0% found this document useful (0 votes)
121 views425 pages

Justin Corvino, Pengzi Miao - Lectures On Mathematical Relativity-Cambridge University Press (2025)

This book explores the connections between Einstein's theory of gravitation and differential geometry, targeting graduate students in mathematics and mathematical physics. It covers foundational topics in special and general relativity, the Einstein field equations, and advanced results in geometric analysis, including the Penrose singularity theorem and the Riemannian Penrose inequality. The text includes over 100 exercises to engage readers and enhance their understanding of the material presented.

Uploaded by

joaomacedocabral
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
121 views425 pages

Justin Corvino, Pengzi Miao - Lectures On Mathematical Relativity-Cambridge University Press (2025)

This book explores the connections between Einstein's theory of gravitation and differential geometry, targeting graduate students in mathematics and mathematical physics. It covers foundational topics in special and general relativity, the Einstein field equations, and advanced results in geometric analysis, including the Penrose singularity theorem and the Riemannian Penrose inequality. The text includes over 100 exercises to engage readers and enhance their understanding of the material presented.

Uploaded by

joaomacedocabral
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 425

This book introduces and explores some of the deep connections between Ein-

stein’s theory of gravitation and differential geometry. As an outgrowth of


graduate summer schools, the presentation is aimed at graduate students in
mathematics and mathematical physics, starting from the foundations of special
and general relativity, and moving to more advanced results in geometric analysis
and the Einstein constraint equations. Topics include the formulation of the
Einstein field equation and the Einstein constraint equations; a treatment of the
Penrose singularity theorem; an introduction to scalar curvature deformation and
the conformal method; a detailed introduction to asymptotically flat spaces and
the Riemannian positive mass theorem; gluing construction of initial data sets
which are Schwarzschild near infinity; constant mean curvature surfaces and the
center of mass for asymptotically flat initial data sets; and an introduction to the
Riemannian Penrose inequality.
While the book assumes a background in differential geometry and real analysis,
a number of basic results in geometry are included in the text and exercises. A
brief treatment of elliptic partial differential equations is designed to help the
reader navigate through the applications of geometric analysis to the Einstein
constraint equations discussed in the analysis-heavy second half of the book.
There are well over 100 exercises, many woven into the fabric of the chapters as
well as others collected at the end of chapters, to give readers a chance to engage
and extend the text.
Mathematical Sciences Research Institute
Publications

66

Lectures on
Mathematical Relativity
Mathematical Sciences Research Institute Publications

This series is based on work undertaken at the Simons Laufer Mathematical Sciences Institute
(SLMath), formerly the Mathematical Sciences Research Institute (MSRI), in Berkeley,
California. It publishes surveys and workshop proceedings of long-lasting value, as well as
lecture notes and monographs by visitors to the Institute. The volumes below are published
by Cambridge University Press; earlier ones may be available from Springer-Verlag.

5 Blackadar: K -Theory for Operator Algebras, second edition


9 Moore/Schochet: Global Analysis on Foliated Spaces, second edition
28 Clemens/Kollár (eds.): Current Topics in Complex Algebraic Geometry
29 Nowakowski (ed.): Games of No Chance
30 Grove/Petersen (eds.): Comparison Geometry
31 Levy (ed.): Flavors of Geometry
32 Cecil/Chern (eds.): Tight and Taut Submanifolds
33 Axler/McCarthy/Sarason (eds.): Holomorphic Spaces
34 Ball/Milman (eds.): Convex Geometric Analysis
35 Levy (ed.): The Eightfold Way
36 Gavosto/Krantz/McCallum (eds.): Contemporary Issues in Mathematics Education
37 Schneider/Siu (eds.): Several Complex Variables
38 Billera/Björner/Green/Simion/Stanley (eds.): New Perspectives in Geometric Combinatorics
39 Haskell/Pillay/Steinhorn (eds.): Model Theory, Algebra, and Geometry
40 Bleher/Its (eds.): Random Matrix Models and Their Applications
41 Schneps (ed.): Galois Groups and Fundamental Groups
42 Nowakowski (ed.): More Games of No Chance
43 Montgomery/Schneider (eds.): New Directions in Hopf Algebras ⌈Cryptography
44 Buhler/Stevenhagen (eds.): Algorithmic Number Theory: Lattices, Number Fields, Curves and
45 Jensen/Ledet/Yui: Generic Polynomials: Constructive Aspects of the Inverse Galois Problem
46 Rockmore/Healy (eds.): Modern Signal Processing
47 Uhlmann (ed.): Inside Out: Inverse Problems and Applications
48 Gross/Kotiuga: Electromagnetic Theory and Computation: A Topological Approach
49 Darmon/Zhang (eds.): Heegner Points and Rankin L-Series
50 Bao/Bryant/Chern/Shen (eds.): A Sampler of Riemann–Finsler Geometry
51 Avramov/Green/Huneke/Smith/Sturmfels (eds.): Trends in Commutative Algebra
52 Goodman/Pach/Welzl (eds.): Combinatorial and Computational Geometry
53 Schoenfeld (ed.): Assessing Mathematical Proficiency
54 Hasselblatt (ed.): Dynamics, Ergodic Theory, and Geometry
55 Pinsky/Birnir (eds.): Probability, Geometry and Integrable Systems
56 Albert/Nowakowski (eds.): Games of No Chance 3
57 Kirsten/Williams (eds.): A Window into Zeta and Modular Physics
58 Friedman/Hunsicker/Libgober/Maxim (eds.): Topology of Stratified Spaces
59 Caporaso/Mc Kernan/Mustat,ă/Popa (eds.): Current Developments in Algebraic Geometry
60 Uhlmann (ed.): Inverse Problems and Applications: Inside Out II
61 Breuillard/Oh (eds.): Thin Groups and Superstrong Approximation
62 Eguchi/Eliashberg/Maeda (eds.): Symplectic,Poisson,andNoncommutativeGeometry
63 Nowakowski (ed.): Games of No Chance 4
64 Bellamy/Rogalski/Schedler/Stafford/Wemyss (ed.): Noncommutative Algebraic Geometry
65 Deift/Forrester (eds.): Random Matrix Theory, Interacting Particle Systems, and Integrable
66 Corvino/Miao: Lectures on Mathematical Relativity ⌊Systems
67–68 Eisenbud/Iyengar/Singh/Stafford/Van den Bergh (eds.): Commutative Algebra and
⌊Noncommutative Algebraic Geometry
69 Blumberg/Gerhardt/Hill (eds.): Stable Categories and Structured Ring Spectra
70 Larsson (ed.): Games of No Chance 5
71 Larsson (ed.): Games of No Chance 6 ⌈Applications
72 Fathi/Morrison/M-Seara/Tabachnikov (eds.): Hamiltonian Systems: Dynamics, Analysis,
Schwarzschild meets Cortona: with thanks to Mauro Carfora
Lectures on
Mathematical Relativity

Justin Corvino
Lafayette College

Pengzi Miao
University of Miami

with additional chapters by

Lan-Hsuan Huang
University of Connecticut, Storrs

Brian Allen
Lehman College, CUNY

Fernando Schwartz
University of Tennessee, Knoxville
Justin Corvino Pengzi Miao
Lafayette College University of Miami
Easton, PA 18042 Coral Gables, FL 33146
United States United States
[email protected] [email protected]

Silvio Levy (series editor)


[email protected]

The Simons Laufer Mathematical Sciences Institute wishes to acknowledge support by the National
Science Foundation and the Pacific Journal of Mathematics for the publication of this series.

Shaftesbury Road, Cambridge CB2 8EA, United Kingdom


One Liberty Plaza, 20th Floor, New York, NY 10006, USA
477 Williamstown Road, Port Melbourne, VIC 3207, Australia
314-321, 3rd Floor, Plot 3, Splendor Forum, Jasola District Centre, New Delhi - 110025, India
103 Penang Road, #05-06/07, Visioncrest Commercial, Singapore 238467
Cambridge University Press is part of Cambridge University Press & Assessment, a department of
the University of Cambridge. We share the University’s mission to contribute to society through
the pursuit of education, learning and research at the highest international levels of excellence.
www.cambridge.org
Information on this title: www.cambridge.org/9781107079939
DOI: 10.1017/9781139942300
© Simons Laufer Mathematical Sciences Institute (SLMath) 2025
This publication is in copyright. Subject to statutory exception and to the provisions of relevant
collective licensing agreements, no reproduction of any part may take place without the written
permission of Cambridge University Press & Assessment.
When citing this work, please include a reference to the DOI 10.1017/9781139942300
First published 2025
A catalogue record for this publication is available from the British Library
A Cataloging-in-Publication data record for this book is available from the Library of Congress
ISBN 978-1-107-07993-9 Hardback
ISBN 978-1-107-43925-2 Paperback
Cambridge University Press & Assessment has no responsibility for the persistence or accuracy
of URLs for external or third-party internet websites referred to in this publication and does not
guarantee that any content on such websites is, or will remain, accurate or appropriate.
For EU product safety concerns, contact us at Calle de José Abascal, 56, 1o , 28003 Madrid, Spain,
or email [email protected].
We dedicate this book to the memory of Sergio Dain.
Contents

Preface xiii
Notation and conventions xix
Chapter 1. Special relativity and Minkowski spacetime 1
1.1. Lorentz transformations 1
1.2. Kinematics in Minkowski spacetime 15
1.3. Energy and momentum 24
1.4. Some geometric aspects of Minkowski spacetime 32
Exercises 36
Chapter 2. The Einstein equation 47
2.1. Newtonian gravity 47
2.2. From the equivalence principle to general relativity 49
2.3. The Einstein equation 58
2.4. Spacetime examples 82
Exercises 100
Chapter 3. Basics of Lorentzian causality 107
3.1. Preliminaries from Lorentzian geometry 107
3.2. Causality relations 109
3.3. Causality conditions 110
3.4. Achronal sets 112
3.5. Cauchy hypersurfaces 114
3.6. Domains of dependence 117
3.7. Cauchy horizons 119
Exercises 120
Chapter 4. The Penrose singularity theorem 123
4.1. Jacobi fields and focal points 123
4.2. Riccati and Raychaudhuri equations 124
4.3. Proof of Penrose’s singularity theorem 129
Chapter 5. The Einstein constraint equations 133
5.1. Introduction 133
5.2. The Einstein constraint equations 139
5.3. The initial value formulation for the vacuum Einstein equation 144

xi
xii C ONTENTS

Exercises 161
Chapter 6. Scalar curvature deformation
and the Einstein constraint equations 167
6.1. A primer on elliptic PDE 168
6.2. Solving the constraint equations: the conformal method 194
6.3. Scalar curvature deformation on closed manifolds 205
Exercises 214
Excursus:
First and second variation of area 219
Exercises 225
Chapter 7. Asymptotically flat solutions of the Einstein
constraint equations 231
7.1. Harmonically flat solutions of the constraint equations 232
7.2. Asymptotically flat initial data 245
7.3. Harmonically flat asymptotics 282
7.4. On the positive mass theorem 285
7.5. Localized scalar curvature deformation and asymptotics 290
Exercises 304
Chapter 8. On the center of mass and constant mean curvature
surfaces of asymptotically flat initial data sets 319
8.1. Introduction 319
8.2. Uniqueness of embedded CMC surfaces 324
8.3. Stable CMC surfaces 327
8.4. Existence of CMC surfaces in asymptotically flat initial data
sets 331
8.5. Stability and foliations 340
8.6. Density theorems 348
Chapter 9. On the Riemannian Penrose inequality 357
9.1. Introduction 357
9.2. Preliminaries 358
9.3. Lam’s proof of the RPI (and PMT) for graphs,
in arbitrary dimensions 360
9.4. Huisken and Ilmanen’s proof of the RPI using IMCF 364
9.5. Bray’s proof 375
References 383
Index 395
Preface

This volume arose from the Summer Graduate Workshop in Mathematical Gen-
eral Relativity at the Mathematical Sciences Research Institute (MSRI, now
renamed the Simons Laufer Mathematical Sciences Institute) in Berkeley, CA
in 2012, and the subsequent summer school in Cortona, Italy, in 2013. The
editors of the volume served as scientific organizers for the summer schools.
The contributions to the volume grew out of lectures given at one or both of the
schools.
We have endeavored to enhance the presentation of the material covered in the
two-week summer schools to make it suitable for reading in book form, while at
the same time remaining faithful to the spirit of those schools.
The advertised prerequisites for the schools, and hence this volume, included a
standard first-year graduate analysis course, with elements of real and functional
analysis as might be found in Real analysis by H. Royden [193] and Real and
complex analysis by W. Rudin [194]. We also assumed introductory graduate
courses in differential and Riemannian geometry, at the level of the following
texts: An introduction to smooth manifolds and Riemannian manifolds, by J. M.
Lee [141; 140]; Riemannian geometry by M. P. do Carmo [41]; Riemannian
geometry by P. Peterson [182]; and of particular relevance to the summer schools,
Semi-Riemannian geometry by B. O’Neill [174]. A graduate course in partial
differential equations (PDE), at the level of Partial differential equations by L. C.
Evans [86] and the first half of Elliptic partial differential equations of second
order by D. Gilbarg and N. Trudinger [107], was not a requirement for the
schools, and although some of the lectures needed to draw on some PDE results,
students without such background could profit from the bulk of the material
discussed at the schools. For this volume, however, certain PDE details in some
of the presentations have been fleshed out, so those sections would be better
approached with this background in hand. We have endeavored to bridge the gap
by including a section introducing and motivating some of the PDE tools.
The students came to the schools with a wide range of backgrounds in mathe-
matics, from those who had nearly completed their doctoral dissertations to those

xiii
xiv P REFACE

who came without the prerequisite geometry background. Through exercises and
tutorial sessions, students were able to build enough intuition and computational
skills to understand much of the material presented. While we decided a primer
section on elliptic PDE was essential for the flow of this book, we resisted the
temptation to add further sections on background geometry. That said, we do
recall or develop some foundational material where needed, and some startup
notations and conventions are reviewed starting on p. xix. We include exercises
that were assigned before and during the schools, both to give readers a feel for
the tutorials and to help focus those who are learning the topics for the first time
or reviewing on the fly. We added many exercises as well, some collected at the
end of chapters, some interspersed in the text. Of particular note, many exercises
in Chapters 1 and 2 serve to review and extend background in geometry.
Strictly speaking, no physics background is required. We assume, as we did
at the schools, a nodding acquaintance with pre-relativity physics, enough so
that students can approach the development of the theory of special and general
relativity with context from which to appreciate the rudiments of spacetime
structure and the line of thought from Galileo to Newton to Einstein, and to
motivate why the Einstein equations and the initial value constraints were to
receive so much of their attention. In part for this reason, the first chapter
contains some very basic material that would be included in an undergraduate
course in special relativity, but we found it to be a fun way to start each school,
engendering some interesting discussion amongst participants without needing
much in the way of background. The first two chapters on special and general
relativity may seem somewhat chatty, including some discussion of physics
without always being mathematically efficient or fastidious, but we hope it helps
to frame the mathematical theory. We could have cut the physics discussion short
by formulating the mathematical postulates from the start with a small amount
of motivation, but we decided, given the audience, to put some more time into
developing these ideas from their genesis in physics. Even giving ourselves
some leeway, the presentation is not too leisurely, and the lecture schedule at the
schools called for covering the physics background reasonably efficiently at the
beginning of the first week.
The mathematical and physical foundations of relativity have been an active
topic of discussion and research for over a hundred years, and we have not tried
to approach the scope of the debate (for instance, we chose not to discuss Mach’s
principle in depth), nor have we tried to use too fine a brush in painting the
logical and philosophical distinctions, nor strained to give a serious historical
account of the development of the theory. Interested readers can follow up with
P REFACE xv

references such as [82; 83; 85; 161; 169; 170; 171], and with a wealth of material
available online.
Even while starting off in an elementary fashion, and keeping in mind the
range of student backgrounds represented, we were able to cover a reasonable
amount of ground at each school. During the MSRI workshop, Pengzi Miao
covered sufficient elements of causal theory to present the proof of the Penrose
singularity theorem. Justin Corvino developed enough background in scalar
curvature and asymptotically flat solutions of the constraint equations to be able
to present a proof of the Riemannian positive mass theorem in three dimensions,
while Lan-Hsuan Huang and Fernando Schwartz were able to build on this to
discuss advanced aspects of the geometry of initial data sets, with Lan-Hsuan
discussing constant mean curvature surfaces and the notion of center of mass,
and with Fernando outlining multiple approaches to the Riemannian Penrose
inequality. This volume reflects essentially the material covered during the MSRI
workshop.
At Cortona, in lieu of Pengzi’s lectures, Mauro Carfora (Università di Pavia)
presented an engaging and marvelously illustrated development connecting the
constraint equations (elliptic PDE governing initial values for the Einstein evo-
lution) and the Ricci flow,1 while Michael Eichmair (ETH Zürich, now at the
University of Vienna) developed connections between the positive mass theorem
and the geometry of initial data sets (including isoperimetry of large spheres),2
which dovetailed beautifully with the lectures of Huang and Schwartz.
With all this background to present, the organizers decided to focus the topics
lectures on the Einstein constraint equations which govern the initial data for
the Einstein evolution, at the expense of not including advanced and/or current
topics on the evolution problem. While this is a reasonable basis for criticism
(of the schools and hence this volume), the field has developed to a point where
there is room for multiple programs on each of these topics, and the relations
between them; articles such as [64] and volumes such as [11; 51] indicate the
considerable breadth and depth of the field.
The years just after the workshops witnessed a flurry of activity in general
relativity. The centennial year of 2015 marked the hundredth anniversary of
Einstein’s formulation of a geometric theory of gravity governed by the Einstein
equation, and was capped off with the excitement over the detection by LIGO
of gravitational waves generated from black hole mergers — the discovery of
which led to the 2017 Nobel Prize in physics. Roger Penrose shared the 2020

1A full treatment of the topic in Mauro’s lectures can be found in the recent monograph [40].
2 For this material see [32; 33; 79; 80; 81].
xvi P REFACE

Nobel Prize in physics for his work on singularity formation and black holes,
some of which we discuss. We hope the field will continue to develop in a robust
manner, and that this work will be of some value in introducing graduate students
to the field, and showing them some aspects of more advanced topics. Along
these lines, we enthusiastically point the reader to the graduate text Geometric
Relativity by Dan A. Lee (Queens College, CUNY), which has appeared recently
[142], and would surely have been a recommended text for the schools.
The first two chapters of this volume present the basic background, from
Minkowski spacetime and special relativity, to Einstein’s equation and general
relativity. Chapters 3 and 4 treat causality and the Penrose singularity theorem.
Chapter 5 on the Einstein constraint equations rounds out the basic background
from general relativity. Starting from Chapter 6 the text takes a sharp turn in the
direction of geometric analysis. Chapter 6 includes some background motivation
on elliptic PDE, with some applications to the constraint equations and scalar
curvature; of note, there is an excursus on the first and second variations of area,
which will appear throughout the rest of the text. Chapters 7–9 are written as
topical chapters and are largely independent of each other, though one might
find utility in referring to Chapter 7 for some properties of asymptotically flat
spaces. That said, on a first pass, some readers might find themselves giving
some of the more technical discussions in Chapter 7 a light read.
We would like to thank the graduate students for their hard work and enthu-
siasm at the summer schools, and in particular Alan Parry and Xin Zhou, as
well as Peter McGrath and Andrea Santi, for their work as graduate assistants
at the MSRI and Cortona schools, respectively. During one tutorial session,
Alan introduced us to his research area, by presenting work of his thesis advisor
Hubert Bray (Duke University), which modifies the Einstein–Hilbert action of
general relativity with a goal to model dark matter; while we do not treat this
topic in the text, we refer the interested reader to [27]; see also [30]. It has been
inspiring to the scientific organizers to see so many of the students producing a
staggering amount of interesting theses and papers in the years since the summer
schools were held, and many have moved on to postdocs and faculty positions. In
particular, Brian Allen, currently in the Department of Mathematics at Lehman
College, CUNY, attended the MSRI summer school as a graduate student, and is
a coauthor on Chapter 9 in this volume.
There are many people to thank for helping this project along. Giorgio
Patrizio (Università di Firenze) first broached the idea of a volume after the
Cortona summer school. We thank Heléne Barcelo (MSRI) for her enthusiastic
support throughout the process. We also thank all the great staff at MSRI, and in
P REFACE xvii

particular Chris Marshall, for their support before, during and after the school,
and likewise at Cortona, in particular Silvana Boscherini and Cinzia Benedetti.
Funding for the schools was provided in part by National Science Foundation,
the Clay Foundation, and INdAM (Istituto Nazionale di Alta Matematica), and
we thank them for their generous support. Likewise we thank our respective
home institutions, Lafayette College and the University of Miami. The editors
shaped the book in part during their invited mini-course at the 2013 Taiwan
International Conference on Geometry, at the National Taiwan University, and we
would like to extend our thanks to Yng-Ing Lee for that opportunity. JC thanks
Lehigh University, and especially Huai-Dong Cao, for inviting him to teach a
graduate course in mathematical relativity in 2011, an experience that helped
frame the approach to some of the material. JC would also like to acknowledge
invitations from the Park City Math Institute, the Erwin Schrödinger Institute in
Vienna, as well as from the Ravello Summer School, where he delivered mini-
courses in the summers of 2013, 2014 and 2015, respectively, at which some
of the presentation was honed. Of particular note is the support of Tommaso
Ruggeri (Università di Bologna) for both the Cortona and Ravello summer
schools. We thank Greg Galloway for reading Chapters 3 and 4 and offering
some helpful feedback. JC thanks former student Kevin Manogue (Lafayette
College) for feedback on Chapters 1 and 2, David Maxwell (University of Alaska,
Fairbanks) for discussions on the conformal method, Farhan Abedin (Lafayette
College) for reading parts of several chapters, and whose critical feedback led
to a reorganization of Chapters 5–7, and finally John D. Norton (University
of Pittsburgh) for several enlightening email exchanges on the foundations of
general relativity. In addition to lecturing in Cortona, Mauro Carfora read several
chapters in detail and offered critical advice from a physics perspective; in
addition, his beautiful sketch of the palace at which the school was held adorns
this volume. A huge thank you goes out to the editor Silvio Levy not only for
his advice and encouragement, but for his calm patience while this project took
longer than anticipated.
This book is dedicated to our friend and colleague Sergio Dain, who passed
away in February 2016 at the age of 46. Sergio was an inspiration — through
his work and his talks, he shared his deep insights into mathematical relativity
and inspired you to be a better mathematician, while through his friendly and
generous personality, interacting with him inspired you to be a better person. We
lack the words to express how much he is missed.
Notation and conventions

We will often indicate conventions when they appear in the text (sometimes
repeatedly), but we will mention a few here, just to get started.
While we generally use the term smooth to mean C ∞ (partly for definiteness),
we note that often it will be obvious that a certain C k -smoothness level is
sufficiently smooth for the context under consideration. Subset notation A ⊂ B
also allows for A = B. Vectors will be denoted in various ways; standard basis
vectors in coordinates x i will be often written as partial derivative operators
∂/∂ x i , so that a vector V can be written as a linear combination V = V i ∂/∂ x i .
Here we have used the Einstein summation convention of summing over repeated
upper and lower indices. While this convention will be in force unless otherwise
noted, we will repeat it on occasion for the sake of clarity.
The term manifold will generally refer to a smooth manifold without boundary.
A closed manifold will refer to a compact manifold (again, without boundary).
While we assume the standard topological conditions that manifolds are Hausdorff
and second countable, we are ambivalent about whether to restrict to connected
manifolds: many results will not require connectedness, and for certain results
that do, it is rather obvious that a statement as written would only hold on
each component separately. We will try to point out where connectedness is
assumed, but we trust the reader can discern if we have missed such an instance.
A submanifold of codimension one is a hypersurface, which will generally be
taken to be smoothly embedded, though we will try to point out when we allow
it to be immersed, or weaken the regularity assumption (as in Chapter 3).
We will work with semi-Riemannian (also called pseudo-Riemannian) metrics
on M, mostly Lorentzian or Riemannian; our signature for Lorentzian metrics
is (−, +, +, . . . , +). When the spacetime is the focus, it may be given as
a Lorentzian manifold (M, g), whereas at some point, the focus in the book
will shift primarily to Riemannian manifolds, often construed as Riemannian
hypersurfaces in a spacetime, so that the Riemannian manifold might then be
given as (M, g), and the corresponding spacetime (if referenced) by (S , ḡ), for
example. Pay close attention to this, and also to the dimension of the spacetime.

xix
xx N OTATION AND CONVENTIONS

This will be made clear in each situation, but just keep it in mind when cross-
referencing formulae across chapters and sections.
When dealing with tensors, we sometimes just need the value of the tensor at
a point, and sometimes we are referring to a tensor field; this will not always be
explicitly stated, but should be clear in context. If a formula refers to derivatives
of the tensor field, we will assume, unless stated otherwise, that the tensor field
is smooth, or, at least smooth enough to do the indicated computations. For
example, “consider a one-form θ ” might really mean “consider a smooth one-
form field θ”. At various points we will consider fields that have less regularity
(e.g., Sobolev spaces of tensor fields), and that will be made clear when needed;
in particular, we will be more deliberate about emphasizing the regularity when
it comes to the fore starting with the PDE discussion in Chapter 6.
Recall that a connection on the tangent bundle T M (an affine connection)
assigns to vector fields X and Y a vector field ∇ X Y , which is C ∞ (M)-linear
(and hence tensorial) in X and R-linear in Y , and satisfies the product rule
∇ X ( f Y ) = (∇ X f )Y + f ∇ X Y for f ∈ C ∞ (M), where ∇ X f = X [ f ] is the
directional derivative of f ; the value of (∇ X Y )| p depends only on X | p and the
values of Y along a curve tangent to X | p . One can extend the connection to
tensor fields T , defining ∇ X T by applying a product rule; e.g., if T is a one-form,
∇ X (T (Y )) = (∇ X T )(Y )+T (∇ X Y ). In general, ∇ X T is a tensor of the same rank
as T , and it follows easily from the definition that ∇ X T is tensorial in X . Hence
we can construe ∇T as a tensor with rank higher by one: if T is an (r, s)-tensor,
producing a scalar from a tuple of r one-forms and s vectors, then ∇T is an
(r, s+1)-tensor. On a semi-Riemannian (M, g), there is a unique connection,
called the Levi-Civita connection and denoted by ∇ (among other notations you
might see in the text), which is torsion-free (∇ X Y − ∇Y X = [X, Y ]) and satisfies
∇g = 0; this will generally be the connection employed unless stated otherwise.
A metric g will often be written in bracket notation: g(X, Y ) = ⟨X, Y ⟩.
In coordinates, g is given by a symmetric matrix of components gi j , so that
locally g = gi j d x i ⊗ d x j = gi j d x i d x j , where for one-forms θ and η we define
θη = 21 (θ ⊗η+η⊗θ ) (whereas the wedge product is given by θ ∧η = θ ⊗η−η⊗θ ).
Thus the Euclidean metric gEn on Rn , for which the component functions x i
are Cartesian coordinates, is then expressed as gEn = δi j d x i d x j , for example.
The nondegeneracy of g corresponds in components to the invertibility of the
matrix (gi j ), and we write (g i j ) = (gi j )−1 , i.e., g i j g jk = δ ik . There is a natural
volume measure dvg associated to g, which in local coordinates takes the form
dvg = |det(gi j )| d x, where d x is the Euclidean (Lebesgue) volume measure in
p

coordinates; dvg corresponds to a volume form ωg in case M is orientable. We


N OTATION AND CONVENTIONS xxi

sometimes let det g = det(gi j ) for abbreviation, and we often let dσ or dσg be
the volume measure induced on a semi-Riemannian submanifold.
Since at each point on M the metric g is nondegenerate, it can be used to
change the tensor type, e.g., a vector X is associated to a dual form X ♭ by
g(X, Y ) = X ♭ (Y ), and likewise a one-form α can be associated to its vector dual
α ♯ by g(α ♯ , Y ) = α(Y ). It is easy to check in a basis v j for T p M with dual basis
θ i for T p∗ M (so θ i (v j ) = δ i j ) that if X = X j v j then X ♭ = X i θ i with X i = gi j X j ,
where gi j = g(vi , v j ); similarly, if α = αi θ i , then α ♯ = α j v j with α j = g i j αi .
This kind of operation, known as raising and lowering of indices from the way
the notation is arranged, can be performed on more general tensors T , with the
positions of the indices generally indicating tensor type in lieu of the musical ♯
and ♭ notation.
We remark on the consistency of the raising/lowering notation: if T is a
(0, 2)-tensor with components Ti j , then T i j = g ik g jℓ Tkℓ give the components of
the tensor obtained by type-changing using g, so that if T = g, then in fact we see
T i j = g i j (the components of the inverse matrix). Furthermore, we can extend g
as a bilinear form on more general tensors, defining ⟨S, T ⟩ to be an appropriate
metric contraction of S ⊗ T ; e.g., if S and T are (1, 2)-tensors, then in a local
basis ⟨S, T ⟩ = giℓ g js g km S ijk Tsm ℓ . We may write this in various ways, depending

on context: ⟨S, T ⟩ = ⟨S, T ⟩g = S ·g T = S · T , and we let |T |2g = ⟨T, T ⟩g (this


is in fact nonnegative when g is Riemannian). Note that if h is a (0, 2)-tensor,
then ⟨g, h⟩g = g i j h i j = trg h, and similarly if h is a (2, 0)-tensor.
The Riemann curvature tensor will be defined via the vector field

R(X, Y, Z ) = ∇ X ∇Y Z − ∇Y ∇ X Z − ∇[X,Y ] Z = R(X, Y )Z ,

with index conventions Riℓjk ∂∂x ℓ = R ∂∂x i , ∂ ∂x j , ∂ ∂x k , in which R is a (1, 3)-




tensor, while the components of the corresponding (0, 4)-tensor are given by
Ri jkℓ = gℓm Rimjk = ⟨R ∂∂x i , ∂ ∂x j , ∂ ∂x k , ∂∂x ℓ ⟩. Different books use different con-


ventions, so be alert! The curvature tensor enjoys a number of symmetries.


Clearly, R(X, Y, Z ) = −R(Y, X, Z ); slightly less obvious is symmetry-by-pairs
⟨R(V, W, Y ), Z ⟩ = ⟨R(Y, Z , V ), W ⟩. Thus we have the component identities:
Rkℓi j = Ri jkℓ = −R jikℓ = R jiℓk . For a nondegenerate two-plane Π ⊂ T p M,
the following expression is independent of basis {V, W } for Π , and defines the
sectional curvature K (Π ):
⟨R(V, W, W ), V ⟩
K (Π ) = . (0.0.1)
⟨V, V ⟩⟨W, W ⟩ − ⟨V, W ⟩2
For given X and Y , R( · , X, Y ) is a linear transformation, whose trace is defined
to be Ric(X, Y ), the Ricci curvature. The Ricci tensor Ric (alternatively, Ric(g)
xxii N OTATION AND CONVENTIONS

or Ricg ) is a symmetric (0, 2)-tensor (via the preceding curvature component


identities), and it is generally the same tensor across texts (though a notable
exception is [221], where the sign differs from ours), which means the way
it is defined from the Riemann tensor may differ to account for sign. In our
convention,

Ric(X, Y ) = d x ℓ R ∂∂x ℓ , X, Y = g kℓ ⟨R ∂∂x ℓ , X, Y , ∂ ∂x k ⟩,


 

Ri j = Ric ∂∂x i , ∂ ∂x j = Rℓiℓ j = g kℓ Rℓi jk .




The scalar curvature is the metric trace of the Ricci tensor, and is given in
components by R(g) = g i j Ri j .
A comma is used to denote a partial derivative, whereas a semicolon is used
to denote components of the covariant derivative of a tensor. For example, with
Ti jk = gkm Timj , we have (∇T )i jkℓ = Ti jk;ℓ = (gkm Timj );ℓ = gkm Timj;ℓ , since ∇g = 0.
While the covariant derivative of a function f is naturally a one-form d f, i.e.,
∇ f (X ) = ∇ X f = X [ f ] = d f (X ), sometimes ∇ f is instead taken to be the vector
(d f )♯ = gradg f dual to d f , i.e., the gradient of f with respect to the metric g,
so that d f (X ) = g(X, gradg f ); the meaning should be clear in context.
The Christoffel symbols 0ikj for a coordinate frame are defined by

∂ ∂
∇∂ = 0ikj k ,
∂ xi ∂x j ∂x

and can be computed in terms of the metric as 0ikj = 12 g km (gm j,i + gim, j − gi j,m ).

If u is a smooth function on M, the Hessian of u is defined by Hessg u = ∇(du).


It is a (0, 2)-tensor, with (Hessg u)i j = u ;i j in components, and moreover it is
symmetric (Exercise 1-9). The Laplacian is the trace of the Hessian:

1g u = trg (Hessg u) = g i j u ;i j .

In some texts, the term Laplacian is reserved for the case (M, g) is Riemannian,
and may be defined as the negative of our definition. When (M, g) is Lorentzian,
the trace of the Hessian is often called (again, up to a sign) the wave operator □ g .

Geodesic normal coordinates at a point p ∈ M can be useful in computations.


In such a coordinate system, gi j ( p) = ±δi j and ∇ ∂ ∂ ∂x j p = 0ikj p ∂ ∂x k p = 0, the
∂ xi
latter condition being equivalent to the vanishing gi j,k ( p) = 0 of all the partial
derivatives of the components of g at p. Thus, for example, if T is a (1, 2)-tensor
field, then Tikj;ℓ = Tikj,ℓ + 0ℓm
k
Timj − 0ℓim Tmk j − 0ℓj
m k
Tim , which greatly simplifies at
a point p in normal coordinates. When we use an expression like “at a point in
N OTATION AND CONVENTIONS xxiii

normal coordinates”, we generally imply evaluating at the point p around which


the normal coordinates chart is centered.
We will sometimes use “big O” notation: f = O(h) means that | f | ≤ C|h|
for some C > 0, where the quantities may be tensors, with corresponding norms.
Generally one must pay attention to the dependence of C. If f and h are functions
of x, then C might be uniformly chosen for x in a compact subset, or possibly f
is a function of a tensor h, and so the C might depend on the set of tensors under
consideration. Sometimes this notation also implies some bounds on derivatives
of f as well, which will have to be specified in context.
Various function spaces will play a role in some of the analysis herein. We
will in Chapter 6 recall basic definitions of Sobolev and Hölder spaces, and we
encourage the reader to review their basic properties from references such as [2;
86; 107; 144]. We let  be an open subset of Rn , sometimes called a domain in
Rn . For k a nonnegative integer, we let C k () be the set of all functions u on
 such that u and all its partials up through order k are continuous, and we let
C ∞ () = ∞ k
k=0 C ().
T
CHAPTER 1

Special relativity and Minkowski spacetime

This chapter is a brief introduction to Minkowskian geometry, highlighting the


physical motivation for this model of spacetime. This mathematical model
resolves a number of perplexing issues that had arisen in experimental and
theoretical physics by the dawn of the twentieth century, and it led to surprising
predictions for physics, which have been confirmed in the laboratory. While we
review several well-known features of spacetime, we certainly cannot do the
topic justice (from the point of view of mathematics or physics) in this brief
introduction. Rather, we will focus on what we need for our purposes, and
refer the reader to the bibliography, on which we have drawn heavily for both
inspiration and details.

1.1. Lorentz transformations

We begin by discussing some of the physical underpinnings of the Minkowski


spacetime model of physics, following in spirit the ideas of Einstein. We make the
mathematical assumption that events in space and time form a four-dimensional
continuum (manifold) S , which for now we take to be R4 (with the standard
topology). We examine the role of a distinguished class of physical observers,
corresponding to inertial frames of reference, and giving rise to a family of
coordinate charts for Minkowski spacetime. The transformations between such
charts form a group of diffeomorphisms of R4 , and from the Kleinian perspective,
the invariants of this group yield geometric quantities of interest. Einstein echoed
the principle of relativity of Galileo and Newton in asserting the equivalence
of inertial observers from the point of view of physics, extending it from the
realm of mechanics to include electromagnetism. There should be no preferred
inertial frame that could be distinguished by experiment, and thus the laws of
mechanics and electromagnetism should take the same form in any inertial frame
(special covariance of the laws of physics). This being so, quantities of interest
in physics would be invariants of the associated transformation group. In this
respect, then, the departure from the Galilean invariance of Newtonian mechanics
is that Einstein identified a different group of transformations between inertial

1
2 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

frames — in essence, a different geometric structure, moving from Euclidean


to Minkowskian geometry — which allowed him to incorporate the laws of
electromagnetic phenomena, such as the speed of light in vacuum, in a satisfactory
manner. The physics and the associated Minkowskian geometry distinguish a
special class of frames, the inertial frames to which the principle of relativity is
applied, and as such the resulting theory is known as special relativity.

1.1.1. Galilean transformations. Inertial frames of reference (and correspond-


ing inertial observers) are those for which Newton’s first law, the law of inertia,
holds: objects will move with constant velocity unless acted upon by a force. As
this can interpreted to be the definition of an inertial frame, Newton’s first law
can be read as asserting the existence of an inertial frame of reference. Newton’s
law of inertia incorporates Descartes’ first two laws of motion, which had in turn
modified Galileo’s law of inertia, and while all of these appear to echo an idea
from Aristotle’s Physics (IV, 8), the law of inertia in fact overturns Aristotelian
law, which maintained that objects which are not acted upon by a force should
naturally come to rest.
The principle of relativity of Galileo and Newton asserts that mechanics should
look the same to all inertial observers: all inertial systems are equivalent for the
formulation of the laws of mechanics. From Newton’s law of inertia, it is clear
that an observer in uniform motion with respect to an inertial observer is also
an inertial observer. As reference frames yield coordinate charts for spacetime,
we are interested in those coordinate changes that correspond to comparisons of
measurements made by two inertial observers.
If we assume that the universe is endowed with some Newtonian time function
that measures absolute time intervals between events, then making a simple time
translation to coordinate the time for two observers, we can arrange that their time
functions agree. A rigid frame of reference then corresponds to a family of curves
foliating spacetime, (the world lines of) observers at rest with respect to the
frame, so that the points on the curves at any fixed time form a three-dimensional
Euclidean space, with distances between observers independent in time. If we
pick an observer and choose Cartesian axes at a given time, we can build a
coordinate chart for spacetime adapted to the frame in a natural way, with the
chosen observer at the origin of spatial coordinates at each time. The coordinate
transformation relating two such charts obtained by two rigid frames O and O e in
uniform motion with respect to one another is a Galilean transformation, with
one frame being inertial if and only if the other one is.
We can write the transformation that relates the coordinates (t, x) of an event
(a point in spacetime) in one frame O to the coordinates (t˜, x̃) in a frame O e
L ORENTZ TRANSFORMATIONS 3

moving at constant velocity v with respect to O, arranging that the spacetime


origins agree, and the Cartesian axes coincide at time t = 0:

t˜ = t, x̃ = x − vt.

If we also arrange the relative velocity to lie along the x-axis, then the transfor-
mation becomes, with v = v ∂/∂ x,

t˜ = t, x̃ = x − vt, ỹ = y, z̃ = z. (1.1.1)

Moreover, if O b is an observer moving with constant velocity w with respect to


e, then it is elementary to obtain the Galilean law of addition of velocities:
O

x = x̃ + vt = x̂ + wt + vt = x̂ + (v + w)t. (1.1.2)

We see that relative velocities satisfy a very simple addition rule.


Consider a curve γ : I ⊂ R → S parametrized by Newtonian time. If we let
x(t) be the spatial components of γ (t) in O, and let x̃(t) be likewise in O e, then
x̃(t) = x(t)−vt. Therefore, if a prime denotes a time derivative, x̃ (t) = x ′ (t)−v,

and x̃ ′′ (t) = x ′′ (t). Hence the acceleration a(t) = x ′′ (t) of the path γ is the same
as measured in either frame.
Suppose γ is the path of an object of mass m, an observer-independent quantity
that we take to be independent of t, so that the momentum p(t) = m x ′ (t) in O
differs from p̃(t) = m x̃ ′ (t) in O e by a constant. Newton’s second law of motion
states that the net force F on the object due to physical interactions equals the
time rate of change of its momentum, as measured in an inertial frame, which
becomes the familiar F = ma; to obtain an analogous equation in a non-inertial
frame, a fictitious force must be added to balance the frame acceleration. It
seems reasonable that the net interaction forces should be observer-independent,
as would be the case when the force between objects is a function of their relative
separation and relative velocity. Therefore Newton’s second law of motion holds
in all inertial frames if it holds in one inertial frame.
Einstein’s foundational 1905 paper “On the electrodynamics of moving bodies”
[82] emphasizes how the incompatibility of electromagnetism and the Galilean
transformations led to the reformulation of mechanics. There are of course
very fundamental issues in interpreting electromagnetism. Nineteenth-century
experiments revealed that a magnetic field is generated by charges in motion.
The Lorentz force law F = q(E + vc × B) (written in Gaussian or cgs units,
with c the speed of light in vacuum) determines the force on a charge q moving
at velocity v in an electromagnetic field. For example, consider a charge q
which moves across the field lines of a stationary magnet. The moving charge
4 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

experiences a magnetic force from the Lorentz force law. On the other hand, if
we switch to a frame moving with the charge, then the charge q to which we
apply the Lorentz force law is stationary and the magnet is moving. As such,
the charge experiences a force from an electric field induced (Faraday’s law) by
the changing magnetic field moving past q. In the end, of course, the physical
predictions are the same in each case, it is only the interpretation that differs.
Einstein looked for a fundamental explanation of this in terms of relativity, that
the laws of electromagnetism should have the same form in all inertial frames.
A consequence of Maxwell’s equations for electromagnetism is that light
travels according to a wave equation, the speed of which can be determined. Of
course, this raises the question: the speed relative to what? And what would be
the medium capable of transmitting electromagnetic disturbances at such a great
speed, while seeming transparent to the motion of the earth through it? Attempts
such as the Michelson–Morley experiment in the late nineteenth century failed to
find the medium, a preferred reference frame (which was called the ether frame),
with respect to which light in vacuum travels at speed c, roughly 3 × 108 meters
per second. Under the Galilean transformations, inertial observers in relative
motion with respect to the ether frame would have different measurements of the
value of the speed of light. That this was not observed in experiments caused
quite a quandary. Classical results on stellar aberration along with the Fizeau
experiment supplied evidence against an ether drag theory that the ether moves
along with massive bodies. Other experiments ruled out theories that were
consistent with the null result of Michelson–Morley, for example the Lorentz
contraction hypothesis, which on its own cannot account for the result of the
Kennedy–Thorndike experiment; see [96; 137; 188], for instance, for more details
on these experiments. The ether theory embraced the notion that the principle of
relativity did not apply to electromagnetism, in the sense that the ether frame is
a preferred frame of reference for the theory. That there were problems with this
led Einstein to postulate that relativity does apply to electromagnetism. Thus,
although Newtonian dynamics works well with the Galilean transformations, for
relativity to apply to electromagnetism, the Galilean transformations required
modification.
For some foreshadowing, consider frames of reference O and O e with constant
relative velocity, with respective coordinates related as in (1.1.1). Suppose a
function ψ : R4 → R satisfies the wave equation (with wave speed c) in O, in
the sense that

1 ∂ 2ψ ∂ 2ψ ∂ 2ψ ∂ 2ψ
= + 2 + 2.
c2 ∂t 2 ∂x2 ∂y ∂z
L ORENTZ TRANSFORMATIONS 5

Then since
∂ ∂ t˜ ∂ ∂ x̃ ∂ ∂ ∂
= + = −v
∂t ∂t ∂ t˜ ∂t ∂ x̃ ∂ t˜ ∂ x̃
and
∂ ∂ t˜ ∂ ∂ x̃ ∂ ∂
= + = ,
∂x ∂ x ∂ t˜ ∂ x ∂ x̃ ∂ x̃
we see that 9(t˜, x̃) := ψ(t, x) satisfies

1 ∂ ∂ 2 ∂ 29 ∂ 29 ∂ 29
 
− v (9) = + + 2.
c2 ∂ t˜ ∂ x̃ ∂ x̃ 2 ∂ ỹ 2 ∂ z̃
This can be rewritten as (assuming ψ is C 2 )
1 ∂ 29 v ∂ 29 v2 ∂ 29 ∂ 29 ∂ 29
 
−2 2 = 1− 2 + + 2.
c2 ∂ t˜2 c ∂ t˜ ∂ x̃ c ∂ x̃ 2 ∂ ỹ 2 ∂ z̃
We see that the wave equation is not invariant under Galilean transformations.
This is perfectly reasonable for mechanical waves, but the homogeneous wave
equation governing light propagation in vacuum is a consequence of Maxwell’s
equations, and we have seen that experiments indicate that light travels at the
same speed in vacuum for all inertial observers.
However, it is not too hard to play around with the transformation so as to
coax the preceding equation into the standard wave equation form. Namely, let
1  v 
t˜ = p t − 2x , (1.1.3)
1 − (v/c)2 c
1
x̃ = p (x − vt) (1.1.4)
1 − (v/c)2
replace the Galilean coordinate change. Then, as we did above, we obtain
∂ ∂ t˜ ∂ ∂ x̃ ∂ 1 ∂ ∂
 
= + =p −v ,
∂t ∂t ∂ t˜ ∂t ∂ x̃ 1 − (v/c)2 ∂ t˜ ∂ x̃
∂ ∂ t˜ ∂ ∂ x̃ ∂ 1 ∂ v ∂
 
= + =p − .
∂x ∂ x ∂ t˜ ∂ x ∂ x̃ 1 − (v/c)2 ∂ x̃ c2 ∂ t˜
Thus we find
1 ∂2
 2
1 1 ∂ ∂2 2 ∂
2 
= · − 2v +v ,
c2 ∂t 2 c2 (1 − (v/c)2 ) ∂ t˜2 ∂ t˜ ∂ x̃ ∂ x̃ 2
∂2
 2
1 ∂ v ∂2 1  v 2 ∂ 2

= −2 2 +
∂ x 2 (1 − (v/c)2 ) ∂ x̃ 2 c ∂ t˜ ∂ x̃ c2 c ∂ t˜2
6 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

as operators on C 2 functions, and therefore,

1 ∂2 ∂2 1 ∂2 ∂2
− = − .
c2 ∂t 2 ∂ x 2 c2 ∂ t˜2 ∂ x̃ 2
We have found a coordinate change for which the wave equation is preserved,
but at what cost? Especially disturbing is that t˜ depends not only on t, but
on x and v as well! In fact, we will derive these equations from applying a
few simple fundamental principles, as Einstein did. Namely, we combine the
principle of relativity (the Galilean/Newtonian principle for mechanics, extended
by Einstein to encompass electromagnetism), that physical laws should have
the same form in all inertial frames, along with Einstein’s postulate that the
speed of light in a vacuum is a physical law, and thus must be the same for all
inertial observers. This immediately mitigates the need for an ether, and gives
rise to striking predictions that are consistent with experimental results. For
example, Lorentz contraction can account for the result of Michelson–Morley,
and together with time dilation can explain the result of Kennedy–Thorndike;
we will derive both of these predictions in Section 1.2.3. (For a derivation of the
Lorentz transformation based directly on experimental results, see [192].)

1.1.2. Deriving the Lorentz transformations. The above coordinate change


(1.1.3)–(1.1.4) was motivated by preserving the wave equation for electromag-
netic propagation. We will now derive this Lorentz transformation of coordinates
between inertial observers from a more fundamental standpoint, as well as
indicate the transformation rule for electromagnetic fields, from which one may
confirm that Maxwell’s equations are invariant under change of inertial frames.
As such, Maxwell’s equations for electromagnetism are seen to be physical laws
which are in accordance with the principle of relativity. As we will see, however,
the laws of mechanics require some modification from those of Newton.
We assume there exists an inertial frame of reference, in which the law of
inertia holds. We assume in particular that using such a frame, we construct a
coordinate chart for spacetime S , a diffeomorphism between S and R4 , assigning
time and space coordinates (t, x) to points in spacetime, so that the Euclidean
distance between two points in any {t0 } × R3 is construed as measuring the
spatial distance between the corresponding events, while 1t = t1 − t0 is the time
difference between events with respective coordinates in {t0 } × R3 and {t1 } × R3 .
For each fixed x0 ∈ R3 , we obtain an observer at rest in the frame, given by events
corresponding to points (t, x0 ). In fact, for each such observer, we can construct
a coordinate chart (in fact many, with freedom in choosing a Cartesian frame).
For any inertial frame, and for any of these observers, the law of inertia holds,
L ORENTZ TRANSFORMATIONS 7

and so we refer to these observers as inertial observers, and we refer to any


such set of coordinates built from an inertial frame as inertial coordinates. (Note
that we have chosen to restrict inertial coordinates to be Cartesian on constant
time slices {t0 } × R3 , whereas one might naturally consider inertial spherical or
cylindrical coordinates on such slices.) Given a coordinate chart, we will often
blur the distinction between points (or displacements) in S and their coordinates
(or their coordinate displacement vectors) in R4 . Given two inertial observers at
rest in this frame, there are inertial coordinates for each for which the change of
coordinates is a simple spacetime translation. We build in the assumption that at
all points in an inertial frame, clocks at rest in the frame run at the same rate,
and that any two inertial observers agree on the future direction of time.
Inherent in this framework, deriving from the equivalence of inertial observers,
is the homogeneity and isotropy of this spacetime: no point is intrinsically
different from any other, and there are no preferred spacetime directions. (See
[174, p. 257–260] for a careful treatment of these notions and relations between
them.) Indeed, with respect to any inertial observer, the spaces {t} × R3 have no
preferred direction, consistent with the isotropy of light propagation; moreover,
the direction of the spacetime path of any given inertial observer should not be
preferred over that of any other, whether at rest or in uniform motion relative to
each other. To compare two inertial frames, then, we might as well arrange that
the origins in each inertial coordinate system (and we reiterate that each inertial
coordinate system is built on Cartesian coordinates for its constant-time spatial
slices) correspond to the same point in S . Uniform motion with respect to an
inertial observer is given by a straight-line spacetime path (with nonconstant time)
in the corresponding coordinate system, and should therefore be a straight line in
the other coordinate system (by the law of inertia and the principle of relativity);
actually, as we will see, there are many such spacetime lines that cannot represent
paths of physical particles. If nonetheless we propose that all lines map to lines
under the coordinate change, and if we assume the coordinate transformation is
continuous, we can conclude with a little work that the transformation must be
linear (recall that the origins match up, and see Exercise 1-4). Another way to
argue for linearity is based on the homogeneity and isotropy of spacetime: there
should be no preferred directions or points. A nonlinear change of coordinates
would not preserve displacements along some line parallel to an axis (space or
time), which would violate both of these properties. One could also postulate
linearity for simplicity’s sake, or in the sense of a Taylor approximation, and see
what happens; as we are about to see, linear maps exist which do the job!
8 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

There are various subtleties in physically setting up an inertial coordinate


system, especially as regards synchronization of clocks (which at rest all run
at the same rate), but we do not go into details here. Clearly, however, if two
inertial observers are not in relative motion, then after synchronization of clocks,
and up to an application of a spatial orthogonal transformation, the coordinates
should simply differ by a spatial translation. On the other hand, when there
is nontrivial relative motion, we will soon see that two inertial observers will
generally not agree whether two events are simultaneous.
That notwithstanding, we note that if in one inertial frame O, two events E 1
and E 2 are simultaneous and with spatial displacement orthogonal to the direction
of motion of an inertial frame O e, then those events are also simultaneous in O e.
Indeed, one can imagine a clock in O which will pass halfway between the two
e
events in space; from this clock, send light signals (at the same time) which will
arrive at the events E 1 and E 2 and be reflected back. Such signals will arrive
back at the clock (moving orthogonally to the separation of the two events in O)
at the same event, and so O e will also measure E 1 and E 2 as being simultaneous.
These events should also have the same distance measurement in each frame.
Indeed, suppose the two points differ by a spatial displacement of unit size in
one frame of reference O. If an inertial observer from O e moving orthogonally
to this displacement were to measure the spatial displacement of the points to
be, say, less than 1, then by reversing the roles (with no preferred direction, the
results should agree), we would have to conclude that O should measure the
displacement to be less than that measured by O e, and that cannot be the case!
Note that embedded in this is a notion of how to measure spatial displacement
between events that are simultaneous, as is the case with E 1 and E 2 in both
frames.
The paths of light rays emanating from the origin comprise the lightcone or
nullcone, which can be represented in coordinates as 6 := {(t, x) : (ct)2 = |x|2 }.
Lightcones can emanate from any point, and in fact you might instead consider the
displacement (1t, 1x) between two points in spacetime along a light ray, which
must satisfy 1s 2 := −c2 (1t)2 + |1x|2 = 0. Applying the Einstein postulate that
the speed of light in vacuum is the same for all inertial observers, we require that
the linear transformation between inertial coordinates should preserve lightcones.
µ
If we set 1x 0 = c1t and 1x = (1x 1 , 1x 2 , 1x 3 ), and let (3 ν ) be the matrix
for the linear transformation between inertial coordinates, with the rescaled
time coordinate x 0 = ct, we have, using the Einstein summation convention of
µ
summing over repeated upper and lower indices, 1x µ = 3 ν 1x̃ ν . Then with
L ORENTZ TRANSFORMATIONS 9

η00 = −1, η0i = 0 and ηi j = δi j for i, j ∈ {1, 2, 3}, we have


µ
1s 2 = 1x µ ηµν 1x ν = 1x̃ λ 3 λ ηµν 3νσ 1x̃ σ = 1x̃ λ η̃λσ 1x̃ σ ,
µ
where we let η̃λσ = 3 λ ηµν 3νσ ; note immediately that η̃λσ = η̃σ λ . We know that
1s 2 vanishes whenever 1s̃ 2 = 0; for example, if 1x̃ 0 = 1 and 1x̃ 1 = ±1, with
1x̃ 2 = 0 = 1x̃ 3 , we get 0 = η̃00 + η̃11 ±2η̃01 . Thus η̃01 = 0; likewise η̃02 = 0 = η̃03
and moreover η̃00 = −η̃11 = −η̃22 = −η̃33 . In general for 1x̃ 0 = |1 x̃|, we have

0 = 1s 2 = η̃00 (1x̃ 0 )2 + η̃i j 1x̃ i 1x̃ j = η00 ((1x̃ 0 )2 − |1 x̃|2 ) + η̃i j 1x̃ i 1x̃ j ,
P
i̸ = j

so that 0 = i̸= j η̃i j 1x̃ i 1x̃ j . By choosing 1x̃ 1 = 1 = 1x̃ 2 and 1x̃ 3 = 0, and
P

1x̃ 0 = |1 x̃|, we infer that η̃12 = 0, and likewise that η̃i j = 0 for i ̸= j. We thus
conclude in general that 1s̃ 2 = −η̃00 1s 2 . In other words, that lightcones must
be preserved implies that the spacetime intervals 1s 2 in two inertial coordinate
systems must agree up to a multiplicative constant. By the principle of relativity
this multiplicative constant must equal ±1 (as it must be the same in going from
frame O to O e or going in reverse: only the relative motion should matter). On the
other hand, we know that certain non-vanishing spacetime intervals are preserved
from one observer to another (such as simultaneous events orthogonal to the
direction of motion, as discussed above), so that we must have −η̃00 = 1. We see
that 1s 2 = 1s̃ 2 , and η̃µν = ηµν : the coordinate change is a linear transformation
P3
that preserves η = −(d x 0 )2 + i=1 (d x i )2 = −c2 dt 2 + d x 2 + dy 2 + dz 2 . Such
linear maps are Lorentz transformations, and we want to get our hands on them
by writing some down explicitly.
We can set up axes with the direction of relative motion along the x-axis in
each coordinate system, oriented so that a photon path along the x-axis (fixed y
and z) given by x = ct corresponds to x̃ = ct˜. Based on the assumption that there
is no preferred direction orthogonal to the direction of motion, and recalling the
above argument regarding such orthogonal spatial displacements, it follows that
we can arrange ỹ = y and z̃ = z. Similar symmetry considerations show that
the coordinates t˜ and x̃ of an event should be independent of y and z, and thus
depend on t and x only. Arguing from the principle of relativity, if O e is moving
with velocity v = v ∂/∂ x with respect to O, then O is moving with velocity −v
with respect to O e — which can also be readily seen by letting x = 0 and using
(1.1.5) below. As the transformation between coordinates should depend only
on the relative velocity, if Tv maps the coordinates of O e to those of O, then we
−1
should have Tv = T−v (compare Einstein’s derivation in [82]).
We have reduced the problem to a two-dimensional one, with relative motion
in the x-direction, and with the change of coordinates giving a map between the
10 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

(t˜, x̃) and (t, x) coordinates of events in spacetime. Moreover, a vector along the
lightcone in the (t˜, x̃)-plane should map to the lightcone in the (t, x)-plane. We
let the reduced two-dimensional mapping with v = v ∂/∂ x be Tv . Using standard
bases we represent the linear transformation by a matrix [Tv ] = aa11
 a12 
21 a22
, so that

t = a11 t˜ + a12 x̃,


x = a21 t˜ + a22 x̃.

From here, we could use that this reduced map must preserve η (a corollary of
the fact shown above that the full mapping preserves η) to get relations between
the coefficients; see [188], for instance, and compare [84, Appendix 1]. Instead
we will follow [38], proceeding to highlight the preservation of lightcones, and
in the process showing directly that η is preserved in spacetime dimension two.
Now x̃ = 0 is the path of the observer moving with respect to O with velocity
v, and is given by t = a11 t˜, x = a21 t˜, so that v = a21 /a11 . Now we apply the
invariance of the lightcone: the path x = ±ct should map to x̃ = ±ct˜ (with
respective signs corresponding by orientation), which implies that the vectors
1  1
c and −c are eigenvectors of Tv ; by orientation again, the eigenvalues should
be positive. This implies there are respective eigenvalues λ± > 0 such that

a11 ± ca12 = λ± ,
a21 ± ca22 = ±cλ± .

Thus ca11 + c2 a12 = a21 + ca22 , and ca11 − c2 a12 = −a21 + ca22 . Adding and
subtracting these two equations yields a11 = a22 and a21 = c2 a12 , from which we
deduce v/c2 = a12 /a11 . We then see the matrix for Tv has the following form,
with α(v) := a11 :
a11 a12 1 v/c2
   
[Tv ] = 2 = α(v) . (1.1.5)
c a12 a11 v 1

Now 0 < λ+ λ− = det[Tv ] = (α(v))2 (1 − (v/c)2 ), so |v| < c: the relative speed
of two inertial observers is less than that of light. We also see α(v) > 0, because
2α(v) = tr[Tv ] = λ+ + λ− > 0.
We now apply a symmetry argument similar to one we used earlier. Start with
adapted coordinates (t, x) and (t˜, x̃) as above for two inertial frames of reference
O and O e, respectively. We also consider adapted coordinates (t ′ , x ′ ) = (t, −x)
and (tˆ, x̂) = (t˜, −x̃) related to the original coordinates by the parity operator
P(τ, ξ ) = (τ, −ξ ). The axes in these coordinates are consistently aligned, and
frame O e moves with velocity v = v ∂/∂ x = −v ∂/∂ x ′ with respect to O. So
the transformation from (tˆ, x̂) to (t ′ , x ′ ), which like Tv represents the identity
L ORENTZ TRANSFORMATIONS 11

map on spacetime, should be given by T−v (consistent with isotropy and the
principle of relativity). Thus if [T−v ] is the matrix representing this coordinate
transformation in the respective standard coordinate bases, then with [P] = 10 −10 ,
 

we see [P][T−v ][P] = [Tv ]. From this we conclude det [T−v ] = det[Tv ], but from
[Tv ]−1 = [T−v ], we see det [Tv ] = ±1, and thus det [Tv ] = 1 since we know the
determinant is positive.
Together with the result of the last paragraph, we get
1
α(v) = √ ,
1 − (v/c)2
and we note then that α(v) = α(−v). One could have argued α(v) = α(−v)
from the principle of relativity, from which the value of α(v) then follows using
[Tv ]−1 = [T−v ] and (1.1.5). In any case, we have

1 1 v/c2 1 1 −v/c2
   
[Tv ] = √ , [T−v ] = √ .
1 − (v/c)2 v 1 1 − (v/c)2 −v 1
Hence we have arrived at the Lorentz transformation
1 
v
 1
t=√ t˜ + 2 x̃ , x = √ (v t˜ + x̃),
1 − (v/c) 2 c 1 − (v/c)2
(1.1.6)
1 
v
 1
t˜ = √ t − 2 x , x̃ = √ (−vt + x) .
1 − (v/c)2 c 1 − (v/c)2
Note that if we change the first variable to x 0 := ct, so that the coordinates have
the same units, we have
1 
v
 1 v 0
 
x0 = √ x̃ 0 +
x̃ , x=√ x̃ + x̃ ,
1 − (v/c)2 c 1 − (v/c)2c
1 
v
 1 
v

x̃ 0 = √ x0 − x , x̃ = √ − x0 + x .
1 − (v/c)2 c 1 − (v/c)2 c

Let β = v/c, so that the matrix for Tv relative to the bases 1c ∂∂t˜ , ∂∂x̃ and

1 ∂ ∂
c ∂t , ∂ x is then just

1 1 β 1 1 −β
   
[Tv ] = √ , and likewise [T−v ] = √ .
1 − β2 β 1 1 − β2 −β 1

For −1 < β = v/c < 1, there is a unique θ such that


β v/c
sinh θ = p =p .
1 − β2 1 − (v/c)2
12 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

Then cosh θ = 1/ 1 − β 2 = 1/ 1 − (v/c)2 (since cosh2 θ − sinh2 θ = 1 and


p p

cosh θ > 0). Thus we can write from the above


cosh θ −sinh θ
 
[T−v ] = ,
−sinh θ cosh θ
so that if θ corresponds to v, −θ corresponds to −v. You might observe the
appearance of the hyperbolic functions is not surprising, given that the level sets
of the function −c2 t 2 + x 2 are invariant hyperbolae: a Lorentz transformation
preserves the value of this function, i.e., −c2 t˜2 + x̃ 2 = −c2 t 2 + x 2 when (t˜, x̃)
and (t, x) are related by a Lorentz transformation.
We can use this to find the velocity addition rule in special relativity: if θ1 and
θ2 correspond to v1 and v2 , then we note that if we consider frame O e moving
along the x-axis of O at velocity v1 , and O moving along the x̃ axis of O
b e at
velocity v2 , then the transformation T which takes coordinates measured in O b to
those in O satisfies the following:
cosh θ1 sinh θ1 cosh θ2 sinh θ2
  
[T ] = [Tv1 ][Tv2 ] =
sinh θ1 cosh θ1 sinh θ2 cosh θ2
cosh(θ1 + θ2 ) sinh(θ1 + θ2 )
 
= = [Tv ],
sinh(θ1 + θ2 ) cosh(θ1 + θ2 )
where v corresponds to θ = θ1 + θ2 . Thus the transformation between frames is
precisely that which has Ob moving along the x-axis relative to O at velocity v.
We see that the set of Lorentz transformations corresponding to inertial observers
in relative motion along a common axis has an elementary group structure.
Moreover, we also have from the above,
v1 v2
v tanh θ1 + tanh θ2 +
= tanh θ = tanh(θ1 + θ2 ) = = c c .
c 1 + tanh θ1 tanh θ2 1 + v1 v2
c2
Thus the Galilean rule (1.1.2) for the addition of velocities, v = v1 + v2 , is
replaced by
v1 + v2
v= .
vv
1 + 122
c
Note that for |v1 |, |v2 | < c, we have (1 ± v1 /c) (1 ± v2 /c) > 0, from which it
follows that |v| < c. As expected, for |v| ≪ c, the velocity addition rule, and in
fact the Lorentz transformations, approximate their Galilean counterparts.

1.1.3. Electromagnetism. As noted above, electromagnetism played a central


role in the development of special relativity. We saw that the wave equation,
L ORENTZ TRANSFORMATIONS 13

which governs the propagation of the electric and magnetic fields in vacuum, is
invariant under Lorentz transformations. Moreover, it is a fundamental fact of
physics that the electromagnetic fields behave in such a way that the laws which
govern electromagnetism, Maxwell’s equations, are invariant under Lorentz
transformations (sometimes referred to as special covariance of the equations).
Thus the laws of electromagnetism take the same form in all inertial frames, in
accordance with the principle of relativity.
While we will not develop the foundations of electromagnetic theory in a
relativistic framework, we should at least indicate how the electromagnetic fields
as measured in two inertial frames compare, i.e., how they transform under a
Lorentz transformation. As we saw earlier, we expect the electric and magnetic
fields to somehow transform together, since charges in motion induce and are
affected by a magnetic field, but motion is relative to the frame of reference.
In fact, the fields do transform together as an anti-symmetric two-tensor, called
the Faraday tensor. In an inertial frame O (and using x 0 = ct), where the electric
field has components E i and the magnetic field B i , the Faraday tensor has a
component matrix

0 E1 E2 E3
 
 −E 1 0 B 3 −B 2 
[F µν ] = 
 −E 2 −B 3 0
. (1.1.7)
B1 
−E 3 B 2 −B 1 0

We can write F as a two-form F ♭ (with components down) as (where we have


E i = E i , Bi = B i in inertial coordinates)

F ♭ = (E 1 d x 1 + E 2 d x 2 + E 3 d x 3 ) ∧ d x 0
+ (B1 d x 2 ∧ d x 3 + B2 d x 3 ∧ d x 1 + B3 d x 1 ∧ d x 2 ).

How do the fields transform exactly? For example, let us compute the magnetic
field component B̃ 3 in another inertial frame Oe moving along the x-axis at
velocity v relative to O. The associated Lorentz transformation has matrix
µ
elements 3 ν given by

1 −v/c
 
0 0
1−(v/c)2 1−(v/c)2
p p
 
−v/c 1
 
 µ 
3ν =  0 0.
1−(v/c)2 1−(v/c)2
p p
 
0 0 1 0
 

0 0 0 1
14 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

Hence, using the Einstein summation convention, we have

B̃ 3 = F̃ 12 = 31µ F µν 32ν = 310 F 02 322 + 311 F 12 322


v/c 1
= −√ E2 + √ B3
1 − (v/c) 2 1 − (v/c)2
1 
v

=√ B3 − E 2 .
1 − (v/c)2 c
Similarly we find

Ẽ 1 = F̃ 01 = 30µ F µν 31ν = 300 F 01 311 + 301 F 10 310 = E 1 .

One can check the other transformation rules:

B̃ 1 = B 1 ,
1 
v

B̃ 2 = √ B2 + E 3 ,
1 − (v/c)2 c
1 
v

Ẽ 2 = √ E 2 − B3 ,
1 − (v/c)2 c
1 
v

Ẽ 3 = √ E 3 + B2 .
1 − (v/c)2 c
In a region with electromagnetic fields but free of charges and hence current
(for simplicity), Maxwell’s equations are as follows (in inertial coordinates, using
spatial divergence and curl):
1∂E
div E = 0, = curl B, (1.1.8)
c ∂t
1∂B
div B = 0, = −curl E. (1.1.9)
c ∂t
Maxwell’s equations in any inertial frame will take the same familiar form,
consistent with the principle of relativity; indeed, one can easily check, using the
above transformation rules for the fields, that the form of Maxwell’s equations
in inertial coordinates is preserved by Lorentz transformations.
Having said that, it is instructive to write Maxwell’s equations directly in
µν
terms of F. Equations (1.1.8) are equivalent to F ,ν = 0, with Gauss’s law
div E = 0 for µ = 0, and with 1c ∂∂tE = curl B corresponding to the remaining
µν
three components — and we recall that ∂∂x 0 = 1c ∂t∂ . Of course, F ,ν gives the
components of divη F in inertial coordinates, so that (1.1.8) can be expressed as
divη F = 0. Similarly, expressing dF ♭ = 0 in inertial coordinates is equivalent to
(1.1.9).
K INEMATICS IN M INKOWSKI SPACETIME 15

Since, as stated above, the electromagnetic field transforms according to


the two-tensor F under change of inertial frame, Maxwell’s equations (being
expressible in terms of divη F and dF ♭ ) have the same familiar form in any
inertial frame. By tensoriality, we can readily express F and F ♭ in any coordinate
system in terms of the explicit expressions of F and F ♭ in an inertial coordinate
system. This is a purely mathematical fact; that the Lorentz force law (p. 32)
with this F should hold in any coordinate system, including one somehow
adapted to a non-inertial frame, must derive from some principle of physics;
see Section 1.2.4. Assuming that, writing Maxwell’s equations in tensorial
form makes manifest their general covariance with respect to all spacetime
coordinate systems. All frames of reference are in principle equivalent from
the point of view of formulating the physical law (consistent with the principle
of general relativity). While the component form of Maxwell’s equations in
inertial coordinates in Minkowski spacetime seems particularly convenient, we
shall see that in Einstein’s theory of gravitation, (locally) inertial frames are not
determined a priori, though in such frames the laws of physics should reduce
to their special relativistic form (in accordance with a version of Einstein’s
equivalence principle). But this is getting slightly ahead of the story at this point!

1.2. Kinematics in Minkowski spacetime

In the remainder of the chapter, we will use Roman indices such as i, j, k to label
spatial components, whereas Greek indices will continue to be used for space and
time components. The Einstein convention will remain in force, summing over
the relevant index range, as we will remind the reader sporadically throughout.
θ sinh θ 
1.2.1. The Minkowski metric. Consider the matrix 3 = cosh sinh θ cosh θ . Since


3T −10 01 3 = −10 01 ,
   

we again see that each element of the group of Lorentz transformations in one
space dimension preserves the bilinear form −(d x 0 )2 + d x 2 = −c2 dt 2 + d x 2 .
Returning to three spatial dimensions, the relevant bilinear form is
η = −(d x 0 )2 + d x 2 + dy 2 + dz 2 = −c2 dt 2 + d x 2 + dy 2 + dz 2 ,
which as we have seen is preserved by the (four-dimensional) transformations
between two sets of inertial coordinates (corresponding to rigid reference frames
for two inertial observers with the same origin of spacetime coordinates). The
set of linear maps that preserve the quadratic form η — the set of Lorentz trans-
formations — forms a group, the Lorentz group; the proper Lorentz group is
the component of the identity map, chosen to ensure observers use a consistent
16 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

spatiotemporal orientation (and in particular keep the same arrow of time). The
Lorentz group is a subgroup of the Poincaré group of maps which preserve η,
which also includes translations.
We let ⟨v, w⟩ := η(v, w). The manifold R4 together with the semi-Riemannian
metric η is four-dimensional Minkowski spacetime M4 , or R14 , indicating the
signature of the metric, and similarly for Mn = R1n , for n ≥ 2.

1.2.2. Causal nature of vectors. (See Figure 1.) Vectors w = (c1t, 1x) with
⟨w, w⟩ < 0 are called timelike, since |1x| < c |1t|. Then |v| = |1x|/|1t| < c.
Such vectors may thus represent the tangents to spacetime paths of material
particles. We note in (1.2.1) below how such vectors represent the spacetime
displacements between pairs of events that in some inertial frame share the same
spatial location. Similarly, null vectors w ̸= 0 satisfy ⟨w, w⟩ = 0; these are
tangent to the lightcone and represent paths of light rays. Finally, vectors with
⟨w, w⟩ > 0 are called spacelike, and represent displacements between pairs of
events which are simultaneous (i.e., share the same time coordinate) in some
inertial frame; see (1.2.2) below. The zero vector is also defined to be spacelike.
Suppose that w = (c1t, 1x, 0, 0) is timelike, and let v = 1x/1t, so that
|v| < c. The Lorentz transformation T−v satisfies
c1t 1 1 −v/c c 1t ∗
      
[T−v ] =p = . (1.2.1)
1x 1 − (v/c)2 −v/c 1 1x 0
Similarly if |1x| > c|1t|, then let v/c = c1t/1x, so that |v| < c. Then
c1t 1 1 −v/c c1t 0
      
[T−v ] =p = . (1.2.2)
1x 1 − (v/c)2 −v/c 1 1x ∗
Thus in either case, there is a Lorentz transformation which maps the timelike
or spacelike vector to align with a timelike or spacelike axis for an appropriate
observer.
1.2.2.1. Twin paradox. It turns out that the familiar triangle inequality for vectors
in Euclidean geometry is reversed for timelike vectors in Minkowski spacetime.

ct
timelike
x
spacelike
lightcone
(null) O

Figure 1. Causal nature of vectors.


K INEMATICS IN M INKOWSKI SPACETIME 17

Given a smooth timelike curve γ (λ) (meaning that γ ′ (λ) is timelike for all λ),
we define the proper time 1τ along a portion of γ as
Z λ1 p
1τ = c−1 −⟨γ ′ (λ), γ ′ (λ)⟩ dλ,
λ0

and the proper time function as


Z λp
−1
τ (λ) = c −⟨γ ′ (λ), γ ′ (λ)⟩ dλ.
λ0

If we reparametrize γ by proper time τ by inverting to get λ = λ(τ ), so that


γ̃ (τ ) := γ (λ(τ )), we have
dλ c
γ̃ ′ (τ ) = γ ′ (λ(τ ))
=√ γ ′ (λ(τ )).
dτ −⟨γ (λ(τ )), γ ′ (λ(τ ))⟩

Thus we see the tangent vector has constant length −⟨γ̃ ′ (τ ), γ̃ ′ (τ )⟩ = c. This
p

means that the parameter τ −τ (λ0 ) is indeed the proper time elapsed along γ
from λ0 to λ(τ ).
We now consider the reversed triangle inequality. If the displacement vector


OB from O to B is timelike, let

→ −→ − →
q
| OB| = −⟨OB, OB⟩,
which is, up to a factor of c, the proper time along a straight-line path from O to
B (the elapsed time measured by the inertial observer passing through O and B).
Definition 1-1. A vector is causal if it is timelike or null. A path is causal if
its tangent vector at each point is causal. A causal vector w is future-pointing if
⟨w, ∂/∂t⟩ < 0, and analogously for past-pointing.

→ −
→ −

Proposition 1-2. If OB is future-pointing and timelike, and OA and AB are
−→ −
→ −→
future-pointing and causal, then | OB| ≥ | OA| + | AB|, with equality only in case
O, A and B are collinear.
Proof. By applying a Lorentz transformation as in (1.2.1), we may arrange
−→ −→
that OB has components (t B , 0, 0, 0), for t B > 0, and OA has components


(t A , x A , 0, 0), with |x A | ≤ ct A . Then AB has components (t B − t A , −x A , 0, 0),
with |x A | ≤ c(t B − t A ). Therefore,

→ −→
| OA|2 = (ct A )2 − x 2A and | AB|2 = c(t B − t A )2 − x 2A ,
so that

→ −
→ p p
| OA| + | AB| = (ct A )2 − x 2A + c(t B − t A )2 − x 2A


≤ ct A + c(t B − t A ) = ct B = | OB|.
18 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

The only way equality holds is if x A = 0. □


The reversed triangle inequality has the following analogue for piecewise
smooth paths.
Proposition 1-3. Suppose O and B are two points in Minkowski spacetime so
−→
that the displacement vector OB is future-pointing and timelike. Then amongst all
piecewise smooth future-pointing causal paths from O to B, the one of maximal
proper time interval is the straight-line path from O to B.
Proof. By applying a Lorentz transformation as in (1.2.1), we may choose an
inertial frame so that O is the origin and B lies on the positive t-axis. If γ is
a timelike curve from O to B parametrized by proper time τ , with coordinates
(t (τ ), x(τ ), y(τ ), z(τ )), then along γ ,
Z 1τ p
1τ = c−1 −⟨γ ′ (τ ), γ ′ (τ )⟩ dτ
0
Z 1τ r  2  2  2  2
dt dx dy dz
= c−1 c2 − − − dτ
0 dτ dτ dτ dτ
Z 1τ
dt
≤ dτ = 1t.
0 dτ
We used the fact that dt/dτ > 0. Also, if the curve were piecewise smooth, we
could break it up into finitely many intervals and apply the above analysis on
each interval. The inequality is clearly strict unless x, y and z are constant (equal
to 0), in which case the curve is indeed along the straight line path. □
The title “twin paradox” of the subsection refers to the following interpretation
of the reversed triangle inequality. Suppose two twins are together at O, at which
time they are inertial observers moving relative to each other along their x-axes
with velocity 80% of the speed of light. Suppose that from the point of view of
one of the twins who remains an inertial observer at rest in an inertial frame O,
the other twin travels for five years to arrive at spacetime point A, quickly turns
around and returns to join the other twin at spacetime point B after a total travel
time of ten years as determined in O. In other words, the proper time elapsed on
a straight path from O to B is ten years. On the other hand, the proper time that
elapses on the other twin’s path from O to A to B is strictly less than ten years,
by the previous corollary: the other twin is younger! How can this be if both
are moving relative to each other: are the situations not symmetric? Well, no:
the twin moving to point A “and back” must accelerate to turn around. This
acceleration means that this twin does not remain at rest with respect to a single
inertial frame the entire time. In other words, two rigid frames of reference
K INEMATICS IN M INKOWSKI SPACETIME 19

adapted to these observers (and respective coordinate charts for Minkowski


spacetime) do not correspond by a simple Lorentz transformation.
Before we move on, let us compute how much time passes for the younger
twin. Suppose the twins mark each passing year by sending a light signal to each
other. From the point of view of O, a signal sent at year t = k will be received
by the other twin on the outgoing part of the journey at time tk determined by
vtk = c(tk − k), which determines how long it will take the light to catch up to
the moving twin. Now, v/c = 0.8, so tk = 5k. Thus the first signal (k = 1) will
be received just as the second twin turns around to begin the return trip. Only
one signal from O will be received on the outgoing portion of the trip, and so
the other nine signals will be received by the other twin on the return portion of
the journey.
What about signals sent from the twin moving relative to O? The first signal
is sent after one unit of
p proper time has passed along the path from O to A,
which is at time t = 1/ 1 − (v/c)2 as measured in p O. The distance between the
twins at this instant, as measured in O is just v/ 1 − (v/c)2 . Thus the time at
which the signal will arrive at x = 0 is

1 1 v 1 + v/c
+ p =√ .
1 − (v/c)2 c 1 − (v/c)2 1 − v/c
p

Similarly the signal sent after ℓ years of proper time have elapsed on the outward
journey from O to A will arrive at x = 0 at

1 + v/c
t=√ ℓ. (1.2.3)
1 − v/c

The coefficient of ℓ in (1.2.3) gives the time interval between the reception of
successive signals which were emitted one unit in time apart, and thus it is the
ratio between the frequency of emission of signals, as measured by the emitter,
and the frequency of reception of signals, as measured by the receiver. As such,
it gives a measure of the relativistic Doppler shift. When v/c = 0.8, this Doppler
factor equals 3. The numerology works p out so that the third signal sent along
the path from O to A occurs at t = 3/ 1 − (0.8)2 = 5, so three signals are sent
along the outward journey, and the third signal arrives at x = 0 at t = 9. The twin
at A has sent three signals back to his twin, but has received only one signal. On
the “homeward” journey, this twin will send three more signals (the last just as
the twins are back together again), and will receive a total of nine signals from
the twin with frame O, the last upon arrival.
20 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

1.2.3. Simultaneity. The worldlines of inertial observers are special paths, namely
timelike geodesics. Moreover, inertial observers correspond to certain coordinate
charts on Minkowski spacetime M. Mathematically it is not a big deal when a
point in M has two different sets of coordinates in two different charts. However,
interpreting the coordinates in the physical model yields some interesting results:
the coordinates are not merely labels, but rather they are supposed to be the
results of physical measurements.
The first observation is that simultaneity is relative: two different observers
adapted to respective inertial frames O and O e moving relative to each other will
not agree in general on whether two events occur at the same time. Imagine
we synchronize the observers at a common origin O, and that they are moving
along their respective x-axes. The Lorentz transformations tell us that the events
which O e charts as occurring simultaneously at t˜ = 0 correspond to t = vx/c2
in O. Different points in this set have different t-coordinates, and so O will not
agree that they occur at the same time; cf. (1.2.2).
Though simultaneity is relative, both observers will calculate the same value of
−(ct)2 +x 2 = −(x 0 )2 +x 2 = −(x̃ 0 )2 +(x̃)2 = −(ct˜)2 +(x̃)2 . This observation can
be used to obtain two interesting conclusions that can be verified experimentally:
time dilation and Lorentz contraction.
Consider the event A which has coordinates t˜A = 1, x̃ A = 0 (we suppress
the other spatial dimensions, whose coordinates we take to be 0). The Lorentz
transformation gives us the coordinates (t A , x A ) for A in O, and in particular

t˜A + v x̃ A /c2 1
tA = p =p > 1.
1 − (v/c) 2 1 − (v/c)2

O measures more time to have elapsed from O to A, and so concludes that the
moving clock in O
e’s frame runs slow. Another way to see this from the invariant
hyperbola is to note that since −(ct˜A )2 + x̃ 2A = −c2 , A is on the hyperbola
−(ct)2 + x 2 = −c2 ; but x A = vt A ̸= 0, so that we must have t A > 1 = t˜A ! You
can see this with a simple picture: the invariant hyperbola through A hits the
t-axis at the point (t, x) = (1, 0) = (t˜A , 0), with t-coordinate clearly lower than
t A . (This point is labeled B in Figure 2, left.) In general, if time 1t˜ is measured
between events at p a fixed x̃ value, the time between the events as measured in O
will be 1t = 1t˜/ 1 − (v/c)2 > 1t˜; cf. (1.2.1).
Similarly, moving objects contract along the direction of motion. To be precise,
consider a rod along the x̃-axis, whose rest length measured in O e is L, and which
is moving with velocity v > 0 along the x-axis. This means that the ends of the
rod are measured simultaneously in O e at, say, O given by (t˜, x̃) = (0, 0) and A
K INEMATICS IN M INKOWSKI SPACETIME 21

ct~ ct~
ct ct

A
~
x ~
x
B
A

B x
x
O O C

Figure 2. Left: time dilation. Right: Lorentz contraction.

given by (t˜, x̃) = (0, L). The ends of the rod make paths in spacetime, one given
by x̃ = 0, the other by x̃ = L. (See Figure 2, right.)
We need to find the coordinates of the point B where x̃ = L intersects t =
0, since then both O and B will be simultaneous with respect to O. By the
Lorentz ˜ 2 , so that
p transformation (1.1.6), the event B will have t = −vL/c
x = 1 − (v/c) L < L. In O, the rod is measured to have length 1 − (v/c)2 L,
2
p

since determining the length of the rod amounts to finding the spatial separation
between the ends at the same time. It is this simultaneity that is relative. This
length contraction is a necessary consequence of time dilation and the agreement
by the observers on their relative speed, and can readily be seen geometrically
in terms of the invariant hyperbola. Indeed, the hyperbola −(ct˜)2 + x̃ 2 = L 2
through the point A lies to the right of the line x̃ = L, touching only at the point
of tangency at A. Therefore, if C is the event given by the intersection of the
hyperbola and the line t = 0, the event B which occurs on x̃ = L is between the
origin O and the point C along the line t = 0. Thus C must have coordinate
(t, x) = (0, L), since it lies on the hyperbola. Hence the point B must have
x-coordinate less than L.
1.2.3.1. Pole and barn paradox. Consider a barn of rest length 10 and a pole of
rest length 20. Suppose they are at rest in respective inertial frames moving along

the x-axis with respect to each other with relative velocity v given by v/c = 23 ,
so that 1 − (v/c)2 = 21 . From the point of view of the barn, the pole contracts
p

along the direction of motion to half its rest length, so that it can fit entirely in
the barn as it moves through. From the point of view of the pole, the barn is
moving toward it, and thus it contracts to length 5; thus in the rest frame of the
pole, it can never fit entirely inside the barn. Can both viewpoints of reality be
correct?
22 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

The answer of course is yes! We analyze this first from the point of view of the
barn frame O, in which the one end of the barn has worldline x = 0, and the other
end x = 10. The pole is moving in the positive x-direction, and at time t = 0 in
O, the ends of the pole are at x = 0 (front) and x = −10 (back). The worldlines

for the front√and back ends of the pole are respectively x = vt/c = 23 t and
x = −10 + 23 t. At time t = √20 , the front end of the pole is at x = 10 (let this
3
be spacetime point A), and the back end at x = 0 (at spacetime point B, say).
From the pole frame O e, the barn never contains the pole. That the two
observers disagree is not a paradox, since simultaneity is relative. The issue is
simply that in O
e, the events A and B are not simultaneous as they are in O. This
is easy to compute by using the Lorentz transformation (1.1.6) on the points with
coordinates (t, x) = √20 , 10 (point A) and (t, x) = √20 , 0 (point B).
 
3 3

1.2.4. Acceleration. If γ is a future-pointing timelike curve parametrized by


proper time τ , then for each τ , there is a momentarily comoving inertial reference
frame, or local rest frame, giving coordinates (t, x k ) for Minkowski spacetime
for which γ (τ ) corresponds to the origin and γ ′ (τ ) = ∂/∂t at γ (τ ). Since
⟨γ ′ (τ ), γ ′ (τ )⟩ is constant, the (covariant) acceleration γ ′′ (τ ) = Dγ ′ (τ )/dτ is
orthogonal to γ ′ (τ ). At γ (τ ) in the momentarily comoving inertial frame, then,
γ ′′ (τ ) is purely spatial, and thus ⟨γ ′′ (τ ), γ ′′ (τ )⟩ ≥ 0. In fact, in this frame at
γ (τ ), we have dt/dτ = 1 and d x k/dt = 0, so that
d2xk d dt d x k d2xk
 
= =
dτ 2 dτ dτ dt dt 2
holds here, from which we conclude
d2xk ∂
γ ′′ (τ ) = .
dt 2 0 ∂xk γ (τ )

The spacetime norm of the covariant acceleration A = γ ′′ (τ ) is thus the norm of


the spatial acceleration a = (d 2 x k/dt 2 )∂/∂ x k as measured in the momentarily
comoving inertial frame. (Recall that the Einstein convention applies here, even
though ∂/∂ x k is written with a slash rather than in stacked notation.)
Consider an inertial frame O with respect to which the instantaneous ve-
locity of γ (τ ) is v = v i ∂/∂ x i , with speed p v = |v|, and spatial acceleration
i i
a = (dv /dt)∂/∂ x . Then, since dt/dτ = 1/ 1 − (v/c)2 , we obtain
1 ∂
 

U = γ (τ ) = p +v , (1.2.4)
1 − (v/c)2 ∂t
1 d 1 ∂ 1
  
A= p +v + a.
1 − (v/c)2 dt 1 − (v/c)2 ∂t 1 − (v/c)2
p
K INEMATICS IN M INKOWSKI SPACETIME 23

A simple exercise for the reader is to verify that A = 0 if and only if a = 0.


It is interesting to consider how an accelerating observer can set up spacetime
coordinates to make physical measurements, and then compare the coordinates
to that of an inertial observer; for examples, see [38, Chapter 4] for a uniformly
rotating frame, and a frame with constant acceleration. Facilitating this is the
clock hypothesis, consistent with empirical evidence, which asserts that relative
to an inertial frame, the rate of a clock depends on its instantaneous velocity,
but not its acceleration. Consequently, spacetime measurements by an observer
should be in accordance locally with those from the local rest frames along
the worldline, and should yield the components of the Minkowski metric as
appropriately transformed from those of an inertial frame, consistent with the
extension of tensorial laws from inertial to non-inertial coordinates. In particular,
the proper time along a worldline should agree with the time elapsed on a clock
moving along the worldline.
1.2.4.1. Constant acceleration: hyperbolic motion. Suppose γ (τ ) is a future-
pointing timelike curve parametrized by proper time τ , and that for some inertial
observer, γ (τ ) has parametrization (t (τ ), x(τ ), 0, 0). We consider the situation
that ⟨γ ′′ (τ ), γ ′′ (τ )⟩ = a 2 is constant, for some a > 0. We can determine the
coordinate functions using the two conditions
 2  2  2 2  2 2
dt dx d t d x
c2 − = c2 , −c2 + = a2.
dτ dτ dτ 2 dτ 2
Since dt/dτ > 0 (γ is future-pointing), by the above we have dt/dτ ≥ 1, and there
is a unique function f (τ ) such that dt/dτ = cosh f (τ ) and d x/dτ = c sinh f (τ ),
from which we that see f (τ ) is smooth. Using the acceleration condition, we
obtain
c2 ( f ′ (τ ))2 = a 2 .

Therefore f (τ ) = ± ac (τ + τ0 ) for some constant τ0 . Hence, by integration, there


are constants t0 and x0 such that
c2 a  c2 a 
ct (τ ) = sinh (τ + τ0 ) + ct0 , x(τ ) = ± cosh (τ + τ0 ) + x0 .
a c a c
Thus the curve is a hyperbola in the t x-plane.
We finish this section with an interesting example; cf. [174, p. 181–183].
Built into special relativity is the feature that all inertial observers measure the
same speed of light in vacuum. However, one might wonder if by accelerating,
they could catch up with light. So, consider for b > 0 a curve γ (τ ) with
uniform acceleration a = bc, whose components in an inertial frame are given
24 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

by t (τ ) = b1 sinh(bτ ), x(τ ) = bc cosh(bτ ). In the t x-plane the path forms a


hyperbola, and as one can infer from a sketch of the trajectory, any light signal
sent from a point along γ will reach the inertial observer at x = 0 at some
time t > 0, but if reflected, will never reach γ . Moreover, suppose that at
γ (0), a photon of light is emitted and moves in the positive x-direction. The
accelerating observer γ is also moving in the positive x-direction for τ > 0, so
what speed does γ measure for the photon? Well, since γ (τ ) is orthogonal to
γ ′ (τ ), a simple Lorentz transformation argument shows that the ray ρτ from the
origin through γ (τ ) is comprised of events which γ will deem (via radar) to be
simultaneous with γ (τ ). If we consider the path of the photon as parametrized
by (t (s), x(s)) = s, bc + cs , then it intersects ρτ at some point Aτ = h(τ )γ (τ ).

)
Thus s = h(τ b sinh(bτ ), and

c h(τ ) c
+c sinh(bτ ) = h(τ ) cosh(bτ ),
b b b
from which we see h(τ ) = ebτ . The displacement vector from γ (τ ) to Aτ
is spacelike, and its length is the distance from γ (τ ) to the photon after τ
units of time have elapsed as measured along γ . This displacement has length
−−−→
|ebτ −1|·| Oγ (τ )| = bc (ebτ −1). This is the distance from γ (τ ) to the photon, the
derivative of which with respect to τ is cebτ . So, not only is the acceleration not
helping the observer γ make any progress on catching the photon, but as measured
along γ , the photon is accelerating away from γ . This is counterintuitive, but is a
consequence of how γ makes measurements along its worldline, and furthermore,
keep in mind that in any local rest frame, the photon has speed c.

1.3. Energy and momentum

We begin with a thought experiment due to Einstein. Imagine a box (and its
contents) of total mass M and length L, at rest. Suddenly from inside the left
side of the box some photons of total energy E are emitted in the direction
toward the right side of the box. The formula for photon momentum p is E = cp.
By conservation of momentum, then, the box should acquire a net momentum
− p = Mv (to the left). When the photons reach the right side of the box and stop,
the motion ceases, with the box having moved to the left with a net displacement
1x < 0. (You might argue that by causality, what happens at one end cannot
instantaneously effect the other end; this can be taken into account, still arriving
at the conclusion below; see [96, p. 27–28] or [188, p. 138–143].)
Einstein argues that there is no reason why in this closed system the center of
mass should have changed from the start to the end of the process. He suggests
E NERGY AND MOMENTUM 25

that the photons must have carried a mass m to the right side of the box to
balance out the center, i.e., m(L + 1x) + (M − m)1x = 0. The velocity of the
box is determined by (M − m)v = − p (conservation of momentum), and the
elapsed time during the photon motion is (L + 1x)/c = 1t = 1x/v. Putting
these together we obtain m = p/c = E/c2 , or

E = mc2 .

This is the famous formula relating mass m to energy.

1.3.1. Energy-momentum four-vector. Associated to any massive particle is a


quantity called its rest mass. The rest mass is a measure of inertia, and while a
discussion of the subtleties involved in the concepts of mass and inertia would
be germane in the context of Einstein’s theory of gravitation (especially Mach’s
principle), we will work with these notions operationally through Newton’s
second law F = dtd (mv) (so F = ma if the mass of the object is constant). Thus
one can imagine measuring the inertial mass by applying Newton’s law in a frame
of reference in which the massive particle is momentarily at rest. Of course,
special relativity rewrites the kinematical and dynamical laws, but there is still
an analogous concept of mass. The term rest mass suggests that in relativity,
the quantity mass may itself be relative. However, the rest mass is by definition
an invariant, since it is associated to the object itself, as measured in a frame
adapted to the object. Our present goal is to tie together the concepts of mass,
momentum and energy.
If γ (τ ) is a timelike curve parametrized by proper time and representing the
worldline of a particle of rest mass m 0 , the energy-momentum four-vector is
P = m 0 γ ′ (τ ) = m 0 U. At the point γ (τ ) in a momentarily comoving inertial frame
e we have P = m 0 ∂/∂ t˜ = m 0 c ∂/∂ x̃ 0 , where x̃ 0 = ct˜. Note that −⟨P, P⟩ =
O,
(m 0 c)2 = (E 0 /c)2 , where E 0 = m 0 c2 is the rest energy. If instead we now use
an inertial frame O in which the particle is instantaneously moving with velocity
v along the x-axis, we get from (1.2.4)
m0 ∂ m0v ∂ m0c ∂ m0v ∂
P= p +p =p 0
+p .
1−(v/c)2 ∂t 1−(v/c)2 ∂ x 1−(v/c)2 ∂ x 1−(v/c)2 ∂ x
We identify
m0 m0v ∂
E := p c2 and p := p
1−(v/c)2 1 − (v/c)2 ∂ x
as, respectively, the particle’s energy and spatial momentum as measured by the
inertial frame O. Since the spatial velocity is v = v ∂/∂ x, from the point of view
https://avxhm.se/blogs/hill0
26 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

of the inertial frame O with respect to which the particle is moving, the mass
might be construed to be
m0
m(v) = p .
1 − (v/c)2
Note that limv ↗ c m(v) = +∞, which indicates that inertia increases with speed;
this is consistent with the fact that a force cannot accelerate any particle to or
above the speed of light. Note also that by Taylor (binomial) expansion,

E = m(v)c2 = m 0 c2 1 + 21 (v/c)2 + O((v/c)4 ) = m 0 c2 + 12 m 0 v 2 + O(v 4 /c2 ).




The first two terms are the rest energy and the kinetic energy. Furthermore if
U obs /c is a timelike unit vector tangent to the path of an observer, the observed
energy of the particle with momentum P is just E obs = −⟨P, U obs ⟩. Finally, we
note that
E0 ∂ E ∂ E ∂
P= 2 = 2 + p= + p,
c ∂ t˜ c ∂t c ∂x0
E 02 = −c2 ⟨P, P⟩ = E 2 − c2 | p|2 = (m(v)c2 )2 − c2 | p|2 .

In units where c = 1, we have P = E ∂/∂ x 0 + p and m 20 = E 2 − | p|2 .


Photons can carry momentum too, as we noted in the introductory thought
experiment for this section. The energy and momentum of a photon are related
by E = cp. The energy-momentum vector for a photon must be null, tangent to
the spacetime trajectory of the photon, and the rest mass of a photon is zero. To
define the energy-momentum four-vector, the trajectory of the photon is given
as a geodesic γ , with affine parameter s, so that γ ′′ (s) = Dγ ′ /ds = 0. The
energy-momentum four-vector is the null vector γ ′ (s), which in a given inertial
frame is written γ ′ (s) = cE2 ∂t∂ + p = cE2 ∂t∂ + pi ∂∂x i . In an inertial frame O e which
has velocity v with respect to the original frame (along the aligned x-axes of the
frames, say), the four-momentum of the photon will thus be
E v 1 vE 1
− p − + p
 
2 2 ∂ 2 ∂ ∂ ∂
γ ′ (s) = pc c
+p c + p2 2 + p3 3 .
1 − (v/c)2 ∂ t˜ 1 − (v/c)2 ∂ x̃ 1 ∂ x̃ ∂ x̃

Thus the measured photon energy in O


e will be
1
e = p E − vp .
E (1.3.1)
1 − (v/c)2
1.3.2. The stress-energy tensor. One of the constituents of the Einstein equation
in general relativity is the stress-energy or energy-momentum tensor, which
encodes the energy and momentum fluxes associated to the physical matter or
E NERGY AND MOMENTUM 27

fields in the spacetime. This is a generalization to spacetime of the spatial stress


tensor of classical mechanics. We introduce it here, just enough to motivate its
place in the formulation of general relativity, for further reading see, e.g., [94;
161; 174; 189; 207]. In the next chapter, we will define the stress-energy tensor
via the Lagrangian formulation of the Einstein equation.
We define the stress-energy tensor as a (2, 0)-tensor. Consider a one-form θ
at a point P in Minkowski spacetime M4 (this can be readily generalized). We
assume that θ is dual to a nonzero vector, which is either timelike or spacelike. In
particular, it has a nonzero metric norm, which we normalize to be ⟨θ, θ⟩ = ±1.
For example, θ = d x 0 = c dt, or θ = d x i , i = 1, 2, 3. In a region of spacetime, we
imagine there is a collection of particles — maybe dust particles, or elements of a
fluid or field — each possessing an energy-momentum four-vector. We consider
θ ̸= 0 as a linear functional operating on the tangent space at P. Its nullspace W
is three-dimensional, and by assumption on the causal nature of θ , it must be
either spacelike or timelike (Lorentzian), and we consider it as an affine subspace
at P. Let B ⊂ W be a region inside W about P. Let 1P be the sum (or integral)
of the four-vectors sgn(θ (eP)) · e
P, summed over each of the energy-momentum
four-vectors e P associated to the particles in the spacetime region B. The sign
accounts for the direction of flow across the corresponding spatial boundary of B,
or in case θ is timelike, the sign keeps track of the direction of time to the future
or past. We remark that under standard energy conditions, the energy-momentum
four-vector elements e P are future-pointing timelike, so that if θ is dual to a
timelike vector, then θ (eP) does not change sign as e P varies over B.
Let 1V be the volume of B with respect to the metric, and define the vector
field
c1P
T (θ, · )| P := lim .
B→{P} 1V

If ξ is another one-form at P, we define T (θ, ξ ) = ξ(T (θ, · )). Now we extend


the definition of T by allowing scaling in θ ; given θ which is timelike or spacelike,

we define θ̂ by θ = bθ̂ , where b = |⟨θ, θ⟩| > 0. We then define T (θ, ξ ) as
bT (θ̂, ξ ), where the latter quantity has been defined above.
Clearly, T (θ, ξ ) is linear in ξ . Physical reasoning as in classical mechanics
may be used to argue that T should also be linear in θ. What may be more
surprising is that when ξ is also either timelike or spacelike, we can argue that
T (θ, ξ ) = T (ξ, θ ) (see [189] or [207, Chapter 4]). We then see that T can be
extended to all forms by polarization, and yields a symmetric (2, 0)-tensor.
Let us interpret the components of this tensor in some inertial frame; this
is often how the tensor is defined. T (d x 0 , · ) is c times the spatial density
https://avxhm.se/blogs/hill0
28 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

of the energy-momentum four-vectors of the physical particles at P. Then


T 00 = T (d x 0 , d x 0 ) = d x 0 (T (d x 0 , · )) is precisely the energy density ρ at P as
measured in this frame. It is instructive at this point, since it may seem that T
is a complicated object, to note as a simple exercise that the tensor T can be
determined in terms of the energy densities for inertial observers.
For i = 1, 2, 3, T 0i = cT (dt, d x i ) is c times the x i -component of the momen-
tum density (per unit spatial volume) at P. Similarly, for T (d x 1 , · ), a spacetime
box B is now determined by a rectangle R of area 1A in the x 2 x 3 -plane, as well as
a side of length 1x 0 = c1t. Thus 1V = c 1t 1A, and so T 10 ≈ 1E/(c 1t 1A)
equals 1/c times the rate of energy flux (per unit area per unit time) in the
x 1 -direction. Note that by symmetry, T 10 = T 01 , which reflects the equivalence
of mass and energy (and note how the units work out). We move on to the
purely spatial components, such as T 11 ≈ 1 p1/(1t1A); as the force is given by
F = d p/dt, we see that T 11 is the normal stress in the x 1 -direction, that is, the
force per unit area normal to the x 2 x 3 -direction. Another way to frame this is as
the flux in the x 1 -direction of the x 1 -component of the momentum. The normal
stress is sometimes called the pressure when it is independent of direction. On
the other hand the component T 12 is then a shear, the force per unit area on the
x 2 x 3 -plane acting in the x 2 -direction; this is a component of force tangential to
the relevant area element, or alternately, as the flux in the x 1 -direction of the
x 2 -component of the momentum.
As defined, the total stress-energy tensor encompasses all relevant physical
interactions, so that there is no net external force; from this we can derive
another key property of a stress-energy tensor, namely that it is divergence-
free. To motivate this, we compute a component of the divergence in inertial
coordinates (so that the Christoffel symbols vanish), again using the spatial
index i ∈ {1, 2, 3}, and with the Einstein summation convention still in force. In
such a coordinate system, c times the zero-component of the divergence is then
c(T 00 i0 i0 i0 i
,0 + T ,i ) = ∂ρ/∂t + cT ,i . As discussed above, cT ∂/∂ x is the vector field
measuring the rate of flux (per unit area per unit time) of the energy. Applying
the divergence theorem to a spatial region R, we obtain the time rate of change
of the energy in the region as
∂ ∂ρ
Z Z Z
ρ dx = dx = cT 00
,0 d x.
∂t R R ∂t R

By conservation of energy, this must balance with the rate of flux of energy into
the region across the spatial boundary ∂ R with outward unit normal n:

i0 ∂
Z Z Z

R
00
cT ,0 d x =
∂R
cT⟨ ∂xi ⟩
, −n dσ = − cT i0,i d x
R
E NERGY AND MOMENTUM 29

where we used the divergence theorem. Since the region R may be made
arbitrarily small, we obtain the 0-component of divη T = 0. The other components
may be derived similarly using conservation of momentum.
We now introduce several standard examples of the stress-energy tensor. Of
course, when in vacuum (free of fields or particles), T = 0. The next simplest
example is that of dust. Consider a collection of particles which are all at rest
in some inertial frame O e. In this frame, the rest energy density ρ of the dust
determines the stress-energy tensor:
∂ ∂ ∂ ∂
T = c−2 ρ ⊗ = ρ 0 ⊗ 0.
∂ t˜ ∂ t˜ ∂ x̃ ∂ x̃
This can be written invariantly as T = c−2 ρ U ⊗ U , where U is the four-velocity
of the dust. Using a Lorentz transformation, one can determine the energy density
in a frame O with respect to which the dust moves with velocity v aligned along
the x-axis. Indeed, in such a frame, (1.2.4) gives
∂ 1 ∂ v ∂
U= =p +p
∂ t˜ 1 − (v/c)2 ∂t 1 − (v/c)2 ∂ x
c ∂ v ∂
=p 0
+p ,
1−(v/c)2 ∂ x 1−(v/c)2 ∂ x
so that
T = c−2 ρ U ⊗ U
ρ ∂ ∂
= 2 0
⊗ 0
1 − (v/c) ∂ x ∂x
ρv/c ∂ ∂ ∂ ∂ ρ(v/c)2 ∂ ∂
 
+ 2 0
⊗ + ⊗ 0
+ 2
⊗ . (1.3.2)
1 − (v/c) ∂ x ∂x ∂x ∂x 1 − (v/c) ∂ x ∂ x

p note that the energy density as measured in O is ρ divided by two factors


We
1 − (v/c)2 : one for the transformation of energy and another for the length
contraction, which affects the value of the measured
p density in O. The other sum-
mands in (1.3.2) likewise show two factors of 1 − (v/c)2 in the denominator,
for similar reasons.
To amplify this further, we may write c−2 ρ U ⊗ U = P ⊗ N, where P = m 0 U
is the energy-momentum four-vector for a single dust particle (rest mass m 0 ) and
N = n U , for n the number density (number of dust particles per unit volume), as
measured in the rest frame of the dust. Thus m 0 n is the mass density in the rest
frame, and m 0 nc2 = ρ is the energy frame. Moreover, the ∂/∂t
p density in the rest p
components of P and N are m 0 / 1 − (v/c) and n/ 1 − (v/c)2 , respectively,
2

the latter being the particle density as measured in O.


https://avxhm.se/blogs/hill0
30 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

A perfect fluid is a continuum model specified by its four-velocity U and


density ρ, as well as a further scalar quantity, the pressure p. For a perfect fluid,
in a momentarily comoving inertial reference frame, there is no heat conduction
(T 0i = 0) and no shear stress (T i j = 0 for i ̸= j); since the latter requirement must
hold upon spatial rotation of such a frame, the matrix [T i j ] must be a multiple
of the identity matrix. Perfect fluids exhibit no viscosity, and the pressure p
encodes the normal stress in an inertial frame momentarily comoving with the
fluid, so that T i j = pδ i j in such a frame.
An invariant way to express the perfect fluid stress-energy tensor is then
T = c−2 (ρ + p) U ⊗ U + pη♯ where η♯ is the contravariant form of the metric
tensor η (indices up). In a rest frame of a fluid element,
∂ ∂ ∂ ∂
T =ρ 0
⊗ 0 + pδ i j i ⊗ j .
∂x ∂x ∂x ∂x
For comparison, the dust model is an idealization in which the constituent
particles have no random motion at all (they are all at rest in a certain frame),
and thus there is no pressure.
Finally, as we will see in Example 2-20, the electromagnetic stress-energy
tensor is given by
1
T µν = F µα ηαβ F νβ − 14 ηµν Fαβ F αβ .


By (1.1.7) we have
1 1
T 00 = F 0α ηαβ F 0β + 14 (−2|E|2 + 2|B|2 ) = |E|2 + |B|2 , (1.3.3)
 
4π 8π
the familiar formula for the energy density of the electromagnetic field (in cgs
units), and similarly
1
T 0i = (E × B)i . (1.3.4)

1.3.2.1. Brief remarks on force. Whereas force plays a key role in Newtonian me-
chanics, energy and momentum play a more central role in relativistic dynamics.
For a discussion of transformation rules for acceleration and force, see, for
example, [96; 188; 189]. We will be content with introducing a four-vector for
the force, with Newton’s second law in mind, and see what it means.
If γ (τ ) is a timelike curve parametrized by proper time τ , recall that the four-
velocity is U (τ ) = γ ′ (τ ), and the four-acceleration is the covariant derivative
A(τ ) = γ ′′ (τ ).
For the path of a particle of rest mass m 0 , with energy-momentum P = m 0 U ,
we define the four-force (sometimes called the Minkowski force) as F = D P/dτ .
Thus F = m 0 A + (dm 0 /dτ ) U , so that F = m 0 A if the rest mass is unchanging,
E NERGY AND MOMENTUM 31

which in turn is easily seen to be equivalent to the orthogonality of U and F: since


U (τ ) = γ ′ (τ ) has constant length, A and U are orthogonal, and so c2 dm 0 /dτ =
−⟨F, U ⟩.
Consider an inertial frame O with respect to which the particle has instan-
taneous velocity v = v i ∂/∂ i
p x (summation convention in use!), and let v = |v|.
Recall that dt/dτ = 1/ 1 − (v/c)2 = m(v)/m 0 , and note that if v = 0, then
at this instant, d(m(v))/dτ = d(m(v))/dt = dm 0 /dt = dm 0 /dτ . We have
P = m(v)∂/∂t + m(v)v i ∂/∂ x i = m(v)∂/∂t + p, and thus
DP 1 DP 1 d(m(v)) ∂ dpi ∂
 
=p =p + .
dτ 1 − (v/c)2 dt 1 − (v/c)2 dt ∂t dt ∂ x i
Aside from the time dilation factor out front, the spatial part of this vector is the
classical force f = d p/dt (written lowercase to avoid confusion here), so that
1 d(m(v)) ∂
 
F= p + f .
1 − (v/c)2 dt ∂t
Note that if O were momentarily comoving with the particle, we would have
F = (dm 0 /dt)∂/∂t + f , and that in general with respect to another frame O b,
ˆ
f = d p̂/d tˆ is not the same as f . Now, as noted above, A is orthogonal to
1 ∂ i ∂
 
U=p +v ,
1 − (v/c)2 ∂t ∂xi

and c2 dm 0 /dτ = −⟨F, U ⟩, so that


c2 dm 0 dm 0 1 2 dm(v)
 
= c2 = −⟨F, U ⟩ = c − ⟨ f , v⟩ .
1 − (v/c)2 dt dτ 1 − (v/c)2 dt
p

This yields c2 dm(v)/dt = ⟨ f , v⟩ + 1 − (v/c)2 c2 dm 0 /dt; note that ⟨ f , v⟩ is


p

the classical measure of the rate at which the force f applied with velocity v is
doing work, i.e., the (mechanical) power developed by the force. As E = m(v)c2 ,
we see that the measured rate of change of energy d E/dt in O is the sum of
two terms, the first of which we might interpret as the rate of change of kinetic
energy, and the second as the rate of change of heat energy, as it arises from the
rate of change of the internal energy as measured in O (cf. [189, Chapter V]).
This should not be surprising, as the energy-momentum four-vector delineates
relations between energy and momentum (hence mass) in different frames.
Given the discussion of the stress-energy tensor above in terms of energy and
momentum fluxes, if one measures the stress-energy fluxes for a system upon
which external forces or fields act (a non-closed system, then), the divergence
µν
(divη T )µ = T ;ν of the stress-energy tensor T for the system is then a vector
https://avxhm.se/blogs/hill0
32 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

field whose components encode the density of the net power and force applied
to the system (see [162, p. 188], for instance).
Before we move on, we give one example, that of the force on a particle
of charge q moving with four-velocity U in an electromagnetic field given by
the Faraday tensor F µν . The four-force is given by Fµ = (q/c) U ν F µν . By
skew-symmetry, Fµ U µ vanishes, so that the force is purely mechanical and does
not change the rest mass of the charged particle. We compute the components
in an inertial frame with respect to which the particle has velocity v = v i ∂/∂ x i .
Then with vi = v i , we have
1
Uν d x ν = p (−c d x 0 + vi d x i ),
1 − (v/c)2
so that
1 q ∂ v
i ∂ 
 
F= p ⟨v, E⟩ 0 + q E + × B
1 − (v/c)2 c ∂x c ∂xi
1 1 ∂ v
  
=p ⟨v, q E⟩ + q E + × B .
1 − (v/c)2 c2 ∂t c

We plainly see here the Lorentz force in the spatial components, as well as
(c−2 times) the power developed by the electric field on the moving charge in
the time component.

1.4. Some geometric aspects of Minkowski spacetime

In this section we again use inertial coordinates (x 0 , x 1 , x 2 , x 3 ) for Minkowski


spacetime M4 = R14 , with metric ⟨ · , · ⟩ = −(d x 0 )2 + (d x 1 )2 + (d x 2 )2 + (d x 3 )2 ,
and analogously for M1+k = R11+k . We will blur the distinction between a point
P and its coordinates (x µ (P)). We let D be the Levi-Civita connection for
the Minkowski metric, so that if a vector field Y has the expression Y µ ∂/∂ x µ
(summation convention) in inertial coordinates, then since D X (∂/∂ x µ ) = 0, we
have simply
∂ ∂
D X Y = D X (Y µ ) µ = X [Y µ ] µ .
∂x ∂x
1.4.1. Hyperquadrics. We consider the level sets of the function

F(x) = −(x 0 )2 + (x 1 )2 + (x 2 )2 + · · · + (x k )2 .

We start with the zero level set, 6 = F −1 (0), which is precisely the set of all
points whose position vector from the origin O is null; in other words, 6 is the
lightcone from the origin O. The complement 6 \ {O} = 6 + ∪ 6 − is a smooth
S OME GEOMETRIC ASPECTS OF M INKOWSKI SPACETIME 33

null hypersurface: along 6 \ {O}, the null vector field x µ ∂/∂ x µ is both tangent
and normal to 6.
We move on to discuss the other level sets 6, which are regular hypersurfaces.
Recall (or see 5.1.2) that for Y and Z tangent to 6, if DY6 Z is the tangential
component of DY Z , then D 6 is the Levi-Civita connection for 6 in the induced
metric. We write DY Z = DY6 Z + II(Y, Z ), where II is the second fundamental
form of 6.

1.4.1.1. Hyperbolic space. We now consider the smooth hypersurface 6 equal


to one of the two components of F −1 (−r 2 ), for r > 0, say the component where
x 0 > 0. Along 6, n = r −1 x µ ∂/∂ x µ is a unit timelike normal vector field. The
induced metric on the hypersurface is thus Riemannian. If Y = Y µ ∂/∂ x µ is
tangent to 6, then
∂ ∂
DY n = r −1 DY (x µ ) µ
= r −1 Y µ µ = r −1 Y.
∂x ∂x
From here we see that 6 is totally umbilic, that is, the second fundamental form
is proportional to the induced metric:

II(Y, Z ) = ⟨DY Z , n⟩⟨n, n⟩n = ⟨DY n, Z ⟩n = r −1 ⟨Y, Z ⟩n.

We can compute its curvature via the Gauss equation (see Proposition 5-5): for
X, Y, Z , W tangent to 6,

⟨R 6 (X, Y, Z ), W ⟩
= ⟨R(X, Y, Z ), W ⟩ − ⟨II(X, Z ), II(Y, W )⟩ + ⟨II(X, W ), II(Y, Z )⟩. (1.4.1)

Thus with R = 0 on Minkowski spacetime, we insert the formula for the second
fundamental form to obtain

⟨R 6 (X, Y, Z ),W ⟩ = r −2 (−⟨X, Z ⟩⟨Y,W ⟩⟨n, n⟩+⟨X,W ⟩⟨Y, Z ⟩⟨n, n⟩) . (1.4.2)

If E 1 , E 2 are orthonormal and span a two-plane Π tangent to 6, then the sectional


curvature (see (0.0.1)) of Π in 6 is K (Π ) = ⟨R 6 (E 1 , E 2 , E 2 ), E 1 ⟩ = −r −2 .
Thus 6 (for k ≥ 2) has constant negative sectional curvature; being simply
connected, 6 is therefore isometric to the hyperbolic space of curvature −r −2 .
In particular, 6 ⊂ M4 is isometric to H3 when r = 1. In M1+k with k ≥ 2, each
component of F −1 (−1) is isometric to hyperbolic space Hk .

1.4.1.2. De Sitter spacetime. The level set F −1 (r 2 ) =: Sk1 (r ) for r > 0 is a


smooth connected hypersurface which is topologically R × Sk−1 , so that for
k ≥ 3, Sk1 (r ) is simply connected. Along the level set, n = r −1 x µ ∂/∂ x µ is a unit
https://avxhm.se/blogs/hill0
34 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

spacelike normal vector field. The induced metric on Sk1 (r ) is Lorentzian: the
vector field
∂ ∂
(r 2 + (x 0 )2 ) 0
+ x0xi i
∂x ∂x

is tangent and timelike along the submanifold. If Π is a nondegenerate two-plane


tangent to Sk1 (r ), then (0.0.1) and (1.4.2) imply K (Π) = r −2 ; note that this holds
not only for a spacelike two-plane, but also for a nondegenerate plane Π spanned
by orthonormal vectors E 0 (timelike) and E 1 (spacelike). We call S41 (1) de Sitter
spacetime.

1.4.2. Conformal compactification of M4 . We now present the classical com-


pactification of M4 . The motivation is to topologically compactify the spacetime
in a way that preserves the causal structure, and in particular preserves the
nullcones. In this way, one can faithfully represent the null geometry on a
compact region, whose boundary in part represents the null paths at infinity. This
can be generalized to M1+k , and has been used to prove the existence of global
solutions to quasilinear hyperbolic systems with “small” initial data (cf. [54],
[51, XV.2]).
We will obtain the compactification via a conformal embedding, built using the
null structure, of M4 into the Einstein static universe R × S3 with the Lorentzian
product metric. We compute in explicit coordinates for convenience, but the
construction is geometric in nature; for example, we first introduce coordinates
based on affine parameters along null geodesics. Indeed, consider advanced and
retarded null coordinates v = x 0 +r , u = x 0 −r , where r 2 = (x 1 )2 +(x 2 )2 +(x 3 )2 .
The Minkowski metric becomes

−du dv + 14 (v − u)2 g̊S2 ,

where du dv = 21 (du ⊗ dv + dv ⊗ du), and g̊S2 is the metric on the round unit
sphere. Note that the level sets of u and v are null. If one holds u and a point
ω ∈ S2 fixed, then varying v → +∞ corresponds to going forward in time along
a null geodesic, to infinity. Likewise, with v fixed, u → −∞ corresponds to
a path of light going to past infinity. The goal is to represent where these null
paths “are” at infinity. One way to do this is to use the inverse tangent function
to define new coordinates

T = tan−1 v + tan−1 u, R = tan−1 v − tan−1 u.


S OME GEOMETRIC ASPECTS OF M INKOWSKI SPACETIME 35

Since dT = (1+v 2 )−1 dv+(1+u 2 )−1 du and d R = (1+v 2 )−1 dv−(1+u 2 )−1 du,
we can easily derive the Jacobian determinant
∂(T, R) 2
= > 0.
∂(u, v) (1 + v 2 )(1 + u 2 )
Moreover,
4
−dT 2 + d R 2 = (−du dv) =: 2 (−du dv).
(1 + v 2 )(1 + u 2 )
4
Note that 2 = is smooth on all of M4 , and that
(1 + t 2 + r 2 )2 − 4t 2r 2
v−u
sin R = sin(tan−1 v) cos(tan−1 u) − sin(tan−1 u) cos(tan−1 v) = √ √ .
1+v 2 1+u 2
Thus sin2 R = 14 2 (v − u)2 . This implies

−dT 2 + d R 2 + (sin2 R) g̊S2 = 2 −du dv + 14 (v − u)2 g̊S2 .




The metric on the left is manifestly conformal to the Minkowski metric, and is
readily identified as a Lorentzian product metric (R × S3 , −dt 2 + g̊S3 ), which
is in fact the Einstein static universe (Section 2.4.2). In other words, we have
produced an embedding of Minkowski spacetime into the Einstein static universe,
which is not an isometry, but a conformal isometry. Thus it preserves the causal
nature of vectors, in particular the null structure. The image of the embedding is
a bounded set, since −π < T < π and 0 ≤ R < π on Minkowski spacetime. The
boundary of the set is the union of two smooth null hypersurfaces J ± , “scri-plus”
and “scri-minus”, where “scri” is short for “script I.” We note that  is a defining
function for J ± , since  = 0 here, with d ̸= 0.
We now briefly describe some of the features of the boundary (Figure 3).
Since T + R = 2 tan−1 v, the null rays to the future end up (v → +∞) at
T + R = π, which for 0 < R < π gives J + . Similarly for u → −∞ we get
T − R = 2 tan−1 u = π, which gives J − . The null vector ∂/∂ T ∓ ∂/∂ R is both
tangent and normal to J ± . The closure of the image of Minkowski spacetime
can be represented by a T -R triangle, bounded by R = 0 and T ± R = ±π.
Every point in this region represents a two-sphere, except where R = 0 or R = π ,
each point of which represents a point. One can argue that timelike geodesics
must start at i − in the past, corresponding to (T, R) = (−π, 0) and must end
at i + corresponding to (T, R) = (π, 0). We let i 0 be the point corresponding
to (T, R) = (0, π ), which is called spacelike infinity. Spacelike curves with
r → +∞ have a limit at i 0 in the conformal picture.
https://avxhm.se/blogs/hill0
36 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

i+

T=p

T=0 −

i0

i−
T =−p

R=p R=0

Figure 3. Conformal compactification of Minkowski spacetime.

Exercises

Exercise 1-4. a. Show that if T : R2 → R2 is continuous and invertible, maps


each line onto a line, and fixes the origin, then T is a linear transformation. (Hint:
Use the parallelogram law for addition to get additivity of T . Infer from this and
continuity that T (qw) = qT (w) for all w ∈ R2 and for any rational q, then use
continuity to extend to all q ∈ R.)
b. Suppose T : Rn → Rn preserves Euclidean distance between points and fixes
the origin. Show that T is linear, and hence is given by multiplication by an
orthogonal matrix.
c. Suppose T : Rn+1 → Rn+1 preserves the Minkowski spacetime interval
1s 2 = −(1x 0 )2 + nk=1 (1x k )2 between points and fixes the origin. Show that
P

T is linear, and hence is a given by multiplication by a Lorentz transformation


matrix 3.
d. Suppose that ϕ : U → V is a diffeomorphism between open, connected
subsets of Rn+1 , and suppose that ϕ ∗ η = η, i.e., ϕ preserves the Minkowski
metric η = −(d x 0 )2 + nk=1 (d x k )2 . Show that ϕ is an affine linear function
P

of the coordinates (x µ ), and thus up to translation is given by multiplication


by a Lorentz matrix. (Hint: You might let y µ (x) = ϕ µ (x) be the component
functions of ϕ in inertial coordinates on V, and show these have vanishing second
E XERCISES 37

derivatives. The components of ϕ ∗ η are given in terms of the partials of y µ ;


differentiate and permute indices in a manner similar to the computation of the
Christoffel symbols (which vanish in these coordinates) in terms of the metric to
conclude.)
Exercise 1-5 (Doppler factor). Suppose light with wavelength λ0 emanates from
an emitter at rest in an inertial frame O, with respect to which a detector moves
at velocity v along the same line as the propagation of light; let O e be a rest
frame for the detector. As measured in O, the frequency ν0 of the light satisfies
λ0 ν0 = c, and the time from the start to the end of emission of one wavelength
is 1t0 = 1/ν0 . Let ν1 be the frequency of the light as measured by the detector,
so that 1t1 = 1/ν1 is the time for the reception of one wavelength as measured
in O
e. Derive the Doppler shift formula

1t0 ν1 1 − v/c
= =√ ,
1t1 ν0 1 + v/c
and show this is consistent with the quantum mechanical formula E = hν. (Hint:
This is basically recasting the derivation of the Doppler factor from (1.2.3)
in the twin paradox example. Set up respective inertial coordinate systems
with common axis along the direction of motion, so that one can analyze the
problem in the relevant coordinates (t, x) for O and (t˜, x̃) for O e. Suppose
that in O, the start and end of the emission of one wavelength of a photon of
light occur at (0, 0) and (1t0 , 0), respectively. Let the worldline of the detector
be given in O by x = x0 + vt, and let the detections of the start and end of
the wavelength occur at (t1 , x1 ) and (t2 , x2 ), respectively, as measured in O.
Observe that x1 = ct1 = x0 + vt1 and x2 = c(t2 − 1t0 ) = x0 + vt2 . Use a
Lorentz transformation to convert to O e. Apply (1.3.1) to show that the ratio
E 1 /E 0 of photon energies at reception and emission agrees with the ratio of the
frequencies.)
The following exercises represent a small sampling of basic problems from
semi-Riemannian geometry. They should be a review and a warmup for some
of the geometry you will see in later chapters. Please refer to the preface for
curvature and index conventions. We reiterate that we use the Einstein summation
convention, summing over a pair of repeated indices, one upper and one lower
(the latter including such notation as ∂/∂ x i ).
By a connection ∇, we mean an affine connection on the tangent bundle
T M of a manifold M. When (M, g = ⟨·, ·⟩) is semi-Riemannian, we use the
Levi-Civita connection ∇ (metric compatible (∇g = 0) with vanishing torsion).
We also assume tensor fields are (sufficiently) smooth.
https://avxhm.se/blogs/hill0
38 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

Exercise 1-6. a. If ∇ and ∇ˆ are connections on T M, show that their difference


S(X, Y ) = ∇ X Y − ∇ˆ X Y is tensorial in both X and Y ; that is, it is C ∞ (M)-linear
in X and Y .
b. If ∇ is a connection on T M, show that T (X, Y ) = ∇ X Y − ∇Y X − [X, Y ] is
tensorial in X and Y . We call T the torsion tensor.
Note: S and T each determine a (1, 2)-tensor, e.g., (θ, X, Y ) 7→ θ (T (X, Y )),
where θ is a one-form. Observe that for each one-form θ , if we let τ θ (X, Y ) =
θ(T (X, Y )), then τ θ is a two-form (alternating).
c. Prove that for a one-form α: dα(X, Y ) = X [α(Y )] − Y [α(X )] − α([X, Y ]).
d. If α is a one-form and ∇ is a connection on T M with torsion tensor T , prove
that dα(X, Y ) = (∇ X α)(Y ) − (∇Y α)(X ) + τ α (X, Y ).
Exercise 1-7. Suppose ∇ is a connection on T M, extended to tensor bundles by
the product rule. Suppose (x i ) is a local coordinate system on M, and
∂ ∂
∇∂ = 0ikj k .
∂xi ∂x j ∂x
j j
a. If we express ∇ ∂ d x j as Cik d x k , find Cik .
∂xi
b. If T is a (1, 2)-tensor field on M, and if the components of T in the coordinate
i
system (x i ) are T jk , i.e.,
i ∂
T = T jk ⊗ dx j ⊗ dxk,
∂xi
i
show the components T jk;ℓ of ∇T satisfy
i i i m m i m i
T jk;ℓ = T jk,ℓ + 0ℓm T jk − 0ℓj Tmk − 0ℓk T jm .

Exercise 1-8 (parallel transport). Let I be an interval containing the origin,


and let γ : I → (M, g) be a smooth curve. In local coordinates we have
γ ′ (t) = γ̇ i (t)∂/∂ x i , where γ̇ i (t) = dγ i/dt. For a vector field W along γ , recall
that the covariant derivative DW/dt can be expressed in local coordinates as
DW dWk ∂
 
k i j
= + 0i j γ (t) γ̇ (t)W (t) .
dt dt ∂ x k γ (t)
Let Pt : Tγ (0) M → Tγ (t) M be the parallel transport operator: for w ∈ Tγ (0) M,
Pt (w) = W (t), where W (t) solves the linear ODE DW/dt = 0 along γ , with
W (0) = w.
a. Use parallel transport to prove there exists a smooth orthonormal parallel
frame field e1 (t), . . . , en (t) along γ .
E XERCISES 39

b. If V is a smooth vector field along γ , show (perhaps using part a.) that the
covariant derivative satisfies
DV d
= P −1 (V (t)).
dt t=0 dt t=0 t
Exercise 1-9 (the Hessian). Suppose (M, g) is semi-Riemannian, and u is a
smooth function on M. The Hessian of u is defined by Hessg u = ∇(du). It is a
(0, 2)-tensor; in local coordinates, (Hessg u)i j = u ;i j (and recall u ,i = u ;i , since
du = u ,i d x i ).
a. Show that Hessg u(X, Y ) = Y X [u] −(∇Y X )[u], where we recall that X [u] =
 

du(X ) is the directional derivative of u in the direction X .


b. Show that the Hessian is symmetric, in two ways: (i) using the identity in a.,
and (ii) writing out the components u ;i j . Identify the common fact needed for
either proof.
c. The Laplacian 1g is the trace of the Hessian: 1g u = trg (Hessg u) = g i j u ;i j .
Compare the component expressions at the center point of a normal coordinate
chart for (M, g) in case g is Riemannian versus when g is Lorentzian. The term
Laplacian is often reserved for the case (M, g) is Riemannian (though sometimes
defined as the negative of our definition), and in the case (M, g) is Lorentzian,
the trace of the Hessian is (again, up to a sign) the wave operator □ g .
Exercise 1-10 (divergence and Laplacian). Consider a semi-Riemannian mani-
fold (M, g) of dimension n with Levi-Civita connection ∇. For any vector field
X , ∇ X is a (1, 1)-tensor: ∇ X (θ, Y ) = θ (∇Y X ), whose components are X i; j :

∇ X = X i; j ⊗ dx j.
∂xi
The divergence of X is the contraction of ∇ X ; in coordinates, divg X = X i;i .
a. Explain how any (1, 1)-tensor, such as ∇ X , can be construed at each p ∈ M
as a linear operator on T p M, and conversely. What is contraction in terms of
linear operator terminology?
b. Suppose ω is a volume form on T p M; i.e., ω(e1 , . . . , en ) = ±1 for any
orthonormal frame {e1 , . . . , en } for T p M. Let {E 1 , . . . , E n } be an orthonormal
frame for T p M with dual frame {θ 1 , . . . , θ n } (i.e., θ i (E j ) = δ i j ). Show that
θ 1 ∧ θ 2 ∧ · · · ∧ θ n is a volume form, and that if ω is any volume form on T p M,
then ω = ±θ 1 ∧ θ 2 ∧ · · · ∧ θ n .
c. A smooth (or at least continuous) n-form defined on an open set U ⊂ M is
a local volume form if it is a volume form on T p M for each p ∈ U . Show that,
in local coordinates, ω̊ = |det(gi j )| d x 1 ∧ · · · ∧ d x n gives a local volume form.
p
https://avxhm.se/blogs/hill0
40 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

Show that M is orientable if and only if it admits a continuous (smooth in fact)


global volume form ω.
d. For any p-form α, consider the interior product ι X (α) = α(X, . . .), which is a
( p −1)-form. Show that for a smooth local volume form ω, d(ι X ω) = (divg X ) ω.
Conclude that, in local coordinates,
1 ∂ p
divg X = √ ( |det g| X i ).
|det g| ∂ x i
What does this reduce to at the center point of a normal coordinate chart?
e. Argue that trg (Hessg u) = divg (gradg u), where gradg u = (du)♯ is the vector
dual to the one-form du under the metric g, i.e., du(X ) = ⟨X, gradg u⟩. Use
part d. to express trg (Hessg u) in local coordinates, and conclude that, in local
coordinates on a Riemannian manifold (M, g),
1 ∂ ∂u
 
ij
1g u = √ .
p
g det g
det g ∂ x i ∂x j
Exercise 1-11 (Gauss’s divergence theorem and Green’s identity). Let (M, g)
be a Riemannian manifold, possibly with nonempty (smooth) boundary ∂ M. We
assume either that M is compact, or that the functions or vector fields in the
integrands below have compact support.
a. Assume (M, g) is oriented with global volume form ωg . If ∂ M is nonempty,
give it the induced orientation with induced volume form σg . If ν is the outward
unit normal to the boundary of M, use Exercise 1-10 to show that
Z Z
(divg X )ωg = ⟨X, ν⟩ σg .
M ∂M

What, if anything, changes in the case of (M, g) semi-Riemannian?


b. Show that the analogous formula to that in part a. holds even if M is not
orientable, if the form ω is replaced by the volume measure dvg (and induced mea-
sure dσg on the boundary); recall that in local coordinates dvg = det (gi j ) d x,
p

where d x is Lebesgue measure on Rn .


c. For a function u, we let the normal derivative du(ν) = ∂u/∂ν be the directional
derivative in the ν-direction. Prove that
∂u
Z Z Z
v 1g u dvg = v dσg − ⟨gradg u, gradg v⟩ dvg , (1-11a)
M ∂ M ∂ν M

and then deduce Green’s identity


∂v ∂u
Z Z  
(u1g v − v1g u) dvg = u −v dσg . (1-11b)
M ∂M ∂ν ∂ν
E XERCISES 41

d. Equation (1-11a) is often called integration by parts. Suppose u and v are


supported in a compact set contained in an open coordinate neighborhood as
follows: in terms of the coordinates, the boundary points correspond to x n = 0
(if this is a boundary chart), and there is some K > 0 such that u and v are
supported in {x ∈ Rn : x n ≥ 0 and |x i | ≤ K for all 1 ≤ i ≤ n − 1}. Write out
the identity (1-11a) in coordinates and verify that it corresponds to integration
by parts along coordinate directions. You might consider using a dual frame
to an adapted orthonormal frame along the boundary to relate dvg and dσg in
coordinates.
e. Suppose 1g u = −λu for some nontrivial (smooth) function u on a closed
(compact without boundary) Riemannian manifold (M, g). Show that λ ≥ 0.
If λ = 0, what is u?

Exercise 1-12 (Ricci formula). Let (M, g) be semi-Riemannian.


a. Prove the vector field version of the Ricci formula, which essentially is the
definition of the Riemann curvature tensor: Z i; jk − Z i;k j = Z ℓ Rki jℓ .
b. Deduce the equivalent one-form version of the Ricci formula: if α is a
one-form, then αi; jk − αi;k j = αℓ R ℓjki .
c. Use the Ricci formula to prove the following, for smooth functions u, where
∇g u := gradg u and 1g u = trg (Hessg u) (for any signature metric):

⟨∇g (1g u), ∇g u⟩ + ⟨Hessg u, Hessg u⟩ + Ricg (∇g u, ∇g u) = 21 1g (⟨∇g u, ∇g u⟩).

Exercise 1-13 (Cartan’s structural equations). Let E 1 , . . . , E n be a local frame


field for T M with dual frame θ 1 , . . . , θ n , i.e., θ i (E j ) = δ i j . Let ∇ be a connection
on T M, with torsion tensor T (Exercise 1-6). Since ∇ X Y is tensorial in X , there is
j j
a matrix (ωi ) of connection one-forms such that ∇ X E i = ωi (X )E j . Furthermore,
there are two-forms τ j with T (X, Y ) = τ j (X, Y )E j = θ j (T (X, Y ))E j , so that
j
τ j = τ θ in the notation of Exercise 1-6.
a. What is (∇ X θ j )(E k ) in terms of the connection one-forms? Write the one-form
∇ X θ j as a linear combination of θ 1 , . . . , θ n .
j
b. Prove Cartan’s first structural equation: dθ j = θ i ∧ ωi + τ j . If the connection
is torsion-free, and if the frame θ i = d x i is dual to a local coordinate frame,
what does this reduce to?
c. If (M, g) is Riemannian with Levi-Civita connection ∇, and if E 1 , . . . , E n is
j
a local orthonormal frame field, what can you say about the matrix (ωi )?
https://avxhm.se/blogs/hill0
42 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

Now suppose (M, g) is semi-Riemannian and ∇ is the Levi-Civita connection.


j j
In particular, τ = 0. Let (i ) be a matrix of two-forms defined by i =
1 j k ℓ j
2 Rkℓi θ ∧ θ , so that i (X, Y )E j = R(X, Y, E i ).
j j j
d. Prove Cartan’s second structural equation i = dωi − ωik ∧ ωk .
j j j j j
e. Prove that di = ωik ∧ k − ik ∧ ωk = ωik ∧ k − ωk ∧ ik . Use this to derive
the differential Bianchi identity (cf. Proposition 2-1), written in components
j j j
as Rkℓi;s + Rℓsi;k + Rski;ℓ = 0. (Hint: If you expand this in a local coordinate
m m j j
frame using Christoffel symbols and set Ckℓs = (0ks Rmℓi + 0ℓs Rkmi ), observe
that Ckℓs + Cℓsk + Cskℓ must vanish.)
Remark: This technique for computing curvature is often applied to local ortho-
normal frame fields. While the metric takes a simple form in such a frame, there
is a trade-off, since unlike coordinate fields the Lie brackets [E i , E j ] do not
vanish for general frame fields.
Exercise 1-14. Let I ⊂ R be an open interval, and let u and v be smooth positive
functions defined on I . Consider the two-dimensional Lorentzian metric defined
for (t, r ) ∈ R × I by ḡ = −u(r ) dt 2 + v(r ) dr 2 .
a. Show that the nonzero Christoffel symbols of ḡ in the (t, r )-coordinates are
given by
′ ′ ′
r 1 v (r ) 1 u (r ) 1 u (r )
0rr = , 0ttr = , 0trt = = 0rt t .
2 v(r ) 2 v(r ) 2 u(r )
b. Find ⟨∇r, ∇r ⟩ḡ and Hessḡ r .
c. Show that Ric(ḡ) = K ḡ, where
′′ ′ 2 ′ ′
1 u (r ) 1 (u (r )) 1 u (r )v (r )
K =− + + .
2 u(r )v(r ) 4 (u(r ))2 v(r ) 4 u(r )(v(r ))2
d. Compute and verify the Cartan structural equations (cf. Exercise 1-13) for the
Lorentzian-orthonormal frame E 1 = u(r )−1/2 ∂/∂t, E 2 = v(r )−1/2 ∂/∂r .
Exercise 1-15 (curvature of warped products). Let B and F be manifolds of
respective dimensions b and d, and let M = B × F. For p ∈ B and q ∈ F, we can
identify T( p,q) M with T p B ⊕ Tq F, by lifting via the canonical projection maps
π B : M → B and π F : M → F given by π B ( p, q) = p and π F ( p, q) = q. Given
a point ( p, q) ∈ M and vector v ∈ T p B, there is a unique vector V ∈ T( p,q) M
such that (π B )∗ (V ) = v and (π F )∗ (V ) = 0, and similarly W ∈ T( p,q) M can be
analogously associated to w ∈ Tq F. We thus identify vectors tangent to B or to F
as tangent vectors to M, and any vector in T( p,q) M can be uniquely expressed as
V + W , with V and W as above. If g B and gF are semi-Riemannian metrics on B
and F respectively, and if f : B → (0, +∞) is a smooth positive function, then
E XERCISES 43

we can consider the warped product metric g = g B + f 2 g F , or more precisely,


g = π B∗ (g B ) + (π B∗ f )2 π F∗ (g F ), where π B∗ f = f ◦ π B . Bracket notation for the
metric g will be employed. Let ∇, ∇ B and ∇ F be the associated Levi-Civita
connections for g, g B and g F , and let R, R B and R F be their Riemann tensors.
In what follows, these identifications of the tangent spaces will be in force and
−1
f will denote 1/ f . Furthermore, computations can be done either invariantly
(cf. [174, Chapter 7], with the opposite curvature convention) or using adapted
coordinates (x 1 , . . . , x b ; y 1 , . . . , y d ), where (x i ) are local coordinates for B and
(y j ) for F, with associated tangent vectors ∂/∂ x i and ∂/∂ y j , regarded also as
tangent vectors to M under the identifications above.

a. For X and Y tangent to B, and V and W tangent to F, show that

∇ X Y = ∇ XB Y,
∇ X V = f −1 (∇ X f )V = ∇V X,
∇V W = − f −1 ⟨V, W ⟩ gradg B f + ∇VF W
= − f g F (V, W ) gradg B f + ∇VF W.

(Note: Computing Christoffel symbols in local coordinates is straightforward.)

b. Show that for any smooth function ϕ : B → R, the gradient gradg (π B∗ ϕ) agrees
(under the identification above) with gradg B (ϕ). Show that Hessg B ϕ(X, Y ) =
Hessg (π B∗ ϕ)(X, Y ) if X and Y are vectors tangent to B. Can you give an example
of a warped product metric, along with a function ϕ and a vector field Z tangent
to M, for which Hessg B ϕ((π B )∗ (Z ), (π B )∗ (Z )) ̸= Hessg (π B∗ ϕ)(Z , Z )?

c. Let X , Y and Z be tangent fields to B, and let U , V and W be tangent to F.


Verify the following curvature formulae:

R(X, Y, Z ) = R B (X, Y, Z ),
R(X,V, Y ) = f −1 (Hessg B f (X, Y ))V,
R(X, Y,V ) = 0 = R(V,W, X ),
R(V, X,W ) = f −1 ⟨V,W ⟩∇ XB (gradg B f ) = f g F (V,W )∇ XB (gradg B f ),
R(U,V,W ) = R F (U,V,W ) − f −2 ⟨gradg B f, gradg B f ⟩(⟨V,W ⟩U − ⟨U, W ⟩V )
= R F (U,V,W ) − ⟨gradg B f,gradg B f ⟩(g F (V,W )U − gF (U,W )V ).

d. Let X , Y be tangent fields to B, and let V and W be tangent to F. Verify


the following Ricci curvature formulae, where 1g B f = trg B (Hessg B f ) (in any
https://avxhm.se/blogs/hill0
44 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME

signature):

Ricg (X, Y ) = Ricg B (X, Y ) − d · f −1 Hessg B f (X, Y ),


Ricg (X,V ) = 0,
Ricg (V,W ) = Ricg F (V,W ) − ⟨V,W ⟩( f −11g B f + (d−1) f −2 ⟨gradg B f,gradg B f ⟩)
= Ricg F (V,W ) − g F (V,W )( f 1g B f + (d −1)⟨gradg B f, gradg B f ⟩).

Exercise 1-16 (totally geodesic submanifolds). Let M ⊂ (M, ḡ) be an embedded


submanifold of a semi-Riemannian manifold, with induced metric g on M. If v
is tangent to M, let γv be the g-geodesic on M with γv′ (0) = v, and let γ̄v be the
ḡ-geodesic on M with γ̄v′ (0) = v. For convenience, we restrict the domain of
these curves to a common interval I containing 0.
a. Show that the second fundamental form II(X, Y ) = (∇ X Y )⊥ vanishes identi-
cally on M if and only if for every v tangent to M, γv = γ̄v . In this case we say
M is totally geodesic in M.
b. For semi-Riemannian manifolds (M1 , g1 ) and (M2 , g2 ), we consider the
product (M = M1 × M2 , ḡ = g1 ⊕ g2 ). Show that { p1 } × M2 and M1 × { p2 }, for
any pi ∈ Mi , are totally geodesic in M.

Exercise 1-17 (curvature and parallel transport). The curvature tensor can be
computed via parallel transport, and in fact it measures the failure of path
independence of parallel transport. Some geometry texts skip this, whereas
some general relativity texts point this out, to varying degrees of mathematical
precision. See, for example, [207; 218].
Let g be a semi-Riemannian metric on M, and let p ∈ M. Consider a coordinate
chart ϕ : U ⊂ Rn → M centered at p, and let B ⊂ U be a closed rectangle in
a two-dimensional coordinate plane around the origin, say (x 1 , x 2 ) ∈ B for
max(|x 1 |, |x 2 |) ≤ ε0 . Given a vector V ∈ T p M, define a vector field along the
coordinate mapping from B to M as follows: for (x 1 , x 2 ) ∈ B, let V (x 1 , x 2 ) ∈
Tϕ(x 1 ,x 2 ) M be the vector obtained by the parallel transport of V ∈ T p M along the
image under ϕ of the segment from (0, 0) to (x 1 , 0), and then along the image
of the segment from (x 1 , 0) to (x 1 , x 2 ). By smooth dependence of solutions to
ODE, this depends smoothly on (x 1 , x 2 ). Similarly, let V e(x 1 , x 2 ) be the vector
field obtained by the parallel transport of V ∈ T p M along the image of the
segment from (0, 0) to (0, x 2 ), and then along the image of the segment from
(0, x 2 ) to (x 1 , x 2 ). We remark that we can compute in coordinates, i.e., compute
in (U, ϕ ∗ g), which we will do without further comment, pushing forward or
pulling back quantities between U and M as needed.
E XERCISES 45

Given (x 1 , x 2 ) ∈ B, let γ(x 1 ,x 2 ) be the piecewise linear path parametrizing


the rectangle with vertices (0, 0), (x 1 , 0), (x 1 , x 2 ) and (x 2 , 0), oriented by this
ordering, and let W e(x 1 ,x 2 ) be the vector field along γ(x 1 ,x 2 ) obtained by parallel
transport of V ∈ T p M around γ(x 1 ,x 2 ) . Then W e(x 1 ,x 2 ) (ξ 1 , ξ 2 ) = V (ξ 1 , ξ 2 ) for any
(ξ , ξ ) on either of the first two segments of γ(x 1 ,x 2 ) . Let W (x 1 , x 2 ) ∈ T p M
1 2

be the value of W e(x 1 ,x 2 ) at the final point along the map γ(x1 ,x2 ) ; in other words,
W (x 1 , x 2 ) is the vector obtained by parallel transport of V (x 1 , x 2 ) first along
the line from (x 1 , x 2 ) to (0, x 2 ), and then along the line from (x 2 , 0) to (0, 0).
Given V, we can choose a uniform constant for the “O”-estimates below, or
likewise, given K , we have uniform “O”-estimates for all |V |g ≤ K .
a. Note that V (x 1 , x 2 ) = V e(x 1 , x 2 ) along the axes, where x 1 x 2 = 0. Show that
V (x 1 , x 2 ) − V
e(x 1 , x 2 ) = O(|x 1 x 2 |). (You might write the parallel transport
system in coordinates along the sides of the rectangle and estimate the change
in the vector fields along pairs of parallel sides by estimating an appropriate
integral.)
b. Note that W = V ∈ T p M for x 1 x 2 = 0. Argue that |W − V | = O(|x 1 x 2 |).
Remark. You can make the corresponding construction on any rectangle
[x 1 , x 1 + 1x 1 ] × [x 2 , x 2 + 1x 2 ] ⊂ B,
starting from V (x 1 , x 2 ), say, and the analogous difference is O(|1x 1 1x 2 |).
c. Show that
W (x 1 , x 2 ) − V ∂ ∂
 
lim =R , ,V .
(x 1 ,x 2 )→(0,0) x1x2 ∂x2 ∂x1
x 1 x 2 ̸ =0

(Hint: Write the numerator above in terms of integrals along the sides of γ(x 1 ,x 2 ) .
You again want to estimate the integrals in parallel pairs. The answer will drop out
once you argue that up to an acceptable error, you can replace the W e(x 1 ,x 2 ) -term
in the integrals by the appropriate V -term or Ve-term.)

https://avxhm.se/blogs/hill0
CHAPTER 2

The Einstein equation

The theory of special relativity incorporates a modification of Newtonian me-


chanics together with electromagnetism. A natural question to consider is how
gravitation fits into the framework of relativity. We focus our analysis of this
question along two main ideas, that of the equivalence between uniform accel-
eration and a uniform gravitational field, and that of the gravitational redshift,
which will lead us to the Einstein equation. We begin, however, by reviewing
Newton’s law of gravitation.

2.1. Newtonian gravity

Newton’s law of gravity can be formulated as follows. If two objects are separated
by a spatial distance r , then the magnitude of the gravitational force between
them is given by F = Gm g Mg /r 2 , where the direction of the force is along the
line from one mass to the other. Here m g and Mg are the gravitational masses
associated to the two objects, and G is Newton’s gravitational constant. If r̂ is the
unit vector from the object of mass Mg to the other object, then the force on the
object of mass m g is F = −(Gm g Mg /r 2 ) r̂. If the object of mass Mg is located
at the origin, and x ∈ R3 is the position of the other object, then r = |x|, r̂ = x/r ,
and we can write the force as follows, where ∇ is the Euclidean gradient:

Gm g Mg G Mg
 
F=− r̂ = m g∇ = −m g∇8, (2.1.1)
|x|2 |x|

where 8(x) := −GMg /|x| is the gravitational potential associated to the mass
Mg . In analogy with Coulomb’s law of electrostatics, namely that the electric
force between stationary charged particles (with charges q1 and q2 ) is given by
F = q1 q2 /r 2 (in cgs units), m g and Mg play the role of gravitational charges.
Of course, there is already a notion of mass embodied in Newton’s second
law: in this context we write it as F = m i a, and call m i the inertial mass. By
equating forces, we solve for the acceleration of an object of gravitational mass
m g and inertial mass m i due to the gravitational force of an object of gravitational
https://avxhm.se/blogs/hill0
47
48 2. T HE E INSTEIN EQUATION

mass Mg :
mg
a=− ∇8.
mi

We could in principle use this equation to discern the ratio of the inertial to the
gravitational mass for various objects. It turns out that the acceleration is the
same for all bodies, and hence the mass ratio is constant, a result epitomized
by the apocryphal experiments of Galileo dropping objects of different masses
from the tower in Pisa. By adjusting G, we may assume, then, that m i = m g :
the inertial and gravitational masses agree. The effect of gravity is universal:
it accelerates all objects the same way, independent of what precisely comprises
the mass. In this way gravity is decidedly different from electromagnetism.
Before we move on, we note that the potential function for Newtonian gravity
satisfies a simple partial differential equation. Indeed, away from x = 0, the
function 8(x) = −G M/|x| is harmonic (with respect to the Euclidean metric),
i.e., 18 = 0, as you can easily check. Of course, 18 can be interpreted globally
as a distribution, say T = 18, and we obtain the equation

18 = 4π G Mδ0 , (2.1.2)

where δ0 is the Dirac measure at the origin (the “location” of the mass M).
Indeed, suppose ψ ∈ Cc∞ (R3 ) is any smooth function supported in the ball of
radius r0 around the origin. There exists a C > 0 such that |(8∇ψ)(x)| ≤ C/|x|
for all |x| ̸= 0; hence, for all ε > 0, we have {|x|=ε} 8∇ψ · r̂ dσ ≤ 4πCε. Then,
R

since 1/|x| ∈ L 1loc (R3 ), an application of Gauss’s divergence theorem, along with
Green’s identity div(8∇ψ − ψ∇8) = 81ψ − ψ18 (compare (1-11b)) and the
vanishing of 18(x) for |x| ̸= 0, yields
Z Z
T (ψ) := 81ψ d x = lim 81ψ d x
R3 ε ↘ 0 {ε≤|x|≤r0 }
Z
= − lim (8∇ψ − ψ∇8) · r̂ dσ
ε ↘ 0 {|x|=ε}
GM
Z
= lim ψ 2 r̂ · r̂ dσ
ε ↘ 0 {|x|=ε} |x|
1
Z
= 4π G M lim ψ dσ
ε ↘ 0 4π ε 2 {|x|=ε}

= 4π G Mψ(0),

where the last identity follows by continuity.


F ROM THE EQUIVALENCE PRINCIPLE TO GENERAL RELATIVITY 49

In this case the matter density is σ = Mδ0 . For a more general matter distri-
bution of density σ , the gravitational potential solves Poisson’s equation

18 = 4π Gσ. (2.1.3)

If σ is compactly supported (or more generally, if σ decays sufficiently at infinity),


we may choose 8(x) → 0 as |x| → ∞.
Newton’s law of gravitation has a flaw, which Newton himself critiqued.
Namely, the gravitational force between two masses does not appear to be
effected through an intermediary, even if the masses are far apart. This “action
at a distance” leads immediately to causality violation, since the motion of one
mass would instantly affect the gravitational field everywhere else. This would
mean that gravitational effects have infinite speed of propagation, which seems
troublesome whether or not the discussion is framed in the context of special
relativity. Einstein sought to rectify this, and in doing so to incorporate another
fundamental force within a relativistic framework.

2.2. From the equivalence principle to general relativity

In Minkowski spacetime there is a preferred set of coordinate charts, correspond-


ing in physics terminology to inertial observers. In such charts the metric for
Minkowski spacetime has the familiar form ηµν d x µ d x ν = −(d x 0 )2 +δi j d x i d x j ,
and, up to a spacetime translation, any two such charts are related by a Lorentz
transformation. These charts are the analogues of Cartesian coordinate systems
for Euclidean space. From the point of view of physics, coordinates from an
inertial chart correspond to measurements made by an inertial observer. The
principle of (special) relativity asserts that there are no preferred inertial observers,
and as such the form of physical laws should take the same form in all inertial
frames (sometimes called the principle of special covariance). An interesting
physics question is whether inertial observers can exist in principle when a
nontrivial gravitational field is present, which relates mathematically to whether
the spacetime metric can be flat, as in Minkowski spacetime.

2.2.1. The equivalence principle. We can bring to bear upon the question of the
existence of inertial frames a famous thought experiment of Einstein. Suppose
there were an inertial frame of reference, say a small lab room isolated from
other forces or fields. In such a frame, if one lets go of a ball (or rather a test
particle, say), it would tend to stay at rest. Now suppose a rocket is attached to
the top of the room, and then accelerates the room “upward” at a uniform rate.
If one lets go of a ball now, it will “fall” toward the floor, just as it would in a
https://avxhm.se/blogs/hill0
50 2. T HE E INSTEIN EQUATION

uniform gravitational field.3 In this case, one cannot distinguish, by making local
measurements of the paths of objects upon which no forces (other than possibly
gravity) act, whether the frame is non-inertial, or whether there is a uniform
gravitational field present. Note that this depends on the universality of gravity:
it imparts the same acceleration to all objects. This version of the equivalence
principle basically comes down to the equality of gravitational and inertial masses:
if they were different, then one could distinguish between the two situations by
making appropriate measurements. The equality of the gravitational and inertial
masses is thus a reflection of the equivalence of a uniformly accelerating frame
with an inertial frame in which a uniform gravitational field is present. Other
versions of the equivalence principle involve other laws of physics, asserting
that measurements involving those laws cannot distinguish between a uniformly
accelerating frame and a frame in which a uniform gravitational field is present.
We have just seen how to “create” a uniform gravitational field via acceleration.
Conversely, consider now the lab room as a spaceship in orbit, or closer to home,
a compartment of one of those amusement park “free fall” rides. In free fall,
if one lets go of a small object (not safe to try this on the free fall ride!), its
position with respect to the room is constant, because a uniform gravitational
field accelerates all objects the same. The law of inertia appears to hold, and
the observer in free fall will then not detect a uniform gravitational field. We on
the earth claim to detect such a field precisely because we are not in free fall:
the contact forces with the earth keep us from a free fall path, and thus if we
drop an apple, we observe it “fall” to the earth under the force of gravity. This is
equivalent to the accelerated room: we feel the contact force between the floor
and our legs, and we observe objects falling toward the ground.
Even light cannot escape “gravity’s” pull: if we imagine a light ray entering
the room at one end moving in a straight line in an inertial frame, the path of the
light is curved in the accelerated frame of reference. Einstein reasoned that by
equivalence, a gravitational field should bend the paths of light rays too. Thus,
were there to exist an inertial reference frame, where a nonzero gravitational
field accounts for acceleration not attributed to other forces, light rays would
apparently not move along straight lines in this frame. (While Newton had
actually anticipated the deflection of light by a massive object, Einstein was able

3 Strictly speaking, we are considering a lab frame which is limited in extent in both space
and time, so that we can reliably expect a gravitational field to be roughly uniform, causing an
approximately constant acceleration. In regions more extended in spacetime, the non-uniformity
of the gravitational field will give rise to tidal forces that could be used to distinguish between
uniform acceleration and a gravitational field, and in fact will be directly related to the curvature
of spacetime, as we will see below.
F ROM THE EQUIVALENCE PRINCIPLE TO GENERAL RELATIVITY 51

to use his theory of gravitation to give a much more accurate prediction than that
of classical physics.)
A natural question is how to proceed with these notions and their ramifications
for the question of the existence of inertial frames. Before doing this, we introduce
the physical phenomenon of gravitational redshift.

2.2.2. Gravitational redshift. We recall the Doppler shift formula, as seen earlier
in the twin paradox example. Suppose light with wavelength λ0 emanates from
an emitter at rest in an inertial frame O; the frequency of the light ν0 satisfies
λ0 ν0 = c, and the time between the start and end of the emission of one full
wavelength is 1t0 = 1/ν0 . The light is absorbed by a detector moving at velocity
v with respect to O, along the same line as the propagation of light. We let O e be
the rest frame of the detector, and let ν1 be the frequency of the light as measured
by the detector in O e, so that 1t1 = 1/ν1 is the time for the absorption of one
wavelength as measured in O e. As shown in Exercise 1-5, we have


1t0 ν1 1 − v/c
= =√ .
1t1 ν0 1 + v/c

We next show that the gravitational redshift, which is derived in concert with
and provides evidence for the equivalence between uniform acceleration and a
uniform gravitational field, places a roadblock in the way of the existence of an
inertial observer, and thus calls into question whether one can mesh gravity with
the Minkowskian geometry of special relativity. Indeed, imagine two rockets
moving along the y-axis, one following the other at a fixed distance 1y, and with
a uniform acceleration a > 0 with respect to some inertial frame O. Suppose the
frame O is momentarily comoving with the rockets at the instant a photon of
wavelength λ0 = c/ν0 is emitted from the trailing rocket to the lead rocket. In the
time 1t (measured in O) the photon travels to the lead rocket, the rockets have
a change in velocity 1v = a1t, so there will be a Doppler shift in the reported
frequency of the received photon, from the difference in velocity at reception and
emission. The Doppler shift can be computed using the formula recalled above,
since the shift will be the same as that computed by inertial frames with relative
velocity 1v (momentarily comoving with the respective rocket at emission and
absorption, respectively; see Section 1.2.4). Thus if ν1 is the frequency of the
received photon (as measured by the lead rocket), then

ν1 1 − 1v/c 1 − 1v/c
=√ =p .
ν0 1 + 1v/c 1 − (1v/c)2
https://avxhm.se/blogs/hill0
52 2. T HE E INSTEIN EQUATION

If we arrange a1t = 1v ≪ c, then 1y = (c − 21 a1t)1t ≈ c1t, and ν0 /ν1 ≈


1 + 1v/c, to first order in 1v/c. We relate this to 1y as follows:
1λ λ1 − λ0 ν0 1v a1t a1y
= = −1 ≈ = ≈ 2 . (2.2.1)
λ0 λ0 ν1 c c c
With the equivalence principle in mind, we compare the above situation to
that in which there is a uniform gravitational field, with gravitational potential
8, which we assume is time-independent, and only varies in one spatial variable
y ∈ R. We imagine two observers stationary in the gravitational field, each of
whom would measure that an object, in the absence of other forces, would expe-
rience a “downward” (i.e., in the negative y-direction) acceleration of magnitude
a = ∂8/∂ y. The equivalence principle asserts this situation should be equivalent
to the observers in an accelerating “elevator,” or to the accelerating rockets as
in the preceding paragraph. Now suppose a photon of frequency ν0 travels a
“height” 1y in the gravitational field, from an observer at y (akin to the trailing
rocket) to an observer at height y + 1y. Then using (2.2.1), we should see a
relative Doppler shift of 1λ/λ0 ≈ c−2 (∂8/∂ y)1y ≈ c−2 (8(y + 1y) − 8(y)).
This leads to an experimentally verified phenomenon about clock rates in a
gravitational field. Suppose a photon is emitted from a height y and is received
at height y + 1y in the gravitational field. If the photon has wavelength λ0 at
emission, the time between the beginning and end of a wavelength (as measured
by an inertial observer at emission, say) is 1t0 = c−1 λ0 . As we have just seen,
an inertial observer at reception measures the time between the beginning and
end of a wavelength as 1t1 = 1t0 + c−1 1λ > 1t0 , since the wavelength of
the absorbed photon is longer: the frequency, and hence the energy, is lower,
accounting for the gravitational potential energy via the relativistic mass-energy
equivalence and the equality of inertial and gravitational masses. This means that
clocks run at different rates in different places in a gravitational field; in this case
the clock at y + 1y runs faster than an identical clock at y. This was confirmed
experimentally by the Pound–Rebka–Snider experiment (see [161, p. 1055–1060],
for example). A similar analysis can be done involving acceleration via rotation,
cf. [94, p. 24–25].
We return to the question at hand: could there be inertial reference frames
(with respect to which the principle of special relativity is formulated) in the
presence of a gravitational field, where acceleration unaccounted for by other
forces could be interpreted to be gravitational in nature? We certainly cannot
reconcile the principle of special relativity with the existence of inertial frames
corresponding to two observers stationary at the heights y and y + 1y as in
the above paragraph. Two such observers are not in motion relative to each
F ROM THE EQUIVALENCE PRINCIPLE TO GENERAL RELATIVITY 53

other, so should they be inertial observers, then since the spacetime paths of the
beginning and tail ends of the photon wavelength should be related by a simple
time-translation (the gravitational field is time-independent, and only depends
on y), the 1t measurements should be the same at the two different heights.
The fact that experiment shows otherwise indicates an incompatibility between
gravitation and the existence of such inertial observers.

2.2.3. Towards a geometric solution. Einstein made an argument [85; 84] that
the spacetime continuum in the presence of a gravitational field (related to
acceleration via the equivalence principle) should be “non-Euclidean” (i.e., non-
flat, so non-Minkowskian). As we will see, the argument might be construed
as more heuristic than precise. We present it briefly for historical reasons, and
hopefully the reader will get some utility from it without getting the wrong
impression.
Consider a frame of reference O which represents uniform rotation with
angular velocity ω > 0 with respect to an inertial frame of reference. We pick
the origin for coordinate charts adapted to the frames as the center of rotation,
synchronize the clocks for the two observers at the origin of coordinates, and
align the axes at t = 0, so that the x and y axes are in the plane of rotation.
One can conceive ways to make measurements, and thus build coordinates for
spacetime, adapted in some way to O. There is not a canonical way to do this,
though one might initially be inclined to think the following is such a way: relate
the coordinates (t, x, y, z) in O to the inertial coordinates (t˚, x̊, ẙ, z̊) of an event
by the transformation (where (x, y) = (r cos θ, r sin θ) for r ≥ 0), t˚ = t, z̊ = z,
x̊ = x cos ωt − y sin ωt = r cos(θ + ωt), ẙ = x sin ωt + y cos ωt = r sin(θ + ωt).
For example, one might use radar (by sending and receiving signals to and from
an event), using the clock at the origin to read off the time, which then agrees in
both frames because there is no time dilation. One could also build coordinates
for spacetime by making measurements of events using measuring devices (a grid
of rods and clocks) associated to the observers at rest relative to O. See [161;
38; 162] for further discussion of coordinates adapted to accelerated observers.
Let Cr be a circle centered at the origin, of radius r as measured in the inertial
frame, with respect to which the length is 2πr . An observer on Cr at rest in O is
moving along the circle Cr with angular velocity ω relative to the inertial frame; of
course we require ωr < c. We proceed following Einstein: relative to the inertial
frame, the length of a unit measuring stick in the pframe O oriented tangentially
along Cr is shorter than one unit, by the factor 1p− (ωr/c)2 . Thus, the length
L(Cr ) measured using rods adapted to O is 2πr/ 1 − (ωr/c)2 , which is more
than 2πr : if ωr/c is small, then in O, L(Cr ) ≈ 2πr 1 + 12 (ωr/c)2 . Another

https://avxhm.se/blogs/hill0
54 2. T HE E INSTEIN EQUATION

way to think of this is that the observers at rest in O located along Cr place
non-overlapping measuring sticks along the circle to mark its circumference,
and one just counts how many of these are needed to go around the perimeter to
determine the length of the circle as measured in O. On the other hand, distances
perpendicular to the direction of motion agree in both frames. Thus both frames
agree on the radius r . It appears that from the point of view of the frame O, one
might conclude that the spatial geometry is curved. Indeed, recall the classical
formula for the Gauss curvature (Exercise 2-55), which applied to the above
analysis would yield nonzero (negative) curvature:

3 2πr − L(Cr )
K ( p) = lim · .
r →0 π r3
While Einstein cited this intriguing argument as motivation for the introduction
of non-Euclidean geometry into the theory of gravitation, one must critique it
in various ways. Bearing in mind the relativity of simultaneity, for instance,
has the argument above really succeeded in showing there is an observer who
will measure the circumference and radius of a circle to be out of step with
Euclidean geometry? Or does the analysis just yield some sort of non-Euclidean
geometry on the set of worldlines of rotating observers? While the spacetime
interval between two events is invariant, one needs to consider carefully to what
extent one can define and compare the spatial length and radius of a circle in
the two frames, keeping in mind that the observers will not necessarily agree on
simultaneity. Clock rates will vary for rotating clocks depending on the location
relative to the center; the clock rates depend on r , so while the rates are the same
for observers at rest in O on each Cr (each of which we note has a different local
rest frame), we might ask to what extent the observers can agree on a set of events
to be deemed the disk spanned by Cr . Imagine too if the rotating frame is slowly
brought to rest relative to the inertial frame: what happens as it slows and the
length contraction factor goes to 1? There are in principle too many measuring
rods positioned around the circumference, so something would have to give.
This discussion is related to the breakdown of standard notions of rigidity in
relativity, and the thought experiment of Einstein is related to similar arguments
in discussions of Ehrenfest’s paradox concerning the fate of a rotating cylinder
in relativity. For more on this topic, see [110], for example, and for Kaluza’s
argument relating Ehrenfest’s paradox to hyperbolic geometry, see [132].
From a geometric point of view, if spacetime is a manifold equipped with
a Lorentzian metric, then spacetime geometry is either Minkowskian or not.
If spacetime were Minkowskian, then the events which comprise a circle at a
fixed time in an inertial frame would yield a spacelike curve ξ embedded in a
F ROM THE EQUIVALENCE PRINCIPLE TO GENERAL RELATIVITY 55

Euclidean three-space (and hence a Euclidean plane), so that the length of the
curve would follow the well-known formula for the circumference of a circle.
The length of the curve ξ is a geometric invariant; if another frame of reference
were used to build spacetime coordinates (e.g., coordinates somehow adapted to
an accelerating frame), the spacelike curve ξ might not be comprised of events
simultaneous in this frame, but its length would be invariant. The coordinate
measurements might need to be converted using metric components to obtain
truly invariant geometric distances; this is akin to using curvilinear coordinates
for the Euclidean plane. If on the other hand there is a region where the spacetime
geometry is curved, there is no frame from which one could build coordinates
for which the metric is everywhere identical to that of Minkowski spacetime in
inertial coordinates, a situation analogous to the familiar fact that one cannot
make maps (charts) of the Earth’s surface which are isometries (up to a constant
scale factor) between the geometry on the surface and the Euclidean geometry
of the planar map.
Similar comments apply to the gravitational redshift scenario above. For
the observers stationary in a gravitational field with time-independent potential
varying in y as above, we maintain that the paths of successive photon crests
should be congruent in a coordinate chart associated to either observer. The
conundrum about clock rates discussed above would persist if this observer were
an inertial observer, say, using the Minkowski metric in inertial coordinates to
compute spacetime intervals. On the other hand, if the metric components are
not the Minkowskian components in inertial coordinates, but rather components
for Minkowski spacetime in a non-inertial frame, or components for a metric
for which the geometry of spacetime is curved, then it is generally expected
that coordinate measurements do not give geometric invariants, and that one
would need to use the metric to compute the relevant invariant, and thus give the
physical quantity of interest (in this case a proper time interval).

2.2.4. Free fall and geodesics. Beyond the issues we have seen above regarding
the obstruction to the existence of observers which are at once inertial and
stationary in a gravitational field, another fundamental issue arises from the
universality of gravity. An inertial frame of reference is one in which test
particles upon which no net force acts move with constant velocity. Consider a
region where the only forces are electromagnetic in nature: neutral particles and
charged particles could in principle be distinguished by their relative motions.
In an inertial frame, the neutral particles would move with constant velocity
according to the law of inertia, while charged particles would move according to
the Lorentz force law.
https://avxhm.se/blogs/hill0
56 2. T HE E INSTEIN EQUATION

The situation is decidedly different in the context of gravitation. In a region


where the only forces are due to a nontrivial gravitational field, there would be
no free test particle to illustrate the law of inertia. Imagine a Newtonian inertial
frame in a region far away from massive objects, in which free particles (including
photons) are moving; if we now “dial up” a gravitational field (by moving a
massive object nearby), then what happens? All free particles, including light,
would be accelerated in the same manner away from the straight line paths in this
frame. In fact, if we conceive of a frame in terms of an observer, the universality
of gravity implies that the observer would also be affected by gravity.
In seeking a frame where the law of inertia holds, we have to reconcile the
notion of gravitational force with the equivalence principle. Namely, observations
of physical phenomena in a frame in which there is a uniform gravitational field
should be equivalent to those from a frame uniformly accelerating with respect
to an inertial frame. Recall that this was tied into the equivalence of inertial and
gravitational masses: if these differed, you could in principle perform experiments
on test particles to distinguish between the dynamic effects of a gravitational
field and the basic kinematic effects of frame acceleration. In any case, local
experiments (where tidal effects due to a non-uniform gravitational field are
too small to detect) cannot distinguish the two situations, which suggests, then,
that test particles should behave like free particles, with the local effects of the
gravitational force interpreted as a fictitious force arising from an accelerated
frame of reference. From this viewpoint, then, the worldlines of particles which
undergo only gravitational forces, i.e., freely falling particles, should then give
rise to a class of observers which to some extent might assume the role inertial
observers played in Minkowski spacetime of special relativity.
We test this hypothesis via the gravitational redshift analysis above: what if we
do the analysis instead from the perspective of two observers falling freely in the
gravitational field, as opposed to those remaining stationary with respect to the
gravitational potential? If the freely falling observers build coordinates adapted
to their motion (normal coordinates, or Fermi coordinates along the worldline,
as we will later employ in deriving the Einstein equation), then the physics
in this frame corresponds, to good approximation and locally in spacetime, to
what would be measured in an inertial frame. For example, consider the above
photon with emission frequency ν0 , having traveled a height 1y. As we saw
above, a stationary observer in the field measures the photon frequency ν1 with
ν0 /ν1 ≈ (1 + c−2 a1y). On the other hand, a freely falling observer which was
stationary in the field when the photon was emitted will have attained a velocity
1v ≈ −c−1 a1y at absorption (sign indicates relative direction), and so will
F ROM THE EQUIVALENCE PRINCIPLE TO GENERAL RELATIVITY 57

measure the photon to have a Doppler shift given by



ν2 1 − 1v/c 1v
=√ ≈ 1− ≈ 1 + c−2 a1y.
ν1 1 + 1v/c c
We see that ν0 and ν2 agree to leading order in |1v|/c. Thus while spacetime
is not Minkowskian, there are certain observers that correspond, approximately
and locally in spacetime, to inertial frames.
In light of all this, one might dispense with the notion of gravitational force
and acceleration due to gravity, and instead assert that a gravitational field
manifests itself through a family of timelike paths that represent freely falling
test particles (null paths for photons), whose trajectories are determined by and
encode gravitational effects, but upon which no other forces act. We stress that
while this assertion identifies a class of observers, this class is not determined a
priori, so such observers will not be preferred for the formulation of physical laws,
in accordance with Einstein’s principle of general relativity (see Section 2.3).
Geodesics are paths that have zero covariant acceleration (with respect to an
affine connection ∇), and as such are analogues of straight lines for a curved
space. Einstein asserted that objects that are experiencing no other force except
possibly that of gravity should move along timelike geodesics, while light should
propagate along null geodesics. In any coordinate chart for spacetime, the
covariant acceleration can be computed, and if it vanishes in one chart, it vanishes
in all charts. In this way, the paths of light rays in vacuum obey a rule that takes
the same form in every frame of reference, consistent with the principle of general
relativity, and providing a simple example of the principle of general covariance:
the laws of physics should in principle be diffeomorphism-invariant, and so
should be able to be formulated in a coordinate-independent way.
Of course, the coordinate acceleration along a geodesic may not vanish, as
the equation for a geodesic γ (t) with coordinates x µ (t) can be expressed as
d2xµ µ dxν dxσ
+ 0νσ = 0,
dt 2 γ (t) dt dt
µ
where ∇∂/∂ x ν ∂/∂ x σ = 0νσ ∂/∂ x µ ; if we use the Levi-Civita connection for a
µ
metric, we have the formula 0νσ = 21 g µρ (gρσ,ν +gνρ,σ −gνσ,ρ ) for the Christoffel
symbols. As broached earlier, then, gravitational force may be locally akin to a
fictitious force in an accelerating frame in Euclidean space. By choosing local
coordinates adapted to a freely falling observer, one can arrange for the Christoffel
µ
symbols 0νσ to vanish at an event (or even along a worldline). However, unlike
in Euclidean space or Minkowski spacetime, in the presence of curvature, there
is no coordinate system in which the Christoffel symbols can be made to vanish
https://avxhm.se/blogs/hill0
58 2. T HE E INSTEIN EQUATION

identically, and as we will soon see, spacetime curvature produces tidal effects
ascribed to a gravitational field.
The universality of gravity has led to its incorporation into the structure of
spacetime at a fundamental level, determining inertial properties (motion of test
particles which experience no non-gravitational forces) through the geometry
of spacetime. Furthermore, as experiments continue to confirm that signals in
vacuum cannot travel at speeds faster than the speed of light, gravity plays a
distinguished role in determining the causal structure of spacetime through its
effect on paths of light rays. For a concrete geometric consequence, consider a
spacetime modeled on a Lorentzian manifold, and assert that light rays move
along null paths. The collection of lightcones determines the conformal class of
the spacetime metric g (and recall for comparison, the conformal compactification
of Minkowski spacetime as presented in the first chapter indeed preserves the
lightcone structure). Indeed, if X is timelike and Y ̸= 0 is spacelike at a point p,
then g(X + aY, X + aY ) = a 2 g(Y, Y ) + 2ag(X, Y ) + g(X, X ), a quadratic
polynomial in a with a positive leading coefficient and a negative constant term.
Hence there are exactly two real roots of this quadratic, which in principle we can
glean from the lightcone at p. The product of these roots gives g(X, X )/g(Y, Y ).
If V and W are any tangent vectors at p, then

g(V, W ) = 21 g(V + W, V + W ) − g(V, V ) − g(W, W ) .




Knowing the lightcone at p, then, allows us to find the ratio g(V, W )/g(X, X ),
since any of the terms on the right of the preceding equation, if nonzero, can be
gleaned in ratio with either g(X, X ) or g(Y, Y ).
The question remains how to connect the geometry of spacetime to the distri-
bution of matter and energy, whose motion should be in part determined by the
curvature of spacetime and whose gravitational effects should in turn influence
the geometry of spacetime. An answer lies in the Einstein equation, to which we
now turn.

2.3. The Einstein equation

We begin with a quote from Einstein [83, p. 113]: “The laws of physics must
be of such a nature that they apply to systems of reference in any kind of
motion.” This principle of general relativity puts all frames of reference on
an equal footing, in contrast to the privileged inertial frames of reference of
special relativity. Together with the equivalence principle, that an accelerating
frame ought to be locally (and approximately) equivalent from the point of view
of physics to a frame in which there is a uniform gravitational field, we see
T HE E INSTEIN EQUATION 59

that a theory obeying general relativity ought to naturally be in part a theory of


gravitation, whereby the global inertial frames of special relativity are replaced
by local inertial frames adapted to freely falling observers. Such a theory incor-
porating gravity is consistent with and gives impetus for the assertion that no
coordinate system should be preferred for the formulation of physical laws. This
is echoed in a more geometric way by the principle of general covariance, which
asserts that the laws of physics should be invariant under diffeomorphisms, so
that it should be possible to formulate their equations in a coordinate-independent
manner. In particular, the laws of physics should have the same form for all
frames of reference (compare the formulation in [218, p. 57]). While equations of
tensorial physical laws in Minkowski spacetime can be cast in generally covariant
form, we have argued that other spacetime metrics should arise when there is
a nontrivial gravitational field. We emphasize that this can be the case even in
vacuum regions of spacetime, possibly corresponding to the gravitational field
outside a compact massive object, say.
One seeks to relate the spacetime geometry to the distribution of matter fields
and energy within spacetime: the result is the Einstein equation, which will be
discussed at length below. That the theory should incorporate some features of
special relativity suggests we need to strengthen the equivalence principle, and
assert that spacetime geometry should be given by a Lorentzian metric (recall that
our signature for Lorentzian metrics is (−, +, +, . . . , +)); in normal coordinates
at a point in spacetime, the laws of physics will take the same form as they would
in special relativity, and thus locally, the laws are approximately of the same form
as in special relativity. Said another way, local experiments in a small region
of spacetime cannot detect a gravitational field which is roughly uniform, and
could be ascribed to a uniformly accelerating frame of reference, with the results
being in accordance with special relativity. In larger regions of spacetime, non-
uniformities of the gravitational field can be detected by measuring tidal forces.
As a final guide to the Einstein equation, we have a correspondence principle:
the equations governing gravity should yield Newton’s law of gravitation when
the gravitational field is sufficiently weak.
A few words are in order before we move on. We will not go into a discussion
of Mach’s principle and its influence on Einstein’s development of a theory of
gravitation and cosmology (cf. [85; 161; 171]), other than to say that some of the
spirit of Mach’s ideas may be present in the way spacetime geometry, from which
we discern the timelike geodesics which represent inertial motion, interacts with
matter and energy, through the forthcoming Einstein equation; in other words, the
inertia of a test particle is determined in relation to the rest of matter and energy
of spacetime. We also will not spend any more time delineating differences and
https://avxhm.se/blogs/hill0
60 2. T HE E INSTEIN EQUATION

relationships between the various principles at the foundation of the theory of


gravitation. It may indeed appear that we have been reciting the same themes over
and over. There is a whole literature on the foundational underpinnings of the
theory, stemming from writings of Einstein and his contemporaries, continuing
right up to the present, from physicists, mathematicians, as well as philosophers
and historians of science; see, e.g., [169; 170].
Einstein searched for a way to relate the geometry of spacetime to the energy-
momentum distribution of matter and fields it within it. The gravitational field
itself is encoded in the metric of spacetime. Einstein sought to equate the stress-
energy tensor T describing the energy-momentum densities of the fields and
matter to some tensor created from a Lorentzian metric g. In this quest he faced
some restrictions. The tensor T is symmetric and divergence-free in special
relativity, and therefore these properties should hold at the center of a normal
coordinate chart (locally inertial frame), in accordance with the equivalence
principle; the divergence of T must then vanish any coordinate system. We note
that it is a mathematical statement whether the divergence of a tensor vanishes
(which can be computed in any coordinate system), whereas the physics dictates
whether the stress-energy properties can be encoded by a tensor, which then
behaves in accordance with general covariance. Einstein originally tried to equate
T with the Ricci tensor, up to a constant scalar multiple, but that was doomed
to fail in general, because of the contracted Bianchi identity (Corollary 2-2).
Nowadays this is a well-known fact covered in introductory graduate geometry
courses, but as it is so important, we give the proof here, after recalling a few
facts and conventions about the curvature.
Throughout the remainder of the chapter, indices will run over all components,
except where we specifically note otherwise. Recall that we often employ angle
brackets ⟨ · , · ⟩ for the metric. For convenience, we recall our various curvature
conventions here. We take R(X, Y, Z ) = ∇ X ∇Y Z − ∇Y ∇ X Z − ∇[X,Y ] Z to be
the curvature tensor, with index convention
∂ ∂ ∂ ∂
 
Riℓjk ℓ = R , , ,
∂x ∂x i ∂x j ∂x k

in which R is a (1, 3)-tensor, while


∂ ∂ ∂ ∂
⟨ ⟩
 
Ri jkℓ = gℓm Rimjk = R , , ,
∂xi ∂x j ∂xk ∂xℓ
gives the components of the corresponding (0, 4)-tensor. R(X, Y, Z ) is alternat-
ing in (X, Y ) and enjoys symmetry-by-pairs: ⟨R(V,W, Y ), Z ⟩ = ⟨R(Y, Z ,V ),W ⟩.
Thus we have the component identities Rkℓi j = Ri jkℓ = −R jikℓ = −Ri jℓk . The
Ricci tensor Ric(g) is a symmetric (0, 2)-tensor with components R jk = Rii jk =
T HE E INSTEIN EQUATION 61

g iℓ Ri jkℓ = Rk j , and the metric trace of the Ricci tensor is the scalar curvature
R(g) = g i j Ri j .

Proposition 2-1 (Bianchi identities). In a semi-Riemannian manifold (M, g),


with Levi-Civita connection ∇, the curvature tensor satisfies an algebraic Bianchi
identity,
R(X, Y, Z ) + R(Y, Z , X ) + R(Z , X, Y ) = 0,

for all vectors X , Y , and Z . The curvature tensor also satisfies a differential
Bianchi identity: for all vectors X , Y , Z , V, and W ,

⟨(∇ X R)(V, W, Y ), Z ⟩ + ⟨(∇Y R)(V, W, Z ), X ⟩ + ⟨(∇ Z R)(V, W, X ), Y ⟩ = 0.

By symmetry-by-pairs, this is equivalent to

⟨(∇ X R)(V, W, Y ), Z ⟩ + ⟨(∇V R)(W, X, Y ), Z ⟩ + ⟨(∇W R)(X, V, Y ), Z ⟩ = 0.

Proof. Applying R(X, Y, Z ) = ∇ X ∇Y Z − ∇Y ∇ X Z − ∇[X,Y ] Z , we can rearrange


terms to obtain

R(X, Y, Z ) + R(Y, Z , X ) + R(Z , X, Y )


= ∇ X (∇Y Z − ∇ Z Y ) + ∇Y (∇ Z X − ∇ X Z ) + ∇ Z (∇ X Y − ∇Y X )
− ∇[X,Y ] Z − ∇[Y,Z ] X − ∇[Z ,X ] Y
= ∇ X [Y, Z ] + ∇Y [Z , X ] + ∇ Z [X, Y ] − ∇[X,Y ] Z − ∇[Y,Z ] X − ∇[Z ,X ] Y
= [X, [Y, Z ]] + [Y, [Z , X ]] + [Z , [X, Y ]] = 0.

On the last line we used the Jacobi identity, and in the line above that we used
the torsion-free property of the Levi-Civita connection,
By symmetry-by-pairs, it remains only to prove the second differential Bianchi
identity above, for which it suffices to verify on a coordinate frame ∂∂x i . In


fact, we use normal coordinates at a point p ∈ M, so that gi j ( p) = ±δi j , and


∇ ∂ i ∂ ∂x j p = 0ikj p ∂ ∂x k p = 0. Since ∂∂x i , ∂ ∂x j = 0, we have at the point p, where
 
∂x
the Christoffel symbols vanish (and hence a component of a covariant derivative
at the point agrees with the corresponding partial derivative):

∂ ∂ ∂ ∂ ∂
⟨ ⟩
 
Ri jkℓ;m ( p) = m R , , , ℓ
∂ x p ∂ x i ∂x j ∂xk ∂ x
∂ ∂ ∂
D   E
= ∇ ∂ ∇∂ ∇ ∂ − ∇ ∂ ∇ ∂ , ℓ
.
∂xm ∂xi ∂x j
k ∂x ∂x j
k
∂xi ∂x ∂x p
https://avxhm.se/blogs/hill0
62 2. T HE E INSTEIN EQUATION


By combining terms in pairs and using that ∇ ∂ = 0 for all i, j, we find
∂xi ∂x j p

R i jkℓ;m ( p) + R jmkℓ;i ( p) + Rmikℓ; j ( p)


 ∂ E
∂ ∂
D 
= ∇ ∂ ∇∂ ∇ ∂ − ∇ ∂ ∇ ∂ , ℓ
∂xm ∂xi ∂x j ∂ x
k
∂x j ∂xi ∂ x
k ∂x p
 ∂ E
∂ ∂
D 
+ ∇∂ ∇ ∂ ∇ ∂ − ∇ ∂ ∇∂ ,
∂x i ∂x ∂x
j m ∂ x k
∂x m ∂x j ∂ xk ∂ xℓ p
 ∂ E
∂ ∂
D 
+ ∇ ∂ ∇ ∂ ∇∂ − ∇ ∂ ∇ ∂ , ℓ
j ∂xm ∂xi ∂ x
k i ∂xm ∂ x
k ∂x p
D  ∂x  ∂ x
∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂
   E
= R , , ∇ ∂ + R , , ∇ ∂ +R , j ,∇ ∂ ,
∂x j ∂x m
∂x i ∂x k ∂x ∂x m
∂x j
i ∂x k ∂x
i ∂x ∂x m ∂x k ∂x ℓ p

= 0. □

Corollary 2-2. If (M, g) is a semi-Riemannian manifold with scalar curvature


R(g), then
2 divg Ric(g) = d R(g).

Proof. We use the differential Bianchi identity, along with symmetries of the
ij
curvature tensor, and the fact that ∇g = 0, so that g ;k = 0 for all i, j, k:

d R(g)i = (g jℓ g km Rk jℓm );i


= g jℓ g km (−Rk jmi;ℓ − Rk jiℓ;m )
= g jℓ g km (R jkmi;ℓ + R jkiℓ;m )
= g jℓ R ji;ℓ + g km Rki;m
= 2(divg Ric(g))i . □

With this corollary in mind, we introduce the Einstein tensor

G 3 = Ric(g) − 21 R(g) g + 3g, (2.3.1)

where 3 is a constant, called the cosmological constant. Note that sometimes the
Einstein tensor refers only to the case 3 = 0 above, i.e., G = Ric(g) − 12 R(g) g.
The Einstein tensor is divergence-free, as is any constant scalar multiple, and
thus provides a candidate for the stress-energy tensor of spacetime. In fact, it
is known that up to scalar multiple, G 3 is the only divergence-free symmetric
tensor whose coordinate expression is a function of the components gµν of the
metric tensor, along with their first and second partial derivatives. This result
was known to Cartan and Weyl in the special case that the tensor is quasilinear,
and the more general result was proved by Lovelock [148, p. 322].
T HE E INSTEIN EQUATION 63

From this result, if the Einstein equation should be as simple as possible, and
thus be second-order in the metric components, then it must take the form
G3 = κ T (2.3.2)
for some constant κ that will be determined by the Newtonian limit, as we now
show.
2.3.1. The Newtonian limit. Consider a spacetime metric g that is close to the
Minkowski metric η, in the sense that there are coordinates in which gµν =
ηµν + h µν , where h µν and its derivatives can be taken to be “small”, and ηµν
takes the standard inertial form. We assume that gµν,0 = 0 (or at least gµν,0 ∼ 0;
see below), so that with x 0 = ct, the field is (approximately) time-independent
in these coordinates. We let i and j run over spatial indices (i, j ̸= 0). Let the
trajectory of a slowly moving particle be modeled by a geodesic with coordinates
x µ (τ ), parametrized by proper time τ , so that |d x i/dτ | ≪ c dt/dτ ≈ c. We
will expand to first order in h (and derivatives of h) and c−1 d x i/dτ , and denote
expressions that are equal up to terms quadratic in these quantities (with bounded
coefficients) using “∼”. Thus dt/dτ ∼ 1, so that c−1 d x i/dt ∼ c−1 d x i/dτ . Since
g µν = ηµν + O(h), we have
µ
000 = 12 g µν (gν0,0 + g0ν,0 − g00,ν ) ∼ − 21 ηµν h 00,ν . (2.3.3)
µ
Since 0ρσ = 12 g µν (h νσ,ρ + h ρν,σ − h ρσ,ν ), the geodesic equation becomes
 2 µ ρ σ
−2 d x µ dx dx
0=c + 0ρσ
dτ 2 dτ dτ
 2 µ  0 2   2 µ
−2 d x dx −2 d x

µ 2 µ
∼c + 000 ∼c + c 000 .
dτ 2 dτ dτ 2
The time component gives, using (2.3.3),
d 2t 2 0
−2 d x 0
= c ∼ −000 ∼ 0.
dτ 2 dτ 2
For the spatial components we have
2 i  i  2 i  i 2  2 i
−2 d x −2 d d x dt −2 d x dt 2 d x d t −2 d x

c = c = c + ∼ c ,
dτ 2 dτ dt dτ dt 2 dτ dt dτ 2 dt 2
so that
d2xi 2 i
−2 d x i
c−2 2
∼ c 2
∼ −000 = 21 h 00,i .
dt dτ
If we let 8 = − 21 c2 h 00 , so that g00 = −(1 + 28c−2 ), we recover the Newtonian
relation between acceleration and the gradient of the gravitational potential
https://avxhm.se/blogs/hill0
64 2. T HE E INSTEIN EQUATION

(2.1.1). We remark that this analysis can be interpreted in the case where g
is the Minkowski metric, and gµν = ηµν + h µν gives the components of the
Minkowski metric in a (weakly) accelerating coordinate system, consistent with
the equivalence principle.
We now determine κ. To do this, we consider the stress-energy for a dust model.
For a perfect fluid, the pressure becomes important due to high random motion
of the particles, and we are assuming our particles are slowly moving. So we are
just considering the dust particles at rest in a given frame, without any pressure
forces between them, each with four-velocity U . Hence T µν = c−2 ρ U µ U ν . We
again consider gµν = ηµν + h µν , where gµν,0 = 0 (or at least gµν,0 ∼ 0). We
expand to first order in h (and its derivatives), U i and ρ. Since gµν Uµ Uν = −c2 ,
we have trg T = gµν T µν = −ρ and g00 U0 U0 ∼ −c2 , as well as

T00 = gµ0 gν0 T µν ∼ g00 g00 T 00 = (−1 + h 00 )2 c−2 ρ U 0 U 0 ∼ ρ.

Starting from the Einstein equation (2.3.2) with 3 = 0, Ric(g) − 12 R(g) g = κ T ,


and taking a trace yields R(g) = κρ; evaluating the (0, 0)-component of the
Einstein equation then gives R00 + 21 κρ ∼ κρ, or R00 ∼ 21 κρ. We can also compute
R00 in terms of Christoffel symbols: from equation (2-8a) on page 72 we have
µ µ
R ij00 = 000,
i i i i
j − 0 j0,0 + 0 jµ 000 − 00µ 0 j0
i 1 iµ
∼ 000, j = 2 g (gµ0,0 + g0µ,0 − g00,µ ) , j


∼ − 21 ηiµ g00,µj = − 21 δ iµ g00,µj

and so
1 i
2 κρ = R00 = Ri00 ∼ − 12 1(h 00 ) = c−2 18.

To compare with the Newtonian limit, we convert the energy density to mass
density, σ = c−2 ρ, to obtain 18 = 12 κc4 σ . Thus from (2.1.3) we get (in
spacetime dimension four) 21 κc4 = 4π G, or
8π G
κ= . (2.3.4)
c4
2.3.2. Energy conditions. Without the imposition of additional conditions on T ,
the Einstein equation does not impose any restrictions on a metric g, since
the Einstein tensor is always symmetric and divergence-free. We note here
some conditions often imposed on T based on physically reasonable energy
considerations.
We rewrite the Einstein equation (2.3.2) as follows. From (2.3.1) and (2.3.4)
we have Ric(g) − 21 R(g) g + 3g = κ T ; take the trace to obtain (in spacetime
T HE E INSTEIN EQUATION 65

dimension four) −R(g) + 43 = κ trg T , so

Ric(g) = κ T − 12 (trg T ) g + 3g. (2.3.5)




In the vacuum case (T = 0) the Einstein equation becomes Ric(g) = 3g; a metric
satisfying an equation of this form is an Einstein metric. The vacuum Einstein
equation commonly refers to Ric(g) = 0, which holds when T = 0 and 3 = 0.
We now introduce several conditions coming from physical notions that are
sometimes imposed on T , some of which will appear in later chapters. The
weak energy condition is that T (ξ, ξ ) ≥ 0 for all timelike ξ . If c = 1, say, then
unit timelike vectors U correspond to (instantaneous) physical observers, and
T ( U , U ) is the energy density as measured by such an observer. The strong
energy condition is that for all unit timelike U , T ( U , U ) ≥ −21 trg T . From (2.3.5),
we see this is equivalent when 3 = 0 to the timelike convergence condition
Ric(ξ, ξ ) ≥ 0 for all timelike ξ ; replacing timelike ξ by null ξ , we have the null
energy condition. In a time-oriented Lorentz manifold (i.e., if the manifold admits
a smooth timelike vector field that can be used to give a smooth assignment of a
future timecone in the tangent space at each point), we define the dominant energy
condition that for all future-directed timelike ξ , the vector given by −T ab ξ b is
future-directed causal, or in other words, for all future-directed timelike (causal)
ξ and χ , we have T (ξ, χ ) ≥ 0. The dominant energy condition clearly implies
the weak energy condition.

2.3.3. The Einstein equation in Fermi coordinates along a timelike geodesic.


Consider the motion of particles under gravitational force with potential 8 in the
Newtonian framework. We consider a family of paths ξ(t, s) with coordinates
x k (t, s), where s parametrizes the family of paths by, say, their initial position s
along an axis. Newton’s law becomes
∂2xk ∂8
2
= − k (ξ(t, s)).
∂t ∂x
We now consider the variation vector
∂ξ ∂xk ∂
V= =
∂s ∂s ∂ x k
in the direction across nearby paths. It satisfies the equation (1 ≤ i, j, k ≤ 3)
 2 k
∂2 ∂ xk
 2  j  2
D V ∂ 8 ∂x ∂ 8
  
= 2 =− =− V j . (2.3.6)
dt 2 ∂t ∂s ∂ x j ∂ x k ∂s ∂x j ∂xk
This equation describes the relative motion of particles moving on nearby paths
under the force of gravity. The relative motion is sometimes described in terms of
https://avxhm.se/blogs/hill0
66 2. T HE E INSTEIN EQUATION

tidal forces from non-uniformities in the gravitational field, which is consistent


with what we see in the equation. The matrix governing this behavior is minus
the Hessian of 8, so its trace is −18 = −4π Gσ , where σ is the mass density.
We can use this formulation as a motivation for Einstein’s equation. In general
relativity, the paths of observers in free fall (i.e., subject only to gravitation) are
given by timelike geodesics, which are timelike paths with vanishing covariant
acceleration. By the equivalence principle, experiments cannot distinguish a
uniform gravitational field from uniform acceleration. It is as if locally, gravita-
tional field effects can be either created via an accelerating frame, or effectively
cancelled out (at least approximately near a given point in spacetime) by using
a suitable (“freely falling”) reference frame; we emphasize that we are not
claiming that any gravitational field can be precisely generated by accelerating
some inertial frame.
To be precise, if γ (t) is a geodesic, the geodesic equation in coordinates is
d 2γ µ µ dγ ν γ σ
= −0νσ γ (t) dt dt
.
dt 2
In normal coordinates at p = γ (0) (or in Fermi coordinates along γ , as we
2 µ
discuss below), these equations reduce to ddtγ2 0 = 0, which is analogous to the
Newtonian equation with vanishing gravitational field. Unlike the components
of the curvature tensor, for example, the Christoffel symbols do not form the
components of a tensor field, and can be transformed away at a point by a
coordinate change. At such a point in such coordinates, covariant derivatives
reduce to partial derivatives, and tensor equations for physical laws take their
special relativistic form as in inertial coordinates in Minkowski spacetime.
We now derive the equation governing the behavior of a family of nearby
geodesics. Let f (t, s) be a two-parameter map, and let D/∂s and D/∂t be the
covariant derivatives along the s- and t-curves under the map f . Suppose that
for each s, γs (t) = f (t, s) is a geodesic. The vector field V = ∂ f /∂s along f
is the variation field for the family of geodesics. If we focus on γ = γ0 and
consider V (t) along γ , then V (t) satisfies the differential equation (2.3.7), the
Jacobi equation.
Proposition 2-3. Consider a family of geodesics f (t, s) as above, with variation
field V . Then, along γ = f ( · , 0),
D2 V
= R(γ ′ (t), V (t), γ ′ (t)). (2.3.7)
dt 2
Proof. If V is a smooth vector field defined in a neighborhood of a curve γ , then
DV /dt = ∇γ ′ (t) V . For example, V might be a local coordinate vector field, or
T HE E INSTEIN EQUATION 67

possibly V (t) is defined along an immersed curve γ (t), so that V can be locally
extended and the identity holds (compare to the case γ (t) = p0 is constant, while
V (t) ∈ Tp0 M has a nonzero derivative).
With this in mind, we find (using the symmetry of the Christoffel symbols at
the last step, and with the summation convention running over all indices)

D ∂f D ∂f k ∂
 
=
∂t ∂s ∂t ∂s ∂ x k
∂2f k ∂ ∂f k ∂
= k
+ ∇∂f k
∂t ∂s ∂ x ∂s ∂t ∂ x
∂2f k ∂ ∂f k ∂f ℓ ∂
= k
+ ∇∂
∂t ∂s ∂ x ∂s ∂t ∂ x ℓ ∂ x k
∂f k ∂f ℓ m
 2 m
∂ f ∂

= + 0ℓk
∂t ∂s ∂s ∂t ∂xm
D ∂f
= . (2.3.8)
∂s ∂t
As we did not use the geodesic equation, this identity holds for general f (t, s).
For a smooth vector field W (t, s) along the map f , we have similarly,

D DW D ∂Wk ∂ ∂
 
k
= + W ∇∂f
∂t ∂s ∂t ∂s ∂ x k ∂s ∂ x
k

D ∂Wk ∂ ℓ
k∂f ∂
 
= +W ∇∂
∂t ∂s ∂ x k ∂s ∂ x ℓ ∂ x k
∂2W k ∂ ∂Wk ∂ ∂Wk ∂ 2 ℓ
k∂ f ∂
= k
+ ∇ ∂ f k
+ ∇ ∂ f k
+ W ∇∂
∂t ∂s ∂ x ∂s ∂t ∂ x ∂t ∂s ∂ x ∂t ∂s ∂ x ℓ ∂ x k
∂f ℓ ∂f j ∂
+ Wk ∇∂ ∇∂ .
∂s ∂t ∂ x j ∂ x ℓ ∂ x k
Thus we see that
D DW D DW ∂f ℓ ∂f j ∂ ∂
 
− = Wk ∇∂ ∇∂ k
−∇∂ ∇∂ k
.
∂t ∂s ∂s ∂t ∂s ∂t ∂x j ∂xℓ ∂x ∂xℓ ∂x j ∂ x

Since R(X,Y, Z ) =∇ X ∇Y Z −∇Y ∇ X Z −∇[X,Y ] Z (from the definition of curvature)


and since [∂/∂ x i , ∂/∂ x j ] = 0, we get
ℓ j
D DW D DW k∂f ∂f ∂ ∂ ∂
 
− =W R , ,
∂t ∂s ∂s ∂t ∂s ∂t ∂x j ∂xℓ ∂xk
∂f ∂f
 
=R , ,W . (2.3.9)
∂t ∂s
https://avxhm.se/blogs/hill0
68 2. T HE E INSTEIN EQUATION

Specializing to W = ∂ f /∂t, and with γ ′ (t) = ∂ f /∂t along s = 0, we have


D2 V D D ∂f D D ∂f
2
= =
∂t ∂t ∂t ∂s ∂t ∂s ∂t
D D ∂f
= + R(γ ′ (t), V (t), γ ′ (t))
∂s ∂t ∂t
= R(γ ′ (t), V (t), γ ′ (t));
in the last step we used that Dγ ′ (t)/∂t = 0 since γ is a geodesic.
We note that when f is an embedding (or an immersion, which is an embedding
upon suitably restricting the domain of f ), so that vector fields along f can
be extended, we can see the above much more simply as follows. Since in
this case we can use D/∂t = ∇γ ′ (t) and D/∂s = ∇∂ f /∂s = ∇V (t) , and since
0 = d f ([∂/∂s, ∂/∂t]) = [∂ f /∂s, ∂ f /∂t], we again have, using that γ ′ (t) = ∂ f /∂t
along s = 0,
D2 V D D ∂f D D ∂f ∂f
2
= = = ∇∂f ∇∂f
∂t ∂t ∂t ∂s ∂t ∂s ∂t ∂t ∂s ∂t
∂f
= ∇∂f ∇∂f + R(γ ′ (t), V (t), γ ′ (t))
∂s ∂t ∂t
= R(γ ′ (t), V (t), γ ′ (t)). □
We compare this to (2.3.6) to infer that the tidal acceleration relative to
U = γ ′ (t), i.e., the acceleration D 2 (V1s)/dt 2 , where V1s is approximately the
displacement from the reference observer γ (t) to a nearby observer, is given by
R( U , V 1s, U ), which must be akin to the  (tidal) force (per unit mass). We see
∂ 28
that the analogue of the matrix is the matrix
∂x j ∂xk
µ ν
k dγ dγ

R jµν
dt dt
(note the index swap to match the sign), so that the analogue of 18 is obtained
by tracing over j and k to obtain Ric(γ ′ (t), γ ′ (t)), where t is proper time. Since
Ric(γ ′ (t), γ ′ (t)) = 0 for all timelike γ ′ (t) implies that the Ricci curvature must
vanish, we see that the Newtonian law of gravitation in vacuum is analogous to
the vacuum Einstein field equation Ric(g) = 0.
To make a further link to Newtonian theory, we now construct coordinates
along a timelike geodesic adapted to the geodesic and the geometry. We use
Greek letters for spacetime indices and Roman letters for spatial indices. Consider
a timelike geodesic γ (τ ), parametrized by proper time τ , with γ (0) = p. Choose
an orthonormal frame {e1 , e2 , e3 } for the orthogonal complement of γ ′ (0) in
T p M. We parallel translate these vectors along γ to produce an orthonormal
T HE E INSTEIN EQUATION 69

frame {e1 (τ ), e2 (τ ), e3 (τ )} for the orthogonal complement of γ ′ (τ ) in Tγ (τ ) M.


We define coordinates in a neighborhood of the geodesic by the map

ϕ(τ, x) = expγ (τ ) (x i ei (τ )).


Since
∂ ∂
   
dϕ = γ ′ (τ ) and dϕ = ei (τ ),
∂τ (τ,0) ∂xi (τ,0)

ϕ defines a coordinate system, called Fermi coordinates, in a neighborhood of γ .


For index purposes, we rescale the time coordinate to x 0 = cτ , and refer to the
coordinates x µ as Fermi coordinates. Note that the coordinates of γ (τ ) = ϕ(τ, 0)
µ µ
are γ 0 (τ ) = cτ , while γ j (τ ) = 0, and we let 0νσ (cτ, x) = 0νσ |ϕ(τ,x) .
It is clear by construction that the metric components along γ in this coordinate
system agree with the components of the Minkowski metric in inertial coordinates.
We now establish a lemma regarding the behavior of the Christoffel symbols
along γ .
µ
Lemma 2-4. For all µ, ν, σ ∈ {0, 1, 2, 3}, 0νσ (cτ, 0) = 0.

Proof. For any (τ, b), with b = (b1 , b2 , b3 ), consider the curve defined via the
exponential map as β(s) = expγ (τ ) (sbi ei (τ )) = ϕ(τ, sb). Let b0 = 0. Then
β 0 (s) = cτ and β k (s) = sbk for k = 1, 2, 3. By definition, β is a geodesic. Since
d 2 β µ/ds 2 = 0 for µ = 0, 1, 2, 3, we have

µ dβ ν dβ σ µ µ
0 = 0νσ (cτ, sb) = 0νσ (cτ, sb)bν bσ = 0i j (cτ, sb)bi b j .
ds ds

Since at s = 0, we can consider β defined by an arbitrary b ∈ R3 , we have


µ
0i j (cτ, 0) = 0 for 0 ≤ µ ≤ 3 and 1 ≤ i, j ≤ 3. Similarly, the geodesic equation
µ
for γ yields 000 (cτ, 0) = 0 for 0 ≤ µ ≤ 3.
To get the other Christoffel symbols, we use the equations for parallel transport.
Along γ , the x ℓ -coordinate vector is eℓ for ℓ = 1, 2, 3, so the components of eℓ
µ µ
along γ are eℓ (τ ) = δ ℓ . The parallel transport equations are then given by

d µ µ dγ σ dγ σ µ
0= (eℓ (τ )) + 0νσ (cτ, 0)eℓν (τ ) µ
= 0νσ (cτ, 0)δ νℓ = c0ℓ0 (cτ, 0),
dτ dτ dτ

where we used the fact that γ 0 (τ ) = cτ and γ j (τ ) = 0 for j = 1, 2, 3. □


µ µ
Lemma 2-5. Along γ , we have Rν00 = ∂000 /∂ x ν , so that R kj00 = − 12 g00,k j for
j, k ∈ {1, 2, 3}, while R 0j00 = 0.
https://avxhm.se/blogs/hill0
70 2. T HE E INSTEIN EQUATION

µ
Proof. Along γ the Christoffel symbols vanish and hence ∂0νσ /∂τ = 0, so that
µ µ
R000 = 0 = ∂000 /∂ x 0 along γ , while for j = 1, 2, 3,

µ ∂ ∂ ∂
R j00 = ∇ e (τ ) ∇ ∂ − ∇c −1 γ ′ (τ ) ∇ ∂
∂xµ j
∂x0 ∂ x
0
∂x j ∂ x
0

µ ∂ µ ∂
   
= ∇e j (τ ) 000 µ − c−1 ∇γ ′ (τ ) 0 j0 µ
∂x ∂x
µ
∂000 ∂
= .
∂x j ∂xµ
Moreover, since the Christoffel symbols vanish along γ , so do the first partials
of gµν and g µν . Thus, along γ , we have
k
∂000 ∂ 1 kσ
= 2 g (2g0σ,0 − g00,σ ) = − 21 δ km g00,m j = − 21 g00,k j .

∂x j ∂x j
0
Similarly, ∂000 /∂ x j = − 12 g00,0 j = 0. □
∂ 28
 
Now, as we noted above, the analogue of the matrix is
∂x j ∂xk
c2 R kj00 = − 12 c2 g00, jk .

Thus the analogue of the gravitational potential 8 is − 12 c2 g00 , which is equivalent


to what we had earlier (since 8 is defined only up to an additive constant). The
analogue of its Laplacian 18 is then Ric(γ ′ (τ ), γ ′ (τ )) = c2 R00 , as we had
before. Our analysis leads us again to propose that the Ricci curvature should
be related to the matter density. Of course, the mass-energy density is not an
invariant object, but the stress-energy tensor is. We can argue as in our earlier
analysis how to get from here to the Einstein equation.

2.3.4. Variational formulation. We now consider a Lagrangian variational for-


mulation for the Einstein equation, as first derived by Hilbert around the time
Einstein proposed the equation to model gravitation. Consider the Einstein–
Hilbert action, or total scalar curvature functional, R(g) = M R(g)dvg (or the
R
R 1
related action S (g) = M 2κ R(g)dvg ; cf. (2.3.15)), where, in local coordinates,
dvg = |det(gi j )| d x. (The definition holds for semi-Riemannian metrics g, and
p

thus includes both the Lorentzian and Riemannian cases.) We assume that M is
compact, or more generally that R(g) ∈ L 1 (M, dvg ). We want to compute the
first variation of R and the associated Euler–Lagrange equation.
We will need Cramer’s rule. If A = (Ai j ) is an n × n matrix, we let Mi j be
the determinant of the (n −1) × (n −1) minor obtained by deleting row i and
T HE E INSTEIN EQUATION 71

column j of A. The determinant of A is given by column or row expansion:


n n
(−1)i+ j Mi j Ai j = (−1)i+ j Mi j Ai j .
X X
det A =
i=1 j=1

The n × n matrix with entries cof(A)i j = (−1)i+ j Mi j is known as the cofactor


matrix, and its transpose, Aadj = (cof(A))T , is called the Cramer’s rule adjoint,
adj
so Ai j = (−1)i+ j M ji = cof(A) ji . Thus for any j ∈ {1, 2, . . . , n} we have
det A = (Aadj · A) j j . For i ̸= j, we have (Aadj · A)i j = nk=1 (−1)i+k Mki Ak j = 0:
P

indeed, we can interpret this sum as the determinant of the matrix à obtained
by replacing column i of A by column j of A. The minors Mki are obtained by
crossing out column i, so Mki are the same for A and Ã. But det à = 0 since
à has two equal columns. In summary we arrive at Cramer’s rule: If In is the
n × n identity matrix, then Aadj · A = (det A)In . If A is invertible, then
1
Aadj .
A−1 =
det A
Now we turn to variational formulae.
Lemma 2-6. If A(t) is a smooth path of n × n matrices,
d
det A(t) = tr(Aadj (t) · A′ (t)).
dt
If A(t) is invertible,
d
log |det A(t)| = tr(A−1 (t) · A′ (t))
dt
and
dp
|det A(t)| = 12 |det A(t)| tr(A−1 (t) · A′ (t)).
p
(2.3.10)
dt
2
Proof. If we consider det A as a function of (Ai j ) ∈ Rn , then
∂ det A adj
= (−1)i+ j Mi j = A ji .
∂ Ai j
By the chain rule,
n
d X ∂ det A ∂ Ai j
det A(t) = · .
dt ∂ Ai j ∂t
i, j=1

Together with the preceding equation, this establishes the lemma. □


We need one more variational formula, for the scalar curvature, to compute
the Euler–Lagrange equation for the Einstein–Hilbert action. For a symmetric
(0, 2)-tensor field h, at any point, gt := g + th is a metric for |t| small enough
https://avxhm.se/blogs/hill0
72 2. T HE E INSTEIN EQUATION

on any compact subset, or on all of M if h has compact support. We define


L g h := dtd t=0 R(g + th), which may be computed locally, so we may assume in
the computation that |t| is small enough that gt is indeed a metric.
Lemma 2-7. For a symmetric (0, 2)-tensor field h,
L g h := −1g (trg h) + divg divg h − h · Ric(g) (2.3.11)
where h · Ric(g) = g ik g jℓ h i j Rkℓ .
The proof can be carried out in a straightforward computation. For instance,
one might compute in normal coordinates at a point, and it might be useful to
employ the fact that if 0ikj (t) are the Christoffel symbols (which do not comprise
components for a local tensor field) for gt = g + th in a coordinate chart, then
δ0ikj := dtd t=0 0ikj (t) in fact do give the components for a local tensor field; this
follows from the fact that the difference of two connections is tensorial, so that if
g g
∇ g is the Levi-Civita connection for g, then S(X, Y ) := ∇ Xt Y − ∇ X Y is tensorial
(Exercise 1-6).
Exercise 2-8. Prove Lemma 2-7. You might start by deriving the local coordinate
formula for the curvature,
ℓ ℓ ℓ ℓ m m ℓ
Rki j = 0i j,k − 0 jk,i + 0km 0i j − 0 jk 0im , (2-8a)
and thus for the Ricci and scalar curvatures:
R(g) = g i j Ri j = g i j 0ikj,k − 0ik,
k k m m k
j + 0km 0i j − 0 jk 0im . (2-8b)


You might then argue that δ0ikj = 21 g km (h m j;i + h im; j − h i j;m ), and observe that
the variation of the Ricci tensor is given by dtd t=0 Ri j = (δ0)ikj;k − (δ0)ik; k
j.
We now derive the Euler–Lagrange equation for the Einstein–Hilbert action.
We will vary the metric g in the direction of a symmetric (0, 2)-tensor h. We
will take h to be compactly supported, so we can make sense out of the variation
for a given h even in case R(g) fails to be integrable, by integrating only over
the support of h, where the metric g is actually changing.
Theorem 2-9. The first variation of the Einstein–Hilbert action (total scalar
curvature functional) R is given by
d
Z
R(g + th) = − h · Ric(g) − 12 R(g) g dvg

dt t=0 M
for all compactly supported tensors h (vanishing near the boundary ∂ M if ∂ M
is nonempty). Thus the Euler–Lagrange equation is Ric(g) − 21 R(g)g = 0. This
equation is satisfied on all two-dimensional manifolds (M, g). For n = dim M ≥ 3,
the Euler–Lagrange equation is equivalent to Ric(g) = 0.
T HE E INSTEIN EQUATION 73

Proof. In local coordinates x, we have dvg = |det(gi j )| d x (which we can treat


p

as a measure or locally as an n-form). Hence, from Lemma 2-7, the symmetry


of g (or h), and the fact that g i j h i j = trg h, we have, for gt = g + th,
d
dvgt = 12 g i j h i j dvg = 12 (trg h) dvg . (2.3.12)
dt t=0

Integrating by parts and observing that boundary terms vanish by the choice of h,
we obtain
d
R(g + th)
dt t=0
Z Z
= L g h dvg + R(g) · 12 (trg h) dvg
M M
Z
= (−1g (trg h) + divg divg h − h · Ric(g)) + 12 R(g) g · h dvg

M
Z
=− h · Ric(g) − 21 R(g) g dvg .

M

If M is closed, we let h = Ric(g) − 21 R(g) g to finish the proof. In any case, the
preceding equation holds for all h in a dense subset of L 2 (M, dvg ), so we see
that we must have Ric(g) − 21 R(g) g = 0.
If M is two-dimensional and {e1 , e2 } is an orthonormal basis of T p M, say
⟨e1 , e2 ⟩ = 0, ⟨e1 , e1 ⟩ = ϵ1 = ±1, and ⟨e2 , e2 ⟩ = ϵ2 = 1, then

Ricg (e1 , e j )= ϵ1 ⟨R(e1 , e1 , e j ), e1 ⟩ + ϵ2 ⟨R(e2 , e1 , e j ), e2 ⟩


= ⟨R(e2 , e1 , e1 ), e2 ⟩δ1 j
= ϵ1 ⟨R(e2 , e1 , e1 ), e2 ⟩g(e1 , e j ),
Ricg (e2 , e j )= ϵ1 ⟨R(e2 , e1 , e1 ), e2 ⟩δ2 j
= ϵ1 ⟨R(e2 , e1 , e1 ), e2 ⟩g(e2 , e j ).

Hence the desired equation follows from

R(g) = ϵ1 Ricg (e1 , e1 ) + ϵ2 Ricg (e2 , e2 ) = 2ϵ1 ⟨R(e2 , e1 , e1 ), e2 ⟩.

If M has higher dimension, we trace the Euler–Lagrange equation to obtain


R(g) − 12 (dim M)R(g) = 0, or R(g) = 0. Thus Ric(g) = 0 in this case; the
converse is trivial. □
Definition 2-10. A semi-Riemannian manifold (M n , g) is called an Einstein
manifold if Ric(g) = Cg for some constant C. In this case the scalar curvature
is constant: R(g) = nC.
https://avxhm.se/blogs/hill0
74 2. T HE E INSTEIN EQUATION

Exercise 2-11. Consider an Einstein semi-Riemannian manifold (M n , g). In the


case n = 2, the sectional curvature is clearly constant. Prove this as well for n = 3.
(Hint: consider using an orthonormal frame {E 1 , E 2 , E 3 } for T p M, mindful of
the signature ϵi = ⟨E i , E i ⟩.)

Lemma 2-12. Consider (M n , g) semi-Riemannian with n ≥ 3. If Ric(g) = f g


for some function f , then f is (locally) constant.

Exercise 2-13. Prove the lemma. You might recall the Bianchi identity; compare
Corollary 2-2.

For any constant 3, we can also consider R3 (g) = M (R(g) − 23) dvg . The
R

formula (2.3.12) for the variation of the volume element dvg easily gives this:

Corollary 2-14. For all compactly supported tensors h (vanishing near ∂ M if


∂ M is nonempty), we have
d
Z
R3 (g + th) = − h · Ric(g) − 12 R(g) g + 3g dvg .

dt t=0 M

Thus the Euler–Lagrange equation for R3 is

Ric(g) − 12 R(g)g + 3g = 0.

Suppose that g solves the Euler–Lagrange equation. Then 3 = 0 if M has


dimension 2. For dim M = n ≥ 3, we have
2n
R(g) = 3,
n−2
23
that is, the metric g is Einstein, Ric(g) = Cg, with C = .
n −2
Definition 2-15. The Einstein tensor is given by G = G(g) = Ric(g) − 21 R(g) g.
For 3 constant, we also define the related tensor G 3 (g) = Ric(g)− 12 R(g) g+3g.
The vacuum Einstein equation (with cosmological constant 3) is given by G 3 = 0.
If 3 = 0 and n > 2, this is equivalent to Ric(g) = 0.

2.3.4.1. Lagrangian formulation with matter fields. We now discuss how to


include matter fields into the variational formulation. We assume the matter
fields are given in terms of a collection 9 of tensor fields (including scalars,
vectors, etc.), which we assume are independent of the metric g and are governed
by an action M Lm (g, 9) dvg . Note that Lm can also depend on derivatives of
R

the fields in 9. If h is a symmetric (0, 2)-tensor and 8 is a collection of tensor


T HE E INSTEIN EQUATION 75

fields representing a direction of variation of 9, we write


d
Lm (g + th, 9) =: (D1 Lm )(g,9) (h, 0),
dt t=0
d
Lm (g,9 + t8) =: (D2 Lm )(g,9) (0, 8).
dt t=0
We remark that it is sometimes useful, especially when varying the metric, to

express the action though the Lagrangian density, Lm · |det g|.
The fields 9 must satisfy certain equations for the action to be stationary at
(g, 9): for any direction of variation 8 (compactly supported and vanishing
near the boundary, if nonempty) of the fields, we must have
d
Z Z
0= Lm (g, 9 + t8) dvg = (D2 Lm )(g,9) (0, 8) dvg . (2.3.13)
dt t=0 M M

From here we can derive field equations (sometimes called equations of motion)
for the matter fields. Instead of introducing new notation, we illustrate with
examples.
Example 2-16. Consider g = η, the Minkowski metric on M1+k = R11+k , and
work with global inertial coordinates (t, x i ). Then for any smooth function ψ,
 ∂ψ 2 k 
−2
X ∂ψ 2
η(dψ, dψ) = −c + .
∂t ∂xi
i=1

If the action integrand is Lm = − 12 η(dψ, dψ), then for any smooth compactly
supported ϕ, the field equation is obtained from
d
Z
0= − 1 η(dψ + tdϕ, dψ + tdϕ) dt d x 1 · · · d x k
dt t=0 M 2
Z Z
= −η(dψ, dϕ) dt d x 1 · · · d x k = ϕ □ ψ dt d x 1 · · · d x k
M M

where
2 k 
∂ ∂ 2
 
−2
X
□ = −c +
∂t ∂xi
i=1

is the wave operator in Minkowski spacetime. Since this must hold for all
appropriate ϕ, the field equation is the wave equation □ ψ = 0.

Exercise 2-17. If we let Lm = − 12 η(dψ, dψ) − V (ψ), where V is a smooth


function of one variable, show that the resulting field equation is □ ψ = V ′ (ψ).
In case V (ψ) = 21 m 2 ψ 2 , the resulting equation □ ψ − m 2 ψ = 0 is called the
Klein–Gordon equation (with c = 1, h̄ = 1).
https://avxhm.se/blogs/hill0
76 2. T HE E INSTEIN EQUATION

We remark that the Lagrangian density in the Einstein–Hilbert action for the
gravitational field involves second-order derivatives of the metric, as does the
corresponding Euler–Lagrange equation (the Einstein equation), whereas for the
preceding two examples, the Lagrangian density is first-order in ψ, while the
Euler–Lagrange equation is second-order in ψ.

Example 2-18. For the source-free electromagnetic field, one has


1
Lm = − Fab Fcd g ac g bd ,
16π
where Fab = (d A)ab for a one-form A, the potential. If ϕ is a direction of
variation of the potential, then the stationarity of the action requires (we use the
antisymmetry of F and Exercise 1-6)
1 1
Z Z
0= − ⟨dϕ, F⟩g dvg = − (ϕb;a − ϕa;b )Fcd g ac g bd dvg
8π 8π
ZM M
1
= ϕa;b Fcd g ac g bd dvg

ZM
1
= − ϕa (Fcd;b g bd )g ac dvg .
M 4π
In other words, we obtain the equation divg F = 0, where divg is the spacetime
divergence Fcd;b g bd , which forms part of Maxwell’s equations. The other part
comes from the fact that Fab is a closed two-form.

We now define the stress-energy tensor T corresponding to matter fields


given by a Lagrangian as above, via the following equation identifying T as a
symmetric (0, 2)-tensor via an integral pairing:
d
Z
Lm (g + th, 9) dvg+th
dt t=0 M Z
= (D1 Lm )(g,9) (h, 0) + Lm (g, 9) · 12 trg h dvg

M
1
Z
=: ⟨h, T ⟩g dvg . (2.3.14)
2 M

Example 2-19. Consider a Klein–Gordon field Lm = − 12 η(dψ, dψ) − 21 m 2 ψ 2


at the Minkowski metric on M1+k . The stress tensor T satisfies
Z
η(h, T ) dvη
M Z
1 ij sℓ 1 1 2 2 1 js
=2 2 η h js η ψ;i ψ;ℓ − 2 η(dψ, dψ) + 2 m ψ 2 h js η dvη .
 
M
T HE E INSTEIN EQUATION 77

Thus we see Tab = ψ;a ψ;b − 12 (η(dψ, dψ) + m 2 ψ 2 )ηab . There is an analogue
on any spacetime (M, g): Tab = ψ;a ψ;b − 21 (ψ;c ψ;d g cd + m 2 ψ 2 )gab .
1
Example 2-20. For the source-free Maxwell field Lm = − F F g ac g bd ,
16π ab cd
the method above yields
Z
⟨h, T ⟩g dvg
M
1
Z
= − Fab Fcd − g ai h i j g jc g bd − g ac g bi h i j g jd + 21 h i j g i j g ac g bd dvg


ZM
1
= h i j Fab Fcd g ai g jc g bd + g ac g bi g jd − 21 g ac g bd g i j dvg


ZM
1
= hi j Fab Fcd g bd g ai g jc − 14 g ac g bd g i j dvg

M 4π
where in the last step we used the antisymmetry of Fab . Thus
1
Tab = Fac Fbd g cd − 41 Fi j Fsℓ g is g jℓ gab .


One can derive the Einstein equation starting from the ansatz that the action
for the combination of gravitation with the fields is simply the sum of a constant
multiple of R3 with the action for the matter. In particular, we are assuming that
the fields do not themselves appear in the Lagrangian for the gravitational field.
With κ a positive constant as above (where κ = 8π G/c4 in spacetime dimension
four by the Newtonian limit), we consider the action
1
Z  
S (g, 9) = (R(g) − 23) + Lm (g, 9) dvg . (2.3.15)
M 2κ

An important consideration in the earlier derivation of the Einstein tensor


is that the stress-energy tensor is divergence-free. This can be derived from
the variational definition above, along with diffeomorphism invariance of the
action. The Einstein–Hilbert action R3 is clearly diffeomorphism invariant: if φ
is a diffeomorphism of M, then R3 (g) = R3 (φ ∗ g). Thus the diffeomorphism
invariance of S (g, 9) is equivalent to the condition that the action for the matter,
Rm (g, 9) = M Lm (g, 9) dvg , is diffeomorphism invariant.
R

Now, suppose that X is a smooth vector field on M, which generates a local


one-parameter subgroup of diffeomorphisms φt . Recall that dtd t=0 φt∗ g = L X g,
where (L X g)i j = X i; j + X j;i .
Exercise 2-21. Suppose X is compactly supported away from the boundary,
and let h = L X g. Show directly from the formula for the variation of the
Einstein–Hilbert action and the Bianchi identities that dtd t=0 R(g + th) = 0.
78 2. T HE E INSTEIN EQUATION

For X compactly supported away from the boundary with one-parameter


group of diffeomorphisms φt , we then have by using the preceding exercise,
along with (2.3.13) and (2.3.14), the symmetry of T and the diffeomorphism
invariance of the action,
d
Z
0= Lm (φt∗ g, φt∗ 9) dvφt∗ g
dt t=0 M
1
Z Z Z
ij ij
= ⟨L X g, T ⟩g dvg = X i; j T dvg = − X i T ; j dvg .
2 M M M

Since this holds for any compactly supported X , we see that T must be divergence-
free as desired.

2.3.4.2. Related variational problems. Let M be a closed (compact without


boundary) manifold of dimension n ≥ 3. We want to characterize critical points
of R constrained to metrics of fixed volume, say Vol g = 1. We let M be
the set of (smooth) metrics, and let M1 ⊂ M be the subset of unit volume
metrics. Note that if g ∈ M, and if C > 0 is a constant, then dv(Cg) = C n/2 dvg ,
and so Vol(Cg) = C n/2 Vol g. Thus (Vol g)−2/n g ∈ M1 . We also note that
R(Cg) = C −1 R(g).
We can consider R on M1 , or by rescaling g ∈ M to ḡ = (Vol g)−2/n g ∈ M1 ,
we can consider the functional
R(g)
R(g) = R(ḡ) = 2
.
(Vol g)1− n

Proposition 2-22. A metric g ∈ M is critical for R if and only if g is Einstein.


Thus g ∈ M1 is critical for R restricted to M1 if and only if g is Einstein.

Proof. Let gt = g +th, and let ḡt = (Vol gt )−2/n gt ∈ M1 . Then R(gt ) = R(ḡt ) =
2 d
R 1
R(gt )/(Vol gt )1− n . We compute, using dt t=0
Vol gt = M 2 (trg h) dvg ,

d
R(ḡt )
dt t=0
h · Ric(g) − 12 R(g) g dvg  R(g) Z
R 
2

M 1
=− − 1− (tr h) dvg
(Vol g)1− n
2 n (Vol g)2− n2 M 2 g

1 n −2
Z  
1 −1
=− h · Ric(g) − 2 R(g) g + (Vol g) R(g)g dvg .
2
(Vol g)1− n M 2n

For this to vanish for all symmetric (0, 2)-tensors h, we must have
1 n −2
Ric(g) − R(g) g + (Vol g)−1 R(g)g = 0.
2 2n
T HE E INSTEIN EQUATION 79

Taking the trace yields


n −2
R(g) − (Vol g)−1 R(g) = 0.

2
Hence R(g) = (Vol g)−1 R(g), a constant. From here, we clearly have
R(g)
Ric(g) = g,
n
and R(g) is constant. □
We note that while the unconstrained problem has critical points characterized
by Ric(g) = 0, the constrained problem has a more general critical point equation,
which can be interpreted as a Lagrange multiplier condition. If we interpret the
gradient (in an L 2 integral sense) of R as − Ric(g) − 12 R(g) g , the gradient of


the volume functional would be 21 g, and a Lagrange multiplier condition would


be of the form −(Ric(g) − 21 R(g) g) = λ · 21 g.
Remark 2-23. Einstein metrics arise as stationary points of normalized Ricci
flow. From (2.3.12), we see that a solution g = gt to the normalized Ricci flow
∂g 2 R(g) 1 R(g)
 
= −2 Ric(g) + g = −2 Ric(g) − g
∂t n Vol g n Vol g
has volume Vol gt constant.
Exercise 2-24. Suppose g = gt is a one-parameter family of metrics satisfying
the Ricci flow ∂g/∂t = −2Ric(g) on a closed manifold M. Let Vt = Vol gt , and
−2/n
let g̃ = g̃t be the rescaled metric g̃t = Vt gt . Note that g̃ has unit volume.
−2/n
Let t˜ = t˜(t) be a rescaled time function determined by d t˜/dt = Vt , with
˜t (0) = 0. Show that g̃ solves the normalized Ricci flow with respect to t˜, i.e.,
∂ g̃/∂ t˜ = −2 Ric(g̃) − n1 R(g̃)g̃ .


We could further constrain the variation so that the metrics under consideration
are not only of unit volume, but pointwise conformal to g, that is, expressible as
f · g, for f > 0 a smooth function on M. One can partition M into equivalence
classes [g] of pointwise conformal metrics. As an immediate corollary of the
proof of Proposition 2-22 we have the following.
Proposition 2-25. A metric g is critical for R amongst variations gt ∈ [g] if and
only if R(g) is constant. A metric g ∈ M1 is critical for R amongst variations
gt ∈ M1 ∩ [g] if and only if R(g) is constant.
Proof. Let gt = f t g, t ∈ I , be smooth in I × M, with f 0 = 1. Then f t is smooth,
and h = dgt /dt = (d f t /dt)g. Let ψ = d f t /dt |t=0 , which can be any smooth
80 2. T HE E INSTEIN EQUATION

function on M, since we could let f t = 1 + tψ. Applying the argument in the


proof of Proposition 2-22, with h = ψg, we have
d
R(gt )
dt t=0
1 n −2
Z  
1 −1
=− ψg · Ric(g) − 2 R(g) g + (Vol g) R(g)g dvg
2
(Vol g)1− n M 2n

1 n −2
Z
−1
= · ψ R(g) − (Vol g) (g) dvg .

R
2
(Vol g)1− n n M

Since this must vanish for all ψ, we must have R(g) = (Vol g)−1 R(g), and
conversely. □

2.3.5. The Gauss–Bonnet theorem for closed surfaces. Before we return to


general relativity, we discuss how the analysis of the Einstein–Hilbert action
yields the global Gauss–Bonnet theorem.
Theorem 2-26. Suppose that (6, g) is an orientable, closed Riemannian surface
with Euler characteristic χ (6). If K = K (g) is the Gauss (sectional) curvature,
then Z
K dvg = 2π χ (6).
6

Proof. Any connected such surface 6 is topologically a sphere, or a torus, or a


torus with some number γ of handles attached (genus γ ). A sphere has Euler
characteristic 2, as can be seen using a tetrahedral triangulation. It is also not too
hard to triangulate a torus and compute its Euler characteristic to be 0. Higher-
genus surfaces are obtained by doing a connected sum to a torus. We can view
this as removing a triangle from a surface and a torus, and gluing along the edges
of the triangle. Suppose the original surface had Euler characteristic χ , and we
know the torus has Euler characteristic 0. So we know the total alternating sum
of vertices, edges and faces at the start for both surfaces is χ. In the process
of adding a handle, we lose two faces, three edges and three vertices. Thus
adding a handle brings the Euler characteristic down by 2. Thus χ(6) = 2 − 2γ ,
γ = 0, 1, 2, . . . .
As we proved earlier, the Einstein–Hilbert action R(g) is critical at every
metric g on a two-dimensional manifold. This means that R is constant on the
space M of metrics: any two metrics g1 , g2 ∈ M can be connected by a linear
path gt = (1 − t)g1 + tg2 , 0 ≤ t ≤ 1, and we have seen that R(gt ) is constant in t.
Since R(g) = 2K (g), we see that 6 K (g) dvg is independent of g, and thus it
R

suffices to compute it at any single chosen metric.


T HE E INSTEIN EQUATION 81

D' D

C' C
Σ \U \V

cylindrical
Figure 4. Connected sum construction for the proof of the Gauss–
Bonnet theorem.

For a sphere, we take the metric of the unit (round) sphere, K = 1 and
Vol g = 4π. The Euler characteristic of the sphere is 2, as mentioned.
For a torus, we have Euler characteristic 0, and we can use a flat metric on
the torus to compute R(g).
To progress to the higher-genus case, we use an auxiliary construction. By
smoothly capping off an end of a circular cylinder, we obtain a surface D which
is topologically a closed disk, with a cylindrical collar neighborhood of its
boundary, which is a circular geodesic on the cylinder. Reflect the surface D
across the plane of the geodesic circle to produce a congruent surface D ′ , and
let S = D ∪ D ′ , which is smooth with spherical topology, and induced metric g.
Thus
1
Z Z
K (g) dvg = K (g) dvg = 2π. (2.3.16)
D 2 S
Observe that (2.3.16) is independent of the precise geometry of the smooth cap,
as well as the size of the cylindrical portion (C in Figure 4), where K = 0.
Given a closed surface 6, let 6 ′ be the connected sum of 6 and a torus T. We
can readily put a metric g on 6 and g0 on T, so that there are neighborhoods U ⊂
6 and V ⊂ T, each isometric to the interior of the surface D as above, and whose
complements 6 \ U and T \ V each possess a cylindrical collar neighborhood of
their boundary circle. We can represent 6 ′ as the union (6 \ U ) ∪ (T \ V ) along
the common boundary circle, and take the metric g ′ coinciding with the metrics
on each piece, noting that g ′ is smooth by construction. If the Gauss–Bonnet
theorem holds for 6, then by (2.3.16)
Z Z Z
′ ′
K (g ) dvg = K (g) dvg + K (g0 ) dvg0 − 4π = 2π(χ(6) − 2).
6′ 6 T
82 2. T HE E INSTEIN EQUATION

Since adding a handle contributes to a decrease in the Euler characteristic by 2,


the Gauss–Bonnet theorem follows by induction. □
The interested reader can readily extend the Gauss–Bonnet theorem to the case
where 6 is closed but nonorientable, by applying the preceding to the orientation
double covering for any nonorientable component of 6.

2.4. Spacetime examples

In this section we consider some examples of spacetimes and the form of the
Einstein equation they satisfy.

2.4.1. Constant curvature spacetimes. Let Rνn , for 0 ≤ ν ≤ n, denote the semi-
Riemannian manifold Rn with the metric ⟨∂/∂ x i , ∂/∂ x j ⟩ = ϵi δi j , where ϵi = −1
for 1 ≤ i ≤ ν and ϵi = 1 for ν + 1 ≤ i ≤ n. We define the following manifolds
as level sets of certain quadratic polynomials, with the metric induced from the
indicated inclusions: for n ≥ 2 and r > 0, we let

Sn1 (r ) = {x ∈ Rn+1 : −(x 0 )2 + (x 1 )2 + · · · + (x n )2 = r 2 } ⊂ R1n+1 = M1+n ,


Hn1 (r ) = {x ∈ Rn+1 : −(x 0 )2 − (x 1 )2 + (x 2 )2 + · · · + (x n )2 = −r 2 } ⊂ R2n+1 .

It is easy to see that Sn1 (r ) is diffeomorphic to R × Sn−1 , which is simply


connected for n ≥ 3, while Hn1 (r ) is diffeomorphic to S1 × Rn−1 . We let e
S21 (r )
n
and eH1 (r ) be the universal covering manifolds with the associated covering
metrics.
Proposition 2-27 [174, p. 228–229]. A complete, simply connected n-dimen-
sional Lorentzian manifold of constant sectional curvature C is isometric to one
of the following:
• Sn1 (r ) (n ≥ 3, C = r −2 ),
• S21 (r ) (n = 2, C = r −2 ),
e
• R1n (C = 0),
• Hn1 (r ) (C = −r −2 ).
e

S41 (r ) is de Sitter spacetime of curvature C = r −2 , H41 (r ) is anti-de Sitter


spacetime of curvature C = −r −2 , while the universal cover e H41 (r ) is universal
anti-de Sitter spacetime.
Constant curvature manifolds are Einstein. Indeed, in a constant curvature
Lorentz four-manifold (S 4 , g), we have Ric(g) = 14 R(g) g, where R(g) is con-
stant. Then Ric(g)− 12 R(g) g + 14 R(g) g = 0. One can interpret this as Einstein’s


equation with T = 0 and 3 = 14 R(g). Thus Ric(g) = 3g, and in the examples
S PACETIME EXAMPLES 83

above, 3 > 0 corresponds to de Sitter spacetime, while 3 < 0 corresponds to


anti-de Sitter spacetime.

2.4.2. The Einstein static universe. Consider the space R × S3 with the product
metric g = −c2 dt 2 + g̊S3 , where g̊S3 is the unit round metric on the three-sphere.
Exercise 2-28. Show that the Ricci curvature of this metric is Ric(g) = 2g̊S3 .

From this we see that R(g) = 6, so the Einstein tensor is Ric(g) − 12 R(g) g =
3c2 dt 2 − g̊S3 , and G 3 (g) = (3−3)c2 dt 2 +(3−1)g̊S3 . We identify this with the
stress-energy tensor of a perfect fluid, with fluid velocity U = ∂/∂t, density ρ and
pressure p. If we write T as a (0, 2)-tensor, we have T = (ρ + p)c2 dt 2 + pg =
ρc2 dt 2 + p g̊S3 . The Einstein equation G 3 (g) = κ T is then equivalent to

3 − 3 = κρ, 3 − 1 = κ p.

For p ≥ 0, we must have 3 ≥ 1. Note that ρ+ p = 2/κ and κ = 8π G/c4 .

2.4.3. Friedmann–Lemaître–Robinson–Walker (FLRW) spacetimes. Let I be


an interval and 6 a three-manifold. We consider a warped product metric of
the form g = −c2 dt 2 + (c f (t))2 g0 , where g0 is a Riemannian metric of constant
curvature k0 on 6 and f is a positive function on I . Such metrics arise in
cosmological models which incorporate the observation that the universe seems
to be everywhere isotropic (with respect to a class of observers, which may be on
the galactic scale), which in turn implies spatial homogeneity of the manifolds
orthogonal to the trajectories of such observers (spaces of simultaneity). See
[42; 174; 218] for more discussion of how isotropy and spatial homogeneity are
translated into a metric of the above form.
We let g = ⟨ · , · ⟩, and for the rest of this section, we let U = ∂/∂t. Further,
let X , Y and Z denote tangent vectors to 6t = {t} × 6 for t ∈ I , so that
g0 (X, Y ) = (c f (t))−2 ⟨X, Y ⟩.
Exercise 2-29. Verify the following formulae for the Riemann tensor of g (cf.
Exercise 1-15):
 ′ 2
f (t) k0

−2
R(X, Y, Z ) = c + ⟨Y, Z ⟩X − ⟨X, Z ⟩Y ,

f (t) ( f (t))2
′′
f (t)
R(X, U , U ) = − X,
f (t)
R(X, Y, U ) = 0,
f ′′ (t)
R(X, U , Y ) = −c−2 ⟨X, Y ⟩ U .
f (t)
84 2. T HE E INSTEIN EQUATION

We can now readily find the Ricci and scalar curvatures of g:

f ′′ (t) f ′′ (t)
Ric( U , U ) = −3 =3 ⟨ U , U ⟩c−2 ,
f (t) f (t)
Ric( U , X ) = 0,
  ′ 2
f (t) k0 f ′′ (t)

Ric(X, Y ) = 2 +2 2
+ ⟨X, Y ⟩c−2 ,
f (t) ( f (t)) f (t)
 ′ 2 ′′
f (t) k0 f (t) −2

R(g) = 6 + + c .
f (t) ( f (t))2 f (t)
The stress-energy tensor that corresponds to this metric can be found using
the Einstein equation:
c4 1
 
T= Ric(g) − R(g) g + 3g .
8π G 2
Exercise 2-30. Verify the following formulas.

T ( U , X ) = 0,
c2 f ′ (t) 2 k0 f ′′ (t)
  
2
T (X, Y ) = − + +2 − 3c ⟨X, Y ⟩,
8π G f (t) ( f (t))2 f (t)
  ′ 2
c4 f (t) k0

2
T (U, U) = 3 +3 − 3c .
8π G f (t) ( f (t))2
We can try to identify this with a perfect fluid, which has stress tensor T =
c−2 (ρ + p) U ♭ ⊗ U ♭ + pg. For instance, T ( U , U ) = c2 ρ, T ( U , X ) = 0 and
T (X, Y ) = p⟨X, Y ⟩. We can identify the stress-energy tensor of the warped
product as that of a perfect fluid, with
  ′ 2
c2 f (t) k0

2
ρ= 3 +3 − 3c
8π G f (t) ( f (t))2
  ′ 2
c4 f (t) k0

= 3 +3 −3 ,
8π G c f (t) (c f (t))2
 ′ 2
c2 f (t) k0 f ′′ (t)

2
p=− + +2 − 3c
8π G f (t) ( f (t))2 f (t)
  ′ 2
c4 f (t) k0 f ′′ (t)

= − − −2 2 +3 .
8π G c f (t) (c f (t))2 c f (t)

Note that (4π G/c4 )(ρ + 3 p) − 3 = −3 f ′′ (t)/f (t). It is also easy to derive
the following equation from the above: ρ ′ (t) = −3(ρ(t) + p(t)) f ′ (t)/f (t);
S PACETIME EXAMPLES 85

it is basically the vanishing of the t-component of divg T , and so expresses


conservation of energy.
When f (t) is constant, for example in the Einstein static universe, the geometry
of the spatial slices is not dynamic. In this case, if 3 = 0, then ρ and p must
have opposite signs, which is not appealing in terms of standard physics.
The Friedmann cosmological models are the cases of the above that correspond
to dust models, so that p = 0, with f ′ (t0 )/ f (t0 ) > 0 for some time t0 ∈ I . This
would model situations when the energy density dominates pressure, as might
be the present situation in the universe (but not, say, near the Big Bang). When
p = 0, we obtain the first-order linear equation ρ ′ (t) + 3ρ(t) f ′ (t)/f (t) = 0, so
by integrating we obtain ρ f 3 =: m, a constant. Substituting into the equation for
ρ above, we get the Friedmann equation

8π Gm 1
· = ( f ′ (t))2 + k0 − 13 3c2 ( f (t))2 .
3c2 f (t)

Note that there is a critical value of 3 for which one can achieve a static model
(constant f (t)) when p = 0, given by 3c2 = k03 (4π Gm/c2 )−2 .
We let 3 = 0 and A = 8π Gm/(3c2 ), so that the Friedmann equation becomes

A
= ( f ′ (t))2 + k0 .
f (t)

One can solve this for each possible sign of√k0 . For k0 = 0, we get, assuming

f (t) > 0 and f ′ (t) ̸= 0, f (t) f ′ (t) = ± A, so if we let f (0) = 0, then
f (t) = Ct 2/3 , where C > 0 is constant. This describes a universe that expands
from an initial “Big Bang” singularity (note that f ′ (t) → +∞ as t ↘ 0). If k0 = 1,
the solution graph can be written in parametrized form as t = 21 A(u − sin u),
f = 21 A(1 − cos u), which describes a cycloid. The geometry here expands from
f (0) = 0 to a maximum value, then recollapses at t = π A (u = 2π), the “Big
Crunch”. Similarly, if k0 = −1 and ρ > 0, then ( f ′ (t))2 = 1 + A /f (t) > 1, so
that f (t) keeps growing without bound, and the spatial slices expand in size over
time. We note that you can parametrize the solution graph as t = 21 A(sinh u − u),
f = 21 A(cosh u − 1).

2.4.4. Schwarzschild spacetime. We consider the vacuum Einstein equation


Ric(g) = 0. This is a nonlinear system of second-order partial differential
equations for the metric components. One can reduce this to ordinary differential
equations by imposing an ansatz of spherical symmetry, and argue (see the proof
of Birkhoff’s theorem in [112], [161, Section 32.2], or [218, Section 6.1]) that in
86 2. T HE E INSTEIN EQUATION

spherical symmetry, the (n+1)-dimensional spacetime metric has the form

g = −u(r )dt 2 + v(r )dr 2 + r 2 g̊Sn−1 ,

where g̊Sn−1 is the unit round metric on the sphere. (We take n ≥ 3, since any
Ricci-flat surface or three-manifold is flat; see Exercise 2-11.) So a spherically
symmetric metric is also static (see Section 2.4.5, and compare [218, Chapter 6;
174, Chapter 12]), where the vector field ∂/∂t is a Killing field, timelike for
u(r ) > 0, which is orthogonal to the orbits of the isometry group.
We impose the Einstein equation Ric(g) = 0 to determine g. We will sketch
this here, referring to the works just cited, as well as, e.g., [182, Chapter 3].
We will use results from Exercises 1-14 and 1-15, namely the formulas for the
curvature of a warped product, with the tr -plane as the base B with metric
g B = −u(r )dt 2 +v(r )dr 2 and with fiber F the round sphere (Sn−1 , g̊Sn−1 ), with
warping function r .
We apply Exercises 1-14 and 1-15 to find Ricg (X, Y ) for vectors X and Y
tangent to the base, which yields
 n−1 1 u ′ (r ) 2 1 v ′ (r ) 2
  
2 2
K −u(r )dt + v(r )dr − · − dt − dr (X, Y )
r 2 v(r ) 2 v(r )
where
1 u ′′ (r ) 1 (u ′ (r ))2 1 u ′ (r )v ′ (r )
K =− + + .
2 u(r )v(r ) 4 (u(r ))2 v(r ) 4 u(r )(v(r ))2
For this to vanish for all X and Y is equivalent to
n − 1 u ′ (r ) n − 1 v ′ (r )
−K u(r ) + = 0 = K v(r ) + . (2.4.1)
2r v(r ) 2r v(r )
u ′ (r ) v ′ (r )
Thus a necessary condition for Ric(g) = 0 is that =− , i.e., u(r )v(r )
u(r ) v(r )
is constant.
Exercise 2-31. Use the formula for K , as well as u(r )v(r ) = C, to show that the
condition (2.4.1) above reduces to r u ′′ (r ) + (n−1)u ′ (r ) = 0. Show the general
solution is given by u(r ) = c1 + c2 /r n−2 , where c1 and c2 are constants.
Thus we can achieve the Ricci-flat condition along base directions with u(r ) =
c1 + c2 /r n−2 and v(r ) = C/u(r ). From Exercise 1-15, it follows that we just
have to consider the fiber directions, for which we note Ric(g F ) = (n−1)g̊Sn−1 .
Furthermore, as in Exercise 1-15, the meaning of ⟨∇r, ∇r ⟩ = (v(r ))−1 is the
same computed in g B or g. Moreover from Exercise 1-14 we have
1 u ′ (r ) 2 1 v ′ (r ) 2
Hessg B r = − dt − dr ,
2 v(r ) 2 v(r )
S PACETIME EXAMPLES 87

so
1 u ′ (r ) v ′ (r )
 
1g B r = trg B (Hessg B r ) = − .
2 u(r )v(r ) (v(r ))2
Therefore, for vectors V and W tangent to the fiber, we find Ricg (V, W ) is
precisely
  ′
r u (r ) v ′ (r ) n−2
  
g̊Sn−1 (V, W ) (n − 2) − − + .
2 u(r )v(r ) (v(r ))2 v(r )
Exercise 2-32. Show that under the condition u(r )v(r ) = C, the vanishing of
Ricg (V, W ) for all V and W tangent to the fiber is tantamount to
r u ′ (r ) + (n − 2)u(r ) = C(n − 2).
Solve this to obtain u(r ) = C + c3 /r n−2 for some constant c3 .
Together with the result of Exercise 2-31, we see that Ric(g) = 0 boils down
to (letting α satisfy c3 = −αC)
c3

α
  α −1
u(r ) = C + n−2 = C 1 − n−2 and v(r ) = 1 − n−2 ,
r r r
for C > 0 and α arbitrary constants. We remark that we can rescale the time
variable by a constant factor (changing time units effectively adjusts the numerical
value of the speed of light c), to normalize the constant value of u(r )v(r ). If
we take C = c2 , we can write the Schwarzschild spacetime metric ḡ S , for any
constant α, as
 α   α −1 2
ḡ S = − 1 − n−2 c2 dt 2 + 1 − n−2 dr + r 2 g̊Sn−1 .
r r
Note that for large r , ḡ S ≈ −c2 dt 2 + g En = −(d x 0 )2 +δi j d x i d x j , the Minkowski
metric. So not only must the spacetime be static, which follows directly from the
ansatz of spherical symmetry, but in the far field it must approach Minkowski
spacetime, a consequence of the symmetry together with the Ricci-flat condition
from the vacuum Einstein equation.
We want to identify α as proportional to the mass m of the spacetime, even-
tually settling on units for which α = 2m. We motivate this in dimension four,
i.e., n = 3, and we restrict the discussion to this dimension until further notice.
The Schwarzschild metric can represent the gravitational field in the exterior
of a non-rotating spherically symmetric massive body, and in the weak field
regime (large r ), the effect of the gravitational field on test particles (ascer-
tained by finding the geodesics in the Schwarzschild metric) is roughly that of a
Newtonian gravitational field for a point mass m. In fact, note that if we write
(ḡ S )µν = ηµν + h µν , with h 00 = 2Gm/(c2r ) and d x 0 = c dt, we can as on p. 63
88 2. T HE E INSTEIN EQUATION

identify a gravitational potential 8 = − 12 c2 h 00 = −Gm/r , in agreement with


Newtonian theory for the field of a body of mass m. The spacetime so obtained
is then
2Gm 2 2 2Gm −1 2
   
ḡ S = − 1 − 2 c dt + 1 − 2 dr + r 2 g̊S2 .
c r c r
For simplicity, we take units for which G = 1 and c = 1, so that
2m 2m −1 2
   
ḡ S = − 1 − dt 2 + 1 − dr + r 2 g̊S2 . (2.4.2)
r r
For m = 0 we recover Minkowski spacetime. For m > 0, there are metric
components that are singular at r = 2m and at r = 0, so that we have spacetime
regions with 0 < r < 2m and r > 2m, respectively. It turns out the geometry does
not become singular as r → 2m, and in fact it can be extended across r = 2m,
as we will investigate and make precise in the next section (2.4.4.1). However,
as r ↘ 0, there is a curvature blowup, as we conclude in Remark 2-35; see also
[161, p. 822].
Exercise 2-33. Show that the sectional curvature K of the metric
2m 2m −1 2
   
− 1− dt 2 + 1 − dr
r r
is K = 2m/r 3 . (ii) Let I ⊂ (0, +∞) be an open interval, and consider a
(Lorentzian or Riemannian) metric on I × S2 of the form g = h(r )dr 2 + r 2 g̊S2 .
Show that Ricg (∂/∂r , ∂/∂r ) = h ′ (r )/(r h(r )). Use this to show that, if ν is the

unit vector ν = |1 − 2m/r | ∂/∂r , then RicgS (ν, ν) = −2m/r 3 for the induced
metric g S = (1 − 2m/r )−1 dr 2 + r 2 g̊S2 (0 < r ̸= 2m) on a constant t-slice in
Schwarzschild. Note that for m > 0, the curvatures computed here have a finite
limit as r → 2m.
j
Exercise 2-34. Find the Christoffel symbols 0i0 of the Schwarzschild metric ḡ S ,
where i and j index spherical spatial coordinates and 0 is the t-index. Conclude
that the second fundamental form II of a constant t-slice of Schwarzschild
vanishes.
Remark 2-35. While the spacetime Schwarzschild metric ḡ S has vanishing Ricci
curvature, the Riemannian Schwarzschild metric g S on a constant t-slice does
not. The scalar curvature of g S does vanish, as we will see later in the section
(you can also just compute it directly). The norm of the ambient curvature tensor
for ḡ S blows up as r ↘ 0, by direct calculation, or by using the Gauss equation
(Proposition 5-5) together with II = 0 and RicgS (ν, ν) = −2m/r 3 for a constant
t-slice.
S PACETIME EXAMPLES 89

For m < 0, the metric g S on a constant t-slice is defined for all r > 0 and is
Riemannian. There are radial geodesics for this metric g S , which are then easily
seen to be spacelike geodesics for ḡ S , since II = 0. If r0 > 0, then for m < 0,
R r0 −1/2 dr < +∞. Thus a radial geodesic going from r toward r = 0
0 (1 − 2m/r ) 0
has finite length, so that both (R3 \{0}, g S ) and (R×(R3 \{0}), ḡ S ) are geodesically
incomplete. Neither metric can be smoothly extended in order to extend the
geodesic, due to the curvature blowup as r ↘ 0. As we will see below, for m > 0,
while the geometry of the spacetime metric ḡ S is well-behaved near r = 2m, the
metric is causally geodesically incomplete: there are timelike geodesics which
approach r = 0 in finite proper time (as well as null geodesics which approach
r = 0 at finite affine parameter), and along which there is curvature blowup.
We collect a few formulas that will be useful here and in the next section, and
make one more observation before moving on.
Exercise 2-36. By direct calculation, either computing the Christoffel symbols
µ
0ρσ = 21 g µν (gνσ,ρ + gρν,σ − gρσ,ν ) or by using the Cartan equations (Exercise
1-13), show that in the Schwarzschild metric ḡ S we have

∇ ∂ ∂ = m2 1 − 2m ∂ ,
 
∂t ∂t r r ∂r
−1
∇ ∂ ∂ = m2 1 − 2m ∂ ∂

= ∇∂ ,
∂t ∂r r r ∂t ∂r ∂t
−1
∇ ∂ ∂ = − m2 1 − 2m ∂

.
∂r ∂r r r ∂r
Consider the unit timelike vector field U = (1 − 2m/r )−1/2 ∂/∂t, for r > 2m.
Integral curves γ (τ ) of this vector field are Schwarzschild observers. In these
coordinates, their spatial positions are fixed. From the preceding exercise, we
note that Dγ ′ (τ )/dτ = ∇ U U = (m/r 2 ) ∂/∂r . Thus, for m ̸= 0, such observers
are not in free fall. In fact, as we have seen earlier (cf. Section 1.3.2.1), if
m 0 is the rest mass of the observer, one might interpret f = (m 0 m/r 2 ) ∂/∂r
as the force on the observer. As the observer has fixed spatial coordinates in
this chart, one might interpret this as the force required to oppose the effect of
gravitation, to keep the observer from geodesic motion (free fall); moreover,
∂/∂r is approximately a unit vector if r is large, so the interpretation is consistent
with Newtonian gravity (with G = 1), at least in the far field (weak field limit).
2.4.4.1. Kruskal extension. We have seen that the null structure plays a key role
in the geometry of spacetime, so we study the behavior of null geodesics near
r = 2m > 0. Fix a point ω0 on the sphere, and consider a curve γ (s) which is
given in coordinates by (t (s), r (s), ω0 ). Thus γ ′ (s) = t ′ (s)∂/∂t + r ′ (s)∂/∂r .
90 2. T HE E INSTEIN EQUATION

Therefore, using Exercise 2-36, we obtain


m 2m −1 ∂
  
′′ ′′ ′ ′
γ (s) = t (s) + 2t (s)r (s) 2 1 −
r r ∂t
2m 2m 2m 2m −1 ∂
     
′′ ′ ′
+ r (s) + (t (s)) 2 1 − −(r (s)) 2 1 − .
r r r r ∂r

We observe that the ∂t∂ -component of the geodesic equation γ ′′ (s) = 0 is first-
d
order linear in t ′ (s), and in fact can be written ds t ′ (s)(1 − 2m/r ) = 0, which


reduces to the condition that t ′ (s)(1−2m/r ) be constant.


Note that γ is null if and only if (t ′ (s))2 (1 − 2m/r )2 = (r ′ (s))2 , in which case
m 2m −1 ∂ ∂
  
′′ ′′ ′ ′
γ (s) = t (s) + 2t (s)r (s) 2 1 − + r ′′ (s) .
r r ∂t ∂r
If we want γ (s) to be a null geodesic, we must require r ′′ (s) = 0, i.e., r ′ (s) is a
constant, which we can assume to equal 1 by rescaling the affine parameter s.
We then see that
2m −1 r 2m
   
t ′ (s) = ±r ′ (s) 1 − =± = ± 1+
r r −2m r −2m
from the null condition, which we can integrate to get t = C±(r +2m log |r −2m|),
for some constant C. We thus obtain the solutions to the null geodesic equation
for γ (s), with r (s) linear in s. Note that for m > 0, as r (s) → 2m ± , |t (s)| → ∞:
so, in the tr -plane, these null geodesics are asymptotic to r = 2m over a bounded
affine parameter interval; compare this to the case m < 0, for which t (s) remains
bounded as r (s) ↘ 0, again over a bounded affine parameter interval.
We can analyze this in terms of lightcones. We again fix a point on the sphere,
and study the lightcones as r → 2m. Null vectors of the form a ∂/∂t + b ∂/∂r
must satisfy a/b = ±(1 − 2m/r )−1 , which approaches infinity as r → 2m. As
we saw above, a/b = dt/dr along a null geodesic path. Thus the lightcones
seem to be pinching in these coordinates.
To see how the geometry is not ill-behaved at r = 2m > 0, we seek coordinates
in which the lightcones are better behaved. They are suggested by the null
geodesics, whose form is t = ±r∗ (up to an additive constant), where r∗ :=
r + 2m log |r/(2m) − 1| is known as the Regge–Wheeler coordinate. Note that
dr∗ /dr = (1 − 2m/r )−1 . If we use this to define new coordinates via incoming
and outgoing null directions, we have, say, u = t − r∗ and v = t + r∗ , so that
du = dt − (1 − 2m/r )−1 dr , dv = dt + (1 − 2m/r )−1 dr , and thus
2m 2m −1 2 2m
     
− 1− dt 2 + 1 − dr = − 1 − du dv.
r r r
S PACETIME EXAMPLES 91

The metric is now adapted to the structure of null geodesics, but it still has
singular components at r = 2m. Note that r∗ = 21 (v − u), so that
v−u r  r 
e 4m = |F(r )|, with F(r ) := e 2m −1 .
2m
It is a simple exercise to show that F is a diffeomorphism between (0, +∞) and
(−1, +∞), mapping (0, 2m) to (−1, 0) and (2m, +∞) to (0, +∞).
u v
With this in mind, we let U = e− 4m and V = e 4m , so that for r > 2m,
U V = F(r ). The portion of the Schwarzschild spacetime for r > 2m corresponds,
with r∗ taking on all values in R, to {(U, V ) : U > 0, V > 0}. Moreover, we
u
1 − 4m v
1 4m
compute, since dU = − 4m e du and d V = 4m e dv, that
2m 2m −1 2 2m
     
− 1− dt 2 + 1 − dr = − 1 − du dv
r r r
1 − 2m
= −16m 2 r r r  dU d V
e 2m 2m − 1
32m 3 − 2mr
= e dU d V.
r
Let F −1 be the inverse function for F. We can construe (U, V ) 7→ F −1 (U V )
as a smooth function on the set {(U, V ) : U V > −1}, for which we note that
{(U, V ) : F −1 (U V ) > 2m} = {(U, V ) : U V > 0}, one component of which
we just used above to map to r > 2m in Schwarzschild. Likewise we have
{(U, V ) : 0 < F −1 (U V ) < 2m} = {(U, V ) : −1 < U V < 0}, one component
of which we will map to the region 0 < r < 2m in Schwarzschild, namely
{(U, V ) : U V > −1, U < 0, V > 0}. Note that as r increases in the interval
u
(0, 2m), r∗ decreases over the interval (−∞, 0); if we now let U = −e− 4m and
v
V = e 4m , we again have in this region U V = F(r ), and also again we compute
1 − 4m u 1 4mv
with dU = 4m e du and d V = 4m e dv, that
2m 2m −1 2 2m 32m 3 − 2mr
     
− 1− dt 2 + 1 − dr = − 1 − du dv = e dU d V.
r r r r
Again, r = F −1 (U V ) can be construed as a smooth function of (U, V )
in {(U, V ) : U V > −1}, taking on all values in (0, +∞): r = 2m does not
correspond to a singularity of this metric. We have the original two disconnected
regions r > 2m and 0 < r < 2m embedded isometrically into a smooth manifold,
connected along the axis U = 0, V > 0. Notice that the level sets of r , which
correspond to varying t, are hyperbolae in the U V -plane, with one exception:
r = 2m corresponds to the axes, U V = 0. In any case, we see that there is not a
singularity at r = 2m in the spacetime metric, which smoothly extends over this
set, not only connecting r > 2m with 0 < r < 2m, but also giving rise to two
92 2. T HE E INSTEIN EQUATION

other related regions as well. They are topologically connected to each other,
but not all regions are causally connected, as will soon be clear.
If we want to take the metric out of null form, we can let T = 12 (V − U ), and
R = 12 (V + U ), and on {(T, R) : R 2 − T 2 > −1}, we have

2m 2m −1 2 32m 3 − 2m r
   
− 1− dt 2 + 1 − dr = e dU d V
r r r
32m 3 − 2m
r
= e (−dT 2 + d R 2 ),
r
where r = F −1 (R 2 − T 2 ); the level sets of r are hyperbolae, and r = 2m now
corresponds to R = ±T , which is clearly a null hypersurface. The lightcones
behave uniformly in these coordinates, which of course is how the coordinates
were constructed.
The original region r > 2m, corresponds to R +T = V > 0 and R −T = U > 0,
i.e., R > |T |, and in this region, we can relate the coordinates to the original
(t, r ) coordinates, as follows:
t+r∗ t−r∗  r∗ 12 r
t r t
T = 21 (V −U ) = 12 e 4m −e− 4m = e 4m sinh

= −1 e 4m sinh ,
4m 2m 4m
21 r
r t

R = 21 (V +U ) = −1 e 4m cosh .
2m 4m
The region 0 < r < 2m corresponds to R +T > 0, R −T < 0, i.e., T > |R|,
with R 2 −T 2 > −1, with coordinates related to the (t, r ) coordinates by (watch
the log term in r∗ )
t+r∗ 1 r
− t−r∗  r∗ t r 2 4m t

1 1
T = 2 (V −U ) = 2 e 4m +e 4m = e cosh
4m = 1− e cosh ,
4m 2m 4m
1
r 2 4m t
  r
R = 12 (V +U ) = 1− e sinh .
2m 4m
We note several things about the causal structure (Figure 5). Spacetime has a
time orientation given by ∂/∂ T . Because the lightcones in the (T, R)-plane are
at 45◦ from the T -axis, we see that no causal curve from the region E 2 given
by R < −|T | can enter the region E 1 , where R > |T | (the original r > 2m
region). Moreover, any ingoing (to the future) radial null geodesic starting
from E 1 will cross R = |T | (r = 2m), entering into the region B1 (the original
0 < r < 2m region), where T > |R|, R 2 − T 2 > −1, and will inevitably “reach”
r = 0 (R 2 − T 2 = −1) at a finite affine parameter. In fact, any causal curve
emanating from B1 stays within this region to the future, and must reach r = 0
at finite proper time (or affine parameter). This is a simple consequence of the
structure of lightcones in the (T, R)-plane, and recall that a null geodesic has
S PACETIME EXAMPLES 93

r=0

2m
r=
B1

T
E2 E1

R
B2

r=
2m
r=0

Figure 5. Schwarzschild black hole: Kruskal extension.

affine parameter s with r ′ (s) constant. The final region B2 is dual to B1 , and is
given by T < −|R|, R 2 − T 2 > −1; future-directed causal curves starting from
B2 can enter E 1 , say, but no future-directed causal curve can go from E 1 to B2 .
Interpreting causal curves in terms of signals, one can interpret B1 as a black
hole and B2 as a white hole.

2.4.4.2. Isotropic coordinates. The spatial slice at constant t in the Schwarzschild


metric is actually conformally flat. To see this, we perform a monotone increasing
change in the radial variable, from r to r̃ . We have
 1 2
2m − 2 dr
 r 2
2m −1 2
  
2
1− dr + r g̊S2 = 1 − d r̃ 2 + r̃ 2 g̊S2 .
r | r{z d r̃
} r̃

We would like the expression in braces to equal r/r̃ . This gives


d r̃ dr dr
Z Z Z
= √ = p .
r̃ 2
r − 2mr (r − m)2 − m 2

We can integrate this with the substitution r −m = m cosh w to get w = log r̃ + C̃,
or ew = C r̃ . Thus
m 1

r = m + m cosh w = m + C r̃ + ,
2 C r̃
r mC m m
and so = + + . If we let C = 2/m, we get
r̃ 2 r̃ 2C r̃ 2
r  m 2
= 1+ .
r̃ 2r̃
94 2. T HE E INSTEIN EQUATION

~r → ∞

~ minimal
r - constant sphere
spheres ~ m
r=−
2
r~→
0

Figure 6. Riemannian Schwarzschild metric of positive mass.

Thus the Schwarzschild metric can also be written


2
1 − 12 m/r̃ 2
 m 4 2
ḡ S = − 2 dt + 1 + (d r̃ + r̃ 2 g̊S2 ). (2.4.3)
1
1 + m/r̃ 2r̃
2

Of course the Euclidean metric g E3 on R3 is written in standard spherical coor-


dinates with r̃ = |x|, x ∈ R3 , as g E3 = d r̃ 2 + r̃ 2 g̊S2 , so we see that the constant
time slice is conformally Euclidean:

m 4
 
gS = 1 + g E3 .
2|x|
For m > 0, the conformal metric is defined for all x ̸= 0. For m < 0, r ↘ 0
corresponds to r̃ ↘ − m2 . The manifold (R3 \ {x : |x| ≤ − m2 }, g S ) is incomplete,
by the same argument given earlier, and cannot be extended over |x| = − m2 : the
g S -areas of the spheres {|x| = r } tend to 0 as r ↘ − m2 (Exercise 2-49), and radial
geodesics from r = r0 > − m2 have finite length as r ↘ − m2 , but the curvature
blowup precludes being able to add a point to complete the metric, as is done in
the case m = 0 for the Euclidean metric.
For m > 0, the values r > 2m correspond to r̃ > m2 , and the set where 0 < r̃ ≤ m2
is in the extended Schwarzschild spacetime developed above. The set where
t = 0 and r̃ = m2 is a round two-sphere, which is actually totally geodesic inside
the three-slice (Figure 6). We note that as r̃ = |x| → ∞, g S approaches the
Euclidean metric. What is not immediately obvious is that the same can be said
of the geometry as r̃ ↘ 0, so that the metric is complete. This follows from the
following exercise.

Exercise 2-37. There is an isometric inversion of (R3 \ {0}, g S ) through the


two-sphere r̃ = m2 , given by r̃ → m2 /r̃ , which fixes this two-sphere, and maps
2

the set where r̃ > m2 isometrically onto the set where 0 < r̃ < m2 . Conclude that
the fixed two-sphere is totally geodesic.
S PACETIME EXAMPLES 95

A similar change of radial coordinate can be made on the Schwarzschild


metric in any dimension. Indeed from g S = (1 − α/r n−2 )−1 dr 2 + r 2 g̊Sn−1 (as
above, we may let α = 2m), we see as above the desired change would satisfy
− 1 dr r
α 2

1− n−2
= . (2.4.4)
r d r̃ r̃
Exercise 2-38. Suppose k > 0 is a constant and
 2
α n−2

r = (k r̃ ) 1 + .
4(k r̃ )n−2
Show that r satisfies (2.4.4). If we let k = 1, then
4
α n−2
 
gS = 1 + (d r̃ 2 + r̃ 2 g̊Sn−1 ).
4r̃ n−2

Write the spacetime Schwarzschild metric ḡ S in the form ḡ S = −( f (r̃ ))2 dt 2 + g S .


(Answer: f (r̃ ) = (1 − 41 α/r̃ n−2 )/(1 + 14 α/r̃ n−2 ).)
Remark 2-39. We saw that the Schwarzschild family ḡ S is obtained by imposing
a rotationally symmetric ansatz for a Lorentzian metric, together with the vanish-
ing of the Ricci curvature (the vacuum Einstein equation). Likewise, one obtains
a metric for a spatial slice in Schwarzschild by imposing a rotationally symmetric
ansatz for a Riemannian metric and solving for vanishing scalar curvature. To
see this, let I ⊂ R be an interval, and consider the manifold M = I × Sn−1 , on
which we prescribe a rotationally symmetric metric g = ds 2 + H (s)g̊Sn−1 , for
s ∈ I and some smooth positive function H on I . It is easy to see that for fixed
ω0 ∈ Sn−1 , the path γ (s) = (s, ω0 ) is a unit-speed
√ geodesic orthogonal to the
n−1
family of spheres {s} × S , s ∈ I . We let ρ = H , a smooth positive function
called the area radius, since the geometry of {s} × Sn−1 in (M, g) (and thus
in particular its area A(s)) is precisely that of the Euclidean sphere of radius
ρ(s). We remark that we can use the area radius as the radial coordinate on the
set where ρ ′ (s) ̸= 0, equivalently where A′ (s) ̸= 0. Once we impose vanishing
scalar curvature, there must be points where ρ ′ (s) ̸= 0, since ρ(s) cannot be
constant on any interval: the simple metric product of an interval and a sphere
has positive scalar curvature. That said, we will use a different reparametrization.
Suppose s = s(r ) defines a smooth function with s ′ (r ) ̸= 0, giving a bijection
s : J → I , where J ⊂ R is thus an interval. We can change coordinates on M
and write the metric as
H (s)
 
′ 2 2 ′ 2 2
g = (s (r )) dr + H (s)g̊Sn−1 = (s (r )) dr + ′ g̊ n−1 .
(s (r ))2 S
96 2. T HE E INSTEIN EQUATION


We define r = r (s) to be a function satisfying dr/ds = r (s)/ H (s), i.e., r =

r0 e F(s) where dF/ds = 1/ H (s); we take r0 > 0. Thus r : I → r (I ) =: J
is a smooth bijection, with smooth inverse s = s(r ), for which we have g =
(s ′ (r ))2 (dr 2 +r 2 g̊Sn−1 ) = (s ′ (r ))2 g En . Thus g is conformally Euclidean. As n ≥ 3,
we can write this in a standard form given in Exercise 2-51 as g = u 4/(n−2) g En ,
where in this case u = u(r ) and r = |x|. Applying (2-51a) from that exercise,
we see that the vanishing of the scalar curvature amounts to 1u = 0, which for a
function of r = |x| alone yields u = c1 +c2 /r n−2 ; cf. (7.1.3). If c2 = 0, the metric
g is flat; more generally, by a radial rescaling, it is easy to see that g is isometric
to the Riemannian Schwarzschild metric g S of mass 2c1 c2 . See Exercise 7-36.
We make one final remark here. Let I = (0, +∞). For m > 0, we consider the
Riemannian Schwarzschild metric g S as a metric on I × Sn−1 . This metric admits
an isometric Z2 -action, given by radial inversion (cf. Exercises 2-37 and 2-49)
along I , together with the antipodal map on the sphere Sn−1 . The Schwarzschild
metric then induces a complete Riemannian metric with zero scalar curvature on
the quotient space, a smooth noncompact manifold diffeomorphic to RPn with a
point removed, known as the RPn -geon. The geometry can still be construed as
rotationally symmetric, with one asymptotically flat end instead of two.

2.4.5. Static vacuum metrics. We briefly consider spacetime metrics on S =


I × M of the form ḡ = − f 2 dt 2 + g, where I ⊂ R is an open interval, f : M → R,
and g is a Riemannian metric on M. For ḡ to be a Lorentzian metric, we assume
that f is nowhere-vanishing on M. In this case, ∂/∂t is a timelike Killing vector
field orthogonal to the time slices Mt0 := {t0 } × M for t0 ∈ I . We recall the
formulae for Ric(ḡ), from Exercise 1-15.

Lemma 2-40. Let X and Y be (lifts of) vector fields tangent to the base M. Then

Ricḡ (X, Y ) = Ricg (X, Y ) − f −1 Hessg f (X, Y ),


Ricḡ (X, ∂/∂t) = 0,
Ricḡ (∂/∂t, ∂/∂t) = f 1g f.

Thus the condition that ḡ be Ricci-flat is given by the static vacuum equations
Ric(g) = f −1 Hessg f and 1g f = 0. Note that from here we get R(g) = 0.
We can relate the static equations to scalar curvature deformation, by recalling
the linearization L g h of the scalar curvature from (2.3.11), written in local
coordinates as

L g h = −g kℓ g i j h i j;kℓ + g ik g jℓ h i j;kℓ − g ik g jℓ h i j Rkℓ .


S PACETIME EXAMPLES 97

For h compactly supported, we can integrate by parts to get f L g h dvg =


R
M

M h · L g f dvg , where
R

L ∗g f = −(1g f )g + Hessg f − f Ric(g). (2.4.5)

Thus, if ḡ = − f 2 dt 2 + g is a Ricci-flat Lorentzian metric, then L ∗g f = 0.


Example 2-41. Consider the Riemannian Schwarzschild metric

2m −1 2
 
gS = 1 − dr + r 2 g̊S2
r
induced on a constant time slice p from the spacetime metric ḡ S in (2.4.2). From
the form of ḡ S , we see that f = 1 − 2m/|x| is a solution of L ∗g f = 0. If we
S
change to isotropic coordinates, g S = (1 + 21 m/|x|)4 g E3 , then using the form
(2.4.3) of the Schwarzschild metric ḡ S in these coordinates, we see that
1 − 21 m/|x|
f =
1 + 21 m/|x|
is a solution to L ∗g f = 0.
S

Now consider the converse: suppose f : M → R is a smooth function with


L ∗gf = 0. Such a function is called a static potential. By taking the trace of the
1
equation, we obtain 1g f = − n−1 R(g) f , with n = dim M ≥ 2. Thus we can
∗ 1
rewrite the system L g f = 0 as f Ric(g) = Hessg f + n−1 R(g) f g. On the open
1
set where f ̸= 0, then, we see from Lemma 2-40 that Ric(ḡ) = n−1 R(g) ḡ. By
applying this to the Schwarzschild metric ḡ S , we could conclude R(g S ) = 0, i.e.,
the Riemannian Schwarzschild metric g S has vanishing scalar curvature.
Exercise 2-42. For (M, g) = (Rn , g En ), show that the solution space to L ∗g En f = 0
is the span of the constants and coordinate functions x i , i = 1, . . . , n. In the case
of a flat torus (Tn , gF ), the only solutions to L ∗g f = 0 are the constant functions.
F
Derive this from knowing the solutions on Euclidean space, or from the general
fact that L ∗g f = 0 and R(g) = 0 implies f is harmonic.
Exercise 2-43. In the case of a round sphere (Sn , g̊), which we consider embed-
ded as the unit sphere about the origin in Rn+1 , the solutions to L ∗g̊ f = 0 are
the restrictions of the coordinates x 1 , . . . , x n+1 to the sphere. Show that these
functions solve L ∗g̊ f = 0, and apply the next proposition to conclude these span
the space of all solutions. These functions are eigenfunctions for 1g̊ , with λ = n;
cf. Remark 7-27. (Hint: Compute Hessg̊ (x i ) by computing ⟨DY (gradg̊ x i ), Z ⟩
for Y and Z tangent to the sphere, where D is the Levi-Civita connection for
the Euclidean metric; use the fact that gradg̊ x i is the tangential component of
98 2. T HE E INSTEIN EQUATION

gradgE x i = ei = ∂/∂ x i , and along the unit sphere centered at the origin, the
position vector x is a unit normal.)
Proposition 2-44. Let (M, g) be a connected Riemannian manifold with n =
dim M ≥ 2. The dimension of the space of solutions to L ∗g f = 0 is at most
(n+1). For any f : M → R which is a nontrivial solution to L ∗g f = 0, the
level set 6 := { p ∈ M : f ( p) = 0} is either empty or a smooth totally geodesic
hypersurface, along any component of which |d f |2g is a nonzero constant.
Proof. We first show that the set { f = 0} is a regular level set for f . Suppose
L ∗g f = 0, and assume { f = 0} is nonempty. Let p ∈ M with f ( p) = 0. We claim
that since f is nontrivial, d f p cannot vanish; equivalently, if d f p = 0, then f
would be trivial. To see this, observe that the system L ∗g f = 0 can be rewritten as
1
Hessg f + f n−1 R(g) g−Ric(g) = 0. If γ (t) is a unit-speed geodesic emanating


from p = γ (0), and if we let h(t) = f (γ (t)), then since Dγ ′ /dt = 0, we have
h ′′ (t) = Hessg f (γ ′ (t), γ ′ (t)), so that h(t) satisfies the second-order linear ODE
h ′′ (t) + R(g) ′ ′
n−1 − Ricg (γ (t), γ (t)) h(t) = 0.


By the conditions satisfied by f and p, h(0) = 0 and h ′ (0) = d f p (γ ′ (0)) = 0,


so that h(t) vanishes identically. Since the geodesic was arbitrary, we have that
f = 0 in a neighborhood of p.
If (M, g) were complete, the argument above would suffice. More generally,
let S be the interior of { f = 0}, which we have just argued is nonempty. If q is
in the closure of S, there is a point s ∈ S close enough to q so that q is contained
in a geodesic ball B around s. Since s ∈ S, f (s) = 0 and d f s = 0, from which
we conclude as above that f = 0 on B; in particular, q ∈ B ⊂ S. Thus S ⊂ M is
nonempty, open (by definition) and closed. By connectedness, S = M.
The dimension bound follows from the fact that by the ODE argument, a
solution is determined by the values of f ( p) and d f p at a single point.
That 6 = { f = 0} is totally geodesic in case it is a hypersurface (i.e., nonempty
and not all of M) follows directly from the fact that along 6, L ∗g f = 0 reduces
to Hessg f = 0, and thus the gradient of f , which is normal to 6, has constant
nonzero length along any component of 6. □
We summarize our discussion with a corollary (using Lemma 2-12, cf. Exercise
2-52), and we emphasize that a nontrivial solution to L ∗g f = 0 gives rise to a
static vacuum metric ḡ = − f 2 dt 2 + g, possibly with nonzero 3.
Corollary 2-45. Let (M n , g), n ≥ 2, be a connected Riemannian manifold which
admits a nontrivial solution f of L ∗g f = 0, and let ḡ = − f 2 dt 2 +g on R×{ f ̸= 0}.
Then the scalar curvature R(g) is constant, and ḡ is an Einstein metric with
Ric(ḡ) = R(g)
n−1 ḡ.
S PACETIME EXAMPLES 99

Exercise 2-46. What changes, if anything, if at the beginning of this section, we


allowed (M, g) to be semi-Riemannian, and/or if we defined ḡ = f 2 dt 2 + g?

2.4.6. The Kerr metric. The Kerr metrics form a family of axisymmetric Ricci-
flat metrics, which includes as a subset the Schwarzschild metrics, with their full
spherical symmetry. Such metrics model the exterior of a rotating gravitational
object. We will write down the metric in a certain coordinate system, the Boyer–
Lindquist coordinates. Let r , φ and θ be spherical coordinates on three-space.
We will use the mathematical convention that θ is the polar angle, whereas φ
is the angle between the position vector and the z-axis: in most physics books,
the angle notation is reversed, so be careful when comparing. In any case, the
metric appears somewhat complicated, and depends on two parameters, m and a.
For simplicity, we use units where G = 1 and c = 1; in general, we can replace
“m” with “Gm/c2 ” and “dt” with “c dt” in what follows to obtain the formula
for general units. See [161, p. 36] for more on geometrized units.
Let 1 = r 2 − 2mr + a 2 and ρ 2 = r 2 + a 2 cos2 φ. The Kerr metric is given by
2mr 2mr sin2 φ
 
g = − 1 − 2 dt 2 − a (dt ⊗ dθ + dθ ⊗ dt)
ρ ρ2
(r 2 + a 2 )2 − a 2 1 sin2 φ 2 2 ρ2 2
+ sin φ dθ + dr + ρ 2 dφ 2 . (2.4.6)
ρ2 1
This spacetime metric is stationary: ∂/∂t is a timelike Killing vector, but unlike
the Schwarzschild situation, if a ̸= 0, this Killing field is not orthogonal to the con-
stant t-slices. The axisymmetry is apparent, as ∂/∂θ is also a Killing vector. One
can define suitably the angular momentum (see (7.2.6)) and show that it is J = am
in the z-direction. When a = 0, the metric reduces to the Schwarzschild metric.
General relativity predicts that a rotating massive object will cause the space-
time around it to be “warped”. One such manifestation of this is the frame
dragging effect, which we illustrate in the Kerr metric. We use the following
lemma saying that a symmetry gives rise to a conserved quantity along geodesics.
Lemma 2-47. If X is a Killing field on (M, g), and if γ is a geodesic, then
⟨X, γ ′ ⟩ is constant.
Proof. Since γ (τ ) has constant speed, ⟨∇ X γ ′ , γ ′ ⟩ = 0. By the geodesic equa-
tion, then, d⟨X, γ ′ ⟩/dτ = ⟨∇γ ′ X, γ ′ ⟩ = gi j (γ k )′ X i;k (γ j )′ = X j;k (γ k )′ (γ j )′ =
X k; j (γ k )′ (γ j )′ . We see this must vanish by applying the Killing equation
X j;k + X k; j = 0. □
Consider in any spacetime a timelike geodesic parametrized by proper time τ ,
with velocity vector U (τ ). In the Kerr metric, X = ∂/∂θ is a Killing vector, so
100 2. T HE E INSTEIN EQUATION

that by Lemma 2-47, U θ = ⟨ U , ∂/∂θ⟩ is conserved along the geodesic. One can
see this directly from the geodesic equations U α U α;β = 0, which can be written

d Uβ γ
= 0αβ U α U γ = 12 g γ ν (gνβ,α + gαν,β − gαβ,ν ) U α U γ

= 21 (gνβ,α + gαν,β − gαβ,ν ) U α U ν = 21 gαν,β U α U ν .

Since the metric is independent of θ , then U θ is conserved. If the geodesic


represents the path of a particle of rest mass m 0 , then Pθ = m 0 U θ is a component
of the momentum one-form which is conserved.
Now consider a geodesic in Kerr with U θ = ⟨ U , ∂/∂θ⟩ = 0. Although U
remains orthogonal to ∂/∂θ , the fact that gtθ ̸= 0 means that U θ will be nonzero.
In fact, we have U t = dt/dτ ̸= 0 and U θ = dθ/dτ . Thus

dθ Uθ g θα U α gθ t gθ t 2mra
= t = tα = tt = − = .
dt U g Uα g gθθ (r + a )2 − a 2 1 sin2 φ
2 2

Although in the metric the trajectory is orthogonal to the direction of the symmetry
given by the Killing field ∂/∂θ, the velocity has a nontrivial component in the
θ -direction. Thus a freely falling object seems to pick up some coordinate angular
momentum in the direction of rotation of the massive object, say, generating this
gravitational field (metric). For tests of the effect of the rotation of a massive
object on spacetime, see the results of the Gravity Probe B experiment [87].

Exercises

Exercise 2-48. Consider the Minkowski plane with inertial coordinates x 0 = ct


and x 1 . Let g > 0 be a constant, and introduce a set of coordinates (ξ 0 , ξ 1 ) with
ξ 1 > −c2/g, such that the change of coordinates is given by

c2 gξ 0 c2 gξ 0
   
0 1 1 1
x = ξ + sinh 2 , x = ξ + cosh 2 .
g c g c

It is a straightforward matter to express (ξ 0 , ξ 1 ) as smooth functions of the


inertial coordinates.
a. Sketch a grid of level sets of ξ 0 and ξ 1 in the x 0 x 1 -plane, and indicate the
region covered by these coordinates. Show that the ξ 0 and ξ 1 coordinate curves
are orthogonal for the Minkowski metric, and that the metric spacing of the grid
points along ξ 0 = b is independent of b. To do this, find the components ḡµν
for the Minkowski metric −(d x 0 )2 + (d x 1 )2 = ḡµν dξ µ dξ ν , and confirm that it
retains a static form in these coordinates.
E XERCISES 101

b. Find the proper acceleration along the worldlines of constant ξ 1 . Show that
for |gξ 0 /c2 | small, the motion along any such worldline approximates that of
uniform Newtonian acceleration in the inertial coordinates.
c. Interpreting the metric in the (ξ 0 , ξ 1 ) coordinates for |gξ 1 /c2 | small in terms
of a Newtonian limit, estimate the Newtonian gravitational potential 8, and show
it is consistent with that of a uniform gravitational field.
d. Let Cξ be the worldline ξ 1 = ξ . Suppose that light is emitted along Cξ with
coordinate frequency ν, i.e. with 1ξ 0 = c/ν between successive wave crests.
Argue that the light will have the same coordinate frequency of absorption as
measured along Cξ̃ for any ξ̃ . Find the relative change in proper frequency from
emission along Cξ to absorption at a detector carried along C0 , conclude that the

change in energy per unit mass is c2 ( −ḡ00 (ξ ) − 1), and relate this to 8.

Exercise 2-49 (geometry of the Riemannian Schwarzschild metric). Define


3 if m > 0,
{x ∈ R : x ̸= 0}

M= R 3 if m = 0,
3 m
x ∈ R : |x| > − 2 if m < 0.


m 4 m 4
On M, we let g S = 1 + 2|x| g E3 , so (g S )i j = 1 + 2|x| δi j in Cartesian coordi-
 
m 4
nates, while in spherical coordinates we find g S = (1 + 2r ) dr 2 + r 2 g̊S2 with


r = |x|. For each r > 0, let A(r ) be the g S -area of Sr := {|x| = r }, and for m > 0,
let 6 = Sm/2 (cf. Exercise 2-37).
a. Calculate A(r ). Observe that limr ↘ − m2 A(r ) = 0 for m < 0, while if m > 0,
2
A(r ) has a global minimum at r = m2 , and that A(r ) = A m4r , which is consistent
with Exercise 2-37. Solve for m in terms of the area A6 of 6.
b. For m > 0, show that 6 is totally geodesic in M by direct computation in
coordinates (as opposed to using the inversion isometry, cf. Exercise 2-37).
c. Let m > 0. Find an isometric embedding of (M, g S ) into E4 , identified in
Cartesian coordinates (x, y, z, w) with (R4 , d x 2 + dy 2 + dz 2 + dw 2 ). It might
be easiest to use the first set of coordinates we introduced for the Schwarzschild
−1 2
metric, for which g S = 1 − 2m r dr + r 2 g̊S2 , r > 2m. For ω ∈ S2 , look for an
embedding of the form r ω 7→ (r ω, ξ(r )) ∈ R4 , for r > 2m and ξ ′ (r ) > 0. This
corresponds to half of (M, g S ). The map you get will then extend by reflection
to the other half. Use this to sketch a representative picture of the Riemannian
Schwarzschild metric. (If you let w = ξ(r ), with limr ↘ 2m ξ(r ) = 0, the solution
set in the wr -plane is (half of) a parabola. The image of the embedding in R4 is
called Flamm’s paraboloid.)
102 2. T HE E INSTEIN EQUATION

d. When m < 0 the above argument breaks down. Instead, look for an isometric
embedding into Minkowski spacetime M4 , which is identified with R4 with the
metric d x 2 + dy 2 + d x 2 − dw 2 .
e. Generalize the above to higher dimensions: consider the higher-dimensional
Riemannian Schwarzschild metric g S , with α = 2m > 0, given in isotropic
coordinates in Exercise 2-38. Find the value of r that minimizes the area A(r )
of Sr = {x : |x| = r }. Answer: r n−2 = m2 . Find an isometric inversion through


the corresponding minimal sphere 6. Show this sphere is totally geodesic, and
express m in terms of the area of 6. Can you embed (Rn \ {0}, g S ) with m > 0
isometrically in Euclidean space (Rn+1 , gEn+1 )?

Exercise 2-50. In Euclidean space, the spheres minimize surface area for a given
enclosed volume V . In fact if a closed surface of area A encloses a volume V,

the isoperimetric inequality in three dimensions is V ≤ A3/2 /(6 π).
Let m > 0. In his Ph.D. thesis [25] (see also [29]), Hubert Bray showed
that the spheres Sr = {x : |x| = r } in the Riemannian Schwarzschild metric
4 
R3 \{0}, 1+ 12 m/|x| g E3 are isoperimetric in the homology class of 6 = Sm/2 .
In other words, amongst all surfaces homologous to the minimal surface 6 and
enclosing a certain volume V with 6, the one with smallest area is the sphere Sr
of the correct r value to enclose volume V .
a. Show that the volume V (r ) enclosed by 6 and Sr (r ≥ m/2) has the expansion

4πr 3 9m
 
V (r ) = 1+ + O(mr −2 ) .
3 2r
b. Conclude that the volume V enclosed by 6 and the sphere Sr of area A has
the expansion

A3/2 (3 π )m
 
V (A) = √ 1+ √ + O(m A−1 ) .
6 π A
Exercise 2-51 (conformal deformation of scalar curvature). Suppose (M, g) is a
semi-Riemannian manifold of dimension n, and g̃ = eϕ g where ϕ is a smooth
function on M, and let u > 0 be a smooth function on M.
a. Show that, with 1g ϕ = trg (Hessg ϕ) in any signature,

R(g̃) = e−ϕ R(g) − (n−1)1g ϕ − 41 (n−1)(n − 2)⟨∇g ϕ, ∇g ϕ⟩g .




b. In case n ≥ 3, show that there is a unique choice of q ̸= 0 such that if we


write eϕ = u q for u > 0, the resulting formula for R(g̃) does not contain any
4
⟨∇g u, ∇g u⟩g -term. In fact, for this to be the case, we must have q = n−2 , and
E XERCISES 103

with g̃ = u 4/(n−2) g, for u > 0, also


− n+2 (n−1) − n+2
 
R(g̃) = u n−2 R(g)u − 4 1g u = −(c(n))−1 u n−2 Lg u, (2-51a)
(n − 2)
n−2
where c(n) = 4(n−1) and Lg u = 1g u − c(n)R(g)u defines the conformal
Laplacian Lg .
c. Suppose (M, g) is a closed (compact without boundary) Riemannian manifold.
Show that the total scalar curvature of g̃ = u 4/(n−2) g is given by
Z Z
R(g̃) dvg̃ = c(n)−1
|∇g u|2g + c(n)R(g)u 2 dvg .

M M

(Hint: Show that dvg̃ = u 2n/(n−2) dv g .)

d. For g̃ = u 4/(n−2) g and any smooth f, prove that Lg f = u (n+2)/(n−2) Lg̃ ( f /u).
Exercise 2-52. Suppose (M, g) is a Riemannian manifold.
a. Use the Bianchi identity and the Ricci formula to show that divg L ∗g f =
− 21 f d R(g). Use this and Proposition 2-44 to give another proof of Corollary 2-45.
b. Find the kernel of L ∗g if (M, g) is closed and has negative scalar curvature.
(Hint: Exercise 1-11.)
c. Observe that if Ric(g) = 0, then L ∗g has nontrivial kernel. Recall an example
of a Ricci-flat metric where the kernel of L ∗g has dimension greater than one, and
an example of a metric with zero scalar curvature and which is not Ricci-flat, but
for which L ∗g is nontrivial. What can you say about the kernel of L ∗g if (M, g) is
closed with zero scalar curvature?
d. Consider the metric g = (n − 2)−1 g̊S1 ⊕ g̊Sn−1 on S1 × Sn−1 . Show that
f (t, ω) = sin t is in the kernel of L ∗g .
Exercise 2-53 (kernel of L ∗g ). Let g S be a Riemannian Schwarzschild metric of
S
nonzero mass m. Show that there is a one-dimensional kernel for L ∗g . Outline:
S
Observe that L ∗gS f = 0 implies HessgS f = f Ric(g S ). (Recall from Remark 2-35
that whereas Ric(ḡ S ) = 0, Ric(g S ) ̸= 0). In three dimensions, write this out in
coordinates for which g S = (1 − 2m/r )−1 dr 2 + r 2 (dϕ 2 + sin2 ϕ dθ 2 ). Show
that ∂θ f = 0 and ∂ϕ f = 0, and then solve the remaining ODE for f . Compare
your answer with Example 2-41; for m > 0, note where this potential vanishes
(cf. Exercise 2-49). Generalize the argument to higher dimensions.
Exercise 2-54 (metric expansion in normal coordinates). Suppose ∇ is the Levi-
Civita connection for a metric g = ⟨ · , · ⟩ on M. Suppose that γ (t) is a unit-speed
geodesic, and that J (t) is a Jacobi field along γ : J ′′ (t) = R(γ ′ (t), J (t), γ ′ (t))
(cf. Proposition 2-3).
104 2. T HE E INSTEIN EQUATION

a. If R is the Riemann curvature tensor, show that


J ′′′ (t) = (∇γ ′ (t) R)(γ ′ (t), J (t), γ ′ (t)) + R(γ ′ (t), J ′ (t), γ ′ (t)).
b. Suppose J (0) = 0. Let χ (t) = ⟨J (t), J (t)⟩. Derive the fourth-order Taylor
expansion
χ (t) = ⟨J ′ (0), J ′ (0)⟩t 2 − 31 ⟨R(γ ′ (0), J ′ (0), J ′ (0)), γ ′ (0)⟩t 4 + O(t 5 ).
c. Suppose (M, g) is a Riemannian manifold and p ∈ M. Show that in normal
coordinates centered at p (so x i ( p) = 0)
gi j (x) = δi j − 31 Rki jℓ x k x ℓ + O(|x|3 ).
(Hint: In normal coordinates (x i ) centered at p, consider a unit speed radial
geodesic γ (t) and the vector field W (t) = t W i ∂/∂ x i along γ , where the W i are
constants. (Note that the summation convention applies here.) Show that W (t) is
a Jacobi field along γ with W ′ (0) = W i (∂/∂ x i )| p . One way to do this is to build
a variation 0(s, t) of γ = 0(0, · ) through geodesics. In any case, you might first
observe that in normal coordinates, the curve β with components β i (t) = t V i is
a geodesic with β(0) = p and β ′ (0) = V i (∂/∂ x i )| p .)
Exercise 2-55 (geometric formula for Gaussian curvature). Let (M, g) be a
surface with a Riemannian metric g. Consider an orthonormal basis {e1 , e2 }
of T p M. Note that the Gauss curvature at p is just K ( p) = ⟨R(e1 , e2 , e2 ), e1 ⟩,
where R is the Riemann tensor. Consider a normal neighborhood of radius
a > 0 about p, with normal coordinates (x, y) built using the orthonormal basis
{e1 , e2 } of T p M: (x, y) 7→ exp p (xe1 + ye2 ). Define geodesic polar coordinates
by (r, θ ) 7→ f (r, θ ) = exp p (r cos θ e1 + r sin θ e2 ). Note that the change of
coordinates map is just (r, θ ) 7→ (x, y) = (r cos θ, r sin θ), which shows that the
map f , which is clearly smooth for r < a and all θ , is a diffeomorphism for
0 < r < a and θ ∈ I , where I in any open interval of length at most 2π. Note
that by Gauss’s lemma, the metric components in geodesic polar coordinates are
grr = 1, gr θ = 0 and gθθ = |∂ f /∂θ|2g . Since the radial curves of constant θ on M
are geodesics, by Proposition 2-3, for any θ0 , J (r ) = (∂ f /∂θ)(r, θ0 ) is a Jacobi
field along the radial geodesic r 7→ γ (r ) = f (r, θ0 ).
a. For any θ0 , show that J ′ (0) ⊥ γ ′ (0) and |J ′ (0)|g = 1. (Equation (2.3.8) might
be useful.)
b. Use part b. of Exercise 2-54 to show that gθθ (r, θ) = r 2 − 31 K ( p)r 4 + E (r, θ),
where E (r, θ ) = O(r 5 ) uniformly in θ , i.e., E (r, θ) ≤ Cr 5 , where C can be
chosen independent of r (small) and θ . Use the Taylor expansion (1 + x)α =

1 + αx + O(x 2 ) to derive gθθ (r, θ ) = r − 3!1 K ( p)r 3 + O(r 4 ).
E XERCISES 105

c. Let L(r ) be the length of a geodesic circle of radius r about p, and let A(r )
be the area enclosed by this circle, both computed using the metric g. Show that
3 2πr − L(r ) 12 πr 2 − A(r )
lim = K ( p) = lim .
r↘0 π r3 r↘0 π r4
d. Let D = {(x, y) : x 2 + y 2 < 1} be the unit disk in the plane, with polar
coordinates (ρ, θ ), and consider the hyperbolic metric
4 4
gH = (d x 2 + dy 2 ) = (dρ 2 + ρ 2 dθ 2 ).
(1 − (x 2 + y 2 ))2 (1 − ρ 2 )2
By solving the differential equation
2dρ
= dr,
1 − ρ2
show how to rewrite the hyperbolic metric as gH = dr 2 + sinh2 r dθ 2 . Use this
along with the formulas above to show K = −1 at the origin of coordinates (of
course, K = −1 everywhere).

Exercise 2-56 (volume expansion of geodesic balls). In this problem you will
generalize the relation of curvature and area from part c. of Exercise 2-55 to
higher dimensions, showing how the scalar curvature measures the top-order
deviation of the volume of small geodesic balls from that of Euclidean geometry.
a. Suppose (V, ⟨ , ⟩) is an n-dimensional real inner product space. Suppose that
T : V → V is a self-adjoint linear operator. If dσ is the Euclidean area measure,
Bn is the unit ball, and Sn−1 = {x ∈ V : ⟨x, x⟩ = 1} ⊂ V is the unit sphere in V,
then if Vol is the Euclidean volume,
Z
⟨T (x), x⟩ dσ = trace(T ) Vol Bn .
Sn−1

b. If (M, g) is Riemannian and p ∈ M, let Br ( p) ⊂ M be the geodesic ball of


radius r > 0 (for sufficiently small r ). Then
R(g)| p 2
 
n n 3
Volg (Br ( p)) = Vol(B ) r 1 − r + O(r ) .
6(n + 2)
(Hint: You may wish to use det(I + t A) = 1 + t trace(A) + O(t 2 ), along with
Exercise 2-54.)
CHAPTER 3

Basics of Lorentzian causality

In this chapter we introduce some basic concepts of Lorentzian causality needed


in the discussion of the Penrose singularity theorem in the next chapter. For a
more comprehensive treatment of the causal structure of a spacetime, readers
are referred to [179; 112; 174; 218; 59; 62; 101].

3.1. Preliminaries from Lorentzian geometry

Let S be a connected, smooth Lorentz manifold of dimension n + 1. Let ⟨ · , · ⟩


denote the Lorentzian metric on S which has signature (−, +, . . . , +). We
classify tangent vectors v to S as follows:
timelike if ⟨v, v⟩ < 0,

v is null if ⟨v, v⟩ = 0, and v ̸= 0,


spacelike if ⟨v, v⟩ > 0, or v = 0.

3.1.1. Time-orientability. Given p ∈ S , the set of timelike vectors v ∈ T p S is


the disjoint union of two open cones, known as timecones at p. While there is
no intrinsic choice of one of these timecones at each point in S to serve as the
designation of the future direction, for the purpose of doing this in a continuous
manner, we introduce the concept of time-orientability. Let C be the set of all
timecones on S . We say that a map φ : S → C is a time-orientation on S if
φ( p) ⊂ T p S and φ( p) depends smoothly on p in the sense that, for all p ∈ S ,
there exists an open neighborhood U of p and a smooth timelike vector field X
on U such that X (q) ∈ φ(q), for all q ∈ U . If S admits such a time-orientation
φ, we say S is time-orientable. In this case, one can designate φ( p) as the future
timecone at p, and the other cone as the past timecone at p. A time-orientable
S with designated future (and past) timecones is called time-oriented.
Using a partition of unity, one obtains:
Lemma 3-1 [174, Lemma 5.32]. S is time-orientable if and only if there exists a
smooth timelike vector field on S .
Lemma 3-1 reveals a sufficient (and indeed necessary) topological condition
for a differentiable manifold to admit a time-orientable Lorentzian metric:

107
108 3. BASICS OF L ORENTZIAN CAUSALITY

Proposition 3-2 [174, Proposition 5.37]. Let M be a smooth manifold. If M is


noncompact or if M is compact and has Euler number χ(M) = 0, then there
exists a smooth time-orientable Lorentzian metric on M.
Proof. The given topological assumption implies that M has a smooth, nowhere
vanishing vector field X (see [217]). Let g be any smooth Riemannian metric on
M. Since X is nowhere vanishing, one can normalize X to obtain a g-unit vector
field n. Let ω be the 1-form dual to n with respect to g. Then ḡ = g − 2ω ⊗ ω is
time-orientable Lorentzian metric on M. □
Remark 3-3. By Lemma 3-11 below, all physical spacetimes are noncompact.

3.1.2. Causal curves. Suppose S is time-oriented. A timelike or null tangent


vector v is called future-directed if v lies in the closure of the future timecone;
past-directed is defined similarly. Given a differentiable curve σ : [a, b] → S ,
we say

timelike if σ (t) is timelike for all t,

σ is null if σ ′ (t) is null for all t,


causal if σ ′ (t) is either timelike or null for all t.

A timelike, null, or causal curve σ is called future-directed if σ ′ (t) is future-


directed. Past-directed curves are defined in a similar way.
These notions extend naturally to piecewise differentiable curves σ by requir-
ing that, when σ is differentiable on [a, b] and [b, c], the vectors σ ′ (b− ) and
σ ′ (b+ ) point into the closure of the same timecone.
From now on this chapter, we will work within the class of piecewise differen-
tiable causal curves. We state a basic result regarding the deformation of causal
curves:
Theorem 3-4 [174, Theorem 10.51]. Given a causal curve σ from p to q in S ,
there exists a timelike curve from p to q which is arbitrarily near σ unless σ is a
null geodesic (when suitably parametrized) along which there are no conjugate
points of p before q.
We refer readers to Section 4.1 for the definition of a conjugate point.

3.1.3. Convex open sets. For the later purpose of understanding the causality
relations locally on S , we recall the concept of convex open sets in S .
Given p ∈ S , an open neighborhood U of p is called a normal neighborhood
of p provided U = exp p (Ũ ), where exp p ( · ) denotes the exponential map at p
and Ũ ⊂ T p S is a starshaped open set containing 0 such that exp p : Ũ → U is a
diffeomorphism (see [174, p. 71]).
C AUSALITY RELATIONS 109

An open set U in S is said to be convex if U is a normal neighborhood of each


p ∈ U . Thus, given any two points p, q in a convex U , there exists a unique
geodesic segment σ pq : [0, 1] → S from p to q which lies entirely in U . Every
point in S has a convex open neighborhood [174, Proposition 5.7].

3.2. Causality relations

Henceforth in this chapter, S denotes a spacetime, i.e., a connected, time-oriented


Lorentz manifold. The causality relations ≪ and < on S are defined as follows.
Definition 3-5. For p, q ∈ S ,
• p ≪ q means there exists a future-directed timelike curve from p to q;
• p < q means there exists a future-directed causal curve from p to q.
By Theorem 3-4, one has:
Lemma 3-6. If p < q and q ≪ r , or if p ≪ q and q < r , then p ≪ r .
Given p ∈ S , the timelike future and causal future of p, denoted by I + ( p)
and J + ( p) respectively, are defined by

I + ( p) = {q ∈ S : p ≪ q} and J + ( p) = {q ∈ S : p ≤ q}.

Here, the notation p ≤ q means either p = q or p < q. The timelike past and
causal past of p, denoted by I − ( p) and J − ( p) respectively, are defined in a
time-dual manner.
Naturally one wonders if I + ( p) is open in S . To answer this question, it is
convenient to restrict the causality relations to an open set U of S . Given p ∈ U ,
the timelike future of p within U , denoted by I + ( p, U ), consists of all points q
in U for which there exists a future-directed timelike curve within U from p
to q. Similarly, one defines J + ( p, U ).
When U is convex, the causality relations in U are as simple as those of the
Minkowski spacetime.
Lemma 3-7 (see [174, p. 403]). Let U be a convex open set in S , and let p, q ∈ U .
• q ∈ I + ( p, U ) if and only if the unique geodesic segment σ pq connecting p to q
within U is future-directed timelike. Consequently, I + ( p, U ) is open in U
(hence open in S ).
• If q ̸= p, then q ∈ J + ( p, U ) if and only if the unique geodesic segment σ pq
in U is future-directed causal. Consequently, J + ( p, U ) is the closure of
I + ( p, U ) in U .
Lemma 3-7 implies that ≪ is indeed an open relation on S .
110 3. BASICS OF L ORENTZIAN CAUSALITY

Proposition 3-8. Let p, q ∈ S . If p ≪ q, then there are open neighborhoods U


and V of p and q respectively, such that p̃ ≪ q̃ for all p̃ ∈ U and q̃ ∈ V .
Proof. Let σ : [0, 1] → S be a future-directed timelike curve from p to q. Let
Wq be a convex open neighborhood of q = σ (1). Choose ϵ > 0 small such that
σ (t) ∈ Wq for all t ∈ [1 − ϵ, 1]. Then q ∈ I + (σ (1 − ϵ), Wq ). Similarly, one can
choose a convex open neighborhood W p of p = σ (0) and a small δ > 0 such
that p ∈ I − (σ (δ), W p ). By Lemma 3-6, the open sets I + (σ (1 − ϵ), Wq ) and
I − (σ (δ), W p ) have the required properties of V and U respectively. □
Given a subset A ⊂ S , the timelike future and causal future of A are defined
respectively by
I + (A) =
S +
I ( p) and J + (A) =
S +
J ( p).
p∈A p∈A

It follows from Proposition 3-8 that I + (A) is an open set in S .


We now turn to the structure of J + (A). In general, J + (A) need not be the
closure of I + (A); in fact, J + (A) may not even be closed, as illustrated by the
Minkowski spacetime with one point removed. However, we do have this:
Lemma 3-9. Let A ⊂ S .
(i) int J + (A) = I + (A). Here int J + (A) denotes the interior of J + (A).
(ii) J + (A) ⊂ I + (A) with equality if and only if J + (A) is a closed set.
Proof. (i) It suffices to show int J + (A) ⊂ I + (A). Let q ∈ int J + (A), then there
is an open neighborhood U of q such that U ⊂ J + (A). Take any q − ∈ I − (q, U ),
then q − ∈ J + (A). By Lemma 3-6, q ∈ I + (A).
The proof of (ii) is left as an exercise. □
As a direct corollary of Theorem 3-4, the set J + (A) \ (A ∪ I + (A)) has an
interesting geometric structure.
Proposition 3-10. Given A ⊂ S , suppose q ∈ J + (A) \ (A ∪ I + (A)). Let σ be a
future-directed causal curve ending at q and starting from some p ∈ A. Then σ
(when suitably parametrized) is a future-directed null geodesic along which there
are no conjugate points of p before q; moreover σ does not intersect I + (A).

3.3. Causality conditions

In a physical spacetime, one does not expect observers to be able to travel to


their own past. A spacetime S is said to satisfy the chronology condition if it
contains no closed timelike curves.
Lemma 3-11. A compact spacetime does not satisfy the chronology condition.
C AUSALITY CONDITIONS 111

Proof. Suppose S is compact. Since I + ( p) is open for all p, there exist a finite
Sm +
number of points p1 , . . . , pm such that S ⊂ i=1 I ( pi ). If pm ∈ I + ( pm ), then
there is a closed timelike curve through pm . If pm ∈/ I + ( pm ), then pm ∈ I + ( p j )
Sm−1 +
for some j ≤ m − 1, hence S ⊂ i=1 I ( pi ). Repeating this argument, one
concludes that S must contain a closed timelike curve. □
A spacetime S is said to satisfy the causality condition if it contains no closed
causal curves. Despite being stronger than the chronology condition, the causality
condition itself in many cases is not well suited for doing analysis on S because
it does not rule out existence of “almost closed” causal curves. For this reason,
the following condition is often imposed.
Definition 3-12. The strong causality condition is said to hold at a point p ∈
S provided that given any open neighborhood U of p there exists an open
neighborhood V ⊂ U of p such that every causal curve σ : [0, 1] → S with
α(0) ∈ V and α(1) ∈ V lies entirely in U .
A spacetime S is said to be strongly causal if the strong causality condition
holds at every p ∈ S .
The following lemma gives a good illustration of the implication of the strong
causality condition.
Lemma 3-13. Suppose K ⊂ S is a compact set and the strong causality condition
holds at every point in K . Let σ : [0, b) → S , where b ≤ ∞, be a future-directed
causal curve such that σ (0) ∈ K . If σ is future-inextendible, i.e., if limt↗b σ (t)
does not exist, then σ eventually leaves K : that is, there exists a T ∈ (0, b) such
that σ (t) ∈
/ K , for all t > T .
Proof. If the conclusion is false, there exists an increasing sequence {ti } ⊂ (0, b)
such that σ (ti ) ∈ K and limi→∞ ti = b. As K is compact, passing to a subse-
quence, one may assume limi→∞ σ (ti ) = p for some p ∈ K . Applying the strong
causality condition at p, one can show limt↗b σ (t) = p, which is a contradiction.

The (strong) causality condition is used to define a globally hyperbolic space-
time.
Definition 3-14. A spacetime S is said to be globally hyperbolic if
(1) S is strongly causal, and
(2) the sets J + ( p) ∩ J − (q) are compact for all p, q ∈ S .
Remark 3-15. In [21], it was shown that the causality condition together with
condition (2) in fact implies the strong causality condition.
112 3. BASICS OF L ORENTZIAN CAUSALITY

A direct consequence of condition (2) in Definition 3-14 is the closedness of


J ± ( p).
Proposition 3-16. Suppose a spacetime S satisfies condition 2 above, that is,
J + ( p) ∩ J − (q) is compact for all p, q ∈ S . Then J + (A) and J − (A) are closed
for any compact subset A ⊂ S .
Proof. First consider the case where A = { p}. Let {qk } ⊂ J + ( p) and assume
limk→∞ qk = q. Take q + ∈ I + (q), then qk ∈ I − (q + ) ⊂ J − (q + ) for large k.
Hence, qk ∈ J + ( p) ∩ J − (q + ) which is compact by assumption. Therefore,
q = limk→∞ qk ∈ J + ( p). This shows J + ( p) (and J − ( p)) are always closed.
Next suppose A is compact. Suppose {qk } ⊂ J + (A) and { pk } ⊂ A such that
qk ∈ J + ( pk ) and limk→∞ qk = q. As A is compact, passing to a subsequence, one
may assume limk→∞ pk = p ∈ A. Now take a sequence { pm− } ⊂ I − ( p) such that
limm→∞ pm− = p. For each fixed m, when k is sufficiently large, pk ∈ I + ( pm− )
which implies qk ∈ J + ( pm− ). Since J + ( pm− ) is closed, one has q ∈ J + ( pm− ) or
equivalently pm− ∈ J − (q). Since J − (q) is closed, one concludes p ∈ J − (q) and
hence q ∈ J + ( p) ⊂ J + (A). This shows J + (A) is closed. □
Remark 3-17. The proof of Proposition 3-16 indeed shows that ≤ is a closed
relation on S if the spacetime S satisfies condition (2) in Definition 3-14.

3.4. Achronal sets

We now discuss properties of certain subsets in a spacetime S . A subset A ⊂ S


is said to be achronal if A ∩ I + (A) = ∅, i.e., if the relation p ≪ q never holds
for p, q ∈ A.
Since ≪ is an open relation (Proposition 3-8), one has:
Lemma 3-18. If A is achronal, so is its closure A.
An important example of an achronal set is the topological boundary of the
timelike future (and past) of any given set.
Lemma 3-19. Let A ⊂ S .
(i) If p ∈ ∂ I + (A), then I + ( p) ⊂ I + (A), and I − ( p) ⊂ S \ I + (A).
(ii) ∂ I + (A) is achronal.
Proof. (i) Take q ∈ I + ( p). Let { pk } ⊂ I + (A) such that limk→∞ pk = p. As
I − (q) is an open set containing p, pk ∈ I − (q) for large k. Hence, q ∈ I + (A).
This proves I + ( p) ⊂ I + (A). Similarly, one shows I − ( p) ⊂ S \ I + (A).
Statement (ii) follows from (i) and fact that I + (A) is open. □
ACHRONAL SETS 113

Because of Lemma 3-19(ii), a subset B ⊂ S is said to be an achronal boundary


if B = ∂ I + (A) or B = ∂ I − (A) for some A ⊂ S . A main result in this section is
that an achronal boundary, if nonempty, is always a C 0 hypersurface. To prove
this, one needs the concept of an edge point.
Definition 3-20. Given an achronal set A, a point p ∈ A is called an edge point of
A if every open neighborhood U of p contains a timelike curve from I − ( p, U )
to I + ( p, U ) that does not meet A.
Remark 3-21. By definition, a point p ∈ A is not an edge point of A if there
exists an open neighborhood V of p such that every timelike curve, which is
contained in V and is from I − ( p, V ) to I + ( p, V ), must meet A.
We will denote by edge(A) the set of all edge points of an achronal set A. By
Lemma 3-18, if p ∈ A \ A, no timelike curve passing p meets A. Hence

A \ A ⊂ edge(A) ⊂ A.

In particular, A being edgeless (edge(A) = ∅) implies that A is closed.


Lemma 3-22. Given any A ⊂ S , the achronal boundary ∂ I + (A) is edgeless.
Proof. Taking any p ∈ ∂ I + (A), it follows from Lemma 3-19(i) that any timelike
curve from I − ( p) to I + ( p) must meet ∂ I + (A). Hence, ∂ I + (A) is edgeless by
Remark 3-21. □
We now state a basic structural result.
Proposition 3-23. Suppose A is achronal. Then A \ edge(A), if nonempty, is a
C 0 hypersurface in S .
Proof. Suppose p ∈ A \ edge(A); we want to show that there exists an open
neighborhood U of p such that U ∩ A is homeomorphic to an open set in Rn .
Let {e0 , e1 , . . . , en } be an orthonormal basis of T p S such that e0 is future-
directed timelike. Using this basis, one identifies T p S with Rn,1 , with coordinates
(t, x1 , . . . , xn ). For any δ > 0, √ define W̃δ = {(t, x) : |t| < δ, |x| < δ}, where
x = (x1 , . . . , xn ) ∈ R and |x| = x12 + · · · + xn2 . When δ is small, the exponential
n

map exp p ( · ) is a diffeomorphism from W̃δ onto its image. Denote this image
by Wδ .
Since p ∈ / edge(A), there exists an open neighborhood V of p such that every
timelike curve, which is contained in V and is from I − ( p, V ) to I + ( p, V ), must
meet A. Fix a small δ > 0 such that W2δ ⊂ V . Given any x ∈ Dδ = {x ∈ Rn : |x| < δ},
consider the curve γx (t) = exp p (te0 + x1 e1 + · · · + xn en ), |t| < 2δ. By choosing
δ small, one can assume γx is timelike for all x ∈ Dδ . Now one restricts attention
within Wδ . As t increases within (−δ, δ), γx is from I − ( p, Wδ ) ⊂ I − ( p, V ) to
114 3. BASICS OF L ORENTZIAN CAUSALITY

I + ( p, Wδ ) ⊂ I + ( p, V ). Hence, γx must meet A. Moreover, γx only meets A


once since A is achronal. Therefore, exp−1 p (Wδ ∩ A) is the graph over Dδ of
some function f : Dδ → (−δ, δ).
To complete the proof, one needs to show f is continuous. If f were not
continuous at some point x∗ ∈ Dδ , there would exist an ϵ > 0 and a sequence
{yk } ⊂ Dδ such that yk → x∗ as k → ∞ and |tk − t∗ | > ϵ, where tk = f (yk ) and
t∗ = f (x∗ ). Let σ̃k ⊂ T p S be the line segment that connects (x∗ , t∗ ) to (yk , tk ).
Let σk be the image of σ̃k under the exponential map exp p ( · ). Since the curve
segment γx∗ , t ∈ [−δ, δ], is timelike, one sees that σk is also timelike for large k.
This contradicts the fact that A is achronal. Hence, f is continuous on Dδ . □
The next corollary follows directly from Lemma 3-19(ii), Lemma 3-22 and
Proposition 3-23.
Corollary 3-24. Given any A ⊂ S , the achronal boundary ∂ I + (A), if nonempty,
is a C 0 hypersurface in S .

3.5. Cauchy hypersurfaces

We give two equivalent definitions of a Cauchy hypersurface in a spacetime S .


Recall that a causal curve σ : (a, b) → S is inextendible if lim σ (t) and lim σ (t)
t↘a t↗b
do not exist.

Definition I. A subset 6 ⊂ S is said to be a Cauchy hypersurface if 6 is met


exactly once by every inextendible timelike curve in S .
Definition II. A subset 6 ⊂ S is said to be a Cauchy hypersurface if 6 is
achronal and is met by every inextendible causal curve in S .
It is evident that Definition II implies Definition I. The following lemmas may
be used to verify the reverse direction.
Lemma 3-25. Let 6 be a Cauchy hypersurface according to Definition I.
(i) S is the disjoint union of 6, I + (6) and I − (6).
(ii) 6 is a closed, edgeless, achronal set. Hence 6 is a C 0 hypersurface.
Proof. The proof of (i) is left as an exercise. Given any p ∈ 6, the fact that
any timelike curve passing p meets instantly both I − (6) and I + (6) shows
6 ⊂ ∂ I + (6) and 6 ⊂ ∂ I − (6). This combined with (i) then implies 6 =
∂ I + (6) = ∂ I − (6). (ii) now follows from Lemma 3-19, Lemma 3-22 and
Proposition 3-23. □
Lemma 3-26. Let C ⊂ S be a closed subset. Let β : [0, b) → S be a past-directed,
past-inextendible causal curve, i.e., lim β(t) does not exist. Suppose β does not
t↗b
meet C.
C AUCHY HYPERSURFACES 115

(i) Given any p0 ∈ I + (β(0), S \ C), there is a past-inextendible timelike curve


starting at p0 that does not meet C.
(ii) If β is not a null geodesic free of conjugate points of β(0), there is a past-
inextendible timelike curve starting at β(0) that does not meet C.
Proof. (i) By reparametrization, one may assume b = ∞. Let γ ⊂ S \ C be
a past-directed timelike curve from p0 to β(0). Applying Theorem 3-4 to the
union of γ and β|[0,1] , one knows there exists a past-directed timelike curve γ1
contained in S \ C, connecting p0 and β(1). Let p1 be a point on γ1 that is close
to β(1). Similarly, one can find a past-directed timelike curve γ2 contained in
S \ C, connecting p1 and β(2). Let p2 be a point on γ2 that is close to β(2).
Repeating this argument, one obtains a sequence of points p1 , p2 , . . . in S \ C
and a sequence of past-directed timelike curves γ1 , γ2 , . . . contained in S \ C
such that γk connects pk−1 and pk ; moreover, with respect to any metric d( · , · )
for the topology on S , { pk } can be arranged so that d( pk , β(k)) < 1/k. In
particular, limk→∞ pk does not exist since limk→∞ β(k) does not exist. Now let
γ̃ be the past-directed timelike curve which is the union of γ1 , γ2 , . . . , then γ̃ is
past-inextendible and is contained in S \ C.
(ii) Suppose b = ∞. Choose T < ∞ such that β|[0,T ] is not a null geodesic that has
no conjugate points of β(0) before β(T ). Applying Theorem 3-4 to β|[0,T ] , one
knows there exists a past-directed timelike curve σ : [0, 1] → S \ C, connecting
β(0) and β(T ). Let α be the union of σ |[ 1 ,1] and β|[T,∞) , then α is a past-
2
inextendible causal curve that does not meet C. Since β(0) ∈ I + σ 12 , S \ C ,
 

(ii) now follows from (i). □


Proof that Definition I implies Definition II. Let 6 ⊂ S satisfy Definition I.
Then 6 is achronal since no timelike curves meet 6 more than once. Let
σ : (−∞, ∞) → S be an inextendible causal curve. If σ (0) ∈ 6, then σ meets 6.
Suppose σ (0) ∈/ 6, then σ (0) ∈ I + (6) ∪ I − (6) by Lemma 3-25(i). Without loss
of generality, one can assume σ (0) ∈ I + (6). By reversing the time direction of
σ , one may also assume σ is past-directed. Now suppose σ |[0,∞) does not meet
6. Since 6 is closed, by Lemma 3-26 (i), for a fixed p ∈ I + (σ (0)) ⊂ I + (6),
there exists a past-inextendible timelike curve α starting at p which does not
meet 6. On the other hand, let η be any future-inextendible timelike curve
starting at p, then η stays in I + (6). Putting together η and −α, one obtains
an inextendible timelike curve that never meets 6, contradicting Definition I.
Therefore, σ must meet 6. □
Definition I suggests using a timelike vector field to study the topology of a
spacetime admitting a Cauchy hypersurface.
116 3. BASICS OF L ORENTZIAN CAUSALITY

Let T be a smooth timelike vector field on S (the existence of such a T


is guaranteed by the time-orientability of S ). Given any p ∈ S , let γ p (t) be
the maximal integral curve of T with γ p (0) = p. It is easily seen that γ p is
inextendible. Consider the map F : ( p, t) 7→ γ p (t). The domain of F is the
open subset of S × R given by U = p∈S ({ p} × (a p , b p )), where (a p , b p ) is the
S

maximal interval on which γ p ( · ) is defined. If 6 ⊂ S is a Cauchy hypersurface,


one has the continuous map

86 : U ∩ (6 × R) → S ,

where 86 (x, t) = γx (t) for all x ∈ 6. It is a good exercise to check that


Definition I implies 86 is one-to-one and onto. Since U ∩ (6 × R) and S are
topological manifolds, 86 is a homeomorphism by the theorem on invariance
of domain (for this result see [217] and [165], for example). Denote by π6 the
natural projection from 6 × R to 6. Composing 8−1 6 with π6 , one obtains a
map
ψ6 = π6 ◦ 8−1
6 : S → 6,

which is a continuous open map and satisfies the property ψ6 ( p) = p for all
p ∈ 6. In particular, this shows that 6 is connected, since S is connected.

Proposition 3-27. Let S be a spacetime that has a Cauchy hypersurface 6.

(i) 6 is homeomorphic to any other Cauchy hypersurface in S .


(ii) Suppose A ⊂ S is achronal. If A is a compact C 0 hypersurface, then A and
6 are homeomorphic. Consequently, 6 must be compact.

Proof. (i) Let 6 ′ be a second Cauchy hypersurface in S . For a fixed timelike


vector field T , consider the maps ψ6 and ψ6 ′ defined as above. For any p ∈ 6
and p ′ ∈ 6 ′ , it is evident that ψ6 (ψ6 ′ ( p)) = p and ψ6 ′ (ψ6 ( p ′ )) = p ′ . Hence,
6 and 6 ′ are homeomorphic.
(ii) Consider the restriction of ψ6 to A. The fact A is achronal shows that ψ6 | A
is one-to-one. Since A is a C 0 hypersurface, by invariance of domain, ψ6 | A is
an open map. On the other hand, A being compact shows that ψ6 (A) is compact,
hence closed in 6. Therefore, since 6 is connected, ψ6 (A) = 6 and ψ6 | A is a
homeomorphism between A and 6. □

Remark 3-28. The compact achronal C 0 hypersurface A in Proposition 3-27(ii)


is indeed a Cauchy hypersurface itself. We refer readers to Proposition 4.8 in
[101] for a discussion of the proof.
D OMAINS OF DEPENDENCE 117

3.6. Domains of dependence

Definition 3-29. Given an achronal set A ⊂ S , the future and past domains of
dependence of A, D + (A) and D − (A), are defined as follows:
D + (A) = {q : every past-inextendible causal curve from q meets A},
D − (A) = { p : every future-inextendible causal curve from p meets A}.
The union of D + (A) and D − (A), denoted by D(A), is called the domain of
dependence of A.
Remark 3-30. D + (A) and D − (A) are also known as the future and past Cauchy
developments of A respectively. Similarly, D(A) = D + (A) ∪ D − (A) is also
called the Cauchy development of A.
Remark 3-31. It follows from Definition II on p. 114 that an achronal set 6 is a
Cauchy hypersurface if and only if D(6) = S .
The following facts relating D ± (A), I ± (A) and J ± (A) are easily checked:
A ⊂ D + (A) ⊂ A ∪ I + (A) ⊂ J + (A),
A ⊂ ∂ D + (A),
D + (A) ∩ I − (A) = ∅,
D + (A) ∩ D − (A) = A,
D(A) ∩ I + (A) = D + (A) \ A.
A basic feature about D + (A) is that information outside D + (A) traveling into
D + (A) must first pass through A.
Lemma 3-32. Suppose σ : [0, 1] → S is a past-directed causal curve with
σ (0) ∈ D + (A) and σ (1) ∈
/ D + (A), then σ (t) ∈ A for some t ∈ [0, 1).
Proof. Since σ (1) ∈ / D + (A), there is a past-inextendible causal curve β from
σ (1) that does not meet A. The union σ ∪ β is a past-inextendible causal curve
starting at σ (0) ∈ D + (A). Hence, it must meet A somewhere on σ |[0,1) . □
One of the important aspects of D(A) = D + (A) ∪ D − (A) is that its interior
int D(A), if nonempty, is an open set with appealing properties.
Lemma 3-33. Suppose q ∈ int D(A). If q ∈ D + (A), then any past-inextendible
causal curve must meet I − (A); similarly, if q ∈ D − (A), any future-inextendible
causal curve must meet I + (A).
Proof. It suffices to consider the case q ∈ D + (A). Let β : [0, b) → S be a past-
inextendible causal curve with β(0) = q. The fact q ∈ int D(A) implies that there
is a nearby point p0 ∈ I + (q)∩ D(A). Let γ be a past-directed timelike curve from
118 3. BASICS OF L ORENTZIAN CAUSALITY

p0 to β(0). Repeating the construction in the proof of Lemma 3-26(i), one obtains
a past-inextendible timelike curve γ̃ starting from p0 and a sequence of points
{ pk } on γ̃ such that β(k) ≪ pk , for all k ≥ 1. Since p0 ∈ I + (A)∩ D(A) ⊂ D + (A),
γ̃ meet A somewhere. Therefore, β(k) ∈ I − (A) for large k. □
Remark 3-34. It follows from Lemma 3-33 that if p ∈ int D(A), then every
inextendible causal curve through p must meet both I − (A) and I + (A).
Lemma 3-35. The causality condition holds on int D(A), i.e., no causal loop
meets int D(A).
Proof. Suppose α is a causal loop passing some p ∈ int D(A). Traveling along α
infinitely many times, one gets an inextendible causal curve α̃. By Lemma 3-33,
α̃ meets both I + (A) and I − (A). On the other hand, α̃ meets A. This contradicts
the achronality of A. □
Lemma 3-36. If p, q ∈ int D(A) and p ≤ q, then J + ( p) ∩ J − (q) ⊂ int D(A).
Proof. When q = p, J + (q) ∩ J − (q) = {q} by Lemma 3-35. Hence it suffices to
assume p < q. There are a few cases to consider:
Case 1: q, p ∈ D + (A) \ A = D(A) ∩ I + (A). In this case, points that are
close to q are still in D(A) ∩ I + (A). Let q + ∈ I + (q) be chosen such that
q + ∈ D(A) ∩ I + (A). Consider the open set U = I − (q + ) ∩ I + (A). Given any
past-directed causal curve α : [0, 1] → S from q to p, one has α(t) ∈ U , for all
t ∈ [0, 1].
We proceed to show that U ⊂ D + (A). Suppose y ∈ U . Let σ : [0, 1] → S
be a past-directed timelike curve from q + to y. If y ∈ / D + (A), Lemma 3-32
implies σ (s) ∈ A for some s ∈ [0, 1). Since σ (1) = y ∈ I + (A), this contradicts
the achronality of A. Hence, U ⊂ D + (A).
Case 2: q ∈ D + (A) \ A = D(A) ∩ I + (A) and p ∈ D − (A) \ A = D(A) ∩ I − (A).
In this case, choose p − ∈ I − ( p) and q + ∈ I + (q) respectively such that p − ∈
D(A) ∩ I − (A) and q + ∈ D(A) ∩ I + (A). Any future-directed causal curve α
from p to q now is contained in the open set V = I + ( p − ) ∩ I − (q + ).
We conclude by showing that V ⊂ D(A). Suppose x ∈ V . Let γ : [0, 1] → S
and τ : [0, 1] → S be past-directed timelike curves from q + to x and from x
to p − respectively. If x ∈
/ D(A) = D + (A) ∪ D − (A), then Lemma 3-32 implies
γ (s) ∈ A for some s ∈ [0, 1) and τ (t) ∈ A for some t ∈ (0, 1]. Again, this
contradicts the achronality of A. Hence, V ⊂ D(A).
Case 3: q ∈ D + (A) \ A = D(A) ∩ I + (A) and p ∈ A. The proof of this case is
identical to that of Case 2.
Case 4: q, p ∈ A. Again, this case can be proved in the same way as Case 2.
C AUCHY HORIZONS 119

Any remaining case is dual to one of the cases above by reversing the time
orientation on S . This completes the proof. □

Results stronger than Lemmas 3-35 and 3-36 indeed hold on int D(A). Inter-
ested readers are referred to [174, Theorem 14.38] for a complete proof of the
following theorem.

Theorem 3-37. Let A ⊂ S be an achronal set. Then int D(A), if nonempty,


satisfies the following properties:

(i) The strong causality condition holds at every p ∈ int D(A).


(ii) Given p, q ∈ int D(A), if p ≤ q, then J + ( p) ∩ J − (q) is compact and is
contained in int D(A).

Corollary 3-38. If a spacetime S has a Cauchy hypersurface, then S is globally


hyperbolic.

This follows from Theorem 3-37 and Remark 3-31. The corollary’s converse
is also true; see [104].

3.7. Cauchy horizons

We end this chapter with a brief introduction to Cauchy horizons. Although this
concept is not needed in the proof of the Penrose singularity theorem, it arises
naturally when an achronal set A is not a Cauchy hypersurface,

Definition 3-39. Suppose A ⊂ S is achronal. Its future Cauchy horizon H + (A)


is defined as

H + (A) = { p ∈ D + (A) : I + ( p) ∩ D + (A) = ∅}.

The past Cauchy horizon H − (A) is defined dually. The Cauchy horizon of A is
H (A) = H + (A) ∪ H − (A).

By definition, H + (A) = D + (A) \ I − (D + (A)). Therefore, H + (A) is closed.


Moreover, I + ( p) ∩ D + (A) = ∅ implies I + ( p) ∩ D + (A) = ∅ as I + ( p) is open.
Hence, H + (A) is achronal. By Proposition 3-23, H + (A) \ edge(H + (A)), if
nonempty, is always a C 0 hypersurface.

Lemma 3-40. If A is a closed achronal set, then

D + (A) = {q : every past inextendible timelike curve from q meets A}.

Consequently, D + (A) ⊂ A ∪ I + (A).


120 3. BASICS OF L ORENTZIAN CAUSALITY

Proof. Given q ∈ D + (A), suppose there is a past-inextendible timelike curve


σ : [0, b) → S with σ (0) = q such that σ does not meet A. As A is closed,
S \ A is open. Let p = σ (s) for some s ∈ (0, b), then q ∈ I + ( p, S \ A). Hence,
there exists q̃ ∈ D + (A) ∩ I + ( p, S \ A). Note that σ : [s, b) → S is a past-
inextendible timelike curve starting at p that does not meet A. By Lemma 3-26,
there exists a past-inextendible timelike curve β starting at q̃ that does not meet A,
contradicting the fact that q̃ ∈ D + (A).
Next, suppose q is a point such that every past-inextendible timelike curve
from q meets A. Suppose q ∈ S \ D + (A). Take q − ∈ I − (q, S \ D + (A)). Since
q− ∈/ D + (A), there is a past-inextendible causal curve starting from q − that does
not meet A. By Lemma 3-26, there exists a past-inextendbile timelike curve
starting at q that does not meet A, which is a contradiction. □

Proposition 3-41. If A is a closed achronal set, then ∂ D + (A) = A ∪ H + (A).

Proof. It suffices to show ∂ D + (A) ⊂ A∪ H + (A). To do so, suppose p ∈ D + (A)\


(A ∪ H + (A)). The fact p ∈ D + (A) \ A implies p ∈ I + (A) by Lemma 3-40. The
fact p ∈ D + (A) \ H + (A) implies there exists q ∈ I + ( p) ∩ D + (A). Consider
the open set U = I − (q) ∩ I + (A), which contains p. Given any y ∈ U , there
/ D + (A), Lemma 3-32
exists a past-directed timelike curve σ from q to y. If y ∈
implies σ must meet A before y, contradicting the fact that y ∈ I + (A). Therefore
y ∈ D + (A) and hence U ⊂ int D + (A). In particular, p must be an interior point
of D + (A). This shows ∂ D + (A) ⊂ A ∪ H + (A). □

Exercises

We assume that (S , ⟨ · , · ⟩) is a connected, time-oriented Lorentzian manifold.

Exercise 3-42. Let α : [0, 1] → S be a smooth causal curve segment. Let ε > 0
and x : (−ε, ε) × [0, 1] → S be a smooth variation of α, i.e., x(0, t) = α(t) for
all t ∈ [0, 1]. Let V = (∂ x/∂s)|s=0 be the variation vector field along α, with
covariant derivative V ′ (t) along α. Suppose for all t ∈ [0, 1], ⟨V ′ (t), α ′ (t)⟩ < 0.
Show that there is ε0 > 0 small enough so that for all 0 < s < ε0 , the curve
xs : [0, 1] → S with xs (t) = x(s, t) is timelike. Where could the strict inequality
be relaxed to ⟨V ′ (t), α ′ (t)⟩ ≤ 0?

Exercise 3-43. Suppose p, q ∈ S , and γ : [0, 1] → S is a smooth causal curve


from p to q, with γ ′ (0) or γ ′ (1) timelike. Show that γ can be deformed to a
nearby timelike curve with a fixed endpoint variation x : (−ε, ε) × [0, 1] → S .
(Hint: Start by considering the vector field along γ obtained by parallel transport
of a timelike velocity vector at an endpoint.)
E XERCISES 121

Remark. A more general result holds [174, Proposition 10.46]: If γ is a piecewise


smooth causal curve from p to q which is not a smooth null geodesic, then γ
can be deformed to a nearby timelike curve with a fixed endpoint variation.
Exercise 3-44. Suppose p ∈ S and U e ⊂ T p S and U ⊂ S are open sets for which
the exponential map exp p : Ue → U is a diffeomorphism. Suppose γ̃ : [0, 1] →
e ⊂ T p S is a piecewise smooth curve with γ̃ (0) = 0 ∈ T p S . If γ = exp p ◦ γ̃
U
is a future-pointing timelike curve in S , prove that γ̃ ((0, 1]) lies in the future
timecone of T p S . (Hint: Gauss lemma.)
Exercise 3-45. Suppose A is an achronal set that contains no edge points. Show
that I (A) := I − (A) ∪ A ∪ I + (A) is an open set.
Exercise 3-46. Suppose S has the property that for all p ∈ S , J + ( p) and J − ( p)
are closed sets. Show that if K is compact, then J + (K ) is closed.
Exercise 3-47. Suppose S has a noncompact Cauchy hypersurface. Prove that
for any A ⊂ S , the topological boundary ∂ J + (A), if nonempty, is noncompact.
(Hint: Compare ∂ J + (A) and ∂ I + (A).)
CHAPTER 4

The Penrose singularity theorem

This chapter presents the classical Penrose singularity theorem [177]. The main
ingredients in the proof concern, on the one hand, the causal structure of a
globally hyperbolic spacetime, discussed in the previous chapter, and on the
other, differential geometry techniques involving Jacobi fields together with the
Riccati and Raychaudhuri equations.
This chapter is organized as follows. In Section 4.1, we review the concept of
Jacobi fields and focal points of a spacelike submanifold. In Section 4.2, we give
a geometric formulation of the Riccati and Raychaudhuri equations along causal
geodesics. In Section 4.3, we state and prove the Penrose singularity theorem.
Throughout this chapter, S denotes an (n+1)-dimensional spacetime.

4.1. Jacobi fields and focal points

In general relativity, a freely falling observer is represented by a future-directed


timelike geodesic. A family of such nearby observers can thus be described by a
map F : (−ϵ, ϵ)×(a, b) → S such that for each |s| < ϵ, the curve γs ( · ) = F(s, · )
is a (timelike) geodesic. The position of these observers relative to a given γs
is measured by the tangent vector V = ∂ F/∂s along γs . Differentiating (or
linearizing) the geodesic equation ∇γs′ γs′ = 0, one obtains

0 = ∇V (∇γs′ γs′ ) = ∇V (∇γs′ γs′ ) − ∇γs′ (∇V γs′ ) + ∇γs′ (∇γs′ V )


= V ′′ + R(V, γs′ )γs′ , (4.1.1)

where ∇ denotes covariant differentiation in S , V ′′ = ∇γs′ (∇γs′ V ) and

R(X, Y )Z = ∇ X ∇Y Z − ∇Y ∇ X Z − ∇[X,Y ] Z .

Thinking in the Newtonian regime, one tends to interpret −R(V, γs′ )γs′ as a
certain “force” acting on V . Thus the map V 7→ −R(V, γs′ )γs′ is often known as
the tidal force operator.
A submanifold P ⊂ S is called spacelike if every tangent vector to P is
spacelike.

123
124 4. T HE P ENROSE SINGULARITY THEOREM

We let II( · , · ) : T p P × T p P → (T p P)⊥ be the second fundamental form of


P at p ∈ P, i.e., II(v, w) = (∇v w)⊥ , where ⊥ denotes orthogonal projection to
(Tp P)⊥ , the orthogonal complement of Tp P (see Lemma 5-3).
Suppose the observers {γs } above originate from a given spacelike submanifold
P, i.e. γs (0) ∈ P and γs′ (0) ⊥ P, for all s. Then V (0) ∈ Tγs (0) P, and V ′ (0)
satisfies (apply (2.3.8) to the map F)

⟨V ′ (0), w⟩ = ⟨∇γs′ (0) V, w⟩ = ⟨∇V γs′ (0), w⟩ = −⟨γs′ (0), II(V (0), w)⟩, (4.1.2)

for all w ∈ Tγs (0) P.


Equations (4.1.1) and (4.1.2) suggest the following definitions. Given a
geodesic γ : [0, b] → S , a vector field V along γ is called a Jacobi field if

V ′′ + R(V, γ ′ )γ ′ = 0. (4.1.3)

When γ is orthogonal to a spacelike submanifold P, that is, if γ (0) ∈ P and


γ ′ (0) ⊥ P, a point γ (t), t > 0, is called a focal point of P along γ , provided
there exists a nontrivial Jacobi field V along γ such that V (t) = 0 and
V (0) ∈ Tγ (0) P,
(4.1.4)
⟨V ′ (0), w⟩= −⟨γ ′ (0), II(V (0), w)⟩ for all w ∈ Tγ (0) P.
Heuristically, if q = γ (t) is a focal point of P along a timelike γ , there exist
nearby freely falling observers that start from P and (almost) meet γ at q. When
P consists of a single point, focal points of P are conjugate points of γ (0).
The following basic result concerning the deformation of causal curves is
needed in the proof of the Penrose singularity theorem:
Theorem 4-1 [174, Theorem 10.51]. Let P be a spacelike submanifold in a
spacetime S . Given p ∈ P and q ∈ S , let α be a causal curve from p to q. There
exists a timelike curve from p to q which is arbitrarily near α unless α is a null
geodesic (when suitably parametrized) orthogonal to P along which there are
no focal points of P before q.

4.2. Riccati and Raychaudhuri equations

In this section, we present a geometric description of the Riccati and Raychaud-


huri equations along causal geodesics.

4.2.1. Geometric Riccati equation. Let P ⊂ S be a spacelike submanifold.


Consider a causal geodesic

γ : [0, L) → S , with γ (0) = p ∈ P and γ ′ (0) ⊥ P.


R ICCATI AND R AYCHAUDHURI EQUATIONS 125

The set Ṽ of Jacobi fields along γ satisfying the initial condition (4.1.4) is a
vector space of dimension n + 1. Consider the subspace V ⊂ Ṽ given by

V = {J ∈ Ṽ : J ⊥ γ ′ along γ }.

Besides (4.1.4), elements in V are characterized by an additional initial condition


J ′ (0) ⊥ γ ′ (0). Hence, V has dimension n.
In what follows, it is always assumed that γ (t) is not a focal point of P along
γ , for all t ∈ (0, L). At each γ (t), there is a well-defined map B(t) : V → γ ′ (t)⊥ ,
given by B(t)(J ) = J (t). Here γ ′ (t)⊥ = v ∈ Tγ (t) S : ⟨v, γ ′ (t)⟩ = 0 . Since


γ (t) is not a focal point, B(t) is a linear isomorphism and B(t)−1 (v) is the
unique Jacobi field J ∈ V such that J (t) = v. The map A(t) : γ ′ (t)⊥ → γ ′ (t)⊥
defined by
A(t)(v) = [B(t)−1 (v)]′ (t), (4.2.1)

where the prime in (4.2.1) denotes covariant differentiation by γ ′ (t), satisfies


(from the definition)

A(t)(J (t)) = J ′ (t) for all J ∈ V . (4.2.2)

A(t) is therefore a smooth section of the vector bundle over γ with fiber
End(γ ′ (t)⊥ , γ ′ (t)⊥ ), the space of linear maps from γ ′ (t)⊥ to itself.
Now let A′ (t) = ∇γ ′ (t) A(t). By definition,

A′ (t)(W (t)) = ∇γ ′ (t) [A(t)(W (t))] − A(t)(∇γ ′ (t) W (t))

for any smooth vector field W (t) along γ such that W (t) ∈ γ ′ (t)⊥ . Taking
W (t) = J (t) ∈ V and applying (4.2.2), we have
A′ (t)(J (t)) = ∇γ ′ (t) [A(t)(J (t))] − A(t)(∇γ ′ (t) J (t))
= J ′′ (t)) − A(t)(J ′ (t))
= −R(J (t), γ ′ (t))γ ′ (t) − A(t) ◦ A(t)(J (t)). (4.2.3)

The following proposition follows directly from (4.2.3).


Proposition 4-2. The linear maps A(t) : γ ′ (t)⊥ → γ ′ (t)⊥ defined in (4.2.1)
satisfy a Riccati equation of the form

A′ (t)( · ) = −R( · , γ ′ )γ ′ − A(t) ◦ A(t)( · ). (4.2.4)

With respect to the spacetime metric ⟨ · , · ⟩, A(t) satisfies a further relation:


Proposition 4-3. For all J1 , J2 ∈ V,

⟨ A(t)(J1 (t)), J2 (t)⟩= ⟨ J1 (t), A(t)(J2 (t))⟩. (4.2.5)


126 4. T HE P ENROSE SINGULARITY THEOREM

Proof. By (4.1.4),

lim⟨ A(t)(Ji (t)), J j (t)⟩= ⟨ Ji′ (0), J j (0)⟩= −⟨γ ′ (0), II(Ji (0), J j (0)⟩
t↘0

for i, j ∈ {1, 2}. Hence,

lim⟨ A(t)(J1 (t)), J2 (t)⟩= lim⟨ J1 (t), A(t)(J2 (t))⟩. (4.2.6)


t↘0 t↘0

On the other hand, (4.1.3) implies


d 
⟨ A(t)(J1 (t)), J2 (t)⟩−⟨ J1 (t), A(t)(J2 (t))⟩ = 0. (4.2.7)

dt
Equation (4.2.5) follows from (4.2.6) and (4.2.7). □

4.2.2. Raychaudhuri equation along timelike geodesics. If γ ′ (t) is timelike,


the restriction of ⟨ · , · ⟩ to γ ′ (t)⊥ is positive definite. In this case, Proposition 4-3
shows A(t) is self-adjoint with respect to ⟨ · , · ⟩. Let h(t) : γ ′ (t)⊥ × γ ′ (t)⊥ → R
be the associated symmetric bilinear form: h(t)(v, w) = ⟨ A(t)(v), w⟩. Define

θ (t) = trγ ′ (t)⊥ A(t) = trγ ′ (t)⊥ h(t), (4.2.8)

where trγ ′ (t)⊥ ( · ) denotes the trace on γ ′ (t)⊥ . It is easily checked that
d
trγ ′ (t)⊥ A(t) = trγ ′ (t)⊥ A′ (t). (4.2.9)
dt
With Proposition 4-2, this shows that θ (t) satisfies the following Raychaudhuri
equation.

Proposition 4-4. When γ (t) is timelike, θ (t) defined in (4.2.8) obeys


1
θ ′ (t) = −Ric(γ ′ , γ ′ ) − θ (t)2 − |h̊(t)|2 . (4.2.10)
n
Here Ric( · , · ) denotes the Ricci curvature of S , h̊(t) is the traceless part of h(t)
and | · | is the norm taken on γ ′ (t)⊥ .

Proof. Taking the trace of (4.2.4) and using (4.2.9), one has

θ ′ (t) = −Ric(γ ′ , γ ′ ) − trγ ′ (t)⊥ (A(t) ◦ A(t)),

where trγ ′ (t)⊥ (A(t) ◦ A(t)) = |h(t)|2 = n1 θ (t)2 + |h̊(t)|2 . This proves (4.2.10). □

The physical meaning of θ (t) can be seen as follows. Let J1 , . . . , Jn be n


Jacobi fields along γ such that {J1 , . . . , Jn } forms a basis of V . For t > 0, let
R ICCATI AND R AYCHAUDHURI EQUATIONS 127

gi j (t) = ⟨ Ji (t), J j (t)⟩ and let (g i j (t))n×n be the inverse matrix of (gi j (t))n×n .
By definition and Proposition 4-3,

g i j (t)⟨ Ji′ (t)), J j (t)⟩


X
θ (t) =
1≤i, j≤n
1 X ij d
= g (t) ⟨ Ji (t), J j (t)⟩
2 dt
1≤i, j≤n
1 dp
=p det(gi j (t)). (4.2.11)
det(gi j (t)) dt
Heuristically, (4.2.11) suggests that θ (t) represents the rate of the expansion of
the spatial world γ ′ (t)⊥ as measured by the (nearby) observers J1 , . . . , Jn .
When P is a spacelike hypersurface, limt↘0 θ (t) has an explicit geometric
meaning.

Proposition 4-5. Suppose P ⊂ S is a spacelike hypersurface. Then

lim θ (t) = −⟨γ ′ (0), H⃗ ⟩,


t↘0

where H⃗ is the mean curvature vector of P at γ (0).

Proof. Let {e1 , . . . , en } ⊂ Tγ (0) P be an orthonormal frame. By (4.1.4), there


exists {J1 , . . . , Jn } ⊂ V satisfying Ji (0) = ei and

⟨ Ji′ (0), w⟩= −⟨γ ′ (0), II(ei , w)⟩, for all w ∈ Tγ (0) P. (4.2.12)

Clearly, {J1 , . . . , Jn } forms a basis of V . Let gi j (t) and g i j (t) be given as in


(4.2.11). Since g i j (0) = δi j , by (4.2.11) and (4.2.12),
n
δ i j ⟨ Ji′ (0)), J j (0)⟩= − ⟨γ ′ (0), II(ei , ei )⟩
X X
lim θ (t) =
t↘0
1≤i, j≤n i=1

= −⟨γ ′ (0), H⃗ ⟩. □

4.2.3. Raychaudhuri equation along null geodesics. When γ ′ (t) is null, the
restriction of ⟨ · , · ⟩ to γ ′ (t)⊥ is degenerate, since γ ′ (t) ∈ γ ′ (t)⊥ . The next two
lemmas are left as exercises.

Lemma 4-6. If γ ′ (t) is null, every v ∈ γ ′ (t)⊥ which is not a scalar multiple of
γ ′ (t) is spacelike.
Pn−1
Lemma 4-7. Ric(γ ′ (t), γ ′ (t)) = i=1 ⟨R(ei , γ ′ (t))γ ′ (t), ei ⟩ for any collection
of vectors {e1 , . . . , en−1 } ⊂ γ ′ (t)⊥ satisfying ⟨ei , e j ⟩= δi j .
128 4. T HE P ENROSE SINGULARITY THEOREM

Lemma 4-6 suggests one should define an equivalence relation ∼ in γ ′ (t)⊥


as follows: given v, w ∈ γ ′ (t)⊥ ,

v ∼ w if and only if (v − w) ∥ γ ′ (t).

The spacetime metric ⟨ · , · ⟩ descends to a positive definite metric ⟨ · , · ⟩∼ on


the quotient space γ ′ (t)⊥ / ∼ , where

⟨[v], [w]⟩∼ := ⟨v, w⟩.


Here [v], [w] denote the equivalence classes containing v, w. The following
facts hold about γ ′⊥ / ∼ :
(a) The tidal force operator −R( · , γ ′ (t))γ ′ (t) : γ ′ (t)⊥ → γ ′ (t)⊥ descends to
a linear transformation − R̃( · ) on γ ′ (t)⊥ / ∼ , since R(γ ′ , γ ′ )γ ′ = 0. Here
R̃(t)([v]) := [R(v, γ ′ (t))γ ′ (t)].
(b) The vector field tγ ′ (t) is an element in V, hence A(t)(γ ′ (t)) = t −1 γ ′ (t).
Therefore, A(t) descends to a linear transformation Ã(t) on γ ′ (t)⊥ / ∼ ,
where Ã(t)([v]) := [A(t)(v)]. By Proposition 4-3,

Ã(t) : γ ′ (t)⊥ / ∼ → γ ′ (t)⊥ / ∼

is self-adjoint with respect to ⟨ · , · ⟩∼ .


(c) γ ′⊥ / ∼ is an (n−1)-dimensional vector bundle over γ , on which there is a
connection ∇˜ induced from ∇. Precisely, ∇˜ γ ′ (t) [V ] = [∇γ ′ (t) V ] where V is
any smooth vector field along γ with V (t) ∈ γ ′ (t)⊥ . Define

Ã′ (t)([V ]) := ∇˜ γ ′ (t) [ Ã(t)([V ])] − Ã(t)(∇˜ γ ′ (t) [V ]).

By Proposition 4-2, Ã′ (t) satisfies

Ã′ (t)( · ) = − R̃(t)( · ) − Ã(t) ◦ Ã(t)( · ). (4.2.13)

As in the previous case, consider the associated symmetric bilinear form h̃(t) :
γ ′ (t)⊥ / ∼ × γ ′ (t)⊥ / ∼ → R where h̃(t)([v], [w]) = ⟨ Ã(t)([v]), [w]⟩∼ .
 

Define
θ̃ (t) = trγ ′ (t)⊥ /∼ Ã(t) = trγ ′ (t)⊥ /∼ h̃(t), (4.2.14)

where trγ ′ (t)⊥ /∼ ( · ) is the trace on γ ′ (t)⊥ / ∼ . Similar to (4.2.9), one has
d
θ̃ (t) = trγ ′ (t)⊥ /∼ Ã′ (t). (4.2.15)
dt
The following Raychaudhuri equation for θ̃ (t) follows from Lemma 4-7, (4.2.15)
and taking the trace of (4.2.13).
P ROOF OF P ENROSE ’ S SINGULARITY THEOREM 129

Proposition 4-8. When γ (t) is null, θ̃ (t) defined in (4.2.14) obeys


1 ˚ 2
θ̃ ′ (t) = −Ric(γ ′ , γ ′ ) − θ̃ (t)2 − h̃(t) .
n −1 ∼
˚ is the traceless part of h̃(t) and | · | is the norm taken on γ ′ (t)⊥ / ∼ .
Here h̃(t) ∼

Similarly to Proposition 4-5, when P is a codimension-2 spacelike submani-


fold, lim ↘ 0 θ̃ (t) has an explicit geometric meaning.
Proposition 4-9. Suppose P is a codimension-2 spacelike submanifold in S .
Then
lim θ̃ (t) = −⟨γ ′ (0), H⃗ ⟩,
t↘0

where H⃗ is the mean curvature vector of P at γ (0).


Proof. Let {e1 , . . . , en−1 } ⊂ Tγ (0) P be an orthonormal frame. By (4.1.4), there
exists {J1 , . . . , Jn−1 } ⊂ V satisfying Ji (0) = ei and

⟨ Ji′ (0), w⟩= −⟨γ ′ (0), II(ei , w)⟩, for all w ∈ Tγ (0) P. (4.2.16)

Clearly, J1 , . . . , Jn−1 together with Jn (t) = tγ ′ (t) form a basis of V . Hence,


{[J1 (t)], . . . , [Jn−1 (t)]} is a basis of γ ′ (t)⊥ / ∼ . For α, β ≤ n − 1, let

gαβ (t) = ⟨[Jα (t)], [Jβ (t)]⟩∼

and let (g αβ (t))(n−1)×(n−1) be the inverse matrix of (gαβ (t))(n−1)×(n−1) . Applying


(4.2.16) and the fact that gαβ (0) = δαβ , one concludes that

g αβ (t)⟨ Jα′ (t), Jβ (t)⟩


X
lim θ̃ (t) = lim
t↘0 t↘0
1≤α,β≤n−1
n−1
⟨γ ′ (0), II(eα , eα )⟩= −⟨γ ′ (0), H⃗ ⟩.
X
=− □
α=1

4.3. Proof of Penrose’s singularity theorem

In general relativity, the problem of how to define a singularity is rather difficult


(see [208; 103], for example). The approach adopted in the Penrose singularity
theorem is to diagnose the existence of singular behavior of spacetimes in terms
of the incompleteness of future null geodesics.
A future-directed causal geodesic β : [0, b) → S is called incomplete if b < ∞,
where [0, b) is the maximum interval on which β exists.
Definition 4-10. A spacetime S is said to be future null geodesically incomplete
if it contains an incomplete, future-directed null geodesic.
130 4. T HE P ENROSE SINGULARITY THEOREM

A key notion in the Penrose singularity theorem is that of a trapped surface


(in 3 + 1 dimensions). In general, a closed (i.e., compact without boundary)
codimension-2 spacelike submanifold 6 ⊂ S is said to be (future) trapped if

⟨ν, H⃗ ⟩> 0 (4.3.1)

for all future-directed null vectors ν that are normal to 6. Here H⃗ denotes the
mean curvature vector of 6 in S .

Theorem 4-11 (Penrose singularity theorem [177]). Let S be a globally hyper-


bolic spacetime with a noncompact Cauchy hypersurface satisfying the null
energy condition, i.e. Ric(v, v) ≥ 0, for all null vectors v. If S contains a
closed, trapped, codimension-2 spacelike submanifold 6, then S is future null
geodesically incomplete.

Proof. We argue by contradiction. Suppose that S is not future null geodesically


incomplete, then every future-directed null geodesic in S is defined on [0, ∞).
Given any p ∈ 6, let (T p 6)⊥ = {v ∈ T p S | v ⊥ T p 6}. Let exp p ( · ) be the
exponential map on S at p. Since 6 is compact, there exists a finite open
covering {Ui }1≤i≤m of 6 such that, for 1 ≤ i ≤ m,
• the closure U i is compact and is contained in some open set Vi ⊂ 6, and
• on Vi , there exist two linearly independent future-directed null vector fields,
which we denote by K i+ and K i− .

It follows from (4.3.1) that there exists a constant δ > 0 such that

⟨K i± (q), H⃗ (q)⟩> δ (4.3.2)

for all q ∈ U i and i ∈ {1, . . . , m}.


Now fix i ∈ {1, . . . , m}. For each p ∈ Ui , consider the future-directed null
geodesics γ p+ (t) = exp p (t K i+ ( p)) and γ p− (t) = exp p (t K i− ( p)). By assumption,
γ p+ and γ p− are defined on [0, ∞). Suppose 6 has no focal point along γ p+ on the
interval (0, L] for some L > 0. Let θ̃ (t) be the quantity defined in Section 4.2.3
with γ (t) replaced by γ p+ (t). By Propositions 4-8 and 4-9, (4.3.2) and the null
energy condition, θ̃ (t) satisfies
1
θ̃ ′ (t) ≤ − θ̃ (t)2 for all t ∈ (0, L] (4.3.3)
n −1
and
lim θ̃ (t) = −⟨K + ( p), H⃗ ( p)⟩< −δ.
i (4.3.4)
t↘0

n−1
It follows from (4.3.3), (4.3.4) and elementary calculus that L < δ . Hence,
P ROOF OF P ENROSE ’ S SINGULARITY THEOREM 131

we conclude that 6 must have a focal point along γ p+ in 0, n−1 δ . Similarly, 6


 

must also have a focal point along γ p− in 0, n−1 δ .


 

Next we consider the achronal boundary ∂ I + (6). Since S is globally hyper-


bolic, J + (6) is closed by Proposition 3-16. This implies J + (6) = I + (6), by
Lemma 3-9. Therefore, ∂ I + (6) = J + (6) \ I + (6). Suppose q ∈ ∂ I + (6) \ 6.
By Theorem 4-1, there is a future-directed null geodesic β emanating from some
p ∈ 6 and ending at q such that β is orthogonal to 6 at p and 6 has no focal
points along β before q. Let i ∈ {1, . . . , m} such that p ∈ Ui . Then β ′ (0) is a
positive constant multiple of either K i+ ( p) or K i− ( p). Reparametrizing β, we
can assume that β = γ p+ or β = γ p− . Since 6 has no focal point along β before q,
q must lie in the set Wi = Wi+ ∪ Wi− , where
Wi+ := exp p (t K i+ ( p)) : p ∈ U i , t ∈ 0, n−1 ,
  
δ
− −
 n−1  (4.3.5)
Wi := exp p (t K i ( p)) : p ∈ U i , t ∈ 0, δ .


Since q is arbitrary, we have ∂ I + (6) ⊂ W , where W = mj=1 W j and each W j is


S

defined like Wi just above. Since W is compact and ∂ I + (6) is closed, ∂ I + (6)
must also be compact.
Now, by Lemma 3-19(ii) and Corollary 3-24, ∂ I + (6) is an achronal, C 0
hypersurface. Since ∂ I + (6) is also compact, Proposition 3-27(ii) implies that
any Cauchy hypersurface in S is homeomorphic to ∂ I + (6), therefore must be
compact. This contradicts the assumption that S has a noncompact Cauchy
hypersurface, completing the proof. □
CHAPTER 5

The Einstein constraint equations

5.1. Introduction

Many physical models admit an initial value formulation. In Newtonian me-


chanics, for instance, if we suppose the force is a function of the positions and
velocities of the various particles under observation, then Newton’s second law
gives a system of ordinary differential equations which in principle will yield the
evolution of the system once the initial positions and velocities of the particles
are specified. The wave and heat equations also admit initial value problems.
Maxwell’s equations likewise admit an initial value formulation, but unlike the
previous examples, where the initial configuration is essentially unconstrained,
one cannot arbitrarily prescribe the electric and magnetic field at t = 0, say,
and hope to solve Maxwell’s equations. The reason is that parts of Maxwell’s
equations do not involve time derivatives: the spatial divergence of B vanishes,
as does that of E (in regions with vanishing charge distribution). These two
divergence constraints place restrictions on the vector fields one can use to
prescribe initial data. As it turns out, these are the only constraints.
In this chapter, we discuss an initial value formulation for the vacuum Einstein
equation. A vacuum initial data set will be given geometrically as a manifold 6,
endowed with a Riemannian metric g and symmetric (0, 2)-tensor K . That 6
embeds as a hypersurface in a Lorentzian manifold (M, ḡ) satisfying the vacuum
Einstein equation, with induced metric g and second fundamental form K ,
imposes constraints on g and K . These conditions on g and K , which govern the
space of allowable initial data sets for the vacuum Einstein equation, comprise the
Einstein constraint equations, the study of solutions to which forms an interesting
and rich subject for geometric analysis.

5.1.1. Initial value formulation for Maxwell’s equations. We briefly discuss


the initial value problem for the source-free Maxwell equations on Minkowski
spacetime M4 . For simplicity we take c = 1, so x 0 = t. We have seen that the
Maxwell equations can be written in terms of the Faraday tensor F, an antisym-
metric (2, 0)-tensor with corresponding two-form F ♭ . In inertial coordinates we

133
134 5. T HE E INSTEIN CONSTRAINT EQUATIONS

write

F ♭ = (E 1 dx 1 + E 2 dx 2 + E 3 dx 3 )∧dx 0 +(B1 dx 2∧dx 3 + B2 dx 3∧dx 1 + B3 dx 1∧dx 2 ).

Maxwell’s equations can then be written

divη F = 0 and dF ♭ = 0.

From the second equation and the Poincaré lemma (see, e.g., [141]), we can
write F ♭ = d A for a one-form A = Aµ d x µ , well-defined up to a gauge function
φ: F ♭ = d(A + dφ) for any smooth φ. If ∇ ⃗ is the spatial (Euclidean) gradient
operator, and if we let
∂ ∂ ∂
A = A1 + A2 2 + A3 3 ,
∂x1 ∂x ∂x
then we can make the identifications

⃗ × A,
B=∇ ⃗ A0 − ∂ A .
E=∇
∂t
⃗ · B = 0. The spatial
Note that the spatial divergence vanishes automatically: ∇
divergence of E is given by

⃗ · ∂ A,
⃗ · E = 1A0 − ∇

∂t
where 1 is the Euclidean Laplacian on R3 . A simple calculation shows that
(divη F)0 = F 0ν ⃗
;ν = ∇ · E, so we see that while Maxwell’s equations are second
order in A, this component has only one time derivative of A in it. In terms
of formulating an evolution problem, this component of Maxwell’s equations,
equivalent to the vanishing of the spatial divergence of E, must be satisfied by
the initial data. This imposes a constraint on the data.
To formulate a second-order initial value problem for A, we can use the gauge
freedom in A. The Lorenz4 gauge condition imposes precisely that
∂ A0 ⃗
0 = divη A = − + ∇ · A.
∂t
Under the Lorenz gauge condition, Maxwell’s equations are equivalent to
∂ 2 Aµ
□ Aµ = − + 1Aµ = 0.
∂t 2
∂ Aν ∂ Aµ
Exercise 5-1. Verify this last claim. Use that Fµν = − .
∂xµ ∂xν
4 Note the spelling; this is not the eponym of Lorentz transformations!
I NTRODUCTION 135

We may proceed as follows. We specify initial data Aµ and ∂ Aµ /∂t at t = 0,


satisfying the constraint

∇ ⃗ · ∂ A = 0.
⃗ · E = 1A0 − ∇ (5.1.1)
∂t
We may also assume that the Lorenz gauge condition holds at t = 0. Indeed,
given initial data ŵ |t=0 and (∂ ŵ /∂t)|t=0 satisfying the constraint, we can
arrange the Lorenz gauge condition at t = 0 by adding dφ to Å, for a function φ
depending only on (x 1 , x 2 , x 3 ); this is a gauge transformation, and does not affect
the field F. In particular, it does not change E, which remains divergence-free.
Finding φ involves solving a Poisson equation 1φ = (∂ Å0 /∂t − ∇ ⃗ · Å)|t=0 on
3
R , where we note the right-hand side can be computed in terms of the initial
data. We assume we are working in function spaces where we can solve this
equation (for instance, where the fields decay sufficiently near infinity), and
similar comments apply in the remainder of this section. We note that this gauge
transformation A = Å + dφ does not change the time derivative at t = 0.
We now solve the wave equation □ Aµ = 0. We will have produced a solution
to Maxwell’s equations, provided we can show that the Lorenz gauge condition,
which we have satisfied at t = 0, is propagated in time. How do we do this? We
have not yet incorporated the constraint. In fact, the condition (5.1.1) at t = 0,
together with the wave equation □ A0 = 0, yields at t = 0
3
∂ 2 A0 X ∂ 2 Aℓ
= 1A 0 = ,
∂t 2 ∂t ∂ x ℓ
ℓ=1

which is just (∂ divη A/∂t)|t=0 = 0. Now, the wave equation for each Aµ also
implies □ (divη A) = 0. In other words, we see that divη A satisfies the wave
equation, and has vanishing initial data. Thus divη A = 0 for all t, and the gauge
condition propagates in time. In summary, the gauge term (which we arranged
to vanish at t = 0) is propagated with the wave equation, which, together with
the constraint ∇⃗ · E = 0, implies the vanishing of the first derivative of the gauge
term.
We could also formulate the initial value problem for the source-free Maxwell’s
equations in terms of E and B, with the constraints that the spatial divergences
vanish. We will solve for A using the wave equation as above, where we take
the initial data for A as follows: A0 = 0, and A is chosen so that ∇ ⃗ ×A= B

(this is possible since ∇ · B = 0 at t = 0). We also set (∂ A/∂t)|t=0 = −E, and
to arrange the Lorenz gauge condition initially, we take ∂ A0 /∂t = ∇ ⃗ · A at t = 0.
We can now solve for A as above to produce solutions to the Maxwell equations
with given initial electric and magnetic fields.
136 5. T HE E INSTEIN CONSTRAINT EQUATIONS

We could also have proceeded as follows. The source-free Maxwell’s equations


imply vector wave equations for E and B, i.e., □ E = 0 = □ B. We can thus
solve for E in time using the initial values for E (divergence-free at t = 0)
as well as imposing the Maxwell equation ∂ E/∂t = ∇ ⃗ × B at t = 0. The
vector wave equation for E implies ∇ ⃗ · E solves the wave equation. Moreover,
⃗ ⃗ ⃗
∂(∇ · E)/∂t = ∇ · (∇ × B) = 0 at t = 0, and thus the constraint ∇ ⃗·E=0
propagates in time, giving us one of Maxwell’s equations. We solve for A at
t = 0 so that B = ∇⃗ × A, and we take A0 = 0. The evolution of A is then given
by ∂ A/∂t = −E, and we then define B = ∇ ⃗ × A. This clearly gives the Maxwell
equation ∂ B/∂t = −∇ ⃗ × E, from which we see that
⃗ · B)
∂(∇ ⃗ · (∇
= −∇ ⃗ × E) = 0,
∂t
⃗ · B = 0 propagates in time as well. We need now only show that
so that ∇
∂ E/∂t = ∇⃗ × B, which holds at t = 0, also propagates. We have
 ∂2 E 2
∂ ∂E ⃗ ⃗ × ∂A = ∂ E +∇
  
−∇ × B = 2 −∇ ⃗× ∇ ⃗ × (∇
⃗ × E)
∂t ∂t ∂t ∂t ∂t 2
∂2 E ⃗ ⃗
= 2 + ∇( ∇ · E) − 1E = 0,
∂t
by the wave equation for E and the propagation of the constraint ∇ ⃗ · E = 0.

From this along with the definition E = −∂ A/∂t, we see that ∇ · A is constant in
time. We can replace A by A + ∇φ ⃗ for a suitable time-independent φ to arrange
∇⃗ · A = 0 at t = 0, and hence for all time. The vanishing of the divergence of A
is called the Coulomb gauge. Since we took A0 = 0 here, this also corresponds
to the Lorenz gauge in this special situation.

5.1.2. The Gauss and Codazzi equations. We will consider submanifolds 6


of a semi-Riemannian manifold (M, ḡ). If ḡ is Riemannian, the metric induces
a Riemannian metric g on 6. This is not necessarily the case in the semi-
Riemannian setting, as in the next example.
Example 5-2. Let

6 + = {(x 0 , x 1 , x 2 , x 3 ) : 0 < x 0 = (x 1 )2 + (x 2 )2 + (x 3 )2 } ⊂ M4
p

denote the forward lightcone minus the origin. Consider a point p in this submani-
fold at which x 2 = 0 = x 3 , and x 1 > 0, so that x 0 = x 1 at p. The tangent space
T p 6 + is spanned by the vectors (∂/∂ x 0 )| p + (∂/∂ x 1 )| p , (∂/∂ x 2 )| p , (∂/∂ x 3 )| p .
Note that the first of these is orthogonal to all of T p 6 + , and so the Minkowski
metric does not induce a metric on 6 + . 6 + is a null hypersurface.
I NTRODUCTION 137

Let 6 k be a submanifold of (M, ḡ) upon which ḡ induces a semi-Riemannian


metric g. We let ḡ(X, Y ) = ⟨X, Y ⟩. Since at each p ∈ 6 the metric is non-
degenerate on T p 6, we have that ḡ is also nondegenerate on (T p 6)⊥ , and
T p M = T p 6 ⊕ (T p 6)⊥ (cf. [174, Chapter 2]). Thus we can decompose the
tangent bundle of M along 6 into the direct sum of the tangent and normal
bundles of 6: T M = T 6 ⊕ N 6. It is not hard to show that the Levi-Civita
connection ∇ 6 from the induced metric on 6 satisfies, for smooth vector fields
X and Y tangent to 6,
∇ X Y = ∇ X6 Y + II(X, Y )
where ∇ X6 Y is tangent to 6, and II(X, Y ) is normal to 6.
Lemma 5-3. For any vector fields X , Y tangent to 6, the induced Levi-Civita
connection ∇ X6 Y is the tangential projection of ∇ X Y , while II(X, Y ) is tensorial
in X and Y , and is symmetric.
Proof. We define ∇ X6 Y as the tangential component of ∇ X Y , and we show it satis-
fies the defining properties of the Levi-Civita connection. Clearly ∇ 6 is torsion-
free, since ∇ X Y − ∇Y X = [X, Y ], which is tangent to 6; this latter equation also
implies that II(X, Y ) = II(Y, X ). Clearly ∇ X6 Y is C ∞ -linear in X , and R-linear
in Y , since ∇ X Y has these properties. Since ∇ X ( f Y ) = X [ f ]Y + f ∇ X Y , and
X [ f ]Y is tangent to 6, by taking projections we get the corresponding equation
for ∇ 6 . Finally, the preceding equation also implies that II(X, Y ) is C ∞ -linear
in Y , and hence by symmetry in X as well. □
Definition 5-4. The tensor II is the (vector-valued) second fundamental form.
Pk
The mean curvature vector field H (or H⃗ ) is H = trg II = i=1 ϵi II(E i , E i ),
where {E 1 , . . . , E k } is an orthonormal basis of T p 6 with ϵi = ⟨E i , E i ⟩. The
scalar-valued second fundamental form with respect to the unit normal vector
n is given by K (X, Y ) = ⟨II(X, Y ), n⟩, and the respective mean curvature H is
the trace: H = trg K = ⟨ H, n⟩.
If n denotes a local smooth unit normal field, K (X, Y ) = ⟨II(X, Y ), n⟩ =
⟨−∇ X n, Y ⟩. While K and H depend on a choice of n, we emphasize that II
and H do not. We note that K is sometimes defined with the opposite sign,
so we define K (X, Y ) = ⟨∇ X n, Y ⟩ = −K (X, Y ), and H = trg K . Moreover, in
the hypersurface case, sometimes the scalar-valued second fundamental form
is defined as K̂ , where K̂ (X, Y )n = II(X, Y ) = ⟨n, n⟩K (X, Y )n. If we let
ϵ = ⟨n, n⟩ = ±1, then we see K̂ = ϵ K , so that with Ĥ = trg K̂ = ϵ H , we have
H = Ĥ n = ϵ H n. In case (M, g) is Riemannian we of course have K̂ = K = −K ,
whereas in the case (M, ḡ) is Lorentzian and (6, g) is a Riemannian hypersurface,
then ϵ = ⟨n, n⟩ = −1, and K̂ = K = −K , i.e., K̂ (X, Y ) = ⟨∇ X n, Y ⟩.
138 5. T HE E INSTEIN CONSTRAINT EQUATIONS

We begin by reviewing the proof of the Gauss equation, which relates the
curvature of the submanifold to the ambient curvature and the second fundamental
form.
Proposition 5-5 (the Gauss equation). For any X, Y, Z , W ∈ T p 6, we have
⟨R 6 (X, Y, Z ), W ⟩
= ⟨R(X, Y, Z ), W ⟩−⟨II(X, Z ), II(Y, W )⟩+⟨II(X, W ), II(Y, Z )⟩. (5.1.2)

Proof. If N is normal to 6, then since X and Y are tangential, it follows


that ∇ X ⟨Y, N ⟩ = 0, which is equivalent to ⟨∇ X Y, N ⟩ = −⟨Y, ∇ X N ⟩. More-
over, by decomposing ∇ X Y into tangential and normal components, we obtain
⟨∇ X Y, W ⟩ = ⟨∇ X6 Y, W ⟩, while ⟨∇ X Y, N ⟩ = ⟨II(X, Y ), N ⟩. Using these we
compute as follows:
⟨R(X,Y, Z ), W ⟩ = ⟨∇ X ∇Y Z − ∇Y ∇ X Z − ∇[X,Y ] Z , W ⟩
= ⟨∇ X (∇Y6 Z + II(Y, Z )) − ∇Y (∇ X6 Z + II(X, Z )) − ∇[X,Y
6
]Z, W⟩
= ⟨∇ X6 ∇Y6 Z + II(X, ∇Y6 Z ) + ∇ X (II(Y, Z ))
− ∇Y6 ∇ X6 Z − II(Y, ∇ X6 Z ) − ∇Y (II(X, Z )) − ∇[X,Y
6
]Z, W⟩
= ⟨R 6 (X, Y, Z ), W ⟩ − ⟨II(Y, Z ), ∇ X W ⟩ + ⟨II(X, Z ), ∇Y W ⟩
= ⟨R 6 (X, Y, Z ), W ⟩ − ⟨II(Y, Z ), II(X, W )⟩ + ⟨II(X, Z ), II(Y, W )⟩. □

In the hypersurface case we can write the Gauss equation as

⟨R 6 (X, Y, Z ), W ⟩ = ⟨R(X, Y, Z ), W ⟩ − ⟨n, n⟩⟨K (X, Z ), K (Y, W )⟩


+ ⟨n, n⟩⟨K (X, W ), K (Y, Z )⟩
= ⟨R(X, Y, Z ), W ⟩ − ⟨n, n⟩⟨ K̂ (X, Z ), K̂ (Y, W )⟩
+ ⟨n, n⟩⟨ K̂ (X, W ), K̂ (Y, Z )⟩. (5.1.3)

To derive the Einstein constraint equations, we will use the Einstein equation,
together with the Gauss equation, and the Codazzi equation, which we present
now. We first define the normal connection ∇ ⊥ in the normal bundle N 6 as
follows: for V tangent to 6 and Z normal along 6, we define ∇V⊥ Z to be the
normal component of ∇V Z . We can use this connection (and impose a product
rule) to differentiate tensors with values in the normal bundle, in particular the
second fundamental form: for V, X and Y tangent to 6,

(∇V II)(X, Y ) := ∇V⊥ (II(X, Y )) − II(∇V6 X, Y ) − II(X, ∇V6 Y ). (5.1.4)

For X , Y and Z tangent to 6, let R ⊥ (X, Y, Z ) be the normal component of


R(X, Y, Z ).
T HE E INSTEIN CONSTRAINT EQUATIONS 139

Proposition 5-6 (the Codazzi equation). For X , Y and Z tangent to 6,

R ⊥ (X, Y, Z ) = (∇ X II)(Y, Z ) − (∇Y II)(X, Z ). (5.1.5)

Proof. As in the proof of the Gauss equation, we decompose the curvature tensor:

R(X, Y, Z ) = ∇ X ∇Y Z − ∇Y ∇ X Z − ∇[X,Y ] Z
= ∇ X (∇Y6 Z + II(Y, Z )) − ∇Y (∇ X6 Z + II(X, Z )) − ∇[X,Y ] Z .

Using [X, Y ] = ∇ X6 Y − ∇Y6 X and taking the normal component we have

R ⊥ (X, Y, Z ) = II(X, ∇Y6 Z ) + ∇ X⊥ (II(Y, Z ))


− II(Y, ∇ X6 Z ) − ∇Y⊥ (II(X, Z )) − II([X, Y ], Z )
= II(X, ∇Y6 Z ) + ∇ X⊥ (II(Y, Z ))
− II(Y, ∇ X6 Z ) − ∇Y⊥ (II(X, Z )) − II(∇ X6 Y, Z ) + II(∇Y6 X, Z )
= (∇ X II)(Y, Z ) − (∇Y II)(X, Z ). □

Before we move on, we record the Gauss and Codazzi equations in index
form in local coordinates, for the hypersurface case. Recall the convention
Ri jℓm = gms Risjℓ . In the following, the indices refer to components tangential to
6, while the “n” index refers to the vector n inserted in the corresponding slot
in the tensor. The Gauss equation is easily seen to be

Ri6jℓm = Ri jℓm + ⟨n, n⟩(K im K jℓ − K iℓ K jm )


= Ri jℓm + ⟨n, n⟩( K̂ im K̂ jℓ − K̂ iℓ K̂ jm ), (5.1.6)

while the Codazzi equation takes the form

Ri jℓn = K jℓ;i − K iℓ; j = ⟨n, n⟩( K̂ jℓ;i − K̂ iℓ; j ). (5.1.7)

This readily follows from (5.1.4), II(X, Y ) = ⟨n, n⟩K (X, Y )n and ⟨∇ X n, n⟩ = 0.

5.2. The Einstein constraint equations

Suppose (M, ḡ) is Lorentzian, and 6 ⊂ M is a k-dimensional (k ≥ 2) spacelike


hypersurface, i.e., the induced metric g on 6 is Riemannian. Let n be a (smooth,
local) timelike unit vector field to 6 (see Figure 7). We let T be a symmetric
(0, 2)-tensor and assume that (M, ḡ) satisfies the Einstein equation G 3 (ḡ) = κ T
(note that κ is not the dimension k of 6). Then divḡ T = 0. We let J = (−T (n, · ))♯ ,
so that Jν = −T µν n µ (recall that summed-over Greek indices range from 0 to
k). We write J = ρn + J ♯ , where J ♯ is tangent to 6. Then ρ = T (n, n) = T00 ,
140 5. T HE E INSTEIN CONSTRAINT EQUATIONS

(M k +1, g¯ )
n
n

Σk

Figure 7. A spacelike hypersurface in a spacetime.

and if E 1 , . . . , E k , is a local frame for T 6 and we let E 0 = n to complete the


Pk
indexing, we have J ♯ = i=1 J i E i , where, for j ≥ 1,
k
−Tµj n µ = −T0 j = −T (n, E j ) = ⟨J, E j ⟩ = J i gi j = J j . (5.2.1)
P
i=1

Note that J i = kj=1 g i j J j = −T iµ n µ = T i0 , i ≥ 1. Then ρ is the energy


P

density of the matter fields as measured by the observer with four-velocity cn,
and J is (c times) the corresponding momentum density one-form. If n is future-
pointing, then
qPthe dominant energy condition (J is future-pointing causal) implies
k i
ρ ≥ |J |g = i=1 J Ji .
We now come to the Einstein constraint equations, analogues of the divergence
constraint on the initial data for the Maxwell equations we saw earlier. We recall
that in spacetime dimension four, κ = 8π G/c4 .
Theorem 5-7 (Einstein constraint equations). Let (M, ḡ) be Lorentzian. The
following system of equations must hold on a Riemannian hypersurface 6 ⊂ M,
where the Einstein equation Ric(ḡ) − 12 R(ḡ)ḡ + 3ḡ = κ T holds on M:
R(g) − |K |2g + (trg K )2 = 2κρ + 23, (5.2.2)
divg K − d(trg K ) = κ J. (5.2.3)
Equation (5.2.2) is known as the Hamiltonian constraint, and (5.2.3) is the
momentum constraint; see Sections 5.3.3.2 and 7.2.1 for further connection to
the energy and momenta for gravitational systems. When we insert ρ = 0 and
J = 0 into the constraints, we obtain the vacuum constraint equations.
Proof. Let E 1 , . . . , E k be a smooth local orthonormal frame field for 6. We first
apply the Gauss equation:
k
⟨R(Ei , E j , E j ), Ei ⟩
P
i, j=1
k
= ⟨R 6 (Ei , E j , E j ), Ei ⟩−⟨II(E j , E j ), II(Ei , Ei )⟩+⟨II(Ei , E j ), II(Ei , E j )⟩
P 
i, j=1

= R(g) − |K |2g + H 2 . (5.2.4)


T HE E INSTEIN CONSTRAINT EQUATIONS 141

Because ⟨n, n⟩= −1, we have for i ≥ 1,


k
Ricḡ (E i , E i ) = −⟨R(n, E i , E i ), n⟩+ ⟨R(Ei , E j , E j ), Ei ⟩,
P
j=1

so that with the Einstein tensor G = Ric(ḡ) − 21 R(ḡ)ḡ,


k k
⟨R(Ei , E j , E j ), Ei ⟩ = Ricḡ (n, n) + Ricḡ (E i , E i )
P P
i, j=1 i=1
= R(ḡ) + 2Ricḡ (n, n) = 2G(n, n) (5.2.5)
= 2(−3ḡ + κ T )(n, n) = 23 + 2κρ.

Together with (5.2.4), we conclude (5.2.2).


For (5.2.3), we apply the Codazzi equation (5.1.7), along with the fact that
Rninn = 0 = ḡin for any i ∈ {1, . . . , k}, to obtain
k k
G in = Rin = ⟨R(E j , E i , n), E j ⟩ = − R ji jn
P P
j=1 j=1
k
=− (K i j; j − K j j;i ) = − divg K − d(trg K ) i . (5.2.6)
P 
j=1

By the Einstein equation and (5.2.1), we find G in = −κ Ji , from which we


conclude (5.2.3) as desired. □
To summarize, we have from (5.2.4), (5.2.5) and (5.2.6) the following identities
for the Einstein tensor, which show that these components of G can be expressed
purely in terms of g and K (note the covariant derivatives are with respect to g):

2G nn = R(g) − |K |2g + (trg K )2 , (5.2.7)


j j j j
G ni = −(K i; j − K j;i ) = ( K̂ i; j − K̂ j;i ). (5.2.8)

Remark 5-8. The momentum constraint (5.2.3) appears on [218, p. 266] and
some other works with a sign difference where the second fundamental form has
the opposite sign to ours. In Chapter 8 of this volume, the momentum constraint
takes the same form as (5.2.3), with both the second fundamental form and the
one-form J having the opposite sign to what we have taken here.
Example 5-9. In Minkowski spacetime M1+k, the metric ḡ = −dt 2 + kj=1 (d x j )2
P

induces on the hypersurface 6 = {t = 0} the Euclidean metric, and the second


fundamental form vanishes. On the other hand, if we consider the hypersurface
6 = {(t, x) : −t 2 +|x|2 = −1, t > 0}, as we saw in Section 1.4, the induced metric
is the hyperbolic metric of constant curvature −1, and the second fundamental
142 5. T HE E INSTEIN CONSTRAINT EQUATIONS

form (with respect to n future-pointing) is given by K = −g. The vacuum


constraint equations (3 = 0) are easily verified in each case.
Example 5-10. Let n ≥ 3. Recall from Section 2.4.1 anti-de Sitter spacetime,
Hn1 (1) = {x ∈ Rn+1 : −(x 0 )2 − (x 1 )2 + (x 2 )2 + · · · + (x n )2 = −1} ⊂ R2n+1 .
The metric induced on {x ∈ Rn+1 : x 0 = 0} is that of Minkowski spacetime
Mn . The metric induced on the set 6 = {x ∈ Hn1 (1) : x 0 = 0, x 1 > 0}, on
the other hand, is Riemannian; it is the unit hyperbolic metric, since 6 can be
isometrically identified with the upper unit hyperboloid in R1n . However, 6 is
totally geodesic in Hn1 (1), i.e., its second fundamental form in Hn1 (1) vanishes;
this is easy to see, since ∂/∂ x 0 is a unit normal field to 6 which is tangent to
Hn1 (1) along 6. One readily checks that the vacuum constraint equations with
nonzero 3 = − 21 (n−1)(n − 2) hold (6 is isometric to Hn−1 here).
Hn1 (1), diffeomorphic
We can in turn consider universal anti-de Sitter spacetime e
to Rn , with covering map

Hn1 (1) ≈ R × Rn−1 ∋ (t, ξ ) 7 → ( 1 + |ξ |2 cos t, 1 + |ξ |2 sin t, ξ ) ∈ Hn1 (1).


p p
e

The covering map restricts to a diffeomorphism of each constant t-slice 6t =


{t} × Rn−1 in e Hn1 (1) to the image of 6 under an isometry of R2n+1 (a rotation of
the x 0 x 1 -plane, fixing x 2 , . . . , x n ). Thus 6t is isometric to the unit hyperbolic
space, and is also totally geodesic.
Exercise 5-11. In an FLRW spacetime (I × 6, −dt 2 + ( f (t))2 g0 ), where g0 has
constant curvature k0 = −1, show that the second fundamental form of {t} × 6
with respect to n = ∂/∂t is given by K = − f (t) f ′ (t)g0 , and verify explicitly
the Einstein constraint equations in spacetime dimension four; cf. Section 2.4.3.
Example 5-12. Recall the exterior Schwarzschild spacetime metric
2m 2m −1 2 2m
     
ḡ S = − 1 − dt 2 + 1 − dr + r 2 g̊S2 = − 1 − dt 2 + g S .
r r r
The t = 0 slice 6 has induced metric g S , and is easily seen to be totally geodesic,
since ∂/∂t is parallel along 6. The vacuum Einstein constraints (3 = 0) reduce
to a scalar curvature constraint R(g S ) = 0.
Note that the constraints come from imposing G 3 (ḡ)(n, · ) = κ T (n, · ) along
a spacelike hypersurface. As a simple exercise, one can see that if this equation
holds in the normal direction along every spacelike hypersurface, then the Einstein
equation G 3 (ḡ) = κ T holds. For example, if (G 3 )nn vanishes along every
spacelike hypersurface of a spacetime, then the vacuum field equation G 3 (ḡ) = 0
23
must hold, which is equivalent to Ric(ḡ) = k−1 ḡ in spacetime dimension k+1 ≥ 3.
T HE E INSTEIN CONSTRAINT EQUATIONS 143

As K involves one time derivative of the metric (see (5.3.4) below), the expres-
sions of G nn and G ni in (5.2.7)–(5.2.8) do not involve second time derivatives
of the metric. Thus in formulating the Einstein equation as a second-order
evolution problem for the spacetime metric, as discussed in the next section,
the values of G nµ (and hence (G 3 )nµ ) can be expressed in terms of the initial
data set (6, g, K ; ρ, J ) (where for the non-vacuum case, we augment the geo-
metric initial data (g, K ) with the energy-momentum density (ρ, J ) along 6).
Thus, for instance, a valid initial data set (6, g, K ) for the vacuum Einstein
equation is constrained by the condition that these components must vanish
along 6: (G 3 )nn = 0, (G 3 )ni = 0 along 6, i.e., (5.2.2)–(5.2.3) must hold with
(ρ, J ) = (0, 0).
In the non-vacuum case, an initial data set may include initial data for the
physical fields being modeled, from which ρ and J can be computed, and
moreover the constraints may be augmented by equations that these fields must
also satisfy, as in the following example.
Example 5-13. We consider the coupled Einstein–Maxwell equations in space-
time dimension four, initial data for which will be taken to be (6, g, K ; E, B),
though we might instead encode the electromagnetic fields on 6 using a potential
A, as seen in Section 5.1.1. As in (1.3.3)–(1.3.4), we have
1 1
ρ= (|E|2g + |B|2g ) and J i = (E ×g B)i .
8π 4π
Along with using these in the constraint equations (5.2.2)–(5.2.3), we also impose
constraints on the divergence of E and B, namely in the absence of charge, divg E
and divg B vanish.
Any spacelike hypersurface in any Lorentzian manifold gives rise to a solution
of the constraints, defining T so that the Einstein equation holds for a given 3.
Similarly, given (6, g, K ), one can simply define ρ and J by (5.2.2)–(5.2.3) to
get a solution (6, g, K ; ρ, J ) of the constraint equations. With this in mind, we
often impose some form for T , such as T = 0 (vacuum), or at least impose an
energy condition (cf. Section 2.3.2) on T , such as the dominant energy condition
ρ ≥ |J |g . Under such restrictions, the equations (5.2.2)–(5.2.3) do constrain g
and K on 6 k (k ≥ 2) in some way. For instance, if 3 = 0, the dominant energy
condition can be written
1 2 2
2 R(g) − |K |g + (trg K ) ≥ divg K − d(trg K ) g .


That said, even the vacuum constraints form an underdetermined system. In


the case of spacetime dimension four, the constraints form a system of four
equations for twelve independent functions in g and K (which are symmetric
144 5. T HE E INSTEIN CONSTRAINT EQUATIONS

tensors). Even when one accounts for diffeomorphism gauge equivalence, the
system is still underdetermined, and indeed there are lots of solutions to the
vacuum constraints.
The time-symmetric, or Riemannian, constraints are the case K = 0. Then
(5.2.2) reduces to R(g) = 2κρ + 23, so that if 3 = 0, the scalar curvature R(g)
is proportional to the energy density, and R(g) ≥ 0 if and only if ρ ≥ 0 (which
holds under the weak, and hence under the dominant, energy condition). The
maximal case is H = 0, so that R(g) = |K |2g + 2κρ + 23 ≥ 2κρ + 23. In the
vacuum case (T = 0), the time-symmetric constraints reduce to R(g) = 23
(constant scalar curvature), and in the maximal case we have R(g) ≥ 23; we
often consider 3 = 0, which highlights the condition of zero or nonnegative
scalar curvature of (6, g).
The constraints operator 8 is defined by

8(g, K ) = R(g) − |K |2g + (trg K )2 , divg K − d(trg K ) .




As we will see in Sections 5.3.3.2 and 7.2.1, we may sometimes want to rewrite the
constraints in terms of the momentum tensor π, which is algebraically equivalent
(k ≥ 2) to K via πi j = K i j − (trg K )gi j . It is easy to see that

1
R(g) − |K |2g + (trg K )2 = R(g) − |π|2g + (tr π)2
k −1 g
divg K − d(trg K ) = divg π.

When we use π instead of K , we may abuse notation and write the constraints
1
operator as 8(g, π ) = R(g) − |π |2g + k−1 (trg π )2 , divg π . While this abuse of


notation should not cause confusion, we note that below we might use the same
notation for the operator where π is treated as a (2, 0)-tensor, in which case
divg π is a vector field. In the literature, the constraints operator may appear as
the above operator except with a factor of ± 21 in the first component (or a factor
of ±2 on the second component); see, e.g., (5.2.7) or Section 5.3.3.2.

5.3. The initial value formulation for the vacuum Einstein equation

In this section we discuss aspects of the analysis and geometry of the initial value
formulation for Ric(ḡ) = 0, or more generally G 3 (ḡ) = 0. We will follow the
approach of the foundational work of Choquet-Bruhat [49]. The purpose of this
section is to illustrate the ideas; to be mathematically precise, we should specify
function spaces and state carefully the partial differential equations results that
are in play. We will not do this, but refer the reader to [51; 190; 218].
T HE INITIAL VALUE FORMULATION FOR THE VACUUM E INSTEIN EQUATION 145

5.3.1. Einstein’s equation in Lorentz-harmonic gauge. In order to formulate


the Einstein equation in terms of an initial value problem for a nonlinear wave
equation, we express the Einstein equations in terms of a partial differential
equation along with a gauge condition, as we did for Maxwell’s equations above.
The gauge choice we use is the choice of coordinates. In fact, we will use
harmonic coordinates, or in the Lorentzian case, wave coordinates x α , which
are coordinates so that λα := □ ḡ x α = 0, where we recall that for a Lorentzian
metric ḡ, we let □ ḡ u = trḡ (Hessḡ u). On any Lorentzian manifold we can locally
set up wave coordinates: given any local coordinates y µ , we solve the Cauchy
problem for the linear wave equations □ ḡ x µ = 0 with initial conditions x µ = y µ
and ∇n x µ = ∇n y µ imposed on a level surface of y 0 (∂/∂ y 0 is timelike), where
n is the unit normal to the level surface.
κ
Let 0νβ = 21 ḡ κµ (ḡνµ,β + ḡµβ,ν − ḡνβ,µ ) be the Christoffel symbols for ḡ in a
coordinate system. It is easy to see that

λα = □ ḡ x α = ḡ µν x;µν
α
= ḡ µν (−0µν
κ α
x,κ ) = −ḡ µν 0µν
α
.

In what follows we will write “A ∼ B” to mean A − B is a function of the


components ḡµν and ḡµν,κ . In particular, if A ∼ B, then A − B does not depend
on second derivatives of the metric components. For example,

−(ḡαν λα,β + ḡαβ λα,ν ) ∼ ḡαν ḡ κµ 0κµ,β


α
+ ḡαβ ḡ κµ 0κµ,ν
α
.

Exercise 5-14. Show that


α
1
2 ( ḡαν λ ,β + ḡαβ λα,ν ) ∼ − 21 ḡ κµ (ḡκν,µβ + ḡβµ,κν − ḡκµ,νβ ). (5-14a)

From (2-8b) (p. 72), the components of the Ricci curvature of ḡ are given by
κ κ κ γ κ γ κ κ
Rµν = 0µν,κ − 0µκ,ν + 0κγ 0µν − 0νγ 0µκ ∼ 0µν,κ − 0µκ,ν .

Moreover,
κ κ
0µν,κ − 0µκ,ν ∼ 12 ḡ κγ (ḡµγ ,νκ + ḡνγ ,µκ − ḡµν,γ κ ) − (ḡµγ ,κν + ḡκγ ,µν − ḡµκ,γ ν )


= 21 ḡ κγ (ḡνγ ,µκ − ḡµν,γ κ − ḡκγ ,µν + ḡµκ,γ ν ).


From this and (5-14a) we see that
H
Rµν := Rµν + 21 (ḡαµ λα,ν + ḡαν λα,µ ) ∼ − 21 ḡ κγ ḡµν,κγ .

By adding to the Ricci curvature operator the Lie derivative of the metric with
H ,
respect to a suitable vector field, we obtain the reduced Ricci curvature Rµν
whose leading term plainly constitutes a nonlinear wave operator. (Recall the
Lie derivative formula (L X ḡ)µν = ḡαµ X α;ν + ḡαν X α;µ .)
146 5. T HE E INSTEIN CONSTRAINT EQUATIONS

1 H
We let R H = ḡ µν Rµν
H and (G H ( ḡ))
3
H
µν = Rµν − 2 R ḡµν + 3 ḡµν , and we
H
consider the reduced Einstein equation G 3 (ḡ) = 0, which can be formulated as
a system of quasilinear wave equations:

− 21 ḡ αβ ḡµν,αβ + 9µν ((ḡγ θ ), (ḡγ θ,κ )) = 0. (5.3.1)

A solution to (5.3.1) will solve the Einstein equation if we can arrange λα = 0


for all α. From the seminal work of Leray, along with a rescaling argument (see
the references, e.g., [218, Chapter 10]), we can solve a system like (5.3.1) for
small time.
The gauge condition above is not the only one that can be used, and moreover
gauge choices can be expressed in a more geometric framework, in terms of
background metrics (or connections), and the tension field for harmonic maps.
See [17; 190, Chapter 14; 51, Chapter 7]. Furthermore, the same degeneracy and
issue of gauge for the Ricci operator arises in Riemannian geometry, for instance
in the study of the Ricci flow; cf. [73; 74].

5.3.2. The Einstein constraints and the propagation of the gauge condition.
Suppose we are given a solution (6, g, K ) of the Einstein constraint equations
for G 3 (ḡ) = 0. We now incorporate this solution into initial data for (5.3.1).
Choose local coordinates x i (i ≥ 1) on V ⊂ 6; we will obtain a solution of (5.3.1)
on the product I × U , where I is an interval around 0 with coordinate x 0 = ct,
and U is a compactly contained open subset of V . We prescribe initial values of
ḡµν and ḡµν,0 at t = 0 as follows: for i, j ≥ 1, let ḡi j = gi j , and ḡi j,0 = −2K i j ;
let ḡ00 = −1, and for j ≥ 1, let ḡ0 j = 0. For a spacetime metric ḡ with these
conditions, ∂/∂ x 0 is a unit normal field to {0} × U , and so the components of
the second fundamental form of U are indeed given by
∂ ∂
 
K i j = ḡ ∇ ∂ ,
j ∂x0
= − 12 ḡi j,0 ,
∂xi ∂ x

as desired. For µ ≥ 0, we will choose the initial values of ḡ0µ,0 to arrange


the gauge condition, as we explain. In this section, the Einstein summation
convention will be in force for Greek indices, which as usual run over all values.
Now using the formula (2.3.10) derived earlier for the derivative of a determi-
nant, we see that
1 βα
λα = √ |det ḡ| ḡ βγ x,γ
α
= ḡ ,β + 12 ḡ βα ḡ ρσ ḡρσ,β .
p 
|det ḡ| ,β

At t = 0 we get
1 00 ρσ
λ0 = ḡ 00 1 1
,0 + 2 ḡ ḡ ḡρσ,0 = − 2 ḡ00,0 − 2 ḡ i j ḡi j,0 , (5.3.2)
P
i, j≥1
T HE INITIAL VALUE FORMULATION FOR THE VACUUM E INSTEIN EQUATION 147

and, for i ≥ 1,

ji
λi = ḡ 0i,0 + ḡ , j + 12 ḡ ji ḡ ρσ ḡρσ, j
P 
j≥1
ji
= ḡ0 j,0 ḡ ji + ḡ , j + 12 ḡ ji ḡ ρσ ḡρσ, j . (5.3.3)
P P 
j≥1 j≥1

The final summation in (5.3.3) only involves spatial derivatives, so we can solve
for ḡ0 j,0 at t = 0 ( j ≥ 1) to arrange λi |t=0 = 0 for i ≥ 1. We can also clearly use
(5.3.2) to determine ḡ00,0 at t = 0 in order that λ0 |t=0 = 0 as well.
Now that we have specified all the Cauchy data, we can use standard theory for
nonlinear wave equations (see [51; 190; 218]) to obtain a solution ḡµν to (5.3.1),
which is a Lorentzian metric on the product of U with an interval about 0, and
the induced geometry on the slice {0} × U is precisely (U, g, K ). The question
now is how to guarantee that λα = 0 propagates in time, so that ḡ solves the
Einstein equation. As we will see, a homogeneous linear wave equation for λα
is a consequence of the Bianchi identities, while the Einstein constraints will
show that the initial time derivative of λα vanishes. Together with the preceding
paragraph, this will allow us to conclude that the gauge conditions that we have
arranged at t = 0 propagate in time.
We begin with a simple exercise.
H
Exercise 5-15. Assuming G 3 (ḡ) = 0, show that

(G 3 (ḡ))µν = Rµν − 21 R(ḡ)ḡµν + 3ḡµν = − 12 ḡαµ λα,ν − 21 ḡαν λα,µ + 21 ḡµν λα,α .

The vacuum constraints that are satisfied by the induced geometry (U, g, K )
are precisely (G 3 (ḡ))µν n ν = 0 for µ ≥ 0, or in our set up, (G 3 (ḡ))µ0 = 0 at
t = 0; cf. (5.2.7)–(5.2.8). Note that by our arrangement of the gauge condition at
µ
t = 0, λ ,i = 0 for i ≥ 1 and all µ. The component of the vacuum constraint for
µ = 0 (at t = 0) is just

0 = (G 3 (ḡ))00 = − 21 ḡα0 λα,0 − 12 ḡα0 λα,0 + 12 ḡ00 λα,α = 12 λ0,0 .

For i ≥ 1, we have at t = 0 (again using the vanishing of the spatial derivatives


of λµ , and the initial condition on the metric)
j
0 = (G 3 (ḡ))i0 = − 21 ḡαi λα,0 − 12 ḡα0 λα,i + 12 ḡi0 λα,α = − 12 g ji λ ,0 .
P
j≥1

Since this is true for i ≥ 1 and the matrix (gi j ) is invertible, λi,0 must vanish as
well at t = 0.
148 5. T HE E INSTEIN CONSTRAINT EQUATIONS

Exercise 5-16. Use the preceding exercise to show that for a solution of (5.3.1),
the vanishing of the divergence of G 3 (ḡ) is equivalent to
 γ
0 = − 21 ḡαν □ ḡ λα + Bνγ
θ
(ḡρσ ), (ḡρσ,β ) λ ,θ .

θ arise from expanding the divergence of G ( ḡ) in components,


The functions Bνγ 3
using the Christoffel symbols of ḡ, which involve metric components and their
first partials.

From this exercise we see that the partials of λα satisfy a homogeneous linear
hyperbolic system with vanishing initial data. Thus the λα vanish identically (see
[190, Chapters 8 and 12]). Hence the gauge condition holds, and the solution
to the reduced Einstein equation yields a solution to the Einstein equation, as
desired.

5.3.2.1. More on the evolution problem. What we have discussed above is a local
construction. One would like to piece local solutions together, by patching along
a cover of 6, to obtain an existence result for a vacuum spacetime containing
(6, g, K ); see [51; 190; 218] for details. In order to formulate this, we sketch a
local uniqueness result for the evolution of initial data.
Suppose (V1 , ḡ1 ) and (V2 , ḡ2 ) are vacuum spacetimes containing respective
coordinate neighborhoods U1 and U2 on 6, with induced geometry (g, K ).
Set U = U1 ∩ U2 ⊂ 6. There is a coordinate system in a neighborhood of
U1 in V1 with respect to which the metric components of ḡ1 along U1 satisfy
(ḡ1 )i j = gi j and (ḡ1 )0 j = 0 for i, j ≥ 1, and (ḡ1 )00 = −1; indeed, one can start from
coordinates (x i ) for U1 and build adapted coordinates by (x µ ) 7→ expḡp(x 1 i (x 0 n ),
) 1
i i
where p(x ) ∈ U1 corresponds to coordinates (x ) for U1 , and n 1 is a smooth
timelike unit normal along U1 . We see that x 0 = 0 corresponds to U1 , and
n 1 = ∂/∂ x 0 along U1 . We have then that for i, j ≥ 1, (ḡ1 )i j,0 = −2K i j along
U1 . Actually in these specific coordinates, the geodesic equations also yield
(ḡ1 )0µ,0 = 0 along U1 . The analogous construction can be made along U2 with
normal unit normal n 2 with respect to ḡ2 , and we can build such coordinate
charts along U from a common coordinate domain to respective neighborhoods
W1 ⊂ V1 and W2 ⊂ V2 of U . If we pullback ḡ1 and ḡ2 by these coordinate
charts, respectively, then along x 0 = 0 we will have both (ḡ1 )µν = (ḡ2 )µν and
(ḡ1 )µν,0 = (ḡ2 )µν,0 (again, this comes from the constraint equations, along with
the geodesic equation). Said another way, if we define the diffeomorphism
ḡ1 0 ḡ2 0
ϕ : W1 → W2 by ϕ(exp p(x i ) (x n 1 )) = exp p(x i ) (x n 2 ) so that expressed in these

coordinates, ϕ is the identity map, then ϕ ∗ ḡ2 agrees with ḡ1 along U , and
(ϕ ∗ ḡ2 )µν,0 = (ḡ1 )µν,0 on x 0 = 0 in these coordinates for W1 .
T HE INITIAL VALUE FORMULATION FOR THE VACUUM E INSTEIN EQUATION 149

We now move to wave coordinates, by modifying the given coordinates along


U in W1 and W2 , in exactly the way we did in Section 5.3.1: solve the wave
equation using the original coordinate functions and their time derivatives at
x 0 = 0 as the initial data; we can again restrict to a common coordinate domain
neighborhood e U around the x 0 = 0 slice. It is not hard to see that changing to
wave coordinates this way preserves the first-order partial derivative operators
along U , so in particular (ḡ1 )µν and (ḡ1 )i j,0 for i, j ≥ 1 (which remains equal
to −2K i j ) do not change at x 0 = 0. Note that in wave coordinates, we can
then determine (ḡ1 )0µ,0 from (ḡ1 )µν and (ḡ1 )i j,0 , via (5.3.2)–(5.3.3). Similar
comments apply to ḡ2 , and so in these wave coordinates, (ḡ1 )µν = (ḡ2 )µν and
(ḡ1 )µν,0 = (ḡ2 )µν,0 along x 0 = 0. But as we have expressed vacuum Einstein
metrics in wave coordinates, (ḡ1 )µν and (ḡ2 )µν are solutions of the reduced
Einstein equation (5.3.1), with the same initial conditions. Uniqueness for the
Cauchy problem for the reduced Einstein equation shows that the two solutions
must in fact agree.
We formulate the above in terms of a local isometry. If the wave coordinate
charts are given by e ϕi : e
U →W e i ⊂ Wi , let ψ = e ϕ2 ◦ eϕ1−1 : W
e1 → W e 2 be the
induced diffeomorphism. In W ∗
e 1 , (ψ ḡ2 ) and ḡ1 both satisfy the vacuum Einstein
equation. The components of the coordinate map e ϕ2−1 are harmonic for ḡ2 , so that
−1 −1
the components of the pullback e ϕ1 = e ϕ2 ◦ ψ, which are harmonic for ḡ1 , are
also harmonic for ψ ∗ (ḡ2 ). Thus both ((ḡ1 )µν ) and (ψ ∗ (ḡ2 )µν ) solve the reduced
Einstein equation on e U , and we can argue as above that (ψ ∗ ḡ2 )µν = (ḡ1 )µν

and (ψ ḡ2 )µν,0 = (ḡ1 )µν,0 along t = 0. Of course in the e ϕ1 coordinates, the

components ψ (ḡ2 )µν are just the e ϕ2 -components (ḡ2 )µν . Thus we see that

ψ ḡ2 = ḡ1 , so that ḡ1 and ḡ2 are isometric in neighborhoods around a common
solution (U, g, K ) of the constraint equations.
One can use local existence together with the above uniqueness result to build
a solution of the vacuum Einstein equation containing (6, g, K ). A natural
question is whether there is a maximal (in a certain sense) such solution which
is determined from the initial data by the Einstein equation. To formulate this
one utilizes causality notions, some of which were introduced in Chapter 3, to
establish the existence of a maximal globally hyperbolic spacetime development
of the initial data (see [51; 190; 218], and [191, Chapter 23] for a detailed
exposition). We will be content here with considering the following question:
given a solution (6, g, K ) of the vacuum Einstein constraint equations (with
3 = 0, say), does there exist a globally hyperbolic spacetime (S , ḡ) (i.e., a
connected, time-oriented Lorentzian manifold with a Cauchy hypersurface — see
Corollary 3-38 and its converse), with the following conditions: (i) Ric(ḡ) = 0;
150 5. T HE E INSTEIN CONSTRAINT EQUATIONS

(ii) there is an imbedding of 6 in S , the image of which we identify with


6, with induced fundamental forms (g, K ); and (iii) 6 ⊂ (S , ḡ) is a Cauchy
hypersurface? By Remark 3-31, (iii) means 6 is achronal and D(6) = S .
We build on the causality discussion in Chapter 3, and give a roadmap of
the argument, citing facts from [174]. Let ( M̄, ḡ) be a connected time-oriented
Lorentzian manifold. If a subset A ⊂ M̄ is achronal, then int D(A), if nonempty,
is globally hyperbolic, cf. [174, Theorem 14.38]. Moreover, if A ⊂ M̄ is a
topological hypersurface which is acausal (i.e., no causal curve meets A more
than once, a stronger condition than achronality), then D(A) is open [174, Lemma
14.43]. As a consequence, we see that if ( M̄, ḡ) satisfies conditions (i) and (ii)
above, such that 6 ⊂ ( M̄, ḡ) is acausal, then (S , ḡ) = (D(6), ḡ) will satisfy (i),
(ii) and (iii).
By [174, Lemma 14.42], an achronal spacelike hypersurface is in fact acausal.
Moreover, [174, Lemma 14.45] shows that in case a closed, connected subset
6 ⊂ M̄ is a spacelike hypersurface such that M̄ \ 6 is disconnected, then 6 is
in fact achronal, and hence acausal, and hence by the above, D(6) is open.
As an upshot of these causality results, we see that given a connected solution
(6, g, K ) of the vacuum constraints, for an open set M̄ ⊂ R×6 with {0}×6 ⊂ M̄,
if we have (say, by solving Einstein’s equation locally and patching solutions
together as sketched above) a Lorentzian metric ḡ on M̄ with Ric(ḡ) = 0, inducing
on {0} × 6 the initial data (g, K ), then (D(6), ḡ) will satisfy properties (i), (ii),
(iii) as desired. Thus this spacetime (S = D(6), ḡ) is determined by the evolution
of the geometric data (6, g, K ), a topic we discuss further in the next section.

5.3.3. Evolution of the geometry. Consider a Lorentzian spacetime (S , ḡ) =


(I × 6 k , ḡ), where 0 ∈ I ⊂ R is an interval, and where the slices 6t = {t} × 6
are spacelike. We let g = g(t) be the induced metric on 6t . Let the unit vector
n be normal to the slices, parallel to the spacetime gradient of t, pointing in the
same time direction as ∂/∂t = N n + X , where X is tangent to the slices and
N > 0. We call N the lapse function and X the shift vector field. We can write
the metric ḡ in local coordinates x i for 6 as follows (summation over 1, . . . , k):

ḡ = −N 2 dt 2 + gi j (d x i + X i dt) ⊗ (d x j + X j dt).

The first and second fundamental forms of the slices form a family of solutions
to the Einstein constraint equations. In our discussion of solving the Einstein
equations from initial data, we chose N = 1 and X = 0 on the initial slice. One
can use the solution of the initial value problem to determine a lapse and shift
for a spacetime splitting. You might also consider suitably prescribing N and X ,
possibly solving an auxiliary set of equations which might for instance impose
T HE INITIAL VALUE FORMULATION FOR THE VACUUM E INSTEIN EQUATION 151

gauge conditions, and solve for the induced metric and second fundamental form
of the slices; cf. [7]. Given N and X , we indicate below the evolution equations
of these geometric quantities on the slices.
It is useful to have the analogous formulas in the Riemannian setting as well.
So, with ∂t∂ = N n + X , we let ϵ = ⟨n, n⟩ = ±1, and write the metric in the form

ḡ = ϵ N 2 dt 2 + gi j (d x i + X i dt) ⊗ (d x j + X j dt)
= (ϵ N 2 + |X |2g )dt 2 + X ♭ ⊗ dt + dt ⊗ X ♭ + g,

where N 2 > |X |2g in the Lorentzian case. With our convention on K , we have
∇ X Y = ∇ X6 Y + ϵ K (X, Y )n, where X and Y are tangent to a slice (and, as here,
we may suppress the subscript on 6t ).

5.3.3.1. The ADM equations. We compute the time derivative of the induced
metric. Let ei = ∂∂x i , 1 ≤ i ≤ k, be a coordinate frame for 6, and let e0 = ∂t∂ .
Using metric compatibility, the torsion-free property of the connection, and the
fact that all the eµ commute, we have
∂gi j
= ḡ(∇ei e0 , e j ) + ḡ(ei , ∇e j e0 )
∂t
= ḡ(∇ei (N n + X ), e j ) + ḡ(ei , ∇e j (N n + X ))
= N ḡ(∇ei n, e j ) + N ḡ(ei , ∇e j n) + ḡ(∇ei X, e j ) + ḡ(ei , ∇e j X )
= −2N K i j + g(∇e6i X, e j ) + g(ei , ∇e6j X )
= −2N K i j + (L X g)i j (5.3.4)

where L X g is the Lie derivative, (L X g)i j = X i; j + X j;i , where the semicolon


indicates covariant differentiation for the Levi-Civita connection of g. Note that
we can solve for the second fundamental form:

1 −1 ∂gi j
 
Ki j = − 2 N − (L X g)i j .
∂t
A more laborious exercise determines the time evolution of K :
∂ Ki j
= ϵ N;i j + (L X K )i j + N ϵ(Ri j − Ri6j ) − 2K iℓ K jℓ + K ℓℓ K i j , (5.3.5)

∂t
where N;i j are the components of Hessg N , Ri j are components of Ric(ḡ), Ri6j
are components of Ric(g), and L X K is the Lie derivative of K , (L X K )(Y, Z ) =
X [K (Y, Z )] − K ([X, Y ], Z ) − K (Y, [X, Z ]). Note that there are no time deriva-
tives of N or X in (5.3.4)–(5.3.5). We outline the proof of (5.3.5) in the remainder
of the section.
152 5. T HE E INSTEIN CONSTRAINT EQUATIONS

Exercise 5-17. a. Show that if T is a (0, 2)-tensor, and W , Y and Z are vector
fields, then (L W T )(Y, Z ) = (∇W T )(Y, Z ) + T (∇Y W, Z ) + T (Y, ∇ Z W ).

b. With ei and ∂t = N n + X as above, if Y = Y i ei , then

[n, Y ] = n[Y i ] + N −1 Y [X i ] ei + N −1 Y [N ]n.




Conclude that the tangential component of [n, eℓ ] is −N −1 [X, eℓ ].

We will require a couple of formulas relating g and K to the ambient geometry


of ḡ = ⟨ · , · ⟩. Define the tensor K 2 by K i2j = K ik K jℓ g kℓ , a metric contraction of
K ⊗ K . Note that ∇ei n = −K i ℓ eℓ , since K i j = ⟨−∇ei n, e j ⟩= K i ℓ gℓj , and thus
⟨∇ei n, ∇e j n⟩ = ⟨−∇ei n, K jℓ eℓ ⟩ = K iℓ K jℓ = K i2j . In other words, K 2 (Y, Z ) =
⟨∇Y n, ∇ Z n⟩ = −K (∇Y n, Z ).
We extend K , i.e., the family of tensors K (t), as a tensor on S as follows: for
p ∈ 6t and v, w ∈ T p S , we let K (v, w) := K (t)(v T , w T ), where v T = v−ϵ⟨v, n⟩n
is the tangential projection of v to T p 6t .
The following identity we leave as an exercise, whose derivation is similar to
that of the Hamiltonian constraint equation (5.2.2).

Exercise 5-18. For Y and Z orthogonal to n, we have

ḡ(R(Y, n, n), Z ) = ϵ(Ricḡ (Y, Z ) − Ricg (Y, Z )) + (trg K )K (Y, Z ) − K 2 (Y, Z ).

(Hint: Start by writing out Ricḡ (Y, Z ) in an adapted orthonormal frame.)

We will also make use of the following form of what is sometimes referred to
as the Mainardi equation.

Lemma 5-19. For Y and Z orthogonal to n,

ḡ(R(Y, n, n), Z ) = (∇n K )(Y, Z ) − K 2 (Y, Z ) − ϵ N −1 Hessg N (Y, Z ). (5.3.6)

Proof. We first note the elementary identity

(∇n K )(Y, Z ) = ∇n (K (Y, Z )) − K (∇n Y, Z ) − K (Y, ∇n Z )


= ∇n (K (Y, Z )) − K (∇Y n, Z ) − K ([n, Y ], Z ) + ⟨∇Y n, ∇n Z ⟩
= ∇n (K (Y, Z )) + K 2 (Y, Z ) − K ([n, Y ], Z ) + ⟨∇Y n, ∇n Z ⟩.

Now use [n, Y ] = [n, Y ]T + ϵ ⟨[n, Y ], n⟩n = [n, Y ]T + N −1 Y [N ]n (Exercise


5-17) and K ([n, Y ], Z ) = K ([n, Y ]T , Z ) = ⟨−∇[n,Y ]T n, Z ⟩ (by definition) to
T HE INITIAL VALUE FORMULATION FOR THE VACUUM E INSTEIN EQUATION 153

obtain

⟨R(Y, n, n), Z ⟩ = ⟨∇Y ∇n n − ∇n ∇Y n, Z ⟩−⟨∇[Y,n] n, Z ⟩


= ⟨∇Y ∇n n, Z ⟩−∇n ⟨∇Y n, Z ⟩+⟨∇Y n, ∇n Z ⟩+⟨∇[n,Y ] n, Z ⟩
= ⟨∇Y ∇n n, Z ⟩+∇n (K (Y, Z )) + ⟨∇Y n, ∇n Z ⟩
− K ([n, Y ], Z ) + N −1 Y [N ]⟨∇n n, Z ⟩
= (∇n K )(Y, Z ) − K 2 (Y, Z ) + ∇Y ⟨∇n n, Z ⟩
− ⟨∇n n, ∇Y Z ⟩+N −1 Y [N ]⟨∇n n, Z ⟩.

Now, for a vector field Z normal to n,


⟨∇n n, Z ⟩ = −⟨n, ∇n Z ⟩= −⟨n, ∇ Z n + [n, Z ]⟩= −⟨n, [n, Z ]⟩
= −ϵ N −1 Z [N ]. (5.3.7)

Thus ⟨∇n n, ∇Y Z ⟩= ⟨∇n n, ∇Y6 Z ⟩= −ϵ N −1 ∇Y6 Z [N ], and so

∇Y ⟨∇n n, Z ⟩−⟨∇n n, ∇Y Z ⟩+N −1 Y [N ]⟨∇n n, Z ⟩


= −ϵ Y [N −1 Z [N ]] − N −1 ∇Y6 Z [N ] + N −2 Y [N ]Z [N ]


= −ϵ N −1 Hessg N (Z , Y ) = −ϵ N −1 Hessg N (Y, Z ),

as desired (the Hessian is with respect to the metric g, with connection ∇ 6 ). □


To derive (5.3.5), we have with ∂/∂t = N n + X , and recalling the tangential
component of [n, eℓ ] is −N −1 [X, eℓ ], as well as K 2 (Y, Z ) = −K (∇Y n, Z ),
∂ Ki j
= ∇ N n+X [K (ei , e j )]
∂t
= N (∇n K )i j + N K (∇n ei , e j ) + N K (ei , ∇n e j ) + X [K i j ]
= N (∇n K )i j + N K (∇ei n + [n, ei ], e j )
+ N K (ei , ∇e j n + [n, e j ]) + X [K i j ]
= N (∇n K )i j − 2N K i2j
+ N K (−N −1 [X, ei ], e j ) + N K (ei , −N −1 [X, e j ]) + X [K i j ]
= N (∇n K )i j − 2N K i2j + (L X K )i j .

Putting this together with Exercise 5-18 and Lemma 5-19, we obtain (5.3.5).
The system (5.3.4)–(5.3.5) is known as the ADM equations, after Arnowitt,
Deser and Misner [10]. If the spacetime (S , ḡ) as above satisfies the vacuum
Einstein equation, then Ri j = 0 in (5.3.5), with ϵ = −1. Turning this around,
suppose we have some lapse N and shift X , and we want to build a vacuum
spacetime metric ḡ by solving for g = g(t). The spacetime scalar curvature
154 5. T HE E INSTEIN CONSTRAINT EQUATIONS

is given by R(ḡ) = −Ricḡ (n, n) + g i j Ri j , and the Einstein tensor is given by


G(n, n) = 12 Ricḡ (n, n) + 21 g i j Ri j , G(n, ei ) = Ricḡ (n, ei ), and G i j = Ri j −
1
2 R( ḡ)gi j . We want to encode G i j = 0 into the dynamical equations, since
as we have seen earlier G(n, · ) is encoded in the constraint equations. With
G(n, n) = 12 R(g) − |K |2g + (trg K )2 , we have


Ricḡ (n, n) = R(g) − |K |2g + (trg K )2 − g i j Ri j ,

and so R(ḡ) = 2g i j Ri j − (R(g) − |K |2g + (trg K )2 ). Note that if G i j = 0, then


(k ≥ 2) g i j Ri j = k2 R(ḡ), and so R(ḡ) = k−1
1
(R(g) − |K |2g + (trg K )2 ). Thus
1
G i j = 0 can be written Ri j = 2(k−1) R(g) − |K |2g + (trg K )2 gi j .


If we thus insert Ri j as above into the ADM equations, and if we have a


PDE framework that allows us to solve the system (cf., e.g., [7], and [17; 51]
and references therein), then we can construct ḡ from the solution. The first
equation (5.3.4) guarantees that the tensor K = K (t) is in fact the second
fundamental form of {t} × 6 ⊂ (S , ḡ), and so (5.3.5) with Ri j as above really
does imply that the spatial components G i j (tangent to {t} × 6) vanish. As we
recalled above, the other components will be given by the constraints. Indeed, we
have not yet mentioned the initial data (g(0), K (0)); we of course will impose
the vacuum constraints on this data, and in so doing, we will have G µν = 0 on
the t = 0 slice.
The Einstein equation G µν = 0 will come from the dynamical equations (5.3.4)–
(5.3.5) together with the constraint equations. The constraints are imposed at
t = 0, and we need to verify (as we did in Section 5.3.2) that they are propagated.
This is done as before, using the Bianchi identity ḡ αβ G µα;β = 0, which, given
G i j = 0, can be written as a (symmetrizable hyperbolic: see [51, Chapter VI] or
[214, Chapter 18]) system of linear homogeneous evolution equations for G µ0 ,
with vanishing initial conditions (imposed by the constraints). Thus G µ0 must
vanish identically.
Exercise 5-20. Suppose for simplicity that the lapse N is 1 and the shift X is 0,
and that we have solved for the spacetime metric ḡ from (5.3.4)–(5.3.5) imposing
G i j = 0 for i, j ≥ 1 as above. Suppose furthermore that (g, K ) solve the constraint
equations at t = 0. Using G i j = 0 and the Bianchi identity, show that G 00,0 is a
linear combination (with coefficients depending on ḡ and its Christoffel symbols)
of G 0µ and G i0, j . Show that G i0,0 is such a linear combination, but only with
0
terms G j0 ; you might want to observe 0i0 = 0. Because of this, you can derive
a system of the form ∂ Z /∂t = AZ , where A is a matrix function depending
smoothly (algebraically in fact) on (ḡ, ∂ ḡ, ∂ 2 ḡ), and where the components of
the vector Z are the components G 0µ and G i0, j for µ ≥ 0 and i, j ≥ 1.
T HE INITIAL VALUE FORMULATION FOR THE VACUUM E INSTEIN EQUATION 155

5.3.3.2. Hamiltonian formulation. The ADM equations are related to a Hamil-


tonian formulation of the Einstein equations, for which we have already seen
a Lagrangian formulation. We will indicate some basic notions in this regard,
referring the interested reader to references such as [10; 17; 90; 51; 218].
In classical mechanics, one can often recast an action generated by a La-
grangian L = L(q, q̇, t) in Hamiltonian form, using the Legendre transform
H (q, p, t) = pq̇ − L(q, q̇, t) with p = ∂ L/∂ q̇ to define the associated Hamil-
tonian function [8; 220]; q is the vector of state variables and p is the vector
of the conjugate momenta (we have abbreviated by not including subscripts
to number the components, e.g., pq̇ = Nj=1 p j q̇ j ). Stationarity of the action
P
R t2 R t2
t1 L(q, q̇, t) dt = t1 ( p q̇ − H (q, p, t)) dt, with respect to variations of q(t)
fixing the endpoint values q(t1 ) and q(t2 ), leads naturally to the Euler–Lagrange
equations
d ∂L ∂L
 
= ;
dt ∂ q̇ ∂q
using these along with p = ∂ L/∂ q̇ while examining the differential d H leads to
Hamilton’s equations
∂H ∂H
q̇ = , ṗ = − .
∂p ∂q
We remark that precisely the same equations characterize critical points of
the Hamiltonian form of the action over more general curves (q(t), p(t)) in
momentum phase space fixing the endpoint values q(t1 ) and q(t2 ), not only over
curves for which p = ∂ L/∂ q̇ [8].
We can recast the Einstein–Hilbert action S R(ḡ) dvḡ on (S = I × 6, ḡ),
R

where 6 is a k-manifold (k ≥ 2), in an analogous fashion, connecting the ADM


equations with Hamilton’s equations. We assume we can integrate over the time
interval I = (t1 , t2 ) (restricting from a larger interval if needed), with fields
extending smoothly to the boundary ∂ S , which in case 6 has empty boundary,
consists of two slices 6t1 = {t1 } × 6 and 6t2 = {t2 } × 6. If S is not compact,
then as we have noted before, one can derive the field equations by considering
stationary points ḡ of a regularized action defined by integrating (R(ḡ+h̄)−R(ḡ))
for compactly supported variations h̄, or one can work in spaces of metrics (with
associated deformations) in which the scalar curvature is integrable. Though
we generally take the field variations to vanish near the boundary (if nonempty)
or near infinity (in the noncompact case) when deriving the Euler–Lagrange
equations, we do remark that an analysis of the boundary terms (including the
behavior at infinity) is of interest (see [229; 113; 218], for example), and in
particular plays a role in the study of isolated gravitational systems, and in
156 5. T HE E INSTEIN CONSTRAINT EQUATIONS

the notion of quasilocal mass [158]. In fact, for isolated systems, the natural
decay rates are such that not all the boundary terms should be discarded in
the Hamiltonian analysis; as we will see in Section 7.2.1, the ADM energy-
momentum is defined in terms of flux integrals at infinity (i.e., as a limit of
flux integrals over large spheres in an asymptotic end), and these terms must be
included to get a well-defined Hamiltonian in the natural phase space [9; 16;
75; 187]. For simplicity on the first pass, we will have in mind the case when
6 is compact with no boundary, but see Remark 5-22 and Exercise 5-30 for a
brief discussion of boundary terms, and see Section 7.2.1 for a discussion in the
asymptotically flat setting. Finally, we note that to get the units of energy, we
R 1
want to consider the Hamiltonian related to the Lagrangian S 2κ R(ḡ) dvḡ ; see
Remark 5-23. For simplicity, we work below in units where 2κ = 1, though in
later chapters we will use units where G = 1 and c = 1, so that the energy of
the Schwarzschild spacetime is m; in spacetime dimension four, for instance,
2κ = 16π.
We begin by finding expressions for the the scalar curvature R(ḡ) of (S , ḡ)
with ḡ = −N 2 dt 2 + gi j (d x i + X i dt) ⊗ (d x j + X j dt).

Exercise 5-21. Prove that ∇n (trg K ) = trg (∇n K ). Then use Lemma 5-19 to show

R(ḡ) = −2∇n (trg K ) − 2N −1 1g N + 2|K |2g + R(g) − |K |2g + (trg K )2 (5-21a)




= −2divḡ ((trg K )n) − 2(trg K )2 − 2divḡ (∇n n) + 2|K |2g


+ R(g) − |K |2g + (trg K )2 . (5-21b)


(Hint: You might use (5.3.7) to help show N −1 1g N = divḡ (∇n n).)

To rewrite the Einstein–Hilbert action, we will use the ADM equations (5.3.4)
√ √
and (5.3.5), and we also note the following fact: |det ḡ| = N det g. To derive
this by direct expansion of the matrix for ḡ in lapse-shift form, let ξ (i) be the
matrix obtained by replacing column i of the matrix for g with the column whose
entry in row j is X j = g jℓ X ℓ , so that
k
det ḡ = (−N 2 + |X |2g ) det g − X i det(ξ (i) )
P
i=1
k
= (−N 2 + |X |2g ) det g − (−1)i+ j X i X j Mi j ,
P
i, j=1

where Mi j is the (i, j)-minor determinant of the matrix for g. Cramer’s rule for
the inverse (Section 2.3.4) now can be employed to get g i j det g = (−1)i+ j M ji ,
from which the determinant formula follows.
T HE INITIAL VALUE FORMULATION FOR THE VACUUM E INSTEIN EQUATION 157

We can thus write the Einstein–Hilbert action over the spacetime product
as S R(ḡ) dvḡ = I 6 R(ḡ)N dvg dt. Before we insert the formula (5-21b)
R R R

for R(ḡ) and integrate, we note that with Hamilton’s equations, as well as the
evolution equation (5.3.4) for ∂g/∂t, in mind, we let π i j = K i j − (trg K )g i j , and
π̂ i j = K̂ i j − (trg K̂ )g i j = −π i j , and we consider the following expression:

∂gi j
π̂ i j = −(K i j − (trg K )g i j )(−2N K i j ) + π̂ i j (L X g)i j
∂t
= −2N (trg K )2 + 2N |K |2g + π̂ i j (X i; j + X j;i )
= −2N (trg K )2 + 2N |K |2g + 2π̂ i j X i; j . (5.3.8)

We insert this into (5-21b) to obtain


∂gi j
R(ḡ) = N −1 π̂ i j − 2N −1 π̂ i j X i; j + R(g) − |K |2g + (trg K )2

∂t
−2divḡ ((trg K )n) − 2divḡ (∇n n).

Thus the action S R(ḡ) dvḡ (modulo any boundary terms) is equal to
R

i j ∂gi j
Z Z  
i 2 2
π̂ + 2(divg π̂ )i X + N R(g) − |K |g + (trg K ) dvg dt.

I 6 ∂t
Remark 5-22. In case ∂6 is empty, the boundary terms are easily seen to be
Z Z  Z Z 
− 2trg K dvg − 2trg K dvg =: − − 2trg K dvg .
6t2 6t1 6t2 6t1

For more on boundary terms, see Exercise 5-30.

We now assemble this into Hamiltonian form. We have trg π̂ = (k − 1) trg K


and |π̂ |2g = |K |2g + (k − 2)(trg K )2 . Define HADM = 6 H
b dvg , with
R

b(g, π̂ ) = −N R(g) − |K |2g + (trg K )2 + 2 divg K − d(trg K ) X i


b=H
 
H i
1
 
= −N R(g) − |π̂ |2g + (trg π̂ )2 − 2(divg π̂ )i X i (5.3.9)
k −1
= (N , X ) · 8(g, π̂ ),
b (5.3.10)

where
1
 

8(g, π̂ ) = − R(g) − |π̂ |2g + 2
(tr π̂) , 2(divg π̂) ,
(k −1) g
b

with the second component a one-form field. We emphasize that the vacuum
constraint equations (with 3 = 0) are just 8
b(g, π̂ ) = 0.
158 5. T HE E INSTEIN CONSTRAINT EQUATIONS

Thus we can write S R(ḡ)dvḡ (once again, modulo boundary terms) as


R

i j ∂gi j
Z Z 
π̂ dvg − HADM dt
I 6 ∂t
i j ∂gi j
Z Z  
= π̂ − (N , X ) · 8(g, π̂ ) dvg dt. (5.3.11)
b
I 6 ∂t
N and X appear as Lagrange multipliers for a constrained optimization problem.
Indeed, stationarity of the action with respect to variations of lapse and shift
yield the vacuum constraint equations.
With an eye toward Hamilton’s equations, we choose a background volume
element d v̊ on p 6, induced from a background metric g̊ on 6. The function

θg = det g/ det g̊ encodes the ratio of the volume elements (and in particular
is independent of coordinates). We let π̃ i j = θg π̂ i j , He (g, π̃ ) = θg H b(g, π̂), and
8(g, π̃ ) = θg 8(g, π̂ ), so that the action can be written
e b

i j ∂gi j
Z Z 
π̂ dvg − HADM dt
I 6 ∂t
i j ∂gi j i j ∂gi j
Z Z   Z Z  
= π̂ − H dvg dt = π̃ − H d v̊ dt
∂t ∂t
b e
I 6 I 6
∂gi j
Z Z  
= π̃ i j − (N , X ) · 8
e(g, π̃) d v̊ dt. (5.3.12)
I 6 ∂t
Starting from the stationarity of the action in Hamiltonian form, we can obtain
the ADM equations. Indeed consider a one-parameter variation (compactly
supported away from the boundary) π̂ i j + ϵ σ̂ i j , keeping g fixed, or equivalently
π̃ i j + ϵ σ̃ i j = θg (π̂ i j + ϵ σ̂ i j ), and define

δH d δH d
σ̂ i j b(g, π̂ + ϵ σ̂ ), σ̃ i j e (g, π̃ + ϵ σ̃ ).
b e
= H = H
δ π̂ i j dϵ ϵ=0 δ π̃ i j dϵ ϵ=0

b/δ π̂ i j = δ H
We then observe that δ H e /δ π̃ i j , and moreover with HADM (g, π̂) :=
6 H dvg := 6 H d v̊ =: HADM (g, π̃ ),
R R
b e

d δH
Z
HADM (g, π̂ + ϵ σ̂ ) = σ̂ i j i j dvg
b
dϵ ϵ=0 6 δ π̂
δH d
Z
σ̃ i j i j d v̊ = HADM (g, π̃ + ϵ σ̃ ).
e
=
6 δ π̃ dϵ ϵ=0
From the first two lines of (5.3.12), the corresponding variation of the action is

i j ∂gi j δH i j ∂gi j δH
Z Z   Z Z  
σ̂ σ̃ − i j d v̊ dt. (5.3.13)
b e
− i j dvg dt =
I 6 ∂t δ π̂ I 6 ∂t δ π̃
T HE INITIAL VALUE FORMULATION FOR THE VACUUM E INSTEIN EQUATION 159

For this to vanish for all σ̂ (compactly supported away from the boundary), we
conclude one of Hamilton’s equations

∂gi j δH δH
= ij = ij .
b e
(5.3.14)
∂t δ π̂ δ π̃
If we now consider a (compactly supported) variation gi j + ϵh i j , with π̃ i j
fixed (so π̂ may change with g), we define

δH d
e (g + ϵh, π̃ ),
e
hi j = H
δgi j dϵ ϵ=0

so that
d δH
Z
HADM (g + ϵh, π̃ ) =: d v̊.
e
hi j
dϵ ϵ=0 6 δgi j
∂gi j
Z Z  
The variation of the action π̃ i j e d v̊ dt is then
−H
I 6 ∂t

∂h i j δH
Z Z  
π̃ i j d v̊ dt.
e
− hi j
I 6 ∂t δgi j
We integrate by parts in t; since the expression above must vanish for all
compactly supported h, we get another of Hamilton’s equations

∂ π̃ i j δH
.
e
=− (5.3.15)
∂t δgi j
We now determine what (5.3.14) means. We compute from (5.3.9)

δH 
1

ij ij
σ̂ i j σ̂ ij
π̂ (tr π̂ )g − 2σ̂ ; j X i = 2N σ̂ i j K̂ i j − 2σ̂ ; j X i .
b
= 2N i j − g i j
δ π̂ i j k −1
Thus with an integration by parts we find

i j ∂H
Z Z
σ̂ σ̂ i j (2N K̂ i j + X i; j + X j;i ) dvg .
b
ij
dvg =
6 ∂ π̂ 6

By (5.3.13), we thus recover the first of the ADM equations, ∂gi j /∂t = −2N K i j +
(L X g)i j . A similar (but more laborious) analysis produces the second ADM
equation, which we leave as an exercise; cf. [51] or [218, Appendix E].
We can write the ADM equations in a symplectic form using the constraint
map 8 b, as follows. If we let

d
D8
b(g,π̂) (h, σ̂ ) = 8
b(g + ϵh, π̂ + ϵ σ̂ ),
dϵ ϵ=0
160 5. T HE E INSTEIN CONSTRAINT EQUATIONS

we define a formal L 2 (dvg )-adjoint operator by the prescription


Z Z
(N , X ) · D 8 b∗ (N , X ) dvg
b(g,π̂) (h, σ̂ ) dvg = (h, σ̂ ) ·g D 8
(g,π̂ )
6 6

for all compactly supported (vanishing near the boundary) (h, σ̂ ). We similarly
define D 8e(g,π̃) (h, σ̃ ) = d
dϵ ϵ=0 8(g + ϵh, π̃ + ϵ σ̃ ) and
e
Z Z
(N , X ) · D 8 e∗(g,π̃ ) (N , X ) d v̊.
e(g,π̃) (h, σ̃ ) d v̊ = (h, σ̃ ) ·g D 8
6 6

One needs to take care when computing this adjoint, since the volume element
for the integral is induced from g̊, but the derivative operators would naturally be
taken with respect to g (factors of θg are used to compensate; cf. Exercise 5-28).
Using (5.3.11) and (5.3.12), we obtain the first variation of the action fixing g
and varying π̂ i j + ϵ σ̂ i j (equivalently π̃ i j + ϵ σ̃ i j ) to be

i j ∂gi j
Z Z  
σ̂ − (N , X ) · D 8(g,π̂) (0, σ̂ ) dvg dt
b
I 6 ∂t
i j ∂gi j
Z Z  

= σ̂ − (0, σ̂ ) ·g D 8(g,π̂) (N , X ) dvg dt;
b
I 6 ∂t
equivalently,

i j ∂gi j
Z Z  
σ̃ − (N , X ) · D 8(g,π̃) (0, σ̃ ) d v̊ dt
e
I 6 ∂t
i j ∂gi j
Z Z  

= σ̃ − (0, σ̃ ) ·g D 8(g,e
e π ) (N , X ) d v̊ dt. (5.3.16)
I 6 ∂t

Similarly, if we keep π̃ i j fixed and vary g by gi j + ϵh i j we get the first variation


of the action to be

i j ∂h i j
Z Z  
π̃ − (N , X ) · D 8(g,π̃) (h, 0) d v̊ dt
e
I 6 ∂t
∂ π̃ i j
Z Z  
= −h i j e∗(g,e
− (h, 0) ·g D 8 π) (N , X ) d v̊ dt. (5.3.17)
I 6 ∂t
By (5.3.17) and (5.3.16), for the action to be stationary for variations of
gi j , π̃ i j , as well as N and X (which yield the constraints), we have, with the
components of the adjoint written vertically and with the indices raised or lowered
by g to match the tensor type,

∂g/∂t 0 1
   
= e∗(g,π̃ ) (N , X ).
D8 (5.3.18)
∂ π̃ /∂t −1 0
E XERCISES 161

Compare [51, p. 154–155] (slightly different notation), or [17] (check signs).


This illustrates how the constraints operator generates the evolution; furthermore,
an element in the kernel of D 8e∗
(g,π̃) (cf. Exercise 5-28) corresponds to a Killing
field of the vacuum spacetime, which is then stationary (if the Killing field is
timelike), see [163]. Such an element (N , X ) in the kernel is called a KID, for
Killing initial data, and the presence (or lack thereof) of KIDs has implications
for constructions involving the constraints, such as linearization stability [88;
163], as well as for gluing constructions of solutions to the constraints [61; 69].

Remark 5-23. As noted, to get the units right, we consider the Hamiltonian
R 1
related to the Lagrangian S 2κ R(ḡ)dvḡ . We can readily modify the above to
ij 1 ij
incorporate κ. Indeed, the canonical momentum would be π̂κ = 2κ π̂ , with
ij 1 ij κ 1
π̃κ = 2κ π̃ . The Hamiltonian would change to HADM (g, π̂κ ) = 2κ HADM (g, π̂)
eκ (g, π̃κ ) = 1 H 1
with the weighted Hamiltonian H 2κ (g, π̃) = 2κ θg H(g, π̂), so that
e b
1
bκ (g, π̂κ ) = H 1
H 2κ (g, π̂ ), and finally 8κ (g, π̂κ ) = 2κ 8(g, π̂), with the weighted
b b b
1 e
form 8κ (g, π̃κ ) = 2κ
e 8(g, π̃ ). Hamilton’s equations (5.3.14)–(5.3.15) retain the
same form, since one can readily show that δ H bκ /δ π̂κi j = δ H
b/δ π̂ i j = δ H eκ /δ π̃κi j ,
ij eκ /δgi j is obtained simply by multiplying (5.3.15) by 1 .
while ∂ π̃κ /∂t = −δ H 2κ
Finally, (5.3.18) would retain the same form, as it can be readily seen to be
equivalent to the same equation with 8 eκ and π̃κ in place of 8 e and π̃ . See also
Exercise 5-28.

Exercises

Exercise 5-24. Let g E3 be the Euclidean metric. Let I ⊂ R be an interval, and let
a : I → (0, +∞) be a smooth function. Let S = I ×R3 , with ḡ = −dt 2 +(a(t))2 g E3 .
On the slices {t} × R3 , g(t) = (a(t))2 g E3 , and n = ∂/∂t is a global unit normal
field along the slices. Let a zero index denote the ∂/∂t-component direction.
a. Use the ADM equations to compute the second fundamental form K , as well
as the components of Ric(ḡ) in directions tangent to the slices.
b. Use the relation between g and K and G µ0 to compute G µ0 ; then compute G µν .

Exercise 5-25 (shape operator). Suppose 6 ⊂ (M, ḡ = ⟨ · , · ⟩) is an embedded


oriented hypersurface, with smooth unit normal vector field ν, and with induced
metric (first fundamental form) g with Levi-Civita connection ∇ 6 . For W ∈ T p 6,
let S p (W ) = −∇W ν.
a. Show that for all p ∈ 6, S = S p gives a linear operator S : T p 6 → T p 6,
the shape operator. Show that S is self-adjoint: ⟨S(V ), W ⟩= ⟨V, S(W )⟩ for
V, W ∈ T p 6.
162 5. T HE E INSTEIN CONSTRAINT EQUATIONS

The corresponding bilinear form, given by K (V, W ) = ⟨S(V ), W ⟩, is the


second fundamental form of 6, and is symmetric.
b. The covariant derivative of S as a (1, 1)-tensor field on 6 produces a (1, 2)-
tensor field ∇ 6 S. Show that for V and W in T p 6,

(∇V6 S)(W ) = ∇V6 (S(W )) − S(∇V6 W ).

c. Consider the case where 6 ⊂ (M, ḡ) = (Rn , g En ) is an oriented hypersurface


in Euclidean space. Prove (∇V6 S)(W ) = (∇W
6
S)(V ) for V and W in T p 6, and
conclude divg S = d(trg S) on 6.
d. 6 is totally umbilic at p ∈ 6 if there is a normal vector Z at p for which
II(V, W ) = ⟨V, W ⟩ Z for all V, W ∈ T p 6. 6 is totally umbilic if it is totally
umbilic at each point. In this case, show that Z is a uniquely defined smooth
vector field along 6. Show that 6 is totally umbilic if and only if there is a
(smooth) function f on 6 such that S p (V ) = f ( p)V for all p ∈ 6 and all
V ∈ T p 6.
e. If 6 is a connected, totally umbilic hypersurface in Euclidean space (Rn , g En ),
n ≥ 3, show that the function f from part d. must be constant. If 6 is also
compact (and without boundary), one can show furthermore that 6 is a round
sphere; see [174], for instance.

Exercise 5-26 (some classical surface geometry). Let X : U → R3 be an embed-


ding of an open subset U of R2 onto a surface 6 = X(U ) in Euclidean space;
we let D be the Levi-Civita connection for g E3 = ⟨ · , · ⟩. The components of the
induced metric (first fundamental form) on 6 are written g11 = E = ⟨Xu , Xu ⟩ > 0,
g12 = F = ⟨Xu , Xv ⟩ = g21 , and g22 = G = ⟨Xv , Xv ⟩ > 0. Let
Xu × Xv
ν=
∥Xu × Xv ∥
be a smooth unit normal field. For each p ∈ 6, let S = S p be the shape operator,
defined for W ∈ T p 6 by S p (W ) = −DW ν. By Exercise 5-25, S p defines a
self-adjoint linear operator on T p 6, so that the associated bilinear form given
by K (V, W ) = ⟨S(V ), W ⟩, for V, W ∈ T p 6, is symmetric. S is diagonalizable,
with the principal curvatures defined to be the eigenvalues κ1 and κ2 . The Gauss
curvature is defined as κ1 κ2 .
a. Suppose S = S p is represented by the matrix ab dc in the basis {Xu , Xv }, and


that K is represented in the same basis by mℓ mn . Relate these two matrices to




the first fundamental form matrix EF GF . These are the Weingarten equations.

E XERCISES 163

b. What do you get if you decompose the vector equations Xuuv − Xuvu = 0 and
Xvuv − Xvvu = 0 into tangential and normal components? (Answer: The Gauss–
Codazzi equations. From the tangential components comes the Gauss equation,
which shows the Gauss curvature κ1 κ2 (derived from the second fundamental
form) is the sectional curvature of 6, which can be computed from the first
fundamental form. This is known as Gauss’s theorema egregium, or “remarkable
theorem”.)

Exercise 5-27. a. Prove that Euclidean space (R3 , g E3 ) does not admit a closed
immersed minimal surface (one for which H = 0). To do this, show that there
must be a point on the surface where the Gaussian curvature is strictly positive.
Generalize this result to closed immersed hypersurfaces in (Rn , g En ) for n ≥ 3.
b. Conclude that any embedding of a two-torus T2 into Euclidean R3 is not flat.
Show that there is an embedding of a two-torus into Euclidean R4 for which the
induced metric is flat. Can you likewise embed a flat torus isometrically in the
product S2 × S2 of round spheres?
c. Consider the surface of revolution in Euclidean space given by x 2 + y 2 =
p

cosh z. The surface is a catenoid. Show the catenoid is minimal, and compute
its Gaussian curvature.

Exercise 5-28. Let (6, g) be Riemannian, and let π̂ be a symmetric (2, 0)-
tensor on 6. Recalling notation from Section 5.3.3.2, we let π̃ = θg π̂ and
8 b(g, θg−1 π̃ ), from which we have
e(g, π̃ ) = θg 8
Z Z
b(g, θg−1 π̃)dvg .
HADM = (N , X ) · 8(g, π̃ )d v̊ = (N , X ) · 8
e
6 6

a. Prove that, with σ̃ = θg σ̂ ,

D8
e(g,π̃) (h, σ̃ )
= θg D 8 b(g,π̂) (h, σ̂ ) + 1 θg (trg h) 8
b(g, π̂ ) − D 8
b(g,π̂ ) (0, (trg h) π̂) . (5-28a)

2

From here you can conclude that if (g, π̂ ) solves the vacuum constraint equations,
b∗
then (N , X ) is in the kernel of D 8 e∗ .
(g,π̂) if and only if it is in the kernel of D 8(g,e
π)

b. Show that the kernel of D 8 b∗ b ∗


(g,π̂) agrees with the kernel of (D 8κ )(g,π̂κ ) , and if
(g, π ) solves the vacuum constraint equations, find a simple bijection between
the kernel of D 8b∗ ∗
(g,π̂) and the kernel of the map D8(g,π ) defined in (5-29a).

Exercise 5-29. Consider the vacuum constraints operator on (M n , g, π),

8(g, π ) = R(g) − |π|2g + n−1


1
(trg π)2 , divg π ,

164 5. T HE E INSTEIN CONSTRAINT EQUATIONS

where g is a Riemannian metric, and π i j is a (2, 0)-tensor, whose divergence


defines a vector field. We define the operator D8∗(g,π ) , from the relation
Z Z
(N , X ) ·g D8(g,π) (h, σ ) dvg = D8∗(g,π ) (N , X ) ·g (h, σ ) dvg , (5-29a)
M M

assuming the integrand has compact support (in the interior of M, should ∂ M
be nonempty).
a. Suppose 8(g, 0) = 0. Find D8(g,0) (h, σ ), where h is a symmetric (0, 2)-
tensor and σ is a symmetric (2, 0)-tensor, and then find the operator D8∗(g,0) .
Since π = 0, the linearization simplifies dramatically from the general case.
b. Find the kernel of D8∗(g En ,0) at the Minkowski data on M = Rn . (Hint: for
n = 3, the kernel is ten-dimensional.)
c. Let m ̸= 0. Find the kernel of D8∗(gS ,0) at the data for a constant t-slice in
m 4
Schwarzschild g S = 1 + 2|x| g E3 on M = R3 \ x : |x| ≤ max 0, − m2 . (Hint:
  

the kernel is four-dimensional; cf. Exercise 2-53.)


d. Find D8(g,π) (h, σ ) and D8∗(g,π) (N , X ) in general, proving the following
formulas:

D8(g,π) (h, σ )
j
= L g h − 2h i j π iℓ π jℓ − 2π k σ kj + n−1
2
trg π(h i j π i j + trg σ ),
(divg σ )i − 21 π jk h jk;ℓ g ℓi + π jk h i j;k + 12 π i j (trg h), j ,

(5-29b)

D8∗(g,π) (N , X )
2 k 1 ℓm
= (L ∗g N )i j + N n−1 (trg π )πi j − 2πik π j + 2 giℓ g jm (L X π ) + (divg X )πi j
 

k k km
− 1
2 (X i π j;k + X j πi;k + X k;m π gi j + X k π km
;m gi j ),
1 i ℓj j ℓi 2 ij ij
2 (X ;ℓ g + X ;ℓ g ) + N n−1 (trg π )g − 2π ) ,

− (5-29c)

where L g is the linearization of the scalar curvature operator (cf. (2.3.11)), and
L X is the Lie derivative (cf. Exercise 5-17).

Exercise 5-30 (boundary terms in the Hamiltonian). We briefly explore the


boundary terms in the Hamiltonian formulation for the (vacuum) Einstein equa-
tion, cf. [113; 158; 218; 229]. The boundary integral 6t − 6t 2 trg K dvg
R R 
2 1
is often added to the Einstein–Hilbert action, and from Remark 5-22, we see
it cancels out the first boundary term we discarded. We now assume ∂6 is
nonempty, so that ∂ S has an additional piece given by the timelike hypersurface
I × ∂6.
E XERCISES 165

a. In general the outward unit conormal field ν to the slices 6t does not agree
with the outward unit normal of ∂ S . To analyze the boundary terms, you might
rewrite the scalar curvature from (5-21a), and combine with (5.3.8), to obtain


 ∂g
R(ḡ) = 2N −1 − (trg K )+∇ X (trg K ) −2N −1 1g N + N −1 π̂ i j i j
∂t ∂t
−1 i j 2 2
−2N π̂ X i; j +2(trg K ) + R(g)−|K |g +(trg K )2 . (5-30a)


(i) We can write the integral over S as a product over I × 6. This allows us to
apply integration by parts to the first term in the expression (5-30a) for the scalar
curvature. Using the equality (2.3.10) in the form
∂p 1 ∂g
 p
det g = trg det g
∂t 2 ∂t
and (5.3.4), show that

Z Z
−2N −1 (trg K ) dvḡ
I 6 ∂t
Z Z Z Z 
2
= −2N (trg K ) + 2(trg K )divg X dvg dt − − 2 trg K dvg .

I 6 6t2 6t1

(ii) Show that the modified action S R(ḡ)dvḡ + 6t − 6t 2 trg K dvg can be
R R R 
2 1
written
i j ∂gi j
Z Z  
i 2 2
π̂ + 2(divg π̂ )i X + N R(g) − |K |g + (trg K ) dvg dt

I 6 ∂t Z Z
∂N
  Z Z
−2 + π̂ i j X i ν j dσg dt + 2 (trg K )X j ν j dσg dt.
I ∂6 ∂ν I ∂6

Note that we could use (5.3.7) to replace ∂ N /∂ν by N ⟨∇n n, ν⟩ on the last line.
To interpret the boundary integrals from the divergence terms in (5-21b), it
would be convenient for n to be tangent to this timelike boundary, i.e., for the
outward unit conormal ν to the slices 6t to be the outward unit normal to ∂ S ;
however, as ∂/∂t is tangent to ∂ S , this would restrict X to satisfy ⟨X, ν⟩ = 0
along ∂ S . So as in [113; 158], we restrict if necessary to a spacetime (S ′ , ḡ), with
S ′ = t∈I 6t′ , with smooth hypersurfaces 6t′ ⊂ 6t , where the boundary of each
S

6t′ meets the timelike boundary ∂ S ′ orthogonally, so that the outward pointing
unit conormal ν of each 6t′ is normal to this timelike boundary. Then if we let γ
be the induced metric on ∂ S ′ , the orthogonality gives us that, if we have a local
form field ωg which restricts to a local volume form along each ∂6t′ (in a suitable
neighborhood), then a local volume form for (∂ S ′ , γ ) is n ♭ ∧ ωg = N dt ∧ ωg .
If we extend ν as ν = −n on 6t′2 and ν = n on 6t′1 , the term that is added to
the action is then ∂ S ′ 2trγ A dvγ , where if II is the second fundamental form of
R
166 5. T HE E INSTEIN CONSTRAINT EQUATIONS

the boundary ∂ S ′ , II(X, Y ) = (∇ X Y ) N , then A(X, Y ) = ⟨II(X, Y ), −ν⟩. Thus


A = K on 6t′2 , but A = −K on 6t′1 , so that 6 ′t − 6 ′t 2 trg K dvg is indeed
R R 
2 1
part of the boundary term ∂ S ′ 2 trγ A dvγ . The modified action S ′ R(ḡ)dvḡ +
R R

∂ S ′ 2 trγ A dvγ gives rise to a modified Hamiltonian with boundary terms, as


R

follows.
b. In the setting above, show the modified action S ′ R(ḡ) dvḡ + ∂ S ′ 2 trγ A dvγ
R R

can be written
i j ∂gi j
Z Z Z 
ij
π̂ dvg − HADM − 2 (N H + π̂ X i ν j ) dσg dt,
I 6t′ ∂t ∂6t′

where H is the mean curvature of the boundary ∂6t′ inside 6t′ , with respect to
the normal ν (i.e., if Π is the vector-valued second fundamental form of ∂6t′
inside 6t′ , then H is the trace along ∂6t′ of κ(X, Y ) := ⟨Π (X, Y ), ν⟩). Thus the
modified Hamiltonian is HADM + 2 ∂6 ′ (N H + π̂ i j X i ν j ) dσg .
R
t

Remark. If you try to compare boundary terms from the analyses in parts a. and
b., you have to take care; in part a. we obtained one such term from an integration
by parts in the t-direction; if you tried to carry over the derivation from part (ii)
for S ′ , there may be an additional term in the t-derivative of the slice integrals,
since the slices in S ′ are generally not just 6t = {t} × 6, which had allowed us
to easily write the slice integrals in S over a fixed domain 6.
CHAPTER 6

Scalar curvature deformation


and the Einstein constraint equations

We have derived the Einstein constraint equations and pinpointed their place in
terms of the initial value formulation for Einstein’s equation. From one point
of view, to construct interesting spacetimes, one could construct interesting
solutions of the Einstein constraint equations, and then study their evolution.
If one could then effectively parametrize the space of solutions to the Einstein
constraint equations, one could identify the parameters as the true gravitational
degrees of freedom; cf. [226; 227]. A classical approach to this is the conformal
method to construct solutions of the constraint equations, and we introduce this
in Section 6.2.
Once we set up the conformal method, we will focus on the constant mean
curvature (CMC) case. In this case, the momentum constraint decouples, and
the problem will be reduced to the analysis of a single nonlinear equation, the
Lichnerowicz equation (6.2.8), to solve the Hamiltonian constraint, as we will see.
The Hamiltonian constraint brings the analysis of the scalar curvature operator
to the fore, and we will study the range of the scalar curvature operator, focusing
on scalar curvature deformation, sometimes within a conformal class of metrics.
As we have seen, in the time-symmetric case, the scalar curvature is proportional
to energy density, and the dominant energy condition then focuses our attention
to positive (nonnegative) scalar curvature metrics. There are in fact topological
obstructions to admitting positive scalar curvature metrics, and as developed by
R. M. Schoen and S.-T. Yau, this gives an approach to the positive mass theorem,
as we will see in Chapter 7.
The remaining chapters will make heavier use of analysis and PDE (partial
differential equations), such as properties of the Laplace operator on Euclidean
space or more general Riemannian manifolds. We will discuss some aspects
of elliptic PDE theory in various amounts of detail. Rather than relegating the
PDE aspects to an appendix or simply listing bullet points of facts, we prefer to
cover various aspects of the theory that we will use, giving many references to
possible sources for further reading, and sketching a number of missing details

167
168 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

as exercises. We suggest the reader at least peruse the PDE sections to set some
notations and expectations. Hopefully someone who has had a course in elliptic
PDE will get a better appreciation for some basic applications of the theory,
while those without the background can get some feel for the place of elliptic
PDE in the study of the Einstein constraint equations.

6.1. A primer on elliptic PDE

This section contains some of the required fundamentals of elliptic PDE theory.
Given the fundamental role of the Laplace and Poisson equations in geometric
analysis and mathematical physics, the Laplace operator will be our primary,
though not exclusive, example of an elliptic operator. While we assume the reader
is familiar with basic tools of analysis, including basic functional analysis, found
in standard texts such as those mentioned in the preface, we start by collecting
some of the facts and notation about function spaces that we will use.

6.1.1. Sobolev and Hölder spaces. Many results on elliptic PDE are cast in
Sobolev and Hölder spaces. We define them briefly, and encourage the reader to
review their basic properties from references such as [2; 86; 107; 144].
We let  be an open subset of Rn , sometimes called a domain in Rn . For a
real number p ≥ 1 and a nonnegative integer k, the space W k, p () is the set of
all Lebesgue measurable functions (up to an equivalence relation for functions
which agree except for a set of measure zero) which are in L p () along with
weak derivatives up to order k. The space has a Banach space structure with
norm given by
p p
∥∂ β u∥ L p () ,
X
∥u∥W k, p () =
|β|≤k

where β = (β1 , . . . , βn ) is a multi-index (n-tuple of nonnegative integers), and


Pn
we have set |β| = i=1 βi and
∂β 
∂ β1
 
∂ βn

∂ β = ∂xβ = β = · · · n .
∂x ∂x1 ∂x
When p = 2, the norm is induced by an evident Hilbert space structure, and
k, p k, p
we let H k = W k,2 . We let Wloc () (or just Wloc ) be the space of functions in
W k, p (′ ) for all ′ compactly contained in . A straightforward cutoff and
mollification argument proves that Cc∞ (Rn ) is dense in W k, p (Rn ), while the
Meyers–Serrin theorem shows that C ∞ () ∩ W k, p () is dense in W k, p () for
any open  ⊂ Rn . See, for example, [2, Theorem 3.16] or [107, Theorem 7.9].
There are ways to extend the definition to s ∈ R, and we briefly discuss the
case p = 2. The space H s (Rn ) is the completion of the set of Schwartz functions
A PRIMER ON ELLIPTIC PDE 169

under the norm ∥u∥2H s (Rn ) = Rn (1 + |ξ |2 )s |û(ξ )|2 dξ , where û is the Fourier
R

transform of u. By Plancherel’s theorem, the norm on H 0 (Rn ) agrees with that


of L 2 (Rn ) (up to a constant multiple), and in fact, for k ∈ Z+ , H k (Rn ) agrees
with the preceding definition (with equivalent norm), while for s < 0, H s (Rn ) is
a subset of the set of tempered distributions (see [195], for example).
Let k be a nonnegative integer, and let α ∈ (0, 1]. We recall that C k () is the
set of all functions u on  such that u and all its partials up through order k are
continuous. For such u, let

sup |∂ β u(x)|
X
∥u∥C k () =
|β|≤k x∈

(which could be infinite). There are different conventions for C k (). A natural
one, adopted in [107], is that C k () consists of all functions u ∈ C k () such
that u and all its partials through order k possess continuous extensions to ; for
such functions, it is easy to see ∥u∥C k () = |β|≤k supx∈ |∂ β u(x)| =: ∥u∥C k () ,
P

where we have not introduced new notation to denote the extension of ∂ β u. If 


is not compact, the functions ∂ β u, |β| ≤ k, need not be bounded, however. To
get a natural Banach space structure using ∥u∥C k () , one can consider the space
C Bk () ⊂ C k () of functions u for which ∂ β u is bounded on  for |β| ≤ k,
and the corresponding space C Bk () of functions u ∈ C Bk () for which ∂ β u also
extends continuously to , for |β| ≤ k. (In a convex domain, for example, this
extension condition is automatic for |β| < k by applying the mean value theorem
to infer uniform continuity.) Both C Bk () and C Bk () are Banach spaces under
the norm ∥·∥C k () , and C Bk () agrees with C k () when  is bounded. (We note
that in some sources like [2], C k () is defined to be the subspace of functions
u in C k () with ∂ β u bounded and uniformly continuous on , for |β| ≤ k;
such functions then possess continuous extensions to . Of course, when  is
bounded, these two notions of C k () agree.)
For X ⊂ , we set
|∂ β u(x) − ∂ β u(y)|
[u]k,α;X = max sup ,
|β|=k x,y∈X |x − y|α
x̸ = y

and define ∥u∥C k,α () = ∥u∥C k () + [u]k,α; , which may be infinite. We let
k,α
C k,α () (or Cloc () for emphasis) be the set of all functions u in C k () such
that for all compact X ⊂ , [u]k,α;X is finite; this is equivalent to requiring
[u]k,α; B̄ be finite for all closed balls B ⊂ . Note that if [u]0,α; is finite, then u
is uniformly continuous on , and hence extends continuously to , and if u
denotes the extension, [u]0,α; = [u]0,α; . In [107], C k,α () is then defined to
170 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

be all u ∈ C k () such that [u]k,α; is finite. To have a Banach space with norm
∥u∥C k,α () , one would naturally restrict to the subspace C Bk,α () of functions
u ∈ C Bk () with [u]k,α; finite; when  is bounded C Bk,α () = C k,α (), and
we will generally work in bounded domains, or in unbounded domains with
weighted norms. We note that in [2], the definition of C k,α () also includes the
condition that [u]ℓ,α; is finite for all 0 ≤ ℓ ≤ k, and the norm adds the maximum
of these seminorms to the C k -norm. In the domains we will use, it is automatic
that u ∈ C Bk,α () will satisfy this requirement, and in particular the seminorms
[u]ℓ,α; , 0 ≤ ℓ ≤ k, are bounded up a constant factor by

∥u∥C k,α () := ∥u∥C k,α () = ∥u∥C k () + [u]k,α; .

For example, if  is convex, this follows from the mean value theorem; for
 a smooth compact manifold-with-boundary, this again follows by the mean
value theorem, using a finite covering by coordinate charts, each of which is a
compactly contained restriction of a larger chart, and the image of which is a
ball or half-ball.
Remark 6-1. Whereas smooth functions are dense in Sobolev spaces, this fails
in Hölder spaces. For 0 < α ≤ 1, the function f α (x) = |x|α is readily seen to be
in C 0,α ([−1, 1]). For 0 < α < 1 and f Lipschitz, we have, for 0 < x ≤ 1,
|x α − ( f (x) − f (0))|
[ f α − f ]0,α;[−1,1] ≥ ,

which approaches 1 as x ↘ 0. A similar two-sided argument shows the estimate
[ f 1 − f ]0,1,[−1,1] ≥ 1 for f ∈ C 1 ([−1, 1]). Furthermore, while by Weierstrass
approximation, W k, p ((−1, 1)) and C k ([−1, 1]) are separable, this fails for
C 0,α ([−1, 1]) for 0 < α ≤ 1: the functions f p (x) = |x − p|α are in C 0,α ([−1, 1]),
while for p and q distinct points in [−1, 1], [ f p − f q ]0,α;[−1,1] ≥ 2.
As we will see in this section, elliptic regularity theory is readily formulated
in terms of Sobolev and Hölder spaces, rather than C k spaces. (See also the
discussion of the Newtonian potential in Section 7.1.1.2.)
The above definitions can be extended naturally to tensor fields (for example
by taking the summation or maximum of the corresponding norms of Cartesian
components of the tensor field). To extend to functions or tensors (or more
generally to sections of vector bundles) on compact manifolds (smooth, and with
or without boundary), one builds a norm using the Sobolev or Hölder norms
coming from a covering by a finite number of appropriate coordinate charts
(say, charts which are compactly contained restrictions of larger charts, so that
overlap maps have uniformly bounded derivatives); any two such norms are
A PRIMER ON ELLIPTIC PDE 171

easily seen to be equivalent. In fact one could define an equivalent norm by


replacing the partial derivative with a smooth connection such as the Levi-Civita
connection for a smooth Riemannian metric on M (or compatible connection
for a Riemannian vector bundle [139]), using a Riemannian volume measure for
integrals, and using parallel transport to construct the Hölder quotient (replacing
distance with arclength). We remark that using the Levi-Civita connection to
define the function spaces requires some regularity in the metric, which we
assume (and sometimes but not always reiterate) is smooth (C ∞ ) unless noted
otherwise. One is often led to consider metrics g that have finite regularity a
priori, to consider an operator with coefficients depending on g (via curvature,
or Christoffel symbols in local coordinates, say) or to define an operator acting
on metrics g, so that it may in fact be more natural to define the function spaces
using either a covering by charts or a smooth background metric g̊. In any case,
one must pay attention to the regularity of g, which may naturally limit the
regularity of operators under consideration.
We let H k (M) be the space of sections of class H k (suppressing notation for
bundle), with norm ∥ · ∥ H k (M) (suppressing notation for a metric if used for the
norm), and similarly for other regularity classes of sections. The density results
cited above can be readily used to show that the space of smooth sections is dense
in the Sobolev space W k, p (M), for k a nonnegative integer and 1 ≤ p < ∞.
One can also define suitable such function spaces in the noncompact setting,
possibly with some control on the geometry, such as a positive injectivity radius,
or asymptotic conditions. As we will see in Section 7.2.2.2, to define the relevant
function spaces it is often natural to use an appropriately chosen family of
coordinate charts, or to use a background metric adapted to the asymptotics.

6.1.2. A few compactness results. We will make use of a number of compact


inclusion mappings of one space into another. We start by recalling some notions.
Definition 6-2. Let X and Y be normed vector spaces. A map T : X → Y is
compact if the image of every bounded subset of X has compact closure in Y .
An easy exercise shows that (i) T : X → Y is compact if and only if the image
of every bounded sequence in X has a convergent subsequence in Y , and (ii) if
T : X → Y is a linear map, then T is compact if and only if the image of the unit
ball in X has compact closure in Y , and such a compact linear map T : X → Y
is necessarily continuous.
Definition 6-3. Let (X, d X ) and (Y, dY ) be metric spaces. Let F be a set of
functions from X to Y . For x ∈ X , F is equicontinuous at x if for all ε > 0, there
is δ > 0 such that for all f ∈ F , and for all x1 ∈ X with d X (x1 , x) < δ, it follows
172 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

that dY ( f (x1 ), f (x)) < ε. F is pointwise equicontinuous if it is equicontinuous


at each x ∈ X . F is (uniformly) equicontinuous if for all ε > 0, there is δ > 0
so that for all f ∈ F , and for all x1 , x2 ∈ X with d X (x1 , x2 ) < δ, it follows that
dY ( f (x1 ), f (x2 )) < ε.
If (X, d X ) is a compact metric space, it is a simple exercise in compactness
to show that (i) F is pointwise equicontinuous if and only if it is equicontinuous,
and (ii) if F is equicontinuous, then { f (x) : x ∈ X, f ∈ F } is dY -bounded (F is
uniformly bounded) if and only if for each x ∈ X , { f (x) : f ∈ F } is dY -bounded
(F is pointwise bounded at each x).
A fundamental result which yields compact embeddings is the Ascoli–Arzelà
theorem, a version of which is formulated as follows.
Theorem 6-4 (Ascoli–Arzelà). A collection F of continuous real-valued func-
tions on a compact metric space X has compact closure in the uniform norm
∥ f ∥C 0 (X ) = supx∈X | f (x)| = maxx∈X | f (x)| if and only if F is bounded and
equicontinuous.
The Ascoli–Arzelà theorem yields a compactness result for Hölder spaces.
For example, if (M, g) is a compact smooth Riemannian manifold (possibly with
smooth boundary), then for 0 < α < 1, the inclusion of C 1 (M) into C 0,α (M) is
a compact inclusion (i.e., any bounded sequence in C 1 (M) has a subsequence
which converges in C 0,α (M)), and similarly, C 0,α (M) is compactly included in
C 0,β (M) for 0 < β < α ≤ 1, and in C 0 (M).
There are also compactness results in Sobolev spaces. We recall a version of
the Rellich–Kondrachov theorem [86; 107]. A proof for p = 2 (Rellich’s lemma)
based on the Fourier transform and the Ascoli–Arzelà theorem can be found in
[139], where the authors also discuss going from the setting of a domain in Rn
to the manifold setting.
Lemma 6-5 (Rellich–Kondrachov). Suppose (M, g) is a compact Riemannian
manifold (possibly with smooth boundary), 1 ≤ p < ∞, and k, j ≥ 0 are integers,
with k > j. Then the inclusion W k, p (M) ,→ W j, p (M) is compact.
Other such embeddings are recalled in Section 7.2.2.1. Before we move on,
we note that we can use compactness to interpolate between certain norms, a
basic example of which is as follows.
Exercise 6-6. Suppose (M, g) is a compact Riemannian manifold (possibly with
smooth boundary), 1 ≤ p < ∞ and ℓ > k > j ≥ 0 are integers. Show that for
any ε > 0, there is C > 0 such that ∥u∥W k, p (M) ≤ ε∥u∥W ℓ, p (M) + C∥u∥W j, p (M)
for all u ∈ W ℓ, p (M). You can argue abstractly, by contradiction, using only that
the inclusion W ℓ, p (M) ,→ W k, p (M) is compact and the inclusion W k, p (M) ,→
A PRIMER ON ELLIPTIC PDE 173

W j, p (M) is continuous. Apply the same method to get corresponding inequalities


for Hölder and C k -spaces.
6.1.3. Elliptic operators. We will often apply elliptic theory to the Laplace
operator 1g on a Riemannian manifold (M, g), given in coordinates by
1 ∂ ∂ ∂2 ∂
 
ij ij
1g = √ − g i j 0ikj k .
p
i
g det g j
= g i j
det g ∂ x ∂ x ∂ x ∂ x ∂ x
We first review a notion of ellipticity for second-order scalar operators on  ⊂ Rn ,
and then generalize from there.
An operator of the form
∂2 ∂
L = a i j (x) i j
+ bk (x) k + c(x)
∂x ∂x ∂x
is elliptic at x ∈  ⊂ Rn if a i j (x)ξ i ξ j > 0 for all ξ ∈ Rn \ {0}. Observe that
by symmetrization, we could arrange a i j = a ji and define ellipticity to mean
(a i j (x)) is positive definite. In this case we let λ(x) > 0 be the greatest λ such
that for all ξ ∈ Rn , a i j (x)ξi ξ j ≥ λ|ξ |2 ; likewise we let µ(x) ≥ λ(x) > 0 be the
least µ such that a i j (x)ξi ξ j ≤ µ|ξ |2 for all ξ ∈ Rn . We call L elliptic on  ⊂ Rn
if it is elliptic at each point in , and call it strictly elliptic on  if there is
a λ > 0 such that a i j (x)ξi ξ j ≥ λ|ξ |2 for all x ∈  and all ξ ∈ Rn . We could
replace L by (λ(x))−1 L to achieve strict ellipticity, but then we would likely
want to impose appropriate bounds on the coefficients. For example, considering
λ(x)|ξ |2 ≤ a i j (x)ξi ξ j ≤ µ(x)|ξ |2 , we say L is uniformly elliptic on  if µ/λ is
bounded. If we have an elliptic operator with continuous coefficients, we get
uniform ellipticity on any compactly contained subdomain. It is a simple exercise
to check that ellipticity is preserved by a change of coordinates. With this in
mind, note that we get an elliptic operator when we express the operator 1g for
a metric g in a local chart; in fact in a suitable chart on a compactly contained
domain, the operator written in local coordinates is uniformly elliptic. As you
might imagine, we can generalize the notion of ellipticity to the manifold setting
in a coordinate-invariant way as follows.
Suppose L is a linear differential operator of order m ∈ Z+ , either operating
on functions, or more generally between (sections of) two vector bundles over M,
such as vector or tensor fields; for the moment we can consider smooth sections,
but we will soon also want to allow sections in Sobolev or Hölder spaces.
Motivated by the Fourier transform, we will define the principal symbol σ
of L (at p ∈ M), which takes any ξ ∈ T p∗ M to a linear transformation σ (ξ )
between the fibers at p of the relevant vector bundles; we illustrate with L taking
vectors fields to vector fields, so that σ (ξ ) is a linear operator on T p M, and the
174 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

generalization will be immediate. To define σ (ξ ), first choose a smooth function


ϕ with ϕ( p) = 0 and dϕ| p = ξ , and given v ∈ T p M, choose a smooth vector
field W with W ( p) = v. Then
im
σ (ξ )(v) := L(ϕ m W )( p)
m!
is easily seen to depend only on ξ , v, and the principal part of L, i.e., the
highest order (order m) derivative terms in L. (The factor i m is not essential,
but it becomes useful when considering the symbol of the adjoint operator; see
p. 175.) To generalize to sections of other bundles, given v in the fiber at p of
the relevant bundle, W would be chosen as a smooth section of the bundle for
which W ( p) = v.
If ξ = ξi d x i , we define, in multi-index notation, ξ α = ξ1α1 · · · ξnαn . For any
multi-index α of order |α| = m, the principal symbol σ (ξ ) of ∂ α/∂ x α is seen
to be i m ξ α , the same as the action of the Fourier transform; clearly, this corre-
spondence between the symbol and the Fourier transform extends to all constant-
coefficient linear differential operators. It is a simple exercise to check that
if L = |α|≤m Aα (x)∂ α /∂ x α , where each Aα (x) is a coefficient matrix of the
P

same size for all α, then at x, σ (ξ ) = i m |α|=m Aα (x)ξ α , as a matrix linear


P

transformation. For any such operator L between vector bundles on a manifold,


in local coordinates the size of the matrices Aα depends on the fiber dimensions
of the bundles, and the preceding formula yields the matrix representation [σ (ξ )]
for σ (ξ ) at the point p with coordinates x, in the corresponding local bases (for
example, for tensor bundles, we would use bases built out of d x i and ∂/∂ x j ).
While the definition for the principal symbol is not coordinate-dependent, it
is a good exercise to compute [σ (ξ )] in two different coordinate systems and
compare, taking care with the change of bases from d x i and ∂/∂ x j to dy k and
∂/∂ y ℓ , which affects the coefficient matrices and the components of ξ .
Definition 6-7. Let L be a linear differential operator as above, with principal
symbol σ . L is an elliptic operator at p in case σ (ξ ) is an isomorphism for any
ξ ∈ T p∗ M \ {0}. L is an elliptic operator if it is elliptic at each p ∈ M.
While this definition does generalize the one above for second-order linear
scalar operators, one might consider the following generalization: suppose L is
linear of order m and operates by mapping sections of a bundle E to sections of
E, so that in a local representation, the symbol will be represented by a square
matrix [σ (ξ )]. When working with operators with real coefficients, we could
require the real matrix i −m [σ (ξ )] to be positive definite for all ξ ̸= 0; this is
called strong ellipticity in [22, Appendix I] and elsewhere. For such an operator,
we see by changing ξ to −ξ that m must be even.
A PRIMER ON ELLIPTIC PDE 175

There are related notions which include elliptic operators, but which can
accommodate systems of differential equations involving bundles whose fibers
may have different dimensions.
Definition 6-8. A linear differential operator L with principal symbol σ is
overdetermined-elliptic if for each ξ ∈ T p∗ M \ {0}, σ (ξ ) is injective, while it is
underdetermined-elliptic if for each ξ ∈ T p∗ M \ {0}, σ (ξ ) is surjective.
Suppose L is a linear differential operator between two bundles E and F
each equipped with a smoothly varying fiber-wise inner product ⟨ · , · ⟩ (we use
the same notation for both if it will not cause confusion). We will in particular
focus on tensor bundles over a Riemannian manifold, and the inner product
will be the natural inner product on tensors induced by the metric. The formal
adjoint operator L ∗ between F and E is defined by integration by parts against
compactly supported (away from the boundary if ∂ M is nonempty) sections of the
corresponding bundle: M ⟨Lu, v⟩ dvg = M ⟨u, L ∗ v⟩ dvg ; note that (L ∗ )∗ = L.
R R

We say that Lu = f weakly if for all smooth sections v (compactly supported in the
interior of M), M ⟨ f, v⟩ dvg = M ⟨u, L ∗ v⟩ dvg ; this can be suitably interpreted
R R

and defined for distributions u. It is not a hard exercise to show that the principal
symbol of L ∗ is the conjugate transformation σ (ξ )∗ : at each point p ∈ M and
ξ ∈ T p∗ M, we have ⟨σ (ξ )(v), w⟩ = ⟨v, σ (ξ )∗ (w)⟩. (Without the factor i m in the
definition, which gets conjugated on one side of the Hermitian product, this
relation would hold only up to sign.)
Exercise 6-9. a. Prove the preceding claim, and conclude that L is underdeter-
mined-elliptic if and only if L ∗ is overdetermined-elliptic.
b. Suppose Q is a linear differential operator between (sections of vector bundles)
E and E 1 , and likewise P is a linear differential operator between E 1 and F,
so that P ◦ Q is a linear differential operator between E and F. Show that the
principal symbols satisfy σ P◦Q (ξ ) = σ P (ξ ) ◦ σ Q (ξ ).
c. Conclude that L ∗ is overdetermined-elliptic if and only if L L ∗ is elliptic.
We will now turn to two important examples; another will be met in the
discussion of the TT decomposition (Section 6.2.1).
Example 6-10. Recall the linearization DRg of the scalar curvature operator
(2.3.11), and its formal adjoint (DRg )∗ , given by
(DRg )∗ ( f ) = −(1g f )g + Hessg f − f Ric(g).
The principal symbol of (DRg )∗ is then given by σ (ξ )(1) = |ξ |2g g − ξ ⊗ ξ ,
mapping a one-dimensional fiber to the space of symmetric (0, 2)-tensors on
T p M. If σ (ξ )(1) = 0, then upon taking a trace we get (n−1)|ξ |2g = 0, which
176 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

cannot hold for ξ ̸= 0. In this case, the symbol of (DRg )∗ is injective i.e., (DRg )∗
is overdetermined-elliptic. Note that DRg (DRg )∗ is a fourth-order operator with
principal part (n−1)12g , which is indeed elliptic, in agreement with the preceding
exercise.
For nonlinear operators, one can define ellipticity (at any p ∈ M) at a given
section of the bundle on which the operator acts, using the linearization of the
operator at the given section. For example, the graphical mean curvature operator,
cf. Exercise 6-58, is elliptic. As another example, we now consider the Ricci
curvature operator, which does not have an elliptic linearization (see the next
example). We saw in Section 5.3.1 that in the Lorentzian setting, the Ricci
curvature operator can be put into hyperbolic form in wave coordinates. An
analogue holds in the Riemannian setting, using a derivation following that of
equation (5.3.1), with harmonic coordinates playing the role of wave coordinates.
Example 6-11. The operator g 7→ Ric(g) is nonlinear, and we let P = D Ricg
be its linearization at a metric g. This operator acts between symmetric (0, 2)-
tensor fields on M, and it fails to be elliptic precisely because of diffeomorphism
invariance. We first note that if ϕ is a diffeomorphism on M, then ϕ ∗ (Ric(g)) =
Ric(ϕ ∗ g) (this is easy to compute in local coordinates ψ1 : U ⊂ Rn → V ⊂ M
and ψ2 = ϕ ◦ ψ1 : U → ϕ(V )). If X is a smooth vector field, generating a flow
ϕt , then differentiating ϕt∗ (Ric(g)) = Ric(ϕt∗ g) with respect to t at t = 0 gives
L X (Ric(g)) = P(L X g), where L X is the Lie derivative. As differential operators
on X , P(L X g) is of third order and L X (Ric(g)) is of first order. Taking the
principal symbol yields 0 = σ P (ξ ) ◦ σ Q (ξ ), where we let Q(X ) = L X g. Since
X 7→ Q(X ) maps vector fields to symmetric (0, 2)-tensor fields, this symbol
equation means that 0 = σ P (ξ )(σ Q (ξ )(v)) for any v ∈ T p M and ξ ∈ T p∗ M.
From (L X g)i j = X i; j + X j;i , we see that σ Q (ξ )(v) = i (ξ ⊗ v ♭ + v ♭ ⊗ ξ ). Thus
for ξ ̸= 0, we have a nontrivial kernel for σ P (ξ ). In Exercise 6-59 we will
see that the kernel elements of σ P (ξ ) for ξ ̸= 0 are precisely those symmetric
(0, 2)-tensors of the form ξ ⊗ v ♭ + v ♭ ⊗ ξ .
The constraints operator 8(g, K ) = R(g)−|K |2g +(trg K )2 , divg K −d(trg K )


can be construed as a nonlinear, underdetermined-elliptic operator of mixed order,


following [76], cf. [69; 61].

6.1.4. Elliptic estimates. We consider linear operators L on a manifold M, and


while it will be important to understand the role of the regularity of the coefficients
of L in the analysis, we assume the coefficients of L are smooth unless so noted.
There are fundamental estimates for elliptic operators that will play an important
role in what follows. We will see these elliptic estimates (sometimes paired
A PRIMER ON ELLIPTIC PDE 177

with compactness results in appropriate function spaces) can yield convergence


results, which might be used for instance in an iteration scheme to produce a
solution of a PDE under consideration. We will begin by illustrating several such
estimates.
Recall that a closed manifold is one which is compact and without boundary.
Proposition 6-12 (elliptic estimates). Let L be a linear elliptic operator of order
m, on a closed manifold M. Let 1 < p < ∞ and 0 < α < 1. There is a constant
C > 0 such that the following estimates hold for any smooth u:

∥u∥ H m (M) ≤ C(∥Lu∥ L 2 (M) + ∥u∥ L 2 (M) ), (6.1.1)


∥u∥W m, p (M) ≤ C(∥Lu∥ L p (M) + ∥u∥ L p (M) ), (6.1.2)
∥u∥C m,α (M) ≤ C(∥Lu∥C 0,α (M) + ∥u∥C 0 (M) ). (6.1.3)

Remark 6-13. We could have labeled the constant differently for each of the
estimates above, since in addition to depending on L and M (including the
background metric g, if used to define the norms), the constant in (6.1.2) depends
on p, and in (6.1.3) depends on α.
Inequalities (6.1.1)–(6.1.3) are basic versions of, respectively, the L 2 -estimates,
L p -estimates, and Schauder estimates. One often breaks out the case p = 2
separately, as it can be obtained by a more elementary treatment than the general
L p case. These are a priori estimates: we are assuming smoothness on u to begin
with, and then the estimates show that u and all its derivatives up through order
m can be estimated in the indicated norms in terms of the associated norm of
Lu, and a lower-order norm on u; for instance the first and second derivatives of
a function u can be controlled in terms of 1g u, along with a lower-order norm.
In fact, the estimates do not need u to be C ∞ ; it is sufficient to assume that u
lies in ∈ H m (M), W m, p (M) or C m,α (M) as the case may be. For (6.1.1)–(6.1.2)
this can be easily derived from the smooth case by taking a sequence of smooth
u i converging to u in the relevant Sobolev norm.
We next indicate a few basic facts one can get by combining the estimates
with compactness results. These observations are used heavily; for instance they
can be employed in the proof of the Fredholm property (Proposition 6-25). For
the next proposition, the domain of L can be any of C m,α (M), 0 < α < 1, or
W m, p (M), 1 < p < ∞, for which the above elliptic estimates hold. We will see
shortly that the kernel is independent of which of these function spaces we take
for the domain.
Proposition 6-14. Let M be a closed manifold, on which L is a linear elliptic
operator of order m. The kernel of L is finite-dimensional.
178 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

Proof. The kernel of L is a closed subspace, and hence a Banach space. The claim
will be established by showing that the closed unit ball in ker L is compact. Given
any sequence in this unit ball, we can extract a subsequence u i that converges
appropriately, either in L p (M) (using Rellich–Kondrachov) or in C 0 (M) (using
Ascoli–Arzelà). Since Lu i = 0, the respective elliptic estimate may be applied
to the differences u i −u j to show that the sequence u i converges in ker L, from
which the claim follows. □
A second fundamental fact is an injectivity estimate transverse to the kernel;
thus should L have no kernel, it gives a sharper overall estimate.
Proposition 6-15. Let M be a closed manifold, on which L is a linear elliptic
operator of order m. Let 0 < α < 1, respectively 1 < p < ∞, and suppose S ⊂
C m,α (M), respectively S ⊂ W m, p (M), is a closed subspace with S ∩ ker L = {0}.
There is a C > 0 such that for all u ∈ S, ∥u∥C m,α (M) ≤ C∥Lu∥C 0,α (M) , respectively
∥u∥W m, p (M) ≤ C∥Lu∥ L p (M) .
Exercise 6-16. Prove this proposition. Proceed by contradiction, starting with a
sequence u i ∈ S with ∥u i ∥C m,α (M) = 1 but ∥Lu i ∥C 0,α (M) → 0. Use the basic elliptic
estimate, together with Ascoli–Arzelà, to get convergence of a subsequence in
C m,α (M). Find a contradiction. The respective statement in Sobolev spaces
follows the same way, via Rellich–Kondrachov.
We can get higher-order estimates by differentiation.
Proposition 6-17. Let M be a closed manifold, on which L is a linear ellip-
tic operator of order m. Let 0 < α < 1, 1 < p < ∞, and k ∈ Z+ . There
exists a constant C such that ∥u∥W m+k, p (M) ≤ C(∥Lu∥W k, p (M) + ∥u∥ L p (M) ) for
all u ∈ W m+k, p (M). Likewise there is a constant C such that ∥u∥C m+k,α (M) ≤
C(∥Lu∥C k,α (M) + ∥u∥C 0 (M) ) for all u ∈ C m+k,α (M).
Proof. Observe that L(∂ β u) − ∂ β (Lu) can be expressed as a linear combination
of ∂ γ u for |γ | < m + |β| (assuming the coefficients of L are smooth, or smooth
enough, else we have to restrict |β|). Applying (6.1.2), for example, to ∂ β u
for |β| ≤ k, we get ∥u∥W m+k, p (M) ≤ C(∥Lu∥W k, p (M) + ∥u∥W m+k−1, p (M) ) for all
u ∈ W m+k, p (M) (adjusting C as necessary). The lower-order term ∥u∥W m+k−1, p (M)
can be replaced by ∥u∥ L p (M) via interpolation (see Exercise 6-6). The higher-
order analogue of (6.1.3) follows likewise. □
6.1.4.1. On the proof of (6.1.1)–(6.1.3) and interior estimates. We make only
brief comments on the proofs of (6.1.1)–(6.1.3). One natural approach to proving
(6.1.1) is to use Fourier analysis; see, e.g., the proof from [139, p. 193], which
also uses a parametrix (for more on this notion see [114; 115]). For a beautiful
A PRIMER ON ELLIPTIC PDE 179

approach to Schauder estimates by scaling, see [210]. A standard approach that


works for all three estimates is to use corresponding interior estimates as in [86;
107; 3], along with a suitable finite cover of M. To state and motivate these,
we proceed in the converse direction, getting interior estimates from (6.1.1)–
(6.1.3) as follows. Let ′ be open with compact closure contained in the open
set  ⊂ M. Consider a smooth cutoff function ζ that is identically 1 on ′
and vanishes outside . By applying any of (6.1.1)–(6.1.3) to ζ u, and noting
L(ζ u) − ζ Lu can be expressed as a linear combination of u and its derivatives
up to order m − 1, with coefficients depending on ζ and its derivatives, one
can get interior estimates. We illustrate with the L 2 -estimate, with analogous
comments applying to the others. Applying the cutoff and (6.1.1), we get,
adjusting C as needed, ∥u∥ H m (′ ) ≤ C(∥Lu∥ L 2 () + ∥u∥ H m−1 () ). As you can
imagine, C will depend on the two domains; in particular, the derivatives of
the cutoff function will depend on the distance between the domains. As noted,
one often derives such interior estimates first, and then from these, one can get
an estimate ∥u∥ H m (M) ≤ C(∥Lu∥ L 2 (M) + ∥u∥ H m−1 (M) ) with a suitable covering.
From here, (6.1.1) follows using interpolation as in Exercise 6-6 to replace the
norm ∥u∥ H m−1 (M) with ∥u∥ L 2 (M) . We remark that to replace the norm ∥u∥ H m−1 ()
with ∥u∥ L 2 () in the interior estimate (and with the analogous replacement for
the other estimates, respectively), one has to work a bit harder, because the
domains are different on each side of the inequality in the interior estimate. See,
e.g., [86, Section 6.3], or for more general scaling arguments, [107, Sections 6.1
and 9.5] and [209, Chapter 2, Lemma 2].

6.1.5. Elliptic regularity and bootstrapping. Suppose for the moment that the
coefficients of the linear elliptic operator L are smooth, and u is a locally
integrable function (or more generally a distribution) so that Lu = f weakly.
We would like to infer how smooth u is based on the regularity of f . A seminal
result along these lines is Weyl’s lemma, which we recall now.

Lemma 6-18 (Weyl’s Lemma). Suppose  ⊂ Rn is open, and that u is a distri-


butional solution of 1u = 0 in , i.e., for all ψ ∈ Cc∞ (),  u1ψ d x = 0. Then
R

u ∈ C ∞ (), i.e., the distribution u can be represented by integration against a


smooth function.

This immediately implies that if L = a i j ∂ 2/∂ x i ∂ x j , for (a i j ) a positive definite


n × n constant matrix, then distributions u with Lu = 0 are actually smooth.
A classical proof of Weyl’s Lemma that does not use elliptic estimates is a
mollification argument based on the mean value property (Proposition 7-8), as
in [86] in case u is continuous; this can be extended to harmonic distributions
180 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

u by showing that the mollification of u yields a smooth harmonic function, cf.


[195, Theorem 6.30].
The interior estimates can be used to obtain more general elliptic regularity
results for solutions of elliptic equations. For illustration, let us suppose L is linear
elliptic of order m, with smooth coefficients, and M is a closed manifold (although
as we will see presently, most of the action will be happening in a localized
setting). Take u ∈ H m (M), so that Lu ∈ L 2 (M); if we suppose furthermore that
f := Lu ∈ H 1 (M), we would like to conclude that u ∈ H m+1 (M). We can turn
this into a local question using a suitable covering by coordinate charts, and
work on a bounded domain in Rn .
Proposition 6-19. Suppose L is a linear elliptic operator of order m with smooth
coefficients on an open set  ⊂ Rn . Let 1 < p < ∞, and 0 < α < 1, and k ∈ Z+ .
m, p k, p m+k, p
If u ∈ Wloc () with Lu = f ∈ Wloc (), then u ∈ Wloc (). Likewise if
m,α k,α m+k,α
u ∈ Cloc () with Lu = f ∈ Cloc (), then u ∈ Cloc ().
m 1
Proof. Suppose u ∈ Hloc () with Lu = f ∈ Hloc (), and take compactly
′′ ′
contained domains  ⊂  ⊂ . While we might not be able to differentiate
the equation, we can difference it to obtain an elliptic equation satisfied by any
difference quotient 1ih u(x) = h −1 (u(x +hei )−u(x)) (h ̸= 0), where {e1 , . . . , en }
is the standard basis for Rn . Indeed, we let u h (x) = u(x + hei ), and if L =
α α h h α α
|α|≤m aα ∂ /∂ x , then we let 1i L = |α|≤m (1i aα ) ∂ /∂ x . It is easy to
P P

show the product rule for differences: 1ih (Lu) = L(1ih u) + (1ih L)(u h ). For
0 < |h| < dist(′ , ∂), the interior estimate for L gives, since 1ih u ∈ H m (′ ),
∥1ih u∥ H m (′′ ) ≤ C ∥L(1ih u)∥ L 2 (′ ) + ∥1ih u∥ H m−1 (′ )


≤ C ′ ∥1ih (Lu)∥ L 2 (′ ) + ∥u∥ H m () + ∥1ih u∥ H m−1 (′ ) (6.1.4)




where we used a bound on the difference quotients 1ih aα , by the mean value
theorem, say. To complete the proof, we invoke a difference quotient lemma,
which roughly says that for u ∈ H m , difference quotients of u are uniformly
bounded if and only if u ∈ H m+1 . More precisely, for v ∈ H 1 (), and for any
′ ⊂  with 0 < |h| < d(′ , ∂), ∥1ih v∥ L 2 (′ ) ≤ ∥∂v/∂ x i ∥ L 2 () ; conversely for
v ∈ L 2 (), if there is a K > 0 such that for any ′ ⊂  with 0 < h < d(′ , ∂),
∥1ih v∥ L 2 (′ ) ≤ K , then ∂v/∂ x i ∈ L 2 () and ∥∂v/∂ x i ∥ L 2 () ≤ K . For a proof
of the analogous statement for all 1 < p < ∞, see [107, Chapter 7]. Applying
this to (6.1.4) yields a constant C ′′ such that (note where we use Lu ∈ H 1 ())
∥1ih u∥ H m (′′ ) ≤ C ′′ (∥Lu∥ H 1 () + ∥u∥ H m () ).
m+1
By applying the difference quotient lemma again, we get u ∈ Hloc () as
desired.
A PRIMER ON ELLIPTIC PDE 181

The proof for 1 < p < ∞ follows in the same way, and we can apply induction
to bootstrap regularity for higher k. A similar approach can be used for Hölder
regularity; see Exercise 6-60. □
From here we see that for L linear elliptic of order m, with smooth coefficients,
if u ∈ H m (M) and Lu = f ∈ C ∞ (M), then by induction we can prove u ∈ H k (M)
for all k ∈ Z+ , and analogously if instead u ∈ W m, p (M) and 1 < p < ∞, or
u ∈ C m,α (M) with 0 < α < 1. In the Hölder setting, then, we see directly that u is
C ∞ . In the case of Sobolev spaces, one invokes a Sobolev embedding: if kp > n,
any u ∈ W k, p (M) can be represented by a continuous function u ∈ C 0 (M); in fact
if ℓ is a nonnegative integer, (k − 1) p ≤ n < kp, and 0 < α < 1 with α ≤ k − np ,
then if u ∈ W k+ℓ, p (M), it follows that u ∈ C ℓ,α (M), and there is a constant
C > 0 so that for all u ∈ W k, p (M), ∥u∥C ℓ,α (M) ≤ C∥u∥W k+ℓ, p (M) . Section 7.2.2.1
discusses more such embeddings; see also [2; 86; 107], for example. We likewise
conclude that if L is linear elliptic of order m on , with smooth coefficients,
m, p
and u ∈ Wloc () for some 1 < p < ∞ with Lu = f ∈ C ∞ (), then u ∈ C ∞ ().

m, p
Remark 6-20. We have been working with a priori regularity for u, say u ∈ Wloc ,
for instance. We often naturally run into weak solutions that do not have this
regularity a priori, and in fact the following elliptic regularity result holds: for a
linear elliptic operator L of order m on  ⊂ Rn with smooth coefficients, if u is
a distribution with Lu = f ∈ C ∞ (), then u ∈ C ∞ (). Since smoothness is a
local question, we can localize support near any point, and so consider a smooth
bump function ζ ∈ Cc∞ () that is identically 1 on Bε ( p) compactly contained in
, and let ψ1 be a smooth bump function identically 1 on Bε/2 ( p) supported in
Bε ( p). By [195, Theorem 8.9], ζ u ∈ H s (Rn ) for some s, so that since ζ u = u
on the support of ψ1 , L(ψ1 u) − ψ1 Lu ∈ H s−m+1 (Rn ). We can apply elliptic
regularity in these Sobolev spaces as in [93, Lemma 6.32] (cf. [139, Theorem
III.4.3] for a related approach) to conclude ψ1 u ∈ H s+1 (Rn ). The claim follows
by further localization and bootstrapping regularity.
We could compactify the localized problem to a closed manifold if we wish:
take an elliptic operator e L for which e L = L on Bε ( p), while e L has constant
coefficients outside the support of ζ ; this can be done with a smooth convex
combination of L with the constant-coefficient operator whose coefficients are
those of L at p, by taking sufficiently small ε if necessary. We can construe
L(ψ1 u) =: f˜ on a domain compactified to an n-torus — at which point Fourier
e
series methods could be employed, as in [219, Chapter 6].
As we may want to apply linear theory in situations where the coefficients are
not necessarily smooth, it is very important to note the regularity required on the
182 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

coefficients of L to obtain the elliptic estimates and regularity at a given level.


For (6.1.1)–(6.1.2), one can impose a continuity requirement, at least for the top-
order coefficients, to be able to locally approximate L with a constant-coefficient
operator to which the estimates will apply. For the Schauder estimate (6.1.3),
Hölder continuity of the coefficients is sufficient; cf. Chapters 6, 8, and 9 of [107]
for such results in the second-order case. Given these results, for higher-order
estimates and regularity, we observe directly from the difference quotient method
k−1,1
above, that it is sufficient for L to have, for k ∈ Z+ , coefficients in Cloc in
m+k m k
order to conclude that u ∈ Hloc follows from u ∈ Hloc and Lu = f ∈ Hloc , and
analogously for 1 < p < ∞. For Hölder regularity, given the basic estimate
m,α
(6.1.3) holds for C 0,α coefficients, we can then prove that for u ∈ Cloc , for L
k,α k,α m+k,α
with coefficients in Cloc , then from Lu = f ∈ Cloc we can conclude u ∈ Cloc ,
cf. Exercise 6-60.
Some of the technicalities of the preceding discussion can be appreciated
with a couple basic nonlinear examples. For a very simple example, suppose
we have a solution to the equation Lu = h(u) for some smooth function h,
m,α
where L is linear elliptic of order m with smooth coefficients. If u is Cloc ,
say, then so is h(u). By elliptic regularity, we have u, and hence then h(u), in
2m,α
Cloc . We can continue from here to get that u ∈ C ∞ . The equation Lu = h(u)
is semilinear: the highest order terms appear linearly. Certain operators of
interest are quasilinear: the operator is linear in the highest-order derivatives,
but the coefficients can depend on lower-order derivatives. A classical example
of this is the minimal surface equation (see Exercise 6-58). Suppose u satisfies
an equation of the form a i j (x, u, du)∂ 2 u/∂ x i ∂ x j = f (x), where f is smooth,
and a i j is smooth in its arguments. Given the solution u, we can construe the
coefficients as given and consider the equation in terms of linear theory; suppose
2,α
(a i j (x, u, du)) is positive definite, so that the operator is elliptic. If u is Cloc ,
i j 1,α
say, then the coefficients a (x, u, du) are Cloc ; by the above discussions, we
3,α
see we can conclude u ∈ Cloc ; this in turn allows us to bootstrap the regularity
2,α
of the coefficients a i j (x, u, du) to Cloc , and continue.

6.1.6. The Fredholm alternative/Hodge decomposition. In this section we dis-


cuss a result on solvability of elliptic equations. Suppose that (M, g) is a closed
Riemannian manifold, and P is a linear elliptic operator of order m with smooth
coefficients. From Proposition 6-14, P has a finite-dimensional kernel, which
by elliptic regularity is comprised of smooth sections, and thus we can write
the domain of P (i.e., W m, p (M) or C m,α (M)) as S ⊕ ker P, where S is a closed
subspace. Applying Proposition 6-15, we can easily conclude P has closed range.
Thus in the target of P (i.e., L p (M) or C 0,α (M)) we have a direct sum of closed
A PRIMER ON ELLIPTIC PDE 183

subspaces ran P ⊕ ker P ∗ , since by integration by parts any Pw = f ∈ L 2 (M)


annihilates the kernel of P ∗ , which is itself a finite-dimensional space of smooth
sections. In fact, the content of the Fredholm alternative/Hodge decomposition
is that this direct sum is the entire target space. The proof uses tools from
functional analysis to get a weak solution of Ph = f , for any f in the target
space that annihilates ker P ∗ by integration; and then proving that the solution h
can be represented by an element in the respective space W m, p (M) or C m,α (M),
by elliptic regularity. In fact, then, one has the following decompositions: for
1 < p < ∞, 0 < α < 1, and k a nonnegative integer (again, we suppress the
metric from the notation):

W k, p (M) = P(W m+k, p (M)) ⊕ ker P ∗ (6.1.5)


C k,α (M) = P(C m+k,α (M)) ⊕ ker P ∗ (6.1.6)
∞ ∞ ∗
C (M) = P(C (M)) ⊕ ker P . (6.1.7)

These decompositions are valid as well in case the linear operator P is


underdetermined-elliptic, i.e., P ∗ is overdetermined-elliptic. To see this we
apply the decomposition to the elliptic operator PP ∗ : for p = 2, say, we get
L 2 (M) = PP ∗ (H 2m (M))⊕ker(PP ∗ ). Integrating 0 = M ⟨u, PP ∗ u⟩ dvg by parts
R

shows that ker(PP ∗ ) = ker P ∗ . Also, PP ∗ (H 2m (M)) ⊂ P(H m (M)) ⊥ ker P ∗ ,


from which we conclude L 2 (M) = P(H m (M)) ⊕ ker P ∗ . We highlight that
a solution h of Ph = f can be taken to be of the form h = P ∗ u, which is
L 2 (dvg )-orthogonal to ker P.
As (6.1.5)–(6.1.7) are fundamental, we sketch a proof below, along lines that
will be useful for us later. Before doing so, we ask you to recall the following
basic fact that will be used in the proof.
Exercise 6-21. Suppose X is a Banach space, and xi is a sequence converg-
ing weakly to x ∈ X . Use the Hahn–Banach theorem to prove that ∥x∥ X ≤
lim infi→∞ ∥xi ∥ X . When X is a Hilbert space with inner product ⟨ · , ·⟩, give a
more elementary proof by considering the sequence ⟨xi , x⟩.
Proof of the Fredholm alternative/Hodge decomposition. We now give a proof of
(6.1.5) for the case p = 2. As we have seen, K = ker P ∗ is a finite-dimensional
space of smooth sections. Given an f ∈ L 2 (dvg ) which is L 2 -orthogonal to
K, we seek to solve the equation Ph = f for h ∈ H m (M). To do this, we
define the functional G(u) = M 12 |P ∗ u|2g − u f dvg . Let S = {u ∈ H m (M) :
R 
m ⊥
M uw dvg = 0 for all w ∈ K}, i.e., S = H (M) ∩ K , where the orthogonal
R

complement is taken in L 2 (dvg ). It is easy to see S is closed in H m (M) and that


H m (M) = S ⊕ K.
184 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

By Proposition 6-15 and Cauchy–Schwarz, there is a constant C such that


for all u ∈ S, G(u) ≥ C∥u∥2H m − ∥ f ∥ L 2 ∥u∥ L 2 . Thus we conclude that G is
bounded from below on S, and that an infimizing sequence u i ∈ S must be
H m -bounded. By Riesz Representation and compactness (Banach–Alaoglu,
Rellich), we can assume by re-indexing a suitable subsequence that there is a
u ∈ S (which we recall is closed) so that u i converges to u weakly in H m (M),
and in the L 2 -norm. Thus P ∗ u i converges to P ∗ u weakly in L 2 (dvg ), and then
by Exercise 6-21 we have that ∥P ∗ u∥ L 2 ≤ lim infi→∞ ∥P ∗ u i ∥ L 2 . Thus we see
G(u) ≤ lim infi→∞ G(u i ), and hence G(u) is the minimum value of G on S.
We can thus compute the Euler–Lagrange equation: for v ∈ S, we have
d
Z
0= G(u + tv) = (⟨P ∗ u, P ∗ v⟩ − v f ) dvg .
dt t=0 M

At the same time we have 0 = M (⟨P ∗ u, P ∗ v⟩ − v f ) dvg for v ∈ K = ker P ∗ ,


R

by the assumption on f . Thus this integral identity holds for all v ∈ H m (M),
and hence PP ∗ u = f weakly. Since P is assumed to have smooth coefficients
and g is smooth, elliptic regularity will allow us to conclude the appropriate
regularity for any of the splittings above. For example, for f ∈ L 2 (M), we
conclude u ∈ H 2m (M), so h := P ∗ u ∈ H m (M) solves Ph = f as desired. See
Exercise 6-60 for a proof of the splitting for p > 1, and for the Hölder setting. □
Remark 6-22. We have observed that certain facts extend to overdetermined-
elliptic operators L on a closed manifold M, such as finite-dimensionality of the
kernel, which we gleaned by using a metric g to define P = L ∗ , and considering
the elliptic operator PP ∗ = L ∗ L. Of course, if we could establish the elliptic
estimates in Proposition 6-12 for overdetermined-elliptic operators, then the
proofs of various facts would follow from the estimates, such as Proposition 6-15,
which could then be used to prove the Fredholm/Hodge splitting directly for
L = P ∗ overdetermined-elliptic. In fact, the Fourier transform proof for the
L 2 -estimates works for injective symbol (cf. [114]); the scaling argument in [210]
for the Schauder estimates uses hypoellipticity, that the elements in the kernel
are smooth, which holds for overdetermined-elliptic operators. One could also
establish estimates for L overdetermined-elliptic via the corresponding estimates
for L ∗ L = PP ∗ , say in case p = 2 as a mapping L ∗ L : H m (M) → H −m (M),
for which suitable mapping properties and estimates are available via Fourier
analysis. For other spaces, see, e.g., [210] and [214, Chapters 13–14].

6.1.7. Eigenvalue decomposition for self-adjoint elliptic operators. Suppose


P is a linear self-adjoint (P = P ∗ ) elliptic operator of order m with smooth
coefficients on a closed Riemannian manifold (M, g). Then we have L 2 (M) =
A PRIMER ON ELLIPTIC PDE 185

ker P ⊕ P(H m (M)). Furthermore for any λ ∈ R, P − λI is also elliptic and


self-adjoint, and so it either has nontrivial kernel E λ (which then must also
be finite-dimensional space of smooth sections), so that λ is an eigenvalue, or
P − λI : H m (M) → L 2 (M) is invertible. Although real matrices can have
complex eigenvalues, you will recall that real symmetric matrices have real
eigenvalues. Similarly, even if we consider the self-adjoint operator P operating
on a Hermitian bundle, the same proof as in the matrix case will show that the
eigenvalues can only be real. Let 6 ⊂ R be the set of eigenvalues. One can
show, see, e.g., [139, Theorem III.5.8], that 6 is an infinite discrete set, and
L 2 (M) = λ∈6 E λ as a Hilbert space direct sum of orthogonal subspaces.
L

6.1.8. Fredholm operators. We now recall the notion of a Fredholm operator,


which will be used later. Let T : X → Y be a bounded linear operator between
Banach spaces. The term cokernel refers to a quotient space, which in this context
can either mean the algebraic cokernel Y/T (X ), or the Banach space cokernel
Y/T (X ). Of course these notions agree for operators with closed range. Recall
that by the Hahn–Banach theorem, if T (X ) is a closed, proper subspace of Y ,
there must be a nontrivial element θ in the dual space Y ∗ such that θ vanishes
on T (X ).

Exercise 6-23. Let T : X → Y be a bounded linear operator between Banach


spaces. Show that if there is a closed subspace W ⊂ Y so that Y = T (X )⊕W , then
T (X ) is closed in Y . In particular, conclude that if Y/T (X ) is finite-dimensional,
then T (X ) is closed.

Definition 6-24. A bounded linear operator T : X → Y between Banach spaces is


Fredholm if T has finite-dimensional kernel, closed range, and finite-dimensional
cokernel. We call ind(T ) = dim(ker T ) − dim(cok T ) the Fredholm index of T .

It follows that a bounded linear operator T : X → Y between Banach spaces


is Fredholm if and only if T has finite-dimensional kernel and finite-dimensional
algebraic cokernel Y/T (X ), but for clarity regarding the cokernel, the definition
is usually stated as above.
Elliptic operators between appropriate function spaces are often Fredholm, as
in the following fundamental proposition, a corollary of the above discussion.

Proposition 6-25. Suppose M is a closed manifold, and L is a linear elliptic


operator of order m with smooth coefficients. For an integer k ≥ m and 1 < p < ∞,
L : W k, p (M) → W k−m, p (M) is Fredholm; the analogous statement holds in
Hölder spaces (0 < α < 1). Moreover, by elliptic regularity, the index of L is
independent of k ≥ m (and of 1 < p < ∞, respectively 0 < α < 1).
186 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

6.1.9. The maximum principle and the Harnack inequality. The last stop on
this brief tour of elliptic PDE brings us to several important results for second-
order scalar operators. We will review in the next chapter how these properties
can be proved for the Euclidean Laplace operator using the mean value property.
For now, consider an operator
∂2 ∂
L = a i j (x) i j
+ bi (x) i + c(x).
∂x ∂x ∂x
We should impose some ellipticity and boundedness conditions on the coefficients
a i j , bi and c. Assume that the coefficients are continuous on the closure  of
a bounded domain  ⊂ Rn , and that (a i j ) satisfies an ellipticity condition:
there is λ > 0 so that for all x ∈  and all ξ ∈ Rn , a i j (x)ξi ξ j ≥ λ|ξ |2 . (The
continuity assumption, plus ellipticity, is sufficient to give the required bounds,
but not necessary; see [107], for example. As noted earlier, we may arrange by
symmetrization that a i j = a i j .)
Weak maximum principle. We state first a weak maximum principle for L, which
allows one to estimate certain u on a bounded open set in terms of its boundary
values. Suppose u ∈ C 2 ()∩C 0 (). We write u = u + −u − , |u| = u + +u − , where
u + = max(u, 0) ≥ 0, u − = − min(u, 0) ≥ 0 (note that in [107], u − = min(u, 0)
is the opposite of what we have taken here).
(i) Suppose c = 0 on . If Lu ≥ 0 on , then max u = max∂ u, while if
Lu ≤ 0, then min u = min∂ u.
(ii) Suppose c ≤ 0 in . If Lu ≥ 0 on , then max u ≤ max∂ u + , while if
Lu ≤ 0 in , then min u ≥ − max∂ u − . Thus max |u| = max∂ |u| if
Lu = 0.
The idea behind these is simple. To illustrate, if Lu > 0, we observe that u
cannot have an interior maximum. If it does, say at x0 ∈ , then each ∂u/∂ x i
vanishes at x0 , and the Hessian matrix of u is nonnegative definite at x0 . If c = 0,
we see, working in Cartesian coordinates diagonalizing (a i j (x0 )) (if symmetric;
else diagonalize the Hessian at x0 ), that Lu(x0 ) ≤ 0. This observation also
works when c ≤ 0 and u(x0 ) ≥ 0, or in any case when u(x0 ) = 0. This gives
a contradiction. The general case where the inequality satisfied by Lu is not
necessarily strict uses a perturbation to achieve the result; see [86, Chapter 6] or
[107, Chapter 3].
The strong maximum principle and the Harnack inequality. We now state a
strong maximum principle for operators L as above, which is more subtle (see
the references just cited for proofs). It strengthens the weak maximum principle
A PRIMER ON ELLIPTIC PDE 187

by ruling out certain interior extrema for nonconstant super- or subsolutions.


Assume the bounded open set  is connected, and that u ∈ C 2 () ∩ C 0 ().

(i) Suppose c = 0 on . If Lu ≥ 0 on  and if for some x0 ∈ , u(x0 ) = max u


(or if Lu ≤ 0 in  and for some x0 ∈ , u(x0 ) = min u), then u is constant.
(ii) Suppose c ≤ 0 on . If Lu ≥ 0 on  and u attains a nonnegative global
maximum at a point x0 ∈ , i.e., u(x0 ) = max u ≥ 0 (or, if Lu ≤ 0 on 
and u attains a nonpositive global minimum u(x0 ) = min u ≤ 0 for some
x0 ∈ ), then u is constant. In fact, if the maximum (or minimum) satisfies
u(x0 ) = 0, then u is identically zero, regardless of the sign of c.

One last fundamental result is the Harnack inequality [86, Theorem 6.5] (see
also [107, Corollary 9.25]): for any connected, compactly contained open subset
W of , there is a C > 0 such that if u ≥ 0 is a nonnegative C 2 solution of Lu = 0
in , then supW u ≤ C infW u. In particular, if the supremum is positive on W,
the function u is strictly positive in the connected component of  containing W.
The Harnack inequality can be established with weaker a priori assumptions
on u, and for operators L with coefficients satisfying modest assumptions on
ellipticity and boundedness; see [107, Chapter 9], for instance. For nonnegative
weak solutions of analogous divergence form equations (where the leading
order part of the operator has the form ∂i (a i j ∂ j u)), and under very general
assumptions of ellipticity, boundedness and measurability on the coefficients,
Moser established a Harnack inequality, from which he recovered oscillation
and Hölder continuity estimates first established in the breakthrough discovery
made independently by DeGiorgi and Nash, (see, for example, [107, Chapter 8])
or [214, Chapter 14]). We will not go into the details here, but in light of the
discussion above (p. 182) about applications to nonlinear elliptic equations, the
reader should still be able to appreciate that this result is fundamental in going
from weak solutions to higher regularity. For applications to the mean curvature
operator one can consult [107; 108], for example.

6.1.10. The method of super- and subsolutions. As a final topic in this section,
we discuss a version of the method of super- and subsolutions that will suffice for
our purposes. We will want to solve semilinear elliptic PDE of the form 1g φ =
f (x, φ), for f : M × I → R, where I ⊂ R is an open interval. A subsolution φ−
satisfies 1g φ− ≥ f (x, φ− ), and a supersolution φ+ satisfies 1g φ+ ≤ f (x, φ+ ).
We reiterate that we take g to be smooth.

Theorem 6-26. Suppose (M, g) is a closed Riemannian manifold, and suppose


f : M × I → R is smooth. Suppose φ− ≤ φ+ are smooth sub- and supersolutions
188 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

for 1g φ = f (x, φ), such that [inf M φ− , sup M φ+ ] ⊂ I . Then there is a smooth
function φ, with φ− ≤ φ ≤ φ+ , such that 1g φ = f (x, φ).
For more general formulations of this method, see, e.g., [86; 127; 151]. We
use the proof as a means to illustrate the utility of some of the tools for elliptic
PDE that we have introduced above.
Proof. We first use continuity and compactness to choose a constant ρ > 0
so that, if we write the function f as f (x, s), then ρ − ∂ f /∂s is positive on
M × [inf M φ− , sup M φ+ ] = M × [min M φ− , max M φ+ ]. We let

L = −1g + ρ and F(x, s) = − f (x, s) + ρs,

so that 1g φ = f (x, φ) is equivalent to L(φ) = F(x, φ). Note that ∂ F/∂s > 0
on M × [min M φ− , max M φ+ ].
Since we take g to be smooth, the Fredholm alternative/Hodge decomposition
of the space of smooth functions for the linear, self-adjoint, elliptic operator L
has the form C ∞ (M) = ker L ⊕ ran L. Now, by considering M u Lu dvg and
R

integrating by parts, we see the operator L has trivial kernel. Thus L is surjective
as well. Since φ± is smooth, we can thus define the following sequence of
smooth functions recursively: φ1 is the unique solution to Lφ1 = F(x, φ+ ), and
for any k ∈ Z+ , φk+1 is the unique solution to Lφk+1 = F(x, φk ). We remark that
for this to be well-defined, we have to make sure the range of each φk is inside
the interval I . In fact, the sequence φk is not only well-defined, but satisfies

φ+ ≥ φ1 ≥ φ2 ≥ · · · ≥ φk ≥ φk+1 ≥ · · · ≥ φ− .

Indeed, since φ+ is a supersolution, L(φ+ − φ1 ) = L(φ+ ) − F(x, φ+ ) is nonneg-


ative, i.e., 1g (φ+ − φ1 ) ≤ ρ(φ+ − φ1 ). Clearly this implies (φ+ − φ1 ) cannot
have a negative minimum, so φ+ ≥ φ1 . Similarly, (φ1 − φ− ) cannot have a
negative minimum, given that L(φ1 − φ− ) ≥ F(x, φ+ ) − F(x, φ− ) ≥ 0 (since
∂ F/∂s > 0). Thus the range of φ1 is inside the interval I , and φ2 can be defined as
indicated above. To establish the required inequalities for φ2 , note L(φ1 − φ2 ) =
F(x, φ+ )−F(x, φ1 ) ≥ 0, and L(φ2 −φ− ) ≥ F(x, φ1 )−F(x, φ− ) ≥ 0, so the same
considerations apply. We can proceed in the recursion, inductively establishing
the required estimates.
We have a bounded monotonic sequence φk , which thus converges pointwise to
a limit function φ. We want to show this limit function is smooth and satisfies the
desired PDE. To do this, we apply elliptic estimates for the operator L. Indeed if
we use the Schauder estimate (6.1.3) for Hölder norms, say, then we have a C > 0
such that ∥u∥C 2,α (M) ≤ C(∥Lu∥C 0,α (M) + ∥u∥C 0 (M) ) for all u ∈ C 2,α (M). In our
setting, L has trivial kernel, in which case there is a constant C > 0 such that for
A PRIMER ON ELLIPTIC PDE 189

all u ∈ C 2,α (M), ∥u∥C 2,α (M) ≤ C∥Lu∥C 0,α (M) , by Proposition 6-15. We have the
analogous Sobolev space estimates (6.1.2), but again since L has trivial kernel,
there is a C > 0 such that for all u ∈ W 2, p (M), ∥u∥W 2, p (M) ≤ C∥Lu∥ L p (M) .
With this in hand, we note that

∥φ1 ∥W 2, p (M) ≤ C∥F(x, φ+ )∥ L p (M) ,


∥φk+1 ∥W 2, p (M) ≤ C∥F(x, φk )∥ L p (M) .

Applying continuity, compactness, and boundedness of the sequence φk , we


see there is a K > 0 such that ∥φk ∥W 2, p (M) ≤ K for all k. To get pointwise
bounds, we cite a form of the Sobolev embedding recalled earlier: for p > n2 ,
any u ∈ W 2, p (M) can be represented by a unique continuous function, and there
is a constant C > 0 such that for all u ∈ W 2, p (M), ∥u∥C 0 (M) ≤ C∥u∥W 2, p (M) .
Actually, for 0 < γ < min 1, 2 − np , we can take u ∈ C 0,γ (M), and there is a


C > 0 for which ∥u∥C 0,γ (M) ≤ C∥u∥W 2, p (M) . Thus from ∥φk ∥W 2, p (M) ≤ K , we
see that φk is bounded in C 0,γ (M), so that Ascoli–Arzelà yields a C 0 -convergent
subsequence, and so the limit φ is continuous; moreover, by monotonicity, the
full sequence converges uniformly.
By applying the elliptic estimate judiciously, we can obtain more regularity
on φ. For instance, for k, ℓ > 1, we have

∥φk − φℓ ∥W 2, p (M) ≤ C ∥F(x, φk−1 ) − F(x, φℓ−1 )∥ L p (M) .

By continuity of F, compactness, and uniform convergence, we have that


F(x, φk ) is Cauchy in L p (M), and so φk is W 2, p (M)-Cauchy; the limit must be
φ, and thus φ ∈ W 2, p (M). Using the Sobolev embedding, we then have not only
that φ ∈ C 0,γ (M), but we also see that φk is C 0,γ (M)-Cauchy, and so converges
to φ in C 0,γ (M). From this and the smoothness of F, we can infer that F(x, φk )
is also C 0,γ (M)-Cauchy. Therefore, by the Schauder estimates, we have for
k, ℓ > 1, ∥φk − φℓ ∥C 2,γ (M) ≤ C∥F(x, φk−1 ) − F(x, φℓ−1 )∥C 0,γ (M) . Thus φk is
C 2,γ (M)-Cauchy, and hence converges in C 2,γ (M), and the limit must be φ.
That φ solves the desired equation now follows from Lφ = limk→∞ Lφk =
limk→∞ F(x, φk−1 ) = F(x, φ). By elliptic regularity, we conclude φ is smooth.

6.1.11. Application to conformal deformation of scalar curvature: Yamabe


classes. Before we return to the constraint equations, we give a geometric ap-
plication of the PDE theory just discussed. We show that the space of (smooth)
Riemannian metrics on a closed manifold M n , n ≥ 3, can be broken up into a
union of three mutually disjoint sets, Y + , Y 0 and Y − , each of which is a union
190 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

of conformal classes. To do this, we begin by recalling the formula for the scalar
curvature under a conformal change of metric.
Proposition 6-27. If ĝ and g are conformally related metrics on an n-manifold
4
M (n ≥ 3), say ĝ = u n−2 g with u > 0, then
− n+2
R(ĝ) = − 4(n−1)
n−2 u
n−2 1 u − n−2 R(g)u . (6.1.8)

g 4(n−1)

The proof is a straightforward but somewhat laborious calculation, which we


highly recommend to the reader (Exercise 2-51). From it one will observe that
4
with ĝ = u q g, the choice of exponent q = n−2 is made to avoid |du|2g -terms in
(6.1.8).
The linear operator in parentheses in (6.1.8) is the conformal Laplacian Lg ,
n−2
given by Lg u = 1g u− 4(n−1) R(g)u (cf. p. 103). For (M, g) a closed Riemannian
manifold, we define an energy associated to Lg by
Z Z
Eg (u) = −u Lg u dvg = |∇g u|2g + 4(n−1)
n−2
R(g)u 2 dvg .

M M
4 2n
If ĝ = u g with u > 0, then dvĝ = u n−2 dvg , and we have an identity for the
n−2

total scalar curvature of ĝ using (6.1.8):


Z Z
n+2 2n
R(ĝ) = R(ĝ)dvĝ = − 4(n−1)
n−2 u
− n−2
( Lg u) u n−2 dvg = 4(n−1)
n−2 Eg (u).
M M

For (M, g) Riemannian, Lg is a self-adjoint elliptic operator, and so as


in Section 6.1.7, we have that for M closed, Lg possesses a discrete set of
eigenvalues, each of finite multiplicity, and the eigenfunctions can be chosen
to form a complete orthonormal basis for L 2 (dvg ). As a consequence of the
following lemma, we will see that, upon writing the eigenvalue equation as
Lg u + λu = 0, the eigenvalues can be arranged in a monotonic sequence
λ1 < λ2 ≤ λ3 ≤ · · · , with limk→∞ λk = +∞, and for each k ∈ Z+ , there
is an eigenfunction u k with Lg u k + λk u k = 0. Of particular note, the first
eigenvalue is simple, and a first eigenfunction u 1 can be chosen so that u 1 > 0.
Recall that for u locally integrable (e.g., u ∈ L 2 (dvg )), du can be defined
weakly, by requiring M du(X ) dvg = − M u divg X dvg for all smooth vec-
R R

tor fields X . The Sobolev space H 1 (M, g) is comprised of those functions


u ∈ L 2 (dvg ) with |du|g ∈ L 2 (dvg ), forming a Hilbert space with inner product
⟨u, w⟩ H 1 = M uw dvg + M ⟨du, dw⟩g dvg .
R R

Lemma 6-28. For (M, g) a closed connected Riemannian manifold, Lg has a


smallest eigenvalue λ1 , which is a simple eigenvalue, and there is an eigenfunc-
tion u with u > 0.
A PRIMER ON ELLIPTIC PDE 191

Proof. There is a constant C > 0 (depending on n and g) so that Eg (u) ≥


−C M u 2 dvg . For an eigenfunction u k for λk , Eg (u k ) = λk M u 2k dvg , from
R R

which we then conclude the eigenvalues are bounded from below. Moreover, we
see there is a variational characterization of λ1 , as the infimum of
Eg (u)
Gg (u) := R
M u 2 dvg

over all nontrivial u ∈ H 1 (M, g); the lower bound on Eg (u) shows that this
infimum is finite. As such, we will argue that infu∈H 1 \{0} Gg (u) is achieved by
a minimizer u ∈ H 1 (M, g), and then that this minimizer is a smooth eigen-
function which does not change sign. Given this, the minimizer has eigenvalue
λ1 = infu∈H 1 \{0} Gg (u), and furthermore this eigenvalue is simple: if there were
two linearly independent eigenfunctions for an eigenvalue, we could arrange
them to be orthogonal in L 2 (dvg ), but we will have shown that minimizers
(first eigenfunctions) cannot change sign, so that two such functions cannot be
orthogonal.
That a minimum is achieved can be proven using similar tools as in the proof
above of the Hodge decomposition, based on elementary functional analysis,
together with the fundamental compactness result from the Rellich lemma: the
inclusion of H 1 (M, g) ,→ L 2 (dvg ) is compact, which we recall means that
given an H 1 -bounded sequence, then there is a subsequence which converges
in L 2 . In addition, by Riesz representation for Hilbert spaces, together with the
Banach–Alaoglu theorem, we can choose the subsequence to converge weakly
in H 1 (M, g) as well (to the same limit).
With this in mind, consider an infimizing sequence vi ∈ H 1 (M, g) \ {0} for Gg .
Since Gg (u) = Gg (cu) for a constant c ̸= 0, we may take M vi2 dvg = 1. The fact
R

that Gg (vi ) approaches the finite infimum then implies that ∥vi ∥ H 1 is a bounded
sequence. Thus we may assume, reindexing the subsequence, that vi converges
strongly in L 2 and weakly in H 1 to a function u ∈ H 1 (M, g). u is nontrivial
because M u 2 dvg = 1. That Gg (u) is the minimum of Gg comes from the fact
R

that the norm is weakly lower semicontinuous, i.e., ∥u∥ H 1 ≤ lim infi→∞ ∥vi ∥ H 1
(cf. Exercise 6-21). Together with the L 2 -convergence of vi to u, we can now
conclude that Gg (u) ≤ lim infi→∞ Gg (vi ), so that Gg (u) realizes the minimum.
As with our variational formulation of the Einstein equation, for example, we
can compute the Euler–Lagrange equation by setting
d
t=0
Gg (u + tv) = 0,
dt
for any v ∈ H 1 (M, g) (for small enough |t|, ∥u + tv∥ L 2 ̸= 0 since ∥u∥ L 2 = 1).
192 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

Exercise 6-29. Compute the Euler–Lagrange equation, deriving the following


identity: with u as above and λ = Gg (u), we have for all v ∈ H 1 (M, g),
Z Z
n−2
λuv dvg = ⟨∇g u, ∇g v⟩ + 4(n−1) R(g)uv dvg .

M M

The right-hand side is equal to M (− Lg u)vdvg for smooth enough u, and


R

thus the Euler–Lagrange equation is the weak formulation of Lg u + λu = 0. As


1g , hence Lg , is elliptic, and g is smooth, elliptic regularity gives that u is given
by a smooth function. Thus u is an eigenfunction, and λ = λ1 , as desired.
We have only to show that u has a definite sign. Note that since |∇g u|g =
∇g |u| g almost everywhere, |u| ∈ H 1 (M, g) [86; 107], and so Gg (u) = Gg (|u|).
Thus |u| is a minimizer, and the above argument shows that |u| ≥ 0 is an
eigenfunction. That |u| cannot have a zero minimum value comes from the
strong maximum principle. □
In the next proposition, we show the sign of λ1 is invariant across a conformal
4
class. Suppose g is conformal to g̊, say g = θ n−2 g̊ for θ > 0. Then, for u > 0,
n+2 4 4 n+2
− 4(n−1)
n−2 u
− n−2
Lg u = R(u n−2 g) = R((uθ ) n−2 g̊) = − 4(n−1)
n−2 (uθ)
− n−2
Lg̊ (uθ),

i.e.,
n+2
Lg u = θ − n−2 Lg̊ (uθ ). (6.1.9)
This identity extends to all smooth u by a simple continuity argument about
4 4
points where u = 0. Next we see that if u > 0 and ĝ = u n−2 g = (uθ) n−2 g̊, then
4(n−1) 4(n−1)
n−2 Eg (u) = R(ĝ) = n−2 Eg̊ (uθ). (6.1.10)
We are now ready to summarize the above into a key proposition.
Proposition 6-30. Suppose M n (n ≥ 3) is a closed connected manifold. Let C
be a conformal class of Riemannian metrics. For g ∈ C , let λ1 (g) be the lowest
eigenvalue of the conformal Laplacian Lg . Then the sign of λ1 (g) is the same
for all g ∈ C : it is positive (zero, negative) if and only if there is a metric ĝ ∈ C
for which R(ĝ) is positive (zero, negative).
Proof. We let u > 0 be a first eigenfunction of Lg for a metric g ∈ C . If λ1 (g) = 0,
and if g̊ ∈ C , then by (6.1.9) or (6.1.10), we see λ1 (g̊) ≤ 0; similarly, if λ1 (g) < 0,
then λ1 (g̊) < 0. Hence, if λ1 (g) = 0, then λ1 (g̊) = 0 too. Thus we can also
conclude that if λ1 (g) > 0, then λ1 (g̊) > 0 as well.
4
We again let u > 0 be a first eigenfunction of Lg . It follows from R(u n−2 g) =
− n+2 4
− 4(n−1)
n−2 u
n−2 L u =
g
4(n−1)
n−2 λ1 (g)u
− n−2
that for any metric g, there is a confor-
4
mally related metric ĝ = u n−2 g with scalar curvature whose sign agrees with
A PRIMER ON ELLIPTIC PDE 193

that of λ1 (g). We have only left to show that if g̊ is conformal to g, so that R(g̊)
has a definite sign (or is identically zero), then the sign is the same as that of
4
λ1 (g). To see this, if g = θ n−2 g̊, and if u > 0 is a first eigenfunction of Lg , then
we have
n+2 n+2
n−2
1g̊ (uθ ) − 4(n−1) uθ R(g̊) = Lg̊ (uθ ) = θ n−2 Lg u = −λ1 (g)θ n−2 u.

If λ1 (g) > 0, then by considering a point p where uθ > 0 has a minimum value
on M (and hence 1g̊ (uθ )| p ≥ 0), we see R(g̊)| p > 0. Similar consideration
n−2
applies if λ1 (g) < 0. If λ1 (g) = 0, then 1g̊ (uθ ) = 4(n−1) uθ R(g̊). By considering
points where uθ obtains a maximum and minimum on M, respectively, we see
that since R(g̊) has a definite sign, then R(g̊) must be identically zero. □
As a direct corollary, the set of Riemannian metrics on a closed and connected
manifold can be written as a disjoint union Y + ∪ Y 0 ∪ Y − of Yamabe classes,
where Y + is the set of all metrics which are conformally related to a metric with
positive scalar curvature, and analogously for Y 0 and Y − . Each of these three
sets is a union of conformal classes. We note that Y + and Y 0 might in fact be
empty, whereas Y − is always nonempty (see [22, p. 123-4], for example). For
example, if M = T3 is the three-torus, Y + is empty [198] (Theorem 6-56 below),
and if M is a compact hyperbolic manifold, then Y + ∪ Y 0 is empty (see [139,
Corollary IV.5.6], for instance), and similarly Y + ∪ Y 0 is empty for T3 # T3 (see
comment in Section 6.3.1).
Before we move back to the constraint equations, we note a simple application
of the method of super- and subsolutions.
Theorem 6-31. Suppose (M, g̊) is a closed Riemannian n-manifold (n ≥ 3). Let
C be the conformal class of g̊, and suppose that g̊ ∈ Y − . For any constant ξ < 0,
there is a metric g ∈ C such that R(g) = ξ .
Proof. By the preceding proposition, we may assume without loss of generality
that R(g̊) < 0. By (6.1.8), we see we want to show the following PDE has a
positive solution:
4
n−2
1g̊ φ = 4(n−1) φ R(g̊) − φ n−2 ξ =: F(x, φ).


For δ > 0 sufficiently small, F(x, δ) < 0, and so φ− = δ is a subsolution. For


2 > 0 sufficiently large, F(x, 2) > 0, so that we can choose φ+ = 2 ≥ δ = φ− > 0
to be a supersolution. □
From this and Proposition 6-30, we conclude that if a metric is in Y 0 ∪ Y − ,
there is a conformally related metric with constant scalar curvature. The Y + case
is much harder, but by the resolution of the Yamabe problem by R. M. Schoen
194 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

(following work of Yamabe, Aubin and Trudinger, cf. [143]) each conformal
class does contain a metric of constant scalar curvature.

6.2. Solving the constraint equations:


the conformal method

In this section we discuss one method for producing solutions to the vacuum
Einstein constraint equations, the conformal method, which dates back to Lich-
nerowicz, Choquet-Bruhat, York and Ó Murchadha [50; 52; 145; 172; 228]. The
method has proved very useful in both theory and applications, and remains an
active area of research. We will not try to give an up-to-date account of the latest
developments or even an exhaustive list of references, but rather we will develop
some of the basic formulation and results of the method, and the interested reader
can find further results and references, such as [17; 64; 153; 154].
In terms of parametrizing the space of solutions (g, K ) to the Einstein con-
straint equations, the conformal method has proved successful in the constant
mean curvature (CMC) case, i.e., in case trg K is constant.5 The non-CMC case
is not completely understood, with some near-CMC results, and a number of
recent results which cast doubts on how successful the conformal method can be
for parametrizing the moduli space of solutions to the constraints. We will set
up the conformal method, and then study the CMC regime, following Isenberg’s
treatment [127], which unified a number of earlier results and completed the
analysis in the CMC case.
The basic idea is to prescribe part of the initial data (g, K ) as free data, and
solve for the other components. For instance, the Riemannian metric g is not
prescribed, but rather its conformal class is, and the method will involve solving
for the metric in the class. If g̊ is a metric in the conformal class, the method will
4
determine a function φ > 0 such that g = φ n−2 g̊ will be the desired metric. While
this part was simple enough to describe, it takes more finesse to understand how
to assemble the symmetric tensor K , part of which will be freely prescribed, with
the remainder to be determined by the constraints. The construction of K will be
motivated below with a discussion of the transverse-traceless (TT) decomposition
of symmetric tensors. To establish rigorously the TT decomposition and to solve
the equations that arise from the conformal formulation of the Einstein constraints,
we will draw on our above tour of elliptic PDE.

6.2.1. Transverse-traceless (TT) decomposition. We now show how to write


a symmetric (0, 2)-tensor 9 on a closed Riemannian manifold (M n , g), n ≥ 2,
5 See [153; 154] for a discussion of the conformal method outside the CMC regime.
S OLVING THE CONSTRAINT EQUATIONS : THE CONFORMAL METHOD 195

as a sum of three terms, 9 = 9 T T + 9 L + 9 T r . Here 9 T r = n1 (trg 9)g is a


pure-trace term, while 9 T T is transverse (divergence-free) and traceless (trace-
free). Hence we see that the longitudinal part 9 L must be trace-free. We give
an ansatz for 9 L as 9 L = L g W , where L g is the conformal Killing operator
(L g W )ab = Wa;b +Wb;a − n2 (divg W )gab (the indices are raised and lowered using
g, Wa = gab W b ). You will note the conformal Killing operator is the trace-free
part of the Lie derivative (L W g)ab = Wa;b + Wb;a . Sometimes L g W is written
as a (2, 0)-tensor, so that

(L g W )ab = g ac g bd (L g W )cd = g bd W a;d + g ac W b;c − n2 (divg W )g ab .

Recall that a vector field W generates a flow ϕt : M → M by dtd t=0 ϕt ( p) = W ( p);


a vector field W is a Killing field for g if the flow it generates preserves g, i.e.,
if each ϕt is an isometry, whereas W is a conformal Killing field (CKV) if ϕt
preserves the conformal class of g. It is well-known (see for instance [174; 218,
Appendix C]) that W is a Killing field if and only if L W g = 0, whereas W is a
CKV if and only if L g W = 0.
As L g W is already trace-free, we need only impose the divergence-free
condition on 9 T T , divg (9 − 9 L − 9 T r ) = 0, to obtain an equation for W :

divg(L g W ) = divg (9 − 9 T r ). (6.2.1)

We let Pg = divg ◦ L g , and we write Pg (W ) = divg (L g W ) in components as

(Pg W )a = (L g W )ab;c g bc = Wa;bc g bc + Wb;ac g bc − n2 W ℓ;ℓc gab g bc . (6.2.2)

We wish to commute the covariant derivatives on the second term, for which we
use the Ricci formula.
Lemma 6-32 (Ricci formula). If X is a vector field on a semi-Riemannian
manifold (M, g), then

X i; jk − X i;k j = X ℓ Rki jℓ , (6.2.3)

so that for a one-form α,

αi; jk − αi;k j = αℓ R ℓjki . (6.2.4)

Proof. For vector fields X , Y and Z and one-form field θ ,

∇(∇ X )(θ, Y, Z ) = ∇ Z (∇ X )(θ, Y )


= ∇ Z (∇Y X (θ )) − ∇ X (∇ Z θ, Y ) − ∇ X (θ, ∇ Z Y )
= (∇ Z (∇Y X ))(θ )+∇Y X (∇ Z θ )−∇ X (∇ Z θ, Y )−∇ X (θ,∇ Z Y )
= (∇ Z (∇Y X ))(θ ) − ∇∇ Z Y X (θ ).
196 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

Therefore we have

∇(∇ X )(θ, Y, Z ) − ∇(∇ X )(θ, Z , Y )


= (∇ Z (∇Y X ))(θ ) − ∇∇ Z Y X (θ ) − (∇Y (∇ Z X ))(θ) + ∇∇Y Z X (θ)
= R(Z , Y, X )(θ )
where we used the fact that ∇ Z Y − ∇Y Z = [Z , Y ].
For the one-form version, we note that because ∇g = 0, raising and lowering
commute with covariant differentiation, and we use symmetries of the curvature
tensor to deduce Rk jℓi = −R jkℓi = −Rℓi jk = Riℓjk = R jkiℓ . □
We apply this in (6.2.2) to obtain

(Pg W )a = Wa;bc g bc + (Wb;ca + Wℓ Racb )g bc − n2 W ℓ;ℓc gab g bc

= Wa;bc g bc + n−2 bc
n (divg W );a + Wℓ Racb g .

The operator Pg enjoys several integral properties, which we collect in the


next proposition. For simplicity of exposition, we assume the vector and tensor
fields are smooth, though clearly less regularity is needed.
Proposition 6-33. Let Z and W be vector fields on a closed Riemannian manifold
(M, g). For any symmetric and trace-free (0, 2)-tensor field S,

⟨Z , divg S⟩ L 2 (dvg ) = − 21 ⟨L g Z , S⟩ L 2 (dvg ) ,

from which follows that

⟨W, Pg W ⟩ L 2 (dvg ) = − 12 ∥L g W ∥2L 2 (dvg ) ,


⟨Z , Pg W ⟩ L 2 (dvg ) = ⟨Pg Z , W ⟩ L 2 (dvg ) .

Thus Pg is formally self-adjoint.


Proof. We apply integration by parts (divergence theorem) to obtain
Z Z
a bc
⟨Z , divg S⟩ L 2 (dvg ) = Z Sab;c g dvg = − Z a;c Sab g bc dvg
M M
Z
= −2 1
(Z a;b + Z b;a )S ab dvg
ZM
= −2 1
(L g Z )ab S ab dvg = − 12 ⟨L g Z , S⟩ L 2 (dvg ) ,
M

where in the last line we used that S is trace-free, and in the previous line we
used symmetry. Applying this with S = L g W we obtain ⟨Z , Pg W ⟩ L 2 (dvg ) =
− 21 ⟨L g Z , L g W ⟩ L 2 (dvg ) , which shows we can switch the roles of Z and W ,
S OLVING THE CONSTRAINT EQUATIONS : THE CONFORMAL METHOD 197

to yield the self-adjoint property. If Z = W , we obtain ⟨W, Pg W ⟩ L 2 (dvg ) =


− 12 ∥L g W ∥2L 2 (dv ) . □
g

We note an immediate corollary, recalling from above that a conformal Killing


field (CKV) W solves L g W = 0.

Corollary 6-34. On a closed Riemannian manifold (M, g), the image of Pg is


L 2 (dvg )-orthogonal to its kernel. The kernel of Pg is precisely the space of CKV
fields, each of which is L 2 (dvg )-orthogonal to the divergence of any symmetric
trace-free tensor field S.

We want to apply the Fredholm alternative/Hodge decomposition, for which


we show the operator Pg is elliptic. Indeed, the principal symbol σ of Pg at
ξ ∈ T p∗ M is a linear transformation σ (ξ ) : T p M → T p M satisfying (recall the
factor i 2 = −1)
ℓ ♯
−σ (ξ ) : V 7→ V a ξb ξc g bc + n−2 ac 2 n−2
n V ξℓ ξc g = |ξ |g V + n ξ(V )ξ . (6.2.5)

If σ (ξ )(V ) = 0, we apply ξ to the equation to get |ξ |2g ξ(V ) + n−2 2


n ξ(V )|ξ |g = 0.
For ξ ̸= 0 (and n ≥ 2), we obtain ξ(V ) = 0, which together with (6.2.5) implies
V = 0. Thus for each ξ ∈ T p∗ M \ {0}, σ (ξ ) is an isomorphism, which is the
definition of ellipticity.
We saw above that the image of Pg is orthogonal to the kernel; in fact the
image is complementary to the kernel. Indeed, from (6.1.5)–(6.1.7), the Fredholm
alternative for an elliptic operator like Pg gives a Hodge decomposition of a
suitable function space, either a Banach space of vector fields with Sobolev or
Hölder regularity, or by elliptic regularity, a splitting of the space of smooth
vector fields, as an L 2 -orthogonal direct sum ker P∗g ⊕ ran Pg . Now, because
Pg = P∗g (self-adjointness) this becomes ker Pg ⊕ ran Pg .

Corollary 6-35. For any symmetric (0, 2)-tensor 9 on a closed Riemannian


manifold (M, g), there is a vector field W such that Pg W = divg (9 − 9 T r ), and
W is determined up to the addition of a CKV.

Proof. By Corollary 6-34, if S = 9 − 9 T r , then divg S is L 2 (dvg )-orthogonal to


ker Pg . By the Fredholm alternative, then, divg S must be in the image of Pg . □

Recalling (6.2.1), we summarize as follows.

Proposition 6-36. Any symmetric (0, 2)-tensor 9 on a closed Riemannian


manifold (M, g) can be uniquely decomposed into the L 2 (dvg )-orthogonal sum
9 = 9T T + 9L + 9Tr .
198 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

Proof. Given that we solved (6.2.1), this gives the existence of the decomposition.
By the preceding corollary, the vector field W giving 9 L = L g W is determined
up to addition of a CKV, which plainly does not affect the term 9 L . □
We note that we could have done the analysis for a symmetric (2, 0)-tensor,
and the resulting TT decomposition is obtained from that of the corresponding
metrically equivalent (0, 2)-tensor. Furthermore, while we assumed M to be
closed, there are analogous results for certain noncompact manifolds, such as in
suitable weighted spaces on (M, g) asymptotically flat (see [39], for example).

6.2.2. The conformal data. We now define the conformal data on an n-manifold
(n ≥ 3). The essential idea is to prescribe certain parts of g and K , using the idea
of the decomposition we have just seen above, and to solve for the remaining
parts to constitute g and K satisfying the constraint equations. In fact we specify
the metric up to a conformal factor, prescribing a Riemannian metric g̊ such
that g will be in the conformal class of metrics C containing g̊. In light of the
decomposition of symmetric tensors we have discussed, we also prescribe a
symmetric TT (with respect to g̊) tensor σ along with a scalar function τ , as
giving part of the tensor K (up to conformal rescaling). We will specify K in
terms of this data and derive the corresponding form of the constraint equations,
but first we note some useful conformal identities.
6.2.2.1. Some conformal identities. We collect here some identities enjoyed by
metrics g and g̊ conformally related by g = φ q g̊, where φ > 0, with Levi-Civita
connections ∇ and ∇.˚ The computations collected in this section are relatively
straightforward, with some care. We start with a simple but useful formula.
Exercise 6-37. Recall that the difference T := ∇ − ∇˚ in two connections is
tensorial, and show that Tikj = q2 δ kj (d log φ)i +δ ki (d log φ) j − g̊i j (gradg̊ log φ)k .


Use this to show that if X is a vector field, then divg (φ −qn/2 X ) = φ −qn/2 divg̊ X .
Recall that conformal Killing vector (CKV) fields are those fields whose flow
preserves the conformal class of the given metric. Thus a vector field W is a
CKV for g if and only if it is a CKV for g̊. This can also be seen through the
identity

(L g W )ab = φ −q (L g̊ W )ab . (6.2.6)

Exercise 6-38. Prove the preceding identity. Leave indices on W up.


Remark 6-39. One must take care when proving identities for conformally
related metrics, in terms of raising and lowering indices. For example, if W is a
given vector field, we can lower the index to get a one-form using either metric,
S OLVING THE CONSTRAINT EQUATIONS : THE CONFORMAL METHOD 199

which generally would be called “W ” with indices down when only one metric
is under consideration. For instance, if we write

(L g W )ab = Wa;b + Wb;a − n2 W c;c gab = gac gbd (L g W )cd ,

we have Wa = gab W b , whereas if we write the same formula with each g


replaced by g̊, then Wa = g̊ab W b (and of course the semicolon would now
mean covariant differentiation using ∇). ˚ Suppose to be clear we let ω̊a =
b b q b q
g̊ab W and ωa = gab W = φ g̊ab W = φ ω̊a , and for a one-form θ we let
(L g θ )ab = θa;b + θb;a − n2 θc;d g cd gab . Then using the identity (6.2.6), we get
(L g W )ab = φ q (L g̊ W )cd , i.e.,

(L g ω)ab = gac gbd (L g W )cd = φ q g̊ac g̊bd (L g̊ W )cd = φ q (L g̊ ω̊)cd .

We now turn to TT tensors in conformally related metrics. If σ is a symmetric


(0, 2)-tensor on M which is trace-free with respect to g̊, then it is trace-free
with respect to any g ∈ C . The corollary following the next lemma says that an
analogue holds if σ is TT with respect to g̊; note that the equality below is a
one-form identity (indices down).
Lemma 6-40. Let g = φ q g̊. For any symmetric (0, 2)-tensor S on (M, g̊), and
for any ξ ∈ R,

φ q−ξ (divg (φ ξ S))a


= (divg̊ S)a − q2 (trg̊ S)(d log φ)a + ξ − q2 (2 − n) Sab (d log φ)c g̊ bc .


4
As we have seen (Proposition 6-27), we find it convenient to use q = n−2 .
With this choice of q, the following corollary is immediate.
4
Corollary 6-41. Let g = φ q g̊ with q = n−2 . For any trace-free symmetric (0, 2)-
tensor σ on (M, g̊), we have the one-form identity divg (φ −2 σ ) = φ −2−q divg̊ σ .
Thus if σ is TT for g̊, then φ −2 σ is TT for g.
Exercise 6-42. Prove Lemma 6-40. You might find Exercise 6-37 helpful.
There is an analogous formula for trace-free symmetric (2, 0)-tensors: if 4 is
a trace-free symmetric (2, 0)-tensor, then the following vector equation (indices
4
up) holds, again with q = n−2 : divg (φ −2−2q 4) = φ −2−2q divg̊ 4.
Exercise 6-43. Prove the preceding formula, based on the corresponding (0, 2)-
formula. You might let Scd = 4ab gac gbd and S̊cd = 4ab g̊ac g̊bd be the correspond-
ing trace-free (0, 2)-tensors for 4. Note that S = φ 2q S̊, and (as either form or
vector identities) divg 4 = divg S and divg̊ 4 = divg̊ S̊. Use the preceding lemma
to compute divg (φ −2−2q 4) (take care when raising and lowering indices).
200 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

6.2.2.2. The constraint equations. We will now determine the data g and K
4
in terms of the conformal data on M. Fix q = n−2 for the rest of the chapter.
We specify a metric g̊ to be in the conformal class of g, a symmetric TT (for g̊)
tensor σ , along with a scalar function τ . We can construe σ to be (0, 2) or (2, 0),
depending on how we want to present K (indices up or down). Given this data,
construct the physical data (g, K ) as follows: for φ > 0 and W a vector field, let

g = φ q g̊ and K = φ −2 (σ + L g̊ W ) + τn φ q g̊,

where g and K are (0, 2)-tensors (indices down). Note that trg K = τ . If we
let the indices on K and σ be up, the corresponding formulation would be
K cd = φ −2−2q (σ cd + (L g̊ W )cd ) + τn φ −q g̊ cd .
We now want to evaluate the constraint map (with 3 = 0)

8(g, K ) = R(g) − |K |2g + (trg K )2 , divg K − d(trg K )




2
for this conformally constituted data. Note that |K |2g = φ −4−2q |σ + L g̊ W |2g̊ + τn
(the trace-free part is pointwise orthogonal to the pure trace part), so that using
(6.1.8) yields

R(g) − |K |2g + (trg K )2


− n+2
= − 4(n−1) φ n−2 1 φ − n−2 R( g̊)φ − φ −4−2q |σ + L W |2 + n−1 τ 2 .

n−2 g̊ 4(n−1) g̊ g̊ n

Using Corollary 6-41 with S = (σ + L g̊ W ), we have

divg (φ −2 (σ + L g̊ W )) = φ −2−q divg̊ (σ + L g̊ W ) = φ −2−q divg̊ (L g̊ W ). (6.2.7)

Since divg ( τn g) = n1 dτ , we see that

divg K − d(trg K ) = φ −2−q divg̊ (L g̊ W ) − n−1


n dτ.

Remark 6-44. We see from (6.2.7) that the power of φ scaling L g̊ W in K is


chosen so that there are no dφ-terms in divg K ; cf. Lemma 6-40. This gives the
momentum constraint a simple form, which will be exploited below. Strictly
speaking, we are employing Method A: indeed, there is another natural conformal
rescaling (Method B), which gives the resulting formulation a natural conformal
invariance, cf. Exercise 6-61.
Thus the vacuum constraint equations 8(g, K ) = 0 can be written as follows,
n+2 2n
using q + 1 = n−2 , i.e., q + 2 = n−2 and q + 3 = 3n−2
n−2 :
1
1g̊ φ − q(n−1) 1
R(g̊)φ + q(n−1) |σ + L g̊ W |2g̊ φ −q−3 − qn
1 2 q+1
τ φ = 0, (6.2.8)
n−1 q+2
divg̊ (L g̊ W ) = n φ dτ. (6.2.9)
S OLVING THE CONSTRAINT EQUATIONS : THE CONFORMAL METHOD 201

The scalar equation (6.2.8) is known as the Lichnerowicz equation.


Whereas the constraint equations 8(g, K ) = 0 can be construed as a nonlinear,
underdetermined-elliptic system, the equations above constitute a second-order,
semilinear, elliptic system for (φ, W ). The equations are coupled in general.
There are a number of known results about solving the system under certain
smallness conditions on dτ , which essentially make the coupling weak enough
for iteration schemes to converge to a solution; there are too many results to cite
here, but see [128] for a seminal such result. In fact, we will in the next section
proceed to discuss the case when dτ = 0, i.e., τ is constant (for M connected).
In this case, the system decouples: we can take W = 0 and proceed to study the
Lichnerowicz equation.

6.2.3. The CMC case. We consider a closed manifold M of dimension n ≥ 3.


As noted above, we will discuss the case of conformal data with τ constant; as
τ = trg K , this is the constant mean curvature (CMC) case of the conformal
method. We will follow the paper of Isenberg [127], which completed the CMC
case, building on works of Lichnerowicz, Choquet-Bruhat, Ó Murchadha and
York. Isenberg’s paper deals with the case n = 3, although the analysis easily
extends to n ≥ 3.
In case τ is constant, (6.2.9) becomes the linear equation divg̊ (L g̊ W ) = 0. We
saw earlier that solutions to this equation are conformal Killing fields W , so that
L g̊ W = 0. Thus we see that W will not in any way affect the solution K , or the
Lichnerowicz equation. Thus, we might as well just choose W = 0 (which might
be the only solution anyway). Thus in the CMC case, we are left with finding a
4
positive solution φ > 0 to the Lichnerowicz equation (again, q = n−2 ):

1 1 2 −q−3 1 2 q+1
1g̊ φ = q(n−1) R( g̊)φ − q(n−1) |σ |g̊ φ + qn τ φ . (6.2.10)

Remark 6-45. For the analysis it can be useful for R(g̊) to have a definite sign,
and to arrange this we will change to another metric in the conformal class.
Bearing this in mind, we point out that while Method B (Remark 6-44 and
Exercise 6-61) enjoys a natural conformal invariance, the Lichnerowicz equation
for the CMC case of Method A also enjoys a conformal invariance property:
namely, given (g̊, σ, τ ) with τ constant and σ a (0, 2)-tensor, for any function
θ > 0, the Lichnerowicz equation for (g̊, σ, τ ) admits a solution φ > 0 if and
only if the Lichnerowicz equation for (θ q g̊, θ −2 σ, τ ) admits a solution φθ −1 > 0,
and these solutions lead to the same data (g, K ). Indeed, by Corollary 6-41, θ −2 σ
is TT for θ q g̊, so that from (6.2.8) and (6.2.9), we see g = φ q g̊ = (φθ −1 )q (θ q g̊)
and K = (φθ −1 )−2 (θ −2 σ ) + τn (φθ −1 )q (θ q g̊).
202 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

So in discussing the solvability of (6.2.10), we are free to change the conformal


data to equivalent data, which will produce the same physical data (g, K ), but
may have some useful property, such as constant scalar curvature, or scalar
curvature with a definite sign: see Proposition 6-30 and Theorem 6-31.

6.2.3.1. Main theorem on the CMC conformal method on closed manifolds. We


are now at long last ready to state the main result on the CMC conformal method
on closed manifolds. We will follow [127], using the method of super- and
subsolutions to solve the Lichnerowicz equation (6.2.10). Prior to Isenberg’s
work, fixed-point theorems were generally used to solve this semilinear elliptic
equation, covering all but one case of the CMC conformal method (see below).
Isenberg employed the barrier method (super- and subsolutions), and Maxwell
[151] later gave a streamlined approach with the barrier method (and with lower
regularity of the conformal data, which we do not discuss here).

Theorem 6-46 (Choquet-Bruhat, O’Murchadha, York, Isenberg). On a closed


connected manifold M n , n ≥ 3, let (g̊, σ, τ ) be CMC conformal data (τ constant).
The Lichnerowicz equation (6.2.10) admits a positive solution φ as indicated in
the table below, where the indication is whether the tensor σ is identically zero.
The solution is unique except in the case (g̊ ∈ Y 0 , σ = 0, τ = 0), in which case
any positive constant is a solution.

σ =0 σ =0 σ ̸= 0 σ ̸= 0
τ =0 τ ̸= 0 τ =0 τ ̸= 0
g̊ ∈ Y + no no yes yes
g̊ ∈ Y0 yes no no yes
g̊ ∈ Y − no yes no yes

Proof. We present a fairly complete proof, except for the uniqueness statement,
for which we refer you to [127; 151]. By Remark 6-45, we can move g̊ within a
conformal class without changing the required solvability. In particular, we may
assume without further comment that a representative g̊ of the conformal class
has been chosen so that R(g̊) has a definite sign (or vanishes identically). We
then seek a positive solution to the Lichnerowicz equation (6.2.10).
The nonexistence cases are readily shown. Consider the case σ = 0, so that
1 1 2 q+1
the equation reduces to 1g̊ φ = q(n−1) R(g̊)φ + qn τ φ . If τ ̸= 0, then since by
Stokes’ theorem M 1g̊ φ dvg̊ = 0 for M closed, we see there can be no positive
R
1
solution φ if R(g̊) ≥ 0. If τ = 0 as well, we get 1g̊ φ = q(n−1) R(g̊)φ, from
which we again see there can be no positive solution φ > 0 when R(g̊) > 0 or
S OLVING THE CONSTRAINT EQUATIONS : THE CONFORMAL METHOD 203

R(g̊) < 0. Similar reasoning yields the other two nonexistence cases for σ ̸= 0,
/ Y +.
τ = 0 and g̊ ∈
There is one very easy existence case, when σ = 0, τ = 0 and g̊ ∈ Y 0 , for
which we can assume R(g̊) = 0. Then the equation reduces to 1g̊ φ = 0, whose
solutions are constants.
Consider the case g̊ ∈ Y − , σ = 0, τ ̸= 0. By (6.1.8) and Theorem 6-31, we
can solve
− n+2
− 4(n−1)
n−2 φ
n−2 L φ = R(φ q g̊) = − n−1 τ 2
g n
1 1 2 q+1
for φ > 0. This equation can easily be rewritten 1g̊ φ − q(n−1) R(g̊)φ = qn τ φ ,
which is the Lichnerowicz equation in this case.
For the case g̊ ∈ Y − , σ ̸= 0, τ ̸= 0, we follow [151]. We assume we have applied
Theorem 6-31 to choose a conformal representative g̊ so that R(g̊) = − n−1 2
n τ .
Hence the Lichnerowicz equation becomes
τ 2
1g̊ φ = − qn 1
φ − q(n−1) |σ |2g̊ φ −q−3 + qn
1 2 q+1
τ φ . (6.2.11)

We make another preliminary conformal change: consider the linear operator


Sg̊ θ := 1g̊ θ − n−2 2
4n τ θ . Sg̊ is self-adjoint and elliptic, and from the equation
0 = M w Sg̊ w dvg̊ , we can conclude w = 0, so that Sg̊ has trivial kernel. We
R

can thus consider the solution θ of Sg̊ θ := 1g̊ θ − n−2 2 n−2 2


4n τ θ = − 4(n−1) |σ |g̊ , which
exists by the Fredholm alternative. By elliptic regularity, the solution θ is smooth,
for smooth g̊ and σ . Furthermore, we observe that θ cannot have a negative
minimum, so then by the strong maximum principle, θ > 0. Recalling that
n+2
q + 1 = n−2 and R(g̊) = − n−1 2
n τ , note that
n+2
R(θ q g̊) = − 4(n−1)
n−2 θ
− n−2 n−2
(1g̊ θ − 4(n−1) R(g̊)θ ) = − 2(n−1) 2 −q
n τ θ + θ −q−1 |σ |2g̊ .

We replace g̊ and σ in the Lichnerowicz equation (6.2.10) with g̃ = θ q g̊ and


σ̃ = θ −2 σ (as (0, 2)-tensors), respectively, to obtain
2(n−1) 2 −q
1g̃ φ = 1
q(n−1) (− n τ θ + θ −q−1 |σ |2g̊ )φ − q(n−1)
1
|σ̃ |2g̃ φ −q−3 + qn
1 2 q+1
τ φ .

Since |σ |2g̊ = θ 2q+4 |σ̃ |2g̃ , this becomes


2 2 −q
1g̃ φ = − qn 1
τ θ φ + q(n−1) θ q+3 |σ̃ |2g̃ φ − q(n−1)
1
|σ̃ |2g̃ φ −q−3 + qn
1 2 q+1
τ φ ,

which we write as 1g̃ φ = F(x, φ). We will find constants 0 < φ− ≤ φ+ to


q+4
serve as sub- and supersolutions. Choose φ− > 0 so that φ− ≤ min M θ −q−3
q
and φ− ≤ min M 2θ −q . Then a little algebra shows F(x, φ− ) ≤ 0 as desired.
q+4 q
Likewise, choosing φ+ so that φ+ ≥ max M θ −q−3 and φ+ ≥ max M 2θ −q , we
have φ+ ≥ φ− , and F(x, φ+ ) ≥ 0. Thus we can solve the Lichnerowicz equation;
204 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

compare the equation we solved to (6.2.11) to see how the preliminary step added
a term that allowed a simple choice of super- and subsolutions.
For the three cases with λ1 (g̊) ≥ 0 and σ ̸= 0, we again proceed following
Maxwell [151]. We arrange by conformal invariance to work with g̊ in the
conformal class with R(g̊) > 0 or R(g̊) = 0, and note that R(g̊) + n−1 2
n τ > 0 for
these three cases. We now define an operator Sg̊ analogous to that above:
n−2 2 2
Sg̊ := 1g̊ − 4(n−1) R(g̊) + n−1
n τ = Lg̊ − n−2
4n τ .


Again Sg̊ is linear, elliptic, self-adjoint, and has trivial kernel. Thus, by the
Fredholm alternative and elliptic regularity, we can solve the following with u
smooth:
n−2
Sg̊ u = − 4(n−1) |σ |2g̊ .

At a minimum point p for u, 1g̊ u| p ≥ 0, so that by the strict positivity of


(R(g̊) + n−1 2
n τ ), u ≥ 0. As we did earlier, we can apply the strong maximum
principle to conclude u > 0. We use conformal invariance again, and transform
the conformal data to g̃ = u q g̊, σ̃ = u −2 σ (as (0, 2)-tensors), and τ . Note that
− n+2 n+2
R(g̃) = − 4(n−1)
n−2 u
n−2 L u = −
4(n−1) − n−2
n−2 u
n−2 2 n−2 2
4n τ u − 4(n−1) |σ |g̊


2 2
= −u −(q+1) n−1n τ u − |σ |g̊


= |σ̃ |2g̃ u q+3 − n−1 2 −q


n τ u .

We write the Lichnerowicz equation for (g̃, σ̃ , τ ) in the form


n−2
1g̃ φ = 4(n−1) R(g̃)φ − |σ̃ |2g̃ φ −q−3 + n−1 2 q+1
n τ φ


= 1
|σ̃ |2g̃ (u q+3 φ − φ −q−3 ) + n−1 2 q −q
n τ φ(φ − u ) =: F(x, φ).

q(n−1)

Now we will again find constants 0 < φ− ≤ φ+ to be sub- and supersolutions.


q+4 q
Choose φ− > 0 so that φ− ≤ min M u −q−3 and φ− ≤ min M u −q . Likewise,
q+4 q
choosing φ+ so that φ+ ≥ max M u −q−3 and φ+ ≥ max M u −q , we have φ+ ≥ φ− ,
and a little algebra shows F(x, φ− ) ≤ 0 and F(x, φ+ ) ≥ 0 as before. Thus we
can solve the Lichnerowicz equation. □
Remark 6-47. Isenberg [127] used the resolution of the Yamabe problem to
handle some cases in Y + , in particular the case when σ ̸= 0 but τ = 0, which
was the one case not handled by previous works using fixed-point methods. We
briefly note Isenberg’s approach to this case. For simplicity, we work (as he
did) in case n = 3, so that q = 4. We use the resolution of the Yamabe problem
to pick a conformal representative g̊ with R(g̊) = 8. Then the Lichnerowicz
equation can be written 1g̊ φ = φ − 81 |σ |2g̊ φ −7 =: F(x, φ). If σ never vanishes,
S CALAR CURVATURE DEFORMATION ON CLOSED MANIFOLDS 205

then appropriate sub- and supersolutions are given by


q q
φ− := 8 18 min M |σ |2g̊ , φ+ := 8 18 max M |σ |2g̊ .
For the general case, however, one has to be more clever to pick the subsolution.
Following [127], define φ+ = max 1, 81 max M |σ |2g̊ ≥ 1. It is easy to check


φ+ is a supersolution. Just as in cases above, the operator Sg̊ = 1g̊ − 1 is


self-adjoint elliptic with trivial kernel, so we can let φ− be the solution to
−7
Sg̊ (φ− ) = − 81 φ+ |σ |2g̊ . Again, we see φ− cannot have a negative minimum, so
−7
φ− > 0 by the strong maximum principle. Furthermore, since 81 φ+ |σ |2g̊ ≤ φ+ ,
we have (1g̊ −1)(φ− ) = Sg̊ (φ− ) ≥ −φ+ , from which we see that the maximum of
−7
φ− cannot be greater than φ+ , i.e., φ− ≤ φ+ . Finally, 1g̊ φ− = φ− − 81 φ+ |σ |2g̊ ≥
−7
φ− − 18 φ− |σ |2g̊ = F(x, φ− ), as desired.

6.3. Scalar curvature deformation on closed manifolds

In this section, we discuss two results on the image of the scalar curvature map on
closed manifolds. We will develop in detail how to prescribe perturbations of the
scalar curvature function on a closed manifold M using inverse function theorem
methods, following Fischer and Marsden [89]. First we note the following
theorem of Kazdan and Warner [135; 134] on the range of the scalar curvature
map.

6.3.1. The Kazdan–Warner classification.


Theorem 6-48 (Kazdan–Warner). Suppose M n , n ≥ 3 is a closed connected
manifold. There are three possibilities for the set S of scalar curvatures of smooth
Riemannian metrics on M. Indeed, a function f ∈ C ∞ (M) is a scalar curvature
of a smooth Riemannian metric on M according to one of the following three
cases:
(i) S = C ∞ (M).
(ii) f ∈ S if and only if { f < 0} ̸= ∅.
(iii) f ∈ S if and only if either { f < 0} ̸= ∅ or f is identically zero.
From this theorem, we immediately see that M admits a metric with positive
scalar curvature (PSC) if and only if it can carry any smooth function as a scalar
curvature for some metric. For an example of case (iii) above, the torus Tn admits
a metric with zero scalar curvature (in fact a flat metric), but no metrics with
positive scalar curvature: see Theorem 6-56 (cf. [198]) for the three-dimensional
case, and for general dimension see [200; 205] for the Schoen–Yau minimal
hypersurface approach, or see [139, Theorem IV.5.5] for the Gromov–Lawson
206 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

approach via the Dirac operator. For case (ii), an example is provided by Tn #Tn ,
on which, as with the torus, there is no metric of positive scalar curvature, and
so by Corollary 6-53, any scalar-flat metric would be Ricci flat; however, by
Bochner’s theorem (see [182], for instance), a closed n-manifold with nonnegative
Ricci curvature has first Betti number at most n.
The analogous result in dimension two also holds [135; 134]; in this case, the
three possibilities are governed by the Euler characteristic χ(M), consistent with
the Gauss–Bonnet formula.
Theorem 6-49 (Kazdan–Warner). Suppose M is a closed connected surface.
There are three possibilities for the set S of scalar curvatures of smooth Riemann-
ian metrics on M. Indeed, a function f ∈ C ∞ (M) is a scalar curvature of a
smooth Riemannian metric on M according to one of the following three cases:
(i) f ∈ S if and only if { f > 0} ̸= ∅, in case χ(M) > 0.
(ii) f ∈ S if and only if either f changes sign or f is identically zero, in case
χ (M) = 0.
(iii) f ∈ S if and only if { f < 0} ̸= ∅, in case χ(M) < 0.

6.3.2. The Fischer–Marsden theorem. A fruitful technique for studying the


scalar curvature on a manifold is to fix a metric g, and then consider metrics
conformal to g, i.e., to consider the scalar curvature functions across the confor-
mal class of g, as we have seen earlier. For our purposes here, we will consider
the full range of metric deformations, and consider the scalar curvature map
on the space of all metrics. For this we restate Lemma 2-7, recalling that a dot
between tensor fields denotes the metric contraction of their tensor product, e.g.,
h · k = g iℓ g jm h i j kℓm .
Lemma 6-50 (linearization of the scalar curvature). Let h be a symmetric (0, 2)-
tensor such that, for some ε > 0, g + th is a metric on M for |t| < ε. Then
d
L g h := R(g + th) = −1g (trg h) + divg divg h − h · Ric(g).
dt t=0
In case M is compact, then any sufficiently smooth h can be used in the above
formula; if M were noncompact, the formula could be interpreted locally, or
additional conditions might be imposed on h (such as compact support) so that
g + th is a metric for small |t|.
We also recall equation (2.4.5): we can use integration by parts to define the
formal L 2 -adjoint L ∗g f , which satisfies M h · L ∗g f dvg = M f L g h dvg for all
R R

h compactly supported in the interior of M, so that


L ∗g f = −(1g f )g + Hessg f − f Ric(g).
S CALAR CURVATURE DEFORMATION ON CLOSED MANIFOLDS 207

Note that L g L ∗g is a fourth-order elliptic operator, with principal part (n−1)12g f ,


as we will show later in this section (you might try it now as an easy exercise).
We have seen the operator L ∗g before, when we introduced static vacuum
metrics in Section 2.4.5. Such metrics are precisely those for which the kernel
of L ∗g is nontrivial. We know from Corollary 2-45 (or by Exercise 2-52a), if
(M, g) is connected and static vacuum, then M has constant scalar curvature.
On a closed manifold we can say more, as in the following result from [89].
Proposition 6-51. Suppose (M n , g) (n ≥ 2) is a closed, connected Riemannian
manifold, for which there is a nontrivial function f : M → R with L ∗g f = 0.
Then the scalar curvature is a nonnegative constant, the function f is smooth
and is an eigenfunction of the Laplacian 1g . In case R(g) = 0, we have that
Ric(g) = 0.
Proof. We have already recalled that the scalar curvature is constant. Tracing the
equation L ∗g f = 0 we obtain −(n−1)1g f − R(g) f = 0, which shows f is an
eigenfunction with eigenvalue R(g)
n−1 , which must be nonnegative by Exercise 1-11e.
In case R(g) = 0, then, 1g f = 0, so that f is a constant, which by scaling we
can take to be f = 1. Since L ∗g (1) = −Ric(g), the claim follows. □
We stress that being static vacuum is a very special condition on a metric g,
and that most metrics are not static. Careful formulations of this and related
results about the existence of Killing initial data (analogous kernel elements in
the case of the Einstein constraints, cf. Section 5.3.3.2) are established in [19].
We now state a scalar curvature deformation result for metrics which are not
static vacuum [89]. For a nonnegative integer k, we let ∥ · ∥k be the norm for
either a Sobolev space W k, p (M) for some p ∈ (1, ∞), or for a Hölder space
C k,α (M) for some α ∈ (0, 1). For the Sobolev case of the next result, we use
k + 2 > np for the existence, and while we take k > np for higher regularity for
convenience, the interested reader can check that k + 1 > np suffices. We recall
that we take g to be smooth.
Theorem 6-52 (Fischer–Marsden). Suppose (M n , g), for n ≥ 2, is a closed
Riemannian manifold for which L ∗g has trivial kernel. There is an ε > 0 and C > 0
such that for all S ∈ C ∞ (M) with ∥S∥k < ε, there a smooth symmetric (0, 2)-
tensor field h such that g + h is a smooth Riemannian metric with R(g + h) =
R(g) + S, and ∥h∥k+2 ≤ C∥S∥k . In fact, there is such an h of the form h = L ∗g f
for some f ∈ C ∞ (M) with ∥ f ∥k+4 ≤ C∥S∥k .
The proof uses the implicit function theorem along with some elliptic PDE
theory. We sketch the main ideas of the proof below. We first observe the
following direct corollary of Theorem 6-52 and Proposition 6-51, attributed to
208 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

J.-P. Bourguignon [136], which we will utilize in the discussion of positive scalar
curvature and the positive energy theorem.
Corollary 6-53. Suppose (M, g) is a closed connected Riemannian manifold
with nonnegative scalar curvature. Either Ric(g) = 0 or M admits a metric with
positive scalar curvature.
As we noted above, the torus Tn does not admit a metric with positive scalar
curvature, so that any zero scalar curvature metric on Tn must be Ricci-flat. It is
not hard to show from here that the metric must actually be flat; see, e.g., [182,
Chapters 7 and 9]. Cf. [91] and [139, Chapter IV, Section 5] for related results.

6.3.3. Outline of the proof of Theorem 6-52.


6.3.3.1. Elliptic splitting/Hodge decompositions. Let X k denote either function
space W k, p (M) or C k,α (M) (for k a nonnegative integer, 1 < p < ∞, and
α ∈ (0, 1)). As we saw in Section 6.1, we can use the Fredholm alternative/Hodge
decomposition for the elliptic operator L g L ∗g : X k+4 → X k to split the relevant
function space X k as
X k = ran(L g L ∗g ) ⊕ ker(L g L ∗g ),

where ran(L g L ∗g ) = (L g L ∗g )(X k+4 ). Since g is smooth, elliptic regularity implies


that this decomposition holds in the smooth setting. We observe that ker(L g L ∗g ) =
ker L ∗g , because multiplying the equation L g L ∗g u = 0 by u and integrating by
parts, we see M |L ∗g u|2 dvg = 0. Then, since ran L g is L 2 (dvg )-orthogonal to
R

ker L ∗g , the splitting becomes

X k = ran(L g L ∗g ) ⊕ ker L ∗g = ran L g ⊕ ker L ∗g . (6.3.1)


We note that the kernel is finite-dimensional, and the range is closed.
We can also split the corresponding function spaces Y k , k ≥ 2, of symmetric
(0, 2)-tensors (of class W k, p (M) or C k,α (M)) as (cf. [20], [22, Appendix I])
Y k = ran L ∗g ⊕ ker L g , or more precisely,

Y k = L ∗g (X k+2 ) ⊕ {h 0 ∈ Y k : L g h 0 = 0}. (6.3.2)


We note that whereas for L ∗g , for which ker L ∗g = ker(L g L ∗g ) is the kernel of an
elliptic operator and thus is comprised of smooth sections, and so is k-independent,
the same may not be true of the distributional kernel of L g .
To obtain the splitting of Y k , for h ∈ Y k , we have L g h ∈ X k−2 . As we saw
with the earlier splitting of X k−2 , then, there is f ∈ X k+2 such that L g h =
L g L ∗g f . Thus h 0 := h − L ∗g f ∈ Y k ∩ ker L g . Thus we have an algebraic
sum Y k = L ∗g (X k+2 ) ⊕ {h 0 ∈ Y k : L g h 0 = 0}, which is direct because the
S CALAR CURVATURE DEFORMATION ON CLOSED MANIFOLDS 209

summands are L 2 (dvg )-orthogonal. Since L 2 (dvg )-orthogonality is preserved


under convergence in Y k , we see that L ∗g (X k+2 ) is closed; as the kernel of L g
in Y k is closed, the direct sum is topological as well.
6.3.3.2. Local surjectivity: the implicit function theorem. From the splitting
(6.3.1), we see that to prove Theorem 6-52, we can basically invoke the implicit
function theorem [1], once we observe that the scalar curvature map is smooth in
the appropriate spaces. Let Y k and X k be as above; as noted earlier, the norms
can be defined with respect to a smooth background metric g̊, or via a suitable
covering. Let Mk denote either the set Mk, p of Riemannian metrics in W k, p (M)
or the set Mk,α of Riemannian metrics in C k,α (M). By Sobolev embedding, for
k > np and 0 < α < min 1, k − np , the map W k, p (M) ,→ C 0,α (M) is a continuous


inclusion, and hence Mk, p is open in W k, p (M) and included in M0,α .


In the Hölder setting with k ≥ 2, one readily checks that the scalar curvature
map R : Mk,α → C k−2,α (M) is a smooth map of Banach manifolds. We have
the following proposition for the Sobolev setting.
Proposition 6-54. Assume s ≥ 2 and s > np . Then the scalar curvature map
R : Ms, p → W s−2, p (M) is a smooth map of Banach manifolds.
Sketch of proof. In local coordinates, the scalar curvature is given by
R(g) = g jk 0 ℓjk,ℓ − 0ℓk,
ℓ m ℓ m ℓ
j + 0 jk 0ℓm − 0ℓk 0 jm .


The expression is polynomial in the derivatives of the metric components, with


coefficients that are rational functions in gi j ; in fact, the second derivatives of
metric components appear linearly, and the first derivatives appear quadratically;
the coefficients are rational, due to the presence of (det g)−1 in the inverse metric
and its derivative. In the Hölder regularity setting, smoothness of g 7→ R(g)
is immediate. In the Sobolev setting with s > np , by judicious use of Sobolev
embedding, one proves that W s, p (M) is a ring under function multiplication.
The proof of this fact in [2, Theorem 5.23] can be readily adapted to show that
(det g)−1 ∈ W s, p (M) and that the following multiplication maps are continuous:
W s−1, p (M)×W s−1, p (M) →W s−2, p (M), W s, p (M)×W s−2, p (M) →W s−2, p (M).
This allows us to estimate the second derivative and the quadratic first derivative
terms in the local expression for the scalar curvature map, and smoothness
follows.
To illustrate, we consider the case s = 2. If p > n, then W 1, p (M) ,→ C 0 (M)
by Sobolev embedding. If 1 ≤ p < n, the Sobolev embedding takes the form of a
np
continuous inclusion W 1, p (M) ,→ L r (M) for any 1 ≤ r ≤ p ∗ = n− p , while in the
1,n r
borderline case p = n, W (M) ,→ L (M) for 1 ≤ r < ∞ (see Section 7.2.2.1;
210 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

cf. [2; 86; 107; 144], for more on Sobolev embedding). Now if n > p > n2 ,
then n−n p > 2. Thus we can choose 2 ≤ s1 < n−n p , so that 1 < s2 < n−n p , with
s2 = s1s−1
1
. Using Hölder’s inequality, we obtain, for u, v ∈ W 1, p (M),
Z Z 1/s1 Z 1/s2
|u| p |v| p dvg̊ ≤ |u|s1 p dvg̊ |v|s2 p dvg̊ .
M M M

The Sobolev embedding W 1, p (M) ,→ L si p (M) then allows us to estimate the


product ∥uv∥ L p ≤ ∥u∥ L s1 p ∥v∥ L s2 p ≤ C∥u∥W 1, p ∥v∥W 1, p . Note that we could have
taken s1 = 2 = s2 above. The same argument works in the borderline case, or,
using the product rule, we could show that ∥uv∥W 1, n2 ≤ C∥u∥W 1,n ∥v∥W 1,n , and
n n ∗
use the continuous embedding W 1, 2 (M) ,→ L ( 2 ) (M) = L n (M).
Thus in all cases for 2 > np , multiplication is a continuous bilinear mapping
W 1, p (M)×W 1, p (M) → L p (M), and likewise for W 2, p (M)×L p (M) → L p (M).
That W 2, p (M) is a ring, with continuous multiplication, follows by the applying
the product rule. □
Exercise 6-55. Consider the case s = 3 > np of the above argument. Prove
that multiplication gives continuous maps W 3, p (M) × W 3, p (M) → W 3, p (M),
W 2, p (M) × W 2, p (M) → W 1, p (M) and W 3, p (M) × W 1, p (M) → W 1, p (M).
In fact, observe that this can quickly be reduced to showing multiplication gives
a continuous mapping W 2, p (M) × W 1, p (M) → W 1, p (M), and prove this using
Sobolev embedding in a similar manner as above.
To summarize, if L ∗g has trivial kernel, as in the setting of Theorem 6-52,
then L g : Y k+2 → X k is surjective, by (6.3.1). The implicit function theorem
[1, Section 2.5] now implies that the map R : Mk+2 → X k is a surjection
from a neighborhood of g in Mk+2 to a neighborhood of R(g) in X k . That is,
there is ε > 0 and C > 0 such that, given a function S ∈ X k with ∥S∥k < ε,
there is an h ∈ Y k+2 with ∥h∥k+2 ≤ C∥S∥k and such that g + h ∈ Mk+2 with
R(g + h) = R(g) + S. In fact, by the splitting (6.3.2), h can be chosen in the
form h = L ∗g f for f ∈ X k+4 , and the estimate ∥ f ∥k+4 ≤ C∥S∥k (adjusting C if
necessary) then follows from Proposition 6-15 with L = L g L ∗g (or with L = L ∗g
using also Remark 6-22). This gives a localized deformation result in the finite
regularity spaces. To complete the proof, we next consider higher regularity.
6.3.3.3. Regularity. In the setting of the preceding paragraph, we consider a
smooth metric g with ker L ∗g = {0}, a smooth S with ∥S∥k < ε, and f ∈ X k+4
with R(g + L ∗g f ) − R(g) = S and ∥ f ∥k+4 ≤ C∥S∥k . To motivate why f should
be smooth, for small S (and hence for small f ), the left-hand side is roughly
L g L ∗g f , which is an elliptic operator of order four on f , with smooth coefficients.
S CALAR CURVATURE DEFORMATION ON CLOSED MANIFOLDS 211

We note that if ∼ denotes equality up to terms in f and its derivatives up to order


three, we have

L g L ∗g f ∼ −1g (−(n−1)1g f ) + divg divg (−(1g f )g + Hessg f )


∼ (n−1)12g f − g i j f ;i jkℓ gr s g kr g ℓs + f ;i jkℓ g ik g jℓ
= (n−1)12g f − g i j f ;i jkℓ g ℓk + f ;i jkℓ g ik g jℓ
∼ (n−1)12g f,

where we commuted derivatives to cancel out the last two terms, either replac-
ing the derivatives with partial derivatives (up to ∼), or keeping the covariant
derivatives and using the Ricci identity αi; jk − αi;k j = αℓ R ℓjki applied to α = d f ;
in either case, the principal order term in L g L ∗g is (n−1)12g . We note that the
symbol of this operator is (n−1)|ξ |4g .
With this in mind, to prove that f is smooth, we write the left-hand side of
R(g + L ∗g f ) − R(g) = S as a quasilinear operator on f . Indeed, if we let γ be
the metric γ = g + L ∗g f , then with a short computation we leave as an exercise,

R(g + L ∗g f ) − R(g) = γ iℓ γ jk g ab (−g jℓ f ,abik + g jk f ,abiℓ ) + S (g, f )

where S (g, f ) depends smoothly on (g, f ) and involves derivatives of f up to


third order. You can easily observe that if γ = g, and the derivatives are covariant,
the leading order term is of course (n−1)12g f . For ε > 0 small, ∥L ∗g f ∥k+2 ≤ Cε,
so that (by Sobolev embedding, if needed), ∥L ∗g f ∥C 0 is small; thus γ and g are
sufficiently close so that the leading order term of R(g+L ∗g f )− R(g) is elliptic as
a fourth-order operator, where the coefficients depend on lower-order derivatives
of f . We can treat the operator as linear in f , with coefficients depending
on f . It follows that f is smooth: for some 0 < α < 1, f ∈ C 4,α (M) (by
Sobolev embedding, if needed). Then L ∗g f ∈ C 2,α (M), and S (g, f ) ∈ C 1,α (M).
Elliptic regularity on the fourth-order equation implies f ∈ C 5,α (M); hence
the coefficients of the principal term as well as S (g, f ) now gain a degree of
regularity. We can thus bootstrap our way up to f ∈ C ∞ (M); of course, if S
were less smooth, this process would terminate at a finite regularity stage. (If
allowing g to be less smooth, one must be especially careful, since derivatives
of g appear in the operators L g and L ∗g .)

6.3.4. A topological obstruction to positive scalar curvature. The theorems of


Bonnet–Myers and of Bochner give classical topological obstructions to mani-
folds admitting Riemannian metrics with positive Ricci curvature; information
on the sectional or Ricci curvature can give some control on the behavior of
geodesics. While from Exercise 2-56 we see how the scalar curvature governs
212 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

the leading-order deviation of the volumes of small geodesic balls from their
Euclidean counterparts (see also [142, Theorem 1.16] for a comparison result on
the conjugate radius), its influence on the geometry, and certainly the topology,
seems less direct. Thus it was a breakthrough when Schoen and Yau established
a celebrated fundamental group obstruction to positive scalar curvature [198],
which we will turn to presently. At the time of their work, the only known
topological restrictions on positive scalar curvature involved harmonic spinors,
from the Dirac operator on spin manifolds, dating from works of Lichnerowicz
and Hitchin (as discussed in [139], where one can also explore Gromov and
Lawson’s development of this line of research). Of interest for us here will be
the connection made by Schoen and Yau in relating the positivity of the mass
of an asymptotically flat metric of nonnegative scalar curvature to topological
obstructions to positive scalar curvature (PSC), a topic we will explore in the
next chapter. For now we state one of the main theorems from [198].
Theorem 6-56 (Schoen–Yau). Suppose M is a closed, connected, orientable
three-manifold. Suppose furthermore that either π1 (M) contains a finitely-
generated non-cyclic abelian subgroup, or that π1 (M) contains a subgroup
abstractly isomorphic to the fundamental group of a closed orientable surface of
positive genus. Then M admits no metric with positive scalar curvature, and any
metric on M having nonnegative scalar curvature is flat.
We discuss some elements of the proof. If 6 is a smooth, closed two-sided
minimally immersed hypersurface in (M, g) with unit normal ν along the im-
mersion, and second fundamental form A = Aν , then by (X.14) on p. 224, the
second variation of area for variation field V = ϕν is
Z Z
− ϕ L6 ϕ dσ = |∇ 6 ϕ|2 − (|A|2 + Ricg (ν, ν))ϕ 2 dσ , (6.3.3)

6 6

where L6 ϕ = 16 ϕ +(|A|2 +Ric g (ν, ν))ϕ is the Jacobi operator on 6, with 16 ,


6
∇ , | · | and dσ induced on 6 from (M, g). A minimal surface 6 is called stable
if the second variation of area is nonnegative for all variations, which translates
into the stability inequality − 6 ϕ L6 ϕ dσ ≥ 0, for all sufficiently smooth ϕ
R

on 6. One of the key components of the proof of Theorem 6-56 is the following
beautiful observation [198], cf. [106], using the stability inequality.
Proposition 6-57. Let (M, g) be a closed, oriented Riemannian three-manifold
with positive scalar curvature. Then (M, g) admits no stable minimal immersion
from a closed orientable surface 6 with positive genus.
Proof. Suppose 6 is a closed oriented surface, with a stable minimal immersion
to M. Choose a local orthonormal frame {E 1 , E 2 , E 3 } adapted to 6, with E 1
S CALAR CURVATURE DEFORMATION ON CLOSED MANIFOLDS 213

and E 2 tangential, and E 3 = ν, the oriented unit normal to 6. For i, j = 1, 2, let


Ai j = g(∇ Ei E j , E 3 ) be the components of the second fundamental form in the
frame. Let K i j denote the sectional curvature in (M, g) of the E i -E j two-plane,
so that Ricg (ν, ν) = Ricg (E 3 , E 3 ) = K 13 + K 23 , and the scalar curvature is given
by R(g) = 2(K 12 + K 13 + K 23 ).
Let K 6 denote the Gauss curvature of 6 (the sectional curvature of its tangent
plane), which by the Gauss equation satisfies K 6 = K 12 + A11 A22 − A212 . From
minimality we have A11 + A22 = 0, and using this along with symmetry A12 = A21 ,
we get K 6 = K 12 − 12 i, j Ai2j = K 12 − 12 |A|2 . Therefore, the coefficient in the
P

Jacobi operator can be written


|A|2 + Ricg (ν, ν) = 21 |A|2 + K 12 − K 6 + K 13 + K 23 = 12 |A|2 + 12 R(g) − K 6 .
Using (6.3.3), the stability inequality for the variation field V = ϕν takes the
form Z Z
2
|∇ 6 ϕ|2 dσ.
1 1
 2
2 |A| + 2 R(g) − K 6 ϕ dσ ≤
6 6

We let ϕ = 1, and apply the Gauss–Bonnet Theorem 6 K 6 dσ = 2πχ (6),


R

where χ (6) is the Euler characteristic of 6, to obtain


1
Z
R(g) + |A|2 dσ ≤ 2πχ (6). (6.3.4)

2 6
Thus χ (6) is positive, and so must equal 2; that is, 6 has genus zero. □
By (6.3.4), we see that in case R(g) ≥ 0, we still get χ (6) ≥ 0, and in case 6
is a torus, we see |A|2 = 0, i.e., the immersion is actually totally geodesic (see
Exercise 6-62 for more on this, as well as [92; 99; 37; 100]).
To prove Theorem 6-56, then, Schoen and Yau establish the existence of a
stable minimally immersed surface of positive genus, given the condition on
the fundamental group. For example, given an abelian subgroup of rank two
in π1 (M), one gets a continuous map of a torus T2 into M which maps π1 (T2 )
onto this subgroup as follows: consider a torus as a rectangle with opposite sides
suitably identified, and map these opposite sides to generators of the rank-two
subgroup; the boundary of this rectangle maps to a null-homotopic curve in M,
and so the continuous map extends to the interior of the rectangle, and hence
to the torus. (A similar procedure works for higher genus g by representing the
surface as a suitable quotient of a 4g-gon.) Amongst all maps inducing the same
action on π1 (T2 ) (a conjugation may be invoked to keep track of the base point),
one finds an energy-minimizing, hence harmonic, map. The energy is defined
with respect to a surface metric, and is conformally invariant. By varying across
conformal classes of T2 (or the higher genus surface), one finds a map of least
214 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

energy whose action on π1 (T2 ) is the same as the original (and in particular,
then, the map is nontrivial). The energy-minimizer can be shown to be a stable
minimal immersion (without branch points). We can now apply the preceding
proposition. See [198] for details. By (6.3.4), one may allow R(g) = 0, for
example in case (T3 , g) were the flat torus. By scalar curvature deformation
(Theorem 6-52 and Corollary 6-53, cf. [198]), g must be Ricci-flat (hence flat in
dimension three), else the scalar curvature may be made positive.

Exercises

Exercise 6-58 (graphical minimal surface equation). Suppose u :  ⊂ Rn → R


is a smooth function on an open set , and let grad u be its Euclidean gradient
n . The graph of u is a hypersurface 6 ⊂ Rn+1 with unit normal field ν
vector in Rp
given by ν 1 + |grad u|2 = −grad u + ∂/∂ x n+1 , and a basis coordinate frame
for T 6 given by
∂ ∂u ∂
X i = i + i n+1 ,
∂x ∂x ∂x
for i ∈ {1, . . . , n}. In this frame, the first fundamental form g has components
∂u ∂u
gi j = ⟨X i , X j ⟩ = δi j + ,
∂xi ∂x j
while the second fundamental form K has components (where ∇ is the ambient
Euclidean connection)
∂ 2u
D h
∂u
i

E ∂xi ∂x j
K i j = ⟨X j , −∇ X i ν⟩ = ⟨∇ X i X j , ν⟩ = X i ,ν = p .
∂ x j ∂ x n+1 1 + |grad u|2

a. Find the components g i j explicitly, and use this to show that the mean curvature
H = H (u) = g i j K i j satisfies
n
∂u ∂u ∂ 2 u
(1 + |grad u|2 )3/2 H (u) = (1 + |grad u|2 )1u −
X
.
∂xi ∂x j ∂xi ∂x j
i, j=1

(Hint: Let G = (gi j ) and B = (grad u)(grad u)T , where grad u is a column vector
(which may be zero). Find G B, and noting G = I + B, infer G −1 from here.)
b. Show that
grad u
H = div p ,
1 + |grad u|2
where div is the Euclidean divergence in Rn , by comparing to the result of the
calculation in part a.
E XERCISES 215

c. Show how the form of the minimal surface equation H = 0 using part b.
follows naturally from the variational characterization of minimal surfaces as
critical points ofparea functional, using the fact that the area A(u) of the graph of
u is A(u) =  1 + |grad u|2 d x. (If the area is infinite, you can still consider
R

the variation, via the regularization expressed schematically as A(u + tv) − A(u)
for some v compactly supported in .)
d. Show that the mean curvature operator u 7→ H = H (u) is elliptic, or equiva-
lently, u 7→ M(u) = (1 + |grad u|2 )3/2 H (u) is elliptic.
Exercise 6-59. Suppose (M n , g) is Riemannian.
a. Using the form (2-8a)–(2-8b) for the curvature tensor in coordinates, show
the symbol of the linearization P = DRicg is given by (note that the symbol as
given in some references differs from this up to sign)

σ (ξ )(h) = 21 h|ξ |2g + trg h ξ ⊗ ξ − h( · , ξ ♯ ) ⊗ ξ − ξ ⊗ h(ξ ♯ , · ) ,




or equivalently, in orthonormal coordinates at p ∈ M, this is just


 n n 
(σ (ξ )(h))i j = 12 h i j |ξ |2g + h kk ξi ξ j − (h ik ξ j ξk + h k j ξi ξk ) .
P P
k=1 k=1

b. Suppose ξ ∈ T p∗ M \ {0},
with σ (ξ )(h) = 0; this still holds if we rescale so that
|ξ |g = 1. By using an orthonormal frame for T p M with the first vector ξ ♯ , show
that h vanishes on (ξ ♯ )⊥ × (ξ ♯ )⊥ . Then show that there is a form η ∈ T p∗ M such
that h = ξ ⊗ η + η ⊗ ξ . We know from Example 6-11 that indeed such h are in
the kernel of σ (ξ ).
c. Adjust the preceding argument for the situation where g is a Lorentzian metric
on M; you might break up into cases depending on the causal character of ξ .
Exercise 6-60 (elliptic estimates and regularity). Suppose  ⊂ Rn is open. Let
0 ≤ φ ≤ 1 be a smooth function supported on the unit ball in Rn , satisfying
−n φ x . If u ∈ L p , and if we
Rn φ(x) d x = 1, and for any σ > 0, let φσ (x) = σ
R 
σ loc
define the convolution u σ = φσ ∗ u, then u σ is smooth, and u σ converges to u as
p
σ ↘ 0 almost everywhere and in L loc , while for u ∈ C k () with k a nonnegative
integer, u σ converges to u in Cloc
k
() [86]. Suppose L is linear elliptic of order
m, and the domain is  in the first two parts of the exercise.
p
a. Suppose the coefficients of L are constant, and that for some p > 1, u ∈ L loc
p
weakly solves Lu = f ∈ L loc . Consider a ball B1 compactly contained in a ball
B2 , compactly contained in . Employ the interior elliptic estimate between B1
and B2 to show that the functions u σ form a bounded set in W m, p (B1 ), and then
m, p q m,q
conclude that u ∈ Wloc . If instead f ∈ L loc for some q > 1, prove that u ∈ Wloc
216 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS

(using the above argument and Sobolev embedding to handle the case q > p).
0,α m,α
Finally, if moreover f ∈ Cloc for some 0 < α < 1, argue that u ∈ Cloc . To do
this, you might use the preceding to first show u is continuous (in fact, you can
m−1,α
get u ∈ Cloc ); argue that f σ is uniformly bounded in C 0,α on any compact
subset, and apply the interior Schauder estimate.
b. You may assume the Schauder estimate (6.1.3) holds for L with C 0,α coeffi-
m,α
cients. Suppose for some 0 < α < 1, u ∈ Cloc solves Lu = f . Suppose that for
k,α
some k ∈ Z+ , the coefficients of L as well as the function f are in Cloc . Show
m+k,α
that u ∈ Cloc . To do this for k = 1, bound L(1ih u) in C 0,α (B) for a ball B
compactly contained in . Use interior estimates to show that for any B ′ com-
 β
pactly contained in B, and for some δ > 0, the set ∂x 1ih u : 0 < |h| < δ, |β| ≤ m
m+1,α
is bounded and equicontinuous on B ′ . From here, conclude u ∈ Cloc .
c. Suppose L is linear elliptic of order m with smooth coefficients on a closed
Riemannian manifold (M, g). The Fredholm alternative/Hodge decomposition
was established earlier in the L 2 and smooth cases. Prove the decomposition
W k, p (M) = L(W m+k, p (M)) ⊕ ker L ∗ for 1 < p < ∞ and k a nonnegative
integer, as follows. Let ⊥ K := { f ∈ L p (M) : M f w dvg = 0 for all w ∈ K},
R

where K = ker L ∗ . Prove that the subspace (⊥ K) ∩C ∞ (M) is dense in the closed
subspace (⊥ K) ∩ W k, p (M) of W k, p (M). Apply an approximation argument, the
C ∞ (M)-splitting, and elliptic estimates to show that for f ∈ (⊥ K) ∩ W k, p (M),
there is a solution h ∈ W m+k, p (M) of Lh = f ; conclude that any distributional
solution w of Lw = f must lie in W m+k, p (M). By a similar method, show that
for f ∈ (⊥ K) ∩ C k,α (M), with k a nonnegative integer and 0 < α < 1, there is a
solution h ∈ C m+k,α (M) of Lh = f ; conclude that any distributional solution w
of Lw = f must lie in C m+k,α (M). (Hint for the last part: show that there is a
sequence f j ∈ (⊥ K) ∩ C ∞ (M) uniformly bounded in C k,α (M) and converging
to f in C k (M).)
(Note: Part a. holds more generally. Suppose the coefficients of L were merely
(sufficiently) smooth. Then while Lu σ − f σ goes to zero distributionally, one
p
needs an estimate in L loc to apply the above argument, and in fact, K. O. Friedrichs
p
[98, (3.8)] showed that the limit is zero in L loc . For the Hölder case, to go from
m−1,α m,α
u ∈ Cloc to u ∈ Cloc , one could localize and compactify as in Remark 6-20,
and then apply part c.)

Exercise 6-61 (conformal method B: conformally covariant split). The conformal


method introduced in Section 6.2 used a splitting of the tensor K which has been
called the semidecoupling split, and indeed, the Hamiltonian and momentum
constraints completely decouple in the CMC case. This method (Method A) also
E XERCISES 217

enjoys a natural conformal invariance in the CMC case, cf. Remark 6-45. There
is another splitting for the conformal method which enjoys a natural conformal
invariance more generally, and we introduce this method, Method B, now, cf.
[17; 65; 172]. In this method, the freely prescribed TT tensor gives, up to
conformal factor, the TT part of K , and the vector field W is used to generate
the longitudinal part of K itself.
We start with a Riemannian metric g̊ on M n , a symmetric TT tensor σ , and a
4
function τ . We let q = n−2 .
a. For φ > 0, we let K ab = φ −2 σab + φ q (L g̊ W )ab + τn φ q g̊ab . This last term is
pure trace. Observe that the first two summands are TT and longitudinal for
g = φ q g̊, respectively. Show that the vacuum constraints 8(g, K ) = 0 can be
written
1 2 1+q
1g̊ φ = 1
q(n−1) R( g̊)φ + qn τ φ
1
− q(n−1) |σ |2g̊ φ −3−q
1
− q(n−1) |L g̊ W |2g̊ φ 1+q − q(n−1)
2
⟨L g̊ W, σ ⟩g̊ φ −1 ,
(divg̊ (L g̊ W ))a = n−1
n dτa − (q + 2)(L g̊ W )bc (d log φ)c g̊ab .
b. Show that this method enjoys a natural conformal invariance: for θ > 0,
(g̊, σ, τ ) admits a solution φ > 0 and W to the above system if and only if
(θ q g̊, θ −2 σ, τ ) admits a solution φθ = φθ −1 > 0 and Wθ = W to the corresponding
system.
Exercise 6-62. Suppose that, in the setting of Proposition 6-57, we instead have
R(g) ≥ 0. As remarked after the proof, in case there is a stable minimal immersion
from a torus 6, then 6 is also totally geodesic, and R(g) = 0 along 6 as well, by
(6.3.4). Show that more is true: prove that K 6 = 0, and conclude Ricg (ν, ν) = 0
along 6. (Hint: Following [92], consider the functional 6 (|∇ 6 ϕ|2 + K 6 ϕ 2 )dσ .
R

Argue that ϕ = 1 is a minimizer, and consider the Euler–Lagrange equation.)


Excursus:
First and second variation of area

The theory of minimal surfaces is fundamental in geometry, and plays a significant


role in aspects of mathematical relativity, in particular, in the study of the
geometry of initial data sets. The minimal surface equation arises as the Euler–
Lagrange equation for the area functional, as we now recall. The second variation
at a minimal surface is important for geometric analysis, as we have seen in
Section 6.3.4. The formulas we present here are standard, but they are so
important that we derive them in detail.

Variation of a submanifold. Let 6 be a smooth manifold (with or without


boundary). Consider an immersion f : 6 → M, where dim 6 = k < n = dim M.
We consider a variation of 6 (or more precisely, of f ) to be a smooth map
F : I × 6 → M, where I ⊂ R is an open interval around 0, so that F(0, · ) = f ,
and so that for each t ∈ I , if we let f t = F(t, · ) : 6 → M, then f t is an immersion.
We note that if we only assume the immersion condition at t = 0, there would be
an open set U ⊃ {0} × 6 such that for each (t, p) ∈ U , the pushforward map f ∗t
would be injective on T p 6. In case 6 were compact, or in case f t = f outside
a compact set, which are two cases of interest for studying the area functional,
then the set U could be chosen to be a product J × 6, where 0 ∈ J ⊂ I is an
open interval. We assume going forward that we are in one of these cases, and
thus we have just arranged each f t to be an immersion.
If ḡ = ⟨ · , · ⟩ is a Riemannian metric on M, with Levi-Civita connection ∇, we
let g = f ∗ ḡ be the pullback metric on 6 with Levi-Civita connection ∇ 6 , and we
let g(t) := ( f t )∗ ḡ, a smooth curve of metrics on 6 with g(0) = g. If ḡ is semi-
Riemannian, we make the hypothesis that for each t, g(t) is semi-Riemannian
too. (We remark that even when (M, ḡ) is a spacetime, the variation variable t is
just an independent parameter.) We let dσg(t) be the area measure on (6, g(t)),
and if 6 is compact, we let A(t) be the area of (6, g(t)). We want to study
the variation of area A′ (t), which makes sense even when 6 is noncompact but
f t = f outside a compact subset, by replacing A(t) by 6 (dσg(t) − dσg ).
R

219
220 E XCURSUS : F IRST AND SECOND VARIATION OF AREA

We recall that a vector field V along the map F is a map V : I × 6 → T M


such that V |(t, p) ∈ TF(t, p) M for all (t, p) ∈ I × 6. A vector field along f t is
defined analogously.
We let x = (x i ) be local coordinates on 6, so that (t, x) give local coordinates
on I × 6. We let ∂ F/∂t = F∗ (∂/∂t), and ∂ F/∂ x i = F∗ (∂/∂ x i ) be the pushfor-
ward vectors, which define vector fields along the map F. Since f = F(0, ·) is an
immersion, the vectors E i := ∂ F/∂ x i form a local frame for T 6. In particular,
E i is nonzero, so for a vector field W along 6, we have DW/∂ x i = ∇ Ei W .
We start with a simple lemma. Since f is locally an embedding, we can use it
to identify T p 6 inside T f ( p) M, as we do without additional notation in the next
lemma.

Lemma X-1. Let f : 6 → M be an immersion, and let p ∈ 6. Suppose L is a


scalar-valued or vector-valued bilinear form on T p 6 (i.e., a (0, 2)-tensor with
scalar values or values in T f ( p) M). If {E 1 , . . . , E k } is a basis for T p 6, then the
quantity g i j L(E i , E j ) is independent of basis chosen.

Proof. Let {E 1 , . . . , E k } and { Ê 1 , . . . , Ê k } be bases for T p 6, so that the metric g


is represented by gi j = ⟨E i , E j ⟩ and ĝi j = ⟨ Ê i , Ê j ⟩. We have the change of basis
j j
given by E i = Mi Ê j , and Ê j = M̂ ℓj E ℓ , so that Mi M̂ ℓj = δ ℓi , and ĝi j = M̂im gmℓ M̂ ℓj .
j
Thus we see ĝ jc = Ma g ab Mbc , and hence

g i j L(E i , E j ) = g i j Mim M ℓj L( Ê m , Ê ℓ ) = ĝ mℓ L( Ê m , Ê ℓ ). □

As a corollary, for any vector field W defined along 6, we can define div6 W :=
g i j ⟨∇ Ei W, E j ⟩, corresponding to the (0, 2)-tensor L(X, Y ) = ⟨∇ X W, Y ⟩ defined
along 6. We note that if W is tangent to 6, then div6 W = g i j ⟨∇ E6i W, E j ⟩=
divg W is the usual divergence of a tangential vector field. For another example,
if 6 is a hypersurface with unit normal field ν and respective mean curvature H ,
then
div6 ν = g i j ⟨∇ Ei ν, E j ⟩ = −g i j ⟨II(E i , E j ), ν⟩ = −H,

where as before II(X, Y ) = (∇ X Y ) N is the second fundamental form of 6.

First variation. We compute the variation of the area element, written in local

coordinates on 6 as |det(gi j (t))| d x = |det g(t)| d x, as we have done in
p

(2.3.10):
dp d ∂F ∂F p
D E
|det g(t)| = 12 g i j , |det g(t)|,
dt dt ∂ x i ∂ x j
221

with
d ∂F ∂F D ∂F ∂F ∂F D ∂F
D E D E D E
, = , + ,
dt ∂ x i ∂ x j ∂t ∂ x i ∂ x j ∂ x i ∂t ∂ x j
D ∂F ∂F ∂F D ∂F
D E D E
= , + , ,
∂ x i ∂t ∂ x j ∂ x i ∂ x j ∂t
where we used (2.3.8). By symmetry we conclude that
dp D ∂F ∂F p
D E
|det g(t)| = g i j , |det g(t)|. (X.1)
dt i ∂ x ∂t
j ∂x
We evaluate this at t = 0, where V := ∂∂tF t=0 is the variation field. Since
F(0, · ) = f is an immersion, we can write at t = 0
d
|det g(t)| = g i j ⟨∇ Ei V, E j ⟩ |det g(0)|.
p p
dt t=0

Along 6 we divide up the variation field V into tangential and normal com-
ponents, V = V T + V N . Then we have

g i j ⟨∇ Ei V, E j ⟩ = g i j ⟨∇ Ei (V T ), E j ⟩+g i j ⟨∇ Ei (V N ), E j ⟩
= g i j ⟨∇ E6i (V T ), E j ⟩−g i j ⟨V N , ∇ Ei E j ⟩
= div6 (V T ) − g i j ⟨V N , II(E i , E j )⟩
= div6 (V T ) − ⟨V, H⟩, (X.2)

where we recall the vector-valued mean curvature H = g i j II(E i , E j ).


Thus we arrive at the first variation of area formula
Z

A (0) = div6 (V T ) − ⟨V, H⟩ dσg

6
Z Z
= − ⟨V, H⟩dσg + ⟨V T , η⟩ dξg , (X.3)
6 ∂6

where dξg is the induced area element of ∂6 and η is the outward-pointing unit
conormal vector along ∂6, the tangent vector to 6 which is a unit outward-
pointing normal to ∂6. Note that the boundary term in (X.3) will pick up
changes to the area obtained by pushing the boundary along 6. Since each f t is
an immersion, the derivation above can be applied at any t-value, and yields
Z   T D
∂F ∂F
E

A (t) = div6 − , H6t dσg(t)
6 ∂t ∂t
Z D Z D T E
∂F ∂F
E
=− , H6t dσg(t) + , ηt dξg(t) , (X.4)
6 ∂t ∂6 ∂t

where H6t is the mean curvature of the immersion f t , and ηt is the corresponding
conormal.
222 E XCURSUS : F IRST AND SECOND VARIATION OF AREA

By (X.3), we see that the first variation of area vanishes for tangential variations
that vanish along the boundary. This also follows from the following exercise,
which shows why one sometimes restricts to normal variations.
Exercise X-2. a. Suppose W is a compactly supported tangent vector field to 6,
which vanishes along ∂6 (if nonempty). Show that W is the variation field for a
variation 8 : I ×6 → 6, where each φ t := 8(t, · ) : 6 → 6 is a diffeomorphism.
b. Consider a variation of 6 with area function A(t) and variation field V which
is compactly supported on 6 with V T = 0 along ∂6 (if nonempty). Show that
there is a variation of 6 with the same area function A(t) and such that the
variation field is V N .

Second variation. For the second variation we take the derivative of (X.1),
obtaining
d2 p
|det g(t)|
dt 2  E2
∂gmℓ ℓj D ∂ F ∂ F D ∂F ∂F
D E  D
|det g(t)| −g im , + gi j ,
p
= g
∂t ∂ x i ∂t ∂ x j ∂ x i ∂t ∂ x j
E
D D ∂F ∂F D ∂F D ∂F
D E D
ij
+g , + , . (X.5)
∂t ∂ x i ∂t ∂ x j ∂ x i ∂t ∂t ∂ x j

We evaluate the three terms inside the bracket at t = 0. The second is just
D ∂F ∂F 2 2
 D E
gi j , = div6 (V T ) − ⟨V, H⟩ . (X.6)
i ∂ x ∂t
j ∂x t=0

For the term on the second line of (X.5) we apply (2.3.8) and (2.3.9):
D D ∂F ∂F D ∂F D ∂F
D E D E
gi j , + ,
∂t ∂ x i ∂t ∂ x j ∂ x i ∂t ∂t ∂ x j
D D ∂F ∂F ∂F ∂F ∂F ∂F D ∂F D ∂F
D E D   E D E
= gi j , + R , , , + , .
∂ x ∂t ∂t
i ∂x
j i ∂t j∂x ∂t
i j ∂x ∂ x ∂t ∂ x ∂t
Evaluation at t = 0 yields
DV DV DV
D E
div6 − g i j ⟨R(E i , V, V ), E j ⟩ + g i j , . (X.7)
∂t ∂xi ∂x j
Just as in (X.2) above, we can write
DV DV T DV N DV T DV
      D E
div6 = div6 + div6 = div6 − , H . (X.8)
∂t ∂t ∂t ∂t dt
As for the first term in brackets in (X.5), we get that it equals the negative of
D ∂F ∂F ∂F D ∂F ℓj D ∂ F ∂ F
D E D E D E
g im , ℓ
+ , ℓ
g , ,
∂x
m ∂t ∂x m ∂x ∂ x ∂t
i j ∂ x ∂t ∂x
223

which at t = 0 is
DV DV DV
D E D E D E
g im , Eℓ + Em , ℓ g ℓj , Ej . (X.9)
∂xm ∂x ∂xi

We can simplify the expressions in (X.7) and (X.9) when the variation field is
normal to 6. To facilitate this, we introduce some terminology. The normal Ricci
curvature R is a linear operator on each normal space N p 6, defined for vectors
W normal to 6 by R(W ) = g i j R(W, E i , E j ) N , where {E 1 , . . . , E k } is a basis
for T p 6. This is well-defined by Lemma X-1. Note that by symmetry-by-pairs,
if W is normal to 6, ⟨R(W ), W ⟩ = g i j ⟨R(E i , W, W ), E j ⟩.
For each vector W normal to 6, we define a scalar-valued (0, 2)-tensor A W
on 6 by

A W (X, Y ) = ⟨−∇ X W, Y ⟩ = ⟨W, ∇ X Y ⟩ = ⟨W, II(X, Y )⟩.

If {E 1 , . . . , E k } is a basis for T p 6 as above, then A W (E i , E j ) = ⟨−∇ Ei W, E j ⟩ =


⟨−(∇ Ei W )T , E j ⟩, so that g jm A W (E i , E j )E m = −(∇ Ei W )T . In case W = ν is
a unit vector, Aν is the scalar-valued second fundamental form K with respect
to ν.
Finally, we recall the normal connection ∇ N on the normal bundle of 6 as
follows: for X tangent to 6, and W normal along 6, we let ∇ XN W = (∇ X W ) N .
If we apply Lemma X-1 to L(X, Y ) = ⟨∇ XN W, ∇YN W ⟩, we let the result be
⟨∇ N W, ∇ N W ⟩g = g i j ⟨∇ ENi W, ∇ ENj W ⟩, or |∇ N W |2g in the Riemannian case.
So, if V = ∂∂tF t=0
is normal to 6, then (X.7) can be written, using DV
∂xℓ
= ∇ Eℓ V
and (X.8), as
DV
div6 − ⟨R(V ), V ⟩ + g i j (∇ Ei V )T , (∇ E j V )T + g i j ⟨∇ ENi V, ∇ ENj V ⟩
∂t
DV
= div6 − ⟨R(V ), V ⟩ + g i j g aℓ A V (E i , E a )g sm A V (E j , E s )gℓm
∂t
+ g i j ⟨∇ ENi V, ∇ ENj V ⟩
T D
DV DV
 E
= div6 − , H −⟨R(V ), V ⟩+⟨A V, A V ⟩g +⟨∇ N V, ∇ N V ⟩g . (X.10)
∂t dt

At last we write (X.9) in the case that V is normal to 6 as

g im ⟨∇ Em V, E ℓ ⟩ + ⟨E m , ∇ Eℓ V ⟩ g ℓj ⟨∇ Ei V, E j ⟩


= g im ⟨V, ∇ Em E ℓ ⟩ + ⟨V, ∇ Eℓ E m ⟩ g ℓj ⟨V, ∇ Ei E j ⟩




= g im ⟨V, II(E m , E ℓ )⟩ + ⟨V, II(E ℓ , E m )⟩ g ℓj ⟨V, II(E i , E j )⟩




= 2⟨A V , A V ⟩g . (X.11)
224 E XCURSUS : F IRST AND SECOND VARIATION OF AREA

Applying (X.11), (X.10) and (X.6) (with V T = 0) to (X.5), we obtain for normal
variations V
DV T
Z 
DV
  D E
′′
A (0) = − 2⟨A V , A V ⟩g + div6 − ,H
6 ∂t dt

−⟨R(V ), V ⟩ + ⟨A , A ⟩g + ⟨∇ N V, ∇ N V ⟩g + ⟨V, H⟩2 dσg
V V

Z
= ⟨∇ N V, ∇ N V ⟩g − ⟨A V , A V ⟩g − ⟨R(V ), V ⟩ dσg

6 Z D Z D
DV DV
E  E
2
− , H − ⟨V, H⟩ dσg + , η dξg . (X.12)
6 dt ∂6 ∂t

This is the second variation formula for normal variations.


A number of comments are in order. By the first variation formula (X.3),
since any V can be the variation field for some variation, the area is critical for
all compactly supported normal variations if and only if H = 0. If we consider
the second variation of such an immersion, and if either ∂6 is empty, or if the
variation F = f on the boundary, i.e., F(t, p) = f ( p) for (t, p) ∈ I × ∂6 (so
that the variation field ∂∂tF vanishes there), then we just get
Z
′′
A (0) = ⟨∇ N V, ∇ N V ⟩g − ⟨A V , A V ⟩g − ⟨R(V ), V ⟩ dσg . (X.13)

6

We have not yet restricted the signatures of ḡ and g, and thus the first two
terms above do not have a (semi)-definite sign. In case g is Riemannian, then
|A V |2g := ⟨A V , A V ⟩g ≥ 0. If ḡ is Lorentzian and 6 is a spacelike hypersurface,
then ⟨∇ N V, ∇ N V ⟩g = g i j ⟨∇ ENi V, ∇ ENj V ⟩ ≤ 0.
In the case (6, g) is a Riemannian hypersurface, we consider a smooth unit
normal field ν to 6; either we assume 6 ⊂ M is two-sided and ν is defined
globally on 6, or ν is a local unit normal field. We let A = Aν , which is just
the scalar-valued second fundamental form K with respect to ν. Consider a
normal variation V = ϕν, where ϕ is a smooth function of compact support
on 6, vanishing on the boundary. In this case, ⟨R(V ), V ⟩ = ϕ 2 Ricḡ (ν, ν),
|A V |2g = ϕ 2 |A|2g , and ⟨∇ N V, ∇ N V ⟩g = ⟨ν, ν⟩|∇ 6 ϕ|2g , so that we have
Z
A′′ (0) = ⟨ν, ν⟩|∇ 6 ϕ|2g − ϕ 2 |A|2g − ϕ 2 Ricḡ (ν, ν) dσg . (X.14)

6

If (M, ḡ) is a spacetime satisfying the Einstein vacuum equation Ric(ḡ) = 0, or


more generally Ricḡ (ν, ν) ≥ 0 (which follows from the timelike convergence
condition, cf. Section 2.3.2), then we see from (X.14) why a spacelike hypersur-
face 6 with H = 0 is sometimes called a maximal hypersurface (as opposed to
a minimal hypersurface in the Riemannian context). In the Riemannian setting,
E XERCISES 225

we integrate (X.14) by parts and use the boundary condition on ϕ to obtain


A′′ (0) = − 6 ϕ L6 ϕ dσg , where L6 = 16 + |A|2g + Ricḡ (ν, ν) is the Jacobi
R

operator, whose properties play an important role in applying minimal surface


theory to geometry.
One can express the Ricci term in the second variation in terms of scalar
curvature, using the Gauss equation, as in the proof of Proposition 6-57 and as
in the derivation of the Hamiltonian constraint equation. Indeed, using (5.2.4)–
(5.2.5) and accounting for signature, we have

|A|2g + Ricḡ (ν, ν) = 21 ⟨ν, ν⟩ R(ḡ) − R(g) + ⟨ν, ν⟩(|A|2g + H 2 ) ,




which simplifies at a minimal (H = 0) hypersurface.

Exercises

Exercise X-3. Consider the three-dimensional Riemannian Schwarzschild mani-


fold (M, g S ), where
3 if m > 0,
{x ∈ R : x ̸= 0}

4
m

gS = 1 + g E3 and M = R 3 if m = 0,
2|x| 3 m
x ∈ R : |x| > − 2 if m < 0.


For any r > max(0, −m/2), let Sr = {x ∈ R3 : |x| = r } ⊂ M.


a. Find the second fundamental form and the mean curvature vector H of
Sr = {x : |x| = r } in the metric g S . Conclude that Sr is minimal only for
r = m/2 > 0. (It is straightforward to do the computation using the relevant
Christoffel symbols for g S in spherical coordinates; compare Exercise X-5, and
cf. Exercise 2-49.)
b. Let A(r ) be the g S -area of Sr . Show directly that A′ (r ) = − Sr ⟨H, X⟩gS dσ ,
R

where X = ∂/∂r and dσ is the area measure induced by g S , thus verifying the
first variation of area formula (X.3).
c. The Hawking mass of a surface 6 with area |6| is given by
r
|6| 1
 Z 
2
m H (6) = 1− H dσ .
16π 16π 6
Show that m H (Sr ) = m.
d. Generalize the argument in part a. of Exercise 5-27 to show that if there is a
closed minimal surface 6 in (M, g S ), then m > 0 and 6 = Sm/2 .
Exercise X-4. Suppose (M, g) is Riemannian. Let I ⊂ R be an open interval
containing 0. Suppose F : I × M → M is smooth, with F(0, · ) : M → M the
226 E XCURSUS : F IRST AND SECOND VARIATION OF AREA

identity map. Suppose  ⊂ M is an open set with compact closure and smooth
hypersurface boundary 6, and outward unit normal field ν along 6. Let V (t)
be the volume of t := F({t} × ), let A(t) be the area of F({t} × 6), and let
dσ be the induced surface measure.
a. Suppose that along 6 we have
 
∂F ∂
:= F∗ = X T + ϕν,
∂t t=0 ∂t t=0

where X T is tangent to 6. Show that V ′ (0) = 6 ϕ dσ . (Hint: Compute the


R

derivative of the pullback of the volume measure d(F(t, · )∗ (dvg ))/dt.)


b. Show that A′ (0) = 0 for all F as above for which V ′ (0) = 0 means that 6
has constant mean curvature (CMC).
c. Show that the CMC condition from part b. can be slightly rephrased to
A′ (0) = 0 for all F for which V (t) = V (0) is constant (i.e., the area is stationary
for all volume-preserving deformations.) (Hint: Given ϕ on 6 with 6 ϕ dσ = 0,
R

construct a volume-preserving deformation as follows: start with an F with


variation field ϕν along {0} × 6; you can even arrange F to be the identity off
of a neighborhood of {0} × 6. Extend ν off of {0} × 6 and construct a suitable
O(t 2 ) deformation of F; one suggestion is to employ the exponential map and
the implicit function theorem.)
Exercise X-5. Suppose M is a n-dimensional manifold, n ≥ 3, with 6 ⊂ M a
smooth embedded hypersurface. Let u > 0 be a smooth positive function on M.
Suppose g̊ is a metric on M with Levi-Civita connection ∇, ˚ and g = u q g̊ is a
conformal metric, with Levi-Civita connection ∇; in case g̊ is semi-Riemannian,
we assume that g induces a semi-Riemannian metric on 6. Note that for p ∈ 6,
the splitting T p M = T p 6 ⊕ (T p 6)⊥ is the same in both metrics. Let II˚ and II be
the second-fundamental forms of 6 with respect to g̊ and g, respectively, i.e., for
X, Y tangent to 6, II˚ (X, Y ) = (∇˚ X Y )⊥ and II(X, Y ) = (∇ X Y )⊥ , with respective
mean curvature vectors H̊ = trg̊ II˚ and H = trg II.
a. Verify that the tensor ∇ − ∇˚ (Exercise 1-6) is given by (cf. Exercise 6-37)

∇ X Y − ∇˚ X Y = 21 qu −1 du(X )Y + du(Y )X − g̊(X, Y ) gradg̊ u .




From here it is easy to express II in terms of II˚ , g̊ and u.


q
If ν̊ is a smooth local unit normal field to 6 with respect to g̊, then ν = u − 2 ν̊ is
a smooth local unit normal field to 6 with respect to g. Recall our convention
on the mean curvature is H = g(H, ν), H̊ = g̊( H̊, ν̊); the opposite convention
changes the sign in the next formula.
E XERCISES 227

b. Prove that H = u −q H̊ − 12 q(n−1)u −1−q (gradg̊ u)⊥ , and conclude that H =


u −q/2 H̊ − 12 q(n−1)u −1−q/2 ∂u/∂ ν̊.
c. Use the above formula to find the mean curvatures of the spheres Sr = {|x| = r }
in the Riemannian Schwarzschild manifold (M, g S ), with
4
m
  n−2
gS = 1 + g En ,
2|x|n−2
and for m > 0, find rm for which Srm is minimal in (M, g S ), cf. Exercise 2-49.
d. In the Riemannian Schwarzschild metric from part c., find the mean curvature
vector of the coordinate hyperplanes x n = ±ξ .
Exercise X-6. Suppose (M, ḡ = ⟨ · , · ⟩) is a Riemannian or Lorentzian manifold
with Levi-Civita connection ∇, and f : 6 → M is an immersed hypersurface.
Suppose F : I × 6 → M is a variation of f , and let 6t be the immersed
surface given by f t := F(t, · ) : 6 → M, with induced Riemannian metric
g(t), and with a smooth section ν of the pullback bundle F ∗ (T M) whose values
ν(t, p) ∈ TF(t, p) M give a unit normal to 6t along f t . We now define a shape
operator along f t . For any (t, p) ∈ I × 6, there is a neighborhood Vt of p in 6
on which f t is an embedding, and we let 6t′ = f t (Vt ). For q = F(t, p) ∈ 6t′ , we
have f ∗t (T p 6) = Tq 6t′ , and we let Sq : Tq 6t′ → Tq 6t′ be the shape operator of 6t′
at q, given by Sq (X ) = (−∇ X ν)q ∈ Tq 6t′ . As in Exercise 5-25, the shape operator
Sq is a (1, 1)-tensor with associated bilinear form K on Tq 6t′ . We can extend
Sq to Tq M by letting S(W ) = −∇ X ν = Sq (X ), for W = X + ⟨ν, ν⟩⟨W, ν⟩ν, i.e.,
X = W T , which is compatible with the way we extended K in Section 5.3.3.1.
Define ∇ν S by (∇ν S)(W ) = ∇ν (S(W )) − S(∇ν W ), which is readily seen to
be tensorial in W .
a. Suppose the variation field is given by ∂∂tF t=0 = ν(0, · ). Show that at t = 0,
(∇ν S)(W ) is orthogonal to ν, and verify the Riccati equation along 6

∇ν S − S 2 = R( · , ν, ν), (X-6a)

where S 2 = S ◦ S. You could do this by interpreting in terms of more general


equations we derived in Chapter 5.
b. Recall that the mean curvature vector field H along 6 is independent of (local)
orientation, and is given as the normal vector H = tr6 (II). Note that ⟨H, ν⟩ =
tr6 S = trg K = H . Consider a variation F : I × 6 → M as above, for which
∂F
∂t t=0 = ν(0, · ). Let H (t, p) be the mean curvature of the immersion F(t, · ) at
p. Show that the variation in the mean curvature is ∂∂tH t=0 = |K |2g + Ricḡ (ν, ν),
where the curvature term is evaluated at F(0, p) = f ( p).
228 E XCURSUS : F IRST AND SECOND VARIATION OF AREA

c. For an example of such a variation, suppose again that (M, ḡ = ⟨ · , · ⟩) is a


Riemannian or Lorentzian manifold, and 6 ⊂ M is an embedded hypersurface
with induced Riemannian metric g, and with smooth unit normal field ν. For
each p in 6, let γ p be the ḡ-geodesic in M with γ p′ (0) = ν| p . We note that given
coordinates x = (x 1 , . . . , x k ) for the hypersurface 6, we let (x 1 , . . . , x k , t) 7→
γ p(x) (t) = exp p(x) (tν| p(x) ) ∈ M. This map gives local coordinates for M (Fermi
coordinates), and along 6 (i.e., t = 0), ∂/∂t = ν. By the geodesic equations,
∂/∂t is unit length, and by the Gauss Lemma (cf. [109], e.g.), ∂/∂t is orthogonal
to the level sets 6t of t. We can thus extend ν locally as ν = ∂/∂t, and the shape
operator S is defined for level sets of t in a neighborhood of 6.
Prove that for q ∈ 6t and for all W ∈ Tq M, S(W ) = −∇W ν. Then verify the
Riccati equation ∇ν S − S 2 = R( · , ν, ν) directly, without appealing to results
from Chapter 5. Conclude that ∂∂tH = |K |2g + Ricḡ (ν, ν) holds for all t.
d. Suppose 6 is a closed manifold (compact without boundary), and consider a
variation F with ∂∂tF t=0 = ν(0, · ). Let A(t) denote the area of 6t . Use the first
variation formula derived earlier to write A′ (t) in terms of H (t, p), and then,
assuming A′ (0) = 0, derive A′′ (0) using the formula for ∂∂tH t=0 to see it agrees
with the second variation of area formula, (X.14).
e. Now suppose instead that the variation field is given by ∂∂tF t=0 = ϕν. Show
that ∂∂tH t=0 = ⟨ν, ν⟩16 ϕ + (|K |2g + Ricḡ (ν, ν))ϕ, which in the Riemannian case
(⟨ν, ν⟩ = 1) is L6 ϕ, where we recall L6 ϕ is the Jacobi operator. If 6 is closed,
if A(t) is the area function for the variation, and if A′ (0) = 0, note that the above
identity is consistent with second variation of area formula, (X.14). (Hint: To do
this, you might prove the variation of mean curvature formula on the open set
U = {ϕ ̸= 0} ⊂ 6; it is clearly true on the open set 6 \ U where ϕ is identically
zero, and the identity extends to U by continuity. It might be instructive to
describe how ϕ can vanish at a point p ∈ U \ U where ∂∂tH t=0 does not vanish.)
Exercise X-7. We have from (X.14) that a hyperplane in Euclidean space is stable.
Let 6 be the totally geodesic plane x 3 = 0 in the Riemannian Schwarzschild
manifold
m 4
 
gS = 1 + g E3 ,
2|x|
with smooth unit normal field νgS . In case m > 0, show that 6 is unstable. In fact,
show this plane is unstable for volume-preserving deformations (cf. Exercise
X-4) by finding a compactly supported function ϕ with 6 ϕ dσgS = 0 and such
R

that a variation with V = ϕνgS decreases area. (Hint: Construct ϕ by modifying


a suitable cutoff of the constant function 1, and then reflecting over the minimal
sphere. One needs to control the integral of |∇ 6 ϕ|2g ; to do this one can employ
S
E XERCISES 229

a logarithmic cutoff log(θ 2 |x|−1 )/log θ , which interpolates between 1 and 0 on


an annulus θ ≤ |x| ≤ θ 2 , for θ > 1 sufficiently large, as in [199, p. 54]. One
can then modify ϕ by adding a suitable term supported on a large annulus to
arrange that 6 ϕdσgS = 0, cf. [79, Proposition 3.3] (thanks to Otis Chodosh for
R

the reference). We remark that the logarithmic cutoff is Lipschitz, which suffices
for the argument. If one feels the need to smooth it out, one can replace |x| with
ρ(x), where ρ(x) = θ for |x| ≤ θ + δ0 , ρ(x) = |x| for θ + δ1 ≤ |x| ≤ θ 2 − δ1 , and
ρ(x) = θ 2 for |x| ≥ θ 2 −δ0 , where 0 < δ0 < δ1 , and δ0 and δ1 are fixed, for all θ > 1
sufficiently large. For θ + δ0 ≤ |x| ≤ θ + δ1 , we can let ρ interpolate smoothly
between θ and |x| by letting ρ ′ (r ) = ψ(r −θ ), where ψ ≥ 0 is smooth, ψ = 0 for

t ≤ δ0 , ψ(t) = 1 for t ≥ δ1 , and δ01 ψ(t) dt = δ1 . For θ 2 − δ1 ≤ |x| ≤ θ 2 − δ0 , we
can interpolate analogously, with ρ ′ (r ) = ψ(θ 2 − r ) for θ 2 − δ1 ≤ r ≤ θ 2 − δ0 .)
CHAPTER 7

Asymptotically flat solutions of the Einstein


constraint equations

This chapter develops the geometry of and analysis on initial data sets that arise in
models of isolated gravitational systems. While such models should behave in the
far field like Minkowski spacetime, there are important physical and mathematical
issues to consider when specifying precisely what the spacetime asymptotic
behavior should be, and the devil is in the details. A desirable, but rather strong,
requirement would be to have the spacetime admit a conformal compactification
in the spirit of that of Minkowski spacetime from Section 1.4.2. (See, for instance,
[218, Section 11.1], [112], and [176] for the notion of asymptotically simple
spacetimes, ideas from the analysis of which are often used to model gravitational
radiation.) One often considers weaker notions that still capture in some sense
how the spacetime approaches the Minkowski spacetime.
A natural question to ask is what kind of spacetime behavior is inherited
from the evolution of initial data which approaches, say, Euclidean geometry,
or, initial data which approaches that of a hyperboloid in Minkowski spacetime,
in the far field. With the conformal compactification of Minkowski spacetime
in mind, hypersurfaces in the former class might be thought of as tending to
spacelike infinity, and those in the latter class as tending to null infinity, though
such notions would come from the spacetime evolution of the data. There are
results, cf. [56; 55; 97], which establish properties about the evolution of such
initial data. We will not discuss such results here, but rather we will focus on the
analysis and geometry of the initial data, with particular emphasis on features
of the asymptotic structure imposed by the constraint equations (vacuum, or
under the dominant energy condition), leaving the interested reader to pursue
the fascinating questions and results for the evolution problem, which has seen
spectacular progress in recent years.
Throughout this chapter, n ≥ 3, g En is the Euclidean metric on Rn , g̊Sn−1 is
the unit round metric on the sphere, and we take the cosmological constant 3 to
be 0. We let d x be Euclidean volume measure, and let | · | be the Euclidean norm,
while for a submanifold of Euclidean space, dσ is Euclidean surface measure,

231
232 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

and ν is a Euclidean normal vector, whereas dvg , | · |g , dσg and νg will be the
analogous quantities with respect to a metric g.
The first two sections of the chapter contain some detailed discussion and
analysis involving the Laplace operator on Euclidean space and on asymptotically
flat manifolds. While it is not necessary to understand and fill in every detail on
a first pass, we hope that the reader can follow some of the discussion to get a
sense of what is important, and then be able to apply the results in later sections.
Indeed, rather than just state results and point to a panoply of references, we
have attempted to motivate the results and sketch proofs, hopefully providing a
reader’s guide to some of the references.

7.1. Harmonically flat solutions of the constraint equations

In Chapter 2 we obtained the Schwarzschild spacetime by imposing the vacuum


Einstein equation along with rotational symmetry. The resulting spacetime is
static and asymptotic to Minkowski spacetime in a far field regime. While static
metrics are rather special, there are many spacetimes metrics which asymptote
to Minkowski spacetime, and are often used to model isolated gravitational
systems. In terms of initial data, certain spacelike slices of the Schwarzschild
spacetime have induced metric g S and vanishing second fundamental form, (cf.
Exercise 2-34) and hence give time-symmetric solutions to the vacuum constraint
equations. In the time-symmetric case, the vacuum constraints reduce to the
vanishing of the scalar curvature. For n ≥ 3, the metric g S with mass m can be
written in isotropic coordinates as
4
m
 n−2
gS = 1 + g En . (7.1.1)
2|x|n−2
This metric is conformally flat, and asymptotically flat (or asymptotically Eu-
clidean), in that it approaches g En as |x| → ∞. (Though it is not immediately
obvious, for m > 0 it is also asymptotically flat as |x| ↘ 0; see Exercise 2-49.)
4
If we consider g = u n−2 g En , then the conformal change of scalar curvature
(6.1.8) simplifies, since R(g En ) = 0, to
4(n−1) − n−2
n+2
R(g) = − u 1u,
n −2
where 1 is the Euclidean Laplacian. Thus for the scalar curvature to vanish, we
require 1u = 0. We remark that if we only require the dominant energy condition
(Section 2.3.2, see also Section 5.2), then in the time-symmetric case we get
4
R(g) ≥ 0, which for g = u n−2 g En translates into 1u ≤ 0, i.e., u is superharmonic.
The conformal factor u for the Riemannian Schwarzschild metric is, up to
constants, the fundamental solution for the Laplace operator on Rn . If more
H ARMONICALLY FLAT SOLUTIONS OF THE CONSTRAINT EQUATIONS 233

generally we consider positive harmonic functions on domains  ⊂ Rn , we obtain


4
solutions (g, K ) = (u n−2 g En , 0) to the time-symmetric vacuum constraints on .
Such metrics might be called harmonically flat, though the term is sometimes
reserved for the case when  is a neighborhood of infinity (i.e., contains the
complement of a compact subset), and when u limits to a positive constant (which
can be rescaled to unity) at infinity, so that g is also asymptotically flat; in this
case we say that g is harmonically flat at infinity [25].
Such metrics play an important role in the study of isolated gravitational
systems, as we will see. For now, we will review some properties of the Laplace
operator and harmonic functions on Rn , and apply them to harmonically flat
metrics.

7.1.1. On the Laplacian in Rn . As we have seen, the gravitational potential 8


in Newtonian gravity satisfies Poisson’s equation 18 = 4π Gσ on R3 , where σ
is the matter density. For an isolated system, we can consider σ to be compactly
supported, or more generally to decay suitably at infinity. We want to know
how the potential 8 behaves at infinity, and so we will begin by studying the
Laplace operator on Rn . In the case of a point mass σ = Mδc (a multiple of
the Dirac distribution at c) if we impose that 8 decays at infinity, we have
8(x) = −G M/|x − c|.
In this section, measurable functions are defined with respect to Lebesgue
measure on Rn , and for an integrable function, Rn f (x) d x is the integral with
R

respect to Lebesgue measure.

7.1.1.1. The fundamental solution. On a Riemannian manifold (M, g), the trace
of the covariant Hessian 1g u = divg (∇g u) = g i j u ;i j can be expressed as (recall
the summation convention is in force) (cf. Exercise 1-10)

1 ∂ ∂u
 
ij
1g u = √ .
p
g det g (7.1.2)
det g ∂ x i ∂x j

We recall the rotationally symmetric harmonic functions on Euclidean space.


If we consider (7.1.2) for the Euclidean metric (Rn , g En = dr 2 +r 2 g̊Sn−1 ), applied
to a function u which only depends on the radial distance r = |x| from the origin,
we obtain (writing u = u(r ))

d
1 n−1 du
 
1u = n−1 r = u ′′ (r ) + n−1 ′
r u (r ).
r dr dr
234 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

d
For such u, to be harmonic is thus equivalent to dr r n−1 du
dr = 0, that is,


 Ar + B for n = 1,

u(r ) = A log r + B for n = 2, (7.1.3)


 2−n
Ar + B for n > 2,
for some constants A and B. In dimension one the function u extends contin-
uously to the whole line; for n = 2, log |x| is unbounded both as r = |x| tends
to 0 or ∞, whereas for n > 2, u(x) decays to 0 as |x| tends to infinity. We will
generally be interested in the case n ≥ 3, but will also include some discussion
of the n = 2 case for comparison; recall that in the plane, harmonic function
theory is intimately tied to the theory of holomorphic functions.
These special harmonic functions are essentially the fundamental solutions
for the Laplace operator, as we now recall. First, note that these functions are
locally integrable, even around the origin. Therefore they define distributions,
and thus have distributional derivatives, and hence the Laplace operator can
be applied distributionally to these functions. To be precise, if f is locally
integrable, 1 f is the distribution T defined as follows: for any ψ ∈ Cc∞ (Rn ),
T (ψ) = Rn f 1ψ d x.
R

Let n be the volume of the unit ball B in Rn , and ωn−1 the surface area of
the unit round sphere Sn−1 = ∂ B. Note that 22 = 2π = ω1 , 33 = 4π = ω2 ,
and in general nn = ωn−1 . The proof of the following is a generalization of
that of (2.1.2), and is left as an exercise.
Proposition 7-1. Let
1
|x − y|2−n if n > 2,

(2−n)nn

0(x − y) =
1
log |x − y| if n = 2.


Then, for all u ∈ Cc2 (Rn ),
Z
u(x) = 0(x − y)1u(y) dy. (7.1.4)
Rn

In other words, the following distributional identities hold: in dimension two,


1
1 2π log |x| = δ0 ,


while for n > 2,  


1
1 |x|2−n = δ0 .
(2−n)nn

Furthermore, 10(x − y) = δ0 (x − y), consistent with (7.1.4).


H ARMONICALLY FLAT SOLUTIONS OF THE CONSTRAINT EQUATIONS 235

∂0(x) 1 xi
We remark that = for all n ≥ 2.
∂x i ωn−1 |x|n
7.1.1.2. The Poisson equation and the Newtonian potential. For functions f ∈
L∞ n 1 n
loc (R ) ∩ L (R ), for instance f locallyRbounded, measurable and decaying
suitably at infinity, the integral N f (x) = Rn 0(x − y) f (y) dy converges, and
constitutes the Newtonian potential of f .
Exercise 7-2. Let n ≥ 3. Prove that the integral defining N f converges for
functions f ∈ L ∞ n 1 n
loc (R )∩L (R ). (Hint: Break up the integral into two regions and
exploit the fact that 0 is both locally integrable, and bounded near infinity.) Using
Young’s inequality for convolution, adapt the proof to show that if f ∈ L 1 (Rn ),
the integral for N f converges a.e. to a locally integrable function.
For f integrable with compact support, a simple Fubini-type argument, along
with Proposition 7-1, yields that 1N f = f distributionally: for all ψ ∈ Cc∞ (Rn ),
Z Z
1ψ(x)N f (x) d x = ψ(y) f (y) dy.
Rn Rn

Proposition 7-3. For n ≥ 3 and f ∈ L 1 (Rn ), 1N f = f distributionally.


Sketch of proof. For f ∈ L 1 (Rn ) and R > 0, we let f R = f χ B R (0) , where χ A
is the characteristic function of the set A. It is not hard to see that as R ↗ ∞,
f R and N f R = f R ∗ 0 converge as distributions to f and N f ; the latter fact can
be proven along the lines of Exercise 7-2. That 1N f = f distributionally then
follows from the analogous equation 1( f R ∗ 0) = f R , which is valid since f R
has compact support. □
One can more generally define N f when f is, say, a compactly supported
distribution, and the distribution so defined also solves the Poisson equation
1N f = f . See [93] or [195, Theorem 6.37].
A fundamental issue is the regularity of the Newtonian potential N f . For
instance, Proposition 7-1 shows us that N f ∈ C 2 (Rn ) for certain continuous
functions f , namely those f ∈ 1(Cc2 (Rn )). We stress, however, that for general
1,α
f ∈ Cc (Rn ), while the following lemma indicates that the solution N f ∈ Cloc (Rn )
2 n
for 0 < α < 1, N f might fail to be in C (R ) (see [111, Example 4.4.4], for
instance; for a more general result see [115, Theorem 7.9.8], and compare [107,
Exercise 4.9]). Given f ∈ Cc (Rn ) and  ⊂ Rn open such that N f fails to be in
C 2 (), there is in fact no solution v ∈ C 2 () of 1v = f : if there were such a
v, then the continuous function w = N f − v would be harmonic, and hence a
smooth function by Weyl’s lemma (Lemma 6-18), which yields a contradiction.
We have the following lemma on the regularity of the Newtonian potential.
The interested reader can formulate a modification for the case n = 2.
236 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

Lemma 7-4. Let n ≥ 3. Suppose f ∈ L ∞ n 1 n


loc (R ) ∩ L (R ). For any 0 < α < 1,
1,α
N f lies in Cloc (Rn ), and for each j ∈ {1, 2, . . . , n},
Z
∂ j N f (x) = ∂ j 0(x − y) f (y) dy.
Rn
0,α 2,α
If in fact f ∈ Cloc (Rn ) ∩ L 1 (Rn ), then N f ∈ Cloc (Rn ), and 1N f = f holds
classically.
Sketch of proof. We will briefly indicate the issues involved in proving this
lemma. First, a straightforward dominated convergence argument shows that for
f ∈ Cck (Rn ) and for any multi-index α with |α| ≤ k,
Z Z
α α
∂ N f (x) = 0(y)∂ f (x − y) dy = 0(x − y)∂ α f (y) dy = N∂ α f (x).
Rn Rn

One can then use uniform continuity of ∂ α f to conclude that N f ∈ C k (Rn ), and of
course, if f ∈ Cc∞ (Rn ), then N f ∈ C ∞ (Rn ). Suppose we dial the regularity down
to f ∈ Cc1 (Rn ), so that ∂ j N f (x) = Rn 0(x − y)∂ j f (y) dy, as above. 0 and ∂ j 0
R

are locally integrable, and a very simple limiting argument around the singularity
as in the derivation of (2.1.2) justifies integration by parts to yield
Z Z
∂ j N f (x) = ∂ j 0(x − y) f (y) dy = ∂ j 0(y) f (x − y) dy.
Rn Rn

Since f ∈ Cc1 (Rn ) and ∂ j 0 is locally integrable, dominated convergence can be


applied again to give ∂i2j N f (x) = Rn ∂ j 0(y)∂i f (x − y) dy, which is likewise
R

seen to be continuous. Thus N f ∈ C 2 (Rn ).


While the C 1 -regularity of f made this argument fairly simple, if we are
willing to do a bit more work, we actually can get even more with less. We
indicate how this works. Since ∂ j 0 is locally integrable, one can use this
along with a regularization argument (cf. [107, Lemma 4.1]) to show that f ∈
L∞ n 1 n
loc (R ) ∩ L (R ) implies that ∂ j N f (x) = Rn ∂ j 0(x − y) f (y) dy, and indeed
R

that N f ∈ C 1 (Rn ). Since ∂i2j 0 is not locally integrable, we cannot run the
analogous argument again; however, if we have suitable control on the difference
0,α
| f (x) − f (y)|, such as from local Hölder continuity, say f ∈ Cloc (Rn ) ∩ L 1 (Rn ),
then we can use the fact that ∂i2j 0(x − y) f (y) − f (x) is locally integrable to


establish N f ∈ C 2 (Rn ); see [107, Lemma 4.2]. As for Hölder continuity of N f , we


1,α
sketch in Exercise 7-101 how to show N f ∈ Cloc (Rn ) for f ∈ L ∞ n 1
loc (R ) ∩ L (R ),
n
2,α
and leave the Cloc -proof to [107, Lemma 4.4]. □
With Proposition 7-1 in mind, we point out the obvious fact that even if 1u is
compactly supported, u generally will not be, even modulo an additive constant;
as an example, consider u constructed by smoothing out 0(x) near x = 0. That
H ARMONICALLY FLAT SOLUTIONS OF THE CONSTRAINT EQUATIONS 237

said, for n ≥ 3, if f is bounded, measurable and compactly supported, there is


a constant C such that |N f (x)| ≤ C|x|2−n ; in particular, then, N f (x) → 0 as
|x| → ∞. Thus, if f ∈ Cc0,α (Rn ), and if u ∈ C 2 (Rn ) with 1u = f and such that
u tends to 0 at infinity, an application of the maximum principle (which we recall
in Proposition 7-10 below), along with the regularity and decay of u and N f and
the fact that 1(u − N f ) = 0, shows that (7.1.4) holds, i.e., u = N f = N1u .
We have the following important corollary, for which we set up some notation.
Definition 7-5. For s ∈ R, let h(x) = O∞ (|x|s ) indicate that h is a function
defined on a domain  ⊂ Rn with compact complement, and that for all k ∈
Z+ ∪ {0}, there is a constant Ck such that for all x ∈  ∩ {|x| > 1} and all multi-
β
indices β with |β| = k, we have |∂x h(x)| ≤ Ck |x|s−k . For ℓ ∈ Z+ ∪ {0}, we let
Oℓ (|x|s ) be defined analogously, just for 0 ≤ k = |β| ≤ ℓ. We write Oℓ (|x|s ; )
if we want to specify the domain.
Recall that dσ is Euclidean surface measure. As usual, r = |x|, and ∂u/∂r =
Pn
(x i/r ) ∂u/∂ x i ,
and for B ∈ Rn , x · B = i=1 x i Bi .
Corollary 7-6. Let n ≥ 3. Suppose v ∈ C 2 (Rn ) is such that 1v has compact
support, and lim|x|→∞ v(x) = 0. Then v admits an expansion
A x·B
v(x) = n−2
+ + O∞ (|x|−n ),
|x| |x|n
for constants A ∈ R and B ∈ Rn , which are given by
1 ∂v 1
Z Z
A= lim dσ = 1v d x (7.1.5)
(2 − n)ωn−1 r →∞ {|x|=r } ∂r (2 − n)ωn−1 Rn
1 i ∂v xi 1
Z   Z
i
B =− lim x − v dσ = − x i 1v d x. (7.1.6)
ωn−1 r →∞ {|x|=r } ∂r r ωn−1 Rn

Exercise 7-7. Prove Corollary 7-6. (Hint: Use v = N1v and the expansion
2−n
x y |y|2 2

2−n 2 2 2−n 2−n
|x − y| = (|x| − 2x · y + |y| ) = |x|
2 1−2 · +
|x| |x| |x|2
x y
 
= |x|2−n 1 + (n − 2) · + O∞ (|x|−2 )
|x| |x|
for |x| large relative to |y|.)
For such a function v as in the corollary, there is a neighborhood  of infinity
4
for which u = 1+v > 0 and such that (, u n−2 g En ) is harmonically flat at infinity.
In analogy with the Schwarzschild metric, we will identify m = 2A as a mass.
The vector B will be related to a center of mass in Exercise 7-28 below. At
238 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

this point it might prove instructive to compare the Newtonian analogue: recall
from (2.1.3) that the gravitational potential 8 satisfies the Poisson equation
18 = 4π σ = 4πρ/c2 , where ρ is the energy density, and σ = ρ/c2 is the mass
density, and we have taken G = 1 for convenience. If we assume ρ is smooth and
compactly supported, and that lim|x|→∞ 8(x) = 0, then 8 admits an expansion
of the above form, which we write 8(x) = −m/|x| − β · x/|x|3 + O∞ (|x|−3 ).
In fact, from (7.1.5) we see that
1 ∂8 1
Z Z Z
m= lim dσ = 18 d x = σ dx (7.1.7)
4π r →∞ {|x|=r } ∂r 4π R3 R3

and from (7.1.6)


1 i ∂8 xi 1
Z   Z
i
β = lim x − 8 dσ = x i 18 d x
4π r →∞ {|x|=r } ∂r r 4π R3
Z
= x i σ d x. (7.1.8)
R3

Thus m is in fact the total mass of the matter distribution, and β i = mci is the
first moment of the distribution, giving (for m ̸= 0) the center of mass ci .

7.1.2. Harmonic functions. We collect here some well-known facts about Eu-
clidean harmonic functions on domains in Rn , n ≥ 2, a reference for which is
[12]; see also [86; 107; 111]. We give some of the proofs, and refer others to
exercises or these references.
For a ∈ Rn and R > 0, let B R (a) = {x ∈ Rn : |x − a| < R}, so that B R (a) :=
B R (a) = {x ∈ Rn : |x − a| ≤ R}, the boundary ∂ B R (a) of which is a sphere.
When a is the origin, we may abbreviate by omitting the center point.
If  is a bounded open set with smooth hypersurface boundary ∂, ν will be
the outward unit normal, and for a differentiable function u, ∂u ∂ν is the directional
derivative in this normal direction. In this section, we may use a subscript on the
surface measure to denote the variable of integration.
Classically, one defines u to be harmonic in an open set  if u ∈ C 2 ()
and 1u = 0 in . As we mentioned for the more general Poisson equation
above, we could interpret 1u = 0 distributionally: if u is locally integrable, or
more generally a distribution, then 1u = 0 on  distributionally means for all
ψ ∈ Cc∞ (),  u1ψ d x = 0. An important classical result is Weyl’s Lemma:
R

if such a u satisfies Laplace’s equation distributionally, then u can be represented


by a function in C ∞ (). In fact, more is true: harmonic functions are real-
analytic, and hence obey the unique continuation property [12]. In what follows,
then, functions harmonic in an open set can be taken to be smooth there, without
H ARMONICALLY FLAT SOLUTIONS OF THE CONSTRAINT EQUATIONS 239

further mention; we may, however, note any assumptions about behavior of the
function up to the boundary, as in the following proposition.
Proposition 7-8 (mean value property, MVP). If u ∈ C 0 (B R (a)) is harmonic on
B R (a), then
1 1
Z Z
u(a) = u(x) dσ = u(x) d x.
nn R n−1 {|x−a|=R} n R n B R (a)

Exercise 7-9. Prove the MVP, as follows. Apply Green’s identity


∂v ∂u
Z Z  
(u1v − v1u) d x = u −v dσ
 ∂ ∂ν ∂ν
with v = |x − a|2−n (n > 2) and  = {x : ε < |x − a| < R − ε}. Let ε ↘ 0 and
use continuity. The case n = 2 is handled analogously.
As a corollary of the MVP, we immediately obtain a maximum principle, in
both a weak and a strong form.
Proposition 7-10 (maximum principle). If u is harmonic on a connected open set
, and if u attains an absolute maximum or minimum inside , then u is constant.
If  is a bounded open set for which u is harmonic on  and continuous on ,
then the maximum and minimum of u are attained on the boundary of .
7.1.2.1. Dirichlet problem. We now recall how to solve the Dirichlet problem
on a ball, to determine a harmonic function from its boundary values.
By the maximum principle (Proposition 7-10), there is at most one harmonic
function in C 0 (B R ) with given boundary values, and in fact there is precisely
one. For this we use the Poisson kernel
R 2 − |x|2
PR (x, y) = .
nn R|x − y|n
This is the generalization to all dimensions of the classical Poisson kernel in two
dimensions, in which case the kernel can be derived using the series expansion
for a holomorphic function on a disk [12].
Proposition 7-11 (Dirichlet problem). If ϕ ∈ C 0 (∂ B R ), then
Z
u(x) = ϕ(y)PR (x, y) dσ y
∂ BR

is harmonic on B R , continuous on B R , and u = ϕ on ∂ B R .


For the proof, see [12]. It is not hard to show that ∂ B R PR (x, y) dσ y = 1 for
R

x ∈ B R ; i.e., the formula holds for ϕ = 1 and u = 1. Moreover, as x approaches a


240 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

point ξ on the boundary, PR (x, y) becomes sharply peaked about ξ , so that its unit
integral is concentrated around y = ξ , and thus ∂ B R ϕ(y)PR (x, y) dσ y ≈ ϕ(ξ ).
R

Exercise 7-12 (Green’s function). For any x ∈ B R , let h x (y) solve 1h x (y) = 0
on B R with h x (y) = −0(x − y) for y ∈ ∂ B R . Let G(x, y) = 0(x − y) + h x (y)
be the Green’s function for B R . Show that PR (x, y) = ∂G/∂ν y for y ∈ ∂ B R by
using Green’s identity, applying what we know about the distributional Laplacian
of 0 (Proposition 7-1), along with the fact that G = 0 on ∂ B R .

One can compute a formula for G, using inversion in the sphere, and then
derive the above relation to the Poisson kernel [86; 107; 111]; we do not do this
here, but we recall inversion in the sphere in Section 7.1.2.3 below.
We now arrive at a key fact about the behavior of harmonic functions at an
isolated singularity.

Proposition 7-13 (removable singularities theorem). An isolated singularity of a


bounded harmonic function is removable.

Proof. By translation and scaling we can, without loss of generality, assume that
u is bounded and harmonic on B \ {0}, where B = B1 (0). Let v be harmonic on
B, continuous on B, with the same boundary values as u. For n > 2 and ε > 0, let
vε (x) = u(x)−v(x)+ε(|x|2−n −1); for n = 2, let vε (x) = u(x)−v(x)−ε log |x|.
Then vε is harmonic on the punctured ball, with vε = 0 on ∂ B, while (since u and
v are bounded) vε (x) → +∞ as x → 0. By the maximum principle, vε (x) ≥ 0
on B \ {0}. Letting ε ↘ 0, we conclude u − v ≥ 0 on B \ {0}, and by replacing
u with (−u) in the argument, we can also conclude u − v ≤ 0 on this domain.
Hence u = v on B \ {0}, and v is harmonic in a neighborhood of the origin. □

7.1.2.2. Positive harmonic functions. We will often be solving for a conformal


factor for a metric, and we want such a factor to be positive. So we gather a few
basic facts about positive harmonic functions in this section.

Proposition 7-14 (Liouville’s theorem). Any bounded harmonic function on Rn


is constant. In fact, any positive harmonic function on Rn is a constant. Thus
any harmonic function on Rn bounded either above or below must be constant.

Proof. Let u > 0 be harmonic on Rn . For any x ∈ Rn , let R > |x| and use the
MVP to write
1
Z Z 
u(x) − u(0) = u(y) dy − u(y) dy .
n R n B R (x) B R (0)
H ARMONICALLY FLAT SOLUTIONS OF THE CONSTRAINT EQUATIONS 241

Since u > 0 and B R (x) \ B R (0) ∪ B R (0) \ B R (x) ⊂ B R+|x| (0) \ B R−|x| (0),
 

there is a constant C = C(n, |x|) > 0 such that


1
Z
|u(x) − u(0)| ≤ u(y) dy
n R n B R+|x| (0)\B R−|x| (0)
1
= n (R + |x|)n − (R − |x|)n u(0) ≤ C R −1 .

R
Let R → ∞ to conclude. □
Exercise 7-15. Liouville’s theorem implies that a bounded harmonic function on
Rn must be constant. What can you say about a harmonic function u ∈ L p (Rn ),
1 ≤ p < ∞?
The following corollary is a key property of positive harmonic functions, and
its extension to other second-order elliptic operators is a deep result (see [107,
Theorem 8.20], for example). The simple proof here just relies on the MVP
(cf. [111, p. 109–110]), so that a different proof must be used for other operators.
Proposition 7-16 (Harnack inequality). Suppose  ⊂ Rn is a connected open
set, and that K ⊂  is compact. There exists C > 0 such that for all nonnegative
harmonic functions u on , and for all x, y ∈ K ,

C −1 u(y) ≤ u(x) ≤ Cu(y).

Sketch of proof. We indicate two ways to prove a local inequality; the proof can
be completed by a covering argument [12; 107].
Suppose x ∈ B R (y) and B 3R (y) ⊂ . Then B R (y) ⊂ B2R (x) ⊂ B3R (y), so
that since u ≥ 0, we obtain
1 2n 3n
Z Z Z
u d x ≤ u d x ≤ u d x.
n R n B R (y) n (2R)n B2R (x) n (3R)n B3R (y)
Applying the MVP we obtain u(y) ≤ 2n u(x) ≤ 3n u(y).
One could also use the Poisson kernel to get a local estimate. To illustrate,
suppose u ≥ 0 is harmonic on  ⊃ B, where B = B1 (0). From the Poisson
kernel, the MVP, and the sign of u, we obtain for x ∈ B the inequality
1 − |x|2 1 1 − |x|2 1 − |x|2
Z
u(0) ≤ u(ξ ) dσξ ≤ u(0).
(1 + |x|)n ωn−1 ∂ B |x − ξ |n (1 − |x|)n
| {z }
u(x) □
For application to the asymptotics of harmonic functions, we consider positive
harmonic functions on Rn \ {0}. We start with a fundamental result, for proof of
which we refer to [12].
242 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

Proposition 7-17 (Bôcher’s theorem). Let  ⊂ Rn be open and a ∈ . Suppose


u is harmonic on  \ {a} and positive in a punctured neighborhood of a. Then
there is a unique constant b ≥ 0 and a function v harmonic on , such that on
 \ {a},
b log(|x−a|−1 ) + v(x) for n = 2,

u(x) = 2−n
b |x−a| + v(x) for n > 2.

Liouville’s theorem tells us that positive harmonic functions on Rn must


be constant. We can apply Bôcher’s theorem to understand positive harmonic
functions on punctured Euclidean space.

Proposition 7-18. If u > 0 is harmonic on Rn \ {0}, where n ≥ 3, there exist


nonnegative numbers a and b such that u(x) = b|x|2−n + a.
A positive harmonic function on R2 \ {0} must be constant.

Proof. If n ≥ 3, by Bôcher’s theorem, we have a function v harmonic in Rn such


that on Rn \ {0}, u(x) = b|x|2−n + v(x). Thus lim inf|x|→∞ v(x) ≥ 0, from which
we conclude that v must be constant.
The case n = 2 follows directly from Liouville’s theorem applied to the
harmonic function z 7→ u(e z ) on R2 ≈ C, but we suggest as an exercise to prove
it using Bôcher’s theorem. □
4
Example 7-19. Let n ≥ 3. For u > 0, we consider a metric g = u n−2 g En with
zero scalar curvature on  ⊂ Rn . As we have seen, this translates into u being a
positive harmonic function on . Thus with the Liouville and Bôcher theorems
in mind, we see that for such u nonconstant, the largest set on which we can
define u is Rn minus a point. By translation we move the puncture to the origin,
in which case u(x) = b|x|2−n + a for a ≥ 0. If a = 0 then b > 0, and as one
should readily check, the metric g is in then (up to constant scaling) the pullback
of gEn under the inversion in the unit sphere, which we recall in the next section.
On the other hand, if a > 0, then up to a constant rescaling and a coordinate
shift, the metric is a Riemannian Schwarzschild metric (7.1.1).

7.1.2.3. Harmonic functions at ∞. In light of the preceding example, it is nat-


ural to consider functions which are harmonic outside a compact set, as in
Corollary 7-6. As it turns out, there is a device which transforms such functions
into functions harmonic in a punctured neighborhood of the origin. To develop
this, we recall the formula for inversion in the unit sphere, as a mapping on
Rn ∪ {∞}:
x
x 7→ x ∗ = 2 , 0 7→ ∞, ∞ 7→ 0.
|x|
H ARMONICALLY FLAT SOLUTIONS OF THE CONSTRAINT EQUATIONS 243

In the n = 2 case, with R2 ≈ C, this map can be expressed as z 7→ 1/z̄. We


leave it as an exercise to show that inversion in the sphere is a homeomorphism
on the one-point compactification of Rn (which by stereographic projection is
homeomorphic to Sn ), and is a conformal transformation, so pulling back the
Euclidean metric under inversion produces a conformally Euclidean metric.
We now use inversion in the sphere to transform harmonic functions.
Definition 7-20 (Kelvin transform). For functions u defined on an open set
 ⊂ Rn \ {0}, define K [u] on ∗ = {x : x ∗ ∈ } by

K [u](x) = |x|2−n u(x ∗ ).

Exercise 7-21 (a key exercise). Prove that u harmonic on  if and only if K [u]
is harmonic on ∗ . One way to do this it to show 1(K [u])(x) = |x|−n−2 1u(x ∗ )
by direct computation; also see [12, Theorem 4.4].
The Kelvin transform preserves harmonicity, transforming a harmonic function
in a neighborhood of infinity to one defined in a neighborhood of the origin. We
analyze conditions on the harmonic function near infinity that are encoded in the
transform near the origin.
Definition 7-22. We say a function u is harmonic near infinity if there is a
compact set K such that u is harmonic on Rn \ K . If u is harmonic near infinity,
we say u is harmonic at infinity provided K [u] has a removable singularity at
the origin.
The function in Corollary 7-6 is harmonic at infinity:
Proposition 7-23. Let n ≥ 3 and suppose u is harmonic near infinity. Then u is
harmonic at infinity if and only if lim|x|→∞ u(x) = 0.
Proof. From u(x) = |x|2−n K [u](x ∗ ), one direction above is trivial. For the
converse, assuming lim|x|→∞ u(x) = 0, we can adapt the proof of the removable
singularities theorem above to conclude. □
We remark that the Kelvin transform can be used to solve the exterior Dirichlet
problem, to find a harmonic function on the exterior of a ball with given values on
the boundary sphere, by inverting the solution of the interior Dirichlet problem on
the ball. The result is a function which is, by definition, harmonic at infinity; by
the preceding proposition, it decays to zero at infinity. Thus, given a continuous
function on the boundary of a ball, there is a unique function that goes to zero at
infinity with given boundary values. Compare the following simple example.
Example 7-24. Let n ≥ 3 and B = B1 (0). The function u(x) = 1 − |x|2−n is
harmonic on |x| ̸= 0, and u = 0 on ∂ B. The unique function harmonic at infinity
244 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

which solves the exterior Dirichlet problem with zero boundary data on ∂ B is the
zero function. The harmonic function u is not harmonic at infinity, but u − 1 is.
Proposition 7-25. Let n ≥ 3, and let u be harmonic near infinity. The following
statements are equivalent:
• u is bounded near infinity.
• u is bounded above or below near infinity.
• There is a constant a such that u − a is harmonic at infinity.
• u has a finite limit as |x| → ∞.
Proof. Most of the steps are straighforward. One step involves using Bôcher’s
theorem: if u > 0 near infinity, K [u] > 0 near the origin, so by Bôcher, there
is an a such that K [u](x) − a|x|2−n = v(x) is harmonic near the origin. So
K [v](x) = u(x) − a is harmonic near infinity. □
7.1.2.4. Spherical harmonic expansion. Let n ≥ 3. We note that for a function
harmonic near the origin, if you group the terms in the Taylor expansion into
homogeneous polynomials, each such is harmonic [12]. If v is harmonic at
infinity, by applying the Taylor expansion of K [v] about x ∗ = 0 we have (recall
Pn
that B · x = i=1 B i x i ),

|x ∗ |2−n v(x) = K [v](x ∗ ) = A + B · x ∗ + · · · .

Thus
A B·x
v(x) = A|x ∗ |n−2 + B · x ∗ |x ∗ |n−2 + · · · = n−2
+ + O∞ (|x|−n );
|x| |x|n
cf. Corollary 7-6. One can compute higher-order terms in the expansion, which
come from the homogeneous harmonic polynomials of degree higher than one.
Exercise 7-26. In the case n = 3, compute the next term in the expansion of v.
What is the dimension of the space of homogeneous harmonic polynomials of
degree two in dimension three? (The answer is five; see [12].)
We next explain why this is called the spherical harmonic expansion.
Remark 7-27. The eigenfunctions of 1g̊ on the round unit sphere (Sn , g̊) (i.e.,
nontrivial solutions of 1g̊ u+λu = 0 for some constant λ) are given by restrictions
of the homogeneous harmonic polynomials on Rn+1 . The lowest eigenvalue
is λ0 = 0 (constant eigenfunctions), and the next eigenvalue is λ1 = n, with
eigenfunctions x i , i = 1, . . . , n + 1, restricted to the sphere [12], cf. Exercise
2-43.
A SYMPTOTICALLY FLAT INITIAL DATA 245

4
Suppose g = u n−2 g En is harmonically flat at infinity. We may assume, rescaling
if necessary, that lim|x|→∞ u(x) = 1. Then u(x) admits a spherical harmonic
expansion near infinity, as u(x) = 1 + A/|x|n−2 + B · x/|x|n + O∞ (|x|−n ), cf.
Corollary 7-6. As we remarked earlier, we let m = 2A in analogy with the
Schwarzschild metric, and the vector B is related to the center of mass, as we
point out in the next exercise. Note that metrics which are harmonically flat
at infinity asymptote to the Euclidean metric, with the difference on the order
O(|x|−(n−2) ). The leading order deviation in the metric g from the flat metric
comes from the mass term in the expansion. Harmonically flat metrics might be
thought of as asymptotic to Schwarzschild, since g − g S , for g S given in (7.1.1)
with mass m = 2A, decays as O(|x|−(n−1) ), one order better than the deviation
from Euclidean. Actually, if we recenter coordinates, we can make the decay
another order better, as in the next exercise.
4
Exercise 7-28. Suppose g = u n−2 gEn is harmonically flat at infinity, with ex-
pansion u(x) = 1 + A/|x|n−2 + B · x/|x|n + O∞ (|x|−n ) for |x| > r0 . Translate
coordinates to y = x − a, for a ∈ Rn . For |y + a| > r0 , find the asymptotic
expansion of u(x) = u(y + a) in terms of y. Show that in case A ̸= 0, there is a
unique c ∈ Rn for which u(y + c) = 1 + A/|y|n−2 + O(|y|−n ).
With this important class of initial data sets in mind, we move to more general
asymptotically flat initial data in the next section.

7.2. Asymptotically flat initial data

We will give a starting definition of asymptotically flat initial data (g, K ), and
remark later on natural generalizations of the definition in terms of how the
regularity and decay assumptions are imposed. For simplicity, we will assume
g and K are smooth; the interested reader can discern when a lower regularity
level is sufficient. Recall the notation Oℓ (|x|s ; ) from Definition 7-5.
Definition 7-29. Let n ≥ 3, ℓ ∈ Z+ , and q > 0. An n-dimensional Riemannian
manifold (E , g) is called an asymptotically flat (or asymptotically Euclidean) end,
with rate q and order ℓ + 1, if E admits asymptotically flat coordinates x giving
a diffeomorphism of E and  := Rn \ {|x| ≤ 1}, in which, for i, j ∈ {1, . . . , n},
gi j (x) − δi j = Oℓ+1 (|x|−q ; ).
An asymptotically flat initial data set (g, K ) on E will consist of an asymptotically
flat metric g on E as above, along with an associated symmetric (0, 2)-tensor
K (equivalently, π = K − (trg K )g) satisfying for i, j ∈ {1, . . . , n}
K i j (x) = Oℓ (|x|−q−1 ; ).
246 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

We remark that sometimes in the definition one also imposes some extra decay
requirement on 8(g, K ), i.e., on the energy-momentum density (ρ, J ) of (5.2.2)–
(5.2.3), beyond that which comes from the conditions on g and K above.
We may refer to either (E , g), (E , g, K ) or (E , g, π) as an asymptotically
flat end. (We emphasize that q here is different than in the conformal method
from last chapter.) Furthermore, while the definition makes sense for q > 0, we
will (unless otherwise noted) restrict to q > n−2
2 , so as to have a well-defined
energy-momentum vector of an asymptotically flat end. We will often remind the
reader where computations in an end are done with respect to asymptotically flat
coordinates x, though we may carry on without explicit remark, and we note that
in the definition it is equivalent to employ asymptotic coordinates x for which
the end E corresponds to |x| > r0 ≥ 1.
Definition 7-30. A connected complete n-dimensional Riemannian manifold
(M, g) is called asymptotically flat (or asymptotically Euclidean) if there is a
compact set C ⊂ M and a k ∈ Z+ such that the components of M \ C are given
by the (open) sets E1 , . . . , Ek , each of which is an asymptotically flat end (E j , g).
An asymptotically flat initial data set (M, g, K ) (or (M, g, π)) will consist of
an asymptotically flat (M, g) along with an associated symmetric (0, 2)-tensor
K (or π = K − (trg K )g) so that g and K satisfy that asymptotic conditions in
Definition 7-29 on each end.
Metrics on Rn that are harmonically flat at infinity are also asymptotically flat
with q = n − 2. For example, the Riemannian Schwarzschild metric, given on
Rn \{0} by (7.1.1) with m > 0, is asymptotically flat with two ends (Exercise 2-37).
For m < 0, the Riemannian Schwarzschild metric g S on Rn \ |x| ≤ − m2


includes one asymptotically flat end, but as we discussed earlier, this space is
not complete. If we excise the singularity in g S for m < 0 by restricting to a
manifold N = Rn \ |x| ≤ δ − m2 for any δ > 0, we see the mean curvature


vector H of ∂ N points out of N (cf. Exercise X-3). On the other hand, in case
m > 0, if we let N = Rn \ |x| ≤ m2 − δ for 0 < δ < m2 , then the mean curvature


vector of ∂ N points into N , in contrast with the m < 0 case. Thus should
one seek to formulate natural conditions to guarantee a nonnegative mass, this
example clearly shows if we allow boundary components in initial data sets, we
would need to impose some such condition on the boundary. For simplicity, we
will consider the case without boundary in this chapter (whereas asymptotically
flat metrics with minimal surface boundary components will play a role in the
formulation of the Riemannian Penrose inequality in Chapter 9). Even without
boundary, we also need to impose some condition on the constraints operator, i.e.,
on the energy-momentum density (ρ, J ), else we could simply patch together any
A SYMPTOTICALLY FLAT INITIAL DATA 247

initial data (g, K ) to a Schwarzschild end in an elementary manner, producing


an asymptotically flat initial data set, with whatever mass m we choose. We will
discuss this further in the section on the positive mass theorem below, where
we will also see that under a natural energy condition, requiring a decay rate
q > n − 2 above that of harmonically flat metrics will yield only trivial solutions
to the constraints, i.e., data from a slice in Minkowski spacetime.
For certain results it will be equivalent to compute quantities in an asymptotic
end (possibly sufficiently far out in an end) with respect to g or with respect to the
background Euclidean metric induced from the asymptotically flat coordinates.
Sometimes we will just refer to such coordinates, but we could also refer to
a chosen background metric g̊ on M for which g̊i j = δi j in the asymptotic
coordinates near infinity in each end. For example, norms | · |g and | · |g̊ are
uniformly equivalent, and are asymptotic to each other near infinity in any
end. The difference tensor (∇ − ∇) ˚ between the two Levi-Civita connections is
easily seen to be O(|x| −q−1 ), with norm measured equivalently with respect to
either metric, and so in some asymptotic calculations, the background Euclidean
connection can be used in place of the metric connection from g.
Likewise, for an asymptotically flat end (E , g), since g and g En are uniformly
equivalent, so are their corresponding volume measures. Thus, for a (Borel
measurable) function f , f ∈ L 1 (E , dvg ) if and only if f ∈ L 1 (E , d x), and so
we say f is integrable on E ( f ∈ L 1 (E )) under these equivalent conditions. For
(M, g) an asymptotically flat manifold, a locally integrable function f is in
L 1 (M, dvg ) precisely when it belongs to L 1 (E ) for each end E of M. In terms
of a chosen background metric g̊ on M for which g̊ = g En near infinity in each
end as above, f ∈ L 1 (M, dvg ) if and only if f ∈ L 1 (M, dvg̊ ), which we define
to be simply f ∈ L 1 (M). We note that with 2q + 2 > n, |x|−2q−2 ∈ L 1 (E ), so
that |K |2g ∈ L 1 (E ) for such asymptotically flat (E , g, K ). Thus from the Einstein
constraint equation (5.2.2) (with 3 = 0), we see that R(g) ∈ L 1 (E ) if and only if
ρ ∈ L 1 (E ). In terms of modeling an isolated system, it seems to be a reasonable
assumption to pose that the total energy density of the matter fields is integrable
across an asymptotically flat spatial slice, and similarly, for momentum density
J , which is proportional to divg π by the constraint (5.2.3).
Before moving on, we note that while the Einstein summation convention
remains in effect, when we have certain expressions that require a summation
notation symbol for at least one index, we may, for the sake of clarity, extend the
summation notation to indices that are already covered by the Einstein convention.
248 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

7.2.1. The ADM energy and momenta. We now want to define an energy-
momentum vector for an isolated gravitational system, to capture the energy-
momentum of the matter fields along with the gravitational field. This will not
amount, then, to simply integrating the energy density ρ, since, for example, the
Schwarzschild spacetime is vacuum, ρ = 0, while we have seen that far field
observers will detect what they might describe as a gravitational field outside a
massive object. As we will see, the energy-momentum vector will be defined as a
limit of flux integrals, for which (7.1.7) gives a Newtonian analogue. We require
q > n−2 2 in various places in this section, with or without explicit mention.
That said, one might look to a Hamiltonian to define an appropriate notion
of energy of a system. We saw the ADM Hamiltonian in Section 5.3.3.2, cf.
(5.3.10), and how it generated the equations of motion (5.3.18); we also noted
(cf. Remark 5-22 and Exercise 5-30) that the Hamiltonian is modified when
boundary terms need to be taken into account. In the asymptotically flat case,
metrics we want to consider with finite nonzero energy have decay rates which
necessitate modifying the Hamiltonian by such boundary terms (at infinity) (see
[16; 187], and earlier works [9; 10; 75]).
Indeed, we consider g and π = −π̂ with decay rates in asymptotic coordinates
of the orders gi j − δi j = Oℓ+1 (|x|−q ) and π i j = Oℓ (|x|−q−1 ). Given a timelike
vector field V along M, we can extend these coordinates (x 7→ ϕ(x) ∈ M) to
spacetime coordinates (t, x) such that ∂/∂t = V along M, for example by using
the exponential map (t, x) 7→ expϕ(x) (t V |ϕ(x) ). Doing this with a timelike unit
normal field along M, we get Fermi (Gaussian normal) coordinates (t˚, x), in
which the metric has the form ḡ = −d t˚2 + γ i j d x i d x j , with γ i j |t˚=0 = gi j . Given
such Fermi coordinates we let V = N ∂/∂ t˚ + X i ∂/∂ x i be a timelike vector field
along M, and let (t, x) be adapted coordinates with ∂/∂t = V . We extend N ,
X i and gi j off of M (t = 0) so that N ∂/∂ t˚ + X i ∂/∂ x i = ∂/∂t and the metric ḡ
assumes the lapse-shift form ḡ = −N 2 dt 2 + gi j (d x i + X i dt) ⊗ (d x j + X j dt);
Fermi coordinates correspond to N = 1, X = 0. More generally, we will
0 = O −q i i −q
take N − X ∞ ℓ+1 (|x| ), X − X ∞ = Oℓ+1 (|x| ), for some constants
µ
X ∞ ; cf. [16]. We let g̊ be a background Riemannian metric as above, with
g̊i j = δi j in asymptotically flat coordinates near infinity, and let d v̊ = dvg̊ . As in

Section 5.3.3.2, we let θg = det g / det g̊, and π̃ = θg π̂ .
p

From the derivation that led to equation (5.3.18), we see that what is needed
to generate the correct equations of motion in the asymptotically flat con-
text is a Hamiltonian whose first variation in the direction (h, σ̃ ) is equal to
e∗
M (h, σ̃ ) ·g D 8(g,π̃) (N , X ) d v̊. The variation δ HADM of HADM is given by
R
A SYMPTOTICALLY FLAT INITIAL DATA 249

δ HADM = (D HADM )(g,π̃) (h, σ̃ ) = M (N , X ) · D 8 e(g,π̃ ) (h, σ̃ ) d v̊. (There is an-


R

other complication, since for such data as above the integral for HADM might not
converge; we can either restrict to data for which it does converge, or regularize
by subtracting off some background terms, as in [16].) To try to put δ HADM
into the desired form, we integrate by parts, but for bounded (N , X ), and for h
decaying at the same rate as (g − g̊) and σ̃ decaying at the same rate as π̃, this
does not necessarily give us the required identity in general: if we use a spherical
exhaustion of the asymptotic ends, say, then integration by parts leaves terms on
the boundary that may not limit to zero as the spheres tend to infinity.
We claim that the terms from δ HADM = (D HADM )(g,π̃ ) (h, σ̃ ) that require
analysis are M − N L g h + 2(divg σ̂ )i X i dvg , with the other terms either not
R 

involving derivatives of (h, σ̃ ) (and so not contributing a boundary term) or


decaying fast enough so that the contributed boundary integral upon integration
by parts tends to zero at infinity. For instance, we infer from equation (5-28a)
(Exercise 5-28) that the boundary terms to focus on will come from integration
by parts on M (N , X ) · D 8 b(g,π̂) (h, σ̂ ) dvg . It is easy to see the only boundary
R

terms from the N -component of the integrand will come from −N L g h, cf.
equation (5-29b). As for the X -component, the relevant terms arise from the
jk j k
variation of −2(divg π̂ )i X i = −2gi j X i (π̂ ,k + π̂ mk 0mk + π̂ jm 0km ), given by (see
(5.3.9)–(5.3.10))
j
−2h i j X i (divg π̂ ) j − 2X i (divg σ̂ )i − 2gi j X i (π̂ mk δ0mk + π̂ jm δ0km
k
).

The first of these terms will not contribute to the boundary integral, while for
j
the last term we recall that δ0mk = 21 g js (h sk;m + h ms;k − h mk;s ). Integrating a
j
term like 2gi j X i π̂ mk δ0mk by parts results in several boundary integral terms, the
first of which is X i π̂ mk h ik (νg )m = O(|x|−2q−1 ), which tends to zero in a limit
of integrals over large coordinate spheres {|x| = r } as r → ∞ with q > n−2 2 ;


the other terms behave similarly.


As for the integral M −(N L g h + 2(divg σ̂ )i X i ) dvg , the term −N L g h =
R

−N (−1g trg h + g ik g jm h i j;km − h ·g Ric(g)) contributes a boundary integrand

(N (trg h), j − N, j trg h − N g ik h i j;k + Nm g sm h js )νgj


n
=N (h ii, j − h i j,i )ν j + O(|x|−2q−1 ),
P
i, j=1

where we swapped out the normal vector for the Euclidean normal (see Exercise
7-35 below). For the divergence term, the boundary integrand is simply

−2σ̂i j X i νgj = −2σ̂i j X i ν j + O(|x|−2q−1 ).


250 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

In order to get the desired linearization for the Hamiltonian, and subsequently
get the correct equations of motion, we need to add terms to HADM to cancel
out these boundary contributions, resulting in what we will call HRT , so that the
variation δ HRT in the direction (h, σ̃ ) = (h, θg σ̂ ) is given by, using Exercise 7-35
to exchange the measure dσg with the Euclidean measure dσ without affecting
the existence and value of the limit,
Z  X n 
j i j
δ HRT = lim N (h i j,i − h ii, j )ν + 2σ̂i j X ν dσ + δ HADM .
r →∞ {|x|=r }
i, j=1

Thus we define, when the limit exists,


Z  X n 
j i j
HRT = lim N (gi j,i − gii, j )ν + 2π̂i j X ν dσ + HADM .
r →∞ {|x|=r }
i, j=1

We write
κ
HRT = 2κ HRT ,
κ
so that HRT has units of energy (see Remark 5-23). For solutions of the vacuum
constraint equations, we have HADM = 0, and so should the boundary integrals
κ
exist, the value of the Hamiltonian HRT would be
n
1
Z
0
(gi j,i − gii, j )ν j dσ
X
X∞ · lim
2κ r →∞ {|x|=r } i, j=1
1
Z
i
−X ∞ · lim πi j X i ν j dσ. (7.2.1)
κ r →∞ {|x|=r }

We will henceforth use units for which c = 1, G = 1 and κ = (n − 1)ωn−1 .


µ
If X ∞ ∂/∂ x µ is a timelike unit vector (relative to the asymptotic background
Minkowski spacetime) tangent to the spacetime path associated to an observer
“at infinity” (far out in the asymptotic end), then (7.2.1) should represent the
energy the observer would ascribe to the isolated gravitational system. Thus we
can isolate boundary integrals which yield the ADM energy-momentum vector
associated to the isolated gravitational system, which we do below.
Before defining the ADM energy-momentum vector, we analyze the con-
vergence of the asymptotic boundary integrals more carefully, starting with an
expansion of the constraint operator in an asymptotic end.

Proposition 7-31. Suppose (E , g) is an asymptotically flat end, with asymptoti-


cally flat coordinates x with respect to which we take a background Euclidean
metric g En with components δi j , and we write gi j = δi j + h i j . Then the scalar
A SYMPTOTICALLY FLAT INITIAL DATA 251

curvature R(g) of g satisfies the expansion


n
(gi j,i j − gii, j j ) + O(|x|−2q−2 ) = L g En h + O(|x|−2q−2 ),
X
R(g) =
i, j=1

where L g En is the linearization of the scalar curvature operator at the Euclidean


metric, L g En h = −1g En (trg En h) + divg En h (cf. Lemma 2-7).

Proof. The second formula follows directly from the first. In the asymptotic
coordinates, 0ikj = O(|x|−q−1 ). Thus for the scalar curvature we find

R(g) = g i j 0ikj,k − 0ik,


k k m m k ij k k
j + 0km 0i j − 0 jk 0im = g (0i j,k − 0ik, j ) + O(|x|
−2q−2
).


From g km ki
,ℓ = −g gi j,ℓ g
jm = O(|x|−q−1 ), we find

0ikj,ℓ = 21 g km (gm j,iℓ + gim, jℓ − gi j,mℓ ) + O(|x|−2q−2 ).

Since gi j − δi j = O(|x|−q ) we have g i j − δ i j = O(|x|−q ) as well, so that

R(g)
= g i j (0ikj,k − 0ik,
k
j ) + O(|x|
−2q−2
) = δ i j (0ikj,k − 0ik,
k
j ) + O(|x|
−2q−2
)
= 12 δ i j δ km (gm j,ik +gim, jk −gi j,mk )−(gmk,i j +gim,k j −gik,m j ) + O(|x|−2q−2 ),


from which the desired result follows. □

Exercise 7-32. Show that if (E , g, π ) is an asymptotically flat end, then in


asymptotically flat coordinates x,

divg π = divg En π + O(|x|−2q−2 ).

Thus we see that, with q > n−2 , if ρ and J from the constraint equations are
Pn2
integrable on E , then so are i, j=1 (gi j,i j − gii, j j ) = L g En h and divgEn π, which
immediately gives us the following proposition.

Proposition 7-33. Suppose (E , g, K ) is an asymptotically flat initial data set


on an end E , with q > n−2 2 2
2 and R(g) − |K |g + (trg K ) = 2κρ and divg π = κ J
integrable on E . Then the following limits, computed in asymptotically flat
coordinates x, exist, where ν j = x j/|x|:
Z n Z n
j
πi j ν j dσ.
X X
lim (gi j,i − gii, j ) ν dσ, lim
r →∞ {|x|=r } r →∞ {|x|=r }
i, j=1 j=1
252 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

Proof. The proof is by application of the divergence theorem: for r > r0 > 1,
Z n Z n
j
(gi j,i − gii, j ) ν j dσ
X X
(gi j,i − gii, j ) ν dσ −
{|x|=r } i, j=1 {|x|=r0 } i, j=1
Z n
X
= (gi j,i j − gii, j j ) d x.
{r0 ≤|x|≤r } i, j=1

Since, as remarked just above, the integrand is integrable, we see that the differ-
ence in flux integrals can be made small by taking r0 sufficiently large. Similarly,
the second limit follows from
Z n Z n Z
j j
X X
πi j ν dσ − πi j ν dσ = (divgEn π)i d x. □
{|x|=r } j=1 {|x|=r0 } j=1 {r0 ≤|x|≤r }

We now define the ADM energy-momentum vector.

Definition 7-34. Let (E , g, π ) be an n-dimensional asymptotically flat end, with


asymptotically flat coordinates x, and with ρ and J integrable. We define the
ADM energy E and linear momentum P as follows:
n
1
Z
(gi j,i − gii, j ) ν j dσ,
X
E= lim (7.2.2)
2(n−1)ωn−1 r →∞ {|x|=r } i, j=1
n
1
Z
πi j ν j dσ.
X
Pi = lim (7.2.3)
(n−1)ωn−1 r →∞ {|x|=r } j=1

Here ωn−1 is the area of the unit (n−1)-sphere, so that in case n = 3, say, the
1
constant in front of the E-integral is 16π .

If the ADM energy-momentum vector is future-pointing causal, i.e. E ≥ |P|


2 2
p
(c = 1), we let m = E − |P| define the ADM mass m. In the time-symmetric
case (E , g), it is customary to call m = E the ADM mass (regardless of its sign).
The integrand in (7.2.2) equals i,n j=1 (h i j,i − h ii, j )ν j , for h i j = gi j − δi j . As
P

a corollary of the next simple exercise, the normal vector and the measure in
the above integrals (7.2.2)–(7.2.3) can be replaced by νg and dσg , respectively,
without altering the limit, for q > n−22 .

Exercise 7-35. a. Show that (as measures on {|x| = r }) |dσg −dσ | ≤ O(r −q )dσ .
b. Suppose that ν = x j |x|−1 ∂/∂ x j , while νg is the g-normal unit vector field to
the coordinate spheres of constant |x| in an asymptotic end E , with ⟨νg , ν⟩g > 0
(equivalently ⟨νg , ν⟩g En > 0). Show that |νg − ν| = O(|x|−q ).
A SYMPTOTICALLY FLAT INITIAL DATA 253

c. Conclude as noted above that ν and dσ can be replaced by νg and dσg ,


respectively, in (7.2.2)–(7.2.3) without changing the values of the limits.
As you might expect from the way in which the existence of the limits is
established, one can show that they can be computed on more general surfaces
tending to infinity, and moreover the values are independent of the asymptotically
flat coordinate chart in which they are computed; see, e.g., [15; 16].
For coordinates x µ in which a spacetime metric asymptotes appropriately to
a Minkowski metric with these as inertial coordinates, we let E = P 0 = −P0
and P i = Pi be defined using these asymptotic coordinates on a constant time
slice, and we define the ADM energy-momentum four-vector P = P µ ∂/∂ x µ .
µ
Using (7.2.1), we see that if U obs = X ∞ ∂/∂ x µ is a timelike unit vector tangent
to the spacetime path of an observer “at infinity”, then the energy of the isolated
gravitational system as measured by the observer should be E obs = −⟨ U obs , P⟩ =
µ 0 E − Xi P .
−X ∞ Pµ = X ∞ ∞ i
In the next exercise we see the energy integral in fact picks up the mass m in
the Schwarzschild metric, and more generally it picks up the relevant coefficient
in the spherical harmonic expansion of the conformal factor for harmonically
flat metrics. Note that pairing K = 0 with each of these metrics gives solutions
to the vacuum constraint equations, and thus E can be nonzero even in vacuum.
Exercise 7-36. a. Show that E = m for an asymptotically flat end in the Riemann-
ian Schwarzschild metric g S (7.1.1).
4
b. Show that E = 2A for any harmonically flat end (E , u n−2 g En ) with 1u = 0
and u(x) = 1 + A/|x|n−2 + O∞ (|x|−(n−1) ).
4
c. Show that E = 2ab for (E , u n−2 gEn ), with u > 0, 1u = 0, and u(x) =
a +b/|x|n−2 + O∞ (|x|−(n−1) ). (Hint: Rescale to asymptotically flat coordinates.)
Part b. of the preceding exercise is a special case of a result about the effect
of a conformal change on the ADM energy that will be important later in the
chapter.
Lemma 7-37. Suppose (E , g) is an asymptotically flat end with ADM energy
E(g). Suppose u > 0 on E , admitting an expansion in asymptotically flat coordi-
nates x of the form u(x) = 1 + A/|x|n−2 + O1 (|x|−(n−2)−γ ), for some γ > 0. If
4
g̃ = u n−2 g, then E(g̃) = E(g) + 2A.
Proof. This follows immediately from the following identity, which you should
readily verify, with ξ := min(q, γ ) > 0:
n n
4(n − 1) ∂u
+ O(|x|−(ξ +n−1) ). □
X X
(g̃i j,i − g̃ii, j ) = (gi j,i − gii, j ) −
n − 2 ∂x j
i=1 i=1
254 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

The ADM energy-momentum has been defined in terms of the initial data,
whereas one might also like to understand these quantities relative to the spacetime
picture. For instance, one could ask how these quantities transform under a
µ µ µ
boost y µ = 3 ν x ν , or slightly more generally y µ = 3 ν x ν + a µ , where 3 ν is a
proper (time-orientation preserving) Lorentz transformation, and a µ is a constant
spacetime vector representing a shift in coordinates (e.g., a time translation).
Consider a suitably asymptotically flat (Minkowskian) spacetime with asymptotic
coordinates x µ , in which the initial data set corresponds to x 0 = 0 (at least near
infinity, say), and contains for some θ0 > 0 the spacelike hypersurfaces defined
µ
near infinity by y 0 = 0, for y µ = 3 ν x ν , for proper Lorentz transformations with
boost angle |θ| < θ0 ; cf. [56], e.g., for the existence of such boosted slices in the
spacetime evolution from asymptotically Euclidean initial data. One can then
show as in Exercise 7-98, following [211, Chapter 7], [161, Chapter 20], cf. [61,
Appendix E], that the energy-momentum vector transforms as a Minkowskian
vector under a Lorentz transformation: if P µ are the components of the energy-
momentum computed in the x-coordinate chart, and P eµ are those computed in
µ
eµ = 3 ν P ν , i.e.,
the y-coordinate chart, then P

∂ µ ∂
ν µ ∂ ν ∂y ∂
P̃ µ µ
= P 3 ν µ
= P ν µ
= Pν ν .
∂y ∂y ∂x ∂y ∂x

See Exercise 7-97 for explicit confirmation of this for a family of boosted slices
in the Schwarzschild spacetime. In terms of thinking of the spacetime geometry
and boosted slices, it might be a good time to recall that in the conformal
compactification of Minkowski spacetime (see Section 1.4.2), spacelike curves
with r → +∞ all “end” at a single point i 0 , spacelike infinity. Finally, note
that if you simply switch the time orientation in the asymptotic regime, E does
not change, but since the second fundamental form changes sign, the linear
momentum changes sign.
As we see from the form of the expansion of R(g) above, and from the
Hamiltonian H RT , the energy and momentum integrals are directly related to the
linearized constraint operator D8(gEn ,0) (h, σ ) = (L g En h, divgEn σ ). By integration
of the linearized constraints operator against elements of the kernel of the adjoint
operator D8∗(g En ,0) at the Minkowski initial data (g En , 0) one obtains asymptotic
conserved quantities. Recall that D8∗(g En ,0) (N , X ) = (0, 0) for N a constant plus
a linear combination of Cartesian coordinate functions x i , and X a Euclidean
Killing field (a linear combination of generators of translations and rotations).
Now, if D8∗(g En ,0) (N , X ) = (0, 0), then (with the dot product below induced by
A SYMPTOTICALLY FLAT INITIAL DATA 255

the Euclidean metric) we have, by integration by parts,


Z
(N , X ) · D8(gEn ,0) (h, π ) d x
{r0 ≤|x|≤r }
Z  n n 
i
X X
= N (h i j,i − h ii, j ), j + X πi j, j d x = B(r ) − B(r0 ),
{r0 ≤|x|≤r } i, j=1 i, j=1

where B(r ) is given by


Z n X
n n n 
i
(N,i h i j − N, j h ii ) ν j dσ.
X X X
πi j X + N (h i j,i − h ii, j ) −
{|x|=r } j=1 i=1 i=1 i=1

Here we used the fact that X i = X i in Cartesian coordinates, so the components


j
of the Lie derivative L X g En are given by (L X g En )i j = X i, j + X j,i = X i, j + X ,i ,
which vanishes for a Euclidean Killing field. Therefore, by symmetry of π ,
Pn i
Pn 1 i j
i, j=1 X , j πi j = i, j=1 2 (X , j + X ,i )πi j = 0. We also used that

n
L ∗g En
X
0= N = h · (−(1g En N )g En + Hessg En N ) = (−N, j j h ii + N,i j h i j ).
i, j=1

The boundary integral for E comes from (N , X ) = (1, 0), while the integral
for Pi comes from (N , X ) = (0, ∂/∂ x i ). If (g, π ) = (g En +h, π ) is asymptotically
flat, then in asymptotically flat coordinates x on an asymptotic end, 8(g, π) =
8(g En , 0) + D8(g En ,0) (h, π ) + O(|x|−2q−2 ) = D8(g En ,0) (h, π) + O(|x|−2q−2 ),
where the error term comes from a sum of terms of the algebraic forms h ∗ ∂ 2 h,
∂h ∗ ∂h, π ∗ π, ∂h ∗ π and h ∗ ∂π, the latter term not needed when π is treated
a (2, 0)-tensor. Here “∗” means a linear combination, with smooth bounded
coefficients, of terms quadratic in the components (or their partials) of h and π ,
as indicated. Thus if (g, π ) is a solution to the vacuum constraint 8(g, π ) = 0,
then D8(g En ,0) (h, π ) = O(|x|−2q−2 ).

Exercise 7-38. Verify the stated form of the error term in the expansion of
8(g, π) above, and confirm the order estimate.

These kernel elements of D8∗(g En ,0) correspond to symmetries of the Minkowski


spacetime, and so represent asymptotic symmetries of an asymptotically flat
spacetime. Indeed (N , X ) = (1, 0) corresponds to time translation generated by
∂/∂t, and (N , X ) = (0, ∂/∂ x i ) corresponds to spatial translation, and they are
respectively associated to conserved quantities which are the energy and linear
momentum.
256 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

We now consider the other remaining basic kernel elements of D8∗(g En ,0) . For
instance, suppose (N , X ) = (x k , 0), then
Z  X n n 
k
(δ i h i j − δ j h ii ) ν j dσ. (7.2.4)
k k
X
B (r ) = x (h i j,i − h ii, j ) −
{|x|=r } i, j=1 i, j=1

This integral need not converge as r → ∞ for general asymptotically flat initial
data, even for solutions of the vacuum constraints. However, when there are
asymptotically flat coordinates for which the data enjoys sufficient asymptotic
parity, then the above surface integral will converge. We illustrate with an
example.
4
Exercise 7-39. Suppose g = u n−2 g En = gEn + h is harmonically flat at infinity,
with expansion u(x) = 1+ A/|x|n−2 + B · x/|x|n + O∞ (|x|−n ) for |x| sufficiently
large. Show that
4(n−1)
ω Bk
n−2 n−1
Z  X n n 
xk (δ ki h i j − δ kj h ii ) ν j dσ.
X
= lim (h i j,i − h ii, j ) − (7-39a)
r →∞ {|x|=r }
i, j=1 i, j=1

Thus for m
k
= 2A ̸= 0, the limit above is equal to 2(n − 1)ωn−1 mck , where
B
ck = gives the center of mass as in Exercise 7-28.
(n −2)A
Thus for an asymptotically flat metric g with m ̸= 0, with respect to an
asymptotically flat chart for which limr →∞ B(r ), with B(r ) from (7.2.4), exists,
we define the center of mass ck as follows (compare (7.1.8)):
2(n−1)ωn−1 mck
Z  X n n 
k
(δ i h i j − δ j h ii ) ν j dσ.
k k
X
= lim x (h i j,i − h ii, j ) − (7.2.5)
r →∞ {|x|=r }
i, j=1 i, j=1

Lastly we have the rotational Euclidean Killing fields, e.g., in dimension n = 3,


we have three generators Y(i) , i = 1, 2, 3, given by
∂ 3 ∂ ∂ 1 ∂ ∂ 2 ∂
Y(1) := x 2 − x , Y(2) := x 3 − x , Y(3) := x 1 − x .
∂x3 ∂x2 ∂x1 ∂x3 ∂x2 ∂x1
With (N , X ) = (0, Y(i) ), then, we have the surface integrals
Z
k j
B (r ) = π jk Y(i) ν dσ.
{|x|=r }

Just as with the center of mass, when there is enough asymptotic parity symmetry
in the data, the surface integrals approach a limit for r → ∞, and we define the
A SYMPTOTICALLY FLAT INITIAL DATA 257

angular momentum
1
Z
k j
Ji = lim π jk Y(i) ν dσ. (7.2.6)
8π r →∞ {|x|=r }

(In n ≥ 3 dimensions, 8π is replaced by (n−1)ωn−1 .) See [161, Chapter 19] for


the appearance of the angular momentum in the spacetime asymptotics, and see
Exercise 7-106 for initial data asymptotics.
Asymptotic parity coordinate conditions on asymptotically flat at rate q >
n−2
initial data sufficient to guarantee convergence of the center of mass and

2
angular momentum integrals were proposed by Regge and Teitelboim [187]: for
some ℓ ∈ Z+ ,

gi j (x) − gi j (−x) = Oℓ+1 (|x|−q−1 ) (7.2.7)


K i j (x) + K i j (−x) = Oℓ (|x|−q−2 ). (7.2.8)

The parity condition on K i j could be replaced by the analogue for πi j . We


note that the component conditions can be interpreted as relating g or K to the
respective pullback of g or K under the antipodal map x 7→ −x. Moreover under
these conditions it is easy to see that we can replace π jk with K jk in (7.2.6).
See Exercise 7-99 for the transformation rule for the angular momentum under
boosts and translations of the center of mass, cf. Exercise 7-106, as well as [61,
Appendix E] (watch sign conventions).

Exercise 7-40 (cf. [159]). From the proof of Proposition 7-33, the limit of
integrals defining the energy converges if and only if limr →∞ {r0 ≤|x|≤r } R(g)d x
R

exists. Show likewise that under condition (7.2.7) (with ℓ = 1), the limit of
integrals defining the center of mass as in (7.2.5) converges if and only if
limr →∞ {r0 ≤|x|≤r } x k R(g)d x exists, for each k.
R

Exercise 7-41. a. Verify the above claim that under the Regge–Teitelboim
conditions, one can replace π jk by K jk in (7.2.6).
b. Consider the initial data on a constant t-slice in the Boyer–Lindquist co-
ordinates used for the Kerr spacetime (2.4.6). Introduce asymptotically flat
coordinates x = r cos θ sin φ, y = r sin θ sin φ, z = r cos φ, and show that the
shift vector is given by
2yma 2xma
X1 = , X2 = − , X 3 = 0.
rρ 2 rρ 2
c. Show that this initial data set satisfies the Regge–Teitelboim conditions (7.2.7)–
(7.2.8), so that the angular momentum flux integrals converge. To do this, you
258 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

might use the ADM equation (5.3.4). Show that the nonzero component of
angular momentum is given by J3 = am.
Remark 7-42. The Kerr spacetime can also be expressed in Kerr–Schild form as
2m r̃ 3
gµν = ηµν + θµ θν ,
r̃ 4 + a 2 z 2
where η = −dt 2 + d x 2 + dy 2 + dz 2 is the Minkowski metric, and
1  z
θµ d x µ = d x 0 − r̃ (x d x + y dy) + a(xdy − yd x) − dz ,
r̃ 2 + a 2 r̃
with r̃ defined implicitly as the solution of the equation

r̃ 4 − r̃ 2 (x 2 + y 2 + z 2 − a 2 ) − a 2 z 2 = 0.

See [23, (37)] for the second fundamental form of the constant t-slice in these
coordinates, where it is shown that the leading order terms possess a form which
allows the angular momentum flux integrals to converge.

7.2.2. Weighted spaces. We will now introduce weighted function spaces that
are designed to capture growth or decay rates of functions and tensors near
infinity. One can generalize the notion of asymptotically flat initial data sets in
terms of these spaces. Moreover, we will want to utilize the properties of the
Laplace operator between such weighted spaces in order to achieve certain kinds
of conformal deformations on asymptotically flat manifolds. In particular we
will construct certain deformations of the scalar curvature on asymptotically flat
manifolds, with sufficient control on the change in the ADM mass induced by
such conformal transformations. We remark that in their proof of the Riemannian
positive mass theorem, Schoen and Yau [199] prove the necessary existence and
expansion of solutions to the PDE giving the required deformations of scalar
curvature directly, via a Green’s function expansion. As the theory of weighted
spaces has now been sufficiently developed and applied to the study of ADM
mass (cf. e.g., [15]), we will discuss the definitions, fundamental results, and
basic ideas behind the proofs.

Basic definitions and properties. We now introduce the basic definitions of


weighted Sobolev and Hölder spaces on Rn . Let σ ≥ 1 be a smooth function
identical to r = |x| for |x| ≥ 2. For weighted Sobolev spaces, we start with the
p
L −τ -norm (1 ≤ p < ∞):
Z Z
p p τ p−n
∥u∥ L p = |u| σ dx = (|u|σ τ ) p σ −n d x.
−τ Rn Rn
A SYMPTOTICALLY FLAT INITIAL DATA 259

For τ > 0, the exponent on the weight factor is chosen to encourage decay of u
at infinity (and also to rescale the volume measure). The norm

∥∂ γ u∥ L −τ
X
∥u∥W k, p = p
−|γ |
−τ
|γ |≤k
k, p
determines the spaces W−τ (Rn ), considered as the closure of the space of smooth
compactly supported functions in the given norm, or equivalently defined in terms
0, p p
of weak derivatives. Note that W−τ (Rn ) = L −τ (Rn ), while L p (Rn ) = L p−n/ p (Rn ),
k, p k−ℓ, p
and moreover for ℓ := |γ | ≤ k, the linear map ∂ γ : W−τ (Rn ) → W−τ −ℓ (Rn )
k,∞
is bounded. The spaces W−τ can be defined in the natural way starting from
∥u∥ L −τ = ∥σ u∥ L ∞ , so that ∥u∥W k,∞ = |γ |≤k ∥∂ γ u∥ L ∞
τ . Finally, the weight-
P

−τ −τ −|γ |
ing conventions are not universal; we have chosen to follow those in [15].
There are a number of basic properties of these spaces that we will need, some
of which will be given as exercises, which may be interspersed at appropriate
points throughout the section. For example, the next exercise is basic but essential.
q p
Exercise 7-43. Show that if 1 ≤ p ≤ q and τ1 < τ2 , then L −τ2 (Rn ) ⊂ L −τ1 (Rn ),
and in fact there is a constant C > 0 such that ∥u∥ L −τ p ≤ C∥u∥ L q−τ for all
q 1 2
n
u ∈ L −τ2 (R ). Prove the following weighted version of Hölder’s inequality:
q
if p, q, r ≥ 1 and 1p = q1 + r1 , then if u ∈ L −τ1 (Rn ) and v ∈ L r−τ2 (Rn ), then
p
uv ∈ L −(τ1 +τ2 ) (Rn ) and

∥uv∥ L p ≤ ∥u∥ L q−τ ∥v∥ L r−τ . (7-43a)


−(τ1 +τ2 ) 1 2

p
We can identify the dual spaces to L −τ (Rn ) via a dual pairing, as in the next
exercise.
Exercise 7-44. Suppose that p > 1 and q > 1 satisfy 1p + q1 = 1. Using the dual
pairing (Riesz representation) between L p (d x) and L q (d x), identify the dual
p q p
space of L −τ (Rn ) as L τ −n (Rn ) by considering a pairing (u, v) ∈ L −τ (Rn ) ×
q
L τ −n (Rn ) 7→ Rn uv d x.
R

In contrast to integral control from the weighted Sobolev norms, one can use
weighted Hölder norms for direct pointwise control. Indeed, let τ ∈ R, α ∈ (0, 1],
and let σ (x, y) = min(σ (x), σ (y)). We define a weighted Hölder seminorm by
| f (x) − f (y)|
[ f ]α,−τ = sup σ (x, y)α+τ ,
x̸ = y |x − y|α

and a weighted norm for C k -functions f by

sup σ (x)τ +|γ | |∂ γ f (x)| = ∥ f ∥W k,∞ .


X
∥ f ∥C−τ
k =
n −τ
|γ |≤k x∈R
260 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

The Banach space C−τ k (Rn ) is comprised of those C k -functions for which the
k,α
above norm is finite, while the corresponding weighted Hölder spaces C−τ (Rn )
k,α n
are given by those functions f ∈ Cloc (R ) such that the norm

[∂ γ f ]α,−τ −k
X
∥ f ∥C k,α := ∥ f ∥C−τ
k +
−τ
|γ |=k

is finite. (This norm is equivalent to that defined in [15].) For ℓ := |γ | ≤ k,


k,α k−ℓ,α
∂ γ : C−τ (Rn ) → C−τ n ′
−ℓ (R ) is a bounded linear map. Likewise, for τ < τ and
k (Rn ) ,→ W k, p n
any p ≥ 1 there is a bounded inclusion map C−τ −τ ′ (R ).
We have a natural extension of order notation as follows: for α ∈ (0, 1] and
ℓ ∈ Z+ ∪{0}, the notation f = Oℓ,α (|x|s ) means f = Oℓ (|x|s ) as in Definition 7-5,
and furthermore there is a C > 0 such that for all multi-indices γ with |γ | = ℓ,
and for all |x| sufficiently large,
|∂ γ f (y) − ∂ γ f (x)|
sup ≤ C|x|s−ℓ−α .
0<|y−x|< |x|
|y − x|α
2
ℓ,α ℓ,α
It is straightforward to check that C−τ (Rn ) is comprised of those Cloc (Rn )
functions which are also of order Oℓ,α (|x|−τ ).
Exercise 7-45. Suppose k is a nonnegative integer, α ∈ (0, 1], and τ1 , τ2 ∈ R.
k,α k,α
Show there is a C > 0 such that u ∈ C−τ 1
(Rn ) and v ∈ C−τ 2
(Rn ) implies that
k,α
uv ∈ C−(τ 1 +τ2 )
(Rn ) and
∥uv∥C k,α ≤ C∥u∥C k,α ∥v∥C k,α .
−(τ1 +τ2 ) −τ1 −τ2

k,α k,β
If in addition 0 < β ≤ α and τ1 ≤ τ2 , prove that C−τ 2
(Rn ) ⊂ C−τ1 (Rn ) is a
k,α
continuous inclusion, i.e., there is a constant C > 0 such that for all u ∈ C−τ 2
(Rn ),
∥u∥C k,β ≤ C∥u∥C k,α .
−τ1 −τ2

Before moving on, we note that for tensor fields on Rn , one may use Cartesian
components to extend the above definitions to weighted spaces of tensor fields,
as well as the embeddings between the various spaces to which we will turn next.
7.2.2.1. Sobolev embeddings. There are classical inequalities associated with
Sobolev embeddings (cf., e.g., [2; 86; 107; 144]), for which there are weighted
analogues. We introduced certain such embeddings in Section 6.1, and used
them in Section 6.1.10. We will recall a suite of such embeddings here, and
then sketch how to obtain weighted analogues. The domains  to which these
theorems apply, and to which we restrict our discussion here, include balls and
annuli such as A R = {x ∈ Rn : R < |x| < 2R}; these suffice for our purposes, so
we may assume that  is a ball or an annulus. For more details, please refer to
the above references, to which we defer for the proofs of these embeddings.
A SYMPTOTICALLY FLAT INITIAL DATA 261

Embeddings of W 1, p (). We begin with embeddings of W 1, p (), 1 ≤ p < ∞.


If p > n, and if 0 < α ≤ 1 − np , there is a constant C > 0 such that for any
element of W 1, p (), we can choose a representative u ∈ C 0,α (), and moreover,
∥u∥C 0,α () ≤ C∥u∥W 1, p () . Thus we have a continuous embedding W 1, p () ,→
C 0,α (). In other words, as we used earlier in Section 6.1.10, suitable integral
properties of a function and its derivative imply pointwise estimates.
There are also embeddings when 1 ≤ p ≤ n. In case 1 ≤ p < n, if we let
np
p ∗ = n− p , we have the inequality ∥u∥ L p∗ () ≤ C∥u∥W 1, p () and the associated
1, p p ∗
embedding W () ,→ L (). We could use interpolation of the norms to
get ∥u∥ L q () ≤ C∥u∥W 1, p () along with the associated embedding W 1, p () ,→
L q () for p ≤ q ≤ p ∗ . In the borderline case p = n (see [2; 144]), there is an
inequality ∥u∥ L q () ≤ C∥u∥W 1,n () for n ≤ q < ∞, establishing the associated
embedding W 1,n () ,→ L q (). We make the trivial observation that since  has
finite volume (being a ball or an annulus), Hölder’s inequality immediately allows
us to extend the above inequality and embedding into L q () with 1 ≤ q < p.
k, p
Embeddings of W−τ (). We now consider how to establish weighted analogues
1, p
for the above embeddings. For example, if n < p < ∞, a function u ∈ W−τ (Rn )
0,α
can be represented by a function in C−τ (Rn ), where 0 < α ≤ 1− np , and moreover
1, p
there is a constant C > 0 such that for all u ∈ W−τ (Rn ),

∥u∥C 0,α ≤ C∥u∥W 1, p . (7.2.9)


−τ −τ

To prove this estimate, we can consider for R > 0 the rescaled function u R (x) =
u(Rx), and note that if A R = {x ∈ Rn : R < |x| < 2R}, then ∥u∥C 0,α (A R ) (defined
−τ
as above but using x, y ∈ A R ) and R τ ∥u R ∥C 0,α (A1 ) are equivalent for R ≥ 1: there
0,α
is a constant C > 1 such that for all R ≥ 1 and all u ∈ C−τ (Rn ),

C −1 ∥u∥C 0,α (A R ) ≤ R τ ∥u R ∥C 0,α (A1 ) ≤ C∥u∥C 0,α (A R ) , (7.2.10)


−τ −τ
1, p
and similarly for all R ≥ 1 and for all u ∈ W−τ (Rn ),

C −1 ∥u∥W 1, p (A R ) ≤ R τ ∥u R ∥W 1, p (A1 ) ≤ C∥u∥W 1, p (A R ) (7.2.11)


−τ −τ

(where the weighted Sobolev norms on A R are defined as above but only inte-
grating over A R ).

Exercise 7-46. Verify the scaling relationships (7.2.10)–(7.2.11), and then deduce
the corresponding weighted estimate (7.2.9).
0,α
Exercise 7-47 (cf. [15]). If u ∈ C−τ (Rn ), then clearly u(x) = O(|x|−τ ). Show
1, p
that if p > n and u ∈ W−τ (Rn ), then in fact u(x) = o(|x|−τ ) as |x| → ∞, i.e.,
lim|x|→∞ |x|τ u(x) = 0.
262 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

1, p q
We can likewise establish embeddings W−τ (Rn ) ,→ L −τ (Rn ) for 1 ≤ p < n
np
and p ≤ q ≤ p ∗ = n− p ; in the borderline case p = n, there is a continuous
1,n q
embedding W−τ (R ) ,→ L −τ (Rn ) for any n ≤ q < ∞. One proves these as
n

in [15], by using the inequality ∥u∥ L q (A1 ) ≤ C∥u∥W 1, p (A1 ) associated to the
relevant embedding of W 1, p (A1 ) into L q (A1 ) and rescaling as above to prove
∥u∥ L q−τ ≤ C∥u∥W 1, p . Indeed there are constants C1 , C2 , C3 such that for R ≥ 1
1, p −τ
and for u ∈ W−τ (Rn ), defining u R (x) = u(Rx) as above,

∥u∥ L q−τ (A R ) ≤ C1 R τ ∥u R ∥ L q (A1 ) ≤ C2 R τ ∥u R ∥W 1, p (A1 ) ≤ C3 ∥u∥W 1, p (A R ) .


−τ

We now proceed as in the following exercise.


Exercise 7-48 (cf. [15]). By writing the weighted Sobolev norm in terms of a
series over nonoverlapping annuli, along with scaling, prove that in case 1 ≤ p ≤ n,
and for p ≤ q < ∞ (and q ≤ p ∗ in case 1 ≤ p < n), there is a C > 0 such that
1, p
for all u ∈ W−τ (Rn ), the following estimate holds: ∥u∥ L q−τ ≤ C∥u∥W 1, p . For
−τ
p > n, this holds with q = ∞ by (7.2.9); show this inequality also holds for
n < p ≤ q < ∞, first using the Sobolev embedding of W 1, p (A1 ) to C 0 (A1 ) to
establish a continuous embedding of W 1, p (A1 ) into L r (A1 ), for r ≥ 1.
There are analogous direct relations between embeddings for W k, p () and
k, p
W−τ (Rn ) for k ≥ 1; we state them for the weighted spaces. For kp ≤ n we have
k, p q
embeddings into integral spaces: for kp < n, W−τ (Rn ) embeds in L −τ (Rn ) for
np k, p q
p ≤ q ≤ n−kp ; for kp = n, W−τ (Rn ) embeds in L −τ (Rn ) for p ≤ q < ∞. For
kp > n ≥ (k−1) p, we have embeddings into Hölder spaces: for kp > n > (k−1) p,
k, p 0,α
and 0 < α ≤ k − np < 1, W−τ (Rn ) embeds into C−τ (Rn ); for n = (k − 1) p,
k, p n 0,α n
W−τ (R ) embeds into C−τ (R ) for 0 < α < 1. Moreover, if n < kp < ∞ and
k, p
u ∈ W−τ (Rn ), then u(x) = o(|x|−τ ), just as in Exercise 7-47. We have extensions
to higher regularity embeddings; for example, for ℓ ∈ Z+ , (k − 1) p ≤ n < kp
k+ℓ, p ℓ,α
and corresponding α as above, W−τ (Rn ) embeds into C−τ (Rn ).

Compact embeddings. Many of the above embeddings are compact, the definition
of which we recalled in Section 6.1.
For example, recall that we obtain some compact embeddings by employing
the Ascoli–Arzelà theorem, a direct corollary of which is the following: if  is
a bounded domain, then the space C 0,α () (0 < α ≤ 1) is compactly included
in C 0 (). The Hölder continuity yields the required equicontinuity criterion
for Ascoli–Arzelà. Combining this observation with just a little work, you can
obtain the following.
Exercise 7-49. Show that for a bounded domain , C 0,α () is compactly in-
cluded in C 0,β (), for 0 < β < α ≤ 1.
A SYMPTOTICALLY FLAT INITIAL DATA 263

For  a convex, bounded domain, the mean value theorem can be used together
with Ascoli–Arzelà to conclude that C 1 () is compactly included in C 0,β () for
0 < β < 1, and as well in C 0 (); likewise for k a nonnegative integer, C k+1 ()
is compactly included in C k () and in C k,β () for 0 < β < 1, and moreover,
C k,α () is compactly included C k (), and in fact in C k,β () for 0 < β < α ≤ 1.
These results extend to compact smooth manifolds-with-boundary, such as an
annular region, by a simple covering argument.
There are also compact embeddings between certain Sobolev spaces, given
by the Rellich–Kondrachov theorem [2; 86; 107; 144]. For an example of such
an embedding, for suitable bounded domains , such as a ball or an annulus,
the inclusion W k, p () ,→ W j, p () for 0 ≤ j < k, 1 ≤ p < ∞, is compact. For
a proof in the p = 2 case (Rellich’s lemma) using the Fourier transform and
Ascoli–Arzelà, see, e.g., [139]. Some of the Sobolev embeddings mentioned in
the preceding section are compact for suitable such domains , for example, for
kp > n, with (k − 1) p ≤ n < kp, and for α with 0 < α < k − np , then for ℓ ∈ Z+ ,
W k+ℓ, p () embeds compactly into C ℓ,α () (and hence compactly into C ℓ ()
and W ℓ,q () for q ≥ 1).
k, p
Lemma 7-50. For 1 ≤ p < ∞, k > j ≥ 1 and µ > 0, W−τ −µ (Rn ) is compactly
j, p
included in W−τ (Rn ).
Exercise 7-51. Prove the preceding lemma, employing Rellich–Kondrachov and
a diagonal argument.
0,α n
Exercise 7-52. In a similar manner as the preceding, prove that C−τ −µ (R )
0,β n
is compactly included in C−τ (R ), for µ > 0 and 0 < β < α ≤ 1, and that
1 n 0,β n
C−τ −µ (R ) is compactly included in C −τ (R ) for 0 < β < 1. Formulate higher
regularity versions as well.
7.2.2.2. Weighted spaces and asymptotically flat manifolds. One can extend
the definitions of weighted Sobolev spaces to an asymptotically flat manifold
(M, g), using asymptotically flat charts for each of the finitely many ends whose
union is M \ C , and a suitable finite covering by charts of the compact set
C ⊂ M; equivalently, one could define these spaces using the induced connection
and volume measure from a smooth background metric g̊ with g̊i j = δi j in the
asymptotically flat charts. Similar considerations apply for defining weighted
Hölder spaces on asymptotically flat manifolds, by using a suitable covering
by charts, and computing Hölder quotients for components, or using parallel
transport. We often focus on an asymptotically flat end (E , g), and can consider
weighted spaces defined on the end. We will sometimes omit the domain in the
k, p k, p
notation for a weighted function space, for example letting W−τ = W−τ (M, g).
264 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

Suppose (M, g) is asymptotically flat with rate q > 0 and order ℓ + 1, ℓ ∈ Z+ .


ℓ+1, p ′ ℓ,α
On each end E , the component functions satisfy (gi j − δi j ) ∈ W−µ ∩ C−q for
0 < α ≤ 1, µ < q and 1 ≤ p ′ < ∞; in this discussion we restrict to p ′ < ∞
without further comment. In fact, asymptotic flatness is sometimes defined by
requiring the components (gi j − δi j ) to lie in a suitable weighted space built
using the chosen asymptotic chart on each end (equivalently, one might define
weighted spaces in terms of a smooth background metric g̊ on M as above, and
require g − g̊ to lie in a weighted space); in addition, if this does not already
follow from the weighted space condition, one often requires that gi j and δi j
give uniformly equivalent metrics on each end; in some circumstances, there
is also some overarching regularity assumption that g be (sufficiently) smooth,
k,α
e.g., Cloc for some nonnegative integer k and 0 < α ≤ 1. To illustrate, suppose
1, p ′
(gi j − δi j ) ∈ W−µ for µ > 0 and p ′ > n ≥ 3; then by Sobolev embedding,
(gi j − δi j ) = o(|x|−µ ), so the metric components gi j (x) converge uniformly to
δi j as |x| → ∞.
The bottom line, of course, is that one should always state their assumptions
precisely when using the term “asymptotically flat”, whether formulated in
terms of weighted spaces or in terms of Definition 7-29, as we strive to do
in what follows. In any case, the regularity and decay assumptions on g, and
in particular on the connection coefficients (equivalently on ∇(g ˚ − g̊) = ∇g) ˚
may restrict the spaces between which differential operators associated to g act
continuously. One might naturally propose using the induced connection and
volume measure from g to define the weighted spaces, noting the decay of the
connection coefficients would be controlled by an asymptotic assumption on
∇(g − g̊) = −∇ g̊. Analogous to what we mentioned just above, the regularity
and decay assumptions would impose a limit (again coming from the connection
k, p
coefficients) on k in defining, for instance, W−τ . We will now discuss two
basic calculations using the bookkeeping on the regularity and weights for an
asymptotically flat metric, the first involving the ADM mass, and the second
involving the Laplace operator.
2, p ′
Suppose we take (gi j − δi j ) ∈ W−µ on each end E , for p ′ > n ≥ 3 and
µ ≥ n−2 1
2 , with R(g) ∈ L (M, dvg ) (the mass decay conditions in [15]). Then
one can expand the scalar curvature in asymptotic coordinates, and use the
Sobolev embedding to show the ADM mass is defined. We refer to the proof
of Proposition 4.1 in [15] to handle the borderline case, while noting here that
in case µ > n−2 2 , one can expand simply as in the proof of Proposition 7-31.
1, p ′
To handle the error terms, one notes, for example, that since gi j,k ∈ W−µ−1 ,
we have from Sobolev embedding gi j,k (x) = o(|x|−µ−1 ), so it is easy to check
A SYMPTOTICALLY FLAT INITIAL DATA 265

∂g ∈ L 2 (E ) (the same holds at the borderline, using the proof of the Sobolev
embedding). To complete the proof that the ADM mass definition extends to this
setting, proceed as in the following exercise.
p′
Exercise 7-53. Let µ > n−2 2 . Use (gi j −δi j ) = o(|x|
−µ ) along with g
i j,kℓ ∈ L −µ−2
and Hölder’s inequality to deduce that (gi j − δi j )gkℓ,ms ∈ L 1 (E ). From here,
k
conclude that (gi j − δi j )0ℓm,s ∈ L 1 (E ); then, expanding as in the proof of
Proposition 7-31 and using that R(g) is integrable, show that the mass is defined.
Laplacian on weighted spaces. We now turn to the Fredholm theory for the
Laplace operator on functions in weighted spaces. In preparation for this, we
note that since the operator 1g involves the metric g, the regularity and decay of
the components of g in an asymptotic chart are reflected in the coefficients of
the operator 1g and the weighted spaces between which it operates. It is simple
to keep track of this in case the decay conditions are pointwise conditions; we
illustrate more generally with the following lemma.
1, p ′
Lemma 7-54. Suppose (M, g) is asymptotically flat, with (gi j − δi j ) ∈ W−µ for
µ > 0 and p ′ > n ≥ 3 on each asymptotic end. Then, for 1 ≤ p ≤ p ′ ,
2, p 0, p
1g : W−τ → W−τ −2
2, p 0, p
is a bounded linear map. Moreover, (1 − 1g ) : W−τ → W−τ −µ−2 is also
2, p 0, p
bounded on each end, or equivalently, (1̊ − 1g ) : W−τ → W−τ −µ−2 is bounded,
where 1̊ is the Laplacian for a smooth background metric g̊ as above.
We remark that when g is asymptotically flat of rate q > 0, p ′ can be taken as
large as desired, so there would be no restriction on p ∈ [1, ∞) in the above.
2, p 0, p
Proof. The main step is to show that for w ∈ W−τ , 0ikj ∂w/∂ x k ∈ W−τ −µ−2 , with

a suitable bound on its norm. Note that 1 ≤ p < p ′ implies p < pp′ −pp ; if 1 ≤ p < n

as well, then pp′ −pp < p ∗ = n− np ′
p . Thus when 1 ≤ p < p , we can apply Sobolev
2, p 1, p 0,q
embedding to conclude that for w ∈ W−τ , ∂w/∂ x j ∈ W−1−τ ,→ W−1−τ for

q = pp′ −pp . Since metric components are uniformly bounded near infinity in any
end by Sobolev embedding, the Christoffel symbols of g in these coordinates
0, p ′
satisfy 0ikj ∈ W−µ−1 , and we can apply Hölder’s inequality (7-43a) (p. 259) to
0, p
conclude 0ikj ∂w/∂ x k ∈ W−τ −µ−2 . To handle the case p = p ′ , we use the Sobolev
1, p 0,α
embedding ∂w/∂ x j ∈ W−1−τ ,→ C−1−τ for p > n, where α = 1− np , to conclude
0, p
0ikj ∂w/∂ x k ∈ W−τ −µ−2 .
2, p
By the Sobolev embedding, (gi j − δi j ) = o(|x|−µ ), and so for w ∈ W−τ ,
0, p
1g w = g i j (w,i j − 0ikj w,k ) ∈ W−τ −2
0, p
(1 − 1g )(w) = (δ i j − g i j )w,i j + g i j 0ikj w,k ∈ W−τ −µ−2 . □
266 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

k−1, p ′
Exercise 7-55. Let k ≥ 2, and suppose (gi j − δi j ) ∈ W−µ for µ > 0 and
′ ′ ′ k′, p k ′ −2, p
p > n ≥ 3. Prove that for 2 ≤ k ≤ k and 1 ≤ p ≤ p , 1g : W−τ → W−τ −2 and
k′, p k ′ −2, p
(1̊ − 1g ) : W−τ → W−τ −µ−2 are bounded linear maps. To analyze the term
0imj ∂w/∂ x m , you can employ the product rule together with methods used above.
k−1,α
If instead (gi j − δi j ) ∈ C−µ for µ > 0 and α ∈ (0, 1], prove the analogous
k ′ ,β k ′ −2,β k ′ ,β k ′ −2,β
statements for 1g : C−τ → C−τ −2 and 1̊−1g : C−τ → C−τ −µ−2 in weighted
Hölder spaces, with 0 < β ≤ α and 2 ≤ k ′ ≤ k. Compare Exercise 7-45.
We will now recall from [15; 155; 211] some fundamental results regarding
the Laplacian acting on scalar-valued functions in weighted spaces; see also
[39; 44; 45; 168]. We will mostly use these when solving for conformal defor-
mations of asymptotically flat metrics in the next section. You might review the
definition of a Fredholm operator from Section 6.1.

The Euclidean Laplacian. We start by revisiting the Euclidean Laplacian 1


on Rn (n ≥ 3). Suppose 1u = 0 on Rn . By the maximum principle, if u goes
to 0 at infinity, then u is identically zero. By Proposition 7-14 (Liouville’s
theorem), if u is bounded then u must be constant. If p ∈ [1, ∞) and τ ≥ 0,
p
then if u ∈ L −τ (Rn ) is harmonic, then an application of the mean value property
shows that u vanishes identically (Exercise 7-102, cf. Exercise 7-15). On the
other hand, by [15, Corollary 1.9], if 1 < p < ∞, and τ < 0, then u must be a
p
harmonic polynomial lying in L −τ (Rn ), so that its degree must be less than −τ ;
as a corollary, for τ ≤ 0, a harmonic function u ∈ C−τ 0 (Rn ) must be a harmonic

polynomial of degree at most −τ . For the rest of the section, we take 1 < p < ∞
(with or without comment).
2, p p
For p > 1 and any τ , then, 1 : W−τ (Rn ) → L −τ −2 (Rn ) has finite-dimensional
kernel. Recalling what we have seen in Section 7.1.2.3, the set

{ℓ ∈ Z : ℓ ≤ 2−n or ℓ ≥ 0} = { . . . , −1−n, −n, 1−n, 2−n, 0, 1, 2, 3, . . . }

is comprised of growth rates of functions in weighted spaces which are Euclidean


harmonic functions in a neighborhood of infinity. These numbers form a set
of exceptional weights for the function theory of the Laplace operator, the
complement of which gives the set of nonexceptional weights. The connection
between the exceptional weights as growth rates for harmonic functions and the
function theory for the Laplacian is laid bare in the following theorem. For m a
nonnegative integer, let Km be the space of harmonic polynomials of degree at
most m.
2, p p
Theorem 7-56. Suppose p >1. For −τ exceptional, 1 : W−τ (Rn ) → L −τ −2 (Rn )
does not have closed range. If −τ is nonexceptional, this operator is Fredholm,
A SYMPTOTICALLY FLAT INITIAL DATA 267

and its index depends only on the interval inside the nonexceptional set that
contains −τ . More explicitly, if m is a nonnegative integer. the operator 1 :
2, p p
W−τ (Rn ) → L −τ −2 (Rn ) is surjective with kernel Km if m < −τ < m + 1, while
for (n − 2) + m < τ < (n − 2) + m + 1 the operator has trivial kernel, and closed
range
2, p p
n Z o
1(W−τ (Rn )) = f ∈ L −τ −2 (Rn ) : n f (x)v(x) d x = 0 for all v ∈ Km .
R
2, p p
For 0 < τ < n − 2, 1 : W−τ (Rn ) → L −τ −2 (Rn ) is an isomorphism.
The theorem is proved in [155]; see also [168; 15]. (We note that the weighting
conventions in [155] differ from [15]. For comparison, the weighted Sobolev
p k, p p
spaces Ms,δ (Rn ) as defined in [155] are such that W−τ (Rn ) = Mk,δ (Rn ) for
p p
τ = δ + np , and so the Laplace operator 1 : Mk,δ (Rn ) → Mk−2,δ+2 (Rn ) is a
bounded linear map.) We will not go into full details of the proof of Theorem 7-56
here, but we will make some observations and comments.
Comments on the proof. We discussed the kernel above, so we turn to the range
2, p p p
1(W−τ (Rn )) ⊂ L −τ −2 (Rn ). Recall that the dual space of L −τ −2 (Rn ) can be
q
identified with L τ +2−n (Rn ), where 1p + q1 = 1 (Exercise 7-44), as follows: for
q
v ∈ L τ +2−n (Rn ), ℓv (w) = Rn v(x)w(x) d x defines a bounded linear functional
R
p
on L −τ −2 (Rn ); as such, ℓv has a closed nullspace, which is codimension-one for
q 2, p
nontrivial v. Now suppose v ∈ L τ +2−n (Rn ) annihilates the range 1(W−τ (Rn )),
2, p
i.e., for all u ∈ W−τ (Rn ), Rn v1u d x = 0; in particular this holds for all smooth,
R

compactly supported u, and hence 1v = 0 distributionally, so that v is a smooth


q 2, p
harmonic function. In fact, by density, v ∈ L τ +2−n (Rn ) annihilates 1(W−τ (Rn ))
q
if and only if it annihilates 1(Cc∞ (Rn )). As we have seen, if v ∈ L τ +2−n (Rn )
is a harmonic function, then v must vanish identically in case τ + 2 − n ≤ 0,
while in case τ > n − 2, v ∈ Km , where m is the nonnegative integer such that
q
m < τ + 2 − n ≤ (m + 1). Since for such m, Km ⊂ L τ +2−n (Rn ), we see that the
2, p
range 1(W−τ (Rn )) is contained in the closed subspace
p
n Z o

ℓv (0) = f ∈ L −τ −2 (Rn ) : n f (x)v(x) d x = 0 for all v ∈ Km .
T −1
Km :=
v∈Km R
2, p p
By the Hahn–Banach theorem, we conclude 1(W−τ (Rn )) is dense in L −τ −2 (Rn )
for τ ≤ n − 2, while in case τ > n − 2, with the nonnegative integer m as above,
2, p p
the closure of 1(W−τ (Rn )) in L −τ −2 (Rn ) is precisely ⊥ Km , a space with finite
codimension (Km is finite-dimensional).
In summary, then, the main issue for deriving the Fredholm property is es-
2, p
tablishing closed range. By the above discussion, the range of 1 : W−τ (Rn ) →
p
L −τ −2 (Rn ) will be as stated in Theorem 7-56, for all τ for which the range is
268 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

closed. That said, to establish closed range for −τ < 0 nonexceptional, one
may proceed directly by solving the Poisson equation 1u = f , for f in the
p
appropriate space, either for f ∈ L −τ −2 (Rn ) in case 0 < τ < n − 2, or for
p
f ∈ ⊥ Km ⊂ L −τ −2 (Rn ) when τ > n − 2; we will say a few words about this now.
Once this is established, one may apply a duality argument to handle the case
−τ > 0 nonexceptional, cf. [155].
p
For p > 1, let p ′ = p−1 , and suppose a and b satisfy a +b > 0. Consider the in-
tegral kernel K ′ (x, y) = |x|−a |x −y|a+b−n |y|−b , and the related kernel K e(x, y) =
−a
(1 + |x|) |x − y| a+b−n −b
(1 + |y|) . A workhorse lemma for the analysis of the
Laplacian 1 on weighted spaces is that T ′ (u)(x) = Rn K ′ (x, y)u(y) dy defines a
R

bounded linear operator T ′ on L p (Rn ) if and only if a < np and b < pn′ . For a proof,
see [168, Lemma 2.1]. See also [155, Lemma A] for other references; as noted
there, a straightforward reformulation is that T e(u)(x) = n K e(x, y)u(y) dy
R
R
defines a bounded linear operator on L (R ) if and only if a < np and b < pn′ .
p n

We address the necessity of the condition on a and b in Exercise 7-103.


Let cn = 1/((2 − n)ωn−1 ). From our discussion of the Poisson equation and
the role of cn |x − y|2−n as the fundamental solution for 1, it might not be
surprising that the above integral kernels play a role in the analysis, at least for
a + b = 2, which is the case stated in [15, Lemma 1.8]. If we let a = −τ + np
and b = τ + 2 − np , then the conditions a < np and b < pn′ translate into 0 < τ <
n − 2. For τ in this range, the mapping f 7→ N f , where N f is the Newtonian
potential N f (x) = Rn cn |x − y|2−n f (y) dy, defines a bounded linear operator
R
p p
N : L −τ −2 (Rn ) → L −τ (Rn ). To see this, we let
n
f˜(y) := f (y)(1 + |y|)τ +2− p = f (y)(1 + |y|)b ,

p
and note that f ∈ L −τ −2 (Rn ) if and only if f˜ ∈ L p (Rn ), with C −1 ∥ f ∥ L p ≤
−τ −2
∥ f˜∥ L p ≤ C∥ f ∥ L p for some constant C > 1. Since
−τ −2

n
e( f˜)(x) = (1 + |x|)−τ + p T
N f (x) = (1 + |x|)a T e( f˜)(x),

p
we conclude that N f ∈ L −τ (Rn ) and that N is bounded as claimed. We will
recall a weighted analogue of elliptic regularity in Proposition 7-61, from which
2, p
we conclude N f ∈ W−τ (Rn ), as desired. This gives us the theorem in the
isomorphism range 0 < τ < n − 2, which is the case we will need.
That said, it is instructive to make some comments about the other weight
p p
ranges. We recall that for τ1 < τ2 , L −τ2 (Rn ) ⊂ L −τ1 (Rn ), so that in particular,
p p
for τ ≥ n − 2, L −τ (Rn ) ⊂ 0<σ <n−2 L −σ (Rn ). For τ ≥ n − 2, it follows from
T
p
the preceding discussion that f ∈ L −τ −2 (Rn ) implies f = 1u for u = N f ∈
A SYMPTOTICALLY FLAT INITIAL DATA 269

2, p 2, p
W−σ (Rn ). This space contains W−τ (Rn ), but it is certainly possible
T
0<σ <n−2
2, p
that N f is not in W−τ (Rn ), as in the following example.
Example 7-57. Let ϕ be a smooth function which is zero in a neighborhood of
the origin, and is identically one outside a compact set. Suppose v is a nontrivial
homogeneous harmonic polynomial with degree m and Kelvin transform K [v]
(Section 7.1.2.3). We let f = 1u with u(x) = ϕ(x)K [v](x); for example if m = 0
2, p
and v = 1, we have u(x) = ϕ(x)|x|2−n . We have u ∈ 0<σ <m+n−2 W−σ (Rn )
T
2, p p
but u ∈/ W−τ (Rn ) if τ ≥ m + n − 2. Clearly f ∈ L −τ −2 (Rn ) for any τ (as f
p
vanishes near infinity), and so (u − N f ) is harmonic and lies in L −σ (Rn ) for
2, p
some σ > 0. Hence N f = u ∈ / W−(m+n−2) (Rn ).
Exercise 7-58. Continuing the previous example, show that, whereas for τ >
2, p
m +n −2, 1(W−τ (Rn )) is annihilated by Km under the dual pairing, a simple ap-
plication of Green’s identity, together with the fact that K [v](x) = |x|2−n−2m v(x),
shows that Rn v(x) f (x) d x ̸= 0.
R

2, p
To prove Theorem 7-56 it remains to show that N f ∈ W−τ (Rn ) for all f in
⊥K ⊂ L p n
m −τ −2 (R ), with m + (n − 2) < τ < m + 1 + (n − 2) and m a nonnegative
integer. As you might imagine by now, there is an elegant way to do this using the
spherical harmonic expansion of cn |x − y|2−n , which for |x| > |y| and x̂ = x |x|−1
we write as ∞
cn |x − y|2−n = cn |x|2−n h k (x̂, y)|x|−k ,
X

k=0

where for fixed x̂ ∈ Sn−1 ,


h k (x̂, y) is a homogeneous harmonic polynomial
of degree k with respect to y. Let 0 < ψ ≤ 1 be a continuous function with
ψ(x) = |x|−1 for |x| ≥ 2. For m a nonnegative integer, we consider the functions
m
K m′ (x, y) = cn |x − y|2−n − cn |x|2−n h k (x̂, y)|x|−k ,
X

k=0
m
K m (x, y) = cn |x − y|2−n − cn (ψ(x))n−2 h k (x̂, y)(ψ(x))k .
X

k=0
K m′ is used in [15], which works with weighted spaces defined on punctured
Euclidean space, whereas [155] applies the integral kernel K m in weighted spaces
on Rn , as the use of ψ(x) removes the singularity at x = 0 in the sum. Using the
workhorse lemma on T e cited above, with elementary estimates of K m , one shows
p p
that K m : L −τ −2 (R ) → L −τ (Rn ) is bounded for m +(n−2) < τ < m +1+(n−2).
n
p
Of course, if f ∈ ⊥ Km ⊂ L −τ −2 (Rn ), then Rn K m (x, y) f (y) dy = N f (x), so that
R
p p
N f ∈ L −τ (Rn ), with 1N f = f ∈ L −τ −2 (Rn ). Just as above, one can conclude
2, p
in fact N f ∈ W−τ (Rn ), as desired.
270 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

That the range is not closed for −τ exceptional is addressed in Exercise 7-104.
For more details and the remainder of the proof, see [15; 155]. □
Remark 7-59. Using the regularity statement in Proposition 7-61 below, one can
k, p k−2, p
conclude that 1 : W−τ (Rn ) → W−τ −2 (Rn ) is Fredholm if −τ is nonexceptional,
p > 1, and k ≥ 2. The map is surjective for −τ nonexceptional and τ < n − 2,
while for m a nonnegative integer and (n − 2) + m < τ < (n − 2) + m + 1,
k, p k−2, p
1(W−τ (Rn )) = ⊥ Km ∩ W−τ −2 (Rn ); the kernel is the same as in the k = 2
k,α k−2,α
case. The analogous Fredholm properties for 1 : C−τ (Rn ) → C−τ n
−2 (R ), with
k ≥ 2, 0 < α < 1 and −τ nonexceptional, can be established directly using the
expansion of the Newtonian potential, along with weighted Schauder estimates
and regularity, as in [157].
Exercise 7-60. Combine the expansion result in [157] with the Fredholm prop-
erties in weighted Sobolev spaces to obtain the Fredholm properties in weighted
Hölder spaces. Begin by letting τ ′ < τ be slightly less than τ , so that −τ ′ is nonex-
2, p
ceptional, and letting p > 1 be such that the Sobolev embedding W−τ ′ (Rn ) ,→
0,α n k,α n k, p n
C−τ ′ (R ) holds; note that C −τ (R ) ,→ W−τ ′ (R ). Use the weighted Hölder

estimate and regularity as in Proposition 7-63 (cf. [157, Theorem 1]) to get that
k, p k−2,α k,α
for f ∈ 1(W−τ ′ (Rn )) ∩ C−τ n n
−2 (R ), there is w ∈ C −τ ′ (R ) with 1w = f . Use
k,α
[157, Theorem 2] to conclude that w ∈ C−τ (Rn ). From here one can easily find
k,α k−2,α
the image of the map 1 : C−τ (Rn ) → C−τ n
−2 (R ), and conclude that the map is
Fredholm.

The Laplacian 1g on asymptotically flat (M, g). The extension of the Fred-
holm property from Theorem 7-56 to the Laplacian 1g on an asymptotically
2, p p
flat manifold (M, g), namely that 1g : W−τ → L −τ −2 is Fredholm for −τ
nonexceptional (and more generally for operators which are asymptotic to 1
in a suitable sense), is established in [15, Proposition 1.14, Proposition 2.2];
see also [39]. For our purposes, the following fundamental weighted elliptic
estimate (7.2.12) and regularity result for 1g , and the weighted Hölder version in
Proposition 7-63, will suffice; compare Lemma 7-54. We emphasize the stronger
estimate (7.2.13) that holds in the weight range for which 1g is an isomorphism.
Proposition 7-61. Let τ ∈ R, µ > 0, k ≥ 2, p ′ > n ≥ 3 and 1 < p ≤ p ′ . Suppose
(M n , g) is asymptotically flat, admitting coordinate charts for each end in which
k−1, p ′ k, p k−2, p
(gi j −δi j ) ∈ W−µ . The bounded linear operator 1g : W−τ → W−τ −2 satisfies
k, p
the following a priori weighted elliptic estimate: for some C > 0 and for w ∈ W−τ ,

∥w∥W k, p ≤ C ∥1g w∥W k−2, p + ∥w∥ L −τ . (7.2.12)



p
−τ −τ −2
A SYMPTOTICALLY FLAT INITIAL DATA 271

p k−2, p k, p
Moreover, if u ∈ L −τ and 1g u ∈ W−τ −2 , then u ∈ W−τ .
k, p k−2, p
If in addition τ ∈ (0, n − 2), then 1g : W−τ → W−τ −2 is an isomorphism, so
k, p
in particular, there is a C > 0 such that for all w ∈ W−τ ,

∥w∥W k, p ≤ C∥1g w∥W k−2, p . (7.2.13)


−τ −τ −2

p k−2, p k, p
Thus, if 0 < τ1 < τ2 < n − 2, with u ∈ L −τ1 and 1g u ∈ W−τ2 −2 , then u ∈ W−τ2 .
Remark 7-62. Aside from the various parameters in the proposition, the constant
C in (7.2.12) depends on the ellipticity constant λ > 0 such that λ|ξ |2 ≤ g i j ξi ξ j ≤
λ−1 |ξ |2 and on the norm ∥gi j − δi j ∥W−µ
k−1, p ′ (defined using a suitable covering),

as reflected in the coefficients of 1g . We could also write this in terms of


the analogous norm of the difference g − g̊ for a background metric g̊ which
in each asymptotic chart is given by g̊i j = δi j . Thus the constant C can be
chosen uniformly across a set of metrics, each asymptotically flat with respect
k−1, p ′
to the same asymptotically flat coordinates and suitably controlled in W−µ .
The same is true for the constant in (7.2.13); see [15, Propositions 1.6 and 1.11,
Corollary 1.16].
Outline of the proof of the proposition. To go from 1 to 1g , we have to modify
the operator, and in addition, for a general asymptotically flat manifold, the
topology can be more complicated; for example, there may be more than one
asymptotically flat end. The proof of the a priori estimates and regularity employs
the standard interior elliptic estimate and regularity combined with a scaling
argument akin to ones we have used earlier. There is however a key difference
between the asymptotic setting and that of a closed manifold, with respect to
the Fredholm property. In the latter case, from the basic elliptic estimate, one
concludes the Fredholm property of the elliptic operator as in Exercise 6-25,
by using an estimate of the form ∥u∥W k, p ≤ C(∥Lu∥W k−m, p + ∥u∥ L p ), where L
has order m and k ≥ m, say. In fact, one couples this estimate along with the
compactness of the embedding W k, p (M) ,→ L p (M) to conclude the kernel of L
is finite-dimensional and the range of L : W k, p (M) → W k−m, p (M) is closed. This
same strategy fails in the weighted case, since the weighted compactness results
discussed earlier require the weights to change, e.g., for τ1 < τ2 , the embedding
1, p p
W−τ2 ,→ L −τ1 is compact. In [15] the Fredholm property is established via a
“scale-broken” estimate for nonexceptional weights, [15, Theorem 1.10], in which
the lower-order term ∥w∥ L −τ p in (7.2.12) is replaced by ∥w∥ L p () , where  is

the compactly contained domain whose complement is a union of the exterior


regions of the form {x| ≥ R} in each asymptotic end, for sufficiently large R.
The proof of the estimate works by decomposing functions into pieces each
272 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

supported on an asymptotic end, plus a compactly supported piece. On the ends,


where 1g is appropriately close to 1, one can use an injectivity estimate for 1
at nonexceptional weights, and then patch these together with interior estimates
on the compactly supported piece. The key for the Fredholm property is that the
1, p
restriction W−τ (M) ,→ L p () is compact by Rellich–Kondrachov on . We
will use this form of the estimate again in the proof of Proposition 7-73. □
We turn to a result analogous to Proposition 7-61 for Hölder spaces, and
we include a formulation of an isomorphism statement in the borderline case
τ = n − 2, following [211].
Proposition 7-63. Let n ≥ 3, τ ∈ R, µ > 0, α ∈ (0, 1), and k ≥ 2. Suppose
(M n , g) is asymptotically flat, admitting coordinate charts for each end in which
k−1,α k,α k−2,α
(gi j − δi j ) ∈ C−µ . Then 1g : C−τ → C−τ −2 is a bounded linear operator
satisfying the following a priori weighted elliptic estimate: there is a C > 0 such
k,α
that for all w ∈ C−τ ,

∥w∥C k,α ≤ C(∥1g w∥C k−2,α + ∥w∥C−τ


0 ). (7.2.14)
−τ −τ −2

Moreover, if w ∈ C−τ 0 and 1 w ∈ C k−2,α , then w ∈ C k,α .


g −τ −2 −τ
k,α k−2,α
If in addition, τ ∈ (0, n − 2), then 1g : C−τ → C−τ −2 is an isomorphism, so
k,α
in particular, there is a C > 0 such that for all w ∈ C−τ ,

∥w∥C k,α ≤ C∥1g w∥C k−2,α . (7.2.15)


−τ −τ −2

0 and 1 u ∈ C k−2,α , then u ∈ C k,α .


Thus, if 0 < τ1 < τ2 < n − 2, with u ∈ C−τ1 g −τ2 −2 −τ2
Finally, suppose τ > 0. If w ∈ C−τ0 and 1 w ∈ C k−2,α ∩ L 1 , then w ∈ C k,α .
g −n 2−n
k,α
Moreover, there is a C > 0 such that for all w ∈ C2−n ,

∥w∥C k,α ≤ C(∥1g w∥C k−2,α + ∥1g w∥ L 1 ). (7.2.16)


2−n −n

k−2,α k,α
Given any f ∈ C−n ∩ L 1 , there is a unique w ∈ C2−n such that 1g w = f .
Comments on the proof. As with the Sobolev case, (7.2.14) is established using
interior estimates and scaling. Given what we saw earlier about the role of the
relevant integral kernels in the analysis of 1 on weighted Sobolev spaces, it
should not be surprising that the injectivity estimates (7.2.15) and (7.2.16) can
be obtained using the fundamental solution, as we will indicate shortly for the
Euclidean case. Thus to establish these two estimates in general, one approach is
to show there is a fundamental solution for 1g with the same asymptotics as that
for 1, and compute with it as we will indicate below for the Euclidean metric,
cf. [211].
A SYMPTOTICALLY FLAT INITIAL DATA 273

On the proof of (7.2.15) and (7.2.16) for 1. Recall that the Newtonian potential
is defined by N f (x) = Rn cn |x − y|2−n f (y) dy.
R

Lemma 7-64. For 0 < τ < n − 2, there is a constant C > 0 such that for
0 n 2−n ∈ L 1 (Rn , dy), and in fact
all f ∈ C−τ −2 (R ), it follows that f (y)|x − y|
0 n
N f ∈ C−τ (R ) with ∥N f ∥C−τ
0 ≤ C∥ f ∥ 0 .
C −τ −2

Exercise 7-65. Prove the preceding lemma. Note that this claim involves deriving
a supremum estimate and establishing continuity of N f . To show existence and
continuity of N f , you might break up the integral for N f (x) into two parts
centered around x; continuity will follow from an application of dominated
convergence. To show the C−τ 0 -estimate, for |x| > 0, you might break the integral

into three parts, integrating over y : |x − y| ≤ |x|


 |x|
2 , y : 2 ≤ |x − y| ≤ 2|x| , and


{y : |x − y| ≥ 2|x|}. Remark where you use the conditions 0 < τ and τ < n − 2.
The case τ = n − 2 is handled similarly, as in the next lemma.
Lemma 7-66. There is a constant C > 0 such that for all f ∈ C−n 0 (Rn )∩ L 1 (Rn ),
2−n 1 n 0
it follows that f (y)|x − y| ∈ L (R , dy), and in fact N f ∈ C2−n (Rn ) with
∥N f ∥C 0 ≤ C(∥ f ∥C−n0 + ∥ f ∥ L 1 ).
2−n

Exercise 7-67. Prove the preceding lemma. To do this, for |x| > 0, you might
break the integral into two parts: over y : |x − y| ≤ |x| |x|
2 and y : |x − y| ≥ 2 .
 

Continuity of N f can be shown with a dominated convergence argument, or by


showing N f is differentiable.
With these lemmas in hand, what remains to be shown is that if 0 < α < 1, 0 <
k−2,α n k−2,α
τ < n −2 and f ∈ C−τ −2 (R ), or if τ = n −2 and f ∈ C −n (Rn )∩ L 1 (Rn ), then
k,α n
N f ∈ C−τ (R ) and 1N f = f . A straightforward approach is to compute the par-
tial derivatives of N f directly, using the Hölder continuity of f when computing
the second partials of N f , as in [107, Chapter 4], for example, to conclude 1N f =
f . From here, the claim follows from the regularity result one gets from the basic
estimate (7.2.14). A slight modification of this approach is as follows, assuming
you know 1Nϕ = ϕ for ϕ ∈ Cc0,α (Rn ). For R ≥ 1 we let ψ R (x) = ψ(|x|/R), where
ψ : R → R is a smooth function with 0 ≤ ψ ≤ 1, with ψ(t) = 1 for t ≤ 1, and ψ(t) =
0 for t ≥ 2. Let f R = ψ R f ∈ Cck−2,α (Rn ). Then 1N f R = f R , and f R converges
k−2,α n ′
to f in C−τ ′ −2 (R ) for 0 < τ < τ . By applying Lemma 7-64 to the difference
0
f − f R , we have that N f R converges to N f in C−τ ′ (R); using (7.2.14) we conclude
k,α n k,α n
that N f R is Cauchy in C−τ ′ (R ) as R ↗ ∞, so that the limit N f is in C−τ ′ (R ), and

1N f = f as desired. Since N f ∈ C−τ 0 (Rn ), we conclude from Proposition 7-63


k,α
that N f ∈ C−τ (Rn ). From here the isomorphism statements are immediate. □
Before moving on, we urge the reader to analyze the proofs of the preceding
0 n
lemmas in case f ∈ C−τ −2 (R ) for τ > n − 2. Note for such f , we have
274 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

f ∈ L 1 (Rn ) as well, but we cannot conclude N f ∈ C−τ 0 (Rn ). Where does the

obstruction appear in the proofs of the lemmas? We see that with τ > n − 2, for
0,α n 2,α n
general f ∈ C−τ −2 (R ), one can conclude N f ∈ C 2−n (R ) and 1N f = f , but
2,α
we cannot conclude N f ∈ C−τ (Rn ). The reader should keep in mind that |x|2−n
is harmonic on Rn \ {0}, and gives the leading order rate of decay for functions
which are harmonic at infinity in Euclidean space; as such, you can let w be a
smooth function which is equal to |x|2−n outside a ball, and f = 1w is any of
the weighted spaces under discussion. If 1v = f with lim|x|→∞ v(x) = 0, then
w = v by the maximum principle applied to the harmonic function w − v. For
more on the asymptotics of solutions of the Poisson equation in weighted Hölder
spaces, please see [157]. □
The isomorphism result is often applied as follows:
Corollary 7-68. Suppose 0 < τ < n −2, µ > 0, k ≥ 2, p ′ > n ≥ 3 and 1 < p ≤ p ′ .
Suppose (M n , g) is asymptotically flat, admitting coordinate charts for each end
k−1, p ′ k−2,∞
in which (gi j − δi j ) ∈ W−µ . Furthermore, suppose ν > 2 and h ∈ W−ν .
k, p k−2, p
If the operator 1g − h : W−τ → W−τ −2 is injective, such as is the case for
∥h∥W k−2,∞ sufficiently small, then it is an isomorphism, and so there is a C > 0
−ν k, p
such that, for all w ∈ W−τ ,

∥w∥W k, p ≤ C∥(1g − h)w∥W k−2, p . (7.2.17)


−τ −τ −2

Proof. We show that the operator 1g − h is a compact perturbation of an


isomorphism, from which we can conclude it is Fredholm of index zero. Indeed,
k−2, p k−2,∞
for u ∈ W−τ and h ∈ W−ν , a simple argument using the product rule gives

∥hu∥W k−2, p ≤ C∥h∥W k−2,∞ ∥u∥W k−2, p ≤ C∥h∥W k−2,∞ ∥u∥W k−2, p . (7.2.18)
−τ −2 −ν −τ −2+ν −ν −τ

In particular, then, the function Th given by Th (u) = hu gives a bounded linear


k, p k−2, p
operator Th : W−τ → W−τ −2 . We show that Th is compact. Consider a bounded
k, p
sequence in W−τ ; by Lemma 7-50, we can assume by taking a subsequence
k−2, p
that it converges in W−τ +ν−2 . Let u i be such a subsequence. By applying the
first inequality in (7.2.18) with u = u i − u j , we see that Th (u i ) = hu i is Cauchy,
k−2, p
hence convergent, in W−τ −2 .
If ∥h∥W k−2,∞ is sufficiently small, then by (7.2.18) the operator 1g − h is a
−ν
small perturbation of an isomorphism, and hence is an isomorphism. □
There is of course an analogous statement in Hölder spaces.
Corollary 7-69. Let n ≥ 3, µ > 0, α ∈ (0, 1), and k ≥ 2. Suppose (M n , g) is
asymptotically flat, admitting coordinate charts for each end in which (gi j −δi j ) ∈
k−1,α k−2,α
C−µ . Suppose furthermore that ν > 2 and h ∈ C−ν . If 0 < τ < n − 2 and if
A SYMPTOTICALLY FLAT INITIAL DATA 275

k,α k−2,α
the operator 1g − h : C−τ → C−τ −2 is injective, as for example when ∥h∥C−ν
k−2,α

is sufficiently small, then it is an isomorphism, so that there is a C > 0 such that


k,α
for all w ∈ C−τ ,
∥w∥C k,α ≤ C∥(1g − h)w∥C k−2,α . (7.2.19)
−τ −τ −2

k,α
Now suppose that the only solution w of (1g − h)w = 0 with w ∈ C2−n and
1 k−2,α 1
1g w ∈ L is the trivial solution w = 0. Then for any f ∈ C−n ∩ L , there is a
k,α
unique w ∈ C2−n such that (1g − h)w = f . There is a constant C > 0 such that
k,α
for all w ∈ C2−n

∥w∥C k,α ≤ C(∥(1g − h)w∥C k−2,α + ∥(1g − h)w∥ L 1 ). (7.2.20)


2−n −n

The proof is similar to that of the preceding corollary, and is sketched in


Exercise 7-105.

Asymptotic expansion. Before we move back to geometry and the Einstein


constraint equations, we will discuss a partial expansion near infinity for solutions
of a Poisson equation in weighted spaces; cf. [15, Theorem 1.17]. The proof,
which we recall below, uses the fact that 1g is asymptotic to 1 in an appropriate
sense (the coefficients of (1 − 1g ) decay near infinity in any asymptotic end),
where 1 is the Euclidean Laplacian, along with the spherical harmonic expansion
of Euclidean harmonic functions discussed earlier (p. 244).

Proposition 7-70. Suppose E is an asymptotically flat end in (M n , g), admitting


1, p ′
asymptotic coordinates x in which (gi j −δi j ) ∈ W−µ (E ) for µ > 0 and p ′ > n ≥ 3.
0, p 0, p
If n2 < p ≤ p ′ , τ ∈ (0, n − 2), and if v ∈ W−τ (E ) satisfies 1g v = f ∈ W−β (E )
for some β > n, there is a constant A and a γ > 0 such that
A
v(x) = + o(|x|−(n−2)−γ ).
|x|n−2
Proof. By multiplying v by a cutoff function which is supported in E and is
identically one near infinity, if necessary, we may assume that we are working
2, p
on an asymptotically flat metric on Rn . By Proposition 7-61, we have v ∈ W−τ .
We choose δ > 0 small enough so that δ ≤ µ, 0 < δ + τ < n − 2 and δ < β − n.
We can further restrict δ so that −(aδ + τ ) is nonexceptional for all a ∈ Z+ .
2, p 0, p
Therefore 1 : W−aδ−τ → W−aδ−τ −2 is Fredholm, and there is an ℓ ∈ Z+ such
that ℓδ + τ + 2 < n < (ℓ + 1)δ + τ + 2 < β. Since δ ≤ µ, by Lemma 7-54, we
0, p
have 1v = (1 − 1g )(v) + f , with (1 − 1g )(v) ∈ W−δ−τ −2 .
0, p 0, p 0, p
We also observe that f ∈ W−β ⊂ W−(ℓ+1)δ−τ −2 ⊂ W−δ−τ −2 . Thus 1v ∈
0, p 2, p
W−δ−τ −2 , and so v ∈ W−δ−τ , from which we again use δ ≤ µ to conclude that
276 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

0, p 0, p
(1 − 1g )(v) ∈ W−2δ−τ −2 . Hence 1v ∈ W−2δ−τ −2 , and so if ℓ ≥ 2, we have
2, p
v ∈ W−2δ−τ .
0, p
We can continue this way until we get to 1v ∈ W−(ℓ+1)δ−τ −2 . By the Fredholm
2, p 0, p
property, 1 : W−(ℓ+1)δ−τ → W−(ℓ+1)δ−τ −2 has finite codimension, which together
with the density of smooth functions with compact support, implies there is a
2, p 0, p
finite-dimensional subspace S ⊂ Cc∞ with 1(W−(ℓ+1)δ−τ ) ⊕ S = W−(ℓ+1)δ−τ −2 .
2, p
Thus we conclude there is an R > 0 and a w ∈ W−(ℓ+1)δ−τ such that 1(v−w) = 0
on E R := {|x| > R}. As v and w both decay, v − w is harmonic at infinity and so
as in Section 7.1.2.4, v − w admits a spherical harmonic expansion in E R , with
the first term in the expansion of the form A/|x|n−2 for some constant A.
Now we apply Sobolev embedding (using p > n2 ) to complete the argument.
2, p 2, p
Note that since A/|x|n−2 ∈ W−ℓδ−τ (E R ) \ W−(ℓ+1)δ−τ (E R ) if A ̸= 0, we cannot
in general go further due to the degree of the leading harmonic. □
Of particular importance in applying this below is the ability to differentiate
the expansion and improve decay by a power of |x|−1 for each successive partial
derivative. We can formulate the above proposition in appropriate higher regu-
larity weighted spaces, so that the error term and some number of its derivatives
enjoy suitable decay. The pointwise estimates will come via Sobolev embedding.
Exercise 7-71. a. Assume that p > n in Proposition 7-70. Prove that v(x) =
A/|x|n−2 + O1 (|x|−(n−2)−γ ) for some A and γ > 0.
b. Formulate and prove an analogue of Proposition 7-70 whose conclusion is
that v(x) = A/|x|n−2 + O2 (|x|−(n−2)−γ ) for some A and γ > 0.
We can also get pointwise estimates using weighted Hölder spaces and
Schauder estimates from Proposition 7-63; see [157, Theorem 2].
Proposition 7-72. Suppose E is an asymptotically flat end in (M n , g), admitting
k−1,α
asymptotic coordinates x in which (gi j − δi j ) ∈ C−µ (E ) for some µ > 0 and
0 k−2,α
k ≥ 2. If β > n ≥ 3 and τ ∈ (0, n − 2), and if v ∈ C−τ (E ) and 1g v ∈ C−β (E ),
then there is a constant A and γ > 0 such that
A
v(x) = + Ok,α (|x|−(n−2)−γ ).
|x|n−2
Sketch of proof. Argue as in the proof of Proposition 7-70, with δ as chosen there.
k−2,α
Then 1v ∈ C−δ−τ −2 . Recall Remark 7-59. □

7.2.3. Application to scalar curvature deformation. We now develop some con-


trolled scalar curvature deformation results, using conformal and nonconformal
techniques, on asymptotically flat manifolds.
A SYMPTOTICALLY FLAT INITIAL DATA 277

Suppose M admits an asymptotically flat metric, i.e., M can be obtained from


a closed and connected manifold by removing a finite nonempty set of points.
Then there is a (smooth) metric g̊ on M for which g̊i j = δi j in each of the end
k, p
charts. For kp > n and τ > 0, we let M−τ denote the space of metrics on M
k, p
for which, relative to the family of charts for the ends, gi j −δi j ∈ W−τ on each
k, p k, p
end. By Sobolev embedding, if g ∈ M−τ , there is ε > 0 such that for h ∈ W−τ
a symmetric two-tensor field with ∥h∥W k, p < ε (norm taken with respect to g
−τ
k, p
or g̊), then g + h ∈ M−τ .
Suppose k ≥ 2. If furthermore k > np , judicious use of the Sobolev embedding
k, p
shows the scalar curvature map is smooth as a map from the set of metrics M−τ
k−2, p
to W−τ −2 . Indeed, the scalar curvature is given by

R(g) = g jk (0 ℓjk,ℓ − 0ℓk,


ℓ m ℓ m ℓ
j + 0 jk 0ℓm − 0ℓk 0 jm ).

k, p k, p k, p
Thus it suffices to prove that the multiplication maps W−τ × W−τ → W−τ ,
k, p k−2, p k−2, p k−1, p k−1, p k−2, p
W−τ × W−τ −2 → W−τ −2 and W−τ −1 × W−τ −1 → W−τ −2 are continuous
maps. We leave this as an exercise for the reader, as these are weighted versions
of the analogous statements analyzed in the proof of Proposition 6-54.
We now state an asymptotic version of Theorem 6-52; cf. [89].

Proposition 7-73 (Fischer–Marsden). Suppose n ≥ 3, p > 1 and k ≥ 2 satisfy


k, p
kp > n, and suppose τ ∈ (0, n − 2). The scalar curvature map R : M−τ (Rn ) →
k−2, p
W−τ −2 (Rn ) maps a neighborhood of the Euclidean metric onto a neighborhood
k, p
of the zero function. Moreover, any h 0 ∈ W−τ (Rn ) which satisfies DRgEn (h 0 ) = 0
k, p
is tangent to a path of metrics g(t) ∈ M−τ (Rn ) with R(g(t)) = 0.

Proof. Let L = DRg En , so that L h = −1(tr h) + div div h. We show that


k, p k−2, p
L : W−τ (Rn ) → W−τ −2 (Rn ) is surjective. This is a plausible claim, since L ∗ is
easily seen to be injective on weighted spaces which promote decay at infinity.
Indeed, if we trace 0 = L ∗ N = −(1N )g En + Hess N , we obtain 1N = 0, and
thus Hess N = 0, so that N ∈ span{1, x 1 , . . . , x n }.
k, p k−2, p
Recall that for τ ∈ (0, n −2), 1 : W−τ (Rn ) → W−τ −2 (Rn ) is an isomorphism.
k−2, p
With the identity L(wgEn ) = −(n − 1)1w in mind, given any f ∈ W−τ −2 (Rn ),
1 k, p k, p
we let w = − n−1 1−1 f ∈ W−τ (Rn ). Then h := wgEn ∈ W−τ (Rn ) satisfies
Lh = f , as desired.
k, p
It follows that the space of symmetric tensor fields has a splitting W−τ (Rn ) =
k, p k, p
{wgEn : w ∈ W−τ (Rn )} ⊕ {h 0 ∈ W−τ (Rn ) : Lh 0 = 0}. It is easy to see that these
subspaces are both closed. Since the scalar curvature map is smooth, we can
conclude from here by the implicit function theorem. □
278 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

The directions transverse to the kernel in the above proof are tangent to
conformal deformations. For a splitting analogous to (6.3.2), see Exercise 7-109.

Exercise 7-74. Let L = DRg En as above.


a. Show directly (and in one line) that if the symmetric (0, 2)-tensor h has
compact support, and if Lh ≥ 0, then L h = 0.
b. Show by an elementary construction that there exists an infinite-dimensional
space of smooth symmetric TT tensors (trace-free, divergence-free) on (R3 , gE3 )
with compact support. Such tensors automatically satisfy L h = 0. (The existence
follows also from more general results of [24].)

As discussed following Corollary 6-53, the only metrics on Tn with nonneg-


ative scalar curvature are flat. If h is a smooth symmetric (0, 2)-tensor on Rn
with L h = 0 (compactly supported, or in a suitable weighted space as above),
Proposition 7-73 yields a path of scalar-flat metrics g(t) tangent to h at gEn .
Even if h has compact support, then, we see that if g(t) is not isometric to
the Euclidean metric, then it cannot be Euclidean near infinity either, so the
perturbation g(t) − g En is not compactly supported (else we could use g(t)
to construct a non-flat, scalar-flat quotient Tn ). In the next proposition, we
show the analogue of the local surjectivity result in Proposition 7-73 holds near
any asymptotically flat metric g, and again, we note that from the conformal
deformation used, the metric perturbation γ − g so obtained will not in general
have compact support.

Proposition 7-75. Let p > 1 and k ≥ 2 be such that kp > n ≥ 3, and consider a
(smooth) asymptotically flat manifold (M n , g) (of rate q > 0 and order k). For
τ ∈ (0, n−2) with τ < q, there is an ε > 0 such that if ∥S∥W k−2, p < ε, there is
−τ −2
a function u > 0 with u(x) → 1 as |x| → ∞, and a smooth symmetric (0, 2)-
4
tensor h compactly supported in M, such that γ := u n−2 g + h is a metric and
0, p ′
R(γ ) = R(g) + S. If q + τ > n − 2, and S ∈ W−δ′ for some p ′ > n2 and δ ′ > n,
then u admits an expansion u(x) = 1 + A/|x|n−2 + O(|x|−β ) in any end, for
some β > n − 2.
n−2
Proof. Let cn = 4(n−1) . Let T be the nonlinear operator defined on the open
k, p
subset of W−τ where v > −1 by
4 n+2
T (v) = R((1 + v) n−2 g) = (1 + v)− n−2 R(g)(1 + v) − cn−1 1g (1 + v) .


k−2, p
T is a smooth map to W−τ −2 , by arguments similar to those showing that the
scalar curvature map is smooth. Of course T (0) = R(g), and the linearization
A SYMPTOTICALLY FLAT INITIAL DATA 279

DT |0 at v = 0 is given by
n+2 4
DT |0 (w) = R(g)w − cn−1 1g w − n−2 w R(g) = − cn−1 1g w + n−2 w R(g) .


For S suitably small, we would like to solve T (v) = R(g) + S, for v small,
using the inverse function theorem. Let 2 ≤ k ′ ≤ k. By the asymptotics of g,
k′, p k ′ −2, p
w 7→ w R(g) gives a bounded linear map from W−τ to W−τ −q−2 , which is readily
k ′ −2, p
seen to be compact as a map to W−τ −2 by Rellich’s lemma (Lemma 7-50), as
k′, p k ′ −2, p
in the proof of Corollary 7-68. Thus, DT |0 : W−τ → W−τ −2 is a Fredholm
operator of zero index, as a compact perturbation of the isomorphism −cn−1 1g .
We use a method from [69] to find h. We first show that the linearization
2, p 0, p
L g := DRg : W−τ → W−τ −2 of the scalar curvature operator is surjective,
2, p
for τ ∈ (0, n − 2). The range of L g contains DT |0 (W−τ ), and so it has finite
0,q ′ p
codimension, and hence is closed. Suppose a dual element f ∈ Wτ +2−n , q ′ = p−1 ,

annihilates the range of L g . Then 0 = L g f = −(1g f )g + Hessg f − f Ric(g).
By elliptic regularity, f is smooth and as we proved earlier (see Section 2.4.5),
if f were nontrivial, R(g) would be constant (since M is connected), so that by
asymptotic flatness, R(g) = 0. However in case R(g) vanishes identically, we
see 1g f = 0, and so f = 0 (since 0 < n − 2 − τ < n − 2). Thus in any case we
have f = 0, showing that L g is surjective as desired.
By the density of Cc∞ , we let ω1 , . . . , ωℓ be compactly supported smooth
(0, 2)-tensors for which span{L g ω1 , . . . , L g ωℓ } is a complementary subspace
2, p 0, p
to DT |0 (W−τ ) in W−τ −2 . This span is easily seen to form a complementary
k, p k−2, p
subspace to DT |0 (W−τ ) in W−τ −2 , by elliptic regularity (Proposition 7-61)
Pℓ k−2, p
applied to DT |0 (w) + i=1 ci L g ωi ∈ W−τ −2 . (Alternatively, since a distribu-
tional solution of L ∗g f = 0 is smooth, essentially the same argument as above
k, p k−2, p
shows L g := D Rg : W−τ → W−τ −2 is surjective.)
We let X be a complementary space to the kernel of DT |0 , and we define,
for suitably chosen neighborhoods Y of 0 ∈ X and Z of 0 ∈ span{ω1 , . . . , ωℓ },
k−2, p 4
the map T : Y ⊕ Z → W−τ −2 by T (v, h) = R((1 + v) n−2 g + h). Note that
T (v, 0) = T (v), so that DT |(0,0) (w, 0) = DT |0 (w), while DT |(0,0) (0, ω) = L g ω.
Thus by design, T is a smooth map for which DT |(0,0) is an isomorphism. From
here, the result follows from the inverse function theorem, and the standard
asymptotic expansion follows as in Proposition 7-70 (see Exercise 7-108). □

This result gives us good control on u and h for small S; however, for our
purposes in studying the ADM energy, the above result does not suffice, since we
do not get control on the sign of A. As we will see, for suitably small R(g), or
for R(g) ≥ 0, we can avoid the difficulty encountered in the above proof, which
accounts for the possibility that the linearization is not invertible. Note that when
280 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

we work in the smooth setting and carry out constructions like that above, we
should analyze the regularity of the result. In the preceding, for example, one
can readily show that if S were smooth, then the solution v would be smooth
outside the support of h. To guarantee this inside the support of h, one could
increase k if needed and use Sobolev embedding (to control the nonlinear terms)
and elliptic bootstrapping.
We now consider what happens to the ADM energy if we conformally trans-
form away the matter density in a time-symmetric asymptotically flat initial data
set. In the time-symmetric setting, the ADM linear momentum vanishes, and we
will let m(g) be the ADM mass (energy) for a chosen end of an asymptotically
flat (M n , g).
Proposition 7-76. Suppose (M n , g), n ≥ 3, is asymptotically flat at rate q > n−2
2
and order k + 1 ≥ 3, with nonnegative scalar curvature R(g) ∈ L 1 (M). There is
4
a conformally related metric ḡ = u n−2 g which is asymptotically flat (of order k)
and has R(ḡ) = 0, with m(ḡ) ≤ m(g) (in each end).
n−2
Proof. Recall the conformal Laplacian Lg = 1g − 4(n−1) R(g), a self-adjoint
4
linear operator. By (6.1.8), for u > 0 and ḡ = u n−2 g, R(ḡ) vanishes if and only
if Lg u = 0. This is a linear equation, and we study Lg acting on a weighted
function space.
k,α
Consider τ ≥ n−2 2 , and suppose that v ∈ C −τ solves Lg v = 0. We have
k−2,α
R(g)v ∈ C−q−2−τ and q + τ + 2 > n. From Proposition 7-72, we see that
j
v ∂v/∂ x = O(|x| −2n+3 ) in asymptotic coordinates x, so that since n ≥ 3,
limr →∞ {|x|=r } v (∂v/∂ν) dσ = 0, and hence limr →∞ {|x|=r } v (∂v/∂νg ) dσg = 0
R R

by Exercise 7-35. Integrating by parts we have


Z Z
n −2
− |∇g v|2 dvg = R(g)v 2 dvg ≥ 0.
M 4(n−1) M
Thus v is a constant, and must vanish by the decay of v. (While this follows
more generally for τ > 0 by the maximum principle, the argument above is also
useful; cf. Remark 7-78.)
We take τ ≤ q with τ ∈ n−2 2 , n − 2 , and apply Corollary 7-69 to the operator

n−2 k,α
Lg : we solve Lg v = 4(n−1) R(g) uniquely for v ∈ C−τ , and then let u = 1 + v.
We see that u → 1 on approach to infinity in any end, and that Lg u = 0, and thus
u is smooth. We only need to show that u > 0 to have an honest conformal factor
and the asymptotically flat metric ḡ (of order k) with zero scalar curvature. By
the weak maximum principle, 0 ≤ u ≤ 1, and by the strong maximum principle
(or the Harnack inequality), u > 0.
0,α
Suppose first that R(g) ∈ C−β for some β > n (such as would hold in
case (g, K ) satisfies the vacuum constraints). Then, since u ≤ 1, we see that
A SYMPTOTICALLY FLAT INITIAL DATA 281

A ≤ 0 in the expansion from Proposition 7-72, so that by Lemma 7-37, we have


m(ḡ) = m(g) + 2A ≤ m(g). For more general R(g), the mass estimate follows
in the case of one end by showing that
1
Z
m(ḡ) − m(g) = − R(g)u dvg , (7.2.21)
2(n − 1)ωn−1 M
which can be derived by an expansion similar to that in the proof of Lemma 7-37,
along with integration by parts as in the following remark. For the case of
multiple ends, see, e.g., [142, pp. 88–89]. □
Remark 7-77. For a nontrivial case of the preceding proposition (meaning that
R(g) does not vanish identically), the mass of each end strictly decreases. When
there is a single asymptotic end, this follows by (7.2.21), whereas in the case
of multiple ends, we can conclude from here that the mass of at least one end
strictly decreases.
0,α
Consider the case R(g) ∈ C−β , β > n, when the conformal factor u admits
an expansion. In the case of one asymptotic end, we find, using Exercise 7-35 to
switch to the Euclidean surface measure,
∂u
Z Z Z
n −2
R(g)u dvg = 1g u dvg = lim dσg
M 4(n−1) M r →∞ {|x|=r } ∂νg
∂u
Z
= lim dσ = −(n − 2)ωn−1 A.
r →∞ {|x|=r } ∂ν

In the case of multiple ends, one could isolate one end at a time, by considering
a comparison principle on a manifold-with-boundary which contains only one
asymptotic end, the chosen asymptotic end, see Exercise 7-94. The exercise
invokes a solution to a certain Dirichlet boundary value problem with asymptotic
conditions, and we note that one can modify the theory of weighted spaces to
allow boundary, see e.g., [152]. In fact we remark that Schoen and Yau [199]
give a direct proof for the existence and expansion of solutions of the analogue
Neumann boundary value problem.
Note that the above integration by parts formula would hold for a conformal
factor u satisfying the PDE Lg u = 0 on a manifold with compact boundary
and one asymptotic end, such that u goes to 1 at infinity in the end, and with
∂u/∂ν = 0 on the boundary. One could apply this together with (7.2.21) to
handle the multiple end case; cf. [142, pp. 88–89].
Remark 7-78. We adapt an argument from [199] to show that the conformal
2, p 0, p
Laplacian Lg : W−τ → W−τ −2 where τ ∈ (0, n − 2), p > n2 is invertible for


g asymptotically flat with R(g) ≥ 0, or in fact, whenever R(g) has suitably


small negative part R(g)− . The operator Lg is Fredholm of index zero, since
282 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

it is a compact perturbation of the Laplacian. For any w in the kernel, we


obtain by integration by parts, using the decay of w and the Hölder inequality
(Exercise 7-95), that

∥∇w∥2L 2 (dvg ) ≤ c∥R(g)− ∥ L n/2 (dvg ) ∥w∥2L 2∗ (dv ) ,


g

2n
where 2∗ = n−2 is the Sobolev conjugate exponent to 2, in dimension n > 2.
For R(g)− small in L n/2 , we see that w must be zero by the Sobolev inequal-
2, p 0, p
ity: ∥w∥ L 2∗ (dvg ) ≤ C∥∇w∥ L 2 (dvg ) , and therefore Lg : W−τ → W−τ −2 is an
isomorphism.

For analyzing the question of the sign of the ADM mass of time-symmetric
initial data sets with the dominant energy condition, it thus suffices to study the
vacuum case. In the next section we reduce further to the case of harmonically
flat asymptotics.

7.3. Harmonically flat asymptotics

Schoen and Yau [202] showed that asymptotically flat metrics with zero scalar
curvature can be approximated by metrics which are harmonically flat on the
asymptotic ends. We sketch the proof of a modestly strengthened statement,
slightly modifying and improving the treatment from [68]. In this section and
the next, asymptotically flat metrics have a rate q > n−22 .
2 2,α
For the next proposition, we let X −τ be either C−τ (M) (only if the order is at
2, p
least 3) or W−τ (M), with p > n2 , n−22 < τ < min(q, n − 2), 0 < α < 1. Likewise
0 0,α 0, p
X −τ −2 stands for either C −τ −2 (M) or W−τ −2 (M), as appropriate.

Proposition 7-79 (Schoen–Yau). Suppose (M, g) is asymptotically flat with


nonnegative R(g) ∈ L 1 (M). For any ε > 0, there is a metric ḡ within ε of g in
2 , with R( ḡ) ≥ 0, and which is harmonically flat near infinity in each end, with
X −τ
|m(g)−m(ḡ)| < ε. There is also a metric g̃ with R(g̃) = 0 which is harmonically
flat near infinity in each end, with m(g̃) < m(g) + ε.

Proof. By Proposition 7-76, the second claim follows from the first, the proof of
which will in fact yield R(ḡ) = 0 in the case R(g) = 0 to start.
Let 0 ≤ ψ ≤ 1 be a smooth cutoff function so that ψ(t) = 1 for t < 1 and
ψ(t) = 0 for t > 2. On each end we choose asymptotically flat coordinates defined
for |x| > 1. For θ > 1, let ψθ (x) = ψ(|x|θ −1 ); ψθ extends smoothly from the ends
to all of M. Now consider the metric gθ (x) = ψθ (x)g(x) + (1 − ψθ (x))g En (x).
This metric is identical to the Euclidean metric for |x| > 2θ , and (as q > τ )
gθ can be made arbitrarily close to g in X −τ 2 , for θ sufficiently large. Note
H ARMONICALLY FLAT ASYMPTOTICS 283

that R(gθ ) ≥ 0 except possibly for θ < |x| < 2θ , and we will use a conformal
deformation to impose nonnegative scalar curvature.
From (6.1.8) we have
4 4(n−1) − n−2
n+2 n −2
 
R(u n−2 gθ ) = − u 1gθ u − R(gθ )u .
n −2 4(n−1)
4 4
We want to impose R(u n−2 gθ ) = ψθ u − n−2 R(g), i.e.,
n −2
1gθ u − R(gθ ) − ψθ R(g) u = 0.

4(n−1)
With u = 1 + v this becomes
n −2 n −2
1gθ v − R(gθ ) − ψθ R(g) v = R(gθ ) − ψθ R(g) . (7.3.1)
 
4(n−1) 4(n−1)
n−2
Note that h θ := 4(n−1) R(gθ ) − ψθ R(g) vanishes unless θ < |x| < 2θ, with


|h θ | ≤ Cθ −2−q . In fact since q > τ , for sufficiently large θ , h θ has small norm
0
in X −τ −2 . Thus, 1gθ − h θ is a small perturbation of the invertible operator
2 0
1g : X −τ → X −τ −2 , and thus it is an isomorphism, with inverse bounded
uniformly in θ, cf. Corollaries 7-68 and 7-69; in particular, the constant in
(7.2.17) or (7.2.19) can be chosen uniformly in h suitably small. Thus for θ
large enough, we can solve (7.3.1), and the solution vθ will be smooth and close
to zero in X −τ2 , so that u = 1 + v > 0, while ḡ = u 4/(n−2) g is close in the
θ θ θ θ θ
weighted norm to gθ and hence to g, and is harmonically flat near infinity in
each end.
Upon further restricting to τ > n−2 2 , the masses will also be close for large θ .
We will sketch the idea of the proof following [202; 197] using the Hölder space
for vθ , cf. [78; 119] for the Sobolev case. From (7.2.2) and Propositions 7-33
and 7-31, we know that m(g) equals
n
1
Z Z 
j
X
(gi j,i − gii, j )ν dσ + (R(g) + Qg ) d x ,
2(n−1)ωn−1 {|x|=r1 } {r1 ≤|x|}
i, j=1
Pn
where we have set i, j=1 (gi j,i j − gii, j j ) = R(g) + Qg .
From the expansion of the scalar curvature (Proposition 7-31; cf. Exercise
7-38) we have Qg (x) = O(|x|−2q−2 ), so that R(g) + Qg is integrable on the
end. Given ε > 0, we can bound the second integral by 3ε by choosing r1
sufficiently large. We replace g by ḡθ in the integrals above to compute m(ḡθ ).
2,α
By our control on vθ in C−τ (M), and hence on ḡθ , we can uniformly estimate
−4/(n−2)
R(ḡθ ) = ψθ u θ R(g), and we can see that |Qḡθ (x)| ≤ C|x|−2τ −2 , where C
is uniform in θ large. Thus we can bound the corresponding integral for ḡθ by 3ε
by choosing an r1 large enough, uniformly for θ ≥ θ0 for some sufficiently large
284 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

θ0 . By using (7.2.19), we see that for θ sufficiently large, h θ and hence vθ and
∂vθ are small enough that
n n
ε
Z X 
((ḡθ )i j,i − (ḡθ )ii, j ) ν j dσ < .
X
(gi j,i − gii, j ) − □
{|x|=r1 } 3
i, j=1 i, j=1

Bray [25] observed that the Schoen–Yau approximation could be modified


to produce ends that are precisely Schwarzschild, while preserving nonnegative
scalar curvature. To present his argument, we first recall a standard setup for
mollification. Let φ : R → R be a smooth, nonnegative function with support
[−1, 1], which is constant near the origin. Let ϕ(x) = φ(|x|) be the associated
smooth, rotationally symmetric bump function supported in the unit ball, and
by scaling we may assume Rn ϕ(x) d x = 1. Consider the approximate identity
R

ϕσ (x) = σ −n ϕ( σx ) for σ ↓ 0; note that ϕσ has unit integral, and is supported on


{|x| ≤ σ }. We can use ϕσ to mollify functions by convolution: (ϕσ ∗ w)(x) =
Rn ϕσ (y)w(x − y) dy.
R

We now state and prove Bray’s approximation. The closeness of the metrics g
and g̃ can be measured in a weighted norm as above, or, as Bray states it, as an
ε-quasi-isometry, meaning that for all nonzero v ∈ T M, the ratio g(v, v)/g̃(v, v)
lies in (e−ε , eε ).

Proposition 7-80 (Bray). Suppose (M, g) is asymptotically flat end with non-
negative R(g) ∈ L 1 (M). For any ε > 0, there is a metric g̃ with R(g̃) ≥ 0, which
is ε-close to g, which is isometric to a Riemannian Schwarzschild metric near
infinity in each end of M, and for which |m(g) − m(g̃)| < ε.

Proof. By applying Proposition 7-79, we may modify the metric so that each end
is harmonically flat. For any chosen end E , we may choose asymptotic coordinates
along with an r0 > 0, so that for |x| > r0 , the metric has the form gi j (x) =
4
u(x) n−2 δi j , with 1u = 0, and u(x) = 1 + 12 m(g)/|x|n−2 + O∞ (|x|−(n−1) ). We
will handle each end individually, so we do not introduce notation to distinguish
the mass on different ends. Now for any R > r0 and δ > 0, consider the harmonic
function ũ(x) = C1 + C2 /|x|n−2 , with C1 and C2 chosen so that
C2 C2
C1 + = max u(x) + δ and C1 + = min u(x) − δ.
R n−2 {|x|=R} (2R)n−2 {|x|=2R}
By the choice of C1 and C2 , the function w defined by
u(x) if |x| < R,


w(x) = min(u(x), ũ(x)) if R ≤ |x| ≤ 2R,
ũ(x) if |x| > 2R

O N THE POSITIVE MASS THEOREM 285

is continuous. The expansion of u also implies that, for any η > 0, if we


take R sufficiently large and δ R n−2 sufficiently small, then |C1 − 1| < η and
C2 − 21 m(g) < η. Since u and ũ are harmonic, and the minimum of harmonic
functions is (weakly) superharmonic, we have that w is superharmonic. If we
convolve w with a spherically symmetric mollifier ϕσ supported in {y : |y| ≤ σ }
(0 < σ < R − r0 ), we produce a smooth, superharmonic function w̃ = ϕσ ∗ w
which satisfies, by the mean value property, w̃(x) = u(x) for |x| < R − σ , and
4
w̃(x) = ũ(x) for |x| > 2R + σ . Thus if we let g̃i j = w̃ n−2 δi j on E , then g̃ agrees
with g for r0 < |x| < R − σ , and g̃ is precisely Schwarzschild on |x| > 2R + σ ,
with mass given by (see Exercise 7-36) m(g̃) = 2C1 C2 ≈ m(g). □
We emphasize that applied to an end which is already harmonically flat, this
construction is local to the end.

7.4. On the positive mass theorem

As we saw earlier, to each end in an asymptotically flat initial data set with
integrable mass and momentum densities (ρ, J ), we can associate an energy
E and linear momentum vector P, which together form the ADM energy-
momentum vector for the asymptotic end. The constraint equations have the
form 8(g, π ) = (2κρ, κ J ) (we take 3 = 0). Without a model for, or condition
on, (ρ, J ), any (g, π) satisfies this system by definition, and indeed by simple
patching one can construct asymptotically flat initial data sets with negative
energy. We want a condition that will imply that the ADM energy-momentum
vector is future-pointing causal,p i.e., E ≥ |P| (with c = 1), in which case we
define the ADM mass as m = E 2 − |P|2 . The positive mass theorem asserts
that indeed E ≥ |P| for an asymptotically flat initial data set (M, g, π) with
the dominant energy condition in the form ρ ≥ |J |g (cf. Section 5.2), while the
positive energy theorem asserts that E ≥ 0.
Sometimes these statements are called the spacetime positive mass theorem and
spacetime positive energy theorem, in contrast to the situation where the object of
interest is an asymptotically flat manifold (M, g). Indeed, in the time-symmetric
case (K = 0, i.e., π = 0), we have P = 0, and so we often write E = m, which
we note can be defined for any asymptotically flat (M, g) with R(g) ∈ L 1 (M).
The dominant energy condition in the time-symmetric case, or more generally
in the maximal (trg K = 0) case, reduces to R(g) ≥ 0. The Riemannian positive
mass theorem for an asymptotically flat (M, g) with R(g) ≥ 0 is then m ≥ 0.
In each situation above, the theorem is really comprised of an inequality,
together with a rigidity statement which characterizes the ground state. In the
positive energy theorem, if we find E = 0, we would like to conclude our initial
286 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

data is a slice in Minkowski spacetime, and in the Riemannian positive mass


theorem we would like the conclude if m = 0, then g is Euclidean. Finally in
the general case if we find E = |P|, we would again like to conclude the initial
data is a slice in Minkowski spacetime.
The inequalities and rigidity statements have a long and storied history which
we will not attempt to detail here, but we give some references for further
exploration. The Riemannian positive mass theorem in dimensions less than
eight was established in the seminal works of Schoen and Yau [199; 201; 202],
cf. [197]. The extension to all dimensions has recently appeared in [205], where
one can also find references to an approach in a series of papers by Lohkamp.
The positive energy theorem in dimension three was established in [203], the
methods of which have been extended to dimensions less than eight by Eichmair
[77]. The inequality in the positive mass theorem for dimensions less than eight
was proven in [78], to which we refer the reader for further background and
discussion, while the corresponding rigidity statement recently appeared in [120].
We also want to point out that in the case M is spin, there is an approach to
the positive mass theorem discovered by Witten [224], cf. [15; 175].

On the proof of the Riemannian PMT. We now present a useful observation


due to Lohkamp. We give a somewhat simpler proof than in [146], using Bray’s
Proposition 7-80.
Proposition 7-81 (Lohkamp). Suppose the end (E , g) is harmonically flat with
negative mass. Then there is a metric on E which has nonnegative scalar curva-
ture (and which is not identically zero), which agrees with g near ∂ E and which
is flat outside a compact set.
Proof. By applying the construction in the proof of Proposition 7-80, we may
assume that in appropriate asymptotically flat coordinates for E , we have
4
m
 n−2
gi j (x) = 1 + δi j ,
2|x|n−2
with m < 0. For R > − m2 large enough that E contains a neighborhood of
{|x| = R} in coordinates, we consider the positive continuous function U on E
given by
 (1 + m if |x| ≤ R,
m m

U (x) = min 1 + , 1 + = 2|x|n−2
2|x|n−2 2R n−2 m
1+ if |x| ≥ R;
2R n−2
note where m < 0 has been used. Since U is the minimum of two (Euclidean)
harmonic functions, it is superharmonic. We use a spherically symmetric mollifier
O N THE POSITIVE MASS THEOREM 287

ϕε as above to mollify U : let Ue = (ϕε ∗ U ). Ue is well-defined on E for ε > 0


small enough, and it is smooth and positive. Indeed for |x| < R −ε, Ue(x) = u(x),
by the mean value property of harmonic functions, and for |x| > R + ε, U e(x) =
1 n−2
1+ 2 m/R is a positive constant. The mollification preserves superharmonicity,
and thus, as it is clear that U
e is not harmonic on all of E , there is a region
4
where 1U e < 0. Using the chosen coordinates, the metric g̃i j = U e n−2 δi j has
nonnegative scalar curvature which is not identically zero, and it is a flat metric
on {|x| > R + ε}. □
A basic version of the positive mass theorem follows as a corollary of this
result, using the Schoen–Yau topological obstruction to positive scalar curvature.
Theorem 7-82 (Riemannian positive mass theorem). Suppose (M, g) is an
asymptotically flat three-manifold with nonnegative scalar curvature R(g). Then
the ADM mass of any end is nonnegative. If the mass of any end is zero, then
(M, g) is isometric to (R3 , gE3 ).
Proof. From Proposition 7-79, we may assume that g is harmonically flat at the
ends, and while nonnegative scalar curvature would suffice for the argument, we
can also arrange R(g) = 0. (From the proof of Proposition 7-33, if R(g) ≥ 0
is not integrable on an end, the mass of that end is +∞.) In fact, we will now
further reduce to the case where there is only one asymptotically flat end.
If M has more than one end, choose one, say E , and let u be a harmonic
function, 1g u = 0, with u(x) → 1 as |x| → ∞ in E , and u(x) → 0 as |x| → ∞
in the other ends. To find such a u, we fix w ∈ C ∞ (M) with w = 1 near infinity
in the end E , and w = 0 near infinity in the other ends. Then 1g w ∈ Cc∞ (M),
and so we can solve 1g v = −1g w, for v in a weighted space (so that v decays
to zero in each end). Let u = v + w. By the maximum principle, 0 < u < 1 on M.
Since R(g) = 0, the metric u 4 g has vanishing scalar curvature, and near
infinity in any end, it can be written u 4 g = U 4 g E3 for U > 0, where 1U = 0.
In the ends other than E , U tends to zero at infinity, and so if we write U in
spherical harmonics in these ends, we have U (x) = c/|x| + O∞ (1/|x|2 ). The
higher spherical harmonics are not everywhere-positive, so that c > 0. A simple
calculation using the Kelvin transform x 7→ x/|x|2 , shows that (M, u 4 g) can be
completed to a smooth asymptotically flat manifold (M, ḡ), with each end apart
from E compactified with an additional point. Since u < 1, the mass m(ḡ) is no
more than that of (E , g), as in the proof of Proposition 7-76.
If the ADM mass of (E , g) were negative, then the ADM mass of (M, ḡ)
would also be negative. Now we apply the preceding proposition to assert
the existence of a metric on M with nonnegative scalar curvature which is
flat outside a compact set. We can thus consider a region W ⊂ M such that
288 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

∂ W = {x : |x i | = r0 , i = 1, 2, 3} is a large cube in the region where the metric


is flat. The metric thus descends to a metric with nonnegative (not identically
zero) scalar curvature on the quotient manifold M̌ obtained by identifying the
opposite coordinate faces in pairs. M̌ can be expressed as a connected-sum of
a closed three-manifold with the torus T3 , and in particular there is a copy of
Z ⊕ Z in π1 ( M̌). We clearly have a contradiction to the Schoen–Yau obstruction
to positive scalar curvature (Theorem 6-56), in the case when M, and hence M̌
is orientable. In the non-orientable case, one can either apply Exercise 7-93 and
replace (M, g) by an asymptotically flat orientable double cover, and argue as
above from here, or one can argue that an orientable double cover M b of M̌ has
a copy of Z ⊕ Z inside π1 ( M), as π1 ( M) injects under the covering map as an
b b
index two subgroup in π1 ( M̌).
We address the rigidity case below. □
The minimal hypersurface proof of the positive mass theorem given by Schoen
and Yau in [199; 197] does not use Theorem 6-56 directly, but the proofs
share some of the same ideas (and note that more recent work of Schoen and
Yau [205] does employ a compactification argument in higher dimensions).
A simple calculation in asymptotically flat coordinates for a metric gi j (x) =
(1 + 2m/|x|)δi j + O1 (|x|−2 ) on a three-manifold yields the Christoffel symbol
0i3j = (mx 3/|x|3 )δi j + O(|x|−3 ), for i, j = 1, 2. By tracing over i, j = 1, 2 (and
estimating the difference between ∂/∂ x 3 and the unit normal), we can conclude
that if the mass were negative, the coordinate hyperplanes x 3 = ±ξ for large
enough ξ > 0 would have mean curvature vector H with g(H, ±∂/∂ x 3 ) < 0; cf.
Exercise X-5 where the mean curvature calculation is done for the Schwarzschild
metric, of which g is a perturbation near infinity. These hyperplanes can thus be
used as barriers for finding a stable minimal hypersurface asymptotic to a plane,
by solving a Plateau problem on large cylinders with axis along the x 3 -direction.
The stability inequality and Gauss–Bonnet yield a contradiction, in a similar
manner as in the proof of Proposition 6-57; cf. [197]. We remark that in [199]
Schoen and Yau consider asymptotically flat manifolds-with-boundary. The
mean curvature vector along all boundary components should point inward, so
that each boundary component also serves as a barrier. Indeed, the negative mass
Schwarzschild metric is incomplete, and if you excise a ball centered around
the singularity, you obtain an asymptotically flat manifold-with-boundary and
vanishing scalar curvature, but the mean curvature vector along the boundary
sphere points outward.
On the proof of rigidity. We now address the rigidity statement: if the mass
of any end vanishes, then (M 3 , g) is isometric to (R3 , g E3 ); in particular, the
O N THE POSITIVE MASS THEOREM 289

vanishing of the mass rules out nontrivial topology in the manifold. We compare
to a simple analogue in dimension n ≥ 2 as remarked following Proposition 7-73,
by compactification to Tn : any metric g on Rn with nonnegative scalar curvature
and which outside a compact set is isometric to the Euclidean metric is actually
isometric to g En . (Thus, for n ≥ 3, the argument of the preceding section implies
m ≥ 0 for (Rn , g) asymptotically flat with nonnegative scalar curvature.)
We indicate a rigidity proof in the next exercise, given the Riemannian positive
mass theorem in higher dimensions [197; 205]. The proof below uses volume
comparison; for a proof using harmonic coordinates, see [196, Proposition 2],
and for further discussion, see [142].
Exercise 7-83. Suppose (M n , g), n ≥ 3, is asymptotically flat with nonnegative
scalar curvature and vanishing ADM mass. By Remark 7-77 (see Exercise 7-95),
g has vanishing scalar curvature. Suppose M has a single asymptotic end.
Suppose h is a smooth, symmetric (0, 2)-tensor with compact support in
M. Let γϵ = g + ϵh, which is a metric for |ϵ| small. By Corollary 7-69
4/(n−2)
and Proposition 7-72, there is u ϵ = 1 + vϵ > 0 so that gϵ := u ϵ γϵ has
2,α
R(gϵ ) = 0, with vϵ ∈ C−τ (M), so that gϵ is asymptotically flat, and u ϵ (x) =
1 + 12 m(ϵ)/|x|n−2 + O2 (|x|−(n−2)−γ ), for some γ > 0. Lemma 7-37 and the
hypothesis on g imply that m(gϵ ) = m(ϵ), and as in Remark 7-77, we have
1
Z
m(ϵ) = − R(γϵ )u ϵ dvγϵ .
2(n−1)ωn−1 M
2,α
a. Argue that the map ϵ 7→ vϵ ∈ C−τ (M) is differentiable in ϵ, then obtain an
identity from computing m ′ (0). (Differentiability at ϵ = 0 suffices for this.)
b. Let ζ be any compactly supported smooth bump function. Let h = ζ Ric(g)
in the above. Conclude that Ric(g) = 0.
c. Use the Bishop–Gromov volume comparison theorem to conclude that (M n , g)
is isometric to Euclidean space, cf. [182, Chapter 9, Exercise 5].
We remark that one way to handle multiple ends is to proceed following an
argument in [199]: given M as above with an end E with vanishing mass, namely,
let N ⊂ M be an asymptotically flat manifold-with-boundary containing one
asymptotic end E , and with boundary given by a union of large coordinate spheres
from the remaining ends, chosen so large that the mean curvature vector of these
spheres points into N . Now run the argument indicated in the exercise, solving
for the conformal factor u ϵ which solves the same PDE but has the Neumann
4/(n−2)
boundary condition; see Remark 7-77. The conformal metric gϵ = u ϵ γϵ will
be asymptotically flat with vanishing scalar curvature, and with mean curvature
vector still pointing into N along ∂ N , by Exercise X-5. Thus by Schoen–Yau
290 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

[199], the mass of gϵ must be nonnegative. From here you can conclude N is
Ricci-flat, and then that M is Ricci-flat, and conclude as above.
We see that the decay rate to the Euclidean metric of an asymptotically flat
metric with nonnegative scalar curvature is constrained by the mass: any such
metric which admits asymptotically flat coordinates with rate q > n − 2 must
in fact be flat. We point out that this holds for complete asymptotically flat
manifolds. One can easily write down a nontrivial harmonically flat end with
zero mass, but just as with analogous negative mass examples, that end cannot be
completed to an asymptotically flat manifold with nonnegative scalar curvature.

7.5. Localized scalar curvature deformation and asymptotics

In the final section of this chapter we will address the following question [204,
p. 371]: are there nontrivial asymptotically flat solutions of the vacuum Einstein
constraint equations on Rn which reduce to a Riemannian Schwarzschild metric
outside a compact set? Here we put in the condition nontrivial to rule out the
simple case of the Euclidean metric, with vanishing ADM mass. We saw in
Proposition 7-80 that if the vacuum condition is removed and only the dominant
energy condition in the form R(g) ≥ 0 is enforced, then the answer to the
question is yes, there are lots of such solutions; this sufficed for the proof of
the Riemannian positive mass theorem. In the purely Riemannian case of the
vacuum constraints (time-symmetric, K = 0), the Hamiltonian constraint is
simply R(g) = 0. If we consider conformal methods to construct solutions to
the constraints, unique continuation for harmonic functions suggests that maybe
the answer is no: maybe the condition of vanishing scalar curvature and being
identical to Schwarzschild near infinity is a rigid condition, cf. Example 7-19.
The strategy we will outline here is to investigate the question using a localized
version of the Fischer–Marsden scalar curvature deformation. The idea is to
patch together an asymptotically flat metric with zero scalar curvature to a
Schwarzschild end, using a smooth cutoff function. In an annular transition
region the scalar curvature may fail to vanish, and we seek to reimpose the
scalar curvature constraint by considering localized deformations whose support,
unlike in the case of conformal deformations in general, does not extend to the
Schwarzschild end.
Thus as a first step, we have to address to what extent the metric deformation
in the Fischer–Marsden theorem, which has been discussed earlier both in the
closed case (Theorem 6-52) and near the Euclidean metric in the asymptotically
flat case (Proposition 7-73), can be localized. We state a sufficient version of a
localized scalar curvature deformation (cf. [66; 70]). We introduce a term before
L OCALIZED SCALAR CURVATURE DEFORMATION AND ASYMPTOTICS 291

doing so: if M is a smooth manifold, an open subset  ⊂ M is a precompact


smooth domain in M provided the closure  ⊂ M is a compact manifold-with-
boundary embedded in M via inclusion, with manifold interior  and smooth
boundary ∂, which is then a smooth embedded submanifold (hypersurface) in
M. Let L g = DRg , the linearization of the scalar curvature map at g.
Theorem 7-84. Let 0 < α < 1. Suppose (M,g) is a smooth Riemannian manifold,
and  ⊂ M is a precompact smooth domain in M, so that L ∗g has trivial kernel
on . For any open 0 compactly contained in , there is ε0 > 0 and C0 > 0
such that for any S ∈ Cc∞ (0 ) with ∥S∥C 0,α < ε0 , there is a smooth symmetric
(0, 2)-tensor h on M, with support contained in , with ∥h∥C 2,α ≤ C0 ∥S∥C 0,α
and R(g + h) = R(g) + S.
A number of remarks are in order. First, this is a localized scalar curvature
deformation: it is a perturbation result under a nondegeneracy condition, with
a localization of the support of the deformation. We note the nondegeneracy
condition states precisely that (, g) does not support any nontrivial static
potentials (cf. Section 2.4.5). Just as with Theorem 6-52, the need for some
nondegeneracy condition is illustrated at a flat metric, which is static: there is
no non-flat metric on Rn with nonnegative scalar curvature which is Euclidean
outside a compact set, hence there certainly cannot be a localized deformation
of the type in the above theorem about this metric. For other rigidity results for
static metrics, see [34; 185]. For the purpose of comparison and inspiration, we
mention that Lohkamp showed [146] that there is no obstruction to deforming
the scalar curvature downward in a compact subdomain. As we noted above,
conformal deformations appear ill-suited for localized deformation results, and
in fact Yuan [230] shows certain localized deformations must leave the conformal
class. We note that the round sphere is static (Exercise 2-43), but the spherical
case is different than the case of Euclidean space or the flat torus. Indeed,
Brendle, Marques and Neves [35] constructed a metric on the sphere Sn with
scalar curvature R(g) ≥ n(n−1), with strict inequality on a nonempty subset,
and for which g agrees with the round unit sphere metric in a neighborhood of a
closed hemisphere, thereby giving a counterexample to the Min-Oo conjecture.
We have seen that L ∗g is overdetermined-elliptic, and there are results that
indicate that the presence of nontrivial static potentials (kernel elements of L ∗g )
is non-generic: for example, by taking the divergence of L ∗g f = 0 and using
the Ricci formula, it follows that a connected open set with a nontrivial static
potential must have constant scalar curvature (Corollary 2-45, Exercise 2-52a.;
see also [19]). Whereas, say, the Laplacian would have an infinite-dimensional
kernel parametrized by boundary values, if  is an n-dimensional connected
292 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

domain, the kernel of L ∗g has dimension at most n+1, since the equation L ∗g f = 0
induces an ODE along geodesics emanating from a point; from this we also see
that on a connected manifold, no nontrivial static potential can vanish on an open
set (Proposition 2-44). We can even consider weak solutions to L ∗g f = 0, but
by elliptic regularity (assuming g is smooth), we will obtain that f is smooth,
and in fact, it is a simple exercise to use the induced ODE to show that a static
potential extends smoothly to the boundary [66; 70].
In the remaining sections, we will give an idea of the proof of Theorem 7-84
and then answer the question posed at the start of the section in the form of
Theorem 7-89. We will focus on the linear theory, while leaving a number of
technical details to the references. Also, the discussion will be somewhat breezy,
as the goal is to map out the big ideas of the proofs.

7.5.1. On the proof of Theorem 7-84. We now take a precompact smooth


domain  in a Riemannian manifold (M, g),6 and assume that L ∗g has trivial
kernel on . For functions S ∈ Cc∞ () sufficiently small, we will construct a
smooth metric g +h on M, with h extending smoothly by zero across ∂, so that
R(g + h) = R(g) + S. We briefly remark that this paraphrasing of Theorem 7-84
is correct in spirit, but is not precise enough in regards to the smallness condition
on S. As we will see in the outline of the proof, certain weighted spaces are
used to control the support of the deformation h. One can state a version of
Theorem 7-84 in such spaces, cf. [70]. Because of the nature of the weights,
when formulating the theorem in simple terms as stated above, the ε0 depends
on 0 (and in particular on the distance d(0 , ∂)). In applications one might
indeed be working on a fixed subdomain 0 , in which case this paraphrasing
is reasonably faithful. Since L ∗g has trivial kernel on each component of , we
will assume that  is connected with nonempty boundary (which may or may
not be connected).
The approach we take follows the spirit of the proof of Theorem 6-52. The
trivial kernel condition should translate into a surjectivity property for L g , and
then a local surjectivity result for the nonlinear operator. The way we proved
the linearized surjectivity above was through a Hodge splitting, which we chose
to prove variationally; that variational approach can be adapted suitably for
Theorem 7-84. At this point, in contrast with the case of a closed manifold,
we do not have a functional framework to simply invoke the inverse function
theorem. That said, the idea of the proof of the implicit/inverse function theorem
is to solve the nonlinear problem by iterating linear corrections (Newton–Picard
6Actually, the proof does not require M extending , so we could formulate the problem on
the manifold-with-boundary (, g).
L OCALIZED SCALAR CURVATURE DEFORMATION AND ASYMPTOTICS 293

iteration). For the proof of Theorem 7-84, we run the iteration by hand and show
it converges to a solution.
To do this, we employ ideas from our analysis in the previous chapter. First,
we get the elliptic estimate used to make the variational method for establishing
linear surjectivity succeed. We then iterate the process of linear corrections,
using interior estimates to control the sequence of approximate solutions, and
establish convergence. At this point, this should sound reasonable and somewhat
unremarkable. That said, we have not yet indicated how we will impose the
decay of the deformation tensor h for which we solve. We will define weighted
spaces to accomplish this, and broach some of the technical hurdles involved
and how these are addressed in the analysis.
The linearized problem. We used elliptic estimates to serve as the foundation for
much of the elliptic PDE theory that we have employed. The elliptic estimates
in Section 6.1 were on closed manifolds, or the related interior estimates. If
you study, say, the Laplace operator on a bounded domain, the relevant elliptic
estimate will have to include a term for the boundary data. In our present setting,
we do not want to solve a boundary value problem per se, because we want
our deformation tensor to vanish on ∂, along with derivatives (a finite number,
or all derivatives, depending on how regular we want h to be across ∂). The
operator L ∗g is overdetermined-elliptic, and the structure of the operator allows
us to get an absolute elliptic estimate on any domain , without boundary terms.
This is easy for this operator, because trg (L ∗g f ) = −(n−1)1g f − f R(g), so
1
that Hessg f = L ∗g f − n−1 trg L ∗g f + f R(g) g + f Ric(g). So on any domain


 where the Ricci curvature is bounded, say on a compactly contained domain,


we immediately have a constant C such that for all f ,

∥ f ∥ H 2 () ≤ C(∥L ∗g f ∥ L 2 () + ∥ f ∥ H 1 () ). (7.5.1)

(For a different proof, see [66, Appendix].)


Suppose  is a precompact smooth domain, so we can apply Rellich com-
pactness to the inclusion H 2 () ,→ H 1 (). If L ∗g has trivial kernel on , we
can then promote the estimate to

∥ f ∥ H 2 () ≤ C∥L ∗g f ∥ L 2 () , (7.5.2)

with possibly a different constant; cf. Proposition 6-15.


We introduce a weight function to help achieve the desired support of the
deformation tensor h. Given 0 < τ0 < τ1 , we let ρ̃ : (0, +∞) → R be a smooth
function that satisfies ρ̃(t) = 1 for t ≥ τ1 , and for 0 < t < τ0 , either ρ̃(t) = t N
for some N > 0, or ρ̃(t) = e−1/t , and finally for all t > 0, ρ̃ ′ (t) ≥ 0. Let
294 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

g̊ be a smooth background metric on . The weight function in  is then


ρ(x) = ρ̃(d(x)), where d(x) = dg̊ (x, ∂) = inf y∈∂ dg̊ (x, y). Since ∂ is a
compact, embedded smooth hypersurface, d is smooth in a neighborhood of ∂
in , with nowhere-vanishing differential; i.e., d is a defining function for the
boundary. Thus for small ϵ, the level sets {x ∈  : d(x) = ϵ} are smooth, and by
choosing τ1 small, the weight function ρ is smooth in  as well. Such ρ will be
used to achieve the decay of h at the boundary, and hence the smoothness of the
extension of h by zero outside of , where h will be our constructed solution to
R(g + h) = R(g) + S. The power weight suffices for finite regularity solutions
(using sufficiently large N ), and the exponential weight can be used for smooth
solutions.
The linearized problem we want to solve is L g h = S. We will solve for h
of the form h = ρ L ∗g u; you may recall in the closed setting the range of L g
is indeed the range of L g L ∗g ∼ (n−1)12g . In the situation here, the equation
L g ρ L ∗g u = S is not strictly elliptic. To fix this, we rewrite the equation as
ρ −1 L g ρ L ∗g u = ρ −1 S, which is strictly elliptic, but there are a couple points to
note. First, the coefficients of the lower-order terms will be on the order of some
power of d −1 . As it turns out, one can actually handle this in the elliptic estimates
by suitable scaling [70, Appendix A]; the interested reader can compare this to
the formulation of the interior Schauder estimates in [107, Chapter 6]. Secondly,
the right-hand side ρ −1 S needs to be reasonably well-behaved, for example, S
can compactly supported in the interior , or more generally S must decay fast
enough at the boundary. Finally, as we will see presently, the solution u need
not be a priori bounded on , but will behave well enough so that ρ L ∗g u decays
suitably at the boundary.
To solve L g h = S variationally, we define G(u) =  12 ρ|L ∗g u|2 − u S dvg .
R 

We will identify a suitable space on which to minimize G, such that if u is in the


space and v is smooth and compactly supported, then u + tv is in the space for
any t. If u is critical for G under such perturbations, the Euler–Lagrange equation
is derived from 0 = dtd t=0 G(u + tv) =  (⟨ρ L ∗g u, L ∗g v⟩ − Sv) dvg . This is a
R

weak formulation of L g ρ L ∗g u = S, as desired.


In order to get the functional framework to produce a minimizer, we need to
promote (7.5.2) to a weighted estimate. To do this, we define certain weighted
L 2 -Sobolev spaces. The definition we will give is used in [66; 70], as it is simple
and suffices for our purposes here; in [67] we compare these spaces to another
suite of weighted L 2 -spaces that are naturally defined.
L OCALIZED SCALAR CURVATURE DEFORMATION AND ASYMPTOTICS 295

We say u ∈ L 2ρ () if ∥u∥2L 2 () =  |u|2 ρ dvg is finite, and we define Hρk ()
R
ρ
similarly for any nonnegative integer k, where ∥u∥2H k () = kℓ=0 ∥∇ ℓ u∥2L 2 () ;
P
ρ ρ
we note Hρ0 () = L 2ρ (). It is a simple exercise to show that Hρk () is a Hilbert
space. Furthermore, since ρ is bounded, we have by [69, Lemma 2.1] that
C ∞ () is dense in Hρk (); this can be useful in establishing certain relations
or estimates, by proving on the dense subset and then taking limits. As a set,
Hρk () is the same even if the metric g, and hence the norm, changes, and we will
usually suppress the metric from the notation. We remark that, considering the
behavior of the weight function near the boundary, it would be natural to define
the weighted spaces with a different weighting on different order derivatives, cf.
[67; 61].
We state the key estimate: with  a compact smooth manifold-with-boundary,
assuming L ∗g has no kernel on the manifold interior , there is a constant C > 0
such that for all f ∈ Hρ2 ()

∥ f ∥ Hρ2 () ≤ C∥L ∗g f ∥ L 2ρ () . (7.5.3)

We sketch the proof in the form of two exercises.


Exercise 7-85. Prove that with ϵ = {x ∈  : d(x) > ϵ}, there is a C > 0 such that
for all small enough ϵ > 0, ∥ f ∥ H 2 (ϵ ) ≤ C∥L ∗g f ∥ L 2 (ϵ ) holds. The argument can
be carried out by contradiction, assuming a sequence ϵ j ↘ 0 and f j ∈ H 2 (ϵ j )
with ∥ f j ∥ H 2 (ϵ j ) > j∥L ∗g f j ∥ L 2 (ϵ j ) . One can extend f j to f˜j ∈ H 2 () with
∥ f˜j ∥ H 2 () ≤ D∥ f j ∥ H 2 (ϵ j ) , with a constant D uniform in j, then normalize to
∥ f˜j ∥ H 1 () = 1. Using (7.5.1) on ϵ j and an application of Rellich will yield the
contradiction, cf. [66; 70].
The next exercise indicates how to get from here to (7.5.3).
Exercise 7-86. Establish (7.5.3) as follows. Since ρ̃ ′ ≥ 0, we have

ρ̃ ′ (ϵ)∥ f ∥2H 2 (ϵ ) ≤ C 2 ρ̃ ′ (ϵ)∥L ∗g f ∥2L 2 (ϵ ) .

Foliate a neighborhood of ∂ with smooth level sets of d, containing {x ∈  :


Rd
0 ≤ d(x) ≤ d1 } for some small d1 > 0. Let C0 = 0 1 ρ̃ ′ (t) dt = ρ̃(d1 ) > 0. Use
Rd
the co-area formula to argue 0 1 ρ̃ ′ (ϵ)∥ f ∥2H 2 ( \ ) dϵ = ∥ f ∥2H 2 (\ ) .
ϵ d1 ρ d1

For the full constraint operator the analogous weighted estimates for D8∗(g,π )
hold, but are somewhat harder to establish, without having an analogue of the
absolute (unweighted) estimate (7.5.2), cf. [69; 67; 61].
We now see that (7.5.3) gives us a coercivity bound on the functional G:
assuming the kernel of L ∗g on  is trivial and setting ∥S∥2L 2 () =  |S|2 ρ −1 dvg ,
R
ρ −1
296 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

we obtain for u ∈ Hρ2 ()

G(u) ≥ C ′ ∥u∥2H 2 () −∥u∥ L 2ρ () ∥S∥ L 2 () ≥ C ′ ∥u∥2H 2 () −∥u∥ Hρ2 () ∥S∥ L 2 () .
ρ ρ −1 ρ ρ −1

In applying Cauchy–Schwarz, we incorporated the weight on u, and thus we had


to incorporate a dual weight on S. We already saw above the indication that S
will have to decay suitably near the boundary.
By applying standard functional analysis just like in the proof of the Hodge
decomposition in Section 6.1, one can prove the existence of a minimizer u for G
on Hρ2 (). We note there is a unique minimizer, as a corollary of the following
exercise.
Exercise 7-87. Under the trivial kernel condition in the setting above, show
that for u 1 ̸= u 2 ∈ Hρ2 (), the function G(s) = G((1 − s)u 1 + s(u 2 )) is strictly
convex.
For this unique minimizer u, we note that G(u) ≤ G(0) = 0. This implies
∥ρ L ∗g u∥2L 2 ()
= ∥L ∗g u∥2L 2 () ≤ 2∥S∥ L 2 () ∥u∥ L 2ρ () .
ρ −1 ρ ρ −1

Now we can apply (7.5.3) to infer ∥ρ L ∗g u∥ L 2 () ≤ C


′ ∥u∥
Hρ2 () ≤ C
′′ ∥S∥
L 2 −1 () .
ρ −1 ρ
We summarize what we have discussed.
Proposition 7-88. Let 0 < α < 1. Suppose (M, g) is a smooth Riemannian
manifold, and  ⊂ M is a precompact smooth domain in M, so that L ∗g has
trivial kernel on . There are positive constants C ′ and C ′′ such that for any
S ∈ L 2ρ −1 (), there is unique u ∈ Hρ2 () such that for h = ρ L ∗g u, we have L g h = S,
with the following estimate: ∥h∥ L 2 () ≤ C ′ ∥u∥ Hρ2 () ≤ C ′′ ∥S∥ L 2 () .
ρ −1 ρ −1

On the nonlinear problem. If we assume S and g in the preceding proposition


are smooth, then by elliptic regularity, u, and hence h, will be smooth in 
as well. What remains to be understood for the proof of Theorem 7-84 is for
which S we can infer that h decays sufficiently fast on approach to the boundary
that we can extend it by zero to (and past) the boundary smoothly. We also
expect that if we choose S suitably small, we will be able to control h to be
pointwise small, so that in particular g + h will be a metric. In this case we see
that R(g + h) = R(g) + L g h + Q g h = R(g) + S + Q g h, where Q g h is the Taylor
remainder in the expansion of the scalar curvature map. Solving the linearized
problem results in an approximate solution where the error is quadratic in the
perturbation, after which one expects to be able to solve the nonlinear equation
with a Newton–Picard-type iteration.
The pointwise estimates, including the smallness condition and the behavior
near the boundary, are carried out with interior Schauder estimates and scaling.
L OCALIZED SCALAR CURVATURE DEFORMATION AND ASYMPTOTICS 297

This can all be done, in the spirit of [66] (with correction to the norm needed
on S as noted in [69], and cf. [70] for a cleaner formulation). When S has
compact support, if we assume it is C 0,α -small, then it will also be small in the
required weighted norm, depending on the support (in particular, on a lower
bound of ρ on the support of S), which explains the way in which Theorem 7-84
is stated. We note here that given k, by taking a power weight ρ(x) = (d(x)) N
near the boundary, for N sufficiently large we can make h extend by zero in a
C k fashion across the boundary; an exponential weight can be used to produce a
C ∞ -solution. The solution just constructed is supported on ; we can arrange
the support to lie strictly in  by running the above construction, replacing  by
′ = ϵ ′ , for ϵ ′ > 0 sufficiently small.
There are numerous technical issues remaining in the proof of Theorem 7-84,
which we will not go into here, hoping we have at least given the spirit of the
argument. Instead we spend the remaining section in this chapter discussing
what modifications need to be made to address the question posed at the start
of the section: can we construct a zero scalar curvature metric by gluing an
asymptotically flat metric of zero scalar curvature to a Schwarzschild metric and
doing a localized perturbation to make the scalar curvature vanish?

7.5.2. Asymptotic gluing construction. We now state a result that answers the
question posed at the start of the section. For simplicity we continue to work
with metrics that are smooth, and for asymptotically flat metrics we require the
rate q > n−2
2 , with order ℓ ≥ 3, though we can replace this with a weighted
2,α
C−q -assumption; in any case, we can use this weighted norm to measure the
closeness of two such metrics as in the next theorem.
Theorem 7-89. Suppose (E , g) is an asymptotically flat end with vanishing
scalar curvature, nonzero ADM mass m(g), and asymptotically flat coordinates
x. There is θ0 > 0 so that for all θ ≥ θ0 , there is a metric ḡ on E with R(ḡ) = 0,
ḡ = g for |x| ≤ θ , and ḡ is a Schwarzschild metric for |x| ≥ 2θ . Given ε > 0, for
large enough θ , ḡ is ε-close to g, and |m(ḡ) − m(g)| < ε.
The theorem applies to ends of negative or positive mass, while if E ⊂ M is
an asymptotic end in M, and since the construction is local to the end, ḡ extends
smoothly to all of M. The remainder of the chapter will be spent illustrating
the steps in the proof. The basic strategy is simple to lay out, but a number of
technical issues will make the discussion in parts a bit cumbersome.
With respect to the asymptotic coordinates x, we let
4
m
 n−2
g̊m,c (x) = 1 + g En
2|x − c|n−2
298 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

be the Riemannian Schwarzschild metric of mass m and center c. Let 0 ≤ ψ ≤ 1


be a smooth cutoff function so that ψ(t) = 1 for t < 1 and ψ(t) = 0 for
t > 2. For θ ≥ r0 > 1, let ψθ (x) = ψ(|x|θ −1 ). Now consider the metric
gθ (x) = ψθ (x)g(x) + (1 − ψθ (x))g̊m,c (x). We assume that |c| ≤ θ/2, so that
|x − c| ≥ θ/2 on |x| ≥ θ ; in particular the metric is well-defined on E (for θ large
enough in case m < 0).
The scalar curvature R(gθ ) is supported in the annulus Aθ given by θ ≤ |x| ≤ 2θ .
Since the metric g is asymptotically flat, the scalar curvature R(gθ ) goes to zero
as θ → ∞; in fact, a simple exercise shows that R(gθ ) = O(θ −q−2 ). Here and
below, it is important that such estimates hold uniformly for c as above and
for m in a bounded set. In terms of the localized scalar curvature deformation
result discussed above, the direction seems clear from here: apply a localized
deformation supported in the annulus Aθ to perturb the small scalar curvature
back to zero. Upon a closer look, we find a delicate issue: namely, we have to get
the size O(θ −q−2 ) of the desired perturbation to be within ε0 from Theorem 7-84.
To do this, we would like to choose θ large. Doing so changes the metric, and
hence the ε0 . In fact, we really get stuck here, because as θ increases, the metric
gθ in the annulus approaches the Euclidean metric, which is static vacuum.
To understand this more clearly, we rescale the metric gθ to g̃θ on the fixed
annulus A1 , via the map φ : A1 → Aθ given by φ(x) = θ x: g̃θ = θ −2 φ ∗ gθ . Then
β
∂x (g̃θ )i j (x) − δi j = O(θ −q ) for 0 ≤ |β| ≤ ℓ, so ∥R(g̃θ )∥C 0,α (A1 ) = O(θ −q ).


Now if gθ does not admit any nontrivial static potentials on Aθ (actually if


it did, then we could already have concluded R(gθ ) = 0), then neither does g̃θ
(on A1 ), in which case we have the estimate ∥ f ∥ Hρ2 (A1 ) ≤ C∥L ∗g̃θ f ∥ L 2ρ (A1 ) . The
problem is that the constant C is not uniform in θ large. This is easy to see:
just let f = 1 or f = x j for a coordinate function on A1 ; the span of these
functions is basically an obstruction space to being able to carry out our desired
construction. Handling the obstruction is a two-step process: in the next section,
we will discuss solving the problem transverse to the obstruction, and then finish
by addressing how to correct the resulting finite-dimensional error.

The projected problem. We expect to be able to formulate function spaces so


that the image of L = L g En should be transverse to the kernel K̊ of L ∗ = L ∗g En ,
K̊ = span{1, x 1 , . . . , x n }. As motivation, consider the following exercise, for
which we introduce some more notation. Let 0 ≤ ζ ≤ 1 be a smooth, rotationally
symmetric bump function supported in A1 and that equals 1 near |x| = 32 ,


say. Let K = ζ K̊, and write L 2 (A1 , d x) as the orthogonal decomposition


L 2 (A1 , d x) = K⊥ ⊕ K, and let Π̊ be the projection onto the K⊥ -factor.
L OCALIZED SCALAR CURVATURE DEFORMATION AND ASYMPTOTICS 299

Exercise 7-90. Suppose σ̊ ∈ K⊥ , and let G̊(u) = A1 12 |L ∗ u|2 − σ̊ u d x, for


R 

u ∈ H 2 (A1 ). Suppose u ∈ H 2 (A1 ) ∩ K⊥ minimizes G̊ restricted to the Banach


4
space H 2 (A1 ) ∩ K⊥ . Argue that u ∈ Hloc (A1 ) and L L ∗ u − σ̊ ∈ K, so that
L L ∗ u ∈ L 2 (A1 , d x) and Π̊ (L L ∗ u) = σ̊ .

Even if L ∗g̃θ has trivial kernel, K̊ can be thought of as an approximate kernel for
large θ , an obstruction to uniform estimates. We proceed as in Proposition 6-15:
we get uniform estimates for L ∗g̃θ if we work transverse to K̊. To this end, let
⊥ K = u ∈ H 2 (A ) : i
A1 uζ d x = 0 = A1 uζ x d x, i = 1, . . . , n . Since ζ
 R R
ρ 1
is compactly supported away from the boundary of the annulus, integration
against an element of K defines a continuous linear functional on Hρ2 (A1 ), and
so ⊥ K is closed, which also follows from Hρ2 (A1 ) = ⊥ K ⊕ K. We remark that
Hρ2 (A1 ) = ⊥ K ⊕ K̊ as well, since K̊ ∩ ⊥ K = {0}, and K̊ has the same dimension
as Hρ2 (A1 )/⊥ K ∼ = K. To be explicit, let x 0 := 1; we can decompose u ∈ Hρ2 (A1 )
as u = R0 (u) + P0 (u), where
n
X ⟨u, ζ x j ⟩ L 2 j
P0 (u) = x ∈ K̊.
⟨x j , ζ x j ⟩ L 2
j=0
n
⟨u, ζ x j ⟩ L 2
We can also write u = R(u) + P(u), with P(u) = ζ x j ∈ K. It
P
j=0 ⟨ζ x j , ζ x j ⟩ L 2
is easy to see that R0 (u) and R(u) are in ⊥ K.

Surjectivity for the linearized projected problem. Let


 Z Z 
⊥ 2 i
Kθ = u ∈ Hρ (A1 ) : uζ dvg̃θ = 0 = uζ x dvg̃θ , i = 1, . . . , n .
A1 A1

Since K̊ ∩ ⊥ Kθ = {0} for large θ , we have Hρ2 (A1 ) = ⊥ Kθ ⊕ K = ⊥ Kθ ⊕ K̊. Let


G(u) = A1 21 ρ|L ∗g̃θ u|2 − uσ dvg̃θ for σ ∈ L 2ρ −1 (A1 ). If we restrict G to the set
R 
⊥ K , we can get a coercivity bound using an injectivity estimate transverse to
θ
the kernel by adapting the proof of (7.5.3) to obtain that there is a constant C > 0
such that for all θ large and f ∈ ⊥ Kθ , ∥ f ∥ Hρ2 (A1 ) ≤ C∥L ∗g̃θ f ∥ L 2ρ (A1 ) . We use
this as before to produce a minimizer u ∈ ⊥ Kθ for G on ⊥ Kθ . For any smooth,
compactly supported v ∈ ⊥ Kθ , we have
d
Z
0= G(u + tv) = ⟨ρ L ∗g̃θ u, L ∗g̃θ v⟩ − vσ dvg̃θ .

dt t=0 A1

Since Cc∞ (A1 ) = (⊥ Kθ ∩ Cc∞ (A1 )) ⊕ K, we infer the distributional equation


L g̃θ ρ L ∗g̃θ u − σ ∈ K. By elliptic regularity, if σ ∈ C ∞ (A1 ), then u ∈ C ∞ (A1 ).

Remark 7-91. We have not used anything about g̃θ except that it is near the
Euclidean metric, and so the results just obtained extend to g suitably near g En .
300 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

Nonlinear projected problem. We summarize what the above analysis gives


us. We consider the projected operator g 7→ Π̊ R(g), whose linearization is
Π̊ ◦ L g . For g near g En this linear operator surjects onto L 2ρ −1 (A1 ) ∩ K⊥ , with
Π̊ L g h = σ̊ ∈ L 2ρ −1 (A1 ) ∩ K⊥ , where h = ρ L ∗g u ∈ L 2ρ −1 (A1 ) ∩ Hloc
2
(A1 ). Just
as in Theorem 7-84 (see also Proposition 7-88), h is bounded in terms of σ̊ ,
first in a weighted L 2 -norm from the variational estimates, then in a Hölder
norm by Schauder estimates to obtain regularity and decay of h at the boundary,
as in [66; 70] (see also [69; 61; 67]). We do point the reader to a suggestive
linear model problem in Exercise 7-110. In any case, such pointwise estimates
are then used in a Newton–Picard iteration to solve the nonlinear problem
Π̊(R(g+h)− R(g)) = σ̊ for σ̊ ∈ K⊥ small, and suitably decaying (e.g., compactly
supported). We will not go into the details of the proofs, but we indicate in the
final section how to use this to accomplish the gluing to Schwarzschild near
infinity.
Remark 7-92. There is one technical point that could have been raised in the
proof of Theorem 7-84, but it was not needed there. In the next section, we will
need to know that the solution tensor h depends continuously on σ̊ and on g.
This is generally straightforward: we can proceed as in [6, Appendix A.7] and
the proof of [70, Proposition 3.7], and note that we chose to use a fixed weight ρ
which does not change with g (cf. [67, Remark 6.7]).
Handling the obstruction space. For large θ , we have g̃θ near the Euclidean
metric on the annulus, with ∥R(g̃θ )∥C 0,α (A1 ) = O(θ −q ). In the preceding section
we presented the linear theory needed to solve the problem Π̊(R(g̃θ + h θ )) = 0
for h θ , by letting σ̊ = −Π̊ R(g̃θ ). As mentioned in Remark 7-92, h θ can be
constructed continuously in θ , as well as with respect to m and c, and such that
it extends smoothly by zero outside A1 , together with a bound ∥h θ ∥C 2,α (A1 ) ≤
C∥R(g̃θ )∥C 0,α (A1 ) = O(θ −q ), cf. Theorem 7-84.
Thus we have solved R(g̃θ + h θ ) ∈ K. We now want to see how, for θ large
enough, by choosing m and c appropriately, we will in fact have R(g̃θ + h θ ) = 0;
upon rescaling, we will have finished the proof of Theorem 7-89. Now since
K ∩ K̊⊥ = {0}, we need only arrange A1 x k R(g̃θ +h θ ) d x = 0, for k ∈ {0, 1, . . . , n},
R

where recall x 0 := 1. We proceed to indicate how to estimate these integrals.


Before we do, note that near the boundary {|x| = 2} of A1 ,
4
m/θ n−2
 n−2
(g̃θ )i j (x) = 1 + δi j ,
2|x − c/θ |n−2
which is Schwarzschild of mass m/θ n−2 and center c/θ. We also recall that
ν j = x j/|x| and that dσ is Euclidean surface measure.
L OCALIZED SCALAR CURVATURE DEFORMATION AND ASYMPTOTICS 301

The analysis begins with an expansion of R(g̃θ + h θ ), cf. Proposition 7-31:

R(g̃θ + h θ ) = R(g̃θ ) + L g̃θ h θ + O(θ −2q )


n
(g̃θ )i j,i j − (g̃θ )ii, j j + Lh θ + O(θ −2q ).
X
=


i, j=1

Note that we replaced L g̃θ with the operator L at the Euclidean metric, up to
error terms, since g̃θ is close to the Euclidean metric. Integrating by parts, using
L ∗ (x k ) = 0 and the fact that h θ and ∂h θ vanish at the boundary, we get
Z Z n
k k
(g̃θ )i j,i j − (g̃θ )ii, j j d x + O(θ −2q ). (7.5.4)
X
x R(g̃θ + h θ ) d x = x

A1 A1 i, j=1

We first consider k = 0. Recall that n−2 2 < q ≤ n − 2; the upper bound holds
since m(g) ̸= 0 by assumption. From the fact that g is asymptotically flat with
vanishing scalar curvature, we know that i,n j=1 (gi j,i j −gii, j j )(x) = O(|x|−2−2q ),
P

from which we conclude


Z n
(gi j,i j − gii, j j ) d x = O(θ n−2−2q ),
X
{|x|≥θ } i, j=1

and similarly for g̊m,c (with q = n − 2). Thus by (7.5.4) and (7.2.2) (see also
Proposition 7-33), we have
Z
R(g̃θ + h θ ) d x
A1 Z Z X n
= − ((g̃θ )i j,i −(g̃θ )ii, j )ν j dσ + O(θ −2q )
{|x|=2} {|x|=1} i, j=1
Z n
= θ 2−n ((g̊m,c )i j,i − (g̊m,c )ii, j )ν j dσ
X
{|x|=2θ } i, j=1 Z n 
j
(gi j,i − gii, j )ν dσ + O(θ −2q )
X

{|x|=θ} i, j=1

= θ 2−n · 2(n−1)ωn−1 (m − m(g)) + O(θ n−2−2q ) . (7.5.5)




Up to scaling, this projection integral is governed to leading order by the mass


difference of the glued metrics.
From Corollary 7-6 and equations (7-39a) and (7.2.5) (p. 256), we know to
expect that the other projection integrals should involve the center of mass. That
said, without further assumption, g may fail to have a well-defined center of
302 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

mass integral; see (7.2.7). With h i j = gi j − δi j and k ∈ {1, . . . , n}, we have


Z n
xk
X
(gi j,i j − gii, j j ) d x
{r0 ≤|x|≤θ } i, j=1
Z Z n
X
= − x k (h i j,i − h ii, j ) − (δ ki h i j − δ kj h ii ) ν j dσ.

{|x|=θ } {|x|=r0 } i, j=1

Using i,n j=1 (gi j,i j − gii, j j )(x) = O(|x|−2−2q ) again, we see the integrand
P

on the left is O(|x|−1−2q ). If n − 1 ̸= 2q, then


Z θ
1 n−1−2q
t −1−2q · t n−1 dt = (θ n−1−2q − r0 ),
r0 n − 1 − 2q

whereas if n − 1 = 2q (so q = 1 if n = 3), r0 t −1−2q · t n−1 dt = log θ − log r0 . For
t > 1, let γ (t) = max(1, t n−1−2q ) in case n − 1 ̸= 2q, and let γ (t) = 1 + log t for
n − 1 = 2q. Since for the range of q we are considering, we have n − 1 − 2q < 1,
in all cases, limt→∞ γ (t)/t = 0, i.e., γ (t) = o(t). Thus we conclude there is a
C such that for θ ≥ r0 ,
Z n
x k (h i j,i − h ii, j ) − (δ ki h i j − δ kj h ii ) ν j dσ ≤ Cγ (θ).
X
(7.5.6)

{|x|=θ} i, j=1

We now compute, with (h̊ m,c )i j = (g̊m,c )i j − δi j and (h̃ θ )i j = (g̃θ )i j − δi j , and
using (7.5.4),
Z
x k R(g̃θ + h θ ) d x
A1
Z Z n
X
= − x k ((h̃ θ )i j,i −(h̃ θ )ii, j )−(δ ki (h̃ θ )i j −δ kj (h̃ θ )ii ) ν j dσ

{|x|=2} {|x|=1} i, j=1

+ O(θ −2q )

= θ 1−n {|x|=2θ} i,n j=1 x k ((h̊ m,c )i j,i −(h̊ m,c )ii, j )−(δ ki (h̊ m,c )i j −δ kj (h̊ m,c )ii ) ν j dσ
R P 

Z n
1−n
x k (h i j,i − h ii, j ) − (δ ki h i j − δ kj h ii ) ν j dσ
X
−θ

{|x|=θ } i, j=1

+ O(θ −2q ). (7.5.7)

Since R(g̊m,c ) = 0, we have i,n j=1 (g̊m,c )i j,i j − (g̊m,c )ii, j j = Qg̊m,c , where Qg
P 

is defined on page 283 (cf. Proposition 7-31). Using (7-39a) (p. 256) we then
L OCALIZED SCALAR CURVATURE DEFORMATION AND ASYMPTOTICS 303

obtain
Z n
x k ((h̊ m,c )i j,i − (h̊ m,c )ii, j ) − (δ ki (h̊ m,c )i j − δ kj (h̊ m,c )ii ) ν j dσ
X 
{|x|=2θ } i, j=1
Z n
k k
X
= 2(n−1)ωn−1 mc − lim x (g̊m,c )i j,i j − (g̊m,c )ii, j j d x

r →∞ {2θ ≤|x|≤r }
i, j=1
Z
= 2(n−1)ωn−1 mck − lim x k Qg̊m,c (x) d x. (7.5.8)
r →∞ {2θ ≤|x|≤r }

While Qg̊m,c (x) = O(|x|2−2n ), a closer analysis of Qg̊m,c (x) shows that it takes
the form

h̊ m,c ∗ ∂ 2 g̊m,c + ∂ g̊m,c ∗ ∂ g̊m,c = h̊ m,c ∗ ∂ 2 h̊ m,c + ∂ h̊ m,c ∗ ∂ h̊ m,c ,

which denotes a linear combination of terms of the form (h̊ m,c )i j (∂ 2 h̊ m,c )kℓ and
(∂ h̊ m,c )i j (∂ h̊ m,c )kℓ , with bounded coefficients which are rational expressions in
the components of g̊m,c . By explicit expansion, Qg̊m,c (x) = Qeg̊m,c (x)+O(|x|1−2n ),
where Qeg̊m,c (−x) = Qeg̊m,c (x); cf. (7.2.7) and the discussion of approximate parity
symmetry.
Thus we see for any r > 2θ , {2θ ≤|x|≤r } x k Qeg̊m,c (x) d x = 0, while we also
R

have the estimate {|x|≥2θ } x k (Qg̊m,c (x) − Qeg̊m,c (x)) d x = O(θ 2−n ). From (7.5.6)–
R

(7.5.8), we have
Z
x k R(g̃θ + h θ )d x = θ 1−n · 2(n−1)ωn−1 mck + O(γ (θ)) .

A1

We saw above that the rescaled Schwarzschild indeed has mass times center
equal to (m/θ n−2 )(c/θ ) = θ 1−n mc. With ĉk = ck /θ, recalling that γ (θ) = o(θ),
we obtain
Z
x k R(g̃θ + h θ ) d x = θ 2−n · 2(n−1)ωn−1 m ĉk + o(1) .

A1

Thus even though the metric g may not have had a convergent center of mass
integral, we still have the following vector identity (recall (7.5.5)):
θ n−2
Z Z 
R(g̃θ + h θ ) d x, x R(g̃θ + h θ ) d x
2(n−1)ωn−1 A1 A1
=: m − m(g), m ĉ + (ξ 0 (m, ĉ), ξ(m, ĉ))


= (m − m(g), m ĉ) + o(1), (7.5.9)

where (ξ 0 , ξ ) := (ξ 0 , ξ 1 , . . . , ξ n ), so that each ξ k is a continuous function of m


and ĉ (and θ ).
304 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

This is spectacular: as we might have surmised as the start, we cannot expect


to attach any Schwarzschild metric end and locally perturb to obtain vanishing
scalar curvature. We have to tune it so that the mass is roughly equal to that
of m(g): by (7.5.5), m − m(g) = O(θ n−2−2q ), which we can make as small as
we like by choosing θ large enough. Now, for the second component we have
m ĉ ≈ m(g)ĉ, and as have assumed m(g) ̸= 0, by varying ĉ, the center of mass
of the rescaled Schwarzschild, we can cover the o(1) = O(γ (θ)/θ) error from
(7.5.9). To do this, we are varying m and ĉ (equivalently c) together — the error
terms depend on both parameters and on θ — and recall that we have assumed
|c| ≤ θ2 , i.e., |ĉ| ≤ 12 . The leading order term of this projection map is a local
diffeomorphism for m(g) ̸= 0, the error term is small, and as we noted earlier,
the map is continuous with respect to θ, m and c = θ ĉ (Remark 7-92). Thus,
applying the Brouwer fixed point theorem or a homological degree argument,
we conclude that for some m ≈ m(g) and some ĉ with |ĉ| ≤ 21 , R(g̃θ + h θ ) = 0.
To illustrate, let 2 = (m, ĉ) : |m − m(g)| ≤ 21 |m(g)|, |ĉ| ≤ 21 . The function


ξ(m, ĉ)
 
F(m, ĉ) = m(g) − ξ 0 (m, ĉ), −
m(g) − ξ 0 (m, ĉ)
defines a continuous map F : 2 → 2, since ξ = o(1), uniformly for θ large
and m, ĉ ∈ 2. The Brouwer fixed point theorem yields an (m, ĉ) ∈ 2 such
that F(m, ĉ) = (m, ĉ), which translates into A1 x k R(g̃θ + h θ ) d x = 0 for k ∈
R

{0, 1, . . . , n} as desired.
Upon rescaling the metric back out from A1 , the center of mass becomes
c = θ ĉ, which can be “large”, but which is relatively smaller than the scale θ
at which the gluing is happening; for instance in the case n = 3, q = 1, we
have c = O(log θ ). This might be expected, since we did not assume g had a
convergent center of mass. As in [66] (note there are some unfortunate differences
between the normalization there and here), if the original metric g satisfies better
asymptotics, we can work a bit harder to get the final center of mass c to be close
to the original center c(g). In [66], we assumed asymptotically Schwarzschild
asymptotics, but the same would hold under Regge–Teitelboim asymptotics
(7.2.7), cf. [69; 61].

Exercises

Exercise 7-93. Let (M, g) be complete, noncompact and connected. Suppose


there is a compact set C ⊂ M such that M \ C is a disjoint union α Eα of ends,
S

where each end Eα is diffeomorphic to {|x| > 1} in Rn , and that there is a C > 0
with |gi j (x)−δi j | ≤ C|x|−q (q > 0, |x| > 1) on each end. Let d : M × M → [0, ∞)
E XERCISES 305

be the distance function induced by g. By the Hopf–Rinow theorem, a set S ⊂ M


is d-bounded if and only if its closure S is compact.
a. Show that S ⊂ M is d-bounded if and only if there is an r0 > 1 such that
S ∩ E ⊂ { p ∈ E : |x( p)| ≤ r0 } for each end E .
b. Prove that even if we had a priori allowed infinitely many ends, the number k
of ends of (M, g) must be finite and is well-defined.
c. Show that in the definition of an asymptotically flat manifold with ends
E1 , . . . , Ek , we can arrange for each E j as well as the compact set C ⊂ M with
M \ C = kj=1 E j to be smooth manifolds-with-boundary, with the asymptotically
S

flat coordinates on each E j extending to a neighborhood of E j , and with ∂ C =


Sk
j=1 ∂ E j a disjoint union of (topological) spheres.

Before the next part, you might recall as an example the RPn -geon from
Remark 2-39. We also recall that if M is nonorientable, there is a connected
orientable double cover π : M
b → M.
d. Suppose π : M b → M is a connected double cover, with the covering metric ĝ.
Show that ( M,
b ĝ) is asymptotically flat, with 2k asymptotic ends. Indeed, let
C ⊂ M be as in part c., and let b C = π −1 (C ) ⊂ M.
b Use covering arguments (path
lifting properties) to prove that the path components of M b\bC give the ends of
M.
b It may be useful to prove the following: if p ∈ M \ C is a point in an end of
M, and if π −1 ( p) = { p̂1 , p̂2 } ⊂ Mb\bC , then since any asymptotic end is simply
connected, any path from p̂1 to p̂2 in M b must hit bC.

Exercise 7-94. Refer to Remark 7-77 for the setting of the exercise. Suppose
(M, g) is asymptotically flat and let u = 1+v be smooth with 0 < u < 1 and u → 1
at infinity in each asymptotic end, with v subharmonic and lying in a suitable
weighted space, admitting an expansion v(x) = A/|x|n−2 + O1 (|x|−(n−2)−γ )
on each end, for some γ > 0. Recall that these conditions come from solving
n−2
1g u − 4(n−1) R(g)u = 0, where R(g) ≥ 0 is nontrivial and has suitable decay.
Let E be the manifold-with-boundary corresponding to {|x| ≥ r0 } in asymptotic
coordinates on a chosen end. Suppose w solves the boundary value problem
1g w = 0, w ∂ E = max∂ E v < 0 and w(x) → 0 as |x| → ∞ in the end of E , with

w(x) = B/|x|n−2 + O1 (|x|−(n−2)−γ ) for some γ ′ > 0. Argue that A ≤ B and
B < 0. (Hint: use Hopf’s lemma; see [107, Lemma 3.4] or [86, Chapter 6].)

Exercise 7-95. a. Complete the proof sketched in Remark 7-78. To check the
integration by parts argument, first give a reasonably direct argument for p > n;
to extend to p > n2 , one can use a suitable exhaustion of the ends, choosing
ri ↗ ∞ for which ri1−n {|x|=ri } |∇w| p dσ decays suitably.
R
306 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

b. Prove Corollary 7-68 in case k = 2 if the assumption on h is changed to


0,r
h ∈ W−ν , with n < p ≤ r . Generalize this to k ≥ 2.
2, p
c. Consider metrics g with gi j − δi j ∈ W−τ , p > n and τ > n−2
2 , e.g. in case g is
asymptotically flat of rate q > n−2
2 and order 2. The energy integral converges for
1
such g with R(g) ∈ L (Exercise 7-53). Prove an analogue of Proposition 7-76
2, p
holds for such metrics, carrying out the proof with u − 1 = v ∈ W−τ . Comment
on the asymptotics of the resulting metric ḡ.
d. Prove that the ADM mass (energy) is nonnegative for the metrics g in part
c. with nonnegative scalar curvature. (You may assume the metrics are smooth,
so the condition is just specifying the decay; metrics with only W 2, p regularity
require extra care; see, e.g., [142, Theorem 3.43].)
e. Use Remark 7-77 together with parts c. and d. to argue that if the mass of
an end vanishes, then so does the scalar curvature: the mechanism is to remove
energy density (scalar curvature), but applying part c. can fail to preserve order 2.
An alternative approach is to instead observe that you need not remove all the
energy density: modify equation (7.3.1) by replacing gθ with g, and ψθ with
1−ψθ . What can you conclude?

Exercise 7-96. In this exercise you will develop some further properties of
Euclidean harmonic functions on punctured domains in Rn .
a. Show that if u is harmonic in a punctured ball around the origin, then the
isolated singularity at x = 0 is in fact removable if limx→0 |x|n−2 u(x) = 0 in
u(x)
case n > 2, and in case n = 2, if limx→0 log |x| = 0.
b. Suppose n > 2 and  is a domain containing the origin. If u is harmonic
on  \ {0}, and lim infx→0 |x|n−2 u(x) > −∞, there exist v harmonic in  and
b ∈ R such that u(x) = b|x|2−n + v(x) on  \ {0}. Formulate an analogue in case
n = 2.
c. What can you say about a positive harmonic function on Rn \{0, a}, a ∈ Rn \{0}?

Exercise 7-97. The goal of this exercise is to verify directly that the ADM energy-
momentum vector of Schwarzschild spacetime transforms correctly under boosts
as a Lorentz-invariant vector. We consider the four-dimensional Schwarzschild
metric ḡ S given by
2
1 − 2m m 4
ḡ S = − r
dt 2 + 1 + 2r (d x 2 + dy 2 + dz 2 )

2m 2
1+

r
 2 2 2 2
= − 1 − 2m 2m −2
r dt + 1 + r (d x + dy + dz ) + O∞ (r ),

E XERCISES 307

with r = x 2 + y 2 + z 2 (and G = 1, c = 1). If 6 is the t = 0 slice with induced


p

metric g S , then as we have seen, 6 is time-symmetric (totally geodesic) with two


asymptotic ends, which have ADM energy m and vanishing linear momentum in
the above coordinates. For 0 < α < 1, consider a boosted slice 6α given in an
asymptotic end by t = αx, with induced√metric g. The corresponding √ Lorentz
transformation is given by τ = (t − αx)/ 1 − α 2 , ξ = (−αt + x)/ 1 − α 2 ; the
worldline of the corresponding (asymptotic) observer is ξ = 0, i.e., x = αt, and
6α is given by τ = 0.
a. Show that the future-pointing unit normal to 6α is given by
 ψ(r ) ∂t∂ + α ∂∂x
∂ 2m ∂
1 + 2m
r ∂t + α 1 − r ∂ x

m −2 −2
n = 1 + 2r
2
= q  + O(r ),
ψ(r ) − α
p
2m 2m
1 + r − α2 1 − r


m
−2 m
6
where ψ(r ) = 1 − 2r 1 + 2r = 1 + 4m −2
r + O(r ).


b. Note that
∂ 1 ∂ ∂ ∂ 1 ∂ ∂
   
=√ +α and =√ +α .
∂τ 1−α 2 ∂t ∂x ∂ξ 1−α 2 ∂ x ∂t
Show that the metric ḡ S in the coordinates (τ, ξ, y, z) has the matrix representa-
tion
1+α 2 2m α 4m
 
− 1 + 1−α 2 · r · 0 0

1−α 2 r
α 4m 1+α 2 2m
1 0 0
 
· + ·

[ḡ S ] =  2 r 2 r  + O∞ (r −2 ).
 
 1−α 1−α
0 0 1 + 2m 0
 
 r 
2m
0 0 0 1+ r


This shows that these are asymptotically Minkowskian coordinates. We√caution


however that ∂r/∂τ ̸= 0. Along 6α , we have t = αx, and hence ξ = x 1 − α 2 ,
so that r 2 = α 2 ξ 2/(1−α 2 )+ρ 2 , where ρ 2 := ξ 2 + y 2 + z 2 = r 2 −α 2 x 2 . Note that
(1−α 2 ) ≤ ρ 2 /r 2 ≤ 1. The coordinates (ξ, y, z) are thus seen to be asymptotically
flat coordinates on 6α .
c. Using the observations in part b., compute the ADM energy using the √asymp-
totically flat coordinates (ξ, y, z) along 6α , and show it is equal to m/ 1 − α 2 .
d. Compute the second fundamental form

 
K i j = ḡ S ∇ ∂ , n
j ∂ xi ∂x
of 6α in the asymptotically flat coordinates (ξ, y, z). It might be instructive to
do this in two ways: (i) compute the relevant Christoffel symbols directly from
the metric ḡ S in (t, x, y, z) coordinates above, and recall the above formula for
308 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

∂/∂ξ when computing ∇ ∂ ∂ ∂x j ; and (ii) use the ADM equation (5.3.4) and the
∂ξ
form of the metric above to compute K i j from the lapse and shift form of the
metric ḡ S in the (τ, ξ, y, z) coordinates; we again emphasize that ∂r/∂τ ̸= 0. In
either case you should find that the matrix representation of K in the (ξ, y, z)
coordinates is
α(α 2 −3) mξ α 2my α 2mz
 
− −
 (1−α 2 )2 r 3 1−α 2 r 3 1−α 2 r 3 
[K ] = − α 2 2my α mξ
 
0  + O(r −3 ),
 
 1−α r 3 1−α r2 3 
α 2mz α mξ
 
− 2 3
0 2 3
1−α r 1−α r
α(1 + α 2 ) mξ
so that the trace of K is trg K = − + O(r −3 ).
(1 − α 2 )2 r 3
e. Find the momentum tensor πi j = K i j −(trg K )gi j = K i j −(trg K )δi j + O(r −3 )
in the (ξ, y, z) coordinates, noting in particular that
2α mξ
πξ ξ = − + O(r −3 ).
1 − α2 r 3

Now compute the ADM momentum, showing that P1 = −mα/ 1 − α 2 and
P2 = P3 = 0. Note that this is consistent with the Lorentz transformation of
coordinates, and makes sense on physical grounds in terms of the relative motion
of the two observers with worldlines x = 0 and ξ = 0 in the asymptotic region.
Exercise 7-98 (Lorentz invariance of ADM energy-momentum). Consider a
Lorentzian manifold (S , ḡ). We will be interested in the case when ḡ is asymp-
totically Minkowskian, so we will assume that S is Rn+1 (or a suitable open
subset), and let η be a background Minkowski metric, so that the coordinates x µ
(x 0 = t, c = 1) on Rn+1 are inertial coordinates for η:

η = ηµν d x µ d x ν = −(d x 0 )2 + (d x 1 )2 + · · · + (d x n )2 .

Given a symmetric tensor h, for any point in S , there is a neighborhood on which


η + ϵh is a Lorentzian metric for sufficiently small ϵ. We define
d
DG|η (h) = G(η + ϵh),
dϵ ϵ=0

where G(ḡ) = Ric(ḡ) − 12 R(ḡ)ḡ =: κ T is the Einstein tensor. We likewise let


DRicη (h) and DRη (h) be the respective linearizations of the Ricci and scalar
curvatures.
a. Verify that in any inertial coordinate system (i.e., η0ν = −δ0ν and ηµj = δµj for
j ≥ 1), we have (DRicη (h))µν = 12 ηλα (h αµ,λν − h λα,µν + h αν,λµ − h µν,λα ), and
E XERCISES 309

thus DRη (h) = ηµν ηλα (h αµ,λν − h λα,µν ). Note then that DG η (h) = DRicη (h) −
1
2 DRη (h)η.
Define τ by κτ µν = ηµα ηνβ (DG η (h))αβ . Then κτ is comprised of the terms
in the Einstein tensor that are linear in h.
µν
b. Show that divη τ = 0, i.e., in any inertial coordinate system, τ ,ν = 0. You can
show this directly from the formula above, or use the fact that divḡ G(ḡ) = 0.
c. Show that in any inertial coordinate system, for i ∈ {1, . . . , n},

n
κτ 00 = 1
(h i j,i − h ii, j ), j
P
2
i, j=1
n  n 
0i 1
κτ = h 0i, j − h i j,0 − h 0 j,i + h kk,0 δi j
P P
2 ,j
j=1 k=1
n  n 
1
= h 0i, j − h i j,0 + (h kk,0 − h 0k,k )δi j .
P P
2 ,j
j=1 k=1

µ µ
Let y µ = 3 ν x ν + a µ for some proper Lorentz matrix 3 ν , and some vector
with components a µ . Then the volume form is ω = d x 0 ∧ d x 1 ∧ · · · ∧ d x n =
dy 0∧dy 1∧· · ·∧dy n . Consider spacelike hyperplanes M and M e in Rn+1 , where M
n
is given by x 0 = 0 and M
e by y 0 = 0. Let C R = x ∈ Rn+1 : i=1 (x i )2 ≤ R 2 . Let
 P

MR = M ∩ C R, M eR = M e ∩ C R ; then C R , M R and M
eR bound a solid spacetime
region W R , the boundary of which is the union of M R , M eR and a timelike
hypersurface 6 R ⊂ ∂C R . For each µ, let Iµ be the n-index increasing from 0 to
n, but omitting µ, and let d x Iµ be the corresponding n-form. For a vector field
X = X µ ∂/∂ x µ , note that i X ω = nµ=0 (−1)µ+1 X µ d x Iµ .
P

d. For any λ, let X = τ ( · , d x λ ). Observe that along M, the pullback of


i X ω is τ (d x 0 , d x λ ) d x 1 ∧ · · · ∧ d x n , and along M,
e the pullback of i X ω is
τ (dy 0 , d x λ ) dy 1 ∧ · · · ∧ dy n . What is d(i X ω), and what does Stokes’ theorem
give you for W R d(i X ω)?
R

For the remainder of the problem, we consider an asymptotically Minkowskian


metric ḡµν = ηµν +h µν , with h µν = O2 (|x|−q ) in η-inertial coordinates, q > n−2
2 ,
2
Pn i 2 −2q−2
where |x| = i=1 (x ) . We further assume that G(ḡ) = κ T = O(|x| ).
e. Observe that τ = O(|x|−2q−2 ), so that it is integrable over M and M.
e Prove
that (with d x and dy the induced Euclidean volume measures on M and M, e
respectively)
Z Z
λ
0
τ (d x , d x ) d x = τ (dy 0 , d x λ ) dy.
M M
e
310 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

P := {x 0 =0} τ (d x 0 , d x λ ) d x ∂/∂ x λ is Lorentz-invariant and that


Conclude that b
R 
1
P0 = 2κ limr →∞ {x 0 =0, |x|=r } i,n j=1 (h i j,i − h ii, j )ν j dσ = E, the ADM energy.
R P
b

f. Write the metric ḡ in lapse-shift form

ḡ = −N 2 dt 2 + gi j (d x i + X i dt) ⊗ (d x j + X j dt),

so that N − 1 = O2 (|x|−q ) and X j = O2 (|x|−q ). Show that along M,


 n 
1
h 0i, j − h i j,0 − h 0 j,i + h kk,0 δi j
P
2
k=1

= πi j + (X k,k δi j − X ,i δℓj ) + O(|x|−2q−1 )
 n 
= 1
h 0i, j − h i j,0 + (h kk,0 − h 0k,k )δi j + 12 (X k,k δi j − X ℓ,i δℓj ) + O(|x|−2q−1 ),
P
2
k=1

where πi j = K i j − (trg K )gi j ; cf. (5.3.4). Use these formulas to compute πi j for
a boosted slice 6α in Schwarzschild as in Exercise 7-97; in the notation of that
problem, ∂r/∂τ ̸= 0.
g. Prove that the ADM momentum flux integral converges, and is given by
Z
Pi = lim 2i j ν j dσ = b
Pi ,
r →∞ {x 0 =0, |x|=r }

where κ2i j can be taken to be either 21 h 0i, j − h i j,0 − h 0 j,i + nk=1 h kk,0 δi j
P 

or 12 h 0i, j − h i j,0 + nk=1 (h kk,0 − h 0k,k )δi j . (You might note that in R3 , if
P 

F = ⟨0, −X 3 , X 2 ⟩, then ∇ × F = ⟨X k,k − X 1,1 , −X 2,1 , −X 3,1 ⟩.)


h. Summarize the preceding to formulate the Lorentz invariance of the ADM
energy-momentum vector for an asymptotically flat metric ḡ with decay rates
as above. Note that given the asymptotic spacetime end (which is assumed
to contain a suitable family of spacelike surfaces obtained from boosting the
asymptotically flat end given by x 0 = 0 in asymptotic coordinates), for a given
pair of spacelike slices we want to compare, we can choose a suitable asymptotic
region  containing the ends of the slices, and extend the metric ḡ as a Lorentzian
metric on Rn+1 ; this will not affect the flux integrals defining the ADM energy-
momentum, and allows us to apply the preceding arguments.

Exercise 7-99. We use the notation from Exercise 7-98. Assume in addition to
the asymptotically flat assumptions in that exercise that there are asymptotically
flat coordinates x µ in which ḡ and T are even to one order better than their
E XERCISES 311

asymptotic decay rates (and ∂ ḡ is odd to one order better):

ḡµν (x) − ḡµν (−x) = O(|x|−q−1 ),


ḡµν,σ (x) + ḡµν,σ (−x) = O(|x|−q−2 ),
Tµν (x) − Tµν (−x) = O(|x|−2q−3 )

(cf. [61, Appendix E]). Compare to the Regge–Teitelboim conditions on initial


data sets (7.2.7)–(7.2.8).
a. Show that J νλ := limr →∞ {x 0 =0, |x|≤r } (x ν τ 0λ − x λ τ 0ν ) d x converges. Show
R

that if y µ = x µ + a µ where a 0 = 0 (spatial translation, so x 0 = y 0 ),


Z
˜ νλ
J := lim (y ν τ 0λ − y λ τ 0ν ) dy
r →∞ {y 0 =0, |y|≤r }
Z
= lim (y ν τ 0λ − y λ τ 0ν ) d x = J νλ + a ν Pλ − a λ Pν .
r →∞ {x 0 =0, |x|≤r }

µ
b. Let y µ = 3 ν x ν + a µ , and let

M = {x ∈ Rn+1 : x 0 = 0}, e = {x ∈ Rn+1 : y 0 = 30ν x ν + a 0 = 0},


M

and define M R and M eR as in Exercise 7-98. Using the vector field X µ ∂/∂ x µ
given by X µ = (x ν τ (d x µ , d x λ ) − x λ τ (d x µ , d x ν )), prove that
Z
νλ
J = lim (x ν τ (d x 0 , d x λ ) − x λ τ (d x 0 , d x ν )) d x
r →∞ M
Z r
= lim (x ν τ (dy 0 , d x λ ) − x λ τ (dy 0 , d x ν )) dy,
r →∞ M
er

so that
∂ ∂
J νλ ν ⊗ λ
∂ x Z∂ x
∂ ∂

α α 0 β β β 0 α
= lim ((y − a )τ (dy , dy ) − (y − a )τ (dy , dy )) dy α
⊗ β
r →∞ Mer ∂y ∂y
∂ ∂
= ( J˜αβ − a α P̃β + a β P̃α ) α ⊗ β ,
∂y ∂y
β β
i.e., J˜αβ = 3αν J νλ 3 λ + a α 3 λ Pλ − a β 3αν Pν .
c. For k ∈ {1, . . . , n}, show that J k0 gives the center of mass integral from
(7.2.5). Give a physical interpretation of the result of part b. in case the Lorentz
transformation is the identity, and a i = 0 for i > 0.
312 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

ℓm
d. For ℓ, m ∈ {1, . . . , n}, show that J ℓm = limr →∞ {x 0 =0, |x|=r } 2 j ν j dσ ,
R

where

κ2ℓm 1 ℓ ℓ m m
j = 2 (x h 0m, j − x h m j,0 − x h 0ℓ, j + x h ℓj,0 + h 0ℓ δm j − h 0m δℓj )

= (x ℓ K m j − x m K ℓj ) + Dℓm ℓm
j + Ej ,

1 ℓ 1 m
where in terms of the lapse N and shift X , Dℓm j = − 2 x X j,m + 2 x X j,ℓ +
1 1 ℓm
2 X ℓ δm j − 2 X m δℓj , and E j = x ∗ X ∗ 0 +(N −1)∗x ∗ K , where ∗ denotes a linear
combination of products of components of the indicated factors. By the parity as-
sumptions, N −1 and X are each even up to respective terms of order O(|x|−q−1 ),
and the Christoffel symbols and second fundamental form are odd up to respective
terms of order O(|x|−q−2 ). Conclude that limr →∞ {x 0 =0, |x|=r } E ℓm j
j ν dσ = 0.
R

Show that limr →∞ {x 0 =0, |x|=r } Dℓm j


j ν dσ = 0 as well, and conclude that in case
R

n = 3, J 23 = J1 , J 31 = J2 and J 12 = J3 from (7.2.6).


Exercise 7-100. Let n ≥ 3. Recall the Newtonian potential,
Z
N f (x) = ( f ∗ 0)(x) = 0(x − y) f (y) dy.
Rn

For R > 0, let χ R = χ B R (0) be the characteristic function of the ball of radius R
about the origin.
a. Show that for p > n2 , there are functions f ∈ L p (Rn ) for which the integral
defining the Newtonian potential does not converge. What can you say about the
borderline case p = n2 ?
b. Observe that for f ∈ L ∞ n
loc (R ), the convolution integral (χ1 0) ∗ f converges
everywhere, whereas by Young’s inequality, for 1 ≤ p ≤ ∞ and f ∈ L p (Rn ),
(χ1 0) ∗ f ∈ L p (Rn ), so that in particular the convolution integral converges
almost everywhere.
c. Prove that for 1 ≤ p < n2 and f ∈ L p (Rn ), the convolution ((1 − χ1 )0) ∗ f
converges everywhere to a bounded, measurable function.
d. Generalize Proposition 7-3: prove that for 1 ≤ p < n2 , 1N f = f distributionally.
Exercise 7-101. Suppose f ∈ L ∞ n 1 n
loc (R ) ∩ L (R ), and suppose we have estab-
lished ∂ j N f (x) = Rn ∂ j 0(x − y) f (y) dy. Let  ⊂ Rn be a bounded open set;
R

the goal is to estimate


Z
∂ j N f (x1 ) − ∂ j N f (x2 ) = (∂ j 0(x1 − y) − ∂ j 0(x2 − y)) f (y) dy
Rn

for points x1 and x2 in . For notational convenience, given a measurable set


E ⊂ Rn , let I (E) = E (∂ j 0(x1 − y) − ∂ j 0(x2 − y)) f (y) dy, and for r > 0 and
R
E XERCISES 313

a ∈ Rn , let Br (a) = {x : |x − a| < r }. Let ρ = 2|x1 − x2 |; for any R ≥ 2 diam(),


we have R > ρ, and note that  ⊂ B R (x1 ).
a. Argue that for any r > 0 and a ∈ Rn , Br (a) |y|−(n−1) dy ≤ Br (0) |y|−(n−1) dy.
R R

Conclude that there is a c depending only on n so that for ρ > 0, |I (Bρ (x1 ))| ≤
c |x1 − x2 | supx∈Bρ (x1 ) | f (x)|.
b. Let x λ = (1 − λ)x1 + λx2 . Show that for any y with |y − x1 | ≥ ρ, and for all
λ ∈ [0, 1], |y − x λ | ≥ 21 |y − x1 | ≥ |x1 − x2 |.
c. Conclude that there are constants K and L, depending only on n, such that for
all x1 , x2 ∈ , |y − x1 | > ρ and λ ∈ [0, 1], we have |x λ − y| > 0, and moreover
|x1 − y|n |∂i2j 0|(x λ −y) | ≤ K , from which it follows for suitably chosen L that
|x1 − y|n |∂ j 0(x1 − y) − ∂ j 0(x2 − y)| ≤ L|x1 − x2 |.
d. Show that |I (Rn \ B R (x1 ))| ≤ L R −n |x1 − x2 |∥ f ∥ L 1 (Rn \B R (x1 )) .
e. Finally, for ρ > 0, derive the following estimate for C depending only on n:
I (B R (x1 ) \ Bρ (x1 )) ≤ C supx∈B R (x1 ) | f (x)||x1 − x2 |(log R − log ρ). Conclude
1,α
that N f ∈ Cloc (Rn ) for any 0 < α < 1.
Exercise 7-102. Suppose 1 ≤ p < ∞.
p
a. Suppose 1u = 0 on Rn , and that u ∈ L −τ (Rn ). Show that if τ ≥ 0, then u is
identically zero. (Hint: Use the mean value property.)
p
b. If τ < 0, then if u is a polynomial lying in L −τ (Rn ), show that deg u < −τ .
Exercise 7-103. Let K ′ (x, y) = |x|−a |x −y|a+b−n |y|−b , with a+b > 0. Let p > 1
p
and p ′ = p−1 . Prove that if T (u)(x) = Rn K ′ (x, y)u(y) dy defines a bounded
R

linear operator T on L p (Rn ), then a < np and b < pn′ . To do this, first observe that
f (x) = {|y|≤1} K ′ (x, y) dy gives a function f ∈ L p (Rn ). Furthermore, argue that
R

the map u ∈ L p (Rn ) 7→ {|x|≤1} Rn K ′ (x, y)u(y) dy d x defines a bounded linear


R R

functional on L p (Rn ), and conclude that g(x) = {|x|≤1} K ′ (x, y) d x defines a


R

function g ∈ L p (Rn ). Now examine the asymptotic rates of f and g. We remark
that the same analysis holds for the integral kernel
e(x, y) = (1 + |x|)−a |x − y|a+b−n (1 + |y|)−b .
K

Exercise 7-104. Let m be a nonnegative integer and 1 < p < ∞. Show that
2, p p
1 : Wm (Rn ) → L m−2 (Rn ) does not have closed range. (A duality argument
2, p p
then shows that 1 : W2−n−m (Rn ) → L −n−m (Rn ) does not have closed range cf.
[155].)
(Hint: Let Km be space of homogeneous harmonic polynomials on Rn of degree
2, p
at most m, so that K−1 = {0}. The kernel of 1 in Wm (Rn ) is precisely Km−1 .
314 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

Let 0 ≤ ϕ ≤ 1 be smooth with compact support in {x : |x| ≤ 2}, and with ϕ(x) = 1
for all |x| ≤ 1, and let u ∈ Km \ Km−1 , i.e., u is a nontrivial homogeneous harmonic
polynomial of degree m. For k ∈ Z+ , let u k (x) = (log k)−1/ p ϕ(x/k)u(x). Show
that infk∈Z+ infh∈Km−1 ∥u k − h∥W 2, p > 0. To do this, first argue that there are
m
positive constants c1 and c2 such that for all k ∈ Z+ , c1 ≤ ∥u k ∥ L mp ≤ c2 . Show
that if infk∈Z+ infh∈Km−1 ∥u k − h∥W 2, p = 0, then for some subsequence k j ∈ Z+ ,
m
p
and for some v ∈ Km−1 , u k j → v in L m (Rn ); then argue that v = 0, which gives
p
a contradiction. Argue carefully that 1u k → 0 in L m−2 (Rn ). From here, an
elementary functional analysis argument shows that the range is not closed. After
you have finished the proof, note where you used that u k is harmonic, and check
p
directly that if you tried to run an analogous proof in L µ (Rn ) for m − 1 < µ < m,
you can renormalize u k in a reasonable way to get c1 and c2 as above, but the
analogous proof does not go through (the issue then must lie with the estimate
of 1u k ).)
Exercise 7-105. In parts a. and b. you will prove Corollary 7-69.
a. Prove the claim for the weight range 0 < τ < n − 2, which follows along the
lines of the proof of Corollary 7-68.
b. For the borderline case τ = n − 2, you should first establish an estimate of the
form ∥hu∥ L 1 ≤ C∥h∥C−ν 0 ∥u∥ 0 for ν > 2, for some n −ν < β < n −2. Conclude
C−β
0
that Th (u) = hu gives a compact linear mapping Th : C2−n → L 1 . Following
k−2,α k−2,α 1
[211], we let D−n = C−n ∩ L , with ∥ f ∥ D k−2,α = ∥ f ∥ D k−2,α + ∥ f ∥ L 1 , and
−n −n
k,α k,α
we let E 2−n = {u ∈ C−n : 1g u ∈ L 1 }, with norm ∥u∥ E k,α = ∥u∥C k,α + ∥1g u∥ L 1 .
2−n 2−n
k,α k−2,α
Use (7.2.19) to show (1g − h) : E 2−n → D−n is an isomorphism.
k,α
c. We note that in [211], a slightly different norm for E 2−n is used. Consider a
background metric g̊ as we have done before, where g̊i j = δi j on the asymptotic
2
charts for g on each end. Show that for u ∈ C2−n , 1g u ∈ L 1 if and only if
1
1g̊ u ∈ L , by estimating ∥1g u − 1g̊ u∥ L 1 . (This shows that the functions in the
k,α
space E 2−n are the same for all metrics which satisfy the same asymptotically
flat condition as g does in the asymptotic charts for g.) Conclude that one gets
k,α
an equivalent norm on E 2−n by replacing ∥1g u∥ L 1 in ∥u∥ E k,α with ∥1g̊ u∥ L 1 .
2−n

Exercise 7-106 (expansion in harmonic asymptotics). Define the operator

(Lg X )i j = X i; j + X j;i − X k;k gi j .

If γ is a metric on M 3 and u > 0, let gi j = u 4 γi j and let πi j = u 2 (Lγ X )i j .


a. Compute the constraints map 8(g, π ) = (R(g) − |π |2g + 21 (trg π)2 , divg π),
and in case γ = gE3 , show that the vacuum constraints 8(g, π) = (0, 0) can
E XERCISES 315

be written, in a Cartesian coordinate system for the background gE3 , as follows


(subscripts on operators at the flat metric are omitted):

81u = u − |L X |2 + 21 (tr(L X ))2




j
1X i +4u −1 u , j (L X )i − 2u −1 u ,i tr(L X ) = 0.

b. Suppose the equations in part a. hold on an asymptotic end of an asymptotically


flat manifold (M, g), where u and X have expansions

u(x) = 1 + A/|x| + O2 (|x|−2 ), X i (x) = B i/|x| + O2 (|x|−2 ).

Show that
3
Bi x j + B j x i X Bk x k
πi j = − + δi j + O(|x|−3 )
|x|3 |x|3
k=1

and that P i = − 21 B i is the ADM linear momentum.


c. Justify the expansions of u and X i above, assuming (u − 1) and X i are in
2, p
W−q for some p > n and q > n−2 2 . To do this, use Proposition 7-72 to get initial
expansions of u and X i . To obtain the improved asymptotics of u, show you
can subtract off a function w defined near infinity, with w(x) = O(|x|−2 ) and
1(u − w) = O(|x|−4−γ ) for some γ > 0. Conclude from here using weighted
Schauder estimates as in Proposition 7-63; cf. [157, Theorems 1 and 2]. See [78,
Proposition 24] for a solution if you get stuck; as you you will see there, the case
n > 3 follows more readily, whereas the n = 3 requires the argument outlined
here. The claim for X i follows analogously.
d. Show in manner similar to part c. that the odd parts of u and X admit
expansions
3
βk xk
u odd (x) = + O2 (|x|−3 ),
X
|x|3
k=1
3 k k
i odd
d(i) x
+ O2 (|x|−3 ).
X
(X ) (x) =
|x|3
k=1

e. We saw in (7-39a) and (7.2.5) how to relate β from part d. to the center of
mass. Show the components of the ADM angular momentum can be expressed as
linear combinations of components of d(i) : J 23 = J1 = 12 (d(2)
3 2
−d(3) ), J 31 = J2 =
1 1 3 12 = J = 1 (d 2 − d 1 ). Use this along with an expansion
2 (d(3) − d(1) ) and J 3 2 (1) (2)
of X i (y − a) to show that under translation of asymptotically flat coordinates
y = x + a, the angular momentum changes by Ji 7→ Ji + (a × P)i .
316 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS

Exercise 7-107 (DEC and PMT). This example from [130] illustrates a difference
between the dominant energy condition and the weak energy condition (recall
Section 2.3.2); at the level of the constraints operator on initial data sets, we are
comparing the condition ρ ≥ |J |g versus ρ ≥ 0.
Consider an initial data set (R3 , g̊, π ), where g̊ is the Euclidean metric, and
where the momentum tensor π is pure-trace, π = 3p g̊, where trg̊ π = p is a chosen
(smooth) function on R3 . As usual we let (2ρ, J ) = 8(g, π) be the constraints
operator.
a. Show that ρ ≥ 0 for such an initial data set.
b. The ADM energy E of this initial data set vanishes. Give an example of p for
which ρ ∈ L 1 (R3 ), and the ADM linear momentum P does not vanish; thus the
conclusion of the positive mass theorem does not hold for such an initial data set.
c. Show that if the DEC holds, and if p is compactly supported, then p is
identically zero. To do this, translate the DEC into a differential inequality in
the radial direction. Derive the same conclusion if ρ ∈ L 1 (R3 ).
Exercise 7-108. This exercise refers to the proof of Proposition 7-75.
a. Finish the proof of Proposition 7-75 by establishing the expansion. Recall
that q + τ + 2 > n, and outside a compact set, we have
4 k, p 0, p ′
R((1 + v) n−2 g) = R(g) + S, with v ∈ W−τ , S ∈ W−δ′ .
0,s n np n
Show that 1g v ∈ W−β for some s > 2 and β > n. Note that n−(k−2) p > 2 if
(k − 2) p ≤ n and p > nk .


b. In the proof of Proposition 7-75, we argue that a solution of L ∗g f = 0 must


be harmonic, and thus in a suitable weighted space it must be trivial. More
generally, suppose f ∈ Wτ0,r+2−n (r > 1, 0 < τ < n − 2) solves a Hessian equation
Hessg f = f ξ , where ξ ∈ Ok−2 (|x|−q−2 ), with τ < q. Bootstrap the regularity
and decay of f as in the proof of Proposition 7-70, but utilize the Hessian
k,r
equation to show that for any θ , f ∈ W−θ . Thus if kr > n (e.g., k > n), then f
vanishes to all orders at infinity.
Exercise 7-109. Let L = D RgEn . Furnish another proof of Proposition 7-73,
in the case τ ∈ (0, n − 2) \ {1, 2}, by splitting the function space of symmetric
k, p k+2, p k, p
tensors as W−τ (Rn ) = L ∗ (W−τ +2 (Rn )) ⊕ {h 0 ∈ W−τ (Rn ) : Lh 0 = 0}. Use the
fact that L L ∗ = (n − 1)12 . To show the image of L ∗ is closed, one might use
a “scale-broken” estimate for the Laplacian (see [15, Theorem 1.10] and the
comments after Proposition 7-61 above) to get an injectivity estimate for L ∗ on
k+2, p
a space transverse to W−τ +2 (Rn ) ∩ ker L ∗ ⊂ span{1, x 1 , . . . , x n }.
E XERCISES 317

Exercise 7-110 (solving with localized support: a linear model problem). In


this exercise, we present a model problem from [186], and recounted in [68]
for understanding the mechanism of the proof of Theorem 7-84. Suppose f is
smooth with compact support on a ball B in Rn . Our goal is to solve div X = f ,
so that X extends by zero in C k across ∂ B. A necessary condition is that
B f d x = 0, which you should interpret thus: f is orthogonal to the kernel
R

of the adjoint of div. For a weight function ρ as considered on page 294, let
G (u) = B 21 |∇u|2 ρ − u f d x. Let 0 ≤ ζ ≤ 1 be a nontrivial smooth bump
R 

function of compact support in B.


a. Show that the restriction of G to the set u ∈ Hρ1 (B) : B uζ d x = 0 has a
 R

minimizer in this set. Is it unique? To do this, you should establish a suitable


Hρ1 -estimate, a weighted Poincaré inequality to play the role of the weighted
injectivity estmate (7.5.3).
b. Using the minimizer u from part a., show that X = −ρ∇u solves div(X ) =
f + λζ for some λ ∈ R. Argue that λ = 0. Write the resulting PDE as a
second-order linear PDE in u.
c. Let ρ = (d( · , ∂ B)) N near ∂ B, for N to be chosen sufficiently large. Use inte-
rior Schauder estimates to get the required pointwise control on X . In particular,
if x ∈ B with d(x, ∂ B) = d is such that B2d/3 (x) is outside the support of f ,
apply the interior Schauder estimate for Bd/3 (x) ⊂ B2d/3 (x).
Exercise 7-111. We thank Xin Zhou for posing the question addressed in this
exercise.
a. Show that in the setting of Theorem 7-89, one can carry out the argument to
achieve ḡ = g for |x| ≥ 2θ , and ḡ is Schwarzschild for |x| ≤ θ (and x ̸= c).
b. Consider the metric g on Rn \ {|x| ≤ 1} given by
 4
x k x ℓ n−2

g(x) = 1 + n+2 gEn ,
|x|
for any k ̸= ℓ. Show that this contains an asymptotically flat end, of zero scalar
curvature, with vanishing mass and center of mass.
c. The metric in part b. is exactly parity-symmetric in x, i.e., the map x 7→ −x
is an isometry of g. Argue that the gluing construction in Theorem 7-89 can be
achieved in this case. Observe that if the construction of part a. is carried out, the
mass m of the attached Schwarzschild must be negative.
CHAPTER 8

On the center of mass and constant mean curvature


surfaces of asymptotically flat initial data sets

8.1. Introduction

Many deep results in mathematical general relativity concern the interplay be-
tween globally conserved quantities and the geometric structure of initial data sets.
Examples include the minimal surface approach by R. Schoen and S.-T. Yau [199;
203] and the spinor method by E. Witten [224] in the proof of the Riemannian
positive mass theorem; the inverse mean curvature flow by G. Huisken and
T. Ilmanen [125] and the conformal flow by H. Bray [26] in the proof of the
Penrose inequality; and the constant mean curvature foliation by G. Huisken and
S.-T. Yau [126] (cf. R. Ye [225]) in establishing a geometric notion of center of
mass.
In a broad sense, this chapter is intended to introduce some aspects of the
connections between globally conserved physical quantities, such as the center of
mass and angular momentum, and the geometric structure of the manifold, using
analysis of the scalar curvature, or more generally the full constraint equations
derived from the spacetime Einstein equation. The chapter focuses on constant
mean curvature foliations and the geometric center of mass of asymptotically
flat initial data sets. This research program was initiated by Huisken and Yau in
1996 and has drawn great interest in recent years; see, for example, [31; 32; 33;
79; 80; 81; 117; 156; 166; 184; 225]. This chapter begins with a partial survey
of the classical results of constant mean curvature surfaces and introduces the
now standard concept of stability. We then discuss some recent progress on the
constant mean curvature surfaces in asymptotically flat initial data sets and the
geometric center of mass. In the last part, we adopt a more analytic approach to

The author, Lan-Hsuan Huang, was partially supported by NSF through grants DMS-1308837
and DMS-1452477. This chapter is based upon two mini-courses presented in the 2012 Summer
School on Mathematical General Relativity at MSRI and the 2013 Summer School on Mathematical
General Relativity in Cortona, Italy. The author is very grateful to the organizers Justin Corvino
and Pengzi Miao for their warm hospitality, which made the summer schools memorable. Sincere
appreciation goes to Justin Corvino for valuable comments.

319
320 8. O N THE CENTER OF MASS AND CMC SURFACES

study the center of mass and angular momentum from the Einstein constraint
equations.
A spacetime is an (n + 1)-dimensional smooth manifold equipped with a
Lorentzian metric g of signature (− + · · · +). The Einstein equation is the tensor
equation (in appropriate units)

Ric(g) − 21 R(g)g = T,

where the energy-momentum tensor T represents the energy-momentum density


of matter. A spacetime is called vacuum if it satisfies the Einstein equation with
T = 0. The prototype vacuum spacetime is Minkowski space Rn+1 equipped
with the Minkowski metric g = −(d x 0 )2 + (d x 1 )2 + · · · + (d x n )2 . For a general
energy-momentum tensor T , we assume the dominant energy condition, which
is known to hold for physically reasonable matter fields. When expressed in
terms of local coordinates, the left hand side of the Einstein equation forms
a second-order system in the metric components gαβ . The seminal work of
Choquet-Bruhat [49] proved that the left-hand side of the Einstein equation can
be expressed as a nonlinear hyperbolic operator by using the so-called wave
coordinates. Finding a spacetime that satisfies the Einstein equation can then be
viewed as the evolution problem for a given initial data set. Thus, it is important
to understand the physical and geometric structure of initial data sets.
An initial data set for the Einstein equation is a triple (M, g, k), where (M, g)
is an n-dimensional Riemannian manifold and k is a symmetric (0, 2) tensor
on M. The Gauss–Codazzi equations for submanifolds, along with the Einstein
equation, imply that if M is a submanifold in a spacetime with the induced metric
g and the induced second fundamental form k, then (M, g, k) must satisfy the
constraint equations

R(g) − |k|2g + (trg k)2 = 2µ and divg k − d(trg k) = J,

where µ is the energy density and J is the momentum density. More specifically,
let T be the energy-momentum tensor and let ν be the future-directed timelike
normal to M. We define µ := T (ν, ν) and J := T (ν, ·). The dominant energy
condition on the tensor T reduces to the inequality µ ≥ |J |g at each point of M.
When k ≡ 0, (M, g) is called a time-symmetric (or Riemannian) initial data set. It
is simple to see that in the time-symmetric case the system of constraint equations
becomes a single equation R(g) = 2µ. Thus the dominant energy condition
coincides with the condition that the scalar curvature of g is nonnegative, which
is a condition that naturally appears in Riemannian geometry. (However, for
I NTRODUCTION 321

general k the dominant energy condition involves a system of equations and is


more complicated.)
One family of commonly studied models of isolated gravitational systems
is the set of asymptotically flat initial data sets. We say that an initial data set
(M, g, k) is asymptotically flat (with one end) if there is a compact set K ⊂ M
and a coordinate diffeomorphism x : M \ K → Rn \ B for some closed ball
B ⊂ Rn such that, for i, j = 1, 2, . . . , n,

gi j (x) − δi j = O(|x|−q ), ki j (x) = O(|x|−1−q ),

and such that

µ(x) = O(|x|−n−q0 ), Ji (x) = O(|x|−n−q0 ),

where q > n−2 −q


2 and q0 > 0. Here the expression f (x) = O(|x| ) stands for a
function satisfying | f (x)| ≤ C|x|−q for a constant C depending only on g, k.
When a subscript k appears in the expression f = Ok (|x|−q ), it indicates addi-
tional fall-off rates on the derivatives |∂ I f (x)| ≤ C|x|−q−|I | for |I | = 0, 1, . . . , k,
but in this chapter we often omit the subscript k and avoid the discussion about
the optimal assumption on regularity.
Note that by definition an asymptotically flat initial data set has trivial topology
outside a compact set, but it was shown by J. Isenberg, R. Mazzeo, and D. Pollack
[129] that there are no topological obstructions within the compact set.
It is known that asymptotically flat initial data sets possess globally conserved
physical quantities. In 1962, R. Arnowitt, S. Deser, and C. W. Misner [10]
proposed the definitions of the (total) energy E and the linear momentum P of
an asymptotically flat initial data set (M, g, k) as follows, for i = 1, 2, . . . , n:
n 
1 ∂g jk ∂gkk
Z 
j
ν0 d H0n−1 ,
X
E= lim k
− j
2(n − 1)ωn−1 r →∞ {|x|=r } ∂x ∂x
j,k=1
n
1
Z
j
πi j ν0 d H0n−1 .
X
Pi = lim
(n − 1)ωn−1 r →∞ {|x|=r } j=1

Here, the integrals are computed in the coordinate chart M \ K ∼ =x Rn \ B;


j n−1
ν0 = x j /|x|; πi j = ki j − (trg k)gi j ; H0 is the (n−1)-dimensional Euclidean
Hausdorff measure; and ωn−1 is the volume of the standard unit sphere in Rn .
The work of R. Bartnik [15] and P. T. Chruściel [57] showed that the scalar E and
the vector (P1 , . . . , Pn ) are geometric invariants. The celebrated positive mass
conjecture asserts that E ≥ |P| [199; 202; p 203; 78], so that in particular E ≥ 0
and as well we can define the mass m = E 2 − |P|2 . In the time-symmetric
322 8. O N THE CENTER OF MASS AND CMC SURFACES

case, we may unambiguously use the ADM mass m to denote the energy E,
since |P| = 0.
There are also the notions of center of mass and angular momentum for an
asymptotically flat initial data set. T. Regge and C. Teitelboim [187] and R. Beig
and N. Ó Murchadha [18] proposed the following definitions of the center of
mass CBORT and the angular momentum J (if E ̸= 0), for k, ℓ = 1, 2, . . . , n:
n  n
1 ∂gi j ∂gii
Z  X  
ℓ j ℓ
− j ν0 − (giℓ ν0 −gii ν0 ) d H0n−1 ,
i
X
Cℓ = lim x
2(n−1)Eωn−1 r →∞ {|x|=r } i, j=1
∂ x i ∂x i=1
n
(8.1.1)
1
Z
j
i
d H0n−1 ,
X
J(kℓ) = lim πi j Y(kℓ) ν0 (8.1.2)
(n − 1)Eωn−1 r →∞ {|x|=r } i, j=1

where Y(kℓ) = x k ∂/∂ x ℓ − x ℓ ∂/∂ x k are the Euclidean rotational vector fields.7
To distinguish the above definitions from other notions of center of mass and
angular momentum (e.g. [126; 48]), we refer to the integrals (8.1.1) and (8.1.2)
as the BORT center of mass and the ADM angular momentum, respectively.
In contrast to the ADM energy-momentum, the integrals of CBORT and J are
less well understood and may not even converge in general. In fact, explicit
examples of asymptotically flat initial data sets such that the integrals diverge
have been constructed [18; 43; 47; 46; 118]. Nevertheless, the author shows
that if one assumes the following Regge–Teitelboim conditions, then (8.1.1) and
(8.1.2) converge and transform correctly with respect to different coordinate
charts [116].
An initial data set (M, g, k) is said to satisfy the Regge–Teitelboim conditions
if it is asymptotically flat and, in the coordinate chart M \ K ∼ =x Rn \ B,
gi j (x) − gi j (−x) = O(|x|−1−q ), ki j (x) + ki j (−x) = O(|x|−2−q ),
and
µ(x) − µ(−x) = O(|x|−n−q0 −1 ), Ji (x) − Ji (−x) = O(|x|−n−q0 −1 ).
Example 8-1 (three-dimensional Schwarzschild manifolds). A fundamental
example in general relativity is the Schwarzschild spacetime, which describes
the exterior gravitational field of a static, spherically symmetric body. The totally
geodesic time-slice outside the apparent horizon of the Schwarzschild spacetime
of mass m > 0 can be expressed as a Riemannian manifold M = (2m, ∞) × S2
endowed with the metric
(1 − 2ms −1 )−1 ds 2 + s 2 gS2 ,
7 In the literature (see [69], for example), the BORT center of mass and angular momentum are
sometimes defined as E Cℓ and E J(kℓ) , respectively.
I NTRODUCTION 323

where gS2 is the round metric on the unit sphere. One can readily check that M
is the manifold interior of an asymptotically flat initial data set with a minimal
boundary and one end, and it has zero scalar curvature. Mathematically one
can extend M to a complete asymptotically flat initial data set of zero scalar
curvature by “doubling” M across its minimal boundary. The complete two-
ended asymptotically flat initial data set can be expressed as a conformally flat
metric (R3 \ {a}, gm,a ), where gm,a = u 4 gE and
m
u(x) = 1 + ,
2|x − a|

where gE is the Euclidean metric. We generally suppress “a” from the notation
and write gm = gm,a . One computes directly that m is the ADM energy and a is
the BORT center of mass. The asymptotic expansion of gm for |x| large is

2m 2ma · x 3m 2
 
−3
gm = 1 + + + + O(|x| ) gE .
|x| |x|3 2|x|2

It follows that m appears in the |x|−1 -term of the expansion and the BORT center
of mass appears in the odd part of the O(|x|−2 )-term. This demonstrates that
appropriate assumptions need to be imposed on the leading-order terms of the
data in order for the integrals (8.1.1) and (8.1.2) to converge. It explains the
motivation behind the definition of the Regge–Teitelboim conditions. We also
note that the BORT center of mass of a Schwarzschild manifold is not a point of
the manifold. □

This chapter is organized as follows. Section 8.2 is devoted to the classical


Alexandrov theorem about embedded constant mean curvature surfaces in Eu-
clidean space. In Section 8.3 we introduce variational formulas and stability
of constant mean curvature surfaces, and then discuss the classical result of
Barbosa and do Carmo about uniqueness of stable constant mean curvature
surfaces in Euclidean space. In Section 8.4 we show existence of constant mean
curvature surfaces in asymptotically flat initial data sets that are asymptotic to
Schwarzschild. We also prove that the geometric center of mass, to be defined in
(8.4.8), coincides with the BORT center of mass. Section 8.5 presents methods
to analyze the spectrum of the stability operator and show that the constant
mean curvature surfaces constructed in the previous section are stable and form
a smooth foliation. In Section 8.6, we discuss density results for the Einstein
constraint equations and an application to arbitrarily specifying the BORT center
of mass and the ADM angular momentum.
324 8. O N THE CENTER OF MASS AND CMC SURFACES

8.2. Uniqueness of embedded CMC surfaces

A fundamental problem in differential geometry is to characterize the constant


mean curvature hypersurfaces in a Riemannian manifold. A classical result
due to Alexandrov asserts that the only embedded and closed constant mean
curvature surfaces in Euclidean space are the round spheres. The original proof
of Alexandrov is based on the arguments which came to be known as the method
of moving planes. We instead present another proof due to S. Montiel and A. Ros
[164, Section 6.4].
Let (M, g) be an orientable Riemannian manifold, and let 6 n ⊂ M n+1 be an
immersed two-sided hypersurface, i.e., there exists a globally defined smooth unit
normal vector field ν along 6. We let dµ be the induced hypersurface measure.
The mean curvature of 6 with respect to ν is defined by H := div6 ν. The mean
curvature detects how the (extrinsic) normal vector varies along 6. According to
our convention, the mean curvature of a Euclidean n-sphere is n with respect to
the outward unit normal vector. An immersed submanifold is said to be closed
if it is compact and has no boundary, and is said to be embedded if in addition it
has no self-intersection. We first review some basic integral formulas involving
mean curvature for closed hypersurfaces.
A conformal vector field X is a vector field on M that satisfies
L X g = 2 f g,
for some function f : M → R, where L X is the Lie derivative.
Example 8-2 (see [31]). Consider the n-dimensional Schwarzschild metric
with ADM mass m > 0 of the form gm = (1 − 2ms 2−n )−1 ds 2 + s 2 gSn−1 on
(s0 , ∞) × Sn−1 , where s0 = (2m)1/(n−2) . We change variables and set
Z s
1
r (s) = (1 − 2mτ 2−n )− 2 dτ.
s0

Define h(r ) = s(r ). We then rewrite the Schwarzschild metric in the form
gm = dr 2 + h 2 (r )gSn−1 . Define the vector field X = h(r ) ∂/∂r . By direct
computation,
L X gm = L X (dr ) ⊗ dr + dr ⊗ L X (dr ) + X (h 2 )gSn−1
= 2h ′ (r )dr ⊗ dr + 2(h(r ))2 h ′ (r )gSn−1
= 2h ′ (r )gm .
Thus X is a conformal vector field that satisfies L X gm = 2 f gm , where
 −1
dr
f (r ) = h ′ (r ) = = (1 − 2ms 2−n )1/2 .
ds
U NIQUENESS OF EMBEDDED CMC SURFACES 325

Also note that f satisfies the static potential equation

(1gm f )gm − Hessgm f + f Ricgm = 0. □

In what follows, dµ will generally denote the induced surface measure on


6 ⊂ (M, g).

Theorem 8-3 (generalized Minkowski integral formula [31, Proposition 2.3]).


Suppose that (M n+1 , g) has a conformal vector field X such that L X g = 2 f g for
some function f . Let 6 n be a closed two-sided hypersurface in M, and let H be
the mean curvature of 6 with respect to the unit normal vector ν. Then
Z
(n f − H g(X, ν)) dµ = 0.
6

Proof. Recall that the Lie derivative L X g in a local frame {e1 , . . . , en+1 } has the
expression
(L X g)(ei , e j ) = g(∇ei X, e j ) + g(ei , ∇e j X )

for i, j = 1, 2, . . . , n + 1. We decompose the conformal vector field along 6


into X = X ′ + g(X, ν)ν, where X ′ ∈ T 6. Suppose further that {e1 , . . . , en } is a
local orthonormal frame on 6, and let ∇ be the covariant derivative of g. Then,
at each point of 6,
n n
div6 X ′ + H g(X, ν) = div6 X ′ + g(X, ν) g(∇ei ν, ei ) = g(∇ei X, ei )
P P
i=1 i=1
n n
1P
= (L X g)(ei , ei ) = f g(ei , ei ) = n f.
P
2 i=1 i=1

Integrating on 6 and applying the divergence theorem yields the desired integral
formula. □

We finish this section with two theorems on closed embedded hypersurfaces


6n in Rn+1 . We recall that by the Jordan–Brouwer separation theorem, any such
6 = ∂, where  is a bounded open set; in particular, then, 6 is two-sided.

Theorem 8-4 (The Heintze–Karcher inequality; see [31, Theorem 3.5]). Let
6 n be a closed, embedded hypersurface in Rn+1 , with 6 = ∂, where  is a
bounded region in Rn+1 with volume Vol . Suppose that the mean curvature H
is positive with respect to the outward unit normal. Then
1
Z
n dµ ≥ (n + 1) Vol ,
6 H

with equality if and only if 6 is a round sphere.


326 8. O N THE CENTER OF MASS AND CMC SURFACES

Proof. Consider a deformation F : 6 × [0, ∞) → Rn+1 given by

F(x, t) = x − tν(x),

where ν is the outward unit normal on 6. Let 6t := F(6, t), with surface
measure dµ (suppressing the dependence on t). Let d6 ( p) denote the distance
of p ∈ Rn+1 to 6. For t sufficiently small, 6t = d6−1 (t) ∩  is smooth, but 6t
may begin to have self-intersection for some t. Hence, instead of working on 6t ,
we consider

6t∗ = {F(x, t) : d6 (F(x, t + δ)) = t + δ for some δ > 0}.


∂F
This is a smooth hypersurface contained in 6t . Since ∂t = −ν, the variational
formulas (8.3.1) and (8.3.2) yield
∂ ∂H H2 ∂ H −1 1
dµ = −H dµ, = |A|2 ≥ , = −H −2 |A|2 ≤ − .
∂t ∂t n ∂t n
Define Q(t) := n 6 ∗ H −1 dµ. Then
R
t
Z

Q (t) = n (−H −2 |A|2 + H −1 (−H )) dµ
6 ∗
Z t Z
≤n − n1 − 1 dµ = −(n + 1) dµ.

6t∗ 6t∗

Thus, for τ ∈ (0, ∞), we have,


Z τ Z τZ Z
Q(0)−Q(τ ) = − Q ′ (t) dt ≥ (n+1) dµ dt = (n+1) d x.
0 0 6t∗ {x∈:d6 (x)≤τ }

Since Q(τ ) ≥ 0, we obtain the desired inequality by letting τ → ∞. It is


straightforward to verify that 6 is umbilic if the equality holds. □
Theorem 8-5 (Alexandrov’s theorem). Let 6 n be a closed, embedded, connected
hypersurface in Rn+1 with constant mean curvature. Then 6 is a round sphere.
Remark 8-6. The theorem fails without the assumption that 6 is embedded.
H. Wente [222] produced an immersed torus of constant mean curvature in R3 .
Immersed surfaces of higher genus have been constructed by N. Kapouleas [133].
Proof. The position vector field X = (x 1 , . . . , x n+1 ) in Rn+1 is a conformal
vector field (with f = 1). By the Minkowski integral formula (Theorem 8-3) and
the divergence theorem, with 6 = ∂,
1
Z Z Z
n dµ = ⟨X, ν⟩ dµ = divRn+1 X d x = (n + 1) Vol .
H 6 6 
S TABLE CMC SURFACES 327

Thus, we obtain equality in the Heintze–Karcher inequality, which implies that


6 is a round sphere. □
This theorem has been generalized to a large class of warped manifolds,
including the Schwarzschild manifolds, by Brendle [31, Theorem 1.1].

8.3. Stable CMC surfaces

8.3.1. Variational formulas. Let 6 n be a smooth closed two-sided hypersurface


in (M n+1 , g). We are interested in how some geometric quantities on 6, such as
mean curvature, surface area, and enclosed volume, change under deformations
of 6. The relevant formulas are called the variational formulas. Consider a
deformation of 6 along its normal direction F : 6 × (−ϵ, ϵ) → M satisfying

F(x, t) = η(x, t)ν(x, t),
∂t
F(x, 0) = x,

where ν(x, t) is a unit normal to 6t := F(6, t). Define Ft (x) = F(x, t). We
further suppose that Ft : 6 → M is an immersion. By direct computation, the
first variation formula says
d
Z
Hn (6t ) = H η dµ, (8.3.1)
dt t=0 6

where H = div6 ν. If one allows arbitrary deformations, then we can strictly


decrease the volume of 6 by deforming 6 along −H ν, unless 6 is a minimal
hypersurface (H ≡ 0). Thus, minimal hypersurfaces are the critical points of the
area functional.
The second variation formula at a minimal hypersurface says
d2
Z
n
H (6t ) = |∇ 6 η|2 − (|A|2 + Ric(ν, ν))η2 dµ, (8.3.2)

2
dt t=0 6

where A is the second fundamental form of 6 and Ric is the Ricci tensor of
(M, g). Now define the stability operator

L 6 := −16 − (|A|2 + Ric(ν, ν)).

A minimal hypersurface is said to be stable if 6 ηL 6 η dµ ≥ 0 for all smooth


R

functions η.
If we restrict our attention to a smaller class of deformations on 6, hyper-
surfaces of constant mean curvature H ̸= 0 can appear as the critical points of
the functional Hn (6t ). Consider the (n+1)-dimensional signed volume V (t)
328 8. O N THE CENTER OF MASS AND CMC SURFACES

between 6t and 6. The volume function satisfies the following variational


formula.
Proposition 8-7. Let 6 n be a smooth closed two-sided hypersurface in (M n+1, g).
Let F : 6 × (−ϵ, ϵ) → M satisfy

F(x, t) = η(x, t)ν(x, t),
∂t
F(x, 0) = x,
where ν(x, t) is a unit normal to 6t := F(6, t). Define the volume function by
Z
V (t) = F ∗ dvol M .
6×[0,t]

Then
d
Z
V (t) = η dµ.
dt t=0 6

Proof. Let p ∈ 6. Let {ν, e1 , . . . , en } be a local oriented orthonormal frame


about F( p, 0). Then F ∗ dvol M = a(t, p) dt ∧ dµ, where
∂ ∂F
   
a(t, p) = F ∗ dvol M , e1 , . . . , en = dvol M , d Ft (e1 ), . . . , d Ft (en ) ,
∂t ∂t
∂F
 
a(0, p) = g , ν( p, 0) = η( p, 0).
∂t ( p,0)
It follows that
d
Z Z
V (t) = a(0, p)dµ = η dµ. □
dt t=0 6 6

A variation such that V (t) = V (0) for all t ∈ (−ϵ, ϵ) is called volume-
preserving. Proposition 8-7 shows that if a variation satisfies 6 η dµ = 0, then
R

it preserves the volume between 6t and 6 “infinitesimally” at t = 0. Conversely,


any smooth function η(x) on 6 such that 6 η dµ = 0 gives rise to a volume-
R

preserving variation [13; 14].


One can readily see that for volume-preserving variations, the hypersurfaces
of constant mean curvature are the critical points of the area functional. The
second variation formula for volume-preserving variations at hypersurfaces of
constant mean curvature becomes
d2 d2
Z Z
n
H (6t ) = ηL 6 η dµ + H 2 V (t) = ηL 6 η dµ.
dt 2 t=0 6 dt t=0 6

Therefore, a hypersurface 6 of constant mean curvature is said to be stable if


and only if Z
ηL 6 η dµ ≥ 0
6
S TABLE CMC SURFACES 329

for all η ∈ C ∞ (6) such that 6 η dµ = 0. More specifically, define


R

Z Z 
µ0 := inf ηL 6 η dµ : η ∈ C ∞ (6), ∥η∥ L 2 (6) = 1, η dµ = 0 . (8.3.3)
6 6

Then 6 is stable if µ0 ≥ 0. From the discussion above, it follows that stable


hypersurfaces are the local minimizers of the area functional Hn (6t ) among
volume-preserving variations.
Example 8-8. The n-dimensional sphere Sr in Rn+1 of radius r > 0 centered at
the origin is a stable hypersurface of constant mean curvature n/r . The stability
operator on Sr is
n
L 0 = −10 − (|A|2 + Ric(ν, ν)) = −10 − ,
r2
where 10 is the Laplace operator on Sr . Because µ0 is also an eigenvalue of −10 ,
by analyzing the eigenvalues of −10 , we obtain µ0 = 0 with the eigenspace
spanned by the coordinate functions {x 1 , . . . , x n+1 } restricted to Sr . Also note
that L 0 is self-adjoint and its cokernel equals its kernel.
Example 8-9. Consider the Schwarzschild manifold M = (R3 \ {CBORT }, gm ),
where gm = u 4 gE and
m
u(x) = 1 + .
2|x − CBORT |
For each r > 0, let Sr = {R3 : |x − CBORT | = r }, a constant mean curvature sphere
homologous to the minimal sphere.
We recall a transformation formula for conformal metrics. Let g1 , g2 be
metrics on an n-dimensional manifold, related by g2 = u 4/(n−2) g1 . If ν1 is a
unit normal to a hypersurface with respect to g1 , then ν2 = u −2/(n−2) ν1 is a unit
normal with respect to g2 . The corresponding mean curvatures H1 and H2 of the
hypersurface are related by
− 2 2(n −1) −1
 
H2 = u n−2 H1 + u ∇ν1 u . (8.3.4)
n −2
It is not hard to see that umbilicity is preserved under conformal transformation,
so the sphere Sr is umbilic in M and has constant mean curvature (2r −m)/(r 2 u 3 ).
This implies that Sm/2 is a minimal surface and S(2+√3)m/2 has the largest √ mean
curvature: the mean curvature of √Sr is increasing in r for m/2 ≤ r ≤ (2+ 3)m/2,
and decreasing in r if r ≥ (2 + 3)m/2. The stability operator on Sr is given by
−4r 2 + 8r m − m 2
L Sr = −1 Sr − (|A|2 + Ricgm (ν, ν)) = −u −4 10 + , (8.3.5)
2r 4 u 6
330 8. O N THE CENTER OF MASS AND CMC SURFACES

where 10 is the Laplacian operator of the round sphere of radius r . The smallest
eigenvalue of L Sr is
−4r 2 + 8r m − m 2 2 10m
λ0 = = − 2 + 3 + O(r −4 ),
2r 4 u 6 r r
with the corresponding eigenspace spanned by constant functions. The next
eigenvalues are
6m 6m
λ1 = λ2 = λ3 = = 3 + O(r −4 ),
r 3u6 r
with corresponding eigenspace spanned by the coordinate functions {x 1, x 2, x 3 }
restricted to Sr . Thus, if m > 0, Sr is a stable hypersurface of constant mean
curvature (with respect to volume-preserving variations). The spheres {Sr } form a
smooth foliation of constant mean curvature spheres with common center CBORT .

8.3.2. Uniqueness of stable CMC surfaces. A classical result of Barbosa and


do Carmo [13] characterizes stable hypersurfaces in Euclidean space.
Theorem 8-10 (Barbosa–do Carmo [13]). The only closed, stable, connected,
two-sided hypersurfaces of constant mean curvature in Euclidean space are
round spheres.
Proof. Let 6 n be a hypersurface in Rn+1 of constant mean curvature H that satis-
fies the assumptions in the theorem. Consider the deformation F : 6 ×(−ϵ, ϵ) →
Rn+1 given by
∂F
= ην,
∂t
where η = n − H ⟨X, ν⟩ and X = (x 1 , . . . , x n+1 ) is the position vector of 6. Then
6 η dµ = 0 by the Minkowski integral formula (with f = 1). Let {e1 , . . . , en } be
R

a local orthonormal frame along 6. Note that ∇ei X = ei , where ∇ is the ambient
connection. For any point in 6, we can choose the frame so that ∇e6i e j = 0 at
the point, at which we find
n
16 ⟨X, ν⟩ = ei ei ⟨X, ν⟩
P
i=1
n
= ei ⟨∇ei X, ν⟩ + ⟨X, ∇ei ν⟩
P 
i=1
n
= ⟨∇ei ei , ν⟩ + ⟨ei , ∇ei ν⟩ + ⟨∇ei X, ∇ei ν⟩ + ⟨X, ∇ei ∇ei ν⟩
P 
i=1
n
= ⟨ei , ∇ei ν⟩ + ⟨X, ∇ei ∇ei ν⟩
P 
i=1
= H − |A|2 ⟨X, ν⟩, (8.3.6)
E XISTENCE OF CMC SURFACES IN ASYMPTOTICALLY FLAT INITIAL DATA SETS 331

Pn
where in the last step we have used the equality i=1 ⟨ek , ∇ei ∇ei ν⟩ = 0, which
holds for each k because H is constant.
Let L 6 be the stability operator on 6. The computation above implies that
with H constant and η = n − H ⟨X, ν⟩,

L 6 η = −16 η − |A|2 η = H 2 − n|A|2 .

Since 6 is stable, and since H is constant and 6 η dµ = 0, we have


R
Z Z
0≤ ηL 6 η dµ = (H 2 − n|A|2 )η dµ
6 6
Z
= −n |A|2 (n − H ⟨X, ν⟩) dµ
Z6
= −n (n|A|2 − H 2 ) dµ,
6

where in the last equality we used 6 (H − |A|2 ⟨X, ν⟩) dµ = 0, which is implied
R

by (8.3.6). Because n|A|2 ≥ H 2 with equality if and only if 6 is umbilic, we


conclude that 6 is a sphere. □
Remark 8-11. The uniqueness result has been generalized to an ambient Rie-
mannian manifold which is complete, simply connected with constant sectional
curvature [14]. More precisely, the only stable closed hypersurfaces of constant
mean curvature in a complete simply-connected Riemannian manifold with
constant sectional curvature are the geodesic spheres.

8.4. Existence of CMC surfaces in asymptotically flat initial data sets

We have seen in Example 8-9 that the BORT center of mass of a Schwarzschild
manifold of positive mass is the common geometric center of the (unique)
foliation of the stable constant mean curvature surfaces. In 1996, motivated
by the goal of finding a geometric description of the center of mass in general
relativity, Huisken and Yau [126] initiated a program to study stable constant
mean curvature surfaces in more general asymptotically flat initial data sets.
Throughout this section, we consider three-dimensional asymptotically flat
initial data sets; in addition, we impose the Regge–Teitelboim conditions where
CBORT is used.
We recall that the three-dimensional Schwarzschild metric of mass m is
m 4
denoted by gm = 1 + 2|x| gE . Here we are interested in the exterior region of


the manifold, so the metric is valid for all m ∈ R, and not only for m > 0. For
most of the results presented here, we focus on an asymptotically flat manifold
that is close to some Schwarzschild manifold in the following sense.
332 8. O N THE CENTER OF MASS AND CMC SURFACES

Definition 8-12. A three-dimensional asymptotically flat initial data set (M, g)


is said to be C k -asymptotic to Schwarzschild of mass m if there is a compact
subset K ⊂ M and a diffeomorphism M \ K ∼ =x R3 \ B for a closed ball B ⊂ R3 ,
satisfying
|x|2+|I | ∂ I gi j (x) − (gm )i j (x) ≤ C
X 

|I |≤k

for i, j = 1, 2, 3 and some constant C > 0.

Remark 8-13. The assumptions on k for regularity C k vary among different


results that we discuss below, but we omit the precise assumptions on C k in their
statements.

Theorem 8-14 (Huisken–Yau [126]). If (M, g) is asymptotic to Schwarzschild


of mass m > 0, there exists a foliation of stable constant mean curvature surfaces
in the exterior region of M.

The proof by Huisken and Yau consists of two parts. For the existence part,
they use the volume-preserving mean curvature flow to evolve a sufficiently round
initial surface into a constant mean curvature surface. Next, using the estimates
obtained from the flow, they analyze the eigenvalues of the stability operator and
show that the constant mean curvature surfaces are stable and form a smooth
foliation. We sketch the method of volume-preserving mean curvature flow in
Section 8.4.1 and discuss the eigenvalue estimates in Section 8.5. A different
approach by Ye [225] uses the inverse function theorem for the existence part,
which we discuss in Section 8.4.2.

8.4.1. Volume-preserving mean curvature flow. The volume-preserving mean


curvature flow is a normalized mean curvature flow. It was first introduced by
Huisken in the Euclidean setting [124]. The flow is designed specifically to keep
the enclosed volume the same and to decrease the surface area under the flow.
Let (M, g) be asymptotic to Schwarzschild. Denote by Sr = {x : |x| = r } the
coordinate sphere in M. For each r sufficiently large, we define the volume-
preserving mean curvature flow Fr : S2 ×[0, T ) → M as follows, for t ≥ 0, p ∈ S2 :

Fr ( p, t) = (H − H )ν( p, t),
∂t (8.4.1)
Fr ( p, 0) = r p ∈ Sr ,

where H = |6t |−1 6t H dµ, 6t = Fr (S2 , t), and |6t | is the area of 6t . By
R

Proposition 8-7, the flow keeps the signed volume between 6t and Sr the same.
E XISTENCE OF CMC SURFACES IN ASYMPTOTICALLY FLAT INITIAL DATA SETS 333

Furthermore, the first variation formula implies


d
Z
|6t | = − (H − H )2 dµ.
dt 6t

Thus the area of 6t is strictly decreasing unless H is a constant. Therefore, if the


flow exists for all time, 6t converges to a constant mean curvature surface.
Note that the volume-preserving mean curvature flow (8.4.1) is a quasilinear
parabolic system, so it has a unique short-time solution for a smooth initial
surface. However, the flow may develop singularities at a finite time. For
surfaces in Euclidean space that are uniformly convex, the flow exists for all
time and converges to a round sphere [124]. For surfaces in an initial data set
that is asymptotic to Schwarzschild, we have the following result.
Theorem 8-15 (Huisken–Yau [126]). Let (M, g) be asymptotic to Schwarzschild
of mass m > 0. There exist positive constants r0 and C, depending only on g,
such that for all r ≥ r0 , the volume-preserving mean curvature flow (8.4.1) has a
unique smooth solution for all time. Furthermore, 6t converges exponentially
fast to an embedded surface 6 of constant mean curvature H , and, for x ∈ 6,
2 4m
|x| − r ≤ C and H− + 2 ≤ Cr −2 .
r r
The main ingredient of the proof is to show that the solution 6t stays in a
class of sufficiently round surfaces, and hence it does not develop singularities
along the flow.
For a surface 6 in (M, g), let g6 be the induced metric, and denote by
Å := A − 12 H g6 the traceless part of the second fundamental form A of 6. Let r
be large and B0 , B1 , B2 be positive. Define the class Br (B0 , B1 , B2 ) of smooth
closed surfaces of genus zero in (M, g) by
Br (B0 , B1 , B2 ) =
6 ⊂ M : | Å| ≤ B1 r −3 , |∇ Å| ≤ B2 r −4 , and |x| − r ≤ B0 for all x ∈ 6 .


By carefully choosing the constants B0 , B1 , B2 (depending on the a priori


estimates of | Å| and |∇ Å| along the flow), one can find r0 sufficiently large
(depending on B0 , B1 , B2 and g) such that for each r ≥ r0 the solution 6t to
(8.4.1) remains in Br (B1 , B2 , B3 ). This then implies long-time existence of the
solutions. Details can be found in the original paper [126, Section 3].
We now explain the motivation behind the smallness assumptions on | Å|
and |∇ Å| in the definition of Br (B0 , B1 , B2 ). Note that if | Å| = 0, then all the
principal curvatures at each point of the hypersurface are equal and hence the
hypersurface is umbilic. It is known that the only closed umbilic hypersurfaces
334 8. O N THE CENTER OF MASS AND CMC SURFACES

in Euclidean space are round spheres. (We applied this fact earlier in the proof of
Theorem 8-10.) For surfaces that are almost umbilic, there are several quantitative
estimates that measure how far the surfaces are from being round; see, for
example, [72]. Below we provide a simple version of the quantitative estimates.
We define the area radius for a closed surface 6 in (M, g) by
r
|6|
r6 := .

Proposition 8-16 [126, Proposition 2.1]. There exists an absolute constant C > 0
such that the following holds. Let 6 be a closed surface in R3 of genus zero. Let
B1 and B2 be positive numbers such that

| Å| ≤ B1 r6−3 and |∇ Å| ≤ B2 r6−4 .


√ √
If r6 > C( B1 + B2 ), then the principal curvatures λ1 , λ2 satisfy, for i = 1, 2,

H
λi − ≤ C(B1 + B2 )r6−3 .
2
Proof. In the proof, C is assumed to be an absolute constant and may change
from line to line. As a consequence of the Codazzi equation [123, Lemma 2.2],
we have
|∇ A|2 ≥ 43 |∇ H |2 .

Together with the assumption on |∇ Å|, this implies an upper bound on |∇ H |:

|∇ Å|2 = |∇ A|2 − 12 |∇ H |2 ≥ 41 |∇ H |2 .

Let x0 ∈ 6 be such that H (x0 ) = H . By the mean value theorem and the above
bound on |∇ H |, we have

|H (x) − H | ≤ sup |∇ H (x)|d ≤ 2B2 r6−4 d, (8.4.2)


x∈6

where d is the intrinsic diameter of 6, which can be estimated in terms of the


mean curvature [215, Theorem 1.1] as follows:
Z
d ≤ C |H | dµ ≤ C |H |r62 + B2 r6−2 d ,

6

where we applied (8.4.2) in the last inequality. Choosing r6 ≥ 2C B2 yields
d ≤ 2C|H |r62 . Using (8.4.2) again, we have for all x ∈ 6

|H (x) − H | ≤ C B2 r6−2 |H |.
E XISTENCE OF CMC SURFACES IN ASYMPTOTICALLY FLAT INITIAL DATA SETS 335

By the Gauss–Bonnet theorem and the assumption on | Å|, we have


Z 12 Z Z 12
− 12 2 − 12 2
|H | ≤ |6| H dµ ≤ |6| 2| Å| dµ + 4 K dµ ≤ Cr6−1 ,
6 6 6

provided r6 ≥ B1 . Thus we obtain
|H (x) − H | ≤ C B2 r6−3 .
By the assumption that | Å| ≤ B1 r6−3 , we conclude, for i = 1, 2, that
H
λi − ≤ C(B1 + B2 )r6−3 . □
2
8.4.2. The inverse function theorem. An alternative method to construct a
constant mean curvature surface is by graphically perturbing an initial surface
whose mean curvature is almost constant.
Let (M, g) be asymptotic to Schwarzschild of mass m. Let
2m
 
pi j (x) := gi j (x) − 1 + δi j .
|x|
For r sufficiently large, let Sr (a) be a coordinate sphere defined by Sr (a) =
i i
{x ∈ M : |x − a| = r }. Set ρ i = x −a r . By direct computation [116, (5.1)], the
mean curvature of Sr (a) at x in Sr (a) is
2 4m 6m(x − a) · a 9m 2 1 pi j (x) i j
HSr (a) = − 2 + + 3 + pi j,k (x)ρ i ρ j ρ k + 2 ρρ
r r r 4 r 2 r
pii (x) 1
− pi j,i (x)ρ j − + pii, j (x)ρ j + O(r −4 (1 + |a|))
r 2
2 4m 6m(x − a) · a 9m 2
=: − 2 + + 3 + G r (x, a). (8.4.3)
r r r4 r
Throughout this section, we use the Einstein summation convention and sum over
repeated indices, and a comma denotes a partial derivative. From (8.4.3), the
mean curvature of the coordinate sphere is almost constant, up to terms of order
O(r −3 ). We show below that if m ̸= 0, one can find a surface of constant mean
curvature near Sr (a) for a suitably chosen vector a (depending on r ). Moreover,
the center a will converge to the BORT center of mass as r tends to infinity.
Theorem 8-17 (Ye [225], Huang [116]). Let (M, g) be asymptotic to Schwarz-
schild of mass m ̸= 0. There exist positive constants r0 and C, depending only on
g, such that for each r ≥ r0 , there exists a surface 6r of constant mean curvature
2 4m
r − r 2 , and 6r can be expressed as a normal graph over the coordinate sphere,

6r = x + φ(x)ν Sr (x) : x ∈ Sr (CBORT ) ,



336 8. O N THE CENTER OF MASS AND CMC SURFACES

for some φ ∈ C 2,α (Sr (CBORT )) satisfying

r |I | |∂ I φ| + r 2+α [∂ I φ]α ≤ Cr −1 ,
X X

|I |≤2 |I |=2

where ν Sr is the outward unit normal vector on Sr with respect to g.

Sketch of proof. We will suppress the subscript r in 6r when the context is clear.
Fix an asymptotically flat coordinate system in the exterior region of M. Let 6
be a graph over the coordinate sphere: for φ ∈ C 2,α (Sr (a)) suitably small,

6 = x + φν Sr : x ∈ Sr (a) .


Fix r sufficiently large, which will be specified later. Denote by Hr (a, φ) :


R3 × C 2,α (Sr (a)) → C 0,α (Sr (a)) the mean curvature operator that sends the
function φ to the mean curvature of the normal graph 6 in (M, g). By Taylor
expansion in the φ-component,
Hr (a, φ) = Hr (a, 0) Z 1
+ d Hr (a, 0)(φ) + d Hr (a, sφ) − d Hr (a, 0) (φ) ds, (8.4.4)

0

where d Hr is the first Fréchet derivative with respect to the second component.
Specifically, d Hr (a, 0) is the stability operator on Sr (a) with respect to g:

d Hr (a, 0) = −1 Sr (a) − (|A|2 + Ric(ν Sr , ν Sr )) =: L Sr (a) .

The term Hr (a, 0) in (8.4.4) is the mean curvature of the coordinate sphere
computed as in (8.4.3). Thus, solving Hr (a, φ) = r2 − 4mr2
for some (a, φ) is
equivalent to solving

6m(x − a) · a 9m 2
L Sr (a) φ = − − 3
r4 r Z
1
− G r (x, a) − d Hr (a, sφ) − d Hr (a, 0) (φ) ds. (8.4.5)

0

Since (M, g) is asymptotic to Schwarzschild, on the coordinate sphere Sr (a) we


have
2
|A|2 = 2 + O(r −3 ), Ric(ν Sr , ν Sr ) = O(r −3 ),
r
provided that |a| is bounded by a constant independent of r . Hence we replace
the stability operator L Sr (a) by the stability operator L 0 := −10 − r22 on the
Euclidean round sphere, where 10 is the Laplace operator on the Euclidean
round sphere Sr (a). We then rewrite (8.4.5) as a differential equation on the
E XISTENCE OF CMC SURFACES IN ASYMPTOTICALLY FLAT INITIAL DATA SETS 337

Euclidean round sphere (compare (8.3.5)):


6m(x − a) · a 9m 2
L 0φ = − − 3 − G r (x, a) + O r −1 |∂ 2 φ| + r −2 |∂φ| + r −3 |φ|

r 4 r
=: Fr (x, a, φ, ∂φ, ∂ 2 φ). (8.4.6)
A necessary condition for this equation to have a solution is that Fr be perpen-
dicular to the cokernel of L 0 , which is spanned by {x 1 − a 1 , x 2 − a 2 , x 3 − a 3 }
restricted on Sr (a) (see Example 8-8). By the following lemma, the parameter a
can be chosen to accomplish this.
Lemma 8-18 (Huang [116, Lemma 5.1]). Let (M, g) be asymptotic to Schwarz-
schild of mass m. There exists r0 sufficiently large such that for each r ≥ r0 and,
for each i = 1, 2, 3,
Z
(x i − a i )G r (x, a) dµ0 = −8πm CBORT
i
+ O(r −1 ),
Sr (a)

where G r (x, a) is the remainder term in (8.4.3) and dµ0 is the area measure of
the Euclidean round sphere.
Remark 8-19. Lemma 8-18 has been generalized to initial data sets with the
Regge–Teitelboim conditions. We prove this in Lemma 8-22 below.
By Lemma 8-18 and direct computation, we obtain
Z
Fr (x, a, φ, ∂φ, ∂ 2 φ)(x i −a i ) dµ0 = −8π m(a i − CBORT
i
)+ O(r −1 ∥φ∥C 2 ),
Sr (a)

for i = 1, 2, 3. If m ̸= 0, we choose a = CBORT + O(r −1 ∥φ∥C 2 ) such that the


above integral vanishes. Thus, Fr (x, a, φ, ∂φ, ∂ 2 φ) belongs to the range of L 0 .
Next we use the Schauder fixed point theorem (see [107, Chapter 11], for
example) to find a solution to (8.4.6).
Theorem 8-20 (Schauder fixed point theorem). Let B be a compact convex subset
in a Banach space, and let T : B → B be a continuous map. Then T has a fixed
point, that is, T x = x for some x ∈ B.
Define the convex subset B ⊂ C 2 (Sr (a)) by B := {u ∈ C 2 (Sr (a)) : ∥u∥C 2,α ≤ 1}.
Note that B is compact by the Arzelà–Ascoli theorem. Given w ∈ C 2 (Sr (a)), we
have shown that there exists a vector a such that Fr (x, a, w, ∂w, ∂ 2 w) belongs
to the range of L 0 . Hence there exists a solution v ∈ C 2,α (Sr (a)) such that
L 0 v = Fr (x, a, w, ∂w, ∂ 2 w). (8.4.7)
Define the map T : B → C 2 (Sr (a)) by T (w) = v, where v is the unique solution
to (8.4.7) such that v is perpendicular to the kernel of L 0 . One can verify that
338 8. O N THE CENTER OF MASS AND CMC SURFACES

T is continuous. By the Schauder estimates for solutions perpendicular to the


kernel, we obtain

∥v∥C 2,α (Sr (a)) ≤ C Fr (x, a, w, ∂w, ∂ 2 w) C 0,α (Sr (a))


≤ Cr −1 ∥w∥C 2,α (Sr (a)) ,

where the constant C depends only on the metric g. Choose r such that r ≥ C.
It follows that T maps B into itself. Thus, by the Schauder fixed point theorem,
T has a fixed point φ. Then φ solves the desired equation
2 4m
Hr (a, φ) = − 2,
r r
where a = CBORT + O(r −1 ∥φ∥C 2 ). □
Let {6r } be the family of constant mean curvature surfaces constructed in the
previous theorem, and let {x 1 , x 2 , x 3 } be the coordinate functions. The geometric
center of mass of (M, g) proposed by Huisken–Yau is defined as follows, for
i = 1, 2, 3:
i
6 x dµ0
R
i
CGeom = lim R r , (8.4.8)
6r dµ0
r →∞

where dµ0 is the Euclidean area measure.


Corollary 8-21 (Huang [116, Theorem 2]). If (M, g) is asymptotic to Schwarz-
schild of mass m ̸= 0, these two notions of the center of mass coincide:

CBORT = CGeom .

This corollary has been generalized to the class of asymptotically flat initial
data sets that satisfy the Regge–Teitelboim conditions [117].

8.4.3. Another notion of center of mass. In this section we show the following
identity between the mean curvature of the coordinate spheres and the BORT
center of mass.
Lemma 8-22 (Huang [117]). Let (M, g, k) be an asymptotically flat initial
data set satisfying the Regge–Teitelboim conditions. Given a ∈ R3 , denote by
Sr (a) = {x ∈ M : |x − a| = r } the coordinate sphere centered at a of radius r .
Then we have
2
Z  
(x α − a α ) H − dµ0 = 8π E(a α − CBORT
α
) + O(r 1−2q ) (8.4.9)
Sr (a) r
for α = 1, 2, 3, where H is the mean curvature of Sr (a) with respect to g and
dµ0 is the area measure of the Euclidean round sphere Sr (a).
E XISTENCE OF CMC SURFACES IN ASYMPTOTICALLY FLAT INITIAL DATA SETS 339

Proof. Write h i j = gi j − δi j and ρ i = (x i − a i )/r . Employing the Einstein


convention to sum over all repeated indices, we find by direct computation [117,
Lemma 2.1] that, for x ∈ Sr (a),
2 1 ρi ρ j 1 h ii (x)
H (x) = + h i j,k (x)ρ i ρ j ρ k +2h i j (x) −h i j,i (x)ρ j + h ii, j (x)ρ j − +E 0 (x),
r 2 r 2 r

where E 0 (x) = O(r −1−2q ) and E 0 (x) − E 0 (−x) = O(r −2−2q ). We will use the
key identity
Z
(x α − a α ) 21 h i j,k (x)ρ i ρ j ρ k dµ0
Sr (a)
1 ρi ρ j
Z  
= (x α − a α ) h i j,i (x)ρ j − 2h i j (x) dµ0
Sr (a) 2 r
1
Z
+ h ii (x)ρ α + h iα (x)ρ i dµ0 . (8.4.10)

Sr (a) 2

Assuming (8.4.10), we obtain

2
Z  
(x α−a α ) H (x)− dµ0
Sr (a) r
1 1
Z Z
α α
=− j
(x −a )(h i j,i −h ii, j )ρ dµ0 + (h ρ i −h ii ρ α )dµ0 +O(r 1−2q )
2 Sr (a) 2 Sr (a) iα
1
Z
=− x α (gi j,i −gii, j )ρ j −(giα ρ i −gii ρ α ) dµ0

2 Sr (a)
1 α
Z
+ a (g −g )ρ j dµ0 +O(r 1−2q )
2 Sr (a) i j,i ii, j

1 xj xi xα
Z   
α
=− x (gi j,i −gii, j ) − giα −gii dµ0
2 Sr (a) r r r
1 xj
Z
+ aα (gi j,i −gii, j ) dµ0 +O(r 1−2q ),
2 Sr (a) r

where we used the Regge–Teitelboim conditions in all the equalities. The desired
identity follows from the definitions of the ADM energy and the BORT center
of mass.
It remains to prove (8.4.10). Our original proof uses a density theorem
(Theorem 8-34) which states that initial data sets with harmonic asymptotics
are dense among initial data sets with the Regge–Teitelboim conditions in a
suitable topology such that the ADM energy and the BORT center of mass vary
continuously. It is then straightforward to verify that (8.4.10) holds for initial data
sets with harmonic asymptotics. Eichmair and Metzger later gave the following
proof [81]. For each α, define the vector field X (α) = (x α − a α )h i j ρ i ∂ j . By the
340 8. O N THE CENTER OF MASS AND CMC SURFACES

first variation formula,


hi j ρi ρ j
Z Z Z
div0 X (α) dµ0 = H0 ⟨X (α) , ρ⟩ dµ0 = 2(x α − a α ) dµ0 ,
Sr (a) Sr (a) Sr (a) r
where div0 is the divergence operator on the Euclidean round sphere Sr (a). Then
(8.4.10) follows from a direct computation:
j
div0 X (α) = (δi j − ρ iρ j )∂i X (α)
h ii hi j i j
 
i α α i i j k
= h iα ρ + (x − a ) − 2 ρ ρ + h i j, j ρ − h i j,k ρ ρ ρ . □
r r
Lemma 8-22 gives us a new notion of center of mass that involves the mean
curvature: for α = 1, 2, 3,
1
Z
α
C H = lim − x α H dµ0 ,
r →∞ 8π E S (0)
r

where H is the mean curvature of the coordinate sphere Sr (0) with respect to g,
and dµ0 is the area measure of a Euclidean round sphere.
Corollary 8-23. Let (M, g, k) be an asymptotically flat initial data set satisfying
the Regge–Teitelboim conditions. Then these notions of center of mass coincide:

C H = CBORT .

8.5. Stability and foliations

After having obtained a family of constant mean curvature surfaces in Section 8.4,
we now discuss their properties in this section. We continue to restrict our
discussions to three-dimensional asymptotically flat initial data sets throughout
this section, unless otherwise specified.

8.5.1. Analyzing the stability operator. We have shown in Section 8.4 the exis-
tence of a family of constant mean curvature surfaces in an initial data set (M, g)
asymptotic to Schwarzschild. From the construction, each member of the family
of constant mean curvature surfaces {6r } can be expressed as a normal graph
over the corresponding coordinate sphere at a common center a:

6r = x + φν Sr : x ∈ Sr (a) , (⋆)


where φ ∈ C 2,α (Sr (a)) depends on r and


P |I | I
r 2+α [∂ I φ]α ≤ Cr −1 .
X
r |∂ φ| +
|I |≤2 |I |=2
S TABILITY AND FOLIATIONS 341

In particular, each 6r satisfies


2 4m
H= − 2, |6r | = 4πr 2 + O(r ),
r r
(⋆⋆)
2 2 10m −4 1 2m −4
|A| + Ric(ν, ν) = 2 − 3 + O(r ), K ≥ 2 − 3 − Cr ,
r r r r
where K is the Gaussian curvature of 6r and ν is the unit normal to 6r , and
where f = O(r q ) denotes a function satisfying | f | ≤ Cr q on 6r , for all r , where
C > 0 depends only on g.
We next show that for a family of surfaces satisfying the properties (⋆⋆), there
exists r0 sufficiently large such that for each r ≥ r0 , the constant mean curvature
surface 6r is stable. We first recall a classical estimate on the first nonzero
eigenvalue of the Laplace operator.
Lemma 8-24 (Lichnerowicz). Let 6 be an n-dimensional closed Riemannian
manifold. Let λLap be the first nonzero eigenvalue of the Laplace operator −1.
If the Ricci curvature satisfies

Ric(ξ, ξ ) ≥ (n − 1)κ|ξ |2 ,

for some constant κ > 0 and all ξ ∈ T M, then λLap ≥ nκ.


Remark 8-25. The equality λLap = nκ holds in the preceding if and only if 6 is
isometric to the n-sphere of constant sectional curvature κ [173].
Proof. Recall the Bochner–Lichnerowicz identity:
2
1
2 1|∇u| = |Hess u|2 + ⟨∇u, ∇1u⟩ + Ric(∇u, ∇u).

Let u be an eigenfunction corresponding to λLap such that −1u = λLap u.


Integrating the identities over 6 and using |Hess u|2 ≥ (1u)2 /n, we obtain
2
R 2
6 |∇u| dµ = λLap 6 u dµ and
R

1 (1u)2
Z Z  
2
0= 1|∇u| dµ ≥ + ⟨∇u, ∇1u⟩ + Ric(∇u, ∇u) dµ
2 6 6 n
 2
λLap
Z
≥ + ((n − 1)κ − λLap )λLap u 2 dµ.
n 6

Since λLap > 0, the desired inequality follows. □


Theorem 8-26 (Huisken–Yau [126]). Let (M, g) be asymptotic to Schwarzschild
of mass m. Suppose {6r } is a family of surfaces satisfying the properties (⋆⋆).
Then there exists C > 0, depending only on g, such that for each 6r
6m
µ0 ≥ − Cr −4 ,
r3
342 8. O N THE CENTER OF MASS AND CMC SURFACES

where µ0 is defined by (8.3.3). As a consequence, if m > 0, there exists r0


sufficiently large such that for each r ≥ r0 , we have µ0 > 0 and hence 6r is
stable.
Proof. Applying Lemma 8-24 to a two-dimensional surface 6 yields λLap ≥ 2κ,
where κ is the minimum of the Gauss curvature of 6. On 6r we have, by (⋆⋆),
2 4m
λLap ≥ 2
− 3 − Cr −4 .
r r
The proposition follows from the definition of µ0 and the properties (⋆⋆). □

8.5.2. Invertibility. To show that the stability operator is invertible, we analyze


the eigenvalues of the operator.
Theorem 8-27 (cf. Huisken–Yau [126, Theorem 4.1]). Let (M, g) be asymptotic
to Schwarzschild of mass m. Suppose {6r } is a family of surfaces satisfying the
properties (⋆⋆). Let λ0 and λ1 be the lowest and next lowest eigenvalues of the
stability operator L 6r on 6r , respectively. Then there exists C > 0, depending
only on g, such that
2 10m 6m
λ0 = − 2
+ 3 + O(r −4 ) and λ1 ≥ 3 − Cr −4 .
r r r
As a consequence, if m > 0, there exists r0 sufficiently large such that for each
r ≥ r0 , the stability operator L 6r : C 2,α (6r ) → C 0,α (6r ) is a linear isomorphism.
Proof. Let w be an eigenfunction for λ0 , so that

L 6r w = −1w − (|A|2 + Ric(ν, ν))w = λ0 w. (8.5.1)

Multiplying by w and integrating over 6r , we have


Z Z
2
λ0 w dµ = |∇w|2 − (|A|2 + Ric(ν, ν))w2 dµ

6r 6
 r
2 10m
Z
≥ − 2 + 3 − Cr −4
w 2 dµ,
r r 6r

where we used |∇w| ≥ 0 and applied (⋆⋆) to the term involving |A|2 + Ric(ν, ν).
On the other hand, using a constant function in the Rayleigh quotient yields
2 10m
λ0 ≤ − 2
+ 3 + Cr −4 .
r r
Thus we have shown that
2 10m
λ0 = − 2
+ 3 + O(r −4 ). (8.5.2)
r r
S TABILITY AND FOLIATIONS 343

The function w is almost constant in the L 2 sense. Indeed, let w = |6r |−1 6r w dµ
R

denote the mean value of w. Multiply (8.5.1) by (w − w) and integrate to obtain


Z Z
2
|∇(w − w)| dµ = (λ0 + |A|2 + Ric(ν, ν))(w − w)2 dµ
6r 6r Z
+ (|A|2 + Ric(ν, ν))w(w − w) dµ.
6r

Using the estimates of λLap , λ0 and the properties (⋆⋆), we see


2
 Z Z
2
− Cr −3
|w − w| dµ ≤ Cr −4
|w − w|2 + |w||w − w| dµ.

r 2
6r 6r

From the elementary inequality |w||w − w| ≤ ϵr 2 |w − w|2 + C(ϵ)r −2 w 2 , we


obtain
1
∥w − w∥ L 2 (6r ) ≤ Cr −2 |w||6r | 2 . (8.5.3)
In particular, if w is not constant, then w ̸= 0.
Let u be an eigenfunction with respect to λ1 . Then
L 6r (u − ū) = λ1 (u − ū) + (λ1 + |A|2 + Ric(ν, ν))ū.
We multiply the above identity by (u − ū) and integrate over 6r . Since u − ū
has zero mean value, we apply Theorem 8-26 and obtain
6m
 Z Z
−4 2
− Cr |u − ū| dµ ≤ (u − ū)L 6r (u − ū) dµ
r3 6r 6r
Z Z
2
= λ1 (u − ū) dµ + (λ1 + |A|2 + Ric(ν, ν))ū(u − ū) dµ
6r 6r
Z Z
≤ λ1 (u − ū)2 dµ + Cr −4 |ū||u − ū| dµ
6r 6r
Z
≤ λ1 (u − ū)2 dµ + Cr −4 |ū||6r |1/2 ∥u − ū∥ L 2 .
6r

To estimate the last term on the right, we note that


Z Z Z
0= uw dµ = (u − ū)(w − w) dµ + uw dµ.
6r 6r 6r

By the Hölder inequality and the L 2 bound of (w − w) in (8.5.3),


Z
|ū||6r | = u dµ ≤ |w|−1 ∥u − ū∥ L 2 ∥w − w∥ L 2 ≤ Cr −2 |6r |1/2 ∥u − ū∥ L 2 .
6r

Putting these inequalities together, we have


6m
 
− Cr −4
∥u − ū∥2L 2 ≤ λ1 ∥u − ū∥2L 2 + Cr −6 ∥u − ū∥2L 2 ,
r3
344 8. O N THE CENTER OF MASS AND CMC SURFACES

which shows that


6m
λ1 ≥ 3
− Cr −4 .
r
This implies that if m > 0, for r sufficiently large, L 6r : C 2,α (6r ) → C 0,α (6r )
is injective. By the Fredholm alternative, L 6r is surjective. Hence, it is a linear
isomorphism. □

8.5.3. Foliations. Let (M, g) be asymptotic to Schwarzschild of mass m > 0.


Let 6r be a surface in the family {6r } that satisfy the properties (⋆) and (⋆⋆). As
before, we define the mean curvature operator H : C 2,α (6r ) → C 0,α (6r ) to be
the differential operator that maps φ to the mean curvature of the normal graph
{x + φν : x ∈ 6r }. Theorem 8-27 says that the linearized operator d H = L 6r is
a linear isomorphism for r sufficiently large. We now recall:

Theorem 8-28 (inverse function theorem). Let E and F be Banach spaces, and
let U be an open subset of E. Suppose f : U ⊂ E → F is of class C k , k ≥ 1.
Let x0 ∈ U . Suppose that D f (x0 ) is a linear isomorphism. Then f is a C k
diffeomorphism of some neighborhood of x0 onto some neighborhood of f (x0 ).

Fix r such that d H is a linear isomorphism on the surface 6r of constant


mean curvature h 0 . The inverse function theorem implies that there exists ϵ > 0
such that for each constant h ∈ (h 0 − ϵ, h 0 + ϵ), there is a unique normal graph
over 6 that has constant mean curvature h. This also implies that we can define
a differentiable deformation F : 6r × (h 0 − ϵ, h 0 + ϵ) → M by sending (6r , h)
to the unique normal graph over 6r that has constant mean curvature h. Let
H (h) = h denote the mean curvature of the normal graph F(6r , h). Since
each surface has constant mean curvature, only the normal component ∂ F/∂h
contributes to the evolution of H (h). Thus,
d
1= H (h) = L 6r φ, (8.5.4)
dh h=h 0

where φ = g (∂ F/∂h) h=h 0 , ν . In the following we show that φ has a sign,




from which it follows that members of the family of constant mean curvature
surfaces do not intersect, and in fact form a foliation.
We recall below a standard application of Moser iteration and include the
proof since our setting is slightly different from [107, Theorem 8.17].

Proposition 8-29. Let (6, g) be a two-dimensional closed Riemannian manifold.


Let v be a C 2 solution to
−16 v − Qv = f, (8.5.5)
S TABILITY AND FOLIATIONS 345

where Q and f are in L ∞ (6) and Q ≥ 0. Then, for any p0 ≥ 2,


sup |v| ≤ C0 ∥v∥ L p0 (6) + k ,

6

where k = ∥ f ∥ L ∞ and C0 depends on p0 , 6, g, ∥Q∥ L ∞ .


Proof. Let v + := max6 {v, 0}. Note v + ∈ W 1,2 (6) and
(
∇v if v > 0,
∇v + =
0 if v ≤ 0.
Let w = v + + k, where k is defined as in the proposition. For any real number
p ≥ 1, we multiply the differential equation (8.5.5) by w p and integrate over 6:
p+1 ( p + 1)2
Z Z
|∇(w 2 )|2 dµ = (Qvw p + f w p ) dµ
6 4 p 6
Z
≤ p (Q + 1)w p+1 dµ
6
Z
≤ p max(Q + 1) w p+1 dµ, (8.5.6)
6 6
where in the first inequality we used the assumption that Q ≥ 0. This computation
establishes an upper bound of the W 1,2 -norm of w( p+1)/2 by purely the L 2 -norm
of w ( p+1)/2 . To begin the iteration procedure, we need to relate the higher-order
L q -norms to the W 1,2 -norm. For manifolds of higher dimensions, the standard
procedure is to apply the Gagliardo–Nirenberg–Sobolev inequality: for 1 ≤ p < n,
∥u∥ L p∗ ≤ C0 ∥u∥W 1, p ,
np
where p ∗ = n− p and n is the dimension of the manifold. However, n = 2 in our
case and p = 2 is the borderline case of the Gagliardo–Nirenberg–Sobolev in-
equality, so we use another inequality specifically for a two-dimensional manifold
6 [204, p. 193]: for any 1 ≤ q < ∞,

∥u∥ L q (6) ≤ C0 q∥u∥W 1,2 (6) , (8.5.7)
where C0 depends on (6, g).
Apply (8.5.7) with u = w( p+1)/2 , and let q = 2κ > 2 be a fixed real number.
Together with (8.5.6) and enlarging C0 if necessary, we obtain, for any 1 ≤ p < ∞,
1 1
p+1
∥w∥ L ( p+1)κ ≤ C0 ( p + 1) p+1 ∥w∥ L p+1 , (8.5.8)
where C0 depends on κ, 6, g, ∥Q∥ L ∞ . Now we define a sequence of numbers
p0 = p + 1, pi = p(i−1) κ = ( p + 1)κ i , i = 1, 2, . . . .
346 8. O N THE CENTER OF MASS AND CMC SURFACES

The estimate (8.5.8) implies that


i
P 1 i
pj 1
j=0
Y p
∥w∥ L p(i+1) ≤ C0 p j j ∥w∥ L p0 .
j=0

As i tends to infinity, ∥w∥ L p(i+1) converges to ∥w∥ L ∞ and the factor multiplying
the product converges because κ > 1. This implies that, for κ fixed and for any
p0 ≥ 2, and enlarging C0 from one inequality to the next if necessary,

v ≤ sup w ≤ C0 ∥w∥ L p0 ≤ C0 ∥v + ∥ L p0 + k ≤ C0 (∥v∥ L p0 + k) ,



6

where C0 depends on p0 , 6, g, ∥Q∥ L ∞ . Replacing v with −v yields

−v ≤ C0 (∥v∥ L p0 + k).

The desired estimate follows. □


We now use a scaling argument to factor out the dependence of the constant
C0 in Proposition 8-29 from the family of surfaces {6r }.
Proposition 8-30. Let (M, g) be asymptotic to Schwarzschild of mass m. Sup-
pose {6r } is a family of surfaces satisfying the property (⋆). For each r , let v be
a C 2 solution to
−16r v − Qv = f,

where Q and f are in L ∞ (6r ) and Q ≥ 0. Then, for any p0 ≥ 2,


− p2
sup |v| ≤ C0 (r 0 ∥v∥ L p0 (6r ) + k),
6r

where k = ∥ f ∥ L ∞ and C0 depends on g, p0 , ∥Q∥ L ∞ (but is independent of r ).


Proof. The property (⋆) gives a family of smooth diffeomorphisms Fr : S2 → 6r
such that |d Fr − r Id| = O2 (1), where Id is the identification map from T S2 to
T 6r . This implies the pullback metric satisfies

∥Fr∗ g6r − gS2 ∥C 2 ≤ C,

where C depends only on g. Considering the pullback of the differential equation


for v ◦ Fr and applying Proposition 8-29 on the fixed geometry (S2 , gS2 ), we
obtain
sup |v ◦ Fr | ≤ C0 ∥v ◦ Fr ∥ L p0 (S2 ) + k ,

S2

where C0 depends on g, p0 , k = ∥ f ◦ Fr ∥ L ∞ (S2 ) , and Q. Using the area formula


for the L p0 -norm, we have the desired estimate. □
S TABILITY AND FOLIATIONS 347

Theorem 8-31. Let (M, g) be asymptotic to Schwarzschild of mass m > 0.


Suppose {6r } is a family of surfaces satisfying the properties (⋆) and (⋆⋆). Let
u ∈ C 2,α (6r ) satisfy

L 6r u := −16r u − |A|2 + Ric(ν, ν) u = c,




for some constant c. Then there exists r0 large enough so that, for each r ≥ r0 ,

sup |u − ū| ≤ Cr −1 |ū|,


6r

where C depends only on g. As a consequence, for r0 sufficiently large, the


solution u is either positive or negative for each r ≥ r0 .
Proof. Note that (u−ū) satisfies the equation L 6r (u−ū) = c+(|A|2 +Ric(ν, ν))ū.
By Theorem 8-26 and the estimate on |A|2 + Ric(ν, ν) from (⋆⋆), we obtain
6m C
 Z Z
2
− 4 |u − ū| dµ ≤ (u − ū)L 6r (u − ū) dµ
r3 r 6r 6r
Z
= (|A|2 + Ric(ν, ν))ū(u − ū) dµ
6r
C
Z
≤ 4 |ū||u − ū| dµ
r 6r
12 Z 12
C
Z
2 2
≤ 4 |u − ū| dµ |ū| dµ .
r 6r 6r

This implies
1
∥u − ū∥ L 2 (6r ) ≤ Cm −1r −1 |ū||6r | 2 .

By Proposition 8-30,

sup |u − ū| ≤ C r −1 ∥u − ū∥ L 2 (6r ) + k ,



6r

where k = max6r c + (|A|2 + Ric(ν, ν))ū and the constant C depends on


sup6r (|A|2 + Ric(ν, ν)). To estimate c, we integrate L 6r u = c over 6r and
use (⋆⋆) to obtain
Z Z
|6r ||c| ≤ (|A|2 + Ric(ν, ν))(u − ū) dµ + (|A|2 + Ric(ν, ν))ū dµ
6r 6r
1
−4
≤ C(r ∥u − ū∥ L 2 |6r | + |ū|).
2

By the above estimates and the properties (⋆⋆), we have

sup |u − ū| ≤ Cr −1 |ū|. □


6r
348 8. O N THE CENTER OF MASS AND CMC SURFACES

Applying Theorem 8-31 to (8.5.4), we obtain the following result.

Corollary 8-32. Let (M, g) be asymptotic to Schwarzschild of mass m > 0.


Suppose {6r } is a family of surfaces satisfying the properties (⋆) and (⋆⋆). Then
there exists r0 > 0 such that the family of surfaces {6r } for r ≥ r0 forms a
foliation.

8.6. Density theorems

8.6.1. Weighted Sobolev spaces. We introduce a topology on the space of asymp-


totically flat initial data sets using the following weighted norm. Let B be a ball
in Rn centered at the origin. For k ∈ {0, 1, . . . }, p ≥ 0, and q ∈ R, we define the
k, p k, p
weighted Sobolev space W−q (Rn \ B) to be the set of functions f ∈ Wloc (Rn \ B)
with
Z 1p
I |I |+q p −n
X
∥ f ∥W k, p (Rn \B) := |∂ f (x)||x| |x| d x < ∞.

−q
Rn \B |I |≤k

When p = ∞,

ess sup |∂ I f ||x||I |+q .


X
∥ f ∥W k,∞ (Rn \B) :=
−q
|I |≤k Rn \B

Suppose M is a smooth manifold such that there is a compact set K ⊂ M and


a diffeomorphism M \ K ∼ = Rn \ B. Choose an atlas for M that consists of the
diffeomorphism M \ K ∼ = Rn \ B and finitely many precompact charts on K .
k, p k, p
We define the W−q (M) norm on M by summing over the W−q norm on the
noncompact chart and the W k, p norm on the precompact charts. The definition
extends to the tensor bundles of M by considering the components with respect
to these charts, and can also easily extend to an asymptotically flat manifold with
k, p k, p
a finite number of ends. We sometimes write W−q for W−q (M).
It is known that the ADM energy and linear momentum are continuous func-
tions with respect to the appropriate weighted Sobolev topology.

Theorem 8-33. Let p > n ≥ 3, q ∈ n−2 2 , n − 2 , q0 > 0. Let (g, k) and ( ḡ, k̄)

2 1
be Cloc × Cloc asymptotically flat initial data sets such that
2, p 1, p
(g − g0 , k), (ḡ − g0 , k̄) ∈ W−q × W−1−q ,

where g0 is a smooth symmetric (0, 2) tensor that coincides with gE on M \ K ,


and such that
0, p
µ, J, µ̄, J¯ ∈ W−n−q . 0
D ENSITY THEOREMS 349

Let ϵ > 0. There exists δ > 0 such that if


∥g − ḡ∥W 2, p ≤ δ and ∥k − k̄∥W 1, p ≤ δ,
−q −1−q
then
|E − Ē| < ϵ and |P − P̄| < ϵ.
The proof of this fact goes back to [202, p. 50] for E only and to [69, p.
198] in the vacuum case. The proof of the general case can be found in [119,
Proposition 2.4] and [78, Proposition 19].
On the other hand, the BORT center of mass and the ADM angular momentum
may not be defined in general for asymptotically flat initial data sets since
the integrals (8.1.1) and (8.1.2) may diverge [117; 47; 43; 46]. In fact, the
BORT center of mass and the ADM angular momentum are discontinuous
with respect to above topology (see Theorem 8-39 below). Nevertheless, if we
consider a topology that incorporates the Regge–Teitelboim conditions, we have
an analogous continuity result for the center of mass and angular momentum.
We let f odd (x) = ( f (x) − f (−x))/2 and f even (x) = ( f (x) + f (−x))/2 with
respect to an asymptotically flat coordinate chart.
Theorem 8-34 (cf. Huang [119, Proposition 2.4], [116, Theorem 2.2]). Let
p > n ≥ 3, q ∈ n−2 ¯
2 , n − 2 , q0 > 0. Let (g, k), ( ḡ, k̄), (µ, J ), (µ̄, J ) satisfy the


assumptions in Theorem 8-33. Suppose they also satisfy


2, p 1, p
(giodd even odd even
j , ki j ), ( ḡi j , k̄i j ) ∈ W−1−q (M \ K ) × W−2−q (M \ K )

and
0, p
µodd , Jiodd , µ̄odd , J¯iodd ∈ W−n−q0 −1 (M \ K ).
Let ϵ > 0. There exists δ > 0 such that if
∥g odd − ḡ odd ∥W 2, p ≤δ and ∥k even − k̄ even ∥W 1, p ≤δ
−1−q (M\K ) −2−q (M\K )

then
|CBORT − C BORT | < ϵ and |J − J | < ϵ.

8.6.2. Scalar curvature equation8 . We discuss a density result for the scalar
curvature equation due to Schoen and Yau. The density argument is used in
the proof of the Riemannian positive mass theorem and enables them to reduce
the case of the general asymptotically flat metrics to the case that the metrics
are scalar flat and conformally flat at infinity. In what follows, we consider an
asymptotically flat manifold M of dimension n ≥ 3.
8 See p. 280–284 for more details on the results in this section, which we include in order to
keep the chapter self-contained.
350 8. O N THE CENTER OF MASS AND CMC SURFACES

Theorem 8-35 (Schoen–Yau [202]). Let (M, g) be an n-dimensional asymp-


totically flat initial data set with nonnegative scalar curvature and the ADM
mass m. Given ϵ > 0, there exists an asymptotically flat metric ḡ with zero scalar
curvature such that, outside a compact set of M, the metric has the form
4
ḡi j = u n−2 δi j
with u = 1 + m2 r 2−n + O(r 1−n ), where m is the ADM mass of ḡ and
m ≤ m + ϵ.
For the proof of the theorem, we establish the following lemma. The analysis
can be carried out with q0 > 0, with an error term that will in general have
slightly weaker decay; see [78, Proposition 24]. This remark applies also to
Proposition 8-37 below.
Lemma 8-36 (Schoen–Yau [199, Lemma 3.3]). Let (M, g) be an n-dimensional
asymptotically flat initial data set with the ADM mass m. Suppose the scalar
curvature R(g) ≥ 0 is positive somewhere, and R(g) = O(|x|−n−q0 ) for some
q0 > 1. Then there exist a constant A < 0 and a unique metric ḡ = u 4/(n−2) g
with zero scalar curvature such that
A
u = 1 + r 2−n + O(r 1−n ).
2
Furthermore, the ADM mass m of ḡ satisfies m = m + A < m.
Proof. Let ḡ = u 4/(n−2) g. The scalar curvatures of g and ḡ are related by
− n+2 4(n − 1)
 
R(ḡ) = u n−2 R(g)u − 1g u .
n−2
n−2
Denote the conformal Laplace operator by L = 1g − 4(n−1) R(g). Let p > n,
n−2 2, p 0, p
q ∈ 2 , n − 2 . Then L : W−q → W−2−q is a Fredholm operator of index zero,


by [15, Proposition 1.14]. To find a solution to the inhomogeneous equation


0, p
Lv = f for f ∈ W−2−q , it suffices to prove that L has a trivial kernel by the
2, p
Fredholm alternative. Let v ∈ W−q satisfy Lv = 0. Multiplying the equation
Lv = 0 by v and applying the divergence theorem, we have
n−2 ∂v
Z Z Z
2 2
0≤ |∇v| dσ = − R(g)v dσ + lim v dµ
M 4(n − 1) M r →∞ {|x|=r } ∂ν
n−2
Z
=− R(g)v 2 dσ ≤ 0,
4(n − 1) M
where dσ is the volume measure of (M, g) and we have used the fall-off rates
of v and ∂v to compute the boundary term. We conclude that v ≡ 0. Therefore
n−2
there is a unique solution to Lv = 4(n−1) R(g). Let u = v + 1. Then u satisfies
D ENSITY THEOREMS 351

Lu = 0; moreover, u > 0 everywhere by the strong maximum principle. Thus


ḡ = u 4/(n−2) g is the desired metric. The asymptotic expansion of u follows from
[15, Theorem 1.17]; compare [78, Proposition 24].
To show that A < 0, we integrate Lu = 0 over a large ball and apply the
divergence theorem:
n−2 ∂u
Z Z
0 < lim R(g)u dσ = lim dµ
r →∞ 4(n − 1) {|x|≤r } r →∞ {|x|=r } ∂ν

(n − 2)A 1−n
Z
= lim − r dµ
r →∞ {|x|=r } 2
n−2
=− ωn−1 A. □
2
Proof of Theorem 8-35. By Lemma 8-36, we may assume that g has zero scalar
curvature. For λ ≥ 1 large, we define the cutoff metric

ĝλ := χλ g + (1 − χλ )gE ,

where χλ (x) = χ (x/λ) and χ is a smooth cutoff function on Rn that is 1 on


{|x| ≤ 1} and 0 on {|x| ≥ 2}. Note that the cutoff metric has zero scalar curvature
everywhere except the interpolating region λ ≤ |x| ≤ 2λ and R(ĝλ ) = O(λ−n )
there. We would like to find a metric ḡ = u 4/(n−2) ĝλ with zero scalar curvature in
the conformal class of ĝλ . By the transformation formula of the scalar curvature,
it suffices to find a positive function u that tends to 1 at infinity and satisfies
n−2
1ĝλ u − R(ĝλ )u = 0.
4(n − 1)
Note that R(ĝλ ) may not be nonnegative everywhere, so the proof of Lemma 8-36
cannot be applied. The approach of Schoen and Yau relies on a Sobolev inequality
and requires ∥R(ĝλ )− ∥ L n2 (M) sufficiently small, which is achieved by choosing
λ sufficiently large. We refer the details to [199, Lemma 3.2].
The last statement that m ≤ m + ϵ follows from the continuity of the ADM
mass by Theorem 8-33. □

8.6.3. Einstein constraint equations. Let (M, g, k) be an initial data set. Define
the momentum tensor
π = k − (trg k)g.

It can be convenient to express initial data in terms of π rather than k. We refer


to (M, g, π ) as an initial data set in this section and define the constraint map

8(g, π ) = (2µ, J ) = R(g) − |π |2g + n−1 1


(trg π)2 , divg π .

352 8. O N THE CENTER OF MASS AND CMC SURFACES

We say that (M, g, π ) has harmonic asymptotics if there exist a smooth


function u and a smooth vector field X such that u → 1, X → 0 at infinity and,
outside a compact set of M, we have
4 2
g = u n−2 gE and π = u n−2 (LgE X ),

where the operator Lg is defined by Lg X = L X g − (divg X ) g and L X g is the Lie


derivative. Throughout this section, we denote by g0 a smooth symmetric (0, 2)
tensor on M that coincides with gE on M \ K .
The term “harmonic” follows from the following proposition that the leading-
order terms of the function u and the vector field X are harmonic.

Proposition 8-37 (Corvino–Schoen [69]; see also [78, Proposition 24]). Let
p > n, q ∈ n−2 n
2 , n − 2 , q0 > 1. Suppose that (M , g, π ) is an asymptotically


flat initial data set that satisfies


2, p 1, p
(g − g0 , π ) ∈ W−q (M) × W−1−q (M),
and
0, p
(µ, J ) ∈ W−n−q0 ,

such that (g, π ) has harmonic asymptotics:


4 2
g = u n−2 gE , π = u n−2 LgE X, (8.6.1)
2, p
outside a compact set, for some (u − 1, X ) ∈ W−q . Then (u, X ) admits an
expansion of the form

u(x) = 1 + a|x|2−n + O(|x|1−n ),


X i (x) = bi |x|2−n + O(|x|1−n ),
n

where X = Xi .
P
i=1 ∂xi
A generalization of Theorem 8-35 to the full constraint equations states that ini-
tial data sets with harmonic asymptotics are dense among general asymptotically
flat initial data sets:

Theorem 8-38 (Corvino–Schoen [69, Theorem 1]). Let p > n, q ∈ n−2 2 ,n−2 .


Let (g, π ) and (ḡ, π̄ ) be vacuum asymptotically flat initial data sets satisfying
2, p 1, p
(g − g0 , π ) ∈ W−q × W−1−q .

Let ϵ > 0. There exists a vacuum asymptotically flat initial data set (ḡ, π̄) with
D ENSITY THEOREMS 353

harmonic asymptotics such that

∥g − ḡ∥W 2, p < ϵ, ∥π − π̄∥W 1, p < ϵ


−q −1−q
and
|E − Ē| < ϵ, |P − P̄| < ϵ.

Proof. For λ ≥ 1 large, define the cutoff initial data

(ĝλ )i j = χλ gi j + (1 − χλ )δi j , π̂λ = χλ π,

where χλ (x) = χ (x/λ) and χ is a smooth cutoff function on Rn such that χ is 1


on {|x| ≤ 1} and 0 on {|x| ≥ 2}. In the following, we suppress the subscript λ
when the context is clear.
The Einstein constraint equations form an underdetermined system: the num-
ber of unknowns is greater than the number of the equations that determine them.
One reason to introduce a function u and a vector field X in the expression of
harmonic asymptotics is to obtain a determined-elliptic system. Let
4 2
g̃ = u n−2 ĝ, π̃ = u n−2 (π̂ + Lĝ X ).

We would like to find u tending to 1 and X tending to 0 at infinity such that


(g̃, π̃ ) satisfies the vacuum constraints.
2, p 2, p 0, p
We define a map T(ĝ,π̂) : (W−q +1)×W−q → W−2−q via the constraint operator
2, p 2, p 0, p
T(ĝ,π̂) (u, X ) = 8(g̃, π̃ ), and we define T(g,π) : (W−q + 1) × W−q → W−2−q
analogously; these are both smooth maps. The linearization of T(ĝ,π̂ ) at (1, 0) is

DT(ĝ,π̂) |(1,0) (v, Z )



−4
= n−2 (n−1)1ĝ v+ Rĝ −|π̂|2ĝ + n−1 1
(trĝ π̂ )2 v − 4Z k;ℓ π̂ kℓ + n−1
2
trĝ π̂ divĝ Z ,
 

divĝ (Lĝ Z ) j + 2(n−1) v
n−2 ,k j π̂ k
− 2
v tr
n−2 , j ĝ π̂ − 2
n−2 (div ĝ π̂) j v ,

where indices are raised and covariant derivatives are taken with respect to ĝ.
Because q ∈ n−2 2 , n − 2 and p > n, DT(ĝ,π̂) |(1,0) and DT(g,π ) |(1,0) are Fred-


holm operators of index 0 for λ sufficiently large [15]. Instead of proving the
linearization has a trivial kernel as in the proof of Theorem 8-35, which seems
difficult for the system, we use the following argument.
2, p 1, p
Let K 1 be a complementing subspace for ker(DT(g,π ) |(1,0) ) in W−q × W−1−q .
2, p 1, p
Since DT(g,π) |(1,0) is Fredholm and the linearization D8|(g,π ) : W−q ×W−1−q →
0, p
W−2−q is surjective (see [69, Proposition 3.1]; cf. [120, Lemma 2.10]), we can
N
find smooth compactly supported symmetric (0, 2)-tensors {(h k , wk )}k=1 whose
N
images {D8|(g,π) (h k , wk )}k=1 form a basis for a complementing subspace of
354 8. O N THE CENTER OF MASS AND CMC SURFACES

0, p N
ran(DT(g,π) |(1,0) ) in W−2−q . Let K 2 = span{(h k , wk )}k=1 . For (u − 1, X ) ∈ K 1
and (h, w) ∈ K 2 , define the maps T (ĝ,π̂) , T (g,π) by
4 2
T (ĝ,π̂) (u, X, h, w) = 8(u n−2 ĝ + h, u n−2 (π̂ + Lĝ X ) + w),
4 2
T (g,π) (u, X, h, w) = 8(u n−2 g + h, u n−2 (π + Lg X ) + w).

By construction, DT (ĝ,π̂) |(1,0,0,0) is an isomorphism for λ sufficiently large.


2, p 1, p
Using that (ĝ, π̂ ) converges to (g, π ) in W−q × W−q−1 as λ → ∞, it is easy
to see that DT (ĝ,π̂) |(u,X,h,w) converges to DT (g,π ) |(u,X,h,w) as λ → ∞, locally
uniformly in (u, X, h, w) in the strong operator topology. By the inverse function
theorem, for all λ ≥ 1 sufficiently large, T (ĝ,π̂) restricts to a diffeomorphism
defined on an open neighborhood of (1, 0, 0, 0) (independent of λ ≥ 1) and
0, p
onto an open neighborhood containing a ball centered at (0, 0) in W−2−q . The
preimage T −1(ĝ,π̂) (0, 0) gives the desired solution.
Finally, it follows from Theorem 8-33 that

|E − Ē| < ϵ, |P − P̄| < ϵ. □

The vacuum assumption in Theorem 8-38 can be replaced by appropriate


assumptions on (µ, J ). In fact, using a more delicate perturbation argument, one
can prove that if (g, π ) satisfies the dominant energy condition, it is possible to
obtain a strict dominant energy condition for the approximate data (ḡ, π̄) with
harmonic asymptotics [78, Theorem 18]. This fact is used in the proof of the
spacetime positive mass theorem to reduce the general case of the theorem to the
special case of initial data that has harmonic asymptotics with a strict dominant
energy condition [78].

8.6.4. Applications to the center of mass and angular momentum. General-


izing the proof of Theorem 8-38, we show that one can arbitrarily specify the
BORT center of mass and the ADM angular momentum.

Theorem 8-39 (Huang–Schoen–Wang [122, Theorem 3]). Let (M, g, π) be a


three-dimensional vacuum asymptotically flat initial data set satisfying the Regge–
Teitelboim conditions and E > |P|. Given any constant vectors α⃗ 0 , γ⃗0 ∈ R3 , and
ϵ > 0, there is a vacuum initial data set (ḡ, π̄ ) satisfying the Regge–Teitelboim
conditions such that

∥g − ḡ∥W 2, p ≤ ϵ, ∥π − π̄∥W 1, p ≤ ϵ
−q −1−q
and
E = E, P = P, J = J +α
⃗0, C BORT = CBORT + γ⃗0 .
D ENSITY THEOREMS 355

Analogous to the positive mass conjecture, there is a conjectured inequality



between the ADM energy and angular momentum. It is known that E ≥ |J |
for axially symmetric asymptotically flat black hole initial data sets ([58; 60; 63;
71; 206; 232]; see also [231]). However, Theorem 8-39 shows this inequality
does not hold in general for asymptotically flat data sets without axial symmetry.
The proof of Theorem 8-39 goes along the same lines as that of Theorem 8-38,
but different cutoff data sets are employed. Let σ, τ be symmetric (0, 2)-tensors
on R3 . Suppose further that σ, τ are compactly supported on {1 ≤ |x| ≤ 2}
satisfying the linearized constraint equations (at the trivial data)
(σi j,i j − σii, j j ) = 0,
P
i, j

τi j,i = 0 for j = 1, 2, 3.
P
i
Consider
ĝλ = g + σλ and π̂λ = π + τλ ,
where σλ = σ (x/λ), τλ = τ (x/λ). To specify the center of mass and angular
momentum, the proof centers on constructing the tensors σ, τ with certain desired
properties. For the angular momentum, we find σ, τ with components satisfying
σi j (x) = σi j (−x), τi j (x) = τi j (−x) and so that, for a given α⃗ = (α1 , α2 , α3 ) ∈ R3 ,
Z
l l
X
1
 ij
2 τi j,l Y(k) + τil (Y(k) ), j σ d x = αk
{1≤|x|≤2} i, j,l

for each k = 1, 2, 3, where Y(k) is the rotation vector field Y(k) = ∂ ∂x k × x⃗.
To specify the center of mass, we construct a divergence-free and trace-free
tensor σ such that for a given γ⃗ = (γ1 , γ2 , γ3 ) ∈ R3 ,
Z
xk (σi j,l )2 d x = γk ,
X
{1≤|x|≤2} i, j,l

for each k = 1, 2, 3. For the construction of those tensors, see Theorems 2.1
and 2.2 of [122].
CHAPTER 9

On the Riemannian Penrose inequality

9.1. Introduction

In 1973 Roger Penrose [180] proposed an inequality which states that, in an


isolated gravitational system with nonnegative local energy density, the total
mass of the system must be at least as much as that contributed by any black
holes contained within. The Penrose inequality takes the form
r
A
m≥ , (9.1.1)
16π
where m is the total mass of a spacelike slice of a spacetime and A is the total
area of the black holes within the spacetime, as viewed from the spacelike
slice. Penrose’s original motivation for proposing this inequality is the fact
that a counterexample of this inequality would produce a counterexample for
cosmic censorship, one of the central open questions in relativity, related to the
deterministic character of the theory [178; 180; 181].
The Penrose inequality in full generality remains an open problem. In this chap-
ter we focus on proofs of an important special case of the statement, the so-called
Riemannian Penrose inequality (RPI). This scenario arises when considering the
inequality on time-symmetric spacetimes. (For a discussion of the full Penrose
inequality for general spacetimes, see the survey paper by Marc Mars [150].) In
this case, the Penrose inequality is formulated on an asymptotically flat, three-
dimensional Riemannian manifold (M, g) with nonnegative scalar curvature that
contains “black holes”, i.e., closed minimal surfaces. The inequality retains the
same form as (9.1.1), where m is the ADM mass of (M, g) and A is the area of
the outermost minimal surface in M.
We consider three approaches to the Riemannian Penrose inequality. In 2001
Huisken and Ilmanen [125] proved it using Geroch’s monotonicity formula for

This chapter arose from class notes for the minicourse “On the Penrose Inequality” taught by
Fernando Schwartz in the 2012 MSRI Summer School on Mathematical Relativity and the 2013
Summer School in Cortona, Italy. Brian Allen attended the MSRI course and contributed to the
writing, having carefully typed the contents of the course and improved the exposition, particularly
in Section 9.4.

357
358 9. O N THE R IEMANNIAN P ENROSE INEQUALITY

the inverse mean curvature flow. Their proof has an unavoidable limitation: the
term A in (9.1.1) is the area of any single component of the outermost minimal
surface. In the same year, Bray [26] proved the general statement of the RPI
using a conformal flow and the positive mass theorem. (Bray’s proof was later
generalized by Bray and Lee to work up to dimension seven [28].) Finally, Lam
[138] proved the RPI for graphical manifolds, in arbitrary dimensions, in 2010.
Related results in different ambient spaces appear in [95].
In this chapter we present the main ideas of these three proofs of the RPI.
We start with Lam’s proof, which is the simplest; we move onto Huisken and
Ilmanen’s proof, which is the one that builds on Geroch’s original argument; and
we end with Bray’s proof, which gives the most general result to date.

9.2. Preliminaries

The Riemannian Penrose inequality is an important special case of the general


Penrose inequality. It arises when we consider Cauchy hypersurfaces M that are
time-symmetric, i.e. those for which the second fundamental form of M inside
a spacetime L is identically zero. If we also assume that the Lorentzian space-
time L satisfies the dominant energy condition, a desirable physical hypothesis,
then it can be shown using the Einstein constraint equations that M must have
nonnegative scalar curvature. The Riemannian Penrose inequality deals with
the effects of black holes on the total mass of an isolated system. A seemingly
simple, and closely related, question is to determine the effect of nonnegative
local mass density on the total mass of such an isolated system. It turns out
that this question is quite subtle in itself, and for non-spin manifolds in arbitrary
dimensions remained an open problem whose resolution was announced only
relatively recently [205] (see also [147]). It is known as the positive mass theorem,
the Riemannian version of which (in dimension three) is stated as follows.
Theorem 9-1 (positive mass theorem, PMT). Let (M 3 , g) be an asymptotically
flat Riemannian manifold with nonnegative scalar curvature and ADM mass m.
Then m ≥ 0, with equality if and only if M is isometric to flat Euclidean space.
The PMT was first proved by Schoen and Yau [199] using minimal surfaces.
Their original proof works through dimension seven. The dimensional restriction
comes from the fact that (stable) minimal surfaces can have singularities in
high ambient dimensions, a well-known complication coming from geometric
measure theory. A clever proof of the PMT for spin manifolds, which works
in all dimensions, was found by Witten [224]. In this chapter we will provide
a simple proof of the PMT by Lam [138], which covers the case when M is
isometric to the graph of a function defined on Euclidean space.
P RELIMINARIES 359

The statement of the Penrose inequality in the time-symmetric case can


be written in purely Riemannian terms. In the context of asymptotically flat
Riemannian manifolds, the Penrose inequality deals with the outermost minimal
surface, since this is the Riemannian counterpart of the event horizon of a
black hole in this case. In analogy with the positive mass theorem, there is a
precise characterization of the case of equality, as achieved by the Riemannian
m 4
Schwarzschild manifold of mass m given by R3 \ Bm/2 , 1 + 2r δ , where
 

r = |x|, Bm/2 = x ∈ R3 : |x| < m2 , and δ is the Euclidean metric. More




precisely:
Theorem 9-2 (Riemannian Penrose inequality). Let (M 3 , g) be an asymptot-
ically flat Riemannian manifold with nonnegative scalar curvature and ADM
mass m. Let 6 2 ⊂ M be the outermost minimal hypersurface in M, and denote
by |6| its area. Then r
|6|
m≥ .
16π
Equality is achieved if and only if (M 3 , g) is isometric (outside 6) to the Rie-
mannian Schwarzschild manifold of mass m.
Throughout this chapter we will discuss three different proofs of the Rie-
mannian Penrose inequality that apply to three slightly different, precisely stated
versions of the inequality above. Each one of the proofs gives us a different
perspective on the problem, but only Bray’s proof fully establishes Theorem 9-2.
In Section 9.3 we discuss Lam’s proof of the PMT, as well as his argument
for proving the RPI for a manifold that is a graph over Rn . Both proofs work in
arbitrary dimensions.
In Section 9.4 we discuss Huisken and Ilmanen’s proof of the Riemannian
Penrose inequality. Their proof develops a weak setting in which Geroch’s
monotonicity formula for the Hawking mass under inverse mean curvature flow
can be applied. The techniques contain many interesting ideas of independent
interest, some of which have had a broader impact in geometric analysis and
general relativity (see e.g. [167; 213]).
Finally, in Section 9.5 we discuss Bray’s proof of the Riemannian Penrose
inequality. Bray’s argument fully proves Theorem 9-2. It uses the PMT together
with a novel conformal flow of the metric and a mass-capacity inequality.
Our goal in this chapter is to provide a gentle introduction to the main ideas,
central arguments, and motivating calculations that go into the aforementioned
proofs. In particular both Huisken and Ilmanen’s proof and Bray’s proof are quite
involved and require deep knowledge and lengthy calculations from elliptic PDE
and geometric measure theory, for instance. Thus, we shall not give a complete
360 9. O N THE R IEMANNIAN P ENROSE INEQUALITY

account of their arguments here. Rather, we hope that the interested reader will
be intrigued enough by the sheer beauty of the results, ultimately visiting the
original works in search of their full exposition.

9.3. Lam’s proof of the RPI (and PMT) for graphs,


in arbitrary dimensions

For this section we will adapt the definition of asymptotically flat manifolds to
the setting of graphs over Rn .
Definition 9-3. Let  ⊂ Rn , n ≥ 3, be a bounded open set (possibly empty),
and let f : Rn \  → R be a smooth function. Let M = {(x, f (x)) : x ∈ Rn \ }
denote the graph of f in Rn+1 , and we denote partial derivatives with subscripts.
We say that M is asymptotically flat if

f i (x) = O(|x|− p/2 ), |x|| f i j (x)| + |x|2 | f i jk (x)| = O(|x|− p/2 ),


n−2
as |x| → ∞, for some p > 2 .

Let us state now the positive mass theorem and the Riemannian Penrose
inequality for the case of graphs.
Theorem 9-4 (PMT for graphs). Let (M n , g) be as in Definition 9-3, with  = ∅.
Then the ADM mass of M satisfies
1 R(g)
Z
m= d Vg ,
2(n − 1)ωn−1 M 1 + |∇ f |2
p

where R(g) and d Vg are the scalar curvature and volume measure of (M, g),
respectively, ωn−1 is the area of the (n − 1)-dimensional unit sphere and |∇ f |2 =
f 12 + · · · + f n2 .
Remark 9-5. If R(g) ≥ 0 above, then m ≥ 0. Also, assuming R(g) ≥ 0, then
m = 0 if and only if f is constant. In other words, m = 0 if and only if M is flat
Euclidean space; see [121].
Lam’s formula is quite general and does not need an assumption on the scalar
curvature. This observation has been used for extending the PMT to different
settings (see [160], for example). The RPI for graphs is the following inequality.
Theorem 9-6 (Riemannian Penrose inequality for graphs). Let (M, g) be iso-
metric to the smooth asymptotically flat graph of f : Rn \  → R, where  ⊂ Rn
is a smooth bounded open set that is the union of its finitely many components,
the closure of each of which is a convex smooth compact set. Assume that f is
constant on each component of ∂ and |∇ f (x)| → ∞ as x → 6 := ∂. Then if
L AM ’ S PROOF OF THE RPI ( AND PMT) FOR GRAPHS , IN ARBITRARY DIMENSIONS 361

|6| is the area of 6, and if R(g) and d Vg are the scalar curvature and volume
measure of g, the ADM mass of M satisfies
 n−2
1 |6| n−1 1 R(g)
 Z
m≥ + d Vg .
2 ωn−1 2(n − 1)ωn−1 M 1 + |∇ f |2
p

 n−2
Hence, if R(g) ≥ 0, then m ≥ 12 |6|/ωn−1 n−1 .
Remark 9-7. Lam’s proof of the RPI does not deal with the case of equality.
Nevertheless, Huang and Wu establish it in [121]. Theorem 9-6 can be generalized
to the case of 6 mean-convex and outer-minimizing. This follows from estimates
by Freire and Schwartz [95], as we will see at the end of this section.
We let the mean curvature H of a hypersurface 6 with respect to a smooth unit
normal field ν be the divergence of ν taken along 6 (note the sign convention). A
standard formula for the mean curvature of graphs [41] gives the mean curvature
H of 6 with respect to the induced metric as
H0
H=p ,
1 + |∇ f |2
where H0 is the mean curvature of 6 inside Euclidean space. So the assumption
that |∇ f | → ∞ on 6 guarantees that H = 0; i.e., 6 is a minimal surface.
Before we can prove the above theorems we will need some preliminary
results, the proofs of which can be found in [41; 138]. In what follows, we will
use the convention of summing over pairs of repeated indices.
Lemma 9-8. The induced metric of a manifold M n that is the graph of a function
f : Rn → R is given by
fi f j
gi j = δi j + f i f j , gi j = δi j − .
1 + |∇ f |2
The Christoffel symbols of this metric are
fi j fk
0ikj = ,
1 + |∇ f |2
f i jk f k f i j f kk 2 f i j f kl f k fl
0ikj,k = + − .
1 + |∇ f |2 1 + |∇ f |2 (1 + |∇ f |2 )2
Combining the formula

R(g) = g i j 0ikj,k − 0ik,


k l k l k
j + 0i j 0kl − 0ik 0 jl .


for the scalar curvature R(g) in terms of the Christoffel symbols with Lemma 9-8
we obtain the following.
362 9. O N THE R IEMANNIAN P ENROSE INEQUALITY

Lemma 9-9. The scalar curvature R(g) of a manifold M that is the graph of a
function f : Rn → R is given by
1 2 f j fk
 
R(g) = f f
ii j j − f f
ij ij − ( f f
ii jk − f f
i j ik ) .
1 + |∇ f |2 1 + |∇ f |2
Lemma 9-10. The scalar curvature of a graph over Rn is given by
1 ∂
 
R(g) = div0 ( f f
ii j − f f
ij i ) ,
1 + |∇ f |2 ∂x j
where div0 is the divergence with respect to the Euclidean metric.
The proof, a simple calculation using Lemma 9-9, is left as an exercise.
We will need one more lemma relating the volume form and mean curvature
with respect to g to the volume form and mean curvature with respect to the
Euclidean metric.
Lemma 9-11. Let d Vg be the volume measure of (M, g), and let d V0 be the
Euclidean volume measure on Rn . Using pullback under graphical coordinates,
we have p
d Vg = 1 + |∇ f |2 d V0 .

Also, if 6 is a regular level set of a smooth function f : Rn \  → R, its mean


curvature H0 with respect to the Euclidean metric is given by
1
H0 = ( f ii f j − f i j f i )ν j ,
|∇ f |2

where ν = ν j is a unit normal vector to 6.
∂x j

Proof. The first equation follows directly from the fact d Vg = det g d V0 . The
second equality comes from the fact that, for a set defined as a level set of a
∇f
function, a unit normal vector is given by ν = ± |∇ f | , and the mean curvature is
just H0 = div0 (ν). Hence we find
∂ fi 1 1
H0 = ± =± 3
( f ii f j f j − f i j f i f j ) = ( f ii f j − f i j f i )ν j ,
∂ xi |∇ f | |∇ f | |∇ f |2
as desired. □
Remark 9-12. That |∇ f (x)| → ∞ as x → 6 is a technicality that is not discussed
in Lam’s original paper. It can be dealt with by approximating 6 by (sufficiently
close) parallel surfaces 6ϵ on its outside; see [121]. All integrals over 6 should
actually be construed as limits of integrals over 6ϵ ; we avoid further comment
here for the sake of simplicity.
L AM ’ S PROOF OF THE RPI ( AND PMT) FOR GRAPHS , IN ARBITRARY DIMENSIONS 363

Proof of Theorems 9-4 and 9-6. We will prove both theorems at once where we
note that for Theorem 9-4 we will use the assumption that M is a graph over all
of Rn , whereas for Theorem 9-6 we will use the assumption that M is a graph
over Rn \ , and  ̸= ∅. The mass of (M, g) is given by the following where
Sr = {x ∈ Rn : |x| = r } and d Sr is the Euclidean surface measure associated to
this set:
1
Z
m = lim (gi j,i − gii, j )ν j d Sr
r →∞ 2(n − 1)ωn−1 S
r
1
Z
= lim ( f ii f j − f i j f i )ν j d Sr
r →∞ 2(n − 1)ωn−1 S
r

1 f ii f j − f i j f i
Z  
= lim ν j d Sr ,
r →∞ 2(n − 1)ωn−1 S
r
1 + |∇ f |2
where the last equality follows from the fact that
1
= 1 + O(|x|− p) ) as |x| → ∞,
1 + |∇ f (x)|2
by Definition 9-3. To prove Theorem 9-4 we use the divergence theorem. We
notice that when M is a graph over all of Rn there is no boundary term. Applying
the divergence theorem gives
1 f ii f j − f i j f i ∂
Z  
m= div0 d V0
2(n − 1)ωn−1 Rn 1 + |∇ f |2 ∂ x j
1 R(g)
Z
= d Vg ,
2(n − 1)ωn−1 Rn 1 + |∇ f |2
p

as desired. For the case where the asymptotic end of M is a graph over Rn \ ,
with  ̸= ∅, we apply the divergence theorem and this time we get a boundary
term. Since |∇ f (x)| → ∞ as x → 6, we must proceed carefully.
By virtue of Lemma 9-11 we see that
f ii f j − f i j f i |∇ f |2
ν j = H0 .
1 + |∇ f |2 1 + |∇ f |2
|∇ f |2
The behavior of |∇ f (x)|2 as x → 6 is no cause for concern since → 1.
1+|∇ f |2
With ν pointing into  along 6, we find
1 f ii f j − f i j f i ∂ f ii f j − f i j f i
Z   Z 
m= div0 d V0 + ν j dµ0
2(n−1)ωn−1 Rn \ 1+|∇ f |2 ∂ x j 6 1+|∇ f |
2

1 R(g)
Z Z 
= d Vg + H0 dµ0 ,
2(n−1)ωn−1 Rn \ 1+|∇ f |2
p
6

where dµ0 is the surface measure of 6 with respect to the Euclidean metric.
364 9. O N THE R IEMANNIAN P ENROSE INEQUALITY

The proof now follows from the next result. □


Theorem 9-13 (Alexandrov–Fenchel inequality [4; 5]). Let 6 be a convex
hypersurface in Rn . If we let H denote its mean curvature (with respect to the
outward normal), we have
 n−2
1 1 |6| n−1
Z 
H dµ0 ≥ .
2(n − 1)ωn−1 6 2 ωn−1
See [138] for a short proof. We will show in Theorem 9-26 a generalization of
Theorem 9-13, due to Freire and Schwartz, which works for mean-convex, outer-
minimizing domains 6. It relates to Huisken and Ilmanen’s proof of the RPI,
in that it uses their work on inverse mean curvature flow in arbitrary dimen-
sions [125]. Theorem 9-26 can be applied to Lam’s result, readily generalizing
Theorem 9-6.

9.4. Huisken and Ilmanen’s proof of the RPI using IMCF

We now give an overview of Huisken and Ilmanen’s proof of the Riemannian


Penrose inequality using inverse mean curvature flow (IMCF).
Theorem 9-14 (Huisken–Ilmanen [125]). Let (M 3 , g) be a complete, asymptoti-
cally flat manifold with compact, minimal surface boundary ∂ M, R(g) ≥ 0 and
ADM mass m. Then r
|6|
m≥
16π
whenever 6 is a connected component of ∂ M, and there are no other closed
minimal surfaces in M \ ∂ M. Equality is achieved if and only if (M 3 , g) is
isometric to the Riemannian Schwarzschild manifold.
Huisken and Ilmanen’s proof uses an important geometric evolution equation
called inverse mean curvature flow (IMCF), which is interesting in its own right.
It was first introduced by Geroch [105], and later developed by Jang and Wald
[131]. The precise definition is the following.
Let M n+1 and 6 n be Riemannian manifolds. (Think of 6 n as a submanifold
of M n+1.)
Definition 9-15. A smooth solution to IMCF is a one-parameter family of em-
beddings F : 6 × [0, T ) → M such that
∂F ν
= , (9.4.1)
∂t H
where ν is a smooth unit normal field to 6t := F(6, t) and H > 0 is the mean
curvature of 6t with respect to the choice of normal.
H UISKEN AND I LMANEN ’ S PROOF OF THE RPI USING IMCF 365

small H
large speed

after some time


large H
small speed

Figure 8. An ellipsoid becoming more round along IMCF.

For the purposes of this chapter we will always be dealing with 6 ⊂ M a


closed hypersurface, and in case 6 bounds a region we will choose ν to be the
outward pointing normal vector. H will then be positive for (small) geodesic
spheres in M under our convention.
IMCF defined in this way is a degenerate nonlinear parabolic PDE, since the
denominator in the right side of (9.4.1) can vanish. Because of this, the IMCF
should behave like the heat equation provided we can get control on the mean
curvature of 6t . Heuristically, if we think of curvature as heat and the IMCF
as the heat equation, then curvature (or heat) should become uniform over time
along IMCF. This is to say, we can expect that under IMCF a suitably chosen
initial condition 60 should evolve into a surface of constant curvature as t → ∞.
Example 9-16 (IMCF with a round sphere as initial condition). We start out by
giving an example of how Sn (r ) ⊂ Rn+1 (the round sphere of radius r ) evolves
under IMCF. Since the mean curvature of Sn (r ) is constant and equal to nr , it
follows from (9.4.1) that a smooth solution to IMCF with initial condition Sn (r )
is given by 6t = Sn (r (t)), where r (t) satisfies the ODE r ′ (t) = r (t)
n . (Check
t/n
this!) This ODE, in turn, has solution r (t) = e . By uniqueness, the above
solution is the only solution of the IMCF with initial condition Sn (r ). In other
words, the round sphere remains a round sphere under IMCF, and its radius
expands exponentially fast in time.
A natural question that arises from this example is to determine which surfaces
behave like spheres under IMCF. This is to say, for which initial conditions on
60 do we find exponentially fast convergence to round spheres as t → ∞? Since
IMCF is a nonlinear and degenerate flow (see (9.4.2) below), we do not expect
this behavior to hold for all initial hypersurfaces. For example, singularities such
as H → 0 or |A|2 → ∞ as t → T < ∞ will cause the flow to no longer remain
an embedding at finite values of t. Gerhardt [102] and Urbas [216] separately
give certain sufficient conditions:
366 9. O N THE R IEMANNIAN P ENROSE INEQUALITY

Theorem 9-17 (Gerhardt [102], Urbas [216]). If 60 ⊂ Rn+1 is star-shaped with


H > 0, the IMCF with initial condition 60 has a smooth solution for all time,
e = e−t/n F(t) converge exponentially fast to a
and the rescaled embeddings F(t)
smooth embedding F e∞ (6) = Sn (r∞ ), where r∞ = (|60 |/ωn )1/n .
e∞ for which F

Now we switch gears back to the Riemannian Penrose inequality, and define
the Hawking mass of a hypersurface. This quantity is a crucial ingredient in
Huisken and Ilmanen’s proof of the RPI.

Definition 9-18. The Hawking mass of a hypersurface 6 2 ⊂ (M 3 , g) (with


surface measure dµ) is
s
|6|
 Z 
2
m H (6) := 16π − H dµ .
(16π )3 6

The outline of the proof of the RPI envisioned by Geroch [105] and further
developed by Jang and Wald [131] is based on the key facts collected in the
following proposition, from which Theorem 9-14 follows.

Proposition 9-19. Suppose (M 3 , g) is asymptotically flat with nonnegative


scalar curvature and 60 is a spherical hypersurface with smooth normal field ν.
If the smooth IMCF exists for all time, then

(i) for H = 0 (that is, if 60 is a minimal surface), we have m H (60 ) =



|60 |/(16π );
(ii) m H (6t ) is monotone nondecreasing under IMCF; and
(iii) lim m H (6t ) = m AD M (M).
t→∞

Hence
r
|60 |
= m H (60 ) ≤ m H (6t ) ≤ lim m H (6t ) = m AD M (M) = m.
16π t→∞

Conclusion (i) follows by Definition 9-18, and (ii) follows from the Geroch
monotonicity formula in Section 9.4.1 below. Underlying (iii) is a result (similar
to Theorem 9-17 above) on weak convergence to a round sphere, in which
Huisken and Ilmanen are able to relate the Hawking mass to a limit of integrals
over coordinate spheres. From these three key facts the proof of the RPI follows,
so long as the IMCF has a smooth solution for all time, which does not happen
in general. Huisken and Ilmanen’s contribution consists in the monumental task
of developing a weak existence setting for which the above facts still remain
true.
H UISKEN AND I LMANEN ’ S PROOF OF THE RPI USING IMCF 367

9.4.1. Geroch’s monotonicity formula. We want to compute the evolution equa-


tion of the mean curvature of a hypersurface along the IMCF. For this, we use the
Riccati equation together with the definition of IMCF and obtain (as in [125]):

∂H 1 |A|2 Rc(ν, ν) ∂
 
= −1 − − , dµt = dµt ,
∂t H H H ∂t

where 1 is the Laplacian on 6t , A is the second fundamental form of 6t ,


Rc( · , · ) is the Ricci curvature of (M, g), and dµt is the induced area measure
on 6t . Notice that the second equation implies that |6t | = |60 |et . We also need
the following formula, which is a consequence of the Gauss equation, where σ6
is the sectional curvature of Tx 6 in 6 (i.e. the Gauss curvature of 6), σ̄6 is the
sectional curvature of Tx 6 in M, R̄ is the scalar curvature of M, and λ1 , λ2 are
the principal curvatures of 6 in M:

σ6 = σ̄6 + λ1 λ2 = 21 R̄ − Rc(ν, ν) + 12 (H 2 − |A|2 ).

This is a general formula that holds on 6t as well.


Putting all these together we get

∂ ∂H 2 ∂
Z Z  
2
H dµt = 2H dµt + H dµt
∂t 6t 6t ∂t ∂t
1
Z    
2 2
= −2H 1 − 2|A| − 2Rc(ν, ν) + H dµt
6t H
|∇ H |2
Z  
2
= −2 − |A| − R̄ + 2σ6t dµt
6t H2
|∇ H |2 1 2 1
Z  
2
= 4π χ (6t ) + −2 − 2 H − 2 (λ1 − λ2 ) − R̄ dµt
6t H2
1
 Z 
2
≤ 16π − H dµt ,
2 6t

where we integrated by parts and used the Gauss equation in the third equality,
and then used Gauss–Bonnet and the fact that |A|2 = 21 H 2 + 21 (λ1 − λ2 )2 in the
fourth. The last inequality follows from the assumption R̄ ≥ 0 and the fact that
if (smooth) IMCF exists for all t ∈ [0, ∞), the topology of 6t remains the same
for all t. Hence, if 60 is a sphere (topologically), then 6t remains a topological
sphere for all t; in particular it remains connected, and χ(6t ) = 2 for all t; of
course if 6t is a closed, connected, orientable surface, then χ(6t ) ≤ 2 and the
inequality would also persist.
368 9. O N THE R IEMANNIAN P ENROSE INEQUALITY

With this, we deduce


∂ 1
 Z   Z 
2 2
16π − H dµt ≥ − 16π − H dµt .
∂t 6t 2 6t
Integrating this differential inequality we obtain that the quantity
 Z 
t/2 2
e 16π − H dµt
6t

is nondecreasing in t. Using |6t |1/2 = |60 |1/2 et/2 in the preceding equation
proves that the Hawking mass is nondecreasing along IMCF. This fact is known
as Geroch’s monotonicity formula.
Remark 9-20. If we could prove that a smooth solution of IMCF starting from
a minimal surface exists for all time, we would have a proof of the RPI. This
is the heuristic argument that originally motivated Geroch to pursue IMCF.
Unfortunately, this cannot be expected, as there are counterexamples where
smooth solutions of IMCF do not exist for all time. Indeed, if we start the flow
with initial condition consisting of two spheres far apart (far away black holes),
both surfaces will expand exponentially fast under IMCF, as we have seen. Thus,
they eventually meet, and this produces a “jump” in the topology. More precisely,
embeddedness of the flow no longer holds [125]. For another example, take for
initial condition a thin torus — one for which the inner radius is much smaller
than the outer radius. Then H > 0, so under IMCF the torus will expand, but it
cannot expand forever. Eventually H → 0 somewhere and |A|2 → ∞ [125].

9.4.2. IMCF in the weak setting. The weak setting developed by Huisken and
Ilmanen in [125] produces a flow that exists for all time starting off with any
reasonable initial condition. Their weak solution is unique and has “jumps”, but
nevertheless associated quantities such as the area of the flowing hypersurface
remain continuous (except possibly at time zero). Other quantities, such as
the total mean curvature, remain monotonic along the weak flow. Here we
present the main ideas in the construction of weak solutions and mention some
of its consequences, while omitting details and heavy calculations, such as those
involving geometric measure theory.
The first step in defining weak solutions to IMCF is to rewrite the IMCF
equation as a (degenerate) elliptic PDE. This is accomplished by defining a level
set formulation for which 6t is the level set of a function. More precisely, we
let u :  ⊂ M → R,  ⊂ M open, and define
E t := {u < t}, 6t := ∂ E t , E t+ := int{u ≤ t}, 6t+ := ∂ E t+ .
where ∂ and int represent the topological boundary and interior of a set.
H UISKEN AND I LMANEN ’ S PROOF OF THE RPI USING IMCF 369

∇u
If x(t) is a path with u(x(t)) = t (i.e., x(t) ∈ 6t ), then as ν = |∇u| , we see the
normal speed of a moving level set is given by
 N
dx 1
= .
dt |∇u(x)|
ν
If the level sets were to satisfy IMCF, so that ddtx = H with x(t) = F( p, t), we
would obtain
∇u
 
divg = |∇u|, (9.4.2)
|∇u|
as the term on the left is the mean curvature of a level set.
Huisken and Ilmanen define weak solutions of IMCF by constructing a func-
tional whose Euler–Lagrange equation is (almost) (9.4.2). Actually, they want to
find a functional whose Euler–Lagrange equation “freezes” the right-hand side
of (9.4.2), so to speak. The idea is the following. We note that the difference
between 6t and 6t+ from above arises when u is constant on a set with nonempty
interior. Let us define the values tˆ for which 6tˆ ̸= 6t+
ˆ , and call them jump times
since 6t will not evolve smoothly into 6tˆ , rather 6tˆ will jump (instantly) to 6t+
+
ˆ
in the weak setting. With this in mind, we define weak solutions to the IMCF
as follows. Fix a locally Lipschitz function u and a compact set K ⊂ , and
consider the functional JuK defined as
Z
K
Ju (v) = |∇v| + v|∇u| dvg ,

K

where v is locally Lipschitz, {v ̸= u} ⊂  is compact and dvg is the measure


associated to (M, g). Then, it follows that the Euler–Lagrange equation for this
functional is given by
∇v
 
divg = |∇u|,
|∇v|
which is (9.4.2) with the right-hand side frozen (see [125]). From this, we see
that if u minimizes JuK , i.e. if JuK (u) ≤ JuK (v) for all such v, then we have found
a weak level-set solution to (9.4.2), as desired.
The problem has been reduced to finding a function u that minimizes Juk (v).
Huisken and Ilmanen show, using the co-area formula, that this is equivalent to
having E t = {u < t} minimize the functional
Z

Ju (F) = |∂ F ∩ K | − |∇u|dvg ,
F

where F has locally finite perimeter. Here, |∂ ∗ F


∩ K | is the Hausdorff measure,

and ∂ stands for the reduced boundary; this quantity is defined for sets of locally
370 9. O N THE R IEMANNIAN P ENROSE INEQUALITY

finite perimeter, and coincides with the usual notion of boundary on smooth sets.
(See [108] for more on these geometric measure theoretical notions.)
Huisken and Ilmanen are able to prove using standard regularity results from
geometric measure theory that 6t is at least C 1,α [125] (ambient dimension less
than eight). Thus, we will assume in what remains of this section that all sets we
deal with are at least this regular. Now we will describe an intuitive, geometric
way of understanding weak solutions of IMCF using the definitions from above.
Since equation (9.4.2) may be degenerate when |∇u| = 0, we need to regularize
the PDE in order to prove existence of solutions. It turns out that this process
also holds the key for showing that the monotonicity of the Hawking mass can
be applied in the weak setting, and so it is important to understand how to
accomplish this.
To regularize the PDE (9.4.2) we take ϵ > 0, and we consider the equation
∇u ϵ p
divg p = |∇u ϵ |2 + ϵ 2 . (9.4.3)
|∇u ϵ |2 + ϵ 2
If u ϵ is a smooth solution of (9.4.3), and if we define Uϵ :  × R → R
by Uϵ (x, z) = u ϵ (x) − ϵz, then Uϵ is a smooth solution to (9.4.2), but in one
etϵ := {Uϵ = t}, then one can show that
dimension higher. In fact if we let 6
uϵ t
 
etϵ = graph
6 − ,
ϵ ϵ
and hence 6etϵ is a translating solution of IMCF inside  × R (see Figure 9).
Huisken and Ilmanen show that existence of solutions to (9.4.3) is guaranteed
by the existence of a subsolution of (9.4.2) which can be taken to be an expo-
nentially expanding coordinate sphere, e.g., u(x) = C log |x|, for some C > 0.

t
graph u

− t1

n +1

{u < t1}
Σn

Figure 9. Level set graph 6̃tϵ translating downwards [149].


H UISKEN AND I LMANEN ’ S PROOF OF THE RPI USING IMCF 371

(Since this is a strong subsolution it is defined in the usual way for (9.4.2).)
The estimates in [125] also give that for a suitable subsequence ϵi → 0+ , we have
that Uϵi (x, z) → u(x) as well as 6̃tϵi → 6t × R, locally in C 1,α . This last fact is
key for proving the monotonicity of the Hawking mass under weak solutions, as
we see below.
Now we turn to developing an intuitive, geometric way of understanding weak
solutions of IMCF. In order to do that, we define the following notion.
Definition 9-21. Let  ⊂ M be open. We say that E ⊂ M is a minimizing hull
if it minimizes area on the outside (in ). This is, E satisfies

|∂ ∗E ∩ K | ≤ |∂ ∗F ∩ K |

for any F such that E ⊂ F and F \ E ⊂  is compact, and any compact set
K ⊂  such that F \ E ⊂ K . Additionally, we say that E is a strictly minimizing
hull if equality above implies that F ∩  = E ∩  almost everywhere.
Minimizing hulls minimize area amongst all competing sets that contain
them, and are thus called outer-minimizing sets. Finding the minimizing hull
containing some set is sometimes called the shrink wrap problem, since it amounts
to finding the least area enclosure of a given region. By the first variation formula
for the area, an outer-minimizing hull must have nonnegative (weak) mean
curvature everywhere. (Otherwise, we could choose a compactly supported,
outward-pointing deformation of the hull with lesser area.) In particular, an
outer-minimizing, closed minimal surface is a minimizing hull. This suggests
that we should be able to run the weak IMCF with such a surface as the initial
condition, as it will immediately flow to a surface with positive (weak) mean
curvature.
Proposition 9-22 (Huisken–Ilmanen [125, Proposition 1.4]). Consider a weak
solution to IMCF as above, and assume that E t is precompact. Then:
(i) E t is a minimizing hull in M for all t > 0.
(ii) E t+ is a strictly minimizing hull in M for all t ≥ 0.
(iii) |6t | = |6t+ | for all t ≥ 0, provided that E 0 is a minimizing hull.
Using this, we can portray a (heuristic) geometric characterization of weak
solutions to the flow:
• E t flows by the smooth IMCF as long as E t is a strictly minimizing hull.
• E t jumps to its strictly minimizing hull E t+ when it is not a strictly mini-
mizing hull.
372 9. O N THE R IEMANNIAN P ENROSE INEQUALITY

u >t

u<t u<t

u = t ( jump region)

Figure 10. Time to jump [125].

In Figure 10 we see two spheres that are flowing under weak IMCF until
the moment when they can be enclosed by a “peanut” — two spherical caps
joined by a catenoidal bridge — of the same area. The weak solution will then
instantly jump across the region between these two surfaces, and resume from
the outermost surface.
(There is an alternative notion of weak solutions to IMCF using viscosity
solutions [53]. Such solutions will agree with smooth IMCF until a singularity
is formed; but the problem arises that, after a singular time, 6t may no longer
be a hypersurface anymore. Therefore, a viscosity solution of the IMCF is less
desirable, since it is not clear how to preserve — or even make sense of — the
monotonicity of the Hawking mass.)
From this intuitive geometric picture we can expect at jump times under
weak IMCF that (1) the area of the flowing hypersurface remains continuous
(|∂ E t+ | = |∂ E t |), and (2) the mean curvature does not increase. Therefore, the
following should hold at jump times:
Z Z
2
H dµt ≤ H 2 dµt .
∂ E t+ ∂ Et

Remark 9-23. To prove these statements one considers 6 etϵ , the smooth trans-
lating solutions of IMCF in M 3 × R, for which the smooth monotonicity of the
Hawking mass calculation holds. With this, weak monotonicity of the Hawking
mass follows by taking a limit as ϵ → 0+ so long as χ(6t ) ≤ 2. This, in turn, is
accomplished by showing that if 60 is connected, the weak solution of IMCF
will remain connected for all times. See Sections 4 and 5 of [125] for details.
The proof of part (iii) of Proposition 9-19, in the weak setting, follows from a
weak convergence result for weak solutions of IMCF.
Theorem 9-24 (Huisken–Ilmanen [125, p. 417]). Suppose (M 3 , g) is asymptoti-
cally flat. Let 6t ⊂ M be a weak solution to IMCF and let r (t) be the area radius,
1
i.e., the quantity defined implicitly by |6t | = 4πr (t)2 . Then 6t converges in
1 r (t)
C to to the round unit sphere as t → ∞.
H UISKEN AND I LMANEN ’ S PROOF OF THE RPI USING IMCF 373

Note that since we are rescaling by the area radius, all rescaled solutions here
converge to the unit sphere. This is in contrast to the result of Theorem 9-17,
where we have rescaled to keep area constant to show that all star-shaped hyper-
surfaces converge to a sphere with the same area as 60 .
Using Theorem 9-24 it can be shown that r (t)H6t converges in L 2 to the
analogous quantity for the round sphere. This is helpful for showing that the
Hawking mass of the flowing hypersurface (in the weak setting) converges to the
ADM mass as t → ∞, since the ADM mass is defined as the limit of integrals
over spheres of increasing radius [15]. Generally speaking, the fact that m H (6t )
limits to m AD M (M) as t → ∞ is expected, since these two quantities are equal in
the case of the (Riemannian) Schwarzschild manifold. (Compare this to the case
of asymptotically hyperbolic manifolds [167] where this asymptotic property is
not true.) The proof of this statement is somewhat lengthy, though; we give only
a brief overview here.
Let us denote various geometric quantities of 6t with respect to (M, g) by
H, A, ν, dµt ,∇, and the corresponding ones with respect to (R3 , δ) by H0 , A0 , ν0 ,
dµ0t , ∇ 0 . If we recall that | p(x)| → 0 as |x| → ∞, where pi j = gi j −δi j (because
of our assumption of asymptotic flatness, which also gives us a specified rate of
convergence | p| ≤ C/|x|), we obtain [125, p. 418]

H − H0 = −h ik pkl h l j Ai j + 12 H ν i ν j pi j − h i j ∇i p jl ν l + 12 h i j ∇l pi j ν l
±C| p||∇ p| ± C| p|2 |A|,
H02 (dµt − dµ0t ) = 12 H 2 h i j pi j ± | p|2 |A|2 ± C|∇ p|2 dµt ,


where h is the metric of 6t .


Now write
Z Z
H 2 dµt = H02 dµ0t + H02 (dµt − dµ0t ) + 2H (H − H0 ) − (H − H0 )2 dµt .

6t 6t

The last term converges to zero by the L 2 convergence of H to H0 . Using


that 6 H02 dµ0 ≥ 16π for C 1,α hypersurfaces in R3 [223], we obtain control
R

of the first term in terms of the 16π factor that we encounter in the definition
of the Hawking mass. After integrating by parts, the remaining terms cancel,
leaving error terms as well as, remarkably, terms in the definition of the ADM
mass. More specifically, using the above expressions the factor h i j ∇i p jl ν l in
H − H0 recovers terms in the definition of the ADM mass, plus error terms that
integrate to zero in the limit as t → ∞, where we note that ∇i0 p jl = ∇i0 g jl and
∇ = ∇ 0 ± C| p| |∇ p| . This completes our overview of the proof of asymptotic
convergence of the Hawking mass to the ADM mass in the weak setting. Using
374 9. O N THE R IEMANNIAN P ENROSE INEQUALITY

it, we obtain a proof of Theorem 9-14. The interested reader is directed to [125]
for details.

Remark 9-25. One could try to run IMCF on a spacelike slice of spacetime with
initial condition a codimension-two surface, in order to prove the full Penrose
inequality. Huisken and Ilmanen point out in [125] that such a construction
produces a flow that is a forward-backward system of PDE, and hence does not
possess good existence theory.

9.4.3. The generalized Alexandrov–Fenchel inequality. Now we state and out-


line the proof of the promised generalization of Theorem 9-13.

Theorem 9-26 (Freire–Schwartz [95]). Let  ⊂ Rn be open, with smooth, mean


convex, outer-minimizing boundary 6 with surface measure dµ. Then
n−2
1 1 |6| n−1
Z 
H dµ ≥ .
2(n − 1)ωn−1 6 2 ωn−1

Equality is achieved if and only if  is a round ball.

Proof. The full proof of this theorem requires weak solutions to IMCF, so we
will only prove this theorem for the case where IMCF starting from 6 is smooth
for all time, e.g. when 6 is mean convex and star-shaped. For the rest of the
proof in the weak case see [95].
Claim: Let 6 be as above, and consider a (weak) solution of IMCF in Rn with
initial condition 60 = 6. Then
Z Z 
n−2
H dµt ≤ H dµ e n−1 t .
6t 6

To see this in the smooth case, since (n − 1)|A|2 ≥ H 2 , we have

|A|2 n − 2 1
H− − H = H 2 − (n − 1)|A|2 ≤ 0.

H n−1 (n − 1)H

Using this and the Riccati equation we obtain

d |A|2 n−2
Z Z   Z
H dµt = H− dµt ≤ H dµt .
dt 6t 6t H n − 1 6t

Integrating gives the claim.


B RAY ’ S PROOF 375

n−2
d
Now if we let f (t) = |6t |− n−1 6t H dµt , then using dt |6t | = |6t |, we find
R

n−2
d |6t |− n−1 d
Z Z
n−2
′ − n−1
f (t) = H dµt + |6t | H dµt
dt 6t 6t dt
n −2 n −2
Z Z
− n−2 − n−2
≤− |6 | n−1 H dµt + |6 | n−1 H dµt
n −1 t 6t n −1 t 6t
= 0.
Then we see that
Z
− n−2 1/(n−1)
|6| n−1 H dµ = f (0) ≥ lim f (t) = (n − 1)ωn−1 ,
6 t→∞

where the last equality follows from the fact that IMCF converges to a round
sphere. We are omitting the technical part of the argument, which is to show the
convergence in the weak setting for the above quantities. This is not automatic
since we are not integrating H 2 as in Huisken and Ilmanen’s work, rather we are
integrating H . See [95] for details. □
The Freire–Schwartz theorem allows us to generalize Theorem 9-6 to the case
where ∂ is a union of smooth, mean-convex, outer-minimizing sets. Related
results in [95] include a generalized Pólya–Szegő inequality, as well as mass-
capacity and volumetric Penrose inequalities for the conformally flat case. The
proofs of these results work in arbitrary dimensions and are independent of the
positive mass theorem. It is worth pointing out that the proof of the mass-capacity
inequality is the only part of Bray’s work [26] where the positive mass theorem
is used; cf. Theorems 9-32 and 9-33 below.

9.5. Bray’s proof of the RPI using PMT

We now turn our attention to an overview of Bray’s proof of the Riemannian


Penrose inequality. The precise statement is the following:
Theorem 9-27 (Bray [26]). Let (M 3 , g) be a complete, smooth, asymptoti-
cally flat Riemannian manifold with R ≥ 0, with ADM mass m and with outer-
minimizing minimal surface 6 with area |6|. Then
r
|6|
m≥ ,
16π
and equality is achieved if and only if (M 3 , g) is isometric (outside 6) to the
Riemannian Schwarzschild manifold of mass m.
The main advantage of Bray’s proof is that it can handle horizons (outer-
minimizing minimal surfaces) that are not connected. Another is that it does not
376 9. O N THE R IEMANNIAN P ENROSE INEQUALITY

use the Geroch monotonicity formula, which in turn relies on Gauss–Bonnet.


Thus, it can be (and has been) generalized to higher dimensions [28]. On the
downside, Bray’s proof relies on the positive mass theorem and on geometric
measure theory quite heavily, so proving the RPI in dimensions eight and above
with this approach does not follow directly (and remains open, though recent
developments [205] might prove promising in this direction).
Bray’s proof of the RPI can be broken down into four main steps:
1. Define an appropriate conformal flow of metrics {gt }t≥0 on Mt := M \ t ,
where t is the region inside the horizon of gt .
2. Prove that the area of the horizon of gt , given by A(t), is constant.
3. Prove that the mass of Mt , given by m(t), is nonincreasing.
4. Prove that gt converges to the Schwarzschild metric as t → ∞.
Using these steps, Bray concludes that
r r
A(0) A(∞)
= = m(∞) ≤ m(0),
16π 16π
which proves the RPI. The second equality here comes from the fact that the RPI
is an equality for the Schwarzschild metric.
We now go over some elements of the proofs of the steps outlined above. We
begin with a definition (which can be extended to multiple asymptotic ends).
Definition 9-28. We say that (M, g) is harmonically flat at infinity if for some
compact set K ⊂ M the complement M \ K has zero scalar curvature and is
conformal to (R3 \ B1 (0), δ); i.e., g = u 40 δ on M \ K , with u 0 (x) → a > 0 as
|x| → ∞.
The justification for this name is that the conformal factor u 0 above is harmonic.
This follows from the well-known formula for the transformation of the scalar
curvature under conformal deformations:

R(u 4 g) = u −5 (R(g)u − 81g u), (9.5.1)

where 1g is the Laplacian in the metric g.


If we write g = u 40 δ outside a compact set, then the preceding formula gives
that 1δ u 0 = 0. Using spherical harmonics, we obtain the expansion
b 1
 
u 0 (x) = a + +O ,
|x| |x|2
where a, b ∈ R are constants, so that u 0 (x) → a as |x| → ∞. Using this, we
readily compute that the mass of (M, g) is 2ab.
B RAY ’ S PROOF 377

The following lemma justifies our interest in harmonically flat manifolds.


Theorem 9-29 (Schoen–Yau [203]). Let (M 3 , g) be an asymptotically flat mani-
fold with R(g) ≥ 0. For any ϵ > 0, there exists a metric g0 on M that is
harmonically flat at infinity, has R(g0 ) ≥ 0, and satisfies
(1 − ϵ)g(V, V ) ≤ g0 (V, V ) ≤ (1 + ϵ)g(V, V )
for all V ∈ T p M and p ∈ M; moreover
|m 0 − m| ≤ ϵ,
where m 0 and m are the ADM masses of (M 3, g0 ) and (M 3, g).
Using Theorem 9-29 above we see that the proof of the RPI can be reduced to
the harmonically flat case [26]. Let us now define the conformal flow that Bray
introduced to prove the RPI for these manifolds.

Steps 1 and 2 (see previous page). Let 6t be the outermost area-minimizing


enclosure of 6 = 60 in (M 3 , gt ) and gt be a one-parameter family of metrics
defined by
gt = u t (x)4 g0 for t ≥ 0,
where u 0 (x) ≡ 1, and u t (x) is given by
Z t
u t (x) = 1 + vs (x)ds.
0
Here, we choose vt (x) so that it satisfies

1g0 vt (x) = 0


outside 6t ,
vt = 0 on 6t ,
 lim vt (x) = −e−t .


|x|→∞

Extending vt by 0 inside 6t , we see vt , and hence u t , is superharmonic, and


hence u t > 0 on M. The transformation law (9.5.1) for the scalar curvature under
conformal deformations yields R(gt ) = u −5 t (R(g0 )u t − 81g0 u t ) ≥ 0 outside 6t .
Note also that gt remains asymptotically flat, since u t (x) → e−t when |x| → ∞.
The proof of existence and regularity of solutions of the conformal flow
defined above is based on an approximating scheme, and uses deep notions from
geometric measure theory. It is rather involved, and we omit it here. Along the
way, Bray is able to prove that 6t is enclosed by (but disjoint from) 6t ∗ whenever
t < t ∗ . From this and the definition of vt he shows that the area of the horizon
does not increase infinitesimally, concluding |6t | = |60 | for all t ∈ (0, ∞). This
fact is vital to his proof of the RPI, but is technical in the weak setting.
378 9. O N THE R IEMANNIAN P ENROSE INEQUALITY

Exercise 9-30. Prove that d|6t |/dt = 0 in the smooth case; this calculation uses
and motivates the definition of vt .

Step 3. We next sketch the proof that m(t) is nonincreasing. This involves
a generalization of the classical notion of electrostatic capacity of bodies in
Euclidean space (see [183]).
Definition 9-31. Let (M 3 , g) be an asymptotically flat manifold with boundary
6. The capacity of 6 is
1
 Z 
2
C(6) = inf |∇ϕ| d Vg ,
ϕ∈M10 4π M

where M10 denotes the space of smooth functions ϕ such that ϕ|6 = 0 and ϕ → 1
as |x| → ∞ and ∇ is the gradient operator of the metric g.
Bray uses a mass-capacity inequality to prove that m(t) is nonincreasing, as
we see below. The precise statement is the following.
Theorem 9-32 (mass-capacity inequality [26]). Let (M 3 , g) be an asymptotically
flat manifold with nonnegative scalar curvature and boundary 6, which is a
minimal surface. Then
m ≥ C(6),
where m is the ADM mass of M. Equality is achieved if and only if (M 3 , g) is
isometric to the Riemannian Schwarzschild manifold.
The idea behind the proof of Theorem 9-32 is to use a trick due to Bunting
and Masood-ul-Alam [36], which consists in reflecting the metric of M across
the boundary of M in order to obtain a asymptotically flat metric with two ends.
This reflection is nontrivial since, in general, the reflected metric is not smooth
across the boundary. Bray overcomes this difficulty by solving an ODE system,
as we see below. Once a suitable metric has been constructed in the reflected
manifold, Bray applies the following theorem:
Theorem 9-33. Let (M 3 , g) be a Riemannian manifold with nonnegative scalar
curvature, multiple asymptotically flat ends and no boundary. Let E be one of its
ends. Then
1
 Z 
1 2
2 m(E) ≥ C(E) := inf 1 4π |∇ϕ| d Vg ,
ϕ∈M0 M

where M10 is the set of all smooth functions in M that approach 1 at infinity in
the end E and approach 0 in all the other ends.
The proof depends on the following observation:
B RAY ’ S PROOF 379

Claim. The function φ ∈ M10 that realizes the infimum of C(E) has the special
form φ(x) = 1 − C(E)/|x| + O(|x|−2 ).
The Euler–Lagrange equation for the capacity is the Laplace equation, and
indeed the infimum C(E) is realized by the harmonic function with the appro-
priate end behavior (and analogously for C(6)). Since φ is harmonic and tends
to 1 at infinity in the chosen end, we have
1 dφ
Z
C(E) = dµg
4π 6 dν
after integrating by parts over the manifold. Here, 6 is the smooth compact
boundary of a set containing all the ends of M except for the chosen end E, and
ν is the unit normal vector to 6. Letting 6 be an arbitrarily large sphere in E,
the claim follows.
Proof of Theorem 9-33. Let k be the number of ends of M other than E. Without
loss of generality let us assume that (M, g) is harmonically flat at infinity, and
consider the metric g̃ defined on M as g̃ = φ(x)4 ḡ, where φ(x) is the function
that realizes the infimum in the calculation of C(E) [26]. Since φ goes to zero
in all the ends except for E, we can use the removable singularity theorem to
extend the metric g̃ to M e = M ∪ {∞1 } ∪ · · · ∪ {∞k }. Thus M e is the manifold
obtained from compactifying all ends of M other than E. It follows that g̃ is a
harmonically flat metric on M e with one end, and nonnegative scalar curvature.
To keep track of the mass m̃ of ( M, e g̃) in terms of m = m(E), we use the
expansion of u from above. More precisely, since g is harmonically flat, we
know it has the form g = u 4 δ, where u(x) = 1 + 21 m/|x| + O(|x|−2 ). We can
similarly write g̃ = ũ 4 δ = φ 4 u 4 δ, where ũ(x) = 1 + 21 m̃/|x| + O(|x|−2 ). Using
the spherical harmonic expansion for φ we find that

m − 2C(E) = m̃ ≥ 0,

where the last inequality follows from the positive mass theorem. Putting this
together yields m ≥ 2C(E), as desired. See [26] for the case of equality. □
Proof of the mass-capacity inequality, Theorem 9-32. Bray uses a reflection
argument inspired by the work of Bunting and Masood-ul-Alam [36] in order to
double M and reflect the metric smoothly across the boundary. The process is
depicted in Figure 11.
The main idea is to double (M, g) using a specific construction. We glue
6-cylindrical tubes of the form (6 × [0, 2δ], G) onto the two copies of the inner
boundaries. We define coordinates (z, t) on the tube 6 × [0, 2δ], so that points
of the form (z, 0) are glued to one copy of M, and points of the form (z, 2δ)
380 9. O N THE R IEMANNIAN P ENROSE INEQUALITY

Figure 11. Bunting and Masood-ul-Alam’s reflection trick [26].

land on the other copy. The difficulty of this process is to be able to define the
metric G in such a way that the doubled manifold ( M eδ , g̃δ ) not only has a smooth
metric, but also its scalar curvature remains nonnegative. Furthermore, we want
the mass of either end of M eδ be arbitrarily close to m.
In order to do this, Bray solves an ODE, as we see below. We start by defining
the following components of G:

G(∂t , ∂t ) = 1, G(∂t , ∂z1 ) = 0, G(∂t , ∂z2 ) = 0,

where {∂t , ∂z1 , ∂z2 } are the coordinate vector fields. We still need to define the
metric over all slices 6 × {t}. To do this, consider the unique solution of the
ODE system
d
G i j (z, t) = 2G ik (z, t)Akj (z, t),
dt
with smooth initial conditions G i j (z, 0) = gi j (z, 0), where Ai j (z, 0) is the second
fundamental form of 6 in M for i, j indexing {∂z1 , ∂z2 }, and where Akj (z, t) =
−Akj (z, 2δ − t) extends smoothly to M eδ . Then, using this definition of G, the
second fundamental form of each slice is automatically Ai j (z, t), and G is
symmetric about t = δ; i.e., G i j (z, t) = G i j (z, 2δ − t). Bray then conformally
deforms the metric once more, and obtains a metric on M eδ whose scalar curvature
is nonnegative, so its mass m̃ δ converges to m as δ → 0+ .
Finally, let ϕ be the function that realizes C(6), so that

lim ϕ(x) = 1,

x→∞

 1g ϕ = 0 in M,
ϕ(x) = 0 on 6.

B RAY ’ S PROOF 381

eδ , let φ = φδ be the function which realizes C(E δ ). Then φ


For an end E δ of M
satisfies
lim φ(x) = 1,

x→∞


1g̃δ φ = 0 in M eδ ,

 lim φ(x) = 0.

x→−∞

(Here we use −∞ to denote the reflected end in M eδ .) Letting δ → 0+ we find


that, by symmetry, φ(x) = 12 on 6. Hence by uniqueness of harmonic functions
we find that φ(x) = 12 (ϕ(x) + 1) on M, and so C(E) = 12 C(6), as required. The
proof of the theorem follows from the fact that m ≥ 2C(E) = C(6) because
of Theorem 9-33. The characterization of the case of equality can be found
in [26]. □
To finalize the proof of Step 3, we see that from the definition of u t it follows
that du t /dt = vt . The same argument for the claim on p. 379 yields
C(60 ) 1
 
v0 (x) = −1 + +O .
|x| |x|2
4
We can write the initial metric in harmonically flat format as  g0 = U0 δ, where U0
goes to 1 at infinity, and so that U0 (x) = 1+ m(0) 1
2|x| + O |x|2 . Using this we write
4 4
gt as both gt = Ut (x) δ and gt = u t (x) g0 ; this shows that Ut (x) = u t (x)U 0 (x).

b(t)
Now let a(t) and b(t) be the functions defined by u t (x) = a(t)+ |x| +O |x|1 2 .
Using the above calculations it follows that
m(0) 1 1
   
Ut (x) = a(t) + b(t) + a(t) +O ,
2 |x| |x|2
so m(t) = 2a(t) b(t) + 12 m(0)a(t) . By the expansion of u t (x) from above,


and the fact that u 0 (x) ≡ 1, we obtain a(0) = 1, b(0) = 0, a ′ (0) = −1, and
b′ (0) = C(60 ). From this, it follows that
m ′ (0) = 2C(60 ) − 2m(0) ≤ 0,
as desired.

Step 4. The last step is to show that (M, g(t)) converges (in some sense) to
the Riemannian Schwarzschild metric as t → ∞. Bray is able to show that the
rescaled horizon converges to a coordinate sphere of radius m2 in R3 and that the
m
rescaled solution of the flow ũ t converges to 1 + 2|x| as t → ∞. The interested
reader is directed to Bray’s paper for more details. □
References

[1] R. Abraham, J. E. Marsden, and T. Ratiu, Manifolds, tensor analysis, and applications, 2nd
ed., Applied Math. Sciences 75, Springer, New York, 1988.
[2] R. A. Adams, Sobolev spaces, Pure and Applied Math. 65, Academic Press, San Diego, 1975.
[3] S. Agmon, A. Douglis, and L. Nirenberg, “Estimates near the boundary for solutions of elliptic
partial differential equations satisfying general boundary conditions, II”, Comm. Pure Appl. Math.
17 (1964), 35–92.
[4] A. D. Alexandrov, “On the theory of mixed volumes, II: New inequalities between mixed
volumes and their applications”, Mat. Sbornik (N.S.) 2 (1937), 1205–1238. In Russian.
[5] A. D. Alexandrov, “On the theory of mixed volumes, III: Extension of two theorems of
Minkowski on convex polyhedra to arbitrary convex surfaces”, Mat. Sbornik (N.S.) 3 (1938),
27–46. In Russian.
[6] J. Anderson, J. Corvino, and F. Pasqualotto, “Multi-localized time-symmetric initial data for
the Einstein vacuum equations”, J. Reine Angew. Math. 808 (2024), 67–110.
[7] L. Andersson and V. Moncrief, “Elliptic-hyperbolic systems and the Einstein equations”, Ann.
Henri Poincaré 4:1 (2003), 1–34.
[8] V. I. Arnold, Mathematical methods of classical mechanics, Graduate Texts in Math. 60,
Springer, New York, 1978.
[9] R. Arnowitt, S. Deser, and C. W. Misner, “Coordinate invariance and energy expressions in
general relativity”, Phys. Rev. (2) 122 (1961), 997–1006.
[10] R. Arnowitt, S. Deser, and C. W. Misner, “The dynamics of general relativity”, pp. 227–265
in Gravitation: an introduction to current research, edited by L. Witten, Wiley, New York, 1962.
[11] A. Ashtekar, B. K. Berger, J. Isenberg, and M. MacCallum (editors), General relativity and
gravitation: a centennial perspective, Cambridge Univ. Press, Cambridge, 2015.
[12] S. Axler, P. Bourdon, and W. Ramey, Harmonic function theory, Graduate Texts in Mathe-
matics 137, Springer, 1992.
[13] J. L. Barbosa and M. do Carmo, “Stability of hypersurfaces with constant mean curvature”,
Math. Z. 185:3 (1984), 339–353.
[14] J. L. Barbosa, M. do Carmo, and J. Eschenburg, “Stability of hypersurfaces of constant mean
curvature in Riemannian manifolds”, Math. Z. 197:1 (1988), 123–138.
[15] R. Bartnik, “The mass of an asymptotically flat manifold”, Comm. Pure Appl. Math. 39:5
(1986), 661–693.
[16] R. Bartnik, “Phase space for the Einstein equations”, Comm. Anal. Geom. 13:5 (2005),
845–885.

383
384 R EFERENCES

[17] R. Bartnik and J. Isenberg, “The constraint equations”, pp. 1–38 in The Einstein equations
and the large scale behavior of gravitational fields, edited by P. T. Chruściel and H. Friedrich,
Birkhäuser, Basel, 2004.
[18] R. Beig and N. Ó Murchadha, “The Poincaré group as the symmetry group of canonical
general relativity”, Ann. Physics 174:2 (1987), 463–498.
[19] R. Beig, P. T. Chruściel, and R. Schoen, “KIDs are non-generic”, Ann. Henri Poincaré 6:1
(2005), 155–194.
[20] M. Berger and D. Ebin, “Some decompositions of the space of symmetric tensors on a
Riemannian manifold”, J. Differential Geometry 3 (1969), 379–392.
[21] A. N. Bernal and M. Sánchez, “Globally hyperbolic spacetimes can be defined as ‘causal’
instead of ‘strongly causal’”, Classical Quantum Gravity 24:3 (2007), 745–749.
[22] A. L. Besse, Einstein manifolds, Ergebnisse der Math. (3) 10, Springer, Berlin, 1987.
[23] E. Bonning, P. Marronetti, D. Neilsen, and R. Matzner, “Physics and initial data for multiple
black hole spacetimes”, Phys. Rev. D (3) 68:4 (2003), 044019, 17.
[24] J.-P. Bourguignon, D. G. Ebin, and J. E. Marsden, “Sur le noyau des opérateurs pseudo-
différentiels à symbole surjectif et non injectif”, C. R. Acad. Sci. Paris Sér. A-B 282:16 (1976),
Aii, A867–A870.
[25] H. L. Bray, The Penrose inequality in general relativity and volume comparison theorems
involving scalar curvature, Ph.D. thesis, Stanford University, 1997, https://www.proquest.com/
docview/304386501. arXiv 0902.3241
[26] H. L. Bray, “Proof of the Riemannian Penrose inequality using the positive mass theorem”, J.
Differential Geom. 59:2 (2001), 177–267.
[27] H. L. Bray, “On dark matter, spiral galaxies, and the axioms of general relativity”, pp. 1–64
in Geometric analysis, mathematical relativity, and nonlinear partial differential equations,
Contemp. Math. 599, Amer. Math. Soc., Providence, RI, 2013.
[28] H. L. Bray and D. A. Lee, “On the Riemannian Penrose inequality in dimensions less than
eight”, Duke Math. J. 148:1 (2009), 81–106.
[29] H. Bray and F. Morgan, “An isoperimetric comparison theorem for Schwarzschild space and
other manifolds”, Proc. Amer. Math. Soc. 130:5 (2002), 1467–1472.
[30] H. L. Bray and A. R. Parry, “Modeling wave dark matter in dwarf spheroidal galaxies”, J.
Phy. Conf. Series 615 (2015), art. id. 012001.
[31] S. Brendle, “Constant mean curvature surfaces in warped product manifolds”, Publ. Math.
Inst. Hautes Études Sci. 117 (2013), 247–269.
[32] S. Brendle and M. Eichmair, “Isoperimetric and Weingarten surfaces in the Schwarzschild
manifold”, J. Differential Geom. 94:3 (2013), 387–407.
[33] S. Brendle and M. Eichmair, “Large outlying stable constant mean curvature spheres in initial
data sets”, Invent. Math. 197:3 (2014), 663–682.
[34] S. Brendle and F. C. Marques, “Scalar curvature rigidity of geodesic balls in S n ”, J. Differen-
tial Geom. 88:3 (2011), 379–394.
[35] S. Brendle, F. C. Marques, and A. Neves, “Deformations of the hemisphere that increase
scalar curvature”, Invent. Math. 185:1 (2011), 175–197.
[36] G. L. Bunting and A. K. M. Masood-ul-Alam, “Nonexistence of multiple black holes in
asymptotically Euclidean static vacuum space-time”, Gen. Relativity Gravitation 19:2 (1987),
147–154.
R EFERENCES 385

[37] M. Cai and G. J. Galloway, “Rigidity of area minimizing tori in 3-manifolds of nonnegative
scalar curvature”, Comm. Anal. Geom. 8:3 (2000), 565–573.
[38] J. J. Callahan, The geometry of spacetime: an introduction to special and general relativity,
Springer, New York, 2000.
[39] M. Cantor, “Elliptic operators and the decomposition of tensor fields”, Bull. Amer. Math. Soc.
(N.S.) 5:3 (1981), 235–262.
[40] M. Carfora and A. Marzuoli, Einstein constraints and Ricci flow: a geometrical averaging of
initial data sets, Springer, Singapore, 2023.
[41] M. P. do Carmo, Riemannian geometry, Birkhäuser, Boston, 1992.
[42] S. Carroll, Spacetime and geometry: An introduction to general relativity, Addison Wesley,
San Francisco, 2004.
[43] C. Cederbaum and C. Nerz, “Explicit Riemannian manifolds with unexpectedly behaving
center of mass”, Ann. Henri Poincaré 16:7 (2015), 1609–1631.
[44] A. Chaljub-Simon, “Systèmes elliptiques linéaires dans des espaces de fonctions höldériennes
à poids”, Rend. Circ. Mat. Palermo (2) 30:2 (1981), 300–310.
[45] A. Chaljub-Simon and Y. Choquet-Bruhat, “Problèmes elliptiques du second ordre sur une
variété euclidienne à l’infini”, Ann. Fac. Sci. Toulouse Math. (5) 1:1 (1979), 9–25.
[46] P.-Y. Chan and L.-F. Tam, “A note on center of mass”, Comm. Anal. Geom. 24:3 (2016),
471–486.
[47] P.-N. Chen, L.-H. Huang, M.-T. Wang, and S.-T. Yau, “On the validity of the definition of
angular momentum in general relativity”, Ann. Henri Poincaré 17:2 (2016), 253–270.
[48] P.-N. Chen, M.-T. Wang, and S.-T. Yau, “Conserved quantities in general relativity: from the
quasi-local level to spatial infinity”, Comm. Math. Phys. 338:1 (2015), 31–80.
[49] Y. Choquet-Bruhat (as Fourès-Bruhat), “Théorème d’existence pour certains systèmes
d’équations aux dérivées partielles non linéaires”, Acta Math. 88 (1952), 141–225.
[50] Y. Choquet-Bruhat, “New elliptic system and global solutions for the constraints equations in
general relativity”, Comm. Math. Phys. 21 (1971), 211–218.
[51] Y. Choquet-Bruhat, General relativity and the Einstein equations, Oxford Univ. Press, 2009.
[52] Y. Choquet-Bruhat and J. W. York, Jr., “The Cauchy problem”, pp. 99–172 in General
relativity and gravitation, vol. 1, edited by A. Held, Plenum Press, New York, 1980.
[53] B. Chow and R. Gulliver, “Aleksandrov reflection and geometric evolution of hypersurfaces”,
Comm. Anal. Geom. 9:2 (2001), 261–280.
[54] D. Christodoulou, “Global solutions of nonlinear hyperbolic equations for small initial data”,
Comm. Pure Appl. Math. 39:2 (1986), 267–282.
[55] D. Christodoulou and S. Klainerman, The global nonlinear stability of the Minkowski space,
Princeton Mathematical Series 41, Princeton University Press, 1993.
[56] D. Christodoulou and N. Ó Murchadha, “The boost problem in general relativity”, Comm.
Math. Phys. 80:2 (1981), 271–300.
[57] P. T. Chruściel, “Boundary conditions at spatial infinity from a Hamiltonian point of view”,
pp. 49–59 in Topological properties and global structure of space-time (Erice, 1985), NATO Adv.
Sci. Inst. Ser. B Phys. 138, Plenum, New York, 1986.
[58] P. T. Chruściel, “Mass and angular-momentum inequalities for axi-symmetric initial data sets.
I. Positivity of mass”, Ann. Physics 323:10 (2008), 2566–2590.
[59] P. T. Chruściel, “Elements of causality theory”, preprint, 2011. arXiv 1110.6706v1
386 R EFERENCES

[60] P. T. Chruściel and J. L. Costa, “Mass, angular-momentum and charge inequalities for
axisymmetric initial data”, Classical Quantum Gravity 26:23 (2009), 235013, 7.
[61] P. T. Chruściel and E. Delay, On mapping properties of the general relativistic constraints
operator in weighted function spaces, with applications, Mém. Soc. Math. Fr. (N.S.) 94, 2003.
[62] P. T. Chruściel and J. D. E. Grant, “On Lorentzian causality with continuous metrics”,
Classical Quantum Gravity 29:14 (2012), art. id. 145001.
[63] P. T. Chruściel, Y. Li, and G. Weinstein, “Mass and angular-momentum inequalities for
axi-symmetric initial data sets. II. Angular momentum”, Ann. Physics 323:10 (2008), 2591–2613.
[64] P. T. Chruściel, G. J. Galloway, and D. Pollack, “Mathematical general relativity: a sampler”,
Bull. Amer. Math. Soc. (N.S.) 47:4 (2010), 567–638.
[65] G. B. Cook, “Initial data for numerical relativity”, Living Rev. Relativ. 3 (2000), art. id.
2000–5.
[66] J. Corvino, “Scalar curvature deformation and a gluing construction for the Einstein constraint
equations”, Comm. Math. Phys. 214:1 (2000), 137–189.
[67] J. Corvino and L.-H. Huang, “Localized deformation for initial data sets with the dominant
energy condition”, Calc. Var. Partial Differential Equations 59:1 (2020), Paper No. 42, 43.
[68] J. Corvino and D. Pollack, “Scalar curvature and the Einstein constraint equations”, pp. 145–
188 in Surveys in geometric analysis and relativity, Adv. Lect. Math. (ALM) 20, International
Press, Somerville, MA, 2011.
[69] J. Corvino and R. M. Schoen, “On the asymptotics for the vacuum Einstein constraint
equations”, J. Differential Geom. 73:2 (2006), 185–217.
[70] J. Corvino, M. Eichmair, and P. Miao, “Deformation of scalar curvature and volume”, Math.
Ann. 357:2 (2013), 551–584.
[71] S. Dain, “Proof of the angular momentum-mass inequality for axisymmetric black holes”, J.
Differential Geom. 79:1 (2008), 33–67.
[72] C. De Lellis and S. Müller, “Optimal rigidity estimates for nearly umbilical surfaces”, J.
Differential Geom. 69:1 (2005), 75–110.
[73] D. M. DeTurck, “Existence of metrics with prescribed Ricci curvature: local theory”, Invent.
Math. 65:1 (1981/82), 179–207.
[74] D. M. DeTurck, “Deforming metrics in the direction of their Ricci tensors”, J. Differential
Geom. 18:1 (1983), 157–162.
[75] B. S. DeWitt, “Quantum theory of gravity, I: The canonical theory”, Phys. Rev. 160:5 (1967),
1113–1148.
[76] A. Douglis and L. Nirenberg, “Interior estimates for elliptic systems of partial differential
equations”, Comm. Pure Appl. Math. 8 (1955), 503–538.
[77] M. Eichmair, “The Jang equation reduction of the spacetime positive energy theorem in
dimensions less than eight”, Comm. Math. Phys. 319:3 (2013), 575–593.
[78] M. Eichmair, L.-H. Huang, D. A. Lee, and R. Schoen, “The spacetime positive mass theorem
in dimensions less than eight”, J. Eur. Math. Soc. 18:1 (2016), 83–121.
[79] M. Eichmair and J. Metzger, “On large volume preserving stable CMC surfaces in initial data
sets”, J. Differential Geom. 91:1 (2012), 81–102.
[80] M. Eichmair and J. Metzger, “Large isoperimetric surfaces in initial data sets”, J. Differential
Geom. 94:1 (2013), 159–186.
[81] M. Eichmair and J. Metzger, “Unique isoperimetric foliations of asymptotically flat manifolds
in all dimensions”, Invent. Math. 194:3 (2013), 591–630.
R EFERENCES 387

[82] A. Einstein, “Zur Elektrodynamik bewegter Körper”, Ann. der Physik 17 (1905), 891–
921. Translated as “On the electrodynamics of moving bodies”, pp. 35–65 in The principle
of relativity: a collection of original memoirs on the special and general theory of relativity,
Methuen, London, 1923; reprinted Dover, New York, 1952. Essentially the same text is posted at
http://www.fourmilab.ch/etexts/einstein/specrel/www/. A new translation appeared in pp. 123–
160 of [212].
[83] A. Einstein, “Die Grundlage der allgemeinen Relativitätstheorie”, Ann. der Physik 49:7
(1916), 769–822. Translated as “The foundation of the general theory of relativity”, pp. 109–164
in The principle of relativity: a collection of original memoirs on the special and general theory
of relativity, Methuen, London, 1923; reprinted Dover, New York, 1952.
[84] A. Einstein, Relativity: the special and the general theory, Methuen, London, 1920. Reprinted
by Crown Publishers, New York, 1961.
[85] A. Einstein, The meaning of relativity, Princeton Univ. Press, Princeton, 1922.
[86] L. C. Evans, Partial differential equations, Graduate Studies in Math. 19, Amer. Math. Soc.,
Providence, RI, 1998.
[87] C. F. W. Everitt et al., “The Gravity Probe B test of general relativity”, Classical Quantum
Gravity 32:22 (2015), art. id. 224001.
[88] A. E. Fischer and J. E. Marsden, “Linearization stability of the Einstein equations”, Bull.
Amer. Math. Soc. 79 (1973), 997–1003.
[89] A. E. Fischer and J. E. Marsden, “Deformations of the scalar curvature”, Duke Math. J. 42:3
(1975), 519–547.
[90] A. E. Fischer and J. E. Marsden, The initial value problem and the dynamical formulation of
general relativity, edited by S. W. Hawking and W. Israel, Cambridge Univ. Press, Cambridge,
1979.
[91] A. E. Fischer and J. A. Wolf, “The structure of compact Ricci-flat Riemannian manifolds”, J.
Differential Geometry 10 (1975), 277–288.
[92] D. Fischer-Colbrie and R. Schoen, “The structure of complete stable minimal surfaces in
3-manifolds of nonnegative scalar curvature”, Comm. Pure Appl. Math. 33:2 (1980), 199–211.
[93] G. B. Folland, Introduction to partial differential equations, 2nd ed., Princeton University
Press, 1995.
[94] T. Frankel, Gravitational curvature: An introduction to Einstein’s theory, W. H. Freeman,
San Francisco, 1979.
[95] A. Freire and F. Schwartz, “Mass-capacity inequalities for conformally flat manifolds with
boundary”, Comm. PDE 39 (2014), 98–119.
[96] A. P. French, Special relativity, W. W. Norton, New York, 1968.
[97] H. Friedrich, “On the existence of n-geodesically complete or future complete solutions of
Einstein’s field equations with smooth asymptotic structure”, Comm. Math. Phys. 107:4 (1986),
587–609.
[98] K. O. Friedrichs, “The identity of weak and strong extensions of differential operators”,
Trans. Amer. Math. Soc. 55 (1944), 132–151.
[99] G. J. Galloway, “Least area tori, black holes and topological censorship”, pp. 113–123 in
Differential geometry and mathematical physics (Vancouver, 1993), edited by J. K. Beem and
K. L. Duggal, Contemp. Math. 170, Amer. Math. Soc., Providence, RI, 1994.
[100] G. J. Galloway, “Stability and rigidity of extremal surfaces in Riemannian geometry and
general relativity”, pp. 221–239 in Surveys in geometric analysis and relativity, edited by H. L.
388 R EFERENCES

Bray and W. P. Minicozzi, Adv. Lect. Math. (ALM) 20, International Press, Somerville, MA,
2011.
[101] G. J. Galloway, Notes on Lorentzian causality: ESI-EMS-IAMP Summer School on Mathe-
matical Relativity (Vienna, 2014), 2014. http://hdl.handle.net/10385/2167.
[102] C. Gerhardt, “Flow of nonconvex hypersurfaces into spheres”, J. Diff. Geom. 32 (1990),
299–314.
[103] R. Geroch, “What is a singularity in general relativity?”, Ann. Phys. 48:3 (1968), 526–540.
[104] R. Geroch, “Domain of dependence”, J. Mathematical Phys. 11 (1970), 437–449.
[105] R. Geroch, “Energy extraction”, Ann. New York Acad. Sci. 224 (1973), 108–117.
[106] G. W. Gibbons, “The time symmetric initial value problem for black holes”, Comm. Math.
Phys. 27 (1972), 87–102.
[107] D. Gilbarg and N. S. Trudinger, Elliptic partial differential equations of second order, 2nd
ed., Grundlehren der Math. Wiss. 224, Springer, Berlin, 1983.
[108] E. Giusti, Minimal surfaces and functions of bounded variation, Monographs in Math. 80,
Birkhäuser, Basel, 1984.
[109] A. Gray, Tubes, Addison-Wesley, Redwood City, CA, 1990.
[110] O. Grøn, “Space geometry in rotating frames: a historical appraisal”, pp. 285–334 in
Relativity in rotating frames: relativistic physics in rotating reference frames, edited by G. Rizzi
and M. L. Ruggiero, Kluwer, Dordrecht, 2004.
[111] Q. Han, A basic course in partial differential equations, Graduate Studies in Mathematics
120, American Mathematical Society, 2011.
[112] S. W. Hawking and G. F. R. Ellis, The large scale structure of space-time, Cambridge
Monog. Math. Phys. 1, Cambridge Univ. Press, London, 1973.
[113] S. W. Hawking and G. T. Horowitz, “The gravitational Hamiltonian, action, entropy and
surface terms”, Classical Quantum Gravity 13:6 (1996), 1487–1498.
[114] L. Hörmander, “Pseudo-differential operators and non-elliptic boundary problems”, Ann. of
Math. (2) 83 (1966), 129–209.
[115] L. Hörmander, The analysis of linear partial differential operators, I: Distribution theory
and Fourier analysis, Grundlehren der Math. Wiss. 256, Springer, 1990.
[116] L.-H. Huang, “On the center of mass of isolated systems with general asymptotics”, Classi-
cal Quantum Gravity 26:1 (2009), 015012, 25.
[117] L.-H. Huang, “Foliations by stable spheres with constant mean curvature for isolated systems
with general asymptotics”, Comm. Math. Phys. 300:2 (2010), 331–373.
[118] L.-H. Huang, “Solutions of special asymptotics to the Einstein constraint equations”,
Classical Quantum Gravity 27:24 (2010), 245002, 10.
[119] L.-H. Huang, “On the center of mass in general relativity”, pp. 575–591 in Fifth International
Congress of Chinese Mathematicians, AMS/IP Stud. Adv. Math. 51, pt. 1, Amer. Math. Soc.,
2012.
[120] L.-H. Huang and D. A. Lee, “Equality in the spacetime positive mass theorem”, Comm.
Math. Phys. 376:3 (2020), 2379–2407.
[121] L.-H. Huang and D. Wu, “The equality case of the Penrose inequality for asymptotically
flat graphs”, Trans. Amer. Math. Soc. 367:1 (2015), 31–47.
[122] L.-H. Huang, R. Schoen, and M.-T. Wang, “Specifying angular momentum and center of
mass for vacuum initial data sets”, Comm. Math. Phys. 306:3 (2011), 785–803.
R EFERENCES 389

[123] G. Huisken, “Flow by mean curvature of convex surfaces into spheres”, J. Differential Geom.
20:1 (1984), 237–266.
[124] G. Huisken, “The volume preserving mean curvature flow”, J. Reine Angew. Math. 382
(1987), 35–48.
[125] G. Huisken and T. Ilmanen, “The inverse mean curvature flow and the Riemannian Penrose
inequality”, J. Differential Geom. 59:3 (2001), 353–437.
[126] G. Huisken and S.-T. Yau, “Definition of center of mass for isolated physical systems and
unique foliations by stable spheres with constant mean curvature”, Invent. Math. 124:1-3 (1996),
281–311.
[127] J. Isenberg, “Constant mean curvature solutions of the Einstein constraint equations on
closed manifolds”, Classical Quantum Gravity 12:9 (1995), 2249–2274.
[128] J. Isenberg and V. Moncrief, “A set of nonconstant mean curvature solutions of the Einstein
constraint equations on closed manifolds”, Classical Quantum Gravity 13:7 (1996), 1819–1847.
[129] J. Isenberg, R. Mazzeo, and D. Pollack, “On the topology of vacuum spacetimes”, Ann.
Henri Poincaré 4:2 (2003), 369–383.
[130] P. S. Jang, “On the positive energy conjecture”, J. Mathematical Phys. 17:1 (1976), 141–145.
[131] P. S. Jang and R. M. Wald, “The positive energy conjecture and the cosmic censor hypothe-
sis”, J. Math. Phys. 18 (1977), 41–44.
[132] T. Kaluza, “Zur Relativitätstheorie”, Physikal. Z. 11 (1910), 977–978. Translated in
https://en.wikisource.org/wiki/Translation:On_the_Theory_of_Relativity_(Kaluza).
[133] N. Kapouleas, “Constant mean curvature surfaces constructed by fusing Wente tori”, Invent.
Math. 119:3 (1995), 443–518.
[134] J. L. Kazdan and F. W. Warner, “A direct approach to the determination of Gaussian and
scalar curvature functions”, Invent. Math. 28 (1975), 227–230.
[135] J. L. Kazdan and F. W. Warner, “Existence and conformal deformation of metrics with
prescribed Gaussian and scalar curvatures”, Ann. of Math. (2) 101 (1975), 317–331.
[136] J. L. Kazdan and F. W. Warner, “Prescribing curvatures”, pp. 309–319 in Differential
geometry (Stanford, 1973), vol. 2, Proc. Sympos. Pure Math. 27, 1975.
[137] D. Kleppner and R. J. Kolenkow, An introduction to mechanics, McGraw Hill, New York,
1973.
[138] M.-K. G. Lam, “The graph cases of the Riemannian positive mass and Penrose inequalities
in all dimensions”, preprint, 2010. arXiv 1010.4256v1
[139] H. B. Lawson, Jr. and M.-L. Michelsohn, Spin geometry, Princeton Math. Series 38,
Princeton Univ. Press, 1989.
[140] J. M. Lee, Riemannian manifolds: an introduction to curvature, Graduate Texts in Math.
176, Springer, New York, 1997.
[141] J. M. Lee, Introduction to smooth manifolds, Graduate Texts in Math. 218, Springer, New
York, 2003.
[142] D. A. Lee, Geometric relativity, Graduate Studies in Mathematics 201, Amer. Math. Soc.,
Providence, RI, 2019.
[143] J. M. Lee and T. H. Parker, “The Yamabe problem”, Bull. Amer. Math. Soc. (N.S.) 17:1
(1987), 37–91.
[144] G. Leoni, A first course in Sobolev spaces, Graduate Studies in Math. 105, Amer. Math.
Soc., Providence, RI, 2009.
390 R EFERENCES

[145] A. Lichnerowicz, “L’intégration des équations de la gravitation relativiste et le problème


des n corps”, J. Math. Pures Appl. (9) 23 (1944), 37–63.
[146] J. Lohkamp, “Scalar curvature and hammocks”, Math. Ann. 313:3 (1999), 385–407.
[147] J. Lohkamp, “The higher dimensional positive mass theorem, II”, preprint, 2017. arXiv
1612.07505v2
[148] D. Lovelock and H. Rund, Tensor, differential forms, and variational principles, Wiley, New
York, 1975.
[149] T. Marquardt, The inverse mean curvature flow for hypersurfaces with boundary, Ph.D
thesis, Freie Universität Berlin, 2012.
[150] M. Mars, “Present status of the Penrose inequality”, Proc. Amer. Math. Soc. 132:1 (2004),
217–222.
[151] D. Maxwell, “Rough solutions of the Einstein constraint equations on compact manifolds”,
J. Hyperbolic Differ. Equ. 2:2 (2005), 521–546.
[152] D. Maxwell, “Solutions of the Einstein constraint equations with apparent horizon bound-
aries”, Comm. Math. Phys. 253:3 (2005), 561–583.
[153] D. Maxwell, “The conformal method and the conformal thin-sandwich method are the
same”, Classical Quantum Gravity 31:14 (2014), 145006, 34.
[154] D. Maxwell, “Initial data in general relativity described by expansion, conformal deforma-
tion and drift”, Comm. Anal. Geom. 29:1 (2021), 207–281.
[155] R. C. McOwen, “The behavior of the Laplacian on weighted Sobolev spaces”, Comm. Pure
Appl. Math. 32:6 (1979), 783–795.
[156] J. Metzger, “Foliations of asymptotically flat 3-manifolds by 2-surfaces of prescribed mean
curvature”, J. Differential Geom. 77:2 (2007), 201–236.
[157] N. Meyers, “An expansion about infinity for solutions of linear elliptic equations”, J. Math.
Mech. 12 (1963), 247–264.
[158] P. Miao, “Quasi-local mass via isometric embeddings: a review from a geometric perspec-
tive”, Classical Quantum Gravity 32:23 (2015), art. id. 233001.
[159] P. Miao and L.-F. Tam, “Evaluation of the ADM mass and center of mass via the Ricci
tensor”, Proc. Amer. Math. Soc. 144:2 (2016), 753–761.
[160] H. Mirandola and F. Vitório, “The positive mass theorem and Penrose inequality for
graphical manifolds”, Comm. Anal. Geom. 23:2 (2015), 273–292.
[161] C. W. Misner, K. S. Thorne, and J. A. Wheeler, Gravitation, Freeman, San Francisco, 1973.
[162] C. Møller, The theory of relativity, Clarendon Press, Oxford, 1952.
[163] V. Moncrief, “Spacetime symmetries and linearization stability of the Einstein equations, I”,
J. Mathematical Phys. 16 (1975), 493–498.
[164] S. Montiel and A. Ros, Curves and surfaces, vol. 69, Second ed., Graduate Studies in
Mathematics, American Mathematical Society, Providence, RI; Real Sociedad Matemática
Española, Madrid, 2009. Translated from the 1998 Spanish original by Montiel and edited by
Donald Babbitt.
[165] J. R. Munkres, Elements of algebraic topology (Menlo Park, CA), Addison-Wesley, 1984.
[166] C. Nerz, “Foliations by stable spheres with constant mean curvature for isolated systems
without asymptotic symmetry”, Calc. Var. Partial Differential Equations 54:2 (2015), 1911–1946.
[167] A. Neves, “Insufficient convergence of inverse mean curvature flow on asymptotically
hyberbolic manifolds”, J. Differential Geom. 84:1 (2010), 191–229.
R EFERENCES 391

[168] L. Nirenberg and H. F. Walker, “The null spaces of elliptic partial differential operators
in Rn ”, J. Math. Anal. Appl. 42 (1973), 271–301. Collection of articles dedicated to Salomon
Bochner.
[169] J. Norton, “What was Einstein’s principle of equivalence?”, Stud. Hist. Philos. Sci. 16:3
(1985), 203–246.
[170] J. D. Norton, “General covariance and the foundations of general relativity: eight decades
of dispute”, Rep. Progr. Phys. 56:7 (1993), 791–858.
[171] J. D. Norton, “Mach’s principle before Einstein”, pp. 9–57 in Mach’s principle: from
Newton’s bucket to quantum gravity, edited by J. Barbour and H. Pfister, Einstein Studies 6,
Birkhäuser, 1995.
[172] N. Ó Murchadha and J. W. York, Jr., “Initial-value problem of general relativity, I: General
formulation and physical interpretation”, Phys. Rev. D (3) 10 (1974), 428–436.
[173] M. Obata, “Certain conditions for a Riemannian manifold to be isometric with a sphere”, J.
Math. Soc. Japan 14 (1962), 333–340.
[174] B. O’Neill, Semi-Riemannian geometry, with applications to relativity, Pure and Applied
Math. 103, Academic Press, New York, 1983.
[175] T. Parker and C. H. Taubes, “On Witten’s proof of the positive energy theorem”, Comm.
Math. Phys. 84:2 (1982), 223–238.
[176] R. Penrose, “Asymptotic properties of fields and space-times”, Phys. Rev. Lett. 10 (1963),
66–68.
[177] R. Penrose, “Gravitational collapse and space-time singularities”, Phys. Rev. Lett. 14 (1965),
57–59.
[178] R. Penrose, “Gravitational collapse: the role of general relativity”, Nuovo Cimento 1:(numero
speciale) (1969), 252–276.
[179] R. Penrose, Techniques of differential topology in relativity, CBMS Regional Conf. Series
Appl. Math. 7, Society for Industrial and Applied Mathematics, Philadelphia, 1972.
[180] R. Penrose, “Naked singularities”, Ann. New York Acad. Sci. 224 (1973), 125–134.
[181] R. Penrose, “Some unresolved problems in classical general relativity”, pp. 631–668 in
Seminar on Differential Geometry, edited by S.-T. Yau, Annals of Math. Stud. 102, Princeton
University Press, Princeton, NJ, 1982.
[182] P. Petersen, Riemannian geometry, 2nd ed., Graduate Texts in Math. 171, Springer, 2006.
[183] G. Pólya and G. Szegö, Isoperimetric inequalities in mathematical physics, Annals of
Mathematics Studies 27, Princeton University Press, Princeton, NJ, 1951.
[184] J. Qing and G. Tian, “On the uniqueness of the foliation of spheres of constant mean
curvature in asymptotically flat 3-manifolds”, J. Amer. Math. Soc. 20:4 (2007), 1091–1110
(electronic).
[185] J. Qing and W. Yuan, “On scalar curvature rigidity of vacuum static spaces”, Math. Ann.
365:3-4 (2016), 1257–1277.
[186] W. Qiu, “Interior regularity of solutions to the isotropically constrained Plateau problem”,
Comm. Anal. Geom. 11:5 (2003), 945–986.
[187] T. Regge and C. Teitelboim, “Role of surface integrals in the Hamiltonian formulation of
general relativity”, Ann. Physics 88 (1974), 286–318.
[188] R. Resnick, Introduction to special relativity, Wiley, 1968.
[189] W. Rindler, Introduction to special relativity, 2nd ed., Oxford Univ. Press, New York, 1991.
392 R EFERENCES

[190] H. Ringström, The Cauchy problem in general relativity, European Mathematical Society,
Zürich, 2009. Errata at https://people.kth.se/∼hansr/errata.html.
[191] H. Ringström, On the topology and future stability of the universe, Oxford Univ. Press,
2013.
[192] H. P. Robertson, “Postulate versus observation in the special theory of relativity”, Rev.
Modern Phys. 21 (1949), 378–382.
[193] H. L. Royden, Real analysis, 3rd ed., Macmillan, New York, 1988.
[194] W. Rudin, Real and complex analysis, 3rd ed., McGraw-Hill, New York, 1987.
[195] W. Rudin, Functional analysis, 2nd ed., McGraw-Hill, New York, 1991.
[196] R. Schoen, “Conformal deformation of a Riemannian metric to constant scalar curvature”, J.
Differential Geom. 20:2 (1984), 479–495.
[197] R. M. Schoen, “Variational theory for the total scalar curvature functional for Riemannian
metrics and related topics”, pp. 120–154 in Topics in calculus of variations (Montecatini Terme,
1987), Lecture Notes in Math. 1365, Springer, 1989.
[198] R. Schoen and S. T. Yau, “Existence of incompressible minimal surfaces and the topology of
three-dimensional manifolds with nonnegative scalar curvature”, Ann. of Math. (2) 110:1 (1979),
127–142.
[199] R. Schoen and S. T. Yau, “On the proof of the positive mass conjecture in general relativity”,
Comm. Math. Phys. 65:1 (1979), 45–76.
[200] R. Schoen and S. T. Yau, “On the structure of manifolds with positive scalar curvature”,
Manuscripta Math. 28:1-3 (1979), 159–183.
[201] R. M. Schoen and S. T. Yau, “Complete manifolds with nonnegative scalar curvature and
the positive action conjecture in general relativity”, Proc. Nat. Acad. Sci. U.S.A. 76:3 (1979),
1024–1025.
[202] R. Schoen and S. T. Yau, “The energy and the linear momentum of space-times in general
relativity”, Comm. Math. Phys. 79:1 (1981), 47–51.
[203] R. Schoen and S. T. Yau, “Proof of the positive mass theorem, II”, Comm. Math. Phys. 79:2
(1981), 231–260.
[204] R. Schoen and S.-T. Yau, Lectures on differential geometry, Conference Proceedings and
Lecture Notes in Geometry and Topology, I, International Press, Cambridge, MA, 1994.
[205] R. Schoen and S.-T. Yau, “Positive scalar curvature and minimal hypersurface singularities”,
pp. 441–480 in Surveys in differential geometry, 2019: Differential geometry, Calabi–Yau
theory, and general relativity, vol. 2, Surv. Differ. Geom. 24, International Press, Boston, 2022.
arXiv 1704.05490
[206] R. Schoen and X. Zhou, “Convexity of reduced energy and mass angular momentum
inequalities”, Ann. Henri Poincaré 14:7 (2013), 1747–1773.
[207] B. F. Schutz, A first course in general relativity, Cambridge Univ. Press, Cambridge, 1990.
[208] J. M. M. Senovilla and D. Garfinkle, “The 1965 Penrose singularity theorem”, Classical
Quantum Gravity 32:12 (2015), 124008, 45.
[209] L. Simon, Theorems on regularity and singularity of energy minimizing maps, Birkhäuser,
Basel, 1996.
[210] L. Simon, “Schauder estimates by scaling”, Calc. Var. Partial Differential Equations 5:5
(1997), 391–407.
[211] B. Smith and G. Weinstein, “Quasiconvex foliations and asymptotically flat metrics of
non-negative scalar curvature”, Comm. Anal. Geom. 12:3 (2004), 511–551.
R EFERENCES 393

[212] J. Stachel, Einstein’s miraculous year: five papers that changed the face of physics, Princeton
Univ. Press, 1998.
[213] J. D. Streets, “Quasi-local mass functionals and generalized inverse mean curvature flow”,
Comm. Anal. Geom. 16:3 (2008), 495–537.
[214] M. E. Taylor, Partial differential equations, III: Nonlinear equations, Applied Math. Sci-
ences 117, Springer, New York, 1997.
[215] P. Topping, “Relating diameter and mean curvature for submanifolds of Euclidean space”,
Comment. Math. Helv. 83:3 (2008), 539–546.
[216] J. Urbas, “On the expansion of starshaped hypersurfaces by symmetric functions of their
principal curvatures”, Math. Z. 205 (1990), 355–372.
[217] J. W. Vick, Homology theory: an introduction to algebraic topology, Pure and Applied
Math. 53, Academic Press, New York, 1973.
[218] R. M. Wald, General relativity, University of Chicago Press, 1984.
[219] F. W. Warner, Foundations of differentiable manifolds and Lie groups, Graduate Texts in
Mathematics 94, Springer-Verlag, New York-Berlin, 1983. Corrected reprint of the 1971 edition.
[220] R. H. Wasserman, Tensors and manifolds, with applications to mechanics and relativity,
Oxford Univ. Press, New York, 1992.
[221] S. Weinberg, Gravitation and cosmology, Wiley, New York, 1972.
[222] H. C. Wente, “Counterexample to a conjecture of H. Hopf”, Pacific J. Math. 121:1 (1986),
193–243.
[223] T. Willmore, Total curvature in Riemannian geometry, Wiley, 1982.
[224] E. Witten, “A new proof of the positive energy theorem”, Comm. Math. Phys. 80:3 (1981),
381–402.
[225] R. Ye, “Foliation by constant mean curvature spheres on asymptotically flat manifolds”, pp.
369–383 in Geometric analysis and the calculus of variations, International Press, Cambridge,
MA, 1996.
[226] J. W. York, Jr., “Gravitational degrees of freedom and the initial-value problem”, Phys. Rev.
Lett. 26 (1971), 1656–1658.
[227] J. W. York, Jr., “Role of conformal three-geometry in the dynamics of gravitation”, Phys.
Rev. Lett. 28:16 (1972), 1082–1085.
[228] J. W. York, Jr., “Conformally invariant orthogonal decomposition of symmetric tensors on
Riemannian manifolds and the initial-value problem of general relativity”, J. Mathematical Phys.
14 (1973), 456–464.
[229] J. W. York, Jr., “Boundary terms in the action principles of general relativity”, Found. Phys.
16:3 (1986), 249–257.
[230] W. Yuan, “Brown-York mass and compactly supported conformal deformations of scalar
curvature”, J. Geom. Anal. 27:1 (2017), 797–816.
[231] X. Zhang, “Angular momentum and positive mass theorem”, Comm. Math. Phys. 206:1
(1999), 137–155.
[232] X. Zhou, “Mass angular momentum inequality for axisymmetric vacuum data with small
trace”, Comm. Anal. Geom. 22:3 (2014), 519–571.
Index

acausal hypersurface, 150 future, 110


achronal set, 112 past, 109
adjoint operator, 175 vector, 17
ADM causality relations, 109
energy-momentum, 252, 321, 348 center of mass, 238
equations, 151, 153 BORT, 256, 322, 323
symplectic form, 160 geometric, 338
Hamiltonian, 157 mean curvature, 340
mass, 252 Christoffel symbols, xxii
Lam’s formula, 360 chronology condition, 110
Alexandrov’s theorem, 326 CKV, see conformal Killing field
Alexandrov–Fenchel inequality, 364 clock hypothesis, 23
generalized, 374 closed manifold, xix
angular momentum, 257, 322 CMC
anti-de Sitter spacetime, 82, 142 case of conformal method, 194
area radius, 334 hypersurface, stable, 329
Ascoli–Arzelà theorem, 172 Codazzi equation, 139
asymptotic to Schwarzschild, 245, 332 cokernel, 185
asymptotically flat/Euclidean compact map, 171
coordinates, 245 conformal
end, 246 data, 198
initial data, 246, 321 flow, 377
asymptotically simple spacetime, 231 Killing operator/field, 195
Laplacian, 103, 190
Bôcher’s theorem, 242 method, 194
Barbosa–do Carmo theorem, 330 CMC case, 194, 201
Bianchi identity, 61 vector field, 324
Birkhoff’s theorem, 86 connection
Bochner–Lichnerowicz identity, 341 affine, xx
bootstrapping, 181, 182 Levi-Civita, xx
BORT center of mass, 322 normal, 223
Boyer–Lindquist coordinates, 99 constant mean curvature, see CMC
Bunting–Masood-ul-Alam reflection, 379 constraint
Einstein–Maxwell, 143
capacity, 378 Hamiltonian, 140
Cartan structure equations, 41 momentum, 140
Cauchy constraints operator, 144
development, 117 coordinates
horizon, 119 Boyer–Lindquist, 99
hypersurface, 114, 150 Fermi, 69, 228
causal harmonic, 145
curve, 114 inertial, 7

395
396 I NDEX

wave, 145 strict, 173


correspondence principle, 59 strong, 174
cosmological constant, 62, 74 uniform, 173
Coulomb gauge, 136 embedding, Sobolev, 181
covariance energy condition
general, 15 dominant, 65, 140, 144
special, 2, 49 null, 65
Cramer’s rule, 70 strong, 65
curvature weak, 65
Gauss, 162 energy-momentum
mean, 137 ADM, 252
principal, 162 vector, 25
reduced Ricci, 145 equation
Ricci, xxi ADM, 151, 153
Riemann, xxi Codazzi, 139
scalar, xxii Einstein, 63
Einstein constraint, 140, 141
de Sitter spacetime, 33, 82 Gauss, 138, 139
density theorem for scalar curvature, 282, Hamilton’s, 155, 159
284, 349 Jacobi, 66
difference quotient lemma, 180 Klein–Gordon, 75
domain of dependence, 117 Lichnerowicz, 201
dominant energy condition, 65, 140, 144, Mainardi, 152
321 Poisson, 235
Doppler shift, 19 Raychaudhuri, 126
reduced Einstein, 146
edge point, 113 Riccati, 125, 227
eigenvalue decomposition, 185 Weingarten, 162
Einstein equicontinuity, 172
constraint equations, 140, 141, 320 equivalence principle, 50
maximal, 144 estimate
time-symmetric, 144 elliptic, 177
equation, 63 Schauder, 177
initial data set, 143 exceptional weights, 266
reduced, 146
vacuum, 74 Faraday tensor, 13
manifold, 73 Fermi coordinates, 69, 228
static universe, 34, 83 first variation of area, 221
summation convention, xix Fischer–Marsden theorem, 207
tensor, 62, 74 FLRW spacetime, 83, 142
Einstein–Hilbert action, 70 focal point, 124
elliptic frame (of reference), 2
estimate, 177 dragging, 99
weighted, 270, 272 inertial, 6
operator, 173, 174 Fredholm
self-adjoint, 184 alternative, 183
over/underdetermined, 175 operator, index, 185
regularity, 180 Freire–Schwartz theorem, 374
ellipticity fundamental solution, 235
I NDEX 397

future-directed, 108 weighted Hölder, 259


inertial
Gagliardo-Nirenberg-Sobolev inequality, coordinates, 7
345 frame, 2, 6
Galilean transformation, 2 momentarily comoving, 22
Gauss mass, 47
curvature, 162 observer, 2, 7
equation, 138, 139 inextendible causal curve, 114
theorema egregium, 163 initial data
Gauss–Bonnet theorem, 80 conformal, 198
general initial data set, 133, 143, 320
covariance, 15, 57, 59 asymptotic to Schwarzschild, 332
relativity, 58 asymptotically flat, 321
generalized Minkowski integral formula, asymptotically flat (Euclidean), 246
325 harmonic asymptotics, 352
geon, 96 time-symmetric, 321
Geroch monotonicity formula, 368 invariant hyperbola, 12
globally hyperbolic, 111 inverse function theorem, 344
gravitational inverse mean curvature flow, 364
mass/potential, 47 level set formulation, 368
redshift, 51 weak solution, 368
Green’s function, 240 inversion in the sphere, 243
isoperimetric inequality, 102
Hölder space, 168 isotropic, 7
weighted, 260
Hamilton’s equations, 155, 159 Jacobi
Hamiltonian constraint, 140 equation, 66
harmonic field, 124
asymptotics, 352 operator, 212, 225
at/near infinity, 243
coordinates, 145 Kazdan–Warner theorem, 205
harmonically flat at infinity, 233, 376 Kelvin transform, 243
Harnack inequality, 187, 241 Kerr metric, 99
Hawking mass, 366 Killing
Heintze–Karcher inequality, 325 field, 195
Hessian, xxii, 39 initial data (KID), 161
Hodge decomposition, 183 Klein–Gordon equation, 75
Hopf’s lemma, 305 Kruskal extension, 89
Huisken–Yau Theorem, 332
hypersurface Laplace operator, xxii, 39
acausal, 150 lapse function, 150
Cauchy, 114, 150 lemma
difference quotient, 180
incomplete causal geodesic, 129 Rellich, 172, 191
index, Fredholm, 185 Weyl, 179, 239
inequality Levi-Civita connection, xx
Harnack, 187, 241 Lichnerowicz
mass-capacity, 378, 379 equation, 201
stability, 212 theorem, 341
398 I NDEX

lightcone, 8 normal
Liouville theorem, 240 connection, 223
local rest frame, 22 coordinates, xxiii
Lorentz Ricci curvature, 223
contraction, 6, 20 null vector, 16
force, 3, 32 nullcone, 8
group, 16
transformation, 9
observer, 2, 7
proper, 16
operator
Lorenz gauge, 134
conformal Killing, 195
constraints, 144
Mach’s principle, 59 elliptic, 173, 174
Mainardi equation, 152 formal adjoint, 175
manifold, convention on, xix Fredholm, 185
mass, 238 Jacobi, 212, 225
ADM, 252
Laplace, xxii, 39
decay conditions, 264
overdetermined-elliptic, 175
gravitational, 47
quasilinear, 182
Hawking, 366
self-adjoint elliptic, 184
inertial, 47
semilinear, 182
rest, 25
shape, 161, 162, 227
mass-capacity inequality, 378, 379
tidal force, 123
maximal hypersurface, 225
underdetermined-elliptic, 175
maximum principle, 186, 187
wave, xxii, 39, 75
strong, 239
outermost minimal surface, 357
weak, 239
overdetermined-elliptic, 175
Maxwell’s equations, 14
mean
curvature, 137, 324 parallel transport, 38
value property, 239 past-directed, 108
metric Penrose
Kerr, 99 inequality, see RPI
Lorentzian, xx singularity theorem, 130
Minkowski, 16 PMT, see positive mass theorem, 360
Riemannian, xx Poincaré group, 16
Schwarzschild, 88, 96, 97, 101–103 Poisson
semi-Riemannian, xx equation, 235
static vacuum, 98 kernel, 239
minimal surface, 212 positive
stable, 212, 327 energy theorem, 285
minimizing hull, 371 mass theorem, 285, 358
Minkowski metric/spacetime, 16 for graphs, 360
conformal compactification, 34 Riemannian, 285
momentarily comoving inertial frame, 22 precompact smooth domain, 291
momentum principal
constraint, 140 curvatures, 162
tensor, 144 symbol, 174
principle of relativity, 2
Newtonian potential, 235, 268 proper Lorentz group/transformation, 16
I NDEX 399

quasilinear, 182 self-adjoint, 184


semilinear, 182
Raychaudhuri equation, 126 shape operator, 161, 162, 227
reflection trick, 379 shift vector field, 150
Regge–Teitelboim smooth, xix
conditions, 257, 322 Sobolev
Hamiltonian, 250 embedding, 181
Regge–Wheeler coordinate, 90 weighted, 260
relativity space, 168
general, 58 weighted, 258, 348
of simultaneity, 20 spacelike
special, 2 infinity, 35
Rellich lemma, 172, 191 vector, 16
Rellich–Kondrachov theorem, 172 spacetime, 109
weighted, 263 anti-de Sitter, 82, 142
rest mass, 25 asymptotically simple, 231
reversed triangle inequality, 18 de Sitter, 33, 82
Riccati equation, 125, 227 FLRW, 83, 142
Ricci globally hyperbolic, 111
curvature, xxi homogeneous, 7
reduced, 145 incomplete, 129
formula, 41, 195 isotropic, 7
Riemann curvature, xxi Kerr, 99
symmetry-by-pairs, 60 Minkowski, 16
Riemannian Penrose inequality, see RPI Schwarzschild, 85, 97, 142
RPn -geon, 96 static, 86
RPI, 357, 359 stationary, 99, 161
Bray formulation, 375 special
for graphs, 360 covariance, 2, 49
Huisken–Ilmanen, 364 relativity, 2
spherical harmonic expansion, 244
scalar curvature, xxii stability
conformal deformation, 102 inequality, 212
linearization, 72, 176, 206 operator, 327
Schauder static
estimate, 177 potential equation, 325
fixed point theorem, 337 spacetime, 86
Schoen–Yau theorem, 212 vacuum equations, 96
Schwarzschild vacuum metric, 98
manifold, 322 stationary spacetime, 99, 161
metric, 88, 96, 97, 101–103 stress-energy tensor, 27, 76
spacetime, 85, 97, 142 divergence, 28, 77
isotropic coordinates, 93 dust, 29
Kruskal extension, 89 electromagnetic, 30
scri, 35 energy conditions, 65
second perfect fluid, 29
fundamental form, 137, 162 strictly minimizing hull, 371
variation of area, 212, 224, 327 strong
sectional curvature, xxi causality, 111
400 I NDEX

maximum principle, 187 Lorentz, 9


sub/supersolution, 187 transverse-traceless (TT), 194, 195, 197
submanifold trapped surface, 130
closed, 324 triangle inequality, reversed, 18
embedded, 324 twin paradox, 18
immersed, 324 two-sided hypersurface, 324

tensor underdetermined-elliptic, 175


Einstein, 62, 74
Faraday, 13 variation of area
momentum, 144 first, 221, 327
Ricci, xxi second, 212, 224, 327
Riemann curvature, xxi volume-preserving, 328
stress-energy, 27, 76 variation of volume, 328
torsion, 38 vector
theorem causal, 17
Alexandrov’s, 326 energy-momentum, 25
Ascoli–Arzelà, 172 null, 16
Bôcher, 242 spacelike, 16
Barbosa–do Carmo, 330 timelike, 16
Birkhoff, 86 velocity addition
Fischer–Marsden, 207 Galilean, 3
Freire–Schwartz, 374 relativistic, 12
Gauss, 163 volume-preserving mean curvature flow,
Gauss–Bonnet, 80 332
inverse function, 344
Kazdan–Warner, 205 wave
Lichnerowicz, 341 coordinates, 145
operator, xxii, 39, 75
Liouville, 240
weak
Penrose singularity, 130
maximum principle, 186
positive mass/energy, 285, 358
solution, 175
Rellich–Kondrachov, 172
weighted space
removable singularity, 240
Hölder, 260
Schauder fixed point, 337
Sobolev, 258, 348
Schoen–Yau, 212
Weingarten equations, 162
theorema egregium, 163
Weyl’s lemma, 179, 239
tidal force operator, 123
time dilation, 6, 20
Yamabe class, 193
time-orientation, 107
timecone, 107
timelike
convergence condition, 65
future, 109
past/future, 109
vector, 16
torsion tensor, 38
total scalar curvature, 70
transformation
Galilean, 2

You might also like