(Lecture Notes in Physics) Fulvio Ricci, Massimo Bassan - Experimental Gravitation-Springer (2022)
(Lecture Notes in Physics) Fulvio Ricci, Massimo Bassan - Experimental Gravitation-Springer (2022)
Fulvio Ricci
Massimo Bassan
Experimental
Gravitation
Lecture Notes in Physics
Founding Editors
Wolf Beiglböck
Jürgen Ehlers
Klaus Hepp
Hans-Arwed Weidenmüller
Volume 998
Series Editors
Roberta Citro, Salerno, Italy
Peter Hänggi, Augsburg, Germany
Morten Hjorth-Jensen, Oslo, Norway
Maciej Lewenstein, Barcelona, Spain
Angel Rubio, Hamburg, Germany
Wolfgang Schleich, Ulm, Germany
Stefan Theisen, Potsdam, Germany
James D. Wells, Ann Arbor, MI, USA
Gary P. Zank, Huntsville, AL, USA
The series Lecture Notes in Physics (LNP), founded in 1969, reports new
developments in physics research and teaching - quickly and informally, but with a
high quality and the explicit aim to summarize and communicate current knowledge
in an accessible way. Books published in this series are conceived as bridging
material between advanced graduate textbooks and the forefront of research and to
serve three purposes:
• to be a compact and modern up-to-date source of reference on a well-defined
topic;
• to serve as an accessible introduction to the field to postgraduate students and
non-specialist researchers from related areas;
• to be a source of advanced teaching material for specialized seminars, courses
and schools.
Both monographs and multi-author volumes will be considered for publication.
Edited volumes should however consist of a very limited number of contributions
only. Proceedings will not be considered for LNP.
Volumes published in LNP are disseminated both in print and in electronic
formats, the electronic archive being available at springerlink.com. The series
content is indexed, abstracted and referenced by many abstracting and information
services, bibliographic networks, subscription agencies, library networks, and
consortia.
Proposals should be sent to a member of the Editorial Board, or directly to the
responsible editor at Springer:
Dr Lisa Scalone
Springer Nature
Physics
Tiergartenstrasse 17
69121 Heidelberg, Germany
[email protected]
Experimental Gravitation
Fulvio Ricci Massimo Bassan
Dipartimento di Fisica Dipartimento di Fisica
Sapienza Università di Roma Università di Roma, Tor Vergata
Rome, Italy Rome, Italy
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Switzerland AG 2022
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
To our patient wives Simonetta and Stefania
Preface
vii
viii Preface
typically in the audio band, and ν for carrier frequencies, most often optical. In all
cases, it always intended ω = 2π · frequency.
This book derives from the lecture notes of the course of “Experimental Grav-
itation” that we both taught in our respective universities. We are grateful to all
our students, who provided motivation for these notes and stimulated the study in
further depth of some topics. We take the opportunity here to express our gratitude
to friends and colleagues who offered help, advice and proof-reading: at the risk
of forgetting someone, we mention Pia Astone, Massimo Bianchi, Marta Burgay,
Valerio Ferroni, David Maria Lucchesi, Mehr Un Nisa, Paola Puppo, Giuseppe
Pucacco, Alberto Sesana, Fabio Spizzichino, Angelo Tartaglia, Jean Yves Vinet
and Massimo Visco. A special mention for Stefano Braccini, who taught this
course at the University of Pisa: sadly, he could not join us in this venture.
Gravitation is the oldest and, at the same time, the most lively and vibrant field
of experimental physics: the awarding of three Nobel prizes in the last decade
(2011, 2017, 2020) is the proof, if ever needed. Our wish is to provide, with
this textbook, young and perspective experimental physicists with some of the
necessary tools to continue in this search.
Table 1 A very limited selection of constants, conversion factors and astronomical quantities.
Several constants, including c, k B , , e, etc., have been defined exact since 2018: this has implied
redefining metre, second and kilogram in terms of these new “yardsticks”. Note that G, the most
important (for this book) constant, is the least well determined, with a relative uncertitude. u r =
1.5 · 10−5
Name Symbol Value Units Notes
Fundamental constants
Boltzmann constant kB 1.3806489 · 10−23 J/K Exact (SI 2018)
Planck’s constant 1.054571817 · 10−34 Js Exact
Speed of light (...and of GW) c 2.99792458 · 108 m/s Exact
Gravitational constant G 6.67430(15) · 10−11 m3 /s2 kg σG /G = 1.5 · 10−5
Elementary e.m. charge e ±1.602176634 · 10−19 C Exact
Derived constants
Supercond. flux quantum φ0 2.067833... · 10−15 Wb Exact
Fine structure constant α 1/137.036... σα /α = 1.5 · 10−10
Non-SI units and conversion factors
Modified Julian Date MJD Days since midnight of November 17, 1858
(Julian) year yr 3.15557600 · 107 s 365.25 days
Sidereal day sid.day 86164.09 s 0.99727 solar day
Astronomical unit AU 1.495978707 · 1011 m Exact (IAU 2012)
Parsec pc ∼ 3.08567758 · 1016 m
648000/π AU Exact (IAU 2015)
Light-year ly ∼ 9.460730 · 1015 m
∼ 0.3066 pc
∼ 63198 AU
Mega electron volt MeV 1.60217 · 10−13 J 106 /e
Kilogram kg 5.61 · 1029 MeV 106 c2 /e
A few data about the solar system
Distance from galactic centre dgal 2.35 · 1020 m
Sun speed around the galaxy vgal 2.2 · 105 m/s
Mass of the Sun M 1.99 · 1030 kg
Mean radius of the Sun R 1.3909 · 109 m
Sun quadrupole moment J2, 2.21 · 10−7
Mass of the Earth M⊕ 5.9736 · 1024 kg
Mean radius of the Earth R⊕ 6.378137 · 106 m
Earth quadrupole moment J2,⊕ 1.0826 · 10−3
Mass of the Moon M 7.349 · 1022 kg
1 Classical Gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Kepler and Newton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Newton’s Gravitational Constant G . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Motion on a Keplerian Orbit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 The Gravitational Field and Potential . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Measurement Units in Experimental Gravitation . . . . . . . . . . . . . . 9
1.6 Multipole Expansion of ( r) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.7 Mapping the Earth Gravitational Potential . . . . . . . . . . . . . . . . . . . . 14
1.8 Gravitation and Tides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.8.1 Modelling Tides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.8.2 Tides Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.9 Active and Passive Masses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.9.1 The Experiment of Kreuzer . . . . . . . . . . . . . . . . . . . . . . . . 25
1.9.2 Lunar Laser Ranging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.9.3 The Bartlett-Van Buren Moon Experiment . . . . . . . . . . 27
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2 The Torsion Pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.1 The Torsion Balance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2 The Torsion of the Wire . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.3 Measurement Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.4 Read-Out for the Torsion Balance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.4.1 Pendulums with Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.5 A Detailed Model for the Torsion Balance . . . . . . . . . . . . . . . . . . . 43
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3 The Equivalence Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.1 Statement of the Equivalence Principle . . . . . . . . . . . . . . . . . . . . . . . 49
3.2 Tests of Einstein’s Equivalence Principle . . . . . . . . . . . . . . . . . . . . . 51
3.3 Verification of EEP at the Atomic Level . . . . . . . . . . . . . . . . . . . . . . 58
3.4 Experiments in Microgravity Conditions . . . . . . . . . . . . . . . . . . . . . 59
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
xi
xii Contents
xvii
xviii Acronyms
OB Optical Bench
PDH Pound-Drever-Hall
PK Post-Keplerian
PNS Proto-Neutron Star
PPN Post-Newtonian Parametrization
PR Pound and Rebka
PR Power Recycling
PRCL Power Recycling Cavity Length
PSD Power Spectral Density
PTA Pulsar Timing Array
Pulsar PULSAting Radio source
QND Quantum Non-demolition
QSO Quasi-stellar Object
RAAN Right Ascension of the Ascending Node
RLG Ring Laser Gyroscopes
RM Recycling Mirror
SASI Standing Accretion and Shock Instability
SEP Strong Equivalence Principle
SGWB Stochastic Gravitational Wave Background
SI Système International d’unités
SLR Satellite Laser Ranging
SMBH Super Massive Black Hole
SMBHB Super Massive Black Hole Binary
SN Supernova
SNR Signal-to-Noise Ratio
SQUID Superconducting Quantum Interference Device
SR Signal Recycling
SR Special Relativity
SSB Solar System Barycentre
TDI Time Delay Interferometry
TM Test Mass
TOA Time Of Arrival
TT Transverse and Traceless
UFF Universality of Free Falling
VLBI Very Long Baseline Interferometry
WD White Dwarf
WEP Weak Equivalence Principle
WFSM Weak Field and Slow Motion
WK Wiener-Kolmogorov
Acronyms of Experiments and Satellites
xxi
xxii Acronyms of Experiments and Satellites
The cosmological vision of the ancient world was dominated by the philosophy of
Ptolomeus (100–178 AD), describing the Earth as the centre of the universe, with
several celestial spheres spinning around it with different angular speeds. Although a
heliocentric vision had been proposed as early as the third century BC by Aristarchus
of Samos, the Ptolemaic universal order, so well depicted by Dante in the Divine
Comedy, was only reversed by Nicolaus Copernicus in 1532, when he wrote in his
famous text De Revolutionibus Corporum Coelestium: The Sun is set at the centre
of all the things. Which other position could this source of light have in the cosmos,
a wonderful temple, if not the centre from which it can illuminate every single thing
at the same time? Therefore, the Sun is not improperly named by some “the lamp of
the universe”, “its corresponding mind” by others, and even “the metronome” by
some others.
It is interesting to notice the time line of the scientific discoveries and how they
have been progressively integrated to constitute the framework of classical gravita-
tional theory. The key scientists of such a revolution are depicted in Fig. 1.1.
Johannes Kepler assumed the Copernican point of view in his analysis of the plan-
etary motion. This analysis was based on data, incredibly accurate for the times,
collected by Tycho Brahe over more than 20 years of observation. In Astronomia
Nova, published in 1609, Kepler announced his first two laws, describing the motion
of a generic planet
1. All the planets move along elliptical orbits with the Sun occupying one of the two
foci.
Fig. 1.1 Scientists who contribute to define the classical theory of Gravitation
1.1 Kepler and Newton 3
2. During equal time intervals the line segment connecting the Sun to a planet sweeps
equal areas.
The third law was published 10 years later (1619) in Harmonices Mundi and, unlike
the previous two, states a scaling law for all planets orbiting the same central mass.
3. The ratio of the square of the orbital period to the cube of major semi-axis is a
constant for all the planets.
These laws describe the kinetic properties of the planetary motion: they are exact,
as long as we deal with a simple two-body (Sun and planet, or Earth and Moon)
problem. Deviations from the ideal behaviour occur when other perturbations are
considered, as we shall see in the following chapters. It is customary to define an
orbit via the Keplerian Elements: six parameters that describe the shape and size of
the ellipsis, its orientation in space and the planet position, as shown in Fig. 1.2:
– The orbit inclination i with respect to a reference plane (the solar ecliptic for
planets, the equatorial plane for Earth satellites).
– The longitude of ascending node defines the position of the “node”, i.e. the
intersection of the orbital plane and the reference plane. It is an angle measured
from a reference direction (the Vernal point or the Greenwich meridian). When
using celestial coordinates, the right ascension takes the place of the longitude.
– The argument of pericentre (or periastron) ω specifies the orientation in such
plane of the major semi-axis, measured from the line of nodes.
– The major semi-axis a is the one information about the size of the orbit.
– The eccentricity e tells us, instead, about its shape.
– Finally, the position of the planet on the orbit at a given time or epoch t0 is given as
the angle between the satellite, the central body and the periastron. This angle is
ν(t)
Ecliptic
plane
Sun ω
Ω i
Apoastron
4 1 Classical Gravity
For a Keplerian two-body system, they are all constant. We shall see in Chap. 5 how
general relativity introduces perturbations resulting in secular shifts for some of these
elements. Binary systems in strong gravitational field (Sect. 12.3) will also require
five additional post-Keplerian parameters.
Kepler’s laws were essential for Newton’s formulation of the Law of Universal
Gravitation. Nevertheless, in order to derive them, it is necessary to find a relation
between the description of the motion and the cause generating it. To fill this gap,
the formulation of Galileo’s inertia principle was crucial: a body upon which no
force is acting either remains in rest or moves with a constant speed, along with the
formulation of the second law of dynamics by Newton.
F = m a (1.1)
The curved trajectories of the planets imply the existence of a force, whose expression
must be compatible with the kinetic properties stated by Kepler. These logic steps are
today straightforwardly evident, if we just apply the elementary concepts of classical
mechanics of our modern era. With the simplifying assumption of neglecting the
small eccentricity of the planet orbits (i.e. assuming circular orbits), Kepler’s third
law suggests that, for all the planets, the square of the orbital period T is proportional
to the third power of the orbital radius, r :
T 2 = Kr3
2
Therefore, recalling that the centripetal acceleration is ac = 4π
T2
r and applying (1.1),
the corresponding attraction force that bends the planets’ trajectories turns out to be
proportional to the inverse square of the orbital radius:
4π 2
Fc = m P
Kr2
The constant K must depend on the mass of the attracting body m S : it is a consequence
of the third principle of the dynamics that implies the equality of the force that the
star S exerts upon the planet P with the one that P exerts upon S. Hence, we derive
Newton’s third law in its familiar form:
mSm P
F = −G r (1.2)
r3
1 There is nothing anomalous about the anomaly. It comes in three varieties: true, eccentric or mean.
The value of the gravitational constant G was first measured by Newton from astro-
nomical observations of planetary motion; these measurements have been repeated
in recent days, benefitting from improved accuracy in the value of the orbital radii.
However, we invariably end up measuring G M , as there is no way to disentangle
this product from orbit observations, and the mass of the Sun is not well known.
Therefore, these measurements provide results of limited accuracy, with relative
uncertainty no better than 10−2 .
From the data available at the time, Newton managed to approximately calculate
the Universal Gravitation constant, finding
Nm2
G 6 · 10−11 .
kg2
We need to wait till the first experiment of the modern experimental gravitation, based
upon the torsion balance, to obtain a measurement of G of acceptably accuracy. In
1798, as the conclusion of an elegant series of laboratory measurements, (Cavendish
1798) “weighted the Earth”. Only in 1811, S. Poisson introduced the gravitational
constant (that he called f ), that, using Cavendish data, was evaluated at
Nm2
G = 6.74 · 10−11 ,
kg2
Fig. 1.3 16 different a measurements of the gravitational constant G, with their 1 σ error bars,
performed between 1982 and 2018. These results were considered by the CODATA committee, in a
weighted average, for the 2018 recommended value. Credits: reprinted figure with permission from
Tiesinga (2021). Copyright 2021 by the American Physical Society
6 1 Classical Gravity
with a relative incertitude of 1.5 · 10−5 , to be compared with that of other fundamental
constants, in the 10−9 range (Fig. 1.3).
The motion of celestial bodies along a Keplerian, elliptic orbit is a classical exercise
of analytical mechanics and can be found in most first-year physics textbooks. We
summarize here the results that will be needed in other chapters. The relevant points
and distances are shown in Fig. 1.4.
Although the description is quite general, we shall refer, to simplify definitions, to
a planet orbiting the Sun. The only assumptions are to deal with a two-body problem,
P` Y
.P(t)
b
O
. ψ
a
.
H F
ν . A
X
Fig. 1.4 Relevant point to define the anomalies along an elliptical orbit: O is the ellipse centre,
a, b are the semi-axes. F is the focus, where the attracting body (the Sun) is, A is the periastron
(perihelion), P the position of the orbiting planet. The red circumference is tangent to the ellipse
at the extremes of the major axis. The X- and Y-axes of the Solar Barycentre Coordinate System
are shown in red. The true anomaly is the angle ν = AF P, the eccentric anomaly is the angle
ψ = AO P
2 The Committee on Data of the International Science Council (CODATA) releases every 4 years an
update on the value of several hundreds physical constants. The 2018 release (Tiesinga 2021) has
revolutionized the SI unit system, defining several fundamental constants (c, , k B , φ0 , e, eV...) as
exact. This means redefining metre, second and kilogram in terms of these new “yardsticks”. The
values of constants are also available at https://physics.nist.gov/cuu/Constants/index.html.
1.3 Motion on a Keplerian Orbit 7
Table 1.1 The six Keplerian parameters that define an orbit and the position of the planet (or
satellite) along it
Symbol Definition for planets and stars (for Earth satellites)
i Inclination over the ecliptic (over the equatorial plane)
Right ascension (Longitude) of ascending node
ω Argument of periastron (of perigee)
a Major semi-axis
e Eccentricity
ν(t0 ) True anomaly at epoch t0
and M M planet , so that we can take the system centre of mass to coincide with
the Sun centre. The orbit is described by the simple equation in polar (r , φ) form
a(1 − e2 )
r (φ) = (1.4)
1 + e cos φ
where the attracting body (i.e. the Sun) is at the main focus F (r = 0) and φ is
measured from the semi-major axis directed towards the perihelion A.
The above equation is of little help when we need to describe the position of an
orbiting body versus time, i.e. to follow its motion along the orbit. We thus need
a different way of expressing the trajectory. First, we must revisit the definition of
anomaly: as mentioned above, it comes in three varieties:
– Mean Anomaly: it is simply t, the angle described by a body that hypothetically
travels on a circular orbit with the same period T = 2π/ of the orbit considered.
It corresponds to a constant, mean angular velocity.
– True Anomaly: it is the angle, seen from the focus F, between the perihelion
and the planet position, the angle ν = AF P in Fig. 1.4. The true anomaly is the
sixth Keplerian parameter listed in Table 1.1.
– Eccentric Anomaly: With reference to Fig. 1.4, consider the point P , intersection
of the line H P, perpendicular to the semi-major axis and passing at the body
position P, with the circumference tangent to the orbit (i.e. with radius equal to
the semi-major axis length a): the eccentric anomaly is the angle ψ = AO P ,
between the semi-major axis and the line connecting P with the ellipse centre C.
The relevance of the eccentric anomaly depends on Kepler’s equation that offers an
explicit time dependence:
ψ − e sin ψ = t (1.5)
There is no analytic solution for this equation, but nowadays it is not a worry to
numerically obtain ψ(t).
8 1 Classical Gravity
Then, we introduce Cartesian coordinates, as shown in Fig. 1.2, with X-Y lying
on the ecliptic plane and the origin at the Sun position F. If we were to place the
origin at the centre of mass of the gravitating system (a negligible difference under
our assumptions) we would have the Solar Barycentric Coordinate System.
All this considered, and recalling the definition of orbit inclination i, we can write
the time-dependent equations of motion for the orbiting body:
where the time dependence is in ψ(t), through Eq. 1.5. It looks awkward, but it is the
best we can do to describe the motion of a planet or satellite.
Classical gravitational theory is linear (i.e. the force is proportional to the source
“charge”), thus the superposition principle applies to it. We therefore deduce that the
acceleration field on a point mass identified by the position vector r, generated by n
point masses m k , with k = 1, . . . , n, placed at positions rk can be expressed as
n
r − rk
a (
r ) = −G mk (1.7)
|
r − rk |3
k=1
Just as for electrostatic, we define a gravitational field g as the force acting on the
unit test mass.
The reader will notice that this field coincides with the gravitational acceleration
felt by the test mass: this identity is the cornerstone of the principle of equivalence
(see Chap. 3), the founding concept of general relativity.
This acceleration field is conservative and a potential can be associated to it,
where
such that a = −∇,
n
mk
(
r ) = −G (1.8)
|
r − rk |
k=1
The Newtonian force field of point like mass has the same central behaviour and the
same 1/r 2 dependence of the electrostatic field due to a point charge. It follows that
we can prove the analogous of Gauss’ theorem so that, when we deal with a mass
distribution described by the volumetric density function ρ(r ), we have
· a = 4πGρ
∇ (1.9)
1.4 The Gravitational Field and Potential 9
Hence, the general solution for the gravitational field is found by solving the Poisson
equation:
∇ 2 = 4πGρ (1.10)
However, the attractive character of the gravitational force (there is no negative
mass) entails significant differences in the observed effects, for instance, the lack of
gravitational screening.
In the case of a spatially continuous mass, distributed over a volume swept by the
variable r , the general solution of Eq. (1.10) takes the form
ρ(r )d 3r
(
r ) = −G (1.11)
r − r |
|
The symbol g denotes a mean value of the acceleration produced by gravity on the
Earth surface, at sea level:
G M⊕
g= 2
R⊕
Actually, the value of the gravitational acceleration changes from place to place,
depending on the latitude, altitude and local geological structure: it is indeed the
modulus of the vector field introduced in Eq. (1.7), i.e. g = |
a (
r )|. In aero-spatial
engineering, the symbol g is used also to indicate the measurement unit of the accel-
eration, implicitly assuming a reference value of
1 g = 9.80665 ms−2 .
This practice, widely spread among engineers, causes confusion since the same
symbol g denotes the gram, a measurement unit of mass. In locations with a given
latitude p, the conventional value of the gravitational acceleration at sea level can be
derived from the internationally accepted relation (International Gravity Formula):,
We can add a final correction: −3.086 · 10−3 h, where h is the elevation in metres,
that can easily be derived by series expansion of Newton’s law in the vicinity of
r = R⊕ .
The unit Gal, written with capital G to avoid confusion with the unit of volume
gal so appreciated in the Anglo-Saxon world (the gallon), was introduced in honour
of Galileo Galilei. It has been adopted for the measurements of local variations in the
Earth’s gravitational acceleration (gravity anomalies). A Gal is equal to 1 cm s−2 =
1.0197 ·10−3 g, that is, 1 mGal ∼ 10−5 ms−2 ∼ 10−6 g (Table 1.2).
10 1 Classical Gravity
Table 1.2 Acceleration and gradient units, expressed both in SI units and in terms of g =
9.80665 m s−2 . In some cases, the approximation g 10ms−2 is made
1 Gal 1.02 · 10−3 g 10−2 m s−2
1 mGal 10−6 g 10−5 m s−2
1E 10−10 g m−1 10−9 s−2
10−7 Gal m−1
1 mE 10−12 s−2
To get acquainted with the new unit of measurement let us note that the mean
acceleration produced by gravity is equal to ḡ = 9.81 · 105 mGal and that it varies
from 9.832 · 105 mGal at the Equator down to 9.781 · 105 mGal at the poles. Varia-
tions due to the land’s orography are of the order of 10−100 mGal.
The unit of measurement Eötvös is used in Geophysics to measure the gradient of
gravitational acceleration along an axis. An Eötvös is defined as 1 E = 10−7 Gal/m:
in International Systems (SI) of Units it is 10−9 s−2 . The gravity gradient along the
vertical is the largest component, being ∼3000 E at ground level (this means that
g decreases by ∼3 · 10−6 ms−2 per metre of elevation). The horizontal components
change roughly half as much and the off-diagonal gradient terms ∂ 2 /∂x j ∂xk are
of the order of 100 E. Anomalies in the gravity gradient can exhibit values of up to
103 E in mountain regions.
When we need to compute the potential in points very far from the source, we
choose the origin of coordinates within or close to the source (see Fig. 1.5) and we
have | r | | r |.
We can then expand the denominator of Eq. (1.11) in Taylor series about the origin
(x1 , x2 , x3 ) = 0
1 1 xi xi 1 2 j xi x j
= − 3
+ (3xi x j − r δi ) 5 + ·· (1.12)
− xi )2 r r 2 r
i (x i i=1,3 i, j
r | = x1 2 + x2 2 + x3 2 and same for r .
where r = |
Using this expansion, the potential of Eq. (1.8) can be expressed as
GM G G x k xl
(
r) = − + 3 x k Dk + Q kl 5 + · · · (1.13)
r r 2 r
k k,l
This expansion is useful, as it allows to factorize contributions due to the source (M,
Dk , Q kl . . . ) and terms due to the position of the test particle (xk , xl . . . ) while in
Eq. 1.11 they are mixed in the 1/| r − r | term.
1.6 Multipole Expansion of (
r) 11
is the mass dipole: it can always be set to vanish by choosing the centre of mass as
the origin of the coordinates;
Q i j = ρ(r )(3xi x j − r 2 δi j )d 3 x (1.16)
is the quadrupole term, which vanishes in the case of a spherically symmetric mass
distribution. For a rotation ellipsoid, the shape of most astronomical bodies, the
quadrupole moment can be expressed in terms of the equatorial and polar radius,
Re < R p , respectively:
1
Q i
= j = 0; Q x x = Q yy = − Q zz ;
2
2M 2
Q zz = (Re − R 2p ) = I p − Ie (1.17)
5
where I p and Ie are the moments of inertia with respect to the rotation axis ẑ and
to an equatorial diameter, respectively. Moreover, if the deviation from sphericity is
small, R = Re − R p R, with R the mean radius, we can further simplify the
expression:
4M R
Q zz R
5
12 1 Classical Gravity
The normal modes with the same l resonate in principle at the same frequency,
but the degeneracy of modes with different azimuthal m index is resolved if the body
is rotating. Therefore, one can deduce, from the frequency splitting of these modes,
the angular speed of the inner layers of the Sun, address the issue of the deformation
of the star and finally compute the quadrupole moment (Duval 1984).
The currently accepted value, based on more recent helioseismometric observa-
tions (Mecheri 2004) from satellites, is
Knowledge of J2, is crucial for all the experiments measuring precession to verify
a gravitational theory. Indeed, all these experiments rely on the subtraction of this
classic contributions. We shall return to discuss the Sun quadrupole moment, and its
implications for gravitational theories, in Sect. 6.6.2.
The Earth too has a non-zero quadrupole moment:
caused by the difference between its polar and equatorial radii, namely, R = 21.4
km. This in turn corresponds to an eccentricity of the Earth’s rotational ellipsoid of
= R/R = 3.35 · 10−3 (IERS 2020).
For a complex and structure-rich solid as the Earth, the quadrupole is just the
most basic correction to the spherical approximation. As indicated by the dots in
Eq. 1.13, the expansion can be carried out to an arbitrary degree. However, carrying
out the Taylor series expansion beyond the quadrupole term becomes increasingly
cumbersome: a more manageable, and widely used, approach consists in expanding
these corrections in terms of spherical harmonics:
N max
GM R n
(
r) = 1+ ·
r r
n=2
(1.19)
n
· P̄nm (sinφ)[C̄nm cos mλ + S̄nm sin mλ]
m=0
where we made explicit the spherical harmonics Ylm as the product of Legendre
polynomials P̄nm in the latitude φ and sinusoidal functions of the longitude λ. C̄nm
and S̄nm are the normalized, unit-less, spherical harmonic coefficients or Stokes’
coefficients. For the Earth, R ≡ R⊕ = 6378136.3 m is the mean equatorial radius
14 1 Classical Gravity
of the reference ellipsoid, defined in Sect. 14.4. This expansion can be used for any
celestial body: recent and future space missions measure the first multipoles for other
planets: BepiColombo for Mercury, Cassini for Saturn, Juno for Jupiter...
In the last few decades, a large effort has been devoted, mainly by the geophysics and
geodesy communities, to measuring the multipoles of the Earth gravitational field to
higher and higher order, to generate a detailed model of the Newtonian field of our
planet. This is achieved by measuring the magnitude of the perturbations induced on
the trajectories of artificial satellites orbiting the Earth: these geodetic surveys started
as early as 1958 (just 1 year after the very first satellite, Sputnik) with Sputnik 2 that
measured the Earth oblateness. They continue, with a variety of technologies and
with increasing accuracy, till our days. The method is, in principle, simple: track the
satellite position, interpolate positions to calculate its orbit and evaluate deviations
from a pure Keplerian orbit. Then, solving an inverse problem, derive perturbations
to the spherical symmetry of the source mass, i.e. the multipoles, that cause such
deviations. The main techniques of satellite tracking are as follows:
– Satellite Laser Ranging (SLR): The satellite is tracked by a network of laser
ranging stations on the Earth: each station shines light pulses on it, and measures the
travel time of the reflected beam, deriving the distance between each station and the
satellite. Orbits are determined with an accuracy better than a cm.
This is a powerful technique that, by looking at the position of satellites in the
sky, derives a wealth of information about the Earth: non-sphericity (multipoles),
the European Terrestrial Reference System (ETRS89), drift of continental plates,
motion of the tracking stations, motion of the North pole, changes of rotation period
and of the Length-Of-Day, definition of Universal Time and more. Besides, the
gravitational perturbations of satellite orbits are measured: effects of Moon, Sun
and other planets, oceanic tides and solid tides. Last but not least, general relativity
perturbations. Indeed, the most studied geodetic satellites are the two LAGEOS3 :
they are completely passive, almost spherical (60 cm in diameter) objects, with the
surface covered with retro-reflectors. They are still in operation and are expected to
orbit the Earth for the next 8 million years. Analysis of their orbits has led to the
first measurement of the Lense-Thirring relativistic precession, as we shall see in
Chap. 5.
– Global Navigation Satellite System (GNSS) is based on a network of synchro-
nized radio emitters satellites: it can locate, within a few metres, any object equipped
with a receiver. The first and most famous of these systems is the US network GPS
(Global Positioning System). Today we also have the Russian GLONASS, the Euro-
pean Galileo and the Chinese BEI DOU networks. Operation of GNSS networks,
and their implications with special and general relativity, are the subject of Chap. 14.
Fig. 1.6 Map of the Earth’s gravity anomalies, from GRACE data. Credits: NASA
More complex instruments can measure the full gravity gradient tensor ∂x∂ j ∂x
2
k
: this is
actually the tidal tensor (see Sect. 1.8), symmetric and, being solution of the Laplace
equation, traceless. Thus, it has five independent components and a minimum of
five accelerometer pairs is needed to measure it. Gravity gradients can be integrated
and, together with altitude data gathered by other instruments, return a detailed
gravitational map of the land and sea that the satellite flies over. The foremost example
of these surveyors was the satellite GOCE.4
– Satellite to Satellite Tracking: Two satellites in Low Earth Orbit (LEO) track
their relative position by exchanging a radio link. Variations in their distance are due
to irregularities of the gravity field. The twin satellites mission GRACE5 was so suc-
cessful that a new mission, GRACE Follow-On, was launched in 2018, with identical
instrumentation plus a laser link between the two satellites, accomplishing the first
spacecraft to spacecraft interferometer. We shall return to this in Chap. 11 dedicated
to the LISA space mission. This wealth of information is used, by a large geodesy
and geophysics community, to evaluate and keep track of geodynamic phenomena
Fig. 1.7 Ice loss in Greenland, as measured from Grace data: in the period 2002–2016, 280 gigatons
of ice per year were lost, causing global sea level to rise by 0.8 mm per year. Credits: NASA
(Fig. 1.6), such as crustal motion, Earth rotation and polar motion, to measure gravity
field parameters, tidal Earth’s deformations, coordinates and velocities of SLR sta-
tions, and other substantial geodetic data. Impressive data were recorder, just to give
an example, on the constant reduction of Arctic ice Fig. 1.7. Earth Gravitational mod-
els (EGM) are periodically updated, with ever-increasing accuracy and detail: the
EGM2008, largely based on data from GRACE, GOCE and the German CHAMP,6
provides Stokes’ parameters of Eq. 1.19 up to Nmax = 2160, with a spatial resolution
Nmax
of 5’ x 5’ minutes of arc, requiring =0 (2 + 1) = over 4.6 million coefficients.
Fig. 1.8 Tides forecast for Venice (Italy). The tide heights are expressed in cm and refer to the
Zero Mareografico (tidal-wave) of Punta Salute (Z.M.P.S. 1897). The average sea level has been
constantly raising since 1897 and is currently about 31 cm higher than this reference. The predictions
apply for normal weather conditions. Credits: Comune di Venezia
the Roche limit.7 The most relevant manifestation of this phenomenon is the rise and
fall of sea level caused by the combined effects of the gravitational forces exerted
by the Moon and the Sun, and the rotation of the Earth. Tides vary on timescales
ranging from hours to years due to a number of factors, which determine the lunitidal
interval. To make accurate records, tide gauges at fixed stations measure water level
over time. These gauges are designed so to be insensitive to variations caused by
waves, with periods shorter than minutes (Fig. 1.8). The classic theory of gravitation
opens the way for a detailed understanding of tidal prediction. It explains, just to
give the most basic example, why tides appear twice a day. However, the dynamics
of tides is more complex than what gravitation itself can predict: the actual tide at
a given location is determined by an accumulation effect of those forces acting on
the body of water over many days, by the depth and shape of the ocean basins (their
bathymetry), and by the coastline shape.
7 The Roche limit, named after Édouard Roche (1820–83), is the minimum distance to which a
large satellite can approach its primary body without tidal forces overcoming the internal gravity
holding the satellite together. If the satellite and the primary body are of similar composition, the
theoretical limit is about 2.5 times the radius of the larger body. Inside the Roche limit, orbiting
material disperses and forms rings, as we can see around Saturn. Its rings may be the debris of a
demolished moon. Artificial satellites are too small to develop substantial tidal stresses.
18 1 Classical Gravity
z
x
Earth
E RE
To introduce the mathematical study of the tides, we consider a spherical test mass E
(the Earth), of mass M E and radius R E moving under the effect of the gravitational
force due to a source S (the Moon or the Sun) with mass M S .8 We call this motion
free fall, as it is solely determined by gravity and by the initial conditions. We assume
the Sun to have spherical symmetry; Sect. 1.4 tells us how to correct this hypothesis,
if needed. In the case of an infinitesimally small elastic sphere, the effect of a tidal
force is to distort the shape of the body without any change in volume. The sphere
becomes an ellipsoid with two bulges, pointing towards and away from the other
body. Larger objects distort into an ovoid, and are slightly compressed, which is
what happens to the Earth’s oceans under the action of the Moon.
To evaluate this phenomenon, we position the origin of a reference frame at the
centre of the mass E, so that the source S is at distance Rse along the ẑ-axis (Fig. 1.9).
We focus our attention on two points positioned at distance ±z (z R E
Rse ) from the centre of E (and distant Rse ± z from S). The acceleration in these
points can be computed:
8 In general, and elsewhere in this book, the Earth is labelled with the traditional astronomical
symbol ⊕, the Sun with and the Moon with K. Here, for sake of clarity and generality (our source
S can be either the Sun or the Moon), we use E and S, respectively.
1.8 Gravitation and Tides 19
G MS MS z
The residual term δaz = ±2(G M S z/Rse 3 ) is the tidal acceleration along the axis
joining the centres of the two masses: while the whole body “falls” in the field of
the source mass M S with a common acceleration a0 = −G M S /Rse 2 , the particles on
its surface feel an additional acceleration δaz due to the non-uniformity of the field.
A similar calculation carried on for two particles positioned on the equatorial plane
normal to ẑ, say at ±x from the centre, shows that they feel a relative acceleration
that is half as large, and with opposite sign, with respect to that along ẑ:
G MS ±x M S x
δax (±x, 0, 0) = −
∓ G
2
Rse + x 2 R se Rse3
So, the tidal force pushes away the particles along the axis pointing to the external
mass M, and pulls closer together those on the plane normal to it. We shall encounter
again this behaviour in Chap. 7, when dealing with gravitational waves.
We can now generalize the analysis, computing the gravitational acceleration in
a generic point near the surface of E. We shall neglect the common acceleration a0 ,
but include, in the potential, the local gravity of body E and the tidal term due to S.
The equation of motion of a particle in a gravitational potential ( x ) is
∂( 3
d2xi x ) = − ∂(
= −x̂i · ∇(
x)
=− δ ik
x)
(1.21)
dt 2 ∂x i ∂x k
k=1
where x i is the component of the position vector x of the test particle in the direction
x̂i . To describe the tidal effect of differential gravity acceleration, we focus on two
test particles falling in the field of the source mass: we define the vector ξ = x2 − x1
as the separation between the two particles. While the motion of the first particle is
described by Eq. 1.21, for the second we have
d 2 (x i + ξ i ) ∂(
x + ξ)
= − δ ik
(1.22)
dt 2 ∂x k
k
d 2 (x i + ξ i ) ∂( x) ∂ ∂(
x)
j
= − δ ik
+ ξ + · · · (1.23)
dt 2 ∂x k ∂x j ∂x k
k
Thus, the Newtonian geodesic deviation equation for the separation vector ξ i is
d 2 ξi ∂ 2 ( x) j
= − δ ik
ξ (1.24)
dt 2 ∂x k ∂x j
k, j
20 1 Classical Gravity
This last equation shows how the distance between test particles changes due to a
non-uniform field. We introduce the Newtonian tidal tensor9 :
∂ 2 (
x)
Ti j = δ ik (1.25)
∂x k ∂x j
k
In the particular case of two particles falling in the gravitational field of an isolated,
spherically symmetric mass distribution, i.e. = E = −G M E /r , we have
G ME
T ji = (δ ij − 3n i n j )
r3
where n i = xi /r are the components of the unit vector in the radial direction.
Note that
T i i = ∇ 2 = 4πGρ (1.26)
i
is the way to write the Poisson equation via the tidal tensor, and similarly the equation
for ξ, the geodesic deviation of the two particles in the gravitational field is
d 2 ξi
2
=− Ti jξ j (1.27)
dt
j
9 We shall see in Chap. 4 how general relativity extends the concept of geodesic deviation equation.
Indeed, the tensor defined in the right-hand side of Eq. 1.25 is related to the Riemann tensor of
relativity by T ji = c2 R0i j0 .
1.8 Gravitation and Tides 21
2 R5
Q i j = − k2 · Ti j
3 G
where R is the body’s radius, the factor of 2/3 is conventional and the dimension-
less constant k2 is the tidal Love number for a quadrupolar deformation. The Love
numbers, introduced by A.E.H. Love in 1909, measure the rigidity of a planetary
body, i.e. the susceptibility of its shape to change in response to a tidal potential.
In other words, they measure the ratio between the response of the real Earth and
the theoretical response of a perfect fluid sphere.10 For the elastic Earth, k2 ∼ 3.1.
It follows that the gravitational potential near the surface of body E, at a distance r
from its centre, is composed of three terms: the monopole, depending on the source
mass M E , the external tidal field due to S and the body’s own response to the tidal
interaction
G ME G k2 R E 5 i j
E (
r) = − + 1+2 T xi x j (1.28)
r 2 3G r
i, j
The quadrupole term is the next-to leading order term of the Taylor expansion of the
potential . If we include additional terms we have to consider tidal moments of
higher multipole orders, and higher powers of the coordinates xi . Thus, the general-
ization to higher orders of the gravitational potential outside the body leads to a more
complex formula in which the coefficients weighing the high order tidal moments
are
1 R
2l+1
G 1 + 2kl (1.29)
l(l − 1) r
Here kl is the Love number for the l − th configuration, a constant of proportionality
between the tidal field applied to the body and the resulting multipole moment of its
mass distribution.
Tidal Love numbers are introduced also in general relativity: in reference to
the gravito-electromagnetic formalism, described in Chap. 5, they are catalogued
as electric-type Love numbers kel , having a direct analogy with the Newtonian Love
number here introduced, and magnetic ones kmag that are associated to a purely
relativistic effect.
Current procedure for analysing tides follows the method of harmonic analysis intro-
duced in the 1860s by W. Thomson. It is based on the principle that the combined
motion of Sun and Moon contains a large number of component frequencies, and at
each frequency there is a component of force acting to produce tidal motion. At each
place of interest on the Earth, the tides respond to each frequency with an amplitude
and phase peculiar to that locality. These components are, for the most part, the six
basic frequencies, and their harmonics, of the Moon-Earth-Sun dynamics.
The different harmonic content of the tide-generating potential is specified using
the Doodson numbers, a system proposed by Arthur Thomas Doodson in 1921.
It is based on six coefficients, indicating the harmonic of each of these six basic
periodic angles:
1 τ : the Greenwich Hour Angle of the mean11 Moon plus 12 h, with a period of
1.04 days. It can be expressed as τ = θ M + π − s where θ M is the Sidereal Time
on the Earth, whose period is one sidereal day, or 0.9973 days, and s is as follows:
2 s: the mean longitude of the Moon, with a period of one lunar month, or 27.32 days.
3 h: the mean longitude of the Sun. Its period is 1 year, 365.25 days.
4 p M : the mean longitude of the Moon’s perigee. It has a period of 3231.5 days.
5 −: the negative of the longitude of the Moon’s mean ascending node on the
ecliptic, with a period of 6798.4 days.
6 p S : the longitude of the Sun’s mean perigee: its periodicity is extremely long,
112000 years, and is not used in describing practical tides.
As an example, the well-known principal solar semidiurnal tide S2 , which takes place
exactly twice a day, has the Doodson number 273.555, meaning that it has frequency
components at twice the mean lunar time, twice the lunar month and −2 times the
year, and no component of the remaining three periodicities. An exhaustive table of
many tides constituents, with period, Doodson coefficients and other information,
can be found in https://en.wikipedia.org/wiki/Theory_of_tides.
Tidal phenomena are not limited to the oceans, but can occur in other systems
whenever a gravitational field that varies in time and space is present. For example,
the shape of the solid part of the Earth is slightly affected by Earth tide, though this
is not as evident as the water tidal movements.
Tidal effects become particularly pronounced near small bodies of high mass, such
as neutron stars or black holes, where they are responsible for the spaghettification of
infalling matter. When we deal with a binary system of two neutron stars, their orbital
motion produces emission of gravitational waves, which remove energy and angular
momentum from the system. This causes the orbits to decrease in radius and increase
in frequency, and leads to the inspiralling motion of the compact bodies. When the
orbital separation decreases sufficiently and the tidal effects become important, the
neutron stars acquire a tidal deformation, and this affects their gravitational field
and orbital motion. The effect is revealed in the shape and phase of the gravitational
waves releasing information regarding the compactness of each body, as well as its
equation of state.
When introducing Newton’s law (1.2) we have overlooked some subtle concep-
tual aspects. First, we directly assumed that the physical property determining the
gravitational attraction of a body, the gravitational mass, cannot be different from
the physical property of the same body affected by this attraction. Let us examine
better this concept. Newton’s law implies that the mass is the source of the gravita-
tional force. Let us denote this source by active gravitational mass m a . On the other
hand, we denote the physical body affected by such a force by passive gravitational
mass m p . These two quantities might not coincide. Consider the force exerted by the
Earth upon the Moon: we could assume that the gravitational interaction is propor-
tional to the Active Gravitational Mass of the Earth and to the Passive Gravitational
Mass of the Moon. Conversely, the force exerted by the Moon upon the Earth would
(E) (M)
depend on m p of the Earth and m a of the Moon.
Moreover, in the last section, we have derived the dependence of F on the moving
planet’s mass using the second law of dynamics, whence the passive mass is identified
with the inertial one, m i ; this latter being the physical quantity defined by Eq. (1.1),
and characterizing the dynamical response of the body to the forces acting on it.
The identification of the gravitational mass with the inertial one is the foundation
of the equivalence principle which we will discuss in detail in Chap. 3. For the time
being, we shall continue our analysis by introducing in an operational fashion the
three quantities m i , m a and m p and examining the implications that result from such
distinction.
Consider a two-body system, the first of which has a role of a reference unitary
(1 kg)
mass, say, for example, 1 kg, m 1 . In the absence of external forces, the action-
reaction principle states that
de f |
ai |
m i = m 1 (1kg) , (1.31)
|
a1kg |
that is, the ratio between the inertial masses of the two bodies can be directly obtained
from the ratio of their respective accelerations.
Having thus defined the inertial mass, we can proceed to define the passive grav-
itational mass, assuming there is a unitary test mass of active type which attracts
the mass m p due to gravitational interaction. Hence, by applying the second law of
dynamics
m a (1kg) m p r
G = m i a (1.32)
r2 r
we end up with an expression for m p that depends on the product r 2 a(r ), which
turns out to be a conserved quantity, according to Kepler’s third law:
de f |
a |r 2
m p = lim m i (1.33)
r →∞ Gm a (1kg)
We proceed in the same way for defining the active gravitational mass, assuming
there is a unitary test mass of passive type subject to the gravitational influence of
the active mass m a . We thus derive that the operational definition of m a is
de f |
a |r 2
m a = lim m i (1.34)
r →∞ Gm p (1kg)
We can now apply these definitions to the case of two interacting masses, m 1 and
m 2 . Applying the third law of dynamics
| F1 | = | F2 |
where
r1 − r2
F1 = m 1i a1 = −Gm 1 p m 2a (1.35)
|
r1 − r2 |3
r2 − r1
F2 = m 2i a2 = −Gm 2 p m 1a (1.36)
|
r2 − r1 |3
we conclude that the ratio between active and passive masses of each body is a
constant, that is, active and passive gravitational masses are proportional quantities:
mp
=β (1.37)
ma
1.9 Active and Passive Masses 25
The identification of active mass with its passive counterpart was verified in the ele-
gant experiment (Kreuzer 1968). His idea was to generate a gravitational force by
a Teflon cylinder (76% of fluoride) submerged in a mixture of trichloroethene and
dibromomethane (76% of bromine). The two massive elements, one liquid and the
other solid, are chosen because they are chemically inert and their nuclear composi-
tion is very different (Fig. 1.10).
The Teflon cylinder, which is immersed in the liquid, is connected to a motor
by a thin nylon wire and a pulley system. The motor forces the cylinder to slowly
oscillate back and forth (with a period of 400 s). If there is a small density difference
between the fluid and the Teflon block, the gravitational force of the liquid+solid
system exerted upon the mass of a torsion pendulum placed in front of the container
can be measured experimentally, see Fig. 1.11 (details of the torsion pendulum as
gravitational detector are discussed in the next chapter). The density of the liquid
is then changed, varying the temperature, up to the point where the densities of the
two materials are equal: if passive and active masses coincide, the gravitational force
must vanish. Note that the density measurements of the mixture and the Teflon block
depend on the passive mass. They are indeed obtained by measuring the effect of the
Earth’s gravitational field in samples of these materials.
As the difference in density of materials changes with temperature, comparing
the measurements of density difference versus T to those of the gravitational force
exerted on the torsion pendulum, we can obtain an upper limit on the fractional
difference between active and passive masses by analysing the way the two functions
cross zero. The two plots of Fig. 1.11 show the original data versus temperature:
Fig. 1.11 Data of the Kreuzer experiment: the right panel shows a plot of the density difference
between liquid and solid versus temperature, while the left panel depicts the corresponding depen-
dence of the signal from the torsion pendulum. Credits: reprinted figures with permission from
(Kreuzer 1968). Copyright 1968 by the American Physical Society
The zero crossing of the two functions occurs at a temperature marked by a thermistor
reading of 1130 : analysing these data, Kreuzer claimed the ratio between active
and passive masses to be 1, with an accuracy level of 5 · 10−5 .
A different limit, accurate but model dependent, was derived by observing the
motion of the Moon. The Moon orbit has been monitored, for the last 50 years, by
means of the powerful LLR technique that we briefly describe here.
Lunar Laser Ranging (LLR) is a technique that is conceptually very simple: evaluate
the distance of the Moon by measuring the time of flight of a light beam. Short pulses
of laser light, of order 10 − 100 ps, are sent from a network of tracking stations on the
Earth to the Moon, reflecting off arrays of corner cube retro-reflectors placed on the
Moon’s surface (Fig. 1.12). The round-trip time is accurately measured, from which
the Earth-Moon distance may be deduced. There are five retro-reflector arrays on
the Moon surface that still serve as useful targets: three left by the Apollo 11, 14 and
15 astronauts, and two, French built, deployed by the Soviet Lunokhod unmanned
rovers (Fig. 1.13). The analysis of LLR data requires a sophisticated model of the
solar system ephemeris that also includes all the significant effects that contribute
to the range between the Earth stations and the lunar retro-reflectors. These models
compute a range prediction and the partial derivatives of range with respect to each
model parameter at the epoch of each measurement. The model predictions take into
1.9 Active and Passive Masses 27
Fig. 1.12 Left: the Matera Laser Ranging Observatory of ASI; photo courtesy of F. Ambrico Right:
one of the retro-reflectors placed on the Moon’s surface by the Apollo space program. Credits: NASA
account orbital parameters, attraction to the Sun and planets, relativistic corrections,
as well as tidal distortions, plate motion and other phenomena that affect the position
of both the retro-reflector and of the ground station relative to the centres of mass
of the Earth and Moon. Some of these parameters are measured by other means, but
most are estimated from a huge fit to the LLR data. The range measurements are
corrected for atmospheric delay and a weighted least square analysis is performed to
estimate the ∼170 parameters in the model, most of which are initial conditions and
masses of solar system bodies. LLR data are often combined with other spacecraft
and planetary tracking data to further constrain the estimates or remove degeneracy.
Many observatories around the world routinely make centimetre-level range mea-
surements. In the last 50 years, the ground station technology has improved to the
point that the range measurements have a precision of ∼2 cm, limited by the lunar
retro-reflectors. Once the data are fitted with a computed orbit, the post-fit residuals
(difference between observed and computed) are at the millimetre level. The contin-
uous record of laser range measurements between Earth and Moon, dating back to
1969, has provided an unprecedented set of data by which to understand dynamics
within the solar system and to test the fundamental nature of gravity.
This technique is analogous to the Satellite Laser Ranging (SLR), mentioned in
Sect. 1.7, that is used to track the motion of artificial satellites covered with retro-
reflectors.
The Moon test of the equivalence of active and passive mass is directly related to
the violation of action-reaction law of the Newtonian dynamics. Bartlett and Van
Buren modelled the Moon as a system made of two components: a spherical mantle
of density ρ Fe = 3350 kg/m3 , whose centre is displaced t = 10 km apart from the
geometrical centre of the spherical outer shell of density ρ Al = 2900 kg/m3 (see
Fig. 1.14). This simplified model accounts for the asymmetrical composition of the
28 1 Classical Gravity
Fig. 1.13 Left: the position of the various retro-reflectors deployed on the Moon surface. Credits:
NASA. Right: a model of the Soviet Moon rover Lunokhod used to deploy the French-built retro-
reflectors. Photo courtesy of P. Milosević
two faces of the lunar surface: the one facing the Earth which is rich of iron and
the opposite one which is rich of aluminium. Additionally, the centre of mass of
the moon turns out to be displaced from the spherical centre by s = 1.98 ± 0.06
km along the direction θ E = 14o ± 1o to the East of the vector pointing towards the
Earth.
If the action-reaction principle is violated, we should detect an effect due to the
existence of a residual force FAl,Fe , which would depend on the two components.
This force has the direction of the segment connecting the two geometrical centres
1.9 Active and Passive Masses 29
C Al , C Fe and it is applied in B:
The most significant effect on the motion of the Moon would be determined by the
component of the force tangent to the orbit,
which would in turn produce a variation in the angular speed of the moon ω. A
simple calculation allows us to evaluate the magnitude of such an effect: compute
the work done by the force Ft in a full rotation cycle (a lunar month of observations);
this must be equal to the change in kinetic energy of the Earth-Moon system. Since
the latter is one half of its potential energy, we have
1 1 M⊕ MK 1
2πr Ft = M⊕ = G 2
r = F⊕,K r (1.40)
2 2 r 2
where F⊕K is the gravitational force between Earth and Moon. So we derive that
r Ft
= 4π (1.41)
r F⊕,K
ω FAl,Fe
= 6π sin θ E (1.42)
ω F⊕,K
Finally, we need to evaluate FAl,Fe for this two-component model of the Moon.
We introduce the factor S(Al, Fe), the difference of the ratios between active and
passive masses of the iron and aluminium components
M Al (A) M Fe (A)
S(Al, Fe) = −
M Al (P) M Fe (P)
that depends on the different nuclear composition of the two materials.
FAl (∼ − FFe ) is the internal force that the aluminium component exerts upon the
iron component. We compute its modulus by integrating the field of gravitational
30 1 Classical Gravity
force per unit mass f Al of the aluminium body12 on a mass element of the iron
sphere:
FAl = ρ Fe f Al d V (1.44)
Fe
In order to deduce a Al and calculate Eq. 1.44, we model the Moon as follows:
• The entire volume of the Moon is filled with material with density ρ Al .
• The iron sphere is made of a sphere of density ρ Fe .
• Superimposed on the latter, there is an identical sphere of density −ρ Al .
4
aG = πGρ Al z k̂ (1.45)
3
Substituting with this field in Eq. 1.44, we obtain
4
FAl = πGρ Al ρ Fe t VFe (1.46)
3
where VFe is the volume of the iron sphere and t is the distance between the centres
C Al and C Fe .
The product t VFe is deduced using the information of the Moon mass and the
position s of barycentre. This yields
FAl,Fe MK d⊕,K
2 s ρK
= S(Al, Fe) (1.48)
F⊕,K M⊕ R K RK ρ
where we have introduced the distance d⊕,K between the Earth and the Moon, the
radius of the Moon RK and its mean density ρK .
12 The gravitational force per unit mass f coincides with the gravitational acceleration g if the
Principle of Equivalence (see Chap. 3) holds.
1.9 Active and Passive Masses 31
Comparing this last expression with Eq. 1.42 we derive an upper limit for
S(Al, Fe), which depends on the accuracy of our knowledge of ω ω , i.e. less than
1 · 10−12 .
This value is based on the monthly LLR measurements of ω̇ = −25.3 ar csec/
centur y, where the known effects that distorts the lunar orbit due to the Earth’s ocean
tides have already been subtracted. The upper bound reads
1 1 1 ω
S(Al, Fe) < 5 · 10−14 (1.49)
5 6π sin θ E ω
Its validity rests on the assumptions made on Moon composition and its simplified
model. However, in the second part of the same paper, the authors (Van Buren 1986)
generalize the Moon structure by assuming a multi-component composition, thus
providing a more general bound for S.
In conclusion, the experiment supports the assumption that the passive mass of a
body is proportional to its inertial one.
mp
=α (1.50)
mi
m p = ma = mi .
References
Bartlett, D.F., Van Buren, D.: Equivalence of active and passive gravitational mass using the moon.
Phys. Rev. Lett. 57 (1986)
Cavendish, H.: XXI. Experiments to determine the density of the Earth. Philos. Trans. R. Soc. 88,
469 (1798)
Dicke, R.H.: The solar oblateness and the gravitational quadrupole moment. Ap. J. 159, 1 (1970)
Dicke, R.H., Goldenberg, H.: Mark solar oblateness and general relativity. Phys. Rev. Lett. 18, 313
(1967)
Kreuzer, L.B.: Experimental evidence of the equivalence of active and passive gravitational mass.
Phys. Rev. 169, 1007 (1968)
Duval, T.L., Harvey, J.W., et al.: Rotational frequency splitting of solar oscillations and Internal
rotation of the Sun. Nature 310(19–22), 22–25 (1984)
International Earth Rotation and Reference System Service: IERS Conventions (2020). IERS Tech-
nical note n. 36
Mecheri, R., et al.: New values of gravitational moments J2 and J4 deduced from helioseismology.
Solar Phys. 222, 191–197 (2004)
Rothleitner, S., Schlamminger, S.: Measurements of the Newtonian constant of gravitation. Rev.
Sci. Inst. 88, 111101 (2017)
Tiesinga, E., et al.: CODATA recommended values of the fundamental physical constants: 2018.
Rev. Mod. Phys. 93, 025010 (2021)
Further Reading
Curtis, H.D.: The two body problem. In: Orbital Mechanics for Engineering Students, 2nd ed.
Butterworth Heinemann, (2010)
George T Gillies.: The Newtonian gravitational constant: recent measurements and related studies
Rep. Prog. Phys. 60, 151–225 (1997)
13 A rigorous treatment of the fundamental laws of gravitation and of the complex problem of
the four-momentum conservation in the framework of general relativity is given in Chap. 11 of C.
Moller, The Theory of Relativity, Claredon Press Oxford 1972.
The Torsion Pendulum
2
The torsion pendulum is the main tool of gravitational physics, used in a variety of
configurations and operating modes. Cavendish was the first to use it for exploring
the gravitational field, but the merit of its conception and development must be
attributed to Coulomb and Michell, whose main goal was to measure electric forces:
they realized, in 1777, what is considered the first experiment of modern physics.
Indeed, Coulomb’s law dates to 1785, some 13 years before Cavendish’s claim of
“having weighted the Earth” (Figs. 2.1 and 2.2).
They realized that the horizontal equilibrium condition allowed to remove almost
completely the dominant pull of the Earth, even that associated to the centrifugal force
coming from the Earth’s rotation, but not the background noise of gravitational (or,
for Coulomb, electrostatic) origin. Additionally, they became aware that to increase
the sensitivity of the measurement, long observation (i.e. integration) times were
needed, and that the pendulum should have a small damping coefficient, i.e. a long
time decay constant.
At that time, the thermal origin of the intrinsic noise was unknown and there-
fore the experimentalists were unable to estimate the magnitude of such noise, once
isolated from the seismic and environmental contributions. This instrument was
extensively used during the whole nineteenth and twentieth centuries, in all kind
of measurements requiring detection of small forces; just to mention a few investi-
gations: to determine the value of G (more than 200 experiments since Cavendish,
almost all of them using a pendulum), to test the equivalence principle (see Chap. 3),
to verify several properties of the gravitational field such as the superposition prin-
ciple and inverse square law, to evince gravitational anomalies, to search for a rest
mass of the photon, to measure the Casimir force and to search for postulated weak
forces of either long or short range, the so-called fifth force.
Fig. 2.1 The torsion balance used by Cavendish for “weighting the Earth”. The lettering on the
drawing refers to the description given in Cavendish (1798). On the right: an old print shows
Cavendish reading the torsion angles through the optical lever, from outside the laboratory
The pendulum still today plays a central role in the physics of experimental
gravitation; besides, it is at the basis of the working principle of several versions of
the gravimeters, which are widely employed in geophysics and geodesy.
The torsion pendulum has been built in uncountable versions: it can have the sus-
pension wire made out of quartz fibre or tungsten wire, diameter of just a few microns
or up to millimetres, it can have the inertial member, or test mass ranging from 10 kg
down to a fraction of a gram. It has been implemented in vacuum chambers, inside
tanks for submersion in water, placed in mountains tops or in deep caves, it has been
cooled to cryogenic temperatures and heated up to incandescence. It is a robust and
very versatile instrument assembled in many different versions; it can detect very
small forces over a dynamic range of several orders of magnitude (Fig. 2.3).
In all these versions, its intrinsic noise sources have been carefully studied: thermal
motion and the perturbations due to the transducer transforming mechanical signals
in electromagnetic ones, both in free-running and feedback set-ups. Clearly, the
intrinsic noise is only observable if the apparatus is effectively isolated from external
noise sources, like ground vibrations. Few other scientific instruments, based on a
mechanical device so simple in appearance, have been subject of studies for such
a long time. Its evolution still continues (Gillies and Ritter 1993). The persisting
discrepancies among the different measurements of G, still observed in our days,
promote an improvement on the performance of this instrument, thus increasing
Fig. 2.3 A sophisticated, recent torsion pendulum, built and operated in the Eöt-Wash group at
the University of Washington in Seattle for a measurement of G (Gundlach andMerkowitz 2000).
Courtesy of Eöt-Wash group. On the right a diagram describing its components. Credits: reprinted
figure with permission from (Gundlach and Merkowitz 2000). Copyright 2000 by the American
Physical Society
36 2 The Torsion Pendulum
the comprehension of its physical limits. In order to reduce the contributions from
thermal noise and to increase the isolation from vibrations, the experimentalists
have followed different strategies, like cooling the pendulum down to cryogenic
temperatures or loading the suspension fibre, by reducing its diameter, until it reaches
values close to the maximum safe load. Under these extreme conditions, non-linear
effects start to be significant so that it is crucial to understand and optimize the
anelastic properties of the system, which is a topic at the frontiers of current research
on the dissipative properties of materials.
A dumbbell, i.e. a rigid rod with two masses set at its end, is suspended by a long
thin wire keeping the rod horizontally balanced. Applying a torque to the dumbbell,
the rod rotates by an angle θ and a new equilibrium condition is reached, where
the applied torque is balanced by the elastic one coming from the torsion wire. For
simplicity, we will treat the wire as a cylinder of constant cross section: its torsion is
a typical non-homogeneous shear deformation. We recall here that a generic volume
deformation in an elastic body can be decomposed as the sum of
(a) a simple volume deformation, that is, a deformation which changes the volume
of the body but not is shape;
(b) a simple slip deformation, that is, a deformation which leaves the body’s
volume unchanged, but changes its shape.
We consider the second case: referring to Fig. 2.4, the shear angle γ, for small
deformations (tan γ γ), is given by γ = CC AC .
Under elastic deformations condition, the angle γ is proportional to the applied
shear stress (torque per unit area) τ :
CC τ
γ= = (2.1)
AC μ
It can be proven (Love 1994) that the proportionality coefficient μ, called shear
modulus, is related to the Young’s and Poisson’s elasticity coefficients, E and σ, by
the relation
E
μ= (2.2)
2(1 + σ)
Considering the torsion of a cylinder, each section perpendicular to its axis (slice)
remains on a plane, when subject to a twisting horizontal torque, due to its circular
form; however, it ends up rotated with respect to the adjacent slices. We now consider
a slice of elementary height dz, and we focus on a circular sector of length dr and
width r dφ placed on either of the two bases of the cylinder (see Fig. 2.5).
Next, we assume that the corresponding sector on the other base is rotated with
respect to the first one by an angle which linearly grows with the cylinder height.
For a height dz of the elementary cylinder, such an angle is then αdz, where α is
the angle of rotation per unit length. For small deformations in the radial position r
of the reference section, the shear angle γ is obtained from the ratio of the arc r αdz
and the height of the cylinder dz. Equation 2.1 gives
τ
γ = rα = (2.3)
μ
R R 1
Mτ = τ (2πr )r dr = 2πμ r 3 αdr =
πμR 4 α
0 0 2
Finally, we obtain the total torsion angle, that is, our measurable quantity, by mul-
tiplying α, the rotation angle per unit length, times the wire length L:
π R4
Mτ = μ θ = kτ θ (2.4)
2 L
This equation, clearly valid for small angles θ, defines the torsion constant kτ . In a
sensitive instrument, where a small torque should produce a large response, we want
38 2 The Torsion Pendulum
Mτ = kτ θ = I θ̈ (2.5)
where I is the moment of inertia of the pendulum load about the rotation axis.
The corresponding moment of inertia of the fibre itself is assumed negligible. From
Eq. 2.5, a simple harmonic oscillator, we drive the resonance frequency ντ of free
oscillations of the torsion pendulum:
√
1 kτ π R2 E
ντ = = √ (2.6)
2π I 4 IL (1 + σ)
√
where we have used eqs. (2.2), (2.4) and (2.6) In Eq. 2.6, the term R 2 / I L summa-
rizes the geometric properties of the pendulum. In order to have a very low natural
frequency, approaching the free motion (where no restoring force or torque exists),
we need a large I and a small fibre diameter 2R. However, these are competing
requirements, because the moment of inertia is typically proportional to the load
mass M, and the fibre size must be large enough to hold the weight Mg. The safe
load of a fibre scales with its cross section, i.e. with R 2 , while the torsion constant is
proportional to R 4 ; therefore, it is beneficial to have a lightweight inertial member
and a very thin fibre. State-of-the-art torsion pendulums have natural frequencies in
the mHz range.
Despite its conceptual simplicity the torsion balance turns out to be, upon a detailed
analysis, a complex device. Here, we will try to summarize some basic concepts,
assuming that the system is characterized by a single fundamental mode of oscil-
lation. More thorough treatments, which also consider the presence of several res-
onances, in both the free and forced regimes, as well as the non-linear effects, are
given in Gillies and Ritter (1993) and in Metherell and Speake (1989).
Strategies for measuring the torsion angle of the balance can be grouped in two
big categories:
(a) Direct measurements of the torsion angle and
(b) Measurements of the oscillation period.
m2
L/2
mirror
laser
L/2
m1
quadrant
photodiode
displacement from the equilibrium position of the system. The sensitivity of the
instrument is defined as S = dθ/d Mτ . From Eq. (2.4), we simply have S = 1/kτ .
High-performance pendulums have sensitivity of the order of 108 rad/N m.
To derive the order of magnitude of the smallest measurable signal Mτ , we need to
specify the method for measuring the torsion angle θ. The measurement of θ must
be performed using a system without mechanical contact, which in most devices is
done applying either an optical lever method or an electrostatic read-out (Fig. 2.6).
A light beam, emitted (in modern times) by a laser source, is directed towards a
mirror placed at the centre of the rod and the reflected beam is aimed at a receiver:
the light intensity is detected using a photodiode divided into four sensitive zones
(quadrant photodiode). The cross section of the light beam falls differently upon
the four sectors of the photodiode, whose electric signals can be easily processed to
localize the centre of the beam. A quadrant photodiode can measure beam changes
in two directions; for planar motion, as is the case of a torsion balance, we are only
concerned with one angular motion (Fig. 2.7).
A small change δθ in the equilibrium position of the balance causes an equal
change δi = δθ in the incidence angle i of the light beam on the mirror. As shown
in Fig. 2.8, for small oscillations the position x of the centre of the reflected beam
on the receiver depends on the distance travelled by the reflected beam s, the angle
φ between the normal to the photodiode surface and the incident beam and i, the
incidence angle of the beam on the mirror (Cook 1979):
40 2 The Torsion Pendulum
Fig. 2.7 The principle of operation of an optical lever with quadrant photodiode detector: if the
beam (red) hits the surface not exactly at its centre, the four quadrants detect different amounts of
light. In this example, the horizontal position is deduced by combining the readings (A + C) − (B
+ D), while the vertical one by (A + B) − (C + D). The reading is then normalized by the total
detected power (A+B+C+D).
Pla
pho ne of
tod the
iod
e
x = s · sin(π − 2i − φ) (2.7)
As the torsion angle changes, a small variation in the incidence angle δi causes the
beam to move by δx on the photodiode; the sensitivity of the method depends on the
minimum value of the displacement δxmin detected by the photodiode.
2.4 Read-Out for the Torsion Balance 41
2s
δx = δi (2.8)
cos(φ + 2i o )
Using a quadrant photodiode, changes in the position of the light beam of order
δxmin ∼ 10−10 m can be observed. Assuming for laboratory experiments s ∼ 1 m,
the minimum measurable value of the torsion angle is δθmin ∼ 10−8 rad. This value
corresponds, from previous considerations (Eq. 2.4), to a minimum detectable torque:
Mτ , min = kt δθmin ∼ 10−16 N m
In practice, other effects, like thermal noise or integration time, limit the sensitivity
in the order of f N m.
With an adequate optical configuration for recording the light shining on the
photodiode, it is also possible to distinguish the translational degrees of freedom of
the mirror from the rotational ones. We refer to Fig. 2.9 for the calculation of the
displacement x of the laser spot on a plane placed at a distance D from a convergent
lens of focal distance f , followed by a rotation δi and a translation z of the mirror
placed at a distance L from the lens. We shall assume that, at rest (δi = 0), the
reflected beam travels along the optical axis of the lens.
To first order, it is sufficient to linearly add the two effects. We assume that,
initially, the light beam falls on the mirror with an incidence angle equal to i o .
Recalling the equation that relates source (p=L) and image (q) positions in a thin
lens:
1 1 1
+ = (2.9)
L q f
where q is the coordinate of the mirror image plane. By simple application of ele-
mentary geometry to the diagram of Fig. 2.9, we deduce that
D L
x 2z sin i o 1 − + 2 tan(δi ) L − D −1 (2.10)
f f
mirror mirror
lens lens
Fig. 2.9 Diagram for the calculation of the transfer function between either a rotation (a) or a
translation (b), of a mirror placed at a distance L from a convergent lens of focal distance f and the
corresponding displacement of a light beam on a plane orthogonal to the optical axis at a distance
D behind the lens
42 2 The Torsion Pendulum
m2
L/2
mirror
laser
feedback
L/2
m1
quadrant
photodiode
output
It follows that, if D = f , the displacement x only depends on the value δi , the mirror
rotation angle. Therefore, by placing the photodiode in the focal point of the lens,
the device will only be sensitive to the system’s rotations, but not its translations.
On the other hand, note that by placing the photodiode in the image plane of L,
i.e. at a distance D from the lens equal to
Lf
D=
L− f
the term related to the rotation vanishes and the device is only sensitive to the trans-
lations of the mirror.
plates. The error signal from the feedback circuit V is added to the bias voltage in
one capacitor while it is subtracted from the other one. If A is the surface of each
plate, in the case of a balanced capacitance transducer with Ca = Cb = C, we have
a total force acting on the mass m 2 , which is proportional to the error signal V :
1 C 2 V0
Ff b = C 2 [(V0 + V )2 − (V0 − V )2 ] = 2 V (2.11)
2o A o A
Thus we have a dynamical system that, when excited by an external perturbation M(t)
with frequency components in the bandwidth of the control system, remains locked
at the operating point. This is due to the compensation produced by the electrostatic
force, which in turns is proportional to the error signal V used as the output signal
of the instrument. This method of measuring the response of the torsion balance
allows to increase, with respect to the method of free oscillations, the dynamical
range of the instrument, i.e. it allows to detect external forces or torques in a wider
range of strength, while keeping constant the sensitivity of the quadrant photodiode.
Besides, it overcomes a problem, suggested in the literature (Kuroda 1995), of a
non-linear response of the fibre under torsion, that is, a breakdown of the simple
proportionality law of Eq. (2.4), with a more complex, and not easily manageable
coefficient kt = kt (θ). In feedback operation, the torsion angle remains “nailed” to
zero, and kt is bound to remain a constant value.
A deeper analysis of the sensitivity limits of this device would require a much more
extended discussion than what just provided in the previous sections, since there are
many possible causes that could affect its sensitivity.
Two are the unavoidable noise sources intrinsic to the apparatus: the noise of the
read-out, i.e. the system transducing the variations of the torsion angle δθ in electric
signals, and the thermal noise of the balance. Additional noise from the detection
amplifier circuit can be accounted for in the overall transducer noise while, in the case
of optical transducers, we will need to consider the shot noise, the noise associated
to radiation pressure, the amplitude and phase noises of the laser source, in addition
to those related to the current and voltage fluctuations of the photodiode. We will
discuss all these sources of noise when we study the interferometers for the detection
of gravitational waves. In a system with feedback, also the possible contributions
from the feedback circuit must be considered.
The thermal noise of the torsional oscillator plays a central role in defining the
sensitivity limit of the instrument, as it is the case for a large part of gravitational
experiments: Chap. 9 later on in this book discusses thermal noise in detail. Here,
we will just recall that the power spectrum of the acceleration thermal noise is
proportional to the temperature and decreases by reducing the anelastic dissipation
of the system. For this reason, a careful choice of the material of the torsion wire and
of the method of fastening it at both ends is required to reduce this noise contribution.
44 2 The Torsion Pendulum
Here, we will try to analyse a particular interaction of the system with its envi-
ronment (anthropic noise), a cause limiting the sensitivity different from the above-
mentioned intrinsic noise sources (Fan et al. 2008).
In principle, the seismic motion of the ground and hence of the connection point
of the torsion wire is filtered by the torsion pendulum. Such filtering action would
have an optimal efficiency by carefully designing the fastening system of the wire.
Assume, as a limiting case, that the connection region of the suspension wire on the
rod with its end masses m i is reduced to a perfectly centred, geometric point: there will
be no spurious couplings between translational and rotational degrees of freedom.
In a real apparatus this is not the case. As consequence, a horizontal acceleration of
the connection zone entails a pendular motion which, in turn, introduces a torsional
motion of the rod. This is why the seismic noise places a limit on the sensitivity of
the experiment. To express this argument quantitatively, we report here a detailed
model for the torsion balance and to derive its Lagrangian equations of motion (Ying
2004). A simpler but still realistic and comprehensive approach is developed in
(Bassan 2013).
We refer to Fig. 2.11 and introduce two Cartesian reference frames including the
rotation angles with respect to their respective axes: the first one O0 , X , Y , Z , αo , βo
follows the connection point of the wire and is fixed to the laboratory, while the
second O1 , X 1 , Y1 , Z 1 , α1 , β1 , γ1 follows the rod and its origin coincides with the
connection point of the wire to the rod. 2 h, 2 w and 2 l are the height, length and
Fig. 2.11 Diagram of a more realistic torsion pendulum. Oo is the upper attachment point of the
torsion wire, O1 the attachment point from the wire to the pendulum bar, O2 is the system centre of
mass. We have introduced two Cartesian reference frames which include the rotation angles with
respect to their respective axes: the first one O0 , X , Y , Z , αo , βo moves with the attachment point
of the wire, while the second O1 , X 1 , Y1 , Z 1 , α1 , β1 , γ1 moves along with the rod
2.5 A Detailed Model for the Torsion Balance 45
width of the rod; lo is the length of the torsion wire and ll is the distance between the
centre of mass of the system O2 and the attachment point O1 .
Under the hypothesis that the point Oo is fixed, the degrees of freedom of this
system are 5, namely, the angles αo , βo , α1 , β1 , γ1 . If we want to include in the
analysis the effects of the seismic noise, we should consider the (stochastic) variables
X , Y , Z , describing the motion of the suspension attachment point.
Let us express x, y, z, the coordinates of a generic point P on the rod, as functions
of the degrees of freedom. To this end, we write the equations that transform the
variables in the reference system O1 , X 1 , Y1 , Z 1 into the ones corresponding to the
system O0 , X , Y , Z :
y = lo cosβo sinαo
+ x1 (cosα1 sinγ1 − cosγ1 sinβ1 sinα1 )
+ y1 (cosα1 cosγ1 − sinγ1 sinβ1 sinα1 )
− z 1 cosβ1 sinα1 (2.13)
From these equations, we derive the expressions of ẋ, ẏ, ż (being x˙1 = y˙1 = z˙1 = 0).
We then integrate the kinetic and potential energies of the mass element ρ d x1 dy1 dz 1
over the volume of the rod (l ≤ x1 ≤ l, − w ≤ y1 ≤ w, l1 − h ≤ z 1 ≤ l1 + h), thus
obtaining the expression for the Lagrangian of the system:
1 1
mlo 2 cos 2 βo α˙o 2 + mlo 2 β˙o
2
L=
2
2
1 1
m(w2 + l 2 ) + m(3l1 2
6 6
1
+ h − w )cosβ1 + m(w − l )cos β1 cos γ1 α˙1 2
2 2 2 2 2 2
6
1 1
m(h + w + 3l1 ) + m(l − w )cos γ1 β˙1
2
+ 2 2 2 2 2 2
6 6
1
+ m(w2 + l 2 )γ˙1 2
6
+ mlo l1 cosβo cosβ1 cos(αo − α1 )α˙o α˙1
+ mlo l1 cosβo sinβ1 sin(αo − α1 )α˙o β˙1
46 2 The Torsion Pendulum
1 1
mlo 2 α˙o 2 + mlo 2 β˙o
2
L=
2 2
1
+ m(h 2 + w2 + 3l1 2 )α˙1 2
6
1
+ m(h 2 + l 2 + 3l1 2 )β˙1
2
6
1
+ m(w2 + l 2 )γ˙1 2
6
+ mlo l1 α˙o α˙1
+ mlo l1 β˙o β˙1
1 1
+ mg l0 1 − αo 2 − βo 2
2 2
1 1
+ l1 1 − α1 2 − β1 2
2 2
1
− kγ1 2 (2.16)
2
Comparing Eq. (2.15) with (2.16) we can see that, under our assumption of small
angles approximation, every significant coupling between the different degrees of
freedom cancels out. Beginning instead with the complete Lagrangian of Eq. (2.15)
we now derive the equations of motion using the well-known Lagrange relation of
analytical mechanics:
∂L d ∂L
− =0 (2.17)
∂qi dt ∂ q˙i
where qi is a generic degree of freedom. Applying it to the torsional degree of
freedom γ1 , we obtain an equation that includes coupling terms
2.5 A Detailed Model for the Torsion Balance 47
1 1
m(w2 + l 2 )γ¨1 − m(l 2 − w2 )cos 2 γ1 sin2γ1 α˙1 2
3 3
1
+ m(l 2 − w2 )sin2γ1 β˙1 =
2
6
1
= − m (l 2 − w2 )cos2γ1 + (l 2 + w2 ) α˙1 β˙1
3
1
+ m(l 2 + w2 )sinβ1 α¨1 (2.18)
3
for small values of the angle α1 and β1 , this equation reduces to the form
2l 2 α˙1 β˙1
γ¨1 + ωT 2 γ1 = − − β1 α¨1 = F (2.19)
l 2 + w2
√
where ωT = k/Iz and Iz = 13 m(l 2 + w2 ) is the moment of inertia with respect to
the z-axis of the rod.
From this relation we infer that the swaying motion, caused, for instance, by
seismic noise and described by the degrees of freedom α1 and β1 can act as a forcing
term F of the twisting motion through the coupling terms. Analogously, we can
see that α1 and β1 depend on the degrees of freedom α0 and β0 and the possible
translational motion of the suspension point O.
We conclude that these coupling terms determine a serious sensitivity limitation
for the torsion balance associated to the seismic motion of the attachment point of
the system O.
References
Bassan, M., et al.: Torsion pendulum revisited. Phys. Lett. A 377, 1555–1562 (2013)
Cavendish, H.: XXI. Experiments to determine the density of the earth. Phil Trans. R. Soc. 88, 469
(1798)
Cook, R.O., Hamm, C.W.: Fiber optic lever displacement transducer. Appl. Optics 18, 3230 (1979)
Fan, X.-D., et al.: Coupled modes of the torsion pendulum. Phys. Lett. A 372, 547–552 (2008)
Gillies, G.T., Ritter, R.C.: Torsion balances, torsion pendulums, and related devices. Rev. Sci.
Instrum. 64, 283 (1993)
Gundlach, J.H., Merkowitz, S.M.: Measurement of Newton’s constant using a torsion balance with
angular acceleration feedback. Phys. Rev. Lett. 85, 2869 (2000)
Kuroda, K.: Does the time-of-swing method give a correct value of the Newtonian gravitational
constant? Phys. Rev. Lett. 75, 2796 (1995)
Love, A.E.H.: A Treatise on the Mathematical Theory of Elasticity. Dover, New York
Metherell, A.J.F., Speake, C.C.: The dynamics of the double-pan beam balance. Metrologia 19, 109
(1989)
Ying, Tu., et al.: An abnormal mode of torsion pendulum and its suppression. Phys. Lett. A 331,
354–360 (2004)
Further Reading
Adelberger, E.G., Gundlach, J.H., Heckel, B.R., Hoedl, S., Schlamminger, S.: Torsion balance
experiments: A low-energy frontier of particle physics. Progress in Particle and Nuclear Physics
62, 102–134 (2009)
The Equivalence Principle
3
The equivalence principle is the basis of the Newtonian gravitational theory as well
as Einstein’s one. It can be stated as follows.
The property of a body that determines its response to any applied force (m i =
inertial mass) coincides with the property of the body that determines its response
to gravitational attraction (m g = gravitational mass)
mi = mg (3.1)
This last consideration implies that the elementary space-time interval ds 2 , in absence
of gravity, can be written in terms of the symmetric tensor
⎛ ⎞
1 0 0 0
⎜0 −1 0 0⎟
ημν =⎜
⎝0
⎟ (3.2)
0 −1 0⎠
0 0 0 −1
ds 2 = ημν d x μ d x ν (3.3)
It also follows that the time intervals measured by a clock in motion and in absence
of gravity only depend on its velocity and not on the acceleration.
The slowdown of clocks in motion is a phenomenon that is well demonstrated
experimentally. One of the experiments devoted to this end was performed in 1977
at the European Centre of Nuclear Research (CERN) (Bailey et al. 1977). In this
case, subatomic particles played the role of natural clocks, making use of the fact
that their mean lifetime τ̄ is an intrinsic property of the particle. The mean lifetimes
of positive and negative muons were measured. The muon is an unstable particle that
disintegrates spontaneously in an electron and two different neutrinos after 2.2 ×
10−6 s, on average. Muons with a speed of v = 0.9994 c, corresponding to a Lorentz
−1
factor of γ = 1 − (v 2 /c2 ) = 30, were accelerated around a ring of 14 m in
diameter with a centripetal acceleration of 1018 times the gravitational acceleration
g. The measured lifetime was found in excellent agreement with the well-known
relation:
τ̄ = γ τ¯o (3.4)
where τ̄ is the mean lifetime of the muons measured in the laboratory and τ¯o is the
muon’s mean lifetime at rest.
The information we obtain with this experiment is connected with the time trans-
formation given by the previous equation: the time interval between two events to
that occur in the same place in a moving system (events of injections and disintegra-
tion of the muon, in the example mentioned), if observed in the laboratory becomes
dilated by the factor γ with respect to the corresponding time interval as measured
by an observer in the moving (but at rest with the muons) reference system:
t = γto (3.5)
depends on the velocity just as predicted by Eq. (3.5). At present, in any high energy
experiment, where elementary particles collide at velocities close to c, the events are
analysed using kinematic laws predicted by special relativity.
Experimental evidence also shows that acceleration does not play any role in
modifying the rate of clocks. In fact, at equal speeds, the mean lifetimes of rectilinear
muon beams (with no acceleration) and those from the muons inside an accumulation
ring (which withstand a huge acceleration) are stretched in the same way.
Another milestone experiment on the slowing down of moving clocks was carried
on by Hafele and Keating (1972) using caesium atomic clocks. They were initially
accurately synchronized, then loaded in two airplanes of commercial airlines to
perform a complete travel around the world. One plane travelled to the East while
the other flew to the West. After each flight, the clocks where compared with similar
clocks that remained on the ground. It was found that, in comparison with these
latter clocks, the clock travelling to the West suffered a delay of 59 ± 10 ns, while
the one to the East was 273 ± 7 ns in advance. Such a result agrees with the theoretical
expectations. Note that the most significant variation came from the eastbound clocks.
This is easily explained using a reference system still with respect to the distant stars:
for clocks travelling to East, the airplane speed and the one of the Earth rotation add
up, while we must subtract them for the westbound clocks.
In this experiment, we also need to account for the effect of the Earth’s grav-
itational field, which depends on the flight altitude, thus changing the clock rate
travelling with respect to the ones that rested on ground. In Chap. 4, we will revisit
in detail this effect.
m i = m g (3.6)
The equivalence of mass and energy, as stated by special relativity, tells us that
there will be several contributions to the inertial mass of the body, which will depend
on the nature of the body itself: electromagnetic energy, energy due to weak and
strong interactions, …
We shall therefore investigate, for each of these contributions, if an associated
violation of the EEP of different type exists. We then write:
mi = mg + η k E k /c2 (3.7)
k
where the index k identifies the type of contribution to the energy, E k is the k-th type
of internal energy of the body and η k is a dimensionless parameter that measures the
magnitude of the violation. Let us now consider two bodies that fall with different
accelerations:
2|a1 − a2 | k (1)
where η is known as the Eötvös coefficient, in honour of the Hungarian Baron Lorand
von Eötvös who used the torsion balance to test the equivalence principle (Fig. 3.1).
As mentioned in the previous chapter, to this day this instrument, redesigned in
a modern way, still allows us to obtain the highest accuracy when verifying this
principle.
A rod, with two masses of different composition attached to its extremes, is
suspended by a quartz wire. In principle, the verification of WEP can be accomplished
by assembling two masses of different nature at the ends of the transverse bar. The
angle at which the restoring torque exerted by the wire balances the external torque
is the observable that lets us estimate the effect under investigation (Fig. 3.2).
The experiment is done in a laboratory rotating with the Earth, and hence the
force acting on each of the two masses is the vector sum of the centrifugal force due
to Earth’s rotation, directed away from the rotation axis and the gravitational force
directed towards the Earth centre. The centrifugal force has a component parallel to
the tangent plane of the Earth’s surface (i.e. horizontal). We introduce a reference
frame with spherical coordinates, with the origin at the Earth’s centre and the polar
axis (passing through the Earth’s poles) coinciding with the rotation axis. The lab-
oratory is positioned at a polar angle (colatitude) θ (its geographic latitude is then
3.2 Tests of Einstein’s Equivalence Principle 53
Centrifugal force
Earth
Centrifugal force
Gravitational
Component of the centrifugal force along the Zenith axis
force
Earth
Component of the centrifugal force along the horizontal N-S axis
Earth
Earth Earth
Earth Earth
Both conditions imply
φ = π/2 − θ). In this system, each of the test masses of the torsion balance is subject
to
– the gravitational force that depends, obviously, on the gravitational mass m g and
is purely radial, directed along −ẑ, where ẑ denotes the local vertical;
– the centrifugal force that depends on m i has both a radial and a polar (horizontal)
component, oriented along the local N-S direction; both depend on the latitude
of the laboratory on the Earth’s surface:
mg ME
Fz = −G + m i ω 2 R sin2 θ = m g g − m i az (3.10)
R 2E
FN −S = m i ω 2 R sin θ cos θ = m i ax (3.11)
These two components of the force, acting on each of the test masses, located
at distances r A and r B from the suspension fibre can induce rotation around two
different axes, the horizontal component around the vertical axis and vice versa (see
Fig. 3.3).
for the vertical components
Mz = m i A ax r A − m i B ax r B (3.12)
Mx = (m gA g − m i A az )r A − (m gB g − m i B az )r B (3.13)
3.2 Tests of Einstein’s Equivalence Principle 55
By an adequate choice of r B we can prevent the system from rotating around the
x-axis. In other words, we choose r B such that Mx = 0:
(m gA g − m i A az )
rB = rA (3.14)
(m gB g − m i B az )
m gB /m iB − m gA /m iA
Mz = m i A ax r A (3.15)
m gB /m iB − ax /g
When Mz = 0, the system does not deviate from the equilibrium condition, that
is,
m gB = m iB m gA = m iA
.
From formula (3.15), we can see that by rotating the pendulum bar by π,
if and only if WEP is violated, the non-vanishing torque changes sign and the appa-
ratus changes its equilibrium position.
In 1909 Lorand Baron von Eötvös, after devising an apparatus capable of detecting
rotations angles of 10−11 rad (see Fig. 3.4), obtained an upper limit on the value of
the η parameter, measuring WEP violations, of order 10−8 .
A similar experiment was performed by Roll et al. (1964) in Princeton, (USA)
and by Braginsky and Panov (1972) in Moscow (Russia). In both cases, they made
use of the acceleration produced by the Sun’s gravity, rather than the Earth’s, which
Fig. 3.4 The right panel shows a picture of one of the experimental devices used by Eötvös. On the
left panel, a diagram of the working mechanism. Images from The Eötvös Lorand Virtual Museum
56 3 The Equivalence Principle
modulates the torsion signal with a period of 24 h. Despite the fact that the horizontal
component of the solar acceleration is weaker than the Earth’s one, 0.59 cm s−2
versus 1.67 cm s−2 , the advantage of not having to manually rotate the apparatus,
combined with other experimental advantages available with the technology of the
‘60s, allowed an improvement of three orders of magnitude with respect to Eötvös’
result.
For simplicity sake, let us consider the system as simply composed of two masses
A and B of different material. The resulting torque due to the solar acceleration g
is
m iA r A
M S = (m gA g − m iA a)r A + (m gB g − m iB a)
m iB
or after reducing terms:
m gA m gB
M S = m iA r A − g
m iA m iB
which vanishes when the ratio between inertial and gravitational mass is independent
of the nature of the bodies.
Note that the direction of the solar acceleration changes with time, and therefore
a modulation1 with a period of 24 h is expected in the response of the system, in the
case that WEP is violated. In general, the modulation technique allows to improve
significantly the signal-to-noise ratio and avoids having to rotate the whole system
by π rad.
The experimental device of Dicke, Roll and Kroktov is depicted in Fig. 3.5. It is
composed of three masses: two made of aluminium and one of gold. The choice of
materials is driven by the requirement of selecting materials of different composition;
in this case, they considered the ratio of the number of neutrons to protons, Nn /N p ,
the ratio of energy of the electron in level K to the rest mass, E e (K ) /m e , and the
ratio of the nuclei electrostatic energy to the atomic mass, E N u (elettr .) /M A . Table 3.1
reports the corresponding values to illustrate the microscopic differences between
the chosen materials.
As shown in Fig. 3.5 the gold mass is placed between the plates of a capacitor.
This allows the device to function in a closed feedback loop configuration. The light
signal is reflected on the mirror and detected by the photo-multiplier. A slit is placed
between the mirror and the photo-multiplier in such a way that when the light beam
moves, the recorded light’s intensity also changes. However, the dynamic range of
the apparatus depends on the transverse size of the beam and the width of the slit. To
increase the dynamic range, the photo-multiplier signal is used to generate an error
signal that drives the voltage across the capacitor where the gold mass is inserted. In
Fig. 3.5 The torsion pendulum of Dicke’s experiment. Adapted from Roll et al. (1964)
Table 3.1 Differences in material parameters for the test masses (Roll-Krotkov-Dicke experiment)
this configuration, one can thus exert a force that maintains the mass at rest.2 The
detection is therefore derived from the measurements of the amplitude and phase of
the error signal itself. In Fig. 3.6, we report the diagram for the detection published
in the original paper by Roll, Dicke and Krotov.
With this apparatus, Roll, Dicke and Kroktov were able to obtain a limit η ≤
0.96 · 10−11 (Roll 1964).
At the Lomonosov University in Moscow, Braginsky and Panov (1972) calibrated
a similar device but with a longer wire and a more complex configuration of sus-
pended masses. The purpose was to increase the sensitivity (longer wire) and to
reduce the spurious gravitational couplings with the surrounding environment, by
making the system more symmetric, which means reducing the quadrupolar, sextupo-
lar and octopolar mass moments of the system. In practice, compensation masses
must be added appropriately. In the work they published, Braginsky and Panov did
not provide many details about the experimental configuration and the reduction of
systematic errors. This, in addition to the political climate due to the iron curtain of
the Soviet world, generated considerable suspicion on the validity of the experiment
2 Thiscondition only holds for the Fourier components that lay within the characteristic frequency
range of the control system.
58 3 The Equivalence Principle
Fig. 3.6 Scheme of the implementation and detection system of the Roll, Dicke and Krotov exper-
iment. Credits: (Roll 1964)
in the Western scientific circles. In any case, they concluded their paper with the
claim of having obtained a limit value for η of order 1 · 10−12 .
In a series of experiments since 1999, always using torsion pendulums, the
“Eot-Wash” gravitational group of E. Adelberger in the University of Washington
(Seattle—USA) obtained a value of η ∼ 1.4 · 10−13 (Baeßler et al. 1999, Adelberger
2001). They used the Earth gravitational attraction, like Eötvös, and modulation of
the signal, like Dicke: this was achieved by operating the torsion pendulum on a
platform rotating at ∼ mHz frequency, much higher, and much less affected than
Dicke’s by spurious effects with 1-day periodicity. By combining the turntable fre-
quency with that of the Earth rotation and revolution, they have set upper limits on
η also for the attraction towards the Sun and towards the Galactic Centre: the latter
is of interest because nothing is known about WEP and Dark Matter (Wagner 2012).
Among the many ingenuous ideas they implemented in the apparatus, we mention a
careful arrangement of the compensation masses to balance the local gravity gradi-
ents. This type of apparatus is limited by the thermal noise of the suspension wire;
that is why there have been efforts to develop a torsion pendulum operating at low
temperatures.
On November 2004 Sebastian Fray and collaborators from the Max Planck Institute
for Quantum Optics at Garching and the Tubingen and Munich Universities have
compared the acceleration of two isotopes of rubidium in the Earth’s gravitational
field (Fray et al. 2004), using an atomic interferometer. In agreement with the equiv-
alence principle, the atoms are accelerated in the same way. This type of experiments
3.3 Verification of EEP at the Atomic Level 59
is part of the effort to verify to what extent the theory of gravity is compatible with
the laws of physics in the microscopic world. According to some theories, when
gravitational experiments are done on quantum objects like atoms, it might be pos-
sible to observe a violation of the equivalence principle, thus unveiling a new type
of physics.
The German group based its experiment on the use of an atom interferometer, a
technique already developed to measure the Earth gravity “g” with a precision of
10−9 .
In principle, an atom interferometer is similar to the optical counterpart, but there
are matter waves interfering; in other words, it uses beams made of atoms instead of
light; the role of the beam splitter is performed by stationary electromagnetic waves
that are used both to divide and to combine the atom beams.
The experiment works by capturing in a magneto-optical trap3 ∼ 2 · 109 atoms
of the isotopes 85 Rb or 87 Rb. Using laser beams the atoms are accelerated upwards.
When the beams are turned off, the atoms fall under the influence of gravity.
The interferometer has allowed to measure the acceleration of both types of atoms,
g85 and g87 , reaching a result in agreement with the equivalence principle:
g85 − g87
= 1.2 · 10−7 ± 1.7 · 10−7
g85
Fray and his collaborators have also compared the relative acceleration of Rb85
atoms in two different quantum states, finding that they coincide within the experi-
mental error.
A similar experiment, carried out in 2014 by the group of G. Tino in Firenze
(I) (Tarallo et al. 2014), has tested the UFF for two different isotopes of Strontium:
bosonic 88 Sr isotope which has no spin versus the fermionic 87 Sr isotope which has
a half-integer spin. This search is of interest because it can probe theories, recently
proposed, predicting some spin-gravity coupling. The limit set on the η parameter
for this particular form of coupling is η < 2 10−7 .
3 A magneto-optic trap (MOT) is a sophisticated apparatus where atoms are slowed down (to energies
corresponding to μK ) by elastic collisions with photons of laser beams; these “cold” atoms are then
trapped by magnetic field, through the Zeeman effect, in a small region of space: localized clouds
of up to 108 atoms can be generated and maintained indefinitely. The techniques of laser cooling
were awarded a Nobel prize for physics in 1997.
60 3 The Equivalence Principle
the test masses down to 1.8 K , in order to detect their residual displacements between
magnetic coils, via the use of SQUIDs4
The project, called STEP (Satellite Test of the Equivalence Principle) (Overduin
et al. 2012), has been pursued for many years under the direction and supervision of
both NASA and ESA: the goal was to measure the equivalence between inertial and
gravitational mass with an accuracy of 1 · 10−17 − 1 · 10−18 .
To this purpose, the accelerations of four pairs of test masses in orbit would
be compared. The test masses in free fall would be placed in a ultra-high vacuum
environment, isolated from external perturbations and within a cryostat equipped
with superconducting magnetic shields. The differential accelerations of the test mass
are measured by a superconducting circuit coupled to a SQUID with a sensitivity of
10−22 W b in 1 Hz band. The material for the test masses was to be chosen among
niobium, platinum-iridium and beryllium. The reasons of this limited choice are
related to their physical properties, such as the ratio protons/neutrons and the nuclear
binding energy, that need to be quite different, in order to maximize the contribution to
a violation of the equivalence principle. Besides, the materials need being machinable
with high mechanical accuracy. The test masses of platinum-iridium are placed at the
centre, while the beryllium ones remain on the outer part, thus avoiding the condition
of a position difference too large to be compensated by the actuation systems of the
apparatus.
The test masses of different materials along with the positions transducers would
be placed inside the satellite (see Fig. 3.7), whose attitude is controlled by micro-
thrusters. These serve the purpose of cancelling the accelerations caused by the
residual atmosphere, by radiation pressure and by solar wind, which could influence
the motion of the masses. Such a set-up of the satellite is called “drag free”.
This technique reduces the low-frequency noise in the acceleration produced by
non-gravitational interactions. The perturbations associated with gravity gradients
are eliminated via the precise positioning, in the same point, of the centres of mass of
both test masses. Figure 3.8 shows a pair of masses in their operating configuration.
The orbit was designed as approximately circular and synchronized with the solar
motion in order to minimize fluctuations in temperature, with an optimal height close
to 550 km. The expected duration of the mission is 6 months. However, despite its
potential and its elegant technology, the joint ESA-NASA mission is, to this day, far
from being approved.
The MICROSCOPE experiment (Touboul et al. 2012) is a mainly French (CNES-
DLR-ESA) mission with a less ambitious goal, η ∼ 1 · 10−15 , but it achieved its final
scientific target. It was launched on 26 April 2016, operated for over 2 years, orbiting
at a height of 712 km, and was deactivated on 18 October 2018, after completing all
its scientific goals (Fig. 3.9).
The satellite payload is composed of two differential, almost identical micro-
accelerometers made of concentric cylindrical test masses. The first instrument has
Fig. 3.7 Diagram of the working principle of the STEP satellite. Credits: Overduin et al. (2012)
masses of the same material (Pt) and is dedicated to evaluate the experimental accu-
racy of the measurement (verification instrument). On the other hand, the second
accelerometer has masses of different materials (Pt-Ti), and is thus suited to test
the equivalence principle, looking for different behaviour in the field of the Earth
(Fig. 3.10).
The attitude of the satellite and its thermal resistance are controlled in an active
way, so that the satellite follows, using FEEP thrusters (field emission electric propul-
sion), the two test masses in their drag-free gravitational motion. Testing this micro-
propulsion system is the other technological target of the mission itself. One impor-
tant technological achievement of MICROSCOPE is the proof that the differential
62 3 The Equivalence Principle
Fig. 3.9 The principle of operation of MICROSCOPE, and similar EP satellite experiments: if WEP
is violated, the two cylindrical test masses experience different accelerations (red arrows) as the
satellite orbits the Earth. The differential acceleration is measured by the feedback voltages applied
to the test masses to keep them in equilibrium. Credits: Toubul and Rodrigues (2001)
Fig. 3.10 On the left a sketch of the differential accelerometer of the MICROSCOPE experiment,
with the two test masses surrounded by electrodes to measure their relative position, both in radial
and axial directions. On the right a prototype of the accelerometer core with an inner gold-coated
silica cylinder carrying the radial electrodes. Credits: (Toubul and Rodrigues 2001)
3.4 Experiments in Microgravity Conditions 63
accelerometers, built by ONERA, can detect the effects of gravity gradients due to
the force difference between the attraction exerted by the Earth on the internal and
external masses.
While analysis of data is still underway, the French team has published (Touboul
et al. 2017) a preliminary result confirming validity of the equivalence principle: the
two test masses (Ti and Pl) fall, in the Earth field, with the same acceleration with a
1 σ statistical confidence:
Last, we mention the Italian proposal for the satellite Galileo Galilei (GG) (Nobili
et al. 2009), still far from being approved. GG is a proposal of a small satellite at
low orbit dedicated to the verification of the equivalence principle. The preliminary
study phase of GG, directed by A. Nobili from Pisa University, aims to investigate
how it would be possible to test the equivalence principle with η ∼ 1 · 10−17 using
a device operating at room temperature. The main novelty of GG compared with
STEP and MICROSCOPE is to modulate the possible differential signals of WEP
violations at a relatively high frequency (∼ 2 Hz), by forcing the test masses (con-
centric cylinders, just as in STEP and MICROSCOPE) to rotate. The modulation
frequency is increased, with respect to other experiments, by a factor of more than
104 , thus reducing the low-frequency noise, due to many sources associated to the
electrical and mechanical systems, which has a typical 1/ f dependence.
To conclude this chapter on an amusing note, we mention a qualitative experiment
(i.e. with accuracy not assessable) performed in 1971 by NASA astronaut D. Scott,
of the Apollo 15 mission: following Galileo’s suggestion, he dropped a hammer
and a feather on the Moon surface, observing how, in absence of atmosphere, the
two bodies fell with the same acceleration. The video clip is available on YouTube
(“Hammer vs Feather”).
References
Bailey, J., et al.: Measurements of relativistic time dilatation for positive and negative muons in a
circular orbit. Nature 268, 301–305 (1977)
Rossi, B., Hall, D.: Variation of the rate of decay of Mesotrons with momentum. Phys. Rev. 59, 223
(1941)
Hafele, J.C., Keating, R.E.: Around-the-world atomic clocks. Science 177, 166 (1972)
Roll, P.G., Kroktov, R., Dicke, R.H.: The equivalence of inertial and passive gravitational mass.
Ann. Phys. 26, 442–517 (1964)
Braginsky, V.B., Panov, V.I.: Verification of equivalence of inertial and gravitational masses. Sov.
Phys. JEPT 34, 463–466 (1972)
Baeßler, S., et al.: Improved test of the equivalence principle for gravitational self-energy. Phys.
Rev. Lett. 83, 3585 (1999)
Adelberger, E.: New tests of Einstein’s equivalence principle and Newton’s inverse-square law.
Class. Quantum Grav. 18, 2397 (2001)
Wagner, T.A., et al.: Torsion-balance tests of the weak equivalence principle. Class. Quantum Grav.
29, 184002 (2012)
64 3 The Equivalence Principle
Fray, S., Alvarez, Diez C., Hansch, T.W., Weitz, M.: Atomic interferometer with amplitude gratings
of light and its applications to atom based tests of the equivalence principle. Phys. Rev. Lett. 93,
240404 (2004)
Tarallo, M.G., et al.: Test of Einstein equivalence principle for 0-spin and half-integer-spin atoms:
search for spin-gravity coupling effects. Phys. Rev. Lett. 113, 023005 (2014)
Overduin, J., Everitt, F., Worden, P., Mester, J.: STEP and fundamental physics. Class. Quantum
Grav. 29, 184012 (2012)
Touboul, P., Metris, G., Lebat, V., Robert, V.: The MICROSCOPE experiment, ready for the in-orbit
test of the equivalence principle. Class. Quantum Grav. 29, 184010 (2012)
Toubul, P., Rodrigues, M.: The MICROSCOPE space mission. Class. Quantum Grav. 18, 2487
(2001)
Touboul, P., et al.: The MICROSCOPE mission: first results of a space test of the equivalence
principle. Phys. Rev. Lett. 119, 231101 (2017)
Nobili, A., et al.: Galileo Galilei" (GG) a small satellite to test the equivalence principle of Galileo:
Newton and Einstein. Exp. Astron. 23, 689–710 (2009)
Further Reading
Nordtvedt, K.: Equivalence principle for massive bodies I phenomenology and II theory. Phys. Rev.
169, 1014 (1968)
Principles of Metric Theories
4
4.1 Introduction
In the previous chapter, we discussed the equivalence principle (EP) and underlined
how it constitutes the foundation of Newton’s as well as Einstein’s theory. We recall
here its weak formulation (WEP), following Will (2014):
the trajectory of a freely falling test body in a given point of space-time (event) is
independent of its internal structure and composition.
and we remark that this postulate is not sufficient to support Einstein’s theory. Indeed,
this is based on a more extended postulate that takes the name of Einstein Equivalence
Principle (EEP):
(EEP), except for substituting, in the statement defining LLI and LPI, the words
non-gravitational with both gravitational and non-gravitational. In other words, we
extend the EP to all phenomena that include self-gravitating objects, or bodies that
exert a gravitational interaction with themselves, like stars, planets, black holes....or
torsion balances.
EEP implies that gravity must be described by a metric theory, i.e. a theory where:
In the previous section, we have outlined the reasons why LLI and LPI are the foun-
dations of all metric theories of gravitation. We must now discuss the experimental
basis that those principles rest on.
We have recalled, in Chap. 3, that particle physics has provided countless experi-
mental verifications of Lorentz transforms for time intervals, proving they are deter-
mined by relative velocity and not by acceleration of bodies. Besides, LLI was
experimentally verified in many different fields of physics. In primis, we recall the
experiment of Brillet and Hall (1979): it is an improved version of the glorious
experiment of Michelson and Morley that proved the isotropy of the speed of light.
With sensitivity 4000 times better than several previous, Michelson-like experiments.
Brillet and Hall measured the beat frequency between two single mode lasers, one
positioned on a rotating platform and the other on the ground. They set a limit on the
anisotropy of space: 3 · 10−15 . Actually, we should note that, in order to measure the
frequency of the rotating laser, Hall and Brillet used a Fabry-Perot interferometer
or, more exactly, an etalon: thus, light travels back and forth. The limit that can be
set, with this experiment on the round trip anisotropy is (v/c)2 < 3 · 10−15 , to be
compared with the value ∼ 10−8 of Michelson-like experiments, using the velocity
v of the Earth.
Hughes (1960) and Drever (1961) independently set an upper limit of about 10−20 ,
with two very sophisticated experiments exploiting the magnetic resonance of the
7 Li nucleus in its fundamental state, characterized by a nuclear spin I = 3/2.
galactic center; P2 is the Legendre polynomial of order 2. The same prediction also
holds in the nuclear case, if the nucleus is modeled as a single particle in a quadratic
potential with spherical symmetry. In an external magnetic field, the energetic fun-
damental state of the particle is split into four equispaced levels: there is then only
one observable nuclear transition line. Moreover, the direction of acceleration of
the particle depends on the direction of the external magnetic field. Therefore, if an
anisotropy exists, such degeneracy should be removed and, with enough resolution,
a triplet should be observed, with a separation changing as a function of sidereal
time Ts , as the direction toward the galactic center is θ = θ(Ts ).
The limit 10−20 was set by Hughes and Drever by observing the broadening of
the spectral line of the hyperfine transition versus the direction of the magnetic field
with respect to the galactic reference frame centered on the Earth.
Note that, anyways, these experiments cannot provide indications on the nature
of the metric tensor gμν of the space-time and cannot assure the existence of a global
Lorentz reference frame where the metric tensor is gμν = ημν .
An experimental proof of EP can be performed with a torsion balance with two end
masses of different materials. It implies that bodies follow lines in the event space
that can be identified as geodesics of the tensor metric gμν . These lines can indeed be
defined as geodesics if the motion of the bodies is not accelerated with respect to the
origin of a local Lorentz frame. We know from special relativity that the choice of a
Lorentz frame requires, beyond the three spatial coordinates, also a reference clock:
therefore, this measurement of relative motion inevitably requires knowledge of if
and how the flow of time can change from one point to another, when a gravitational
field is present. The experiment of Pound and Rebka (PR) (Pound 1960) shed light
on this issue: the idea is to compare the behaviour of two “reference clocks” in two
different position in a uniform gravitational field (g on the Earth surface). They used
a tower in the Harvard campus, with two measuring stations with height difference
h = 22.6 m, and exploited the Mössbauer effect. In this case, the (atomic) clocks are
provided by the energetic levels of the excited isospin state I = 5/2 in 57 Fe nuclei.
The nuclei are prepared in this state starting from the source isotope 57 Co that, by
electronic capture, transforms into 57 Fe. This nuclide has two possible decays (see
the lower pane in Fig. 4.1): the main one provides γ radiation with energy E = 14.4
keV.
The Lorentzian shape of this emission line, although extremely narrow (δν Lor /ν =
1.13 · 10−12 ), is still 200 times larger than the frequency shift expected by the grav-
itational red-shift, as shown below in this section: δνG R /ν = gh/c2 = 2.4 · 10−15 .
For this reason, in order to compare the emission lines of isotopes in identical con-
ditions, we need to change the energy of the γ rays emitted by the source inducing
Doppler shifts.
So, the γ source is moved with respect to the detector till the point when absorpion
takes place, and this happens when the emitted energy is exactly the required one.
For this reason, the radioactive source is put into oscillation, on a loudspeaker core,
4.3 The Experiment of Pound–Rebka 69
with an oscillating speed v of few mm/s. In this way, one can compare the frequency
of emitted γ downward with that of γ emitted upward, sweeping across the linewidth
with a frequency span of the order of the ratio v/c, i.e. ∼ 3 · 10−11 .
The PR experiment was carried out at the Jefferson physics laboratory of Harvard
University, with two pairs of Mössbauer emitter and detector, separated by a vertical
distance h = 74 feet 22.56 m. They measured a relative frequency difference,
between upward and downward photons, given by
ν
= 4.92 · 10−15 (4.2)
ν
To understand this result, we need quantum theory, WEP and energy conservation.
The particle emitting the photon at the top of the tower (above) loses a fraction of mass
m a c2 = −2πνa , where 2π is Planck’s constant.1 When the photon is absorbed
at the bottom (below), a free falling observer will see his apparatus increase its
inertia by m b c2 = 2πνb . However, conserving the total energy of the above +
below system, also including the potential energy of gravitational field m, we get:
m a c2 + a m a + m b c2 + b m b = 0 (4.3)
1 Although 2π might seem akward, we try and stay away from h, that in this book has very different
meanings.
70 4 Principles of Metric Theories
and finally:
νa 1 + a /c2 (b − a ) gh
= 1+ =1+ 2 (4.5)
νb 1 + b /c 2 c 2 c
The photon, in the gravitational field, undergoes a frequency (and energy) shift
of opposite sign when it goes up (red shifted) with respect to the photon going
down (blue shifted). We then expect the two receivers, above and below, measure γ
radiation with a frequency difference due to the Doppler shift given by: 2 (gh/c2 )
4.95 · 10−15 .
The agreement between theory and the Pound–Rebka experiment is
νspe
= 1.05 ± 0.10 (4.6)
νteo
This experiment was repeated in 1965 by Pound and Snider (1965), achieving an
accuracy of 1%. The formally rigorous interpretation of this experiment in the light of
GR requires advanced notions, such as hyperbolic motion (see Misner et al. (1973)
Sects. 6.2–6.6). One must describe the motion of an object subject to a constant
acceleration with respect to either an inertial reference system co-moving with the
object, or to inertial systems that must change at any instant. We try to provide here
a simpler explanation:
In the PR experiment the two clocks at the top and bottom of the tower are at rest
in the laboratory system (ts , xs ), and their proper time flows according to:
cdτ = gμν d x μ d x ν (4.7)
As the clock is stationary in the lab frame, (d x i = 0), we have
dτ
dt = √ (4.8)
goo
The coordinate time interval t between two pulses is the same at the points of
emission and reception: indeed the two e.m. pulses travel in world lines that, although
not at 45◦ (space-time is not flat) are identical and just displaced by t, because the
geometry does not depend on time. Thus, for the two clocks in positions where the
value of goo is different, the proper time will flow in two different ways:
τa τb
t = √ =√
goo (xa ) goo (xb )
from this we deduce
νa τb goo (xa )
= =
νb τa goo (xb )
4.3 The Experiment of Pound–Rebka 71
We shall see in the next chapter that, in the case of weak and uniform gravitational
field, the approximation
goo (x) 1 − 2(x)/c2
holds, and we can conclude that in a uniform gravitational field, as the two clocks
are positioned at a height difference xa − xb = h, we have
νa − νb gh
= 2 (4.9)
νb c
νa − νb gh
= 2 · (1 + αr s ) (4.10)
νb c
the equations describing the free fall of a hydrogen atom should also produce equa-
tions describing the energy levels of Hydrogen in a gravitational field, thus setting
the typical periodicity of a Hydrogen maser clock. It follows that, in a WEP violating
theory, an EEP violation could also appear as a violation of LPI: WEP is, therefore,
sufficient to imply EEP.
Around 1960, L. Schiff hypothesized that this type of link could be a needed
feature of any self-consistent theory of gravity. More precisely, the Schiff conjec-
ture states that any self-consistent complete theory of gravity incorporating WEP
necessarily embodies EEP.
In other words, the validity of WEP guarantees the validity of both Lorentz and
Local Position Invariance, and therefore of EEP. Moreover, if Schiff conjecture is
correct, the experiments of Eötvos could be interpreted as an experimental verifica-
tion of EEP, and therefore be the foundation of the hypothesis that interprets gravity
as a curvature phenomenon of space-time.
A rigorous proof of this conjecture seems impossible; it is supported only on the
basis of a number of robust and plausible arguments.
An elegant example of these arguments, due to Dicke, Haugan and Nordtvedt,
is based on energy considerations. It is a simple conceptual experiment (gedanken
experiment), where the core assumption is that the energy of a closed system, at the
end of a series of transformation, must invariably have the same value than that at
the beginning. We now try to explain it.
Consider a system in position x, moving at speed | v | << c: the total energy of
the system has the general form:
1
E c = M R c2 − M R U (
x) + M R v 2 + ................ (4.11)
2
where M R is the rest mass of the system, U the external potential and v the speed
absolute value. The dots “ . . . ” indicate possible terms of higher order in U and v 2
Assume now that our system is composed of n bound elementary particles of
mass m o ; there is a binding energy E B associated with it. The rest energy can be
written as the difference between the rest energy of the n isolated particles and the
binding energy E B : if we have a EEP violation, the binding energy can depend, in
general, on position x (LPI violation) and/or on speed v (LLI violation):
M R c2 = n m o c2 − E B (
x , v) (4.12)
In the reasonable assumption of a weak EEP violation, we can write the binding
energy as a series expansion of the mass anomalies due to the violating terms:
3
E B (
x , v) = E B o + δm p i j U i j + δm I i j v i v j + ..... (4.13)
i=1 j=1
From now on we shall omit the explicit sum symbol, following Einstein’s con-
vention of summation on repeated indexes. Latin indexes (i, j, k ...) run over spatial
coordinates 1, 2, 3.
We now consider two different quantum systems, e.g. two excited states of the
7 Li nucleus, in an external magnetic field. These states are characterized by different
azimuth quantum number: if EEP is not violated, the energy spacing between the
two levels is the same. Besides, the binding energy E B o is, to zeroth order, the same
for both states.
The systems at rest (v i = v j = 0 ∀ i, j) in the gravitational potential U can
perform transitions, emitting photons of frequency proportional to the energy change
of the corresponding system. From (4.13) it follows that the frequencies emitted by
the two systems will be different, due to the presence of the terms δm p i j U i j . This
shows how violation of WEP can induce a violation of LPI.
We report a gedanken-experiment (Haugan 1979) for a quantum system subject
to a cyclic transformation.
The quantum system of n identical, non-interacting particles, initially at rest at a
height z = h, is endowed with an energy
E f r ee (z = h, v = 0) = n m o [c2 − U (h)]
(for notation simplicity, we shall drop in the following the dependence on v, inessen-
tial to our purposes). This is the energy that must be conserved during all transfor-
mations that we now consider:
1. We allow the particles to interact, creating a new composite system. This will
release a binding energy E B , (see Eq. 4.13) and reduce its energy
U (h)
E composite (h) = [n m o c − E B (h)] 1 − 2
2
c
2. We now let both systems fall to the reference height z = 0: the composite system
with an acceleration a (not necessarily equal to g ), while the free particles fall,
.
by definition, with acceleration g = −∇U
3. We bring the two systems to rest, converting and storing their kinetic energy. We
now have available the energies:
E B (0)
E composite (0) + n m o − a · h + δm I i j g i h j
c2
4. We now disassemble the composite system into the original n particles: this takes
up the energy
E f r ee (0) = E composite (0) + E B (0).
5. Finally, raise again the non-interacting system to height h, using an energy nm 0 gh.
We have recomposed the initial system, with its energy E f r ee (h), plus the fol-
lowing energy terms:
E B (0)
[E B (h) − E B (0)] − nm 0 − a − g ) · h + δm I i j gi h j
( (4.14)
c2
If energy conservation holds, this amount must be zero. We rewrite the term
in the square bracket with the help of Eq. 4.13 and define, for shorthand notation
Mc ≡ nm 0 [1 − E Bc2(0) ]
ij
δm p [Ui j (h) − Ui j (0)] − δm I i j gi h j − Mc (a k − g k )h k (4.15)
ij kj
δm p ∂Ui j δm I
a k = gk + + gj (4.16)
Mc ∂xk Mc
This proves that a violation of LPI (second term on the r.h.s) or a violation of LLI
(third term) will produce a violation of WEP ( a = g ).
EB EB
gA = g 1 + α gB = g 1 + α E B − E A = hν (4.17)
m A c2 m B c2
If we want to bring our system back to its initial condition, before the emission
and at height H , we must use the kinetic energy of the system on ground, converted
from the potential m B gb H , and the energy of the photon hν . In other words, we
4.4 Schiff�s Conjecture 75
ν − ν H U
Z=
= g(1 + α) 2 = (1 + α) 2 (4.18)
ν c c
This proves that violation of LPI must follow to violation of WEP. Figure 4.2
shows a diagrammatic representation of the relations underlying EEP.
References
Brillet, A., Hall, J.L.: Improved laser test of the isotropy of space. Phys. Rev. Lett. 42, 549–552
(1979)
Cocconi, G., Salpeter, E.E.: A search for anisotropy of inertia. Nuovo Cimento 10, 646 (1958)
Delva, P., et al.: Gravitational redshift test using eccentric Galileo satellites. Phys. Rev. Lett. 121,
231101 (2018)
Drever, R.W.P.: A search for anisotropy of inertial mass using a free precession technique. Philos.
Mag. 6, 683 (1961)
Haugan, M.P.: Energy conservation and the principle of equivalence. Ann. Phys. 118, 156 (1979)
Hughes, V.W., et al.: Upper limit for the anisotropy of inertial mass from nuclear resonance exper-
iments. Phys. Rev. Lett. 4, 342 (1960)
Misner, C.W., Thorne, K.S., Wheeler, J.A.: Gravitation. Freeman, San Francisco, USA (1973)
Muller, H., Peters, A., Chu, S.: A precision measurement of the gravitational redshift by the inter-
ference of matter waves. Nature 463, 926 (2010)
Pound, R.V., Rebka, G.A.: Apparent weight of photons. Phys. Rev. Lett. 4, 337 (1960)
Pound, R.V., Snider, J.L.: Effect of gravity on gamma radiation. Phys. Rev. B 140, 788 (1965)
Uzan, J.P.: The fundamental constants and their variation: observational and theoretical status. Rev.
Mod. Phys. 75, 403 (2003)
Vessot R.F.C., LevineM.W. et al. Test of Relativistic Gravitation with a Space-Borne Hydrogen
Maser - Phys. Rev. Lett., 45, 2081-2084 (1980)
Webb, J.K., et al.: Indications of a spatial variation of the fine structure constant. Phys. Rev. Lett.
107, 191101 (2011)
Will, C.M.: The confrontation between general relativity and experiment. Living Rev. Relativ. 17,
4 (2014)
76 4 Principles of Metric Theories
Further Reading
Bertotti, B. (ed.): Gravitazione Sperimentale–Experimental Gravitation; Atti dei convegni Lincei
34, Roma (1977)
Ciufolini, I., Wheeler, J.A.: Gravitation and Inertia. Princeton University Press, Princeton (1995)
Tests of Gravity at First
Post-Newtonian Order 5
5.1 Introduction
The criteria to assess the reliability of a theory are its completeness and self-
consistency. A theory is complete when all the equations needed to describe how
a system behaves in given conditions can be derived from first principles. A theory is
self-consistent when the prediction of the outcome of a given experiment, obtained
with different methods, is unique: as an example that we will discuss further on,
regardless of whether the light is considered a wave or massless particle, both pre-
dictions about the light deflection due to the gravitational field should coincide.
Furthermore, its reliability is corroborated if it can make correct prediction about
the Newtonian and the relativistic limits. In other words, in the so-called Weak Field
and Slow Motion (WFSM) limit, when the gravitational field is weak (see the next
section for a quantitative assessment of what we intend for weak) and the particles
move with velocity v c, the theory must recover the laws of Newtonian physics.
On the other hand, in the absence of a gravitational field, the laws of the theory should
lead to those of Special Relativity (SR). In previous chapters, we have discusses the
essential pillars (such as WEP-LLI-LPI, that together yield EEP) that support the
hypothesis that gravity is a metric theory. In such a theory, matter creates fields that,
along with matter itself, generate the metric, but in turn, matter motion is determined
by the metric.
Based on all the considerations made so far about the EEP, the laws of physics
in covariant form can be formulated using SR, and generalized when accounting for
space curvature. The idea is therefore to derive the laws of SR from an action, con-
taining the pseudo-Euclidean (Minkowskian ) metric tensor ημν . The generalization
will take place through a generic transformation of the coordinates x μ = x μ (x α ).
By transforming in a general form the vectors, tensors, differentials and integration
elements, we end up rewriting the action in the new reference system of curved
coordinates. It is interesting to note that at the end of such a transformation, the form
of the action remains the same, provided that the pseudo-Euclidean metric tensor is
replaced by the metric tensor gμν , and the derivatives by covariant derivatives. These
simple mnemonic rules are the formal statement of Einstein’s Equivalence Principle.
Several theories have been formulated, and they differ only in how the metric
is generated: that is because, in any case, if EEP is valid, matter only couples to
the metric. In Einstein’s General Theory of Relativity (this is the name he gave),
there exists a unique gravitational field generated by the stress-energy tensor which
contains contributions of both matter and other fields. We just mention, as examples,
two other theories. In the Brans-Dicke-Jordan theory (Brans and Dicke 1961), matter
and fields generate a scalar field that, along with matter and fields, generates the
metric. In Ni’s theory (Ni 1973), it is assumed the existence of a flat metric in all the
Universe where a proper time exists. This flat metric contributes, with matter and
non-gravitational fields, to generate a scalar field . All these fields then combine to
generate physical metric gμν , the metric that enters into the equivalence principle.
Einstein’s General Relativity (GR) has emerged in the last century as the best and
most robust theory of gravity, also due to the precise experimental verifications of a
few, precisely predicted phenomena, observed in the Solar System, that do not have
Newtonian counterparts. Three of these, the so-called classical tests, were proposed
by Einstein himself (1916):
All these phenomena are conveniently described using GR in the Weak Field, Slow
Motion (WFSM) approximation, that we introduce in the following sections. Other
effects, like generation of gravitational waves that takes place in regions of strong
gravity, require instead a more complete GR approach. In the next chapter, we will
expand the focus to other theories, and recall the Post-Newtonian Parametrization
(PPN): a formalism general enough to represent in a comprehensive form the different
predictions of a broad class of metric theories in a weak field limit, in order to compare
them with the experimental observations.
In this section, we briefly review the main relations of Einstein’s GR, taking a top-
down approach, i.e. starting with the field equations that are often the arrival point
of an introduction to GR.1
1 It is assumed that the reader is already familiar with GR: the list of equations we report here cannot
1 8πG
Rμν − gμν R = 4 Tμν (5.1)
2 c
Tμ ν ;ν = 0 (5.2)
with μ, ν = 0, 1, 2, 3. We recall here a few of the math tools and concepts of GR, in
order to get an insight into the meaning of these equations.
∂vμ
vμ;ν = − μν ρ vρ (5.4)
∂x ν
• where i j r are the Christoffel symbols (or Affine Connections) of second type:
ρ 1 ∂gνα ∂gαμ ∂gμν
μν = g ρα + − (5.5)
2 ∂x μ ∂x ν ∂x α
The Christoffel symbols, made out of derivatives of the metric tensor, are often
considered as the fields of gravitational theory. Just as the fields of electrodynam-
ics, they are not invariant (are not tensors) under coordinate transformation.
• The Riemann tensor R α βμν is also called the curvature tensor in a four-dimensional
space:
R α βμν = α βν,μ − α βμ,ν + α μρ ρ βν − α νρ ρ βμ (5.6)
80 5 Tests of Gravity at First Post-Newtonian Order
• Rμν is the Ricci tensor, obtained by contracting two indexes of the Riemann
tensor:
Rμν = α μν,α − α μα,ν + α μν β αβ − β μα α νβ (5.7)
R is the curvature scalar: it is derived from Ricci tensor by further contracting the
two indexes. Thus, the field Eq. 5.1 are a set of ten coupled, non-linear, second-
order differential equations in the potentials gμν .
A gravitational theory is valid and verified if we can make predictions on the motion
of matter and compare them with measurements. We must therefore measure gμν ,
i.e. determine the motion of a matter element with respect to another, and not just to
a reference system.
This is a fundamental peculiarity of gravitation and of GR.
To state it differently, it is not sufficient to consider the motion of a freely gravitating
particle with respect to a given reference frame: such motion is described by the
geodesic equation, that is simply the generalization of the inertia principle to a
curved space-time.
d2xμ β
μ dx dx
γ
+ βγ =0 (5.8)
ds 2 ds ds
Indeed, this is the explicit expression of the covariant derivative of the 4-vector
velocity for a particle freely moving along a trajectory described by the curvilinear
coordinate s (else, it can be considered as Newton’s law for a particle in flat space-
time, with the force field given by the second term, i.e. by the curvature).
However, as mentioned above, this equation does not allow us to deduce the
fundamental nature of the phenomena in exam: we need to describe the motion of a
particle with respect to another massive body. To this purpose, we now consider two
particles moving along two geodesic lines, described by coordinates s and s + ds; let
ξ μ be the vector connecting the two geodesics. Choosing a locally Galilean reference
frame, we compute the variation of the geodesic equation with respect to this vector,
obtaining the equation of geodesic deviation:
∂2ξγ
+ u μ u ν ξ ρ R γ μνρ = 0 (5.9)
∂s 2
We now return to the field Eq. 5.1 and look for a solution of Einstein’s equations in
the case where we have matter, possibly in a stationary state of rotation. Assume we
are in conditions of small perturbation of the flat (or Minkowskian) space-time:
where h μν 1 is the definition of Weak Field and we compute the perturbed metric
in the linear approximation. We insert the perturbed metric tensor in the definitions
(Eq. 5.5) of the affine connections and of the Ricci tensor, limiting our expansion to
first order in h:
λ 1 λρ ∂ ∂ ∂
μν η [ μ h ρν + ν h ρμ − ρ h μν ] (5.12)
2 ∂x ∂x ∂x
∂ λ ∂ λ
Rμν − (5.13)
∂x ν λμ ∂x λ μν
Note that, in agreement with the assumption of linear approximation, all tensor
indexes are raised or lowered by the flat space-time metric ημν rather than gμν . We
have
∂ ∂
η λρ h ρν = h λ ν η λρ ρ =
∂x ∂xλ
We now combine the two expressions Eqs. 5.12, 5.13 to obtain
1 ∂2 ∂2 λ ∂2 λ ∂2 λ
Rμν h μν − h ν − h μ + h λ (5.14)
2 ∂x λ ∂xλ ∂x λ ∂x μ ∂x λ ∂x ν ∂x μ ∂x ν
∂2 ∂2 λ ∂2 λ ∂2
h μν − h ν − h μ + hλλ =
∂x λ ∂xλ ∂x λ ∂x μ ∂x λ ∂x ν ∂x μ ∂x ν
16πG
=− Tμν (5.15)
c4
The right-hand side of Einstein equations 5.1, includes the energy and momentum of
the gravitational field: Tμν = Tμν (matter, fields) + tμν (h). In this, non-linearity lies
the beauty and the complexity of GR ! These contributions are, as in all field theories,
quadratic (at least) in the field amplitude h μν (see e.g. in Chap. 7, the energy and
momentum carried by gravitational waves). In the WFSM approximation, however,
we only retain first-order corrections to ημν and therefore neglect the gravitational
self-energy contribution.
Note that the tensor Tμν , defined to first order in h μν , satisfies the ordinary (non-
covariant derivative) conservation law:
82 5 Tests of Gravity at First Post-Newtonian Order
∂ μ
T ν =0 (5.16)
∂x μ
Also note that the first term of the left-hand side of Eq. 5.15 is the D’Alembert
operator, applied to the metric perturbation. We now try, via a suitable coordinate
transformation, to set the other terms to zero: this is equivalent to what we do to get
the Lorentz gauge transformation in e.m.
First, we change variable, defining the trace-reversed metric:
1
h̃ μν = h μν − ημν h (5.17)
2
h is the trace, i.e. the scalar obtained by contracting the two indexes: h = h μ μ =
η μν h μν .
The inverse transformation is
1
h μν = h̃ μν − ημν h̃ (5.18)
2
∂
μ ∂
ν
h̃ μν = h̃ μν − ν
− μ
∂x ∂x
It can be shown that, due to gauge invariance, h̃ μν is still a solution of the linearized
field equations. By choosing
μ such as to verify the four Lorentz gauge conditions,
μ
∂ h̃ ν
=0 (5.19)
∂x μ
∂2 ∂2
2 Note that, with our choice of metric signature (+ − −−), we have ∂x λ ∂x λ
= ∂x 02
− ∇ 2 = − ,
with opposite signs w.r.t the conventional definition of the D’Alambert operator.
5.3 The Weak Field, Slow Motion Approximation 83
∂ 2 h̃ μν 16πG
− = h̃ μν = Tμν (5.20)
∂x λ ∂xλ c4
There is an almost perfect formal analogy with the corresponding e.m. equation for
the 4-potential Aμ (em) ≡ ((em) , A(em) ) and the 4-current jμ (em) ≡ (cρ(em) , j (em) ),
expressed in Gaussian units3 :
4π (em)
Aμ (em) = jμ (5.21)
c
The tensor potential h̃ λμ plays the role of the e.m. vector potential Aμ , while the
energy-momentum tensor Tμν plays the role of the sources, that in e.m. is taken
by the 4-current jμ (em) . Using this analogy, the set of solutions of Eq. 5.20 can be
immediately derived as
4G Tμν (r , t) 3
h̃ μν (
r , t) = − d r (5.22)
c4 r − r |
|
We note that the stress energy tensor should be computed at an advanced time:
r − r |/c. This requirement is relaxed here, as we assume stationary
t adv = t − |
motion of the sources.
We now need to identify the various components of Tμν : to this purpose, we apply
analogous approximations to the continuity equation of the stress-energy tensor:
∂ μ0 ∂
T μν ,ν = T + k T μk = 0
∂x 0 ∂x
and compare these equations to the Euler equations of fluid dynamics for the motion
of a perfect fluid (an ensemble of non-interacting particles) in absence of gravity:
∂ρ · (ρ
+∇ v) = 0
∂t
d v p
ρ = −∇
dt
where ρ is the fluid density, p its pressure and v its velocity, and dt
d ∂
= ∂t
+ v · ∇.
μν
The components of the matter tensor T are therefore identified in the following
way
T 00 = ρc2 , T 0 j = cρv j , T jk = ρv j v k + pδ jk (5.23)
where v j are the components of the velocity v.
3 TheGauss unit system of e.m., (Jackson 1975) although unfamiliar in these days, best holds the
analogy with GR. In this system,
0 = 1 ; μ0 = 1 and the fields E and B have the same units.
84 5 Tests of Gravity at First Post-Newtonian Order
It is now sufficient to plug the T μν values as given by Eq. 5.23 into the general
solution Eq. 5.22 to obtain the components of the metric perturbation h̃ μν
4G ρ(r , t) 3
h̃ 00 (
r) = − d r (5.24)
c2 r − r |
|
4G M
h̃ 00 = − ;
c2 r
(5.27)
r × J) j
2G ( 1
h̃ 0 j =− 3 ; h̃ i j = O(v /c , 3 )
2 2
c r3 r
1
gμν = ημν + h̃ μν − ημν h̃ (5.28)
2
4 Derivation of the second Eq. 5.27 from Eq. 5.25 is not obvious, but is an interesting exercise: use
local cartesian coordinates, assume rotation around the z axis: v j = (ẑ × r ) j . Then integrate using
the axial symmetry of the source: d 3 r ρ(r )x = 0 ∀k and d 3 r ρ(r )x 2 = d 3 r ρ(r )y 2 =
k
1
2 I zz .
5.3 The Weak Field, Slow Motion Approximation 85
and, noting that the trace is h̃ = −4 Gr cM2 , we convert back to the metric gμν using
Eq. 5.18, and finally obtain, to the leading order in Gr cM2
2G M 2G M r × J) j
2G (
h 00 = − ; hi j = − δi j ; h0 j = − (5.29)
r c2 r c2 c3 r3
The GR metric, to first order in the WFSM approximation, can now be expressed in
spherical coordinates (x 0 = ct , x 1 = r , x 2 = θ and x 3 = ϕ):
2G M 2 2 2G M
ds 2 =(1 − )c dt − (1 + )dr 2 − r 2 (dθ2 + sin 2 θdϕ2 )+
r c2 r c2
4G J
− 2 sin 2 θdtdϕ (5.30)
c r
We remark that this metric, derived in the WFSM approximation, is indeed the
weak field limit of the more general Kerr metric, that describes space-time around a
rotating, axisymmetric mass.
For a non-rotating source mass, we can drop from Eq. 5.30 the last term
4J
cr 3
sin 2 θ dt dϕ (we shall use in Sect. 5.5 to describe the Lense-Thirring effect):
2G M 2 2 2G M
ds 2 =(1 − 2
)c dt − (1 + )dr 2 −
rc r c2
r 2 (dθ2 + sin 2 θdϕ2 ) (5.31)
2M G 2 2 dr 2
ds 2 = (1 − )c dt − − r 2 (dθ2 + sin 2 θdϕ2 ) (5.32)
r c2 (1 − 2Mrc
G
2 )
where ds 2 defines the proper time (cdτ )2 for massive particles, while ds 2 = 0 for
photons.
In the following, we shall alternatively use one metric or the other.
Although we stay with our SI measurement units (i.e. do not use c = G = 1, as
many theorists do), we find convenient, in the following sections, use the shorthand
notation:
2G M
R= (5.33)
c2
With this notation, we measure mass in metres: the Earth has then R⊕ = 8.8 mm
and the sun R = 2.95 km. R is known as the Schwarzschild radius of a celestial
86 5 Tests of Gravity at First Post-Newtonian Order
Table 5.1 Schwarzschild radius and normalized gravitational potential (g) /c2 = G M/Rc2 on
the surface of some spherical bodies
Mass Radius R R = 2Gc2M (g) (R)/c2
(kg) (m) (m) at surface
Proton 1.7 10−27 10−15 2 10−54 10−39
Bowling ball 7.3 0.22 10−26 2 10−26
Earth 6 1024 6 106 9 10−3 7 10−10
Sun 2 1030 7 108 3 103 2 10−6
White dwarf 1030 106 1 103 7 10−4
Neutron star 3 1030 4 103 104 0.2
body. Table 5.1 shows, for a few familiar quasi-spherical bodies, the gravitational
potential on the surface and the Schwarzschild radius related to their mass.
We now have the appropriate tools to address the Solar System tests of GR.
The classic experimental tests of the general relativity are based on observations
performed in the weak gravitational field regime. In this approximation, we can
predict the perturbation effects on the propagation of massless particles as photons
in order to perform a comparison with the experimental data. There are two main
effect to be considered:
(a) the time delay in photon propagation, also known as Shapiro delay,
(b) the bending of light trajectories, also known as gravitational lensing effect.
We start from the Schwarzschild solution Eq. 5.32 of the Einstein equations for
the empty space-time outside a static spherical body of mass M.
We immediately note that if we let M = 0 the Eq. 5.32 spells the line element of
the Minkowski space-time in spherical coordinates, while for M = 0, both time and
radial coordinate are distorted. The logic consequence is that, for a clock at a fixed
position in space (r , θ, ϕ = constant), just as we got for the red-shift of clocks
in Chap. 4, we have
R 2
dτ 2 = 1 − dt (5.34)
r
while the length of a segment at rest, radially oriented in the gravitational field of the
mass M (t, θ, ϕ = constant) is computed integrating the following differential:
R −1 2
d R2 = 1 − dr (5.35)
r
it does not to perturb the metric), radially positioned in (r2 , θo , φo ). The photon is
reflected back by the body towards the observer, who receives it at a later time. From
an experimenter’s point of view, this is realized by emitting a radio beam from the
Earth, reflecting it off Mercury, or Venus, when it is aligned with the Earth and the
Sun (planets in conjunction). We calculate the elapsed time between transmission
and subsequent reception of a photon by the observer in the simpler, although less
interesting case of inferior conjunction, when the alignment is Sun-Mercury-Earth.
The photon travels in radial direction at the speed of light. In this case, being dτ = 0
and dθ = dφ = 0 in the line element expressed by the Eq. 5.32, we have
R R
−1 2
(1 − )(c dt)2 = 1 − dr (5.36)
r r
which implies the radial coordinate velocity of the light is
dr R
= ±c 1 − (5.37)
dt r
This equation can be integrated to get the round trip travel time of the photon
1 r2 dr r1 dr
t = − R
+ R
=
c r1 1− r r2 1− r
r1
2 dr
= R
(5.38)
c r2 1− r
The experimentally observable quantity is the proper time τ recorded in r1 ; so, we
need to express this elapsed time t in terms of the observer’s proper time. Thus,
using Eq. 5.34, we have
R
21
τ = 1− t =
r1
2 R
21 r1 − R
= 1− r1 − r2 + R ln 1 − (5.39)
c r1 r2 − R
Let us compare this result with the time interval, we would expect simply due to the
light travelling from r1 to r2 and back:
1 r1 R −1/2
τ̃ = R/c = 1− dr =
c r2 r
2
= r2 (r2 − 2R) − r1 (r1 − R) +
c
√r + √r − R
2 2
+2R ln √ √
r1 + r1 − R
88 5 Tests of Gravity at First Post-Newtonian Order
One famous prediction of General Relativity is that the null geodesic line in the
vicinity of a mass is curved. This implies that the photon trajectory deviates from
the straight line when passing near a star. To predict the effect amount we start from
the Lagrangian of the generic particle propagating in a gravitational field
1
L≡ gμν ẋ μ ẋ ν (5.42)
2
where the symbol dot denote the derivate with respect to the proper time τ for a
mass particle and the affine parameter s for the massless photon. We write it, using
spherical coordinates, in the vicinity of the static spherical object of mass M:
1 2 ˙2 R R −1
L= − c t (1 − ) + ṙ (1 − ) + r (θ̇ + sin θϕ̇ )
2 2 2 2 2
(5.43)
2 r r
Thus, Euler-Lagrange equations give us four geodesic equations:
d ∂L ∂L
− μ =0
ds ∂ ẋ μ ∂x
5.4 Effects of Gravity on Photon Propagation 89
Without loss of generality, we assume that the motion is confined in the equatorial
pane , i.e. θ = π2 and we end up with the second equation, corresponding to the index
μ = 1, of the form
R −1 G M 2 R R
r̈ (1 − ) + 2 t˙ − ṙ 2 2 (1 − )−2 − r ϕ̇2 = 0 (5.44)
r r r r
Integrating two other equations, μ = 0 and μ = 3, in the case θ = π/2, we have
∂L R
= (1 − )t˙ = k ≡ costant (5.45)
˙
∂t r
r 2 ϕ̇ = h ≡ costant (5.46)
where k and h are integration constants.
Equation 5.44 is a second order, non-linear equation: it is simpler to use instead
the relation can be obtained from the invariant interval ds 2 of Eq. 5.32:
R R
− c2 t˙2 (1 − ) + ṙ 2 (1 − )−1 + r 2 ϕ̇2 = c2 (5.47)
r r
R R
− c2 t˙2 (1 − ) + ṙ 2 (1 − )−1 + r 2 ϕ̇2 = 0 (5.48)
r r
The constraint Eq. 5.46 is the analogous of the conservation of angular momentum,
while the Eqs. 5.47 and 5.48 lead to relations echoing the conservation of energy.
Substituting, in Eq. 5.47 (for massive particle) and in Eq. 5.48 (for photons), the
values of ϕ̇ and t˙ as given by Eqs. 5.45 and 5.46, we end up with two simpler
equations:
du 2
+ u 2 = E + 2Gh 2M u + 2Gc2M u 3 for particles (5.49)
dϕ
du 2
+ u2 = F + 2Gc2M u 3 for photons (5.50)
dϕ
Fig. 5.1 Deflection of light rays by the Sun. The effect is vastly exaggerated
quantity
1, in the case of photons emitted by a far star and detected by a telescope
on the Earth when grazing the Sun image. In this condition, b = R , the typical value
for
is ∼ 4 · 10−6 .
The Eq. 5.50 takes the form
d ũ 2 F
+ ũ 2 = +
ũ 3 (5.51)
dϕ uo2
at the point of the photon trajectory closest to the Sun, we have, by symmetry
d ũ π F
ϕ= π2
= 0; ũ(ϕ = ) = 1; =1−
dϕ 2 uo2
d ũ 2
+ ũ 2 = 1 −
+
ũ 3 (5.52)
dϕ
We note that, for M = 0, we had
= 0 and ũ = sin ϕ . We then look for a solution
of Eq. 5.52 in the form
ũ = sin ϕ +
f (ϕ)
where f (ϕ) is a function to be determined. Using the initial condition ũ(ϕ = 0) = 0,
that requires f (ϕ = 0) = 0, the solution for f (ϕ) to first order in
is
1
f (ϕ) = (1 + cos2 ϕ − sin ϕ) − cosϕ
2
so that, for ũ, we have (Fig. 5.1).
1 1
ũ = (1 −
) sin ϕ +
(1 − cos ϕ)2 (5.53)
2 2
The consequence of this solution is that the end point of the photon trajectory is
no longer ϕ(t → ∞) = π, but different by a small quantity δϕ, i.e. π + δϕ.
5.4 Effects of Gravity on Photon Propagation 91
Thus, let us compute the Eq. 5.53 at the end point of the trajectory, i.e. for ũ = 0 and
ϕ = π + δϕ. To first order in the small angular quantity δϕ, we have sin(π + δϕ)
−δϕ and cos(π + δϕ) −1, we find
4G M 2R
δϕ = 2
= 2
= (5.54)
bc b
The deflection is inversely proportional to the impact parameter b. In the case of
light passing through the gravitational field of the Sun, the smallest value that b can
assume is the Sun radius. In this case, we have for the total deflection δϕ = 1.75” .
The method involves photographing the stars around the Sun during a total eclipse
(when the stars near the Sun edge can be seen), and comparing the photograph with
one of the same star field taken several months later. This methods suffers of several
limitations. Among them, we cite
– the relevant changes in conditions which occur when bright sunlight changes to
the semidarkness of an eclipse
– the time lapse of several months, which makes it difficult to reproduce similar
conditions when taking the comparison photograph
– the smallness of the effect, which pushes photography to its limits
However, this deviation was measured for the first time the 29 May 1919 during a
total solar eclipse. The British astronomers Frank Watson Dyson and Arthur Stanley
Eddington carried out two observations expeditions, one to the West African island
of Principe and the other to the Brazilian town of Sobral. Following the return of the
expeditions, the data were presented by Eddington to the Royal Society of London
and the news of this experimental results led to worldwide fame for Einstein and his
General Theory of Relativity.
There are two other effects, altering the dynamics of mass particles, that can be
explained within the weak field approximation, and that have been indeed observed:
its axes gradually rotate, in what is called a precession.5 However, once all these
classical contributions are accounted for and subtracted, it remains a discrepancy of
about 43 s of arc per century, or 6.6 · 10−7 rad. per orbit. The direction of the effect
is in the forward direction: if we could observe the Mercury orbit from above the
ecliptic, watching it move in counterclockwise direction, then the gradual rotation of
the major axis is also counterclockwise. An intuitive way to look at this phenomenon
is: the planet spends more time near perihelion than it is classically predicted and
when it moves away from the sun, the Keplerian orbit is rotated counterclockwise.
This effect, tiny in the case of the gravitation interaction between Sun and Mercury,
is enhanced enormously in the case of a mass in the vicinity of a black hole, that
spends an almost infinite time (due to the slowing of proper time) at its perihelion.
To compute this effect in the GR framework, we consider the equatorial (θ = π/2
) motion of a test particle in the field of a spherical object of mass M, set at the origin
of our coordinates. We restart from the Eqs. 5.50, 5.45 and 5.46, that we repeat here
for convenience recalling that the derivatives are computed with respect to the proper
time τ of the mass particle.
R
1− t˙ = k r 2 ϕ̇ = h
r
2
du 2G M 2G M 3
+ u2 = E + 2
u+ u (5.55)
dϕ h c2
where u = 1/r and E ≡ c2 (k 2 − 1)/h 2 .
Compare this with the corresponding equation of Newtonian dynamics:
2
du 2G M
+ u2 = E + u (5.56)
dϕ h2
5 Astronomers had hypothesized the existence of an additional planet to perturb Mercury’s orbit:
they had computed its mass, orbital radius (intermediate between Mercury and Venus) and even
given it a name: Volcanus. Such planet, obviously, was never found.
5.5 Effects of Gravity on Particle Dynamics: Precession of Periastron 93
2G M E
ũ 3 − ũ 2 + 2
ũ + 2 =
(ũ − ũ 1 )(ũ 2 − ũ)(ũ 3 − ũ) = 0 (5.59)
uoh uo
where ũ 1 , ũ 2 , ũ 3 are the roots of the new cubic equation in ũ. Applying the same
normalization to Eq. 5.55 and using Eq. 5.59, we have
d ũ 2
=
(ũ − ũ 1 )(ũ − ũ 2 )(ũ − ũ 3 ) (5.60)
dϕ
d ũ 2
= (ũ − ũ 1 )(ũ 2 − ũ)[1 −
(2 + ũ)] (5.61)
dϕ
dϕ
From the previous equation, we can isolate d ũ and compute an expansion to first
order in
:
dϕ 1 + 21
(2 + ũ) 2
(ũ
1
− 1) + 1 + 23
= = (5.62)
d ũ (ũ − ũ 1 )(ũ 2 − ũ) β 2 − (ũ − 1)2
where in last step we have introduced β = 21 (ũ 2 − ũ 1 ) and we made repeated use of
the relation ũ 1 + ũ 2 = 2.
By integrating Eq. 5.62 in the interval ũ 1 , ũ 2 , we can compute the angle φ
between the aphelion and the following perihelion:
ũ 2 1
(ũ − 1) + 1 + 23
3
φ = 2
d ũ = (1 +
)π
ũ 1 β 2 − (ũ − 1)2 2
94 5 Tests of Gravity at First Post-Newtonian Order
2φ gives the angle between successive perihelions and we conclude that for
each orbit the perihelion advancement is
3 G M 1 1
< δψ >(or bit) = 3
π = 2
+ π=
c r1 r2
6π G M
= 2 (5.63)
c a(1 − e2 )
In the last expression, we used the relations r1,2 = a(1 ± e) relating apoastron (r1 )
and periastron (r2 ) to the ellipse semi-major axis and eccentricity. For Mercury, which
is the planet of our solar system closest to the Sun, Eq. 5.63 predicts an advance of
42.98 per century, in excellent agreement with the observed precession.
We conclude by noting that the formula derived above is useful when dealing with
weak solar system effects. If we consider binary systems of two neutron stars, as in the
case of the Pulsar 1913 + 16, the perihelion advance is much larger, ∼ 4.2 degrees
per year: that means that, in a single day, we observe the same precession as Mercury’s
perihelion advances in a century. This is due to a much stronger gravitational field,
and this approximate method is no longer applicable: the periastron advance must
be computed using a fully relativistic approach.
5.6 Gravitoelectromagnetism
In this section, we shall make use of the similarity between the equations of elec-
tromagnetism and gravity outlined in Eq. 5.21 and following. Before venturing into
this analogy, we remark that, from a fundamental point of view, e.m. and gravity,
even in its linearized form, are two very different theories, the main difference being
the equivalence principle: gravitational effects can be locally removed by a choice
of reference frame, the e.m. effects can’t. Therefore, the analogy must just be taken
for a formal likeness of the equations and a useful tool for computation.
We return to the Eq. 5.27, we derived for the metric perturbation in the weak field
limit:
4G M 2G (r × J) j
h̃ 00 = − 2 ; h̃ 0 j = − 3 ;
c r c r3
We can push further the analogy between GR and the electromagnetic theory by
defining a four-component gravitation potential
c2 M
(g) = h 00 = −G (5.64)
4 r
c2 G r × J
A j (g) = − h0 j =
2 c r3
5.6 Gravitoelectromagnetism 95
(g)
(g) − 1 ∂ A
E (g) = −∇ × A(g)
, B (g) = ∇ (5.65)
2c ∂t
and the gravitational Lorentz gauge, Eq. 5.19 becomes
(g)
1 (g) 1 ∂
∇·A + =0 (5.66)
2 c ∂t
We remark that the analogy of the e.m. and weak field gravitation is not perfect. The
most relevant difference is a factor 1/2 present in the gravitational Eqs. 5.66. This
factor is a consequence of the GR linear approximation procedure applied to a tensor
theory, i.e. a spin-2 field compared to the classical electrodynamics, a spin-1 field.6
We must also note that we must assume a negative expression for (g) in order
to preserve the attractive nature of the gravitational interaction.
With these caveats in mind, we can end up with a set of equations similar to the
Maxwell’s ones (in Gaussian units)
· E (g) = −4πGρ
∇
1 1 ∂ E (g) 4πG (g)
∇ × B (g) = − j (5.67)
2 c ∂t c
(g)
× E (g) = − 1 ∂ B
∇
2c ∂t
∇·B =0 (g)
Again, despite the apparent similarity with Maxwell’s equations, there are differ-
ences, that is worth underlining:
6 In other words, this means that the effective gravitomagnetic charge is twice the gravitoelectric
one.
96 5 Tests of Gravity at First Post-Newtonian Order
We push further the analogy, recalling the magnetic potential and field generated by
the magnetic moment of a charge current (as stated by Ampére’s equivalence), and
deriving the corresponding equations for the gravitomagnetic field generated by the
angular momentum J of a rotating mass current
μ
× r G r × J
A(em) = ; A(g) = (5.68)
r3 c r3
3r̂ (r̂ · μ
) − μ
G 3r̂ (r̂ · J) − J
B (em) = ; B (g) =− (5.69)
r3 c r3
To complete the analogy, we need the equation of motion for a particle in these
fields, paralleling the role of Lorentz equation F = q( E + vc × B)
in the e.m. case.
To this purpose, we shall evaluate the geodesic Eq. 5.8 in the particular limit of the
space-time around a rotating body, in the WFSM and stationary case, as described
by the metric (Eq. 5.30)
2G M 2 2 2G M 2
ds 2 = 1 − 2 c dt − 1 + 2 dr
c r c r
4G J
− r 2 dθ2 − r 2 sin 2 θdφ2 − 2 sin 2 θdφdt
c r
In agreement with our WFSM assumptions (R r and v c), we can assume the
√
interval ds 2 to be dominated by the time-like term: ds g00 cdt. As a consequence,
we can take d x i /ds v i /c and neglect terms in v i v h /c2 . Stationarity of the metric
also implies gαβ,0 = 0. With these approximations, after some algebra, the geodesic
equation takes the form:
d 2 r (g) + 2 v × (∇
× A(g) )
= −∇ (5.70)
dt 2 c
echoing, again, the equation of motion of a charged particle in the electromagnetic
field. The first term on the right-hand side is the Newtonian interaction, responsible
for “Keplerian orbits”, while the second, a small perturbation O(v/c), represents a
precession of the angular momentum, as discussed in the next section.
As an interesting application of this analogy, we consider the precession of the
angular momentum of a charged particle orbiting about a magnetic dipole and that
of the angular momentum S of the massive particle orbiting in the gravitomagnetic
field B (g) .
In an external magnetic field B (em) , a magnetic dipole μ × B and
feels a torque μ
precesses around the B̂ direction (Larmor precession). In a similar way, a torque M
is applied to the orbiting massive particle
= d S = S × B(g) ≡
M P × S (5.71)
dt c
5.6 Gravitoelectromagnetism 97
Table 5.2 Precession mechanisms in General Relativity. The effects are referred, for definiteness,
to the system where they have been first observed but, obviously, hold for any gravitationally bound
system
Effect Name Source Interacting with Measured on
Schwarzchild Sun mass M Planet orbital L Mercury orbit
De Sitter—geodetic Earth orbital L ⊕ Gyroscope spin S Earth + satellite
Lense-Thirring Earth spin J⊕ Satellite orbital L LAGEOS (e.g.) orbits
Pugh-Schiff Earth spin J⊕ Gyroscope spin S GP-B space
experiment
G
P =
[3( J · r̂ )r̂ − J] (5.72)
c2 r 3
This formalism is the starting point of Sect. 5.7 where we discuss and review the
experimental measurements of the gravitomagnetic phenomena.
Gravitomagnetism deals a lot with spin-spin interactions: to keep notation straight,
here and in the following section we use J for the spin of the mass generating the
gravitational field, and S or L for the angular momentum (spin or orbital) of the
orbiting test object. This distinction will obviously fail when dealing with binary
systems of comparable mass and angular momentum, as in Chap. 12.
As we have seen in previous sections, the motion of celestial (and man-made) bodies
is subject, in GR, to a number of effects that have no correspondence in Newtonian
gravity. We shall discuss here the precessions, that affect planets, satellites and stars,
orbiting a massive central source.
Schiff pointed out (Schiff 1960) that precessions are the only measurements that
can really test General Relativity in weak field; both the red shift and the deflection
of light (Shapiro time delay was yet to be discovered) can be deduced on the basis
of the equivalence principle + special relativity (Table 5.2).
We have already encountered in Sect. 5.5 the precession of periastron in the field of
a central source mass
6π G M
< δψ >(Schwar z) = 2
c a(1 − e2 )
The case of Mercury is the most famous, but any orbiting body is subject to this effect.
Recent observations have shown Schwarzschild precession in the star S2 orbiting
98 5 Tests of Gravity at First Post-Newtonian Order
Fig. 5.2 2D analogy of the parallel transport of a vector along a geodesic: we start in position A and
parallel transport the vector to N along the meridian A − N (green vectors). If, instead, we move
the vector to position B along the parallel A − B, and then to N along another meridian B − N ,
the end vector (red) is rotated with respect to the previous transport. Credits: https://i.stack.imgur.
com/H2xCM.png
the galactic black hole Sgr A∗ (The Gravity Collaboration 2020). The precession has
also been accurately measured for geodetic satellites orbiting the Earth (Lucchesi
and Peron 2010). We note that this precession changes the angle ω defining the
periastron: it is a rotation of the ellipsis axes, and does not change the orbital plane:
it is an in-plane effect. Moreover, we remark that it is a gravitoelectric effect, as
it depends on the mass M of the central source (in general, on the gravitoelectric
potential (g) ) and not on its rotation.
d S 3 (g)
DS × S;
= DS ≡
with v × ∇ (5.73)
dt 2c2
and we leave a somehow technical derivation to the next section.
The angular momentum of a spinning gyroscope orbiting a central source mass
experiences a precession around a vector normal to the orbital plane. When completed
one orbit, the in-plane component S φ does not return to the initial value + 2π, but
5.7 Spin Precession in Gravitation 99
d 3 (G M)3/2
φ DS = − 2 (5.75)
dt 2c r 5/2
√
Finally, recalling that the orbital velocity is v = G M/r , we can re-express the
above rate as
d 3 GM
φ DS = 2 v 2 (5.76)
dt 2c r
The precession takes place in the orbital (r̂ , φ̂) plane, i.e. around the vector
3 GM 3 (g)
DS =
r̂ × v = 2 v × ∇ (5.77)
2
2c r 2 2c
normal to that plane. The time evolution of the spin S is then described by
d SDS 3 (g)
= 2 ∇ × v × S (5.78)
dt 2c
Just as a curiosity, we can identify r × v in Eq. 5.77 as the orbital angular momentum
L/m of the gyroscope and write the precession in yet another way:
d SDS 3G M
=− L×S (5.79)
dt 2 mc2 r 3
that is suggestive of a spin-orbit interaction, like in atomic systems.
100 5 Tests of Gravity at First Post-Newtonian Order
We have seen that the angular momentum of a spinning and orbiting body pre-
cesses around the radial direction. We discuss in the next section the experimental
confirmations obtained by the GP-B experiment, with a spinning quartz sphere in
orbit close to the Earth surface: Eq. 5.75 predicts a tiny precession of 30 µrad/year,
or 6 arcsec/year.
However, we can consider the Earth + any of its satellites as a gigantic spinning
gyro, that moves in the field of the Sun. S is, in this case, the angular momentum
of the satellite orbiting the Earth, while L refers to the Earth (+satellite, negligble)
moving around the Sun: as it precesses, S will change its orbital plane around the
Earth, with a change in the Right Ascension of the Ascending Node (RAAN), the
Keplerian parameter defined in Sect.1.1.
A word of caution is in order about notation: the Right Ascension (or Longitude, in
terrestrial coordinates) of Ascending Node is an angle that astronomers traditionally
label with the symbol . To avoid confusion between the RAAN and the precession
frequency of Eqs. 5.72 and 5.77 we shall use in this chapter the acronym R A AN
instead of .
Clearly, the effect is the same for any satellite orbiting the Earth: Eq. 5.75 predicts,
using the Sun as the source (M = M , r = 1 AU ) a precessional rate
3 2π
DS = R = 93 nrad/yr = 19.2 marcsec/yr (5.80)
4 rT
We should consider, however, that this precession takes place, as mentioned above,
around a direction normal to the ecliptic; observing it on an Earth based reference
frame, the RAAN we measure the component in the equatorial plane:
d
R A AN = DS · cos(
) = 17.6 mas/yr (5.81)
dt
where
= 23o 27 is the inclination of the equator on the ecliptic. This effect was first
observed on the precession of the Moon perigee (Bertotti et al. 1978), using Lunar
Laser Ranging, with an accuracy of 10%, later improved by other experiments, to
the current level of agreement, between theory and measurement, of 0.2%.
where ω = dφ dt is the orbital coordinate (i.e. measured from far away) angular fre-
quency. Using the normalization condition gμν u μ u ν = c2 , we find that there is a
centrifugal contribution to coordinate time:
2 −1
dt r 2 ω2 R −1 R r 2 ω2
= 1+ 2 1− 1− − 2
dτ c r r c
−1/2
dt 3R
=⇒ = 1− ≡A (5.83)
dτ 2r
to first order in R/r , and where Kepler’s law ω 2 = G M/r 3 was used in the last
expression. We further simplify the analysis by taking the initial spin angular momen-
tum in the r̂ direction ( Sin = [0, S, 0, 0]), and solve Eq. 5.82 for the time evolution
= [S 0 , Sr , S θ , S φ ]. We find the time component of S μ
of all its components S(t)
with the help of the above-mentioned orthogonality condition:
μ ν R
gμν S u = c 2
1− u0 S0 − r 2 · uφ Sφ = 0 (5.84)
r
that yields
r 2ω R −1 φ
S0 = 1 − S
c2 r
102 5 Tests of Gravity at First Post-Newtonian Order
d Sr
+ 00 S u + φφ
r 0 0 r
Sφuφ = 0
dτ
d Sθ d Sφ φ
= 0; + r φ Sr u φ = 0 (5.85)
dτ dτ
we see that, as expected, S θ is null at all times; we end up with two coupled, first-order
differential equations for the time derivatives of the Sr , S φ components of the spin.
Computing the Christoffel symbols for the Schwarzschild metric is a tedious but
straightforward calculations: they can also be found in many textbooks of GR (see,
e.g. Hartle 2003) and in Mathematica® notebooks; those relevant for our equatorial
problem are
GM R −1 GM R
r00 = 1 − 00
r
= 1− (5.86)
c2 r 2 r c2 r 2 r
R φ 1
φφ
r
= −r 1 − r φ =
r r
Substituting these into Eq. 5.85 and recalling the relation Eq. 5.83, we find
d Sr rω φ
− S =0 (5.87)
dτ A
d Sφ Aω r
+ S =0 (5.88)
dτ r
It is immediate, by eliminating either variable from the system, to obtain, for both
components, the equation of a simple harmonic oscillator with angular frequency ω.
The solution of these equations, with initial conditions S = Sr̂ at t = 0, is
C C
Sr = cos(ωτ ) Sφ = sin(−ωτ ) (5.89)
A r
where C is a constant of integration. We also see that the phase of Sr and of S φ evolve
in opposite directions with time. These equations, expressed in terms of proper time
and LFI, are not very helpful, because we are interested in the spin evolution as seen
from large distance, in a coordinate reference frame. We then recall the relation 5.83
between t and τ to re-express
ωt 3R
ωτ = = 1− · ωt ≡ ω t (5.90)
A 2r
C C ω
Sr (t) = cos(ω t) S φ (t) = ( ) sin(−ω t) (5.91)
A Ar ω
5.7 Spin Precession in Gravitation 103
We have found that, even if we start with the spin aligned in the r̂ direction, a S φ
component will arise during the revolution, although depressed by a factor ωω . At
completion of a revolution, after a time T = 2π/ω, the phase of component S φ has
grown to ω T = 2π(1 − 32rR )1/2 . Therefore, when completed one orbit, S φ does not
return to 2π, but there is a small (negative) extra angle, a precession:
1/2
3R 3πG M
φ = ω T − 2π = 2π 1− −1 − 2 (5.92)
2r c r
d S G
L T × S =
= [3( J · r̂ )r̂ − J] × S (5.93)
dτ c r3
2
Fig. 5.4 The Earth, spinning with angular velocity ω and angular momentum J , creates a grav-
itomagnetic field with field lines analogous to those of a magnetic dipole. A gyroscope with spin
angular moment S orbits the Earth along a polar geodesic orbit (thick continuous line). The Lense-
Thirring effect produces a precession of S around the direction given by the field lines of B (g) .
Credits: courtesy of C. Lämmerzahl
frames: the central body drags, in this picture, the space-time along its rotation, in
a sort of frictional coupling (without dissipation !). Distortion of the metric thus
gives rise to the LT precession. The orbits of satellites are affected by this relativistic
phenomenon, hard to detect because it is masked by numerous other effects (Fig. 5.5).
We first consider the LT effect in the case where the angular momentum of the Test
Mass is orbital: for a satellite of negligible dimensions and mass m, in a circular orbit
5.8 Lense-Thirring Measurements on the Orbits of the LAGEOS … 105
of radius r around the Earth, with the help of Kepler’s third law
L = mr × v = m G M⊕r L̂ (5.94)
G
LT >=
< < (3(r̂ · Jr̂ − J) >
r 3 c2
3J 2
= 3 2
[(cos i sin i), 0, (sin2 i − )] (5.95)
2r c 3
The averaged precession depends on the orbit inclination i:
L T >= G2 J3 ẑ
– For a polar orbit, i = π/2, we find < 2c r
– For an equatorial orbit, i = 0, we have < L T >= − G2 J3 ẑ but, as the satellite
c r
angular momentum is also directed along ẑ, no precession takes place.
Consider now
a satellite in a polar orbit around the Earth: in one period of revo-
lution T = 2π r 3 /G M the Ascending Node moves by
π J ⊕ G R ⊕ 3
δ L = T <
L T > × L =
2 3
= 4.3 · 10−11 rad/or bit
c r M⊕ r
a really tiny amount. Fortunately, this effect accumulates with time so that, for a
satellite in LEO (Low Earth Orbit r R⊕ ), in 1 year the orbital plane rotates by:
R A AN
= 0.25 μrad/year
t
i.e. 1.6 m. It will take 23 million years to complete a rotation of the orbital plane.
We have derived these results in the GEM framework, by analogy with the field
of a magnetic moment. The same results can be obtained, in a more formal way, by
computing the change of direction of the L vector under parallel transport in the field
of a rotating mass. The LT effect is responsible for precession of both the longitude
of the ascending node and of the argument ω of the perigee.
7 Wecan also evaluate it in a frame with the orbit on the x − y plane and S along ẑ; we would find
LT >= 3J 2 [ 1 sin i, 0, − cos i].
< r c 2
106 5 Tests of Gravity at First Post-Newtonian Order
Fig. 5.6 Left: Picture of the LAGEOS II: the satellite is composed of two aluminium hemispheres,
60 cm diameter and 117 kg total mass, hosting an internal Brass core of 190 kg. 43% of the surface is
covered by 426 corner cube retroreflectors (CCR): 422 made of Silicon and 4 made of Germanium,
to reflect infrared beams; each CCR is 1.9 cm in diameter and 33.2 g in mass. Image credits:
NASA/GSFC. Right: schematic diagram of the orbits of the two LAGEOS satellites
The ideal condition to measure the precession of the line of nodes would be to
have a pair of satellites in orbits with supplementary angles inclinations i: in this
way, one would obtain perfect cancellation of the dominant, classical precession.
Unfortunately, these satellites were conceived and launched for other (geodetic)
applications (the GR measurements are a sort of free bonus), and a different inclina-
tion was eventually chosen for LAGEOS-2. Nevertheless, I. Ciufolini and E. Pavlis
in 2004 combined the data to extract the classical precession effect. Using the mea-
surements on the Right Ascension of the Ascending Node of the two satellites, and a
sophisticated orbit-tracing program, they solved for a system of equations with two
unknown: the LT precession and C̄2,0 .
This was possible thanks to a detailed model of the Newtonian field, based on
geodetic surveys of satellites like CHAMP and GRACE, as mentioned in Sect. 1.6.
With this technique, Ciufolini and Pavlis verified the relativistic precession to coin-
cide with the prediction of GR within an error bar of 6% (Ciufolini and Pavlis 2004).
On February 13, 2012, a third satellite, named LARES, was launched by ASI,
the Italian Space Agency, on the inaugural launch of the vector VEGA, on an orbit
actually supplementary to that of LAGEOS, but at a much lower elevation: about
1450 km above the Earth surface, verses 5800 km for the two LAGEOS. Unlike the
predecessors, LARES is composed of one solid sphere of Tungsten.
Observation of the orbits of three satellites allows to remove from the problem the
errors contributed by both the quadrupole and the octupole moment, thus refining
the measurement. Recent analyses (Ciufolini et al. 2016; Lucchesi et al. 2020) have
thus reduce the incertitude on the LT measurement to 5% and 1% respectively.
The precession of a gyroscope that is, at the same time, orbiting and spinning in the
field of a rotating central mass is known as the Pugh-Schiff effect. This problem was
examined by L. Schiff in a famous two-page paper (Schiff 1960. He considered at
the same time three precessions of the spin S:
d S
DS +
= ( LT + T h ) × S (5.96)
dt
DS and
where L T are the de Sitter and Lense-Thirring precession rates, as given
by Eqs. 5.77 and 5.95:
3G M G I⊕
DS =
[
r × v] LT =
[3r̂ (r̂ · ω
⊕) − ω
⊕]
2c2 r 3 c2 r 3
where I⊕ ω ⊕ = J⊕ and v is the satellite velocity.
T h is a special relativity effect,
similar to Thomas precession in atomic physics, due to interaction with any non-
gravitational force F:
F × v
Th =
2 mc2
108 5 Tests of Gravity at First Post-Newtonian Order
Such force is, by necessity, present in any laboratory experiment: this is the driv-
ing motivation for measuring the first two precessions in an orbiting, free-falling
experiment.
The rotation frequency of a gyro is constant, and the spinning particle behaves
like a clock. This frequency exhibits Doppler and gravitational shifts when observed
from outside, just like that of a more conventional clock.
Computation of the resulting precession in the general case is quite involved, but
a manageable result can be obtained if we choose a polar orbit ( v , r, ω
⊕ in the same
plane) and the radial initial direction for S.
An intuitive explanation of the precession can be given in Schiff’s words (Schiff
1960): ... think of the moving earth as “dragging” the metric with it to some extent.
At the poles, this tends to drag the spin around in the same direction as the rotation
of the earth. But at the equator, since the gravitational field falls off with increasing
r, the side of the spinning particle nearest the earth is dragged more than the side
away from the earth, so that the spin precesses in the opposite direction.
W.M. Fairbank accepted the challenge and started the experiment Gravity Probe
B (GP-B): the development, carried out at Stanford University under the lead of F.
Everitt, lasted over 45 years.
The goals of GP-B, for an altitude of 642 km above Earth surface, were
– the measurement of LT = 0.041 / year due to rotation and with nominal preci-
sion of 1%
– the measurement of DS = 6.6 /year with precision ∼10−5
Clearly the LT precession, being 160 times smaller than the DS one, would stand
no hope of being detectable: nevertheless, with the chosen orbit and spin direction,
the two effect are orthogonal: the resulting precessions of the gyroscope spin in two
orthogonal directions can thus be separated and detected (Fig. 5.7).
The satellite was launched on April 20, 2004, (the project had begun in 1961) and
placed on a polar orbit at 642 km altitude. It contained a superfluid He (T = 1.8 K)
cryostat with 4 spherical quartz rotors (3.8 cm in diameter), coated with a supercon-
ducting Niobium layer (Niobium becomes superconducting at temperatures Tc < 9
5.8 Lense-Thirring Measurements on the Orbits of the LAGEOS … 109
K). Each of these small spheres was spun by He gas jets, and became a superconduct-
ing gyroscope, with a magnetic moment parallel to their angular momentum. The
tiny magnetic fields produced by the rotors were picked up by superconducting coils
coupled with SQUID magnetometers (see Sect. 8.2.5), based on Josephson effect.
When the spheres preceded, a current change was induced in the coils and mea-
sured through the SQUIDs. In order to detect both, orthogonal precession signals,
the whole spacecraft, and the coils with it, was continuously rotated around its axis.
The satellite was also equipped with a telescope aimed at the distant star IM-
Pegasi which constituted the basic element of a control system that commanded the
satellite trim correctors. The aiming system was based on VLBI measurements of the
IM-Pegasi motion relative to distant quasars. In this way, the co-movable reference
system was realized, having axes with fixed orientation with respect to the distant
stars. Precessions of the gyroscopes were measured with respect to this star-fixed
reference.
Data were collected during the period August 28, 2004–August 14, 2005, when all the
liquid Helium had finally evaporated and the rotors were no longer superconducting.
The data were analysed for a long time, trying to achieve the announced goal of
verifying the predicted LT precession effect to a 1% precision. Among the many
problems faced, we mention the effort to subtract the residual effect due to the electric
dipole of the spheres: apparently, in the process of evaporating niobium on quartz,
the level of uniformity of the metal layer was insufficient to prevent the formation
of “charge patches” that produced a residual electric dipole moment. The detailed
modelling of the electric dipole presented considerable difficulty in influencing the
error in the measurements of the precession effect.
After long and intense efforts, the experimental results were officially announced
in May 2011 (Everitt et al. 2011): the measured precessions were in accordance with
General Relativity, although with larger incertitudes than expected:
Despite the partial accomplishment of the initial goals, GP-B was a very success-
ful experiment: it did verify the Schiff precession, and tested, for the first time in
space, cryogenics, superconductivity, attitude control by microthrusters and other
sophisticated techniques.
The Earth-Moon system can also be seen as a gyroscope with its axis perpendicular
to the Moon orbital plane. The de Sitter precession on this system is about 2 arcsec
per century. The Earth-Moon system also exhibits a gravitomagnetic precession. The
Earth, moving around the Sun, produces gravitomagnetic field, and the Moon moves
through this field, experiencing a Lorentz-like force perpendicular to both its velocity
110 5 Tests of Gravity at First Post-Newtonian Order
and the gravitomagnetic field. This effect produces ∼6 m/yr amplitude terms in the
Moon’s orbit as evaluated in the Solar System Barycenter (SSB) frame.
The quality of Lunar Laser Ranging (see Sect. 1.9.2) data has settled to the level
of ∼2 cm range uncertainty, so that the de Sitter effect has been measured by LLR
(Turyshev and Williams 2007) within 0.6% accuracy.
Analysis of the relative motion of the Earth-Moon system has provided a success-
ful probe to explore many other gravitational effects (Merkowitz 2010):
– the best test, to date, of the Strong Equivalence Principle: ηs < 4.5 · 10−4
– the best test, to date, of the time constancy of Newton’s gravitational constant:
Ġ/G < 10−12 s −1
– a test of the Inverse Square Law: α < 10−10 at the λ ∼ 108 m length scale
– WEP: a/a < 1.3 · 10−13
– gravitomagnetism to ∼0.1%.
References
Bertotti, B., Bender, P., Ciufolini, I.: New test of general relativity: measurement of de sitter geodetic
precession rate for lunar perigee. Phys. Rev. Lett. 58, 1062 (1987)
Brans, C., Dicke, R.H.: Mach’s principle and a relativistic theory of gravitation. Phys. Rev. 124,
925 (1961)
Ciufolini, I., et al.: A test of general relativity using the LARES and LAGEOS satellites and a
GRACE Earth gravity model. Eur. Phys. J. C 76, 119 (2016)
Ciufolini, I., Pavlis, E.C.: A confirmation of the general relativistic prediction of the Lense-Thirring
effect. Nature 431(7011), 958 (2004)
Einstein, A.: Die Grundlage der allgemeinen Relativitätstheorie (The foundation of the General
Theory of Relativity). Annalen der Physik 49, 769–822 (1916)
Everitt, C.W.F., et al.: Gravity probe B: final results of a space experiment to test general relativity.
Phys. Rev. Lett. 106, 221101 (2011)
Hartle J.B.: Gravity: An Introduction to Einstein’s General Relativity. Addison Wesley (2003)
Jackson J.D.: Classical Electrodynamics. Wiley (1975)
Lense, J., Thirring, H.: Uber den Einflub der Eigenrotation der Zentralkorper auf die Bewegung der
Planeten und Monde nach der Einsteinschen Gravitationstheorie (On the Influence of the Proper
Rotation of Central Bodies on the Motions of Planets and Moons According to Einstein’s Theory
of Gravitation). Physikalische Zeitschrift 19, 156 (1918)
Lucchesi, D.M., et al.: A 1% measurement of the gravitomagnetic field of the Earth with laser-
tracked satellites. Universe 6, 139 (2020)
Lucchesi, D.M., Peron, R.: Accurate measurement in the field of the earth of the general-relativistic
precession of the LAGEOS II pericenter and new constraints on non-Newtonian gravity. Phys.
Rev. Lett. 105, 231103 (2010)
Merkowitz, S.M.: Tests of gravity using lunar laser ranging. Living Rev. Relat. 13, 7 (2010)
Möller, C.: The Theory of Relativity, 2nd edn. Oxford University Press, Oxford (1972)
Ni, W.T.: A new theory of gravity. Phys. Rev. D 7, 2880 (1973)
Schiff, L.: Possible new experimental test of general relativity theory. Phys. Rev. Lett. 4, 215 (1960)
Schiff, L.: On experimental tests of the general theory of relativity. Am. J. Phys. 28, 349 (1960)
References 111
The Gravity Collaboration: Detection of the Schwarzschild precession in the orbit of the star S2
near the Galactic centre massive black hole. A&A 636, L5 (2020)
Turyshev, S.G., Williams, J.G.: Space-based tests of gravity with laser ranging. Int. J. Mod. Phys.
D 16, 2165 (2007)
Gravity at the Second Post Newtonian
Order 6
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 113
F. Ricci and M. Bassan, Experimental Gravitation, Lecture Notes in Physics 998,
https://doi.org/10.1007/978-3-030-95596-0_6
114 6 Gravity at the Second Post Newtonian Order
• A symmetric metric tensor gμν describes the geometry of space-time, such that
ds 2 = gμν d x μ d x ν .
This determines lengths and proper times in the ways we already saw from Special
and General Relativity.
• Such geometry is locally approximated by a Lorentz space-time.
• Matter in free fall only responds to the metric, according to the same equation of
motion, i.e. the geodesic equation:
d2xμ ν
μ dx dx
λ
+ νλ =0 (6.1)
ds 2 ds ds
* {gμν } In purely tensorial theories, like Einstein’s General Relativity, there exists
a unique gravitational field gμν . It is generated by the stress-energy tensor which
contains contributions of matter and other fields. What differs from theory to
theory are the field equations, i.e. the particular way in which matter, and possibly
other fields, generate the metric.
In these theories, local gravitational physics is independent of the position and
velocity of the local reference frame: being gμν the only field coupled to the envi-
ronment, it is always possible to find a coordinate system in which gμν takes the
form ημν at the boundary between the local system and the external environment,
neglecting the tidal potentials. One class of such theories postulates a gravitational
Lagrangian density that is a general function of the Ricci scalar, rather than the
6.1 Gravitation as a Metric Theory 115
Ricci scalar itself; these are called f (R) theories, devised to alter the behaviour
of gravity on cosmological scales. Another class of theories adds quadratic and
higher-order curvature terms to the general relativistic Lagrangian density; this
alters the behaviour of the metric on short scales, and the higher-order terms are
sometimes interpreted as representing quantum corrections to classical general
relativity.
* {gμν , } In scalar-tensor theories, matter and fields also generate one or more
scalar fields. In such theories, the local gravitational physics can depend on posi-
tion of the frame but is independent of its velocity. The scalar field (xμ ) may
depend on the space-time position, but on the velocity, because a scalar field
is Lorentz-invariant. The scalar field can vary in time because of cosmological
evolution, or it can vary in space because of the proximity of matter outside the
quasi-local Lorentz frame.
* {gμν , K μ , B μν , . . .} In the vector-tensor theories, other fundamental fields con-
tribute to generate the metric: vector, tensor or both. The local gravitational physics
may have both position and velocity-dependent effects. For example we consider
a time-like vector field K μ , whose value depends on the distribution of matter
in the universe: in a given quasi-local Lorentz reference frame it has only a time
component K 0 . Consider now a Lorentz boost to a frame moving with velocity
v relative to the first: the asymptotic form of K μ has now spatial components
K j ∝ K 0 v j , and these velocity-dependent components can then contribute to
the form of the local metric.
* Bimetric and stratified theories: In Ni’s theory (Ni 1973) for example, it is assumed
the existence of a flat metric in all the Universe where a proper time exists. This
flat metric contributes, with matter and non gravitational fields, to generate a
scalar field . All these fields then combine to generate the physical metric gμν ,
the metric that is relevant for the equivalence principle.
For a comprehensive list and description of the many theories proposed since 1922,
we refer to the classical and thorough source (Will 2018).1
6.2 Notation
In this chapter we introduce the PPN formalism, that extends the WFSM limit to
a further degree of approximation, largely following the reasoning outlined in Will
(1974). Such formalism, mostly developed by Nordtvedt and Will in the 1970s, has
evolved in the years to a standardized notation, that is adopted in most textbooks and
papers. These papers use the natural, or theorist’s units, where G = c = 1. Here,
we shall try and maintain our notation with SI units and all universal constants at
their place, with one exception: the symbol U is used for a sort of dimensionless,
relativity.
116 6 Gravity at the Second Post Newtonian Order
≡ v/c 1
where v is the velocity of the body under scrutiny, the test mass with respect to the
local reference frame.
The 1PN approximation of GR in the Weak Field Slow Motion hypothesis, that
we discussed in Chap. 5, assumed v 2 /c2 1 and U 1: the virial theorem assures
us that kinetic and potential energy are of the same order, i.e. O(2 ). Being the
lowest order of correction, expansion of the metric elements up to O(2 ) is called
linear.3
For consistency, also the stress-energy tensor elements were expanded to order
2 : we considered a perfect fluid, described by:
p
T μν = ρ + ρ + 2 u μ u ν + pg μν (6.3)
c
where p is the fluid pressure, ρ the mass density and ρc2 the internal energy
(thermal, radiative, nuclear, compressional…) density: is the ratio of internal
energy density to rest mass density. This had effect on both the pressure and the
internal energy density:
p
∼ O(2 ); ∼ O(2 )
ρc2
Stated differently, we require both pressure and internal energy to be small with
respect to the gravitational energy of the system:
p
≤ U; ≤ U
ρc2
These assumptions are reasonable: p/ρc2 ∼ 10−5 in the Sun, ∼10−10 on the Earth.
Similar values hold for .
On the other hand, in compact objects the expansion in powers of may fail: in
a neutron star ∼ 0.2 while p/ρc2 can be as large as 0.5 and U ∼ 0.2.
Finally, as a consequence of v c, the time derivative are small with respect to
the space derivatives:
d 2 x , dU
a = = c2 ∇U or a j = c2 = c2 U , j
dt 2 dx j
this must be recovered as a weak field limit for any gravitational theory.
The geodesic equation can be rewritten in a local (x j , t) frame, applying the
requirement of Eq. 6.4:
j 1
a j = −c2 00 = c2 g jk g00,k
2
The Newtonian limit is recovered if g jk = δ jk and g 00 = 1 − 2U . This simple sub-
stitution yields to the Newtonian formulation, that is adequate to describe many
phenomena in the Solar System, with accuracy up to 10−5 . However, further cor-
rections must be introduced to explain the precession of perihelion of Mercury
(∼5 10−7 rad/orbit) or the other classical tests of GR. We learned in the previous
chapter that the Schwarzchild metric of GR
We can therefore interpret the expression under square root as a Lagrangian for a
single particle in a gravitational field described by the metric gαβ :
vj v j vk
L = c g00 + 2g0 j
2
+ g jk (6.6)
c c c
We cannot rule out the hypothesis that gravitation is more complicated than GR and
is governed by a more complex theory: what we saw so far could then be simply a
first approximation to the real metric, and we can hypothesize further, higher order
correction terms, that would characterize the other, competing theories.
Indeed, the first attempts at such generalization were made by Eddington in 1922,
Robertson and Schiff; they simply hypothesized, for a central, spherical source, a
power expansion to the next order in U :
In this formalism, β accounts for the amount of non-linearity of the theory, while
γ measures the amount of space curvature produced by the source. We shall see
that this physical interpretation is retained for the corresponding coefficients of the
general PPN formulation. The parameter α is normally neglected because it can
always be absorbed in the coupling constant G; e.g. for an isolated central mass:
2αU = 2[αG ]M/r c2 ≡ 2G M/r c2 , where G corresponds to our present knowledge
of the gravitational constant.
Looking for a more general formulation of the theory, what form should these addi-
tional terms have? The first-order correction to Minkowski metric is the gravitational
potential U , we can expect that further terms would be similarly fabricated “poten-
tials”, fanciful combinations of mass density and distance. We shall then require the
following rules:
(Eq. 6.4) time derivatives and velocities v are O() with respect to spatial ones
and to c respectively, this means that an extension of Eq. 6.6 to O(4 ) demands
to expand g00 up to order 4 , g0 j to order 3 and gi j to order 2 :
g00
1 + O(2 ) + O(4 ) + · · ·
g0 j
− 1 + O(3 ) + · · ·
gi j
− δi j + O(2 ) + · · ·
For consistency, we will also expand the stress-energy tensor of a perfect fluid,
Eq. 6.3 up to order 4 .
2. The metric elements can be dimensionless functions of matter properties p, ρ, ρ
and of its velocity v, but not of their gradients.
3. The metric generated by an isolated source must be locally Lorentzian. That means
|gμν − ημν | → 0 when r → ∞.
4. The general expression of a PPN metric should be valid in any quasi-Lorentzian
frame. This implies that the functional dependence of the metric on the potentials
and velocities (some theories postulate a preferred frame of the universe) of the
theory should be the same in any quasi-Lorentzian frame. In other words, the
metric must be invariant for spatial translations, for spatial rotations, for Lorentz
boosts. A way to satisfy this requirement is to impose that all terms in g00 be
scalars under translations, rotations and boosts. Similarly, terms in g0 j and gi j
should be vectors and tensors, respectively, under the above transformations.
According to these guidelines, we shall add the following terms to the metric4 :
ρ(x , t)v j 3
G
Vj = 3 d x (6.9)
c x − x |
|
G ρ(x , t)
v · (
x − x )(x − x ) j 3
Wj = 3 d x (6.10)
c x − x |3
|
wjU wi Ui j (6.11)
W j and V j are two possible potentials whose existence is allowed by our rules. The
vector w j , assumed to be O(1 ), represents the velocity of the adopted reference
frame with respect to the rest frame of the Universe. So, w j U and wi Ui j are also
O(3 ).
• Expand g00 to order 4 . This metric element should be a scalar under rotation;
therefore it can be composed of any of the following scalars, O(4 ) terms
G ρ v 2 3 G ρ U 3
1 =
d x 2 = d x
c4 |
x − x | c 2 x − x |
|
G ρ 3 G p
3 = 2
d x 4 = d3x
c |
x − x | c 4 x − x |
|
G ρ [
v · (x − x )]2 3
A= 4 d x
c |x − x |3
G ρ d v 3
B= 4
x − x ) ·
( d x
c |
x − x | dt
U , Ui j , V j , W j , W , 1 , 2 , 3 , 4 , A
Some theories admit a preferred reference frame, the rest frame of the Universe, so
also w j must be retained. The most general metric will be composed of a linear com-
bination of these potentials, with ten coefficients to be determined by experiments:
6.4 A General Theory at Second Post-Newtonian Order 121
We could choose to pair each potential with a parameter, but we can just as well pair
them with linear combinations of these ten coefficients. This second choice, although
less intuitive, allows us to give physical meaning to the ten PPN parameters:
γ, β, ξ, α1 , α2 , α3 , ζ1 , ζ2 , ζ3 , ζ4
The metric in every particular theory can be obtained by setting these parameters
to particular values, often zero.
The stress-energy tensor will also be expanded beyond what stated in Eq. 6.3, to
include the gravitational energy U :
v2
T 00
= ρ 1 + + 2 − 2U c2
c
v2 p
T0j = ρ 1 + + 2 − 2U + 2 cv j
c ρc
v2 p
T jk = ρ 1 + + 2 − 2U + 2 v j v k + pδ jk (1 + 2γU ) (6.15)
c ρc
In the following list, we summarize the physical meaning of the 10 PPN param-
eters, and their role in the metric theories of gravity:
The process of applying the PPN formalism to a theory is lengthy and algebraic-
intensive:
5. Substitute these forms into the field equations, operating to a consistent order.
Find solutions for h μν . Substitute the perfect fluid stress tensor for the matter
sources.
6. Solve for h 00 to O(2 ). This shall tends to zero far from the system: compute the
form h 00 = −2U . Solve for h i j to O(2 ) and h 0 j to O(3 ).
7. Solve for h 00 to O(4 ). This is the most complex step, involving all the nonlin-
earities in the field equations. The stress–energy tensor must also be expanded to
sufficient order.
8. Convert to local quasi-Cartesian coordinates.
9. Read off the PPN parameter values by comparing the result for gμν with the values
of Eq. 6.14.
6.5 PPN Applied to General Relativity 123
We now apply the above recipe to the particular, but very important case of GR,
to recover the equations derived in Chap. 5. The starting point is the Einstein field
equations
1 8πG
Rμν − gμν R = 4 Tμν (6.16)
2 c
If we multiply this equation by g μν , we easily obtain R = − 8πG
c4
T with T = g μν Tμν .
This allows us to rewrite Eq. 6.16 in the form
8πG 1
Rμν = 4 Tμν − gμν T (6.17)
c 2
gμν
ημν + h μν ; with h μν << 1
1 2 1 1 1 1 1
R00 = ∇ h 00 − (h j j,00 − 2h j0, j0 ) + h 00, j h jk,k − h kk, j − |∇h 00 |2 + h jk h 00, jk
2 2 2 2 4 2
1
R0 j = − (∇ 2 h 0 j − h k0, jk + h kk,0 j − h k j,0k ) (6.18)
2
1
Ri j = − (∇ 2 h i j − h 00,i j + h kk,i j − h ki,k j − h k j,k,i )
2
while the stress-energy tensor is given by Eq. 6.3.
First, we compute h 00 and T00 to O(2 ):
1 2
R00 = ∇ h 00 T00 = T = ρc2 (6.19)
2
Plugging these into Einstein’s equation (6.17) gives
8πG
∇ 2 h 00 = ρ that has solution: h 00 = −2U (6.20)
c2
To compute h i j to O(2), recall that indexes are raised and lowered using the flat-space
μ
metric: h α ≡ η μβ h βα . Impose the three gauge conditions:
μ 1 μ
h i,μ − h μ,i = 0 (6.21)
2
124 6 Gravity at the Second Post Newtonian Order
μ 1 μ 1
h 0,μ − h μ,0 = − h 00,0 (6.23)
2 2
and Eq. 6.17 takes the form
16πG
∇ 2 h 0 j + U,0 j = − ρv j (6.24)
c2
The solution of Eq. 6.24 is not straightforward and we refer to textbooks as Weinberg
(1972) or Poisson and Will (2014). In essence, one has to relate the derivatives of U
to the potentials V j , W j defined in Eqs. 6.9 and 6.10 to finally obtain the solution
7 1
h0 j = Vj + W j (6.25)
2 2
To iterate the expansion to O(4 ) , on the same gauge condition, we make use of the
solutions of h μν to order O(2 ). We eventually find
1 2
R00 = ∇ (h 00 − 2U 2 ) + 4U ∇ 2 U
2
and
1 1 v2 1 3 p
T00 − g00 T = ρc2 1 + 2 − U + +
2 2 c 2 2 ρc2
Plugging again the last two expressions into the Einstein equations (6.17) and making
use of the relationships Eq. 6.13 for the potentials j ( j = 1 . . . 4) one eventually
reaches the solution
In conclusion, the GR metric in the PPN formulation is, from Eqs. 6.22, 6.25, 6.26:
We shall just mention here the most famous among the many gravitational theories
that have been proposed as competitors to GR: the Jordan–Brans–Dicke theory.5
R. Dicke and his student C. Brans, building on previous work of P. Jordan, pro-
posed (Brans 1961) a scalar-tensor theory of gravity that incorporated Mach’s princi-
ple into gravitation. This principle states that inertial forces are gravitational effects
of distant matter: who decides whether a reference frame is inertial or accelerated
? According to Mach, inertial frames are those that are not accelerated with respect
to the distant stars. In this way, one could envision a preferred reference frame, in
contrast with Lorentz invariance. The issue is almost philosophical and not resolved
yet. The consequence of this assumption is that matter must play an additional role,
beside determining the geometry. Hence, the simplest and most famous of the many
wordings of Mach’s principle:
“matter there influences inertia here”.
Mach’s principle had a profound influence on the development of Einstein’s thought;
however, as Brans and Dicke noted, the principle is imperfectly satisfied in General
Relativity, so they moved to remedy this situation. To this purpose, another formu-
lation of Mach’s principle comes useful:
Newton’s Gravitational Constant G is a Dynamical Field.
This formulation is influenced by Dirac’s Large Number Hypothesis (see also
Sect. 6.6.4): it states that, calling Mu the estimated mass of the universe and 2Ru
its diameter, the fact that
G Mu
∼1
Ru c 2
cannot be a coincidence, resulting from cancellation of extremely large (O(1040 ))
numbers. So, the theory allows 1/G ∼ Mu /Ru c2 to be a scalar field φ(xμ ), rather
than a constant.
The Brans–Dicke theory is thus a scalar-tensor theory containing a parameter ω,
that measures the coupling of the two fields.
For GR, the field equations can be derived from the variational principle (Landau
and Liftschitz 1951)
16πG m
δ d x |g| R +
4
L =0 (6.28)
c4
with R the curvature scalar, g the determinant of the metric and Lm the Lagrangian
density of matter, including all non gravitational fields. Brans and Dicke extended
this action to include the additional scalar field φ, allowing Newton’s constant to vary.
5 An excellent, readable overview of this theory and related experimental tests can be found online:
They added the standard Lagrangian density of a scalar field, Lφ ∝ φ,μ φ,μ . More-
over, they chose a constant, dimensionless coupling parameter6 ω. By multiplying
Eq. 6.28 by G −1 = φ, we obtain the resulting action:
16π φ,μ φ,μ
δ d 4 x |g| φR + 4 Lm + ω =0 (6.29)
c φ
The factor 1/φ in the denominator of the third term is introduced so that the coupling
constant ω be dimensionless. The field equations are derived by varying φ, φ,μ in
Eq. 6.29:
1 8π ω 1 ,α
Rμν − gμν R = 2 Tμν + 2 φ,μ φ,ν − gμν φ,α φ +
2 φc φ 2
1
+ [φ,μ;ν − gμν φ]
φ
where φ can play either the role of additional “matter” source, if positioned, as here,
on the right hand side of the equation, or the role of additional gravitational field, if
we move it to the left. By expanding this equations in the weak field limit, we would
find:
2M 1
g00 = 1 − 1+
φ0 c2 r 3 + 2ω
2M 1
gi j = − 1 + 1− δi j
φ0 c2 r 3 + 2ω
1
φ(xμ ) = φ0 1 +
3 + 2ω
where φ0 is the asymptotic value of the scalar field away from our source M.
By comparison with the usual PPN value g00 = 1 + 2G M/r c2 we derive the
present, i.e. measured here and now, value of Newton’s constant:
2ω + 4 1
G now =
2ω + 3 φ0
Moreover, comparison with giPj P N of Eq. 6.14 yields:
ω+1
γ= (6.30)
ω+2
In the limit ω → ∞ we have γ → 1. Therefore, for large values of ω, the predictions
of Brans–Dicke become more and more indistinguishable from those of GR. Dicke
hypnotized a value ω ∼ 5, to fit the data of Mercury’s perihelion precession (see
following section). Recent tests (Bertotti et al. 2003) push the limit toward ω >
40000. This determined the decline of interest in Brans–Dicke theory.
Using the PPN expansion of each theory in the equations of motion we find clues
and traces to perform experimental verifications. Particularly, we report here the fun-
damental relations that are the bases to deduce the bounds of the PPN parameters
in some of the classical observational tests of general relativity. We will then men-
tion the tests of violation of the strong equivalence principle (SEP). The analytical
description of these effects within the PPN formalism is carried on following the
logic paths illustrated in the previous chapter. The significant change is the use of the
PPN metric gμν given by (6.14) instead of the Schwarzchild metric. For a complete
derivation of the following formulas we refer to Will’s textbook (Will 2018).
Light Deflection
Historically, the first confirmation of General Relativity came from observations of
the deflection of photon trajectories in the proximity of the solar gravitational field,
as discussed in Sect. 5.4.2. We consider the case of an observer at rest on the Earth,
who receives two light rays: the first from a targeted source, the second from another
star, used as reference and located in a different position of the sky. as sketched in
Fig. 6.1. ϕ is the angle between the directions of the two incoming rays, being ϕ0 the
value when there is no gravitational perturbation: that is, when the massive object
interposed between the target source and the observer is far from the line of sight. The
difference ϕ = ϕ − ϕ0 is the measurement of the light deflection due to lensing:
the calculation in the PPN framework is carried out similarly to Sect. 5.4.2, inserting
the PPN metric elements into the starting Eq. 5.43.
Fig. 6.1 Sketch of the geometry for the measurements of light deflection
128 6 Gravity at the Second Post Newtonian Order
1 + γ 4G M 1 + cos ϕ0
ϕ = (6.31)
2 bc2 2
For light grazing the Sun, that means:
1+γ
=
ϕmax 1.75 (6.32)
2
In analogy to Eddington’s pioneering experiment, the measurements are carried
out during solar eclipses, when the solar disc is obscured and it is possible to observe
stars with angular position very close to the Sun. Values of the deviation ϕmax are
averaged by observing the position of several star in the sky. The main limitation
to this type of measurement is due to fluctuations in the refractive index of the
atmosphere. This effect can be avoided using astrometric data taken by a telescope
in space. In fact, the most accurate measurements to date of the light bending effect,
is a byproduct of the satellite mission Hipparcos,7 a project aimed at accurately
measuring the positions of celestial objects on the sky. The Hipparcos limit on γ
was obtained by accumulating accurate measurements of star coordinates at various
elongations from the Sun. The optical observations, taken over 37 months, did not
rely on Solar eclipses. The data were taken at very large angular distance from the
Sun between 47 and 133 degrees. At the first glance this could be a drawback, but
apparently this method allows a better randomisation of the systematic errors. The
Hipparcos limit on γ is (Froeschlé et al. 1997):
The follow-up astrometric mission is GAIA: this ESA satellite was launched on
December 19th, 2013 and it is orbiting around the Sun-Earth L2 Lagrange point.8
There is great expectation to achieve, using its data, a dramatic improvement in the
γ limit with respect to the Hipparcos result of σγ = 3 × 10−3 . Simulations predict
a gain of at least two order of magnitude (Vecchiato 2014), under the assumption
that the error in a single measurement of light deflection effect will be on average
100 µas for at least 1 million stars, observed about 400 times during 5 years. GAIA
is indeed expected to observe and catalog more than a billion stars.
Actually, the measurement is best carried on observing electromagnetic radiation
of longer wavelength, such as radio waves or microwaves. In fact, the Sun radio
emission is weak, and many well localized sources emitting in the radio band can be
observed applying an interferometric technique. The technique consists of combining
7 Hipparcos: HIgh Precision PARallax COllecting Satellite ESA 1989–1993. Named after the Greek
both gravitational forces and of the centrifugal force cancels out. L2 is about 1.5 Gm beyond the
Earth along the Sun-Earth axis.
6.6 Experimental Limits of the PPN Parameters 129
Ra a d lan
di sta e w
o n a
sig t s ve
i
(p
na ou )
ls rce
fro
m
k
Fig. 6.2 Schematic of the operation for a VLBI antenna array: cross-correlation of the two recorded
time series yields the delay τ in reception. Hence, via the relation cos β = cτ /B one can measure
either the baseline B or the beam inclination β
the signals collected by two radio-telescopes located tens of kilometres apart: the
Long Baseline Interferometry (LBI), As sketched in Fig. 6.2, by cross-correlating the
outputs of two antennas, one can determine the time delay τ . However, an exact and
synchronized timekeeping at the two (or more) stations is required: this is achieved by
linking the clock that time-stamps the observational data to a superstable oscillator,
like a H maser and, more recently, to the GPS network, for long-term stability.
Knowledge of τ and of the length of the ideal line joining the two antennas (the
baseline B), gives, through the relation cτ = B cos β the inclination angle β of the
incoming plane wave. Conversely, observation of radio signals from a known source
can determine the baseline length B. This method is easily extended to a network
of several antennas (in analogy with multi-beam interferometry), where the phase
closure technique can be applied. In oversimplified terms, the sum of the signal phase
differences from one antenna to the next must be equal to the total phase difference
between the first and the last:
Applying this method the overall phase noise contribution of each observing station
contributions cancel out (Jennison 1958).
In 1975 Fomalont and Sramek (1976) used a baseline of 35 km, obtaining as
limiting value
γ = 1.007 ± 0.009 - LBI, 1975
With an angular resolution of 0.01
they observed distant sources such as the quasars
0111 + 02, 0119 + 11, and 0116 + 08. The main limitation of this method was
130 6 Gravity at the Second Post Newtonian Order
Fig. 6.3 Corrections for the timing of the radio signal. Between the times of arrival at the first
VLB1 station (left) and the second (right), the Earth has translated (with velocity ve ) and rotated
(with tangential velocity w2 at station 2)
related to the modelling of the solar corona, whose effects are subtracted by observing
electromagnetic signals at different frequencies (in their case 2.695 and 8.085 GHz).
Clearly, the accuracy of these measurements increases with the baseline B. This
consideration led to establish the Very Long Base Interferometry (VLBI) where the
antennas are distant thousands of km. With the increase of the baseline length, addi-
tional care must be taken in time-keeping. The measurement is carried out in the
Solar System Barycentric (SBB) reference frame, to “make simple” the path of the
radio beam: therefore the timing must take into account both rotation and revolution
of the Earth, as shown in Fig. 6.3.
At present, the US-operated Very Long Base Array (see Fig. 6.4) can rely on ten
antennas with a maximum baseline (Mauna Kea, Hawaii to St. Croix, US Virgin
Islands) of 8611 km, not counting the possible extension to a site in Germany.9 The
recent best result obtained with this technique is (Lambert and Le Poncin-Lafitte
2011):
γ = 1 − (8 ± 12) 10−5 - VLBI, 2011
Still in the domain of classical tests of relativity, we discussed the Shapiro time delay,
i.e. the delay in the propagation time of electromagnetic signals traveling through a
gravitational field. Here too, we just report the final expression of the delay time in the
PPN framework. This relation allows us to establish new bounds on the parameters
γ. The guiding principle for the experiments is to measure the round-trip travel time
of an electromagnetic signal sent towards a target that plays the role of a mirror. Such
experiment is carried on within the Solar System, and the reflecting target is a planet
or a satellite: therefore, the retarding gravitational field of interest is that generated
by the Sun. Consider a reference frame with the origin at the Sun center, as shown
in Fig. 6.5. Let x⊕ and xt be the position vectors of the Earth and the target body. n̂
the unit vector oriented from the target to the Earth and r⊕ , rt are the distances along
n̂ from the point P of closest approach to the Sun to the Earth and to the target star,
respectively. Performing the calculation with the PPN metric, the Shapiro delay for
a round trip is
1 + γ 4G M 4r⊕ rt
δt = ln = 247 µs if R
b
2 c3 b2
We should also consider the time derivative of this equation, that is related to the
Doppler frequency shift of the received signal:
δf d δt 1 + γ G M db
= = −4 (6.34)
f dt 2 c3 b dt
The first measurements were performed reflecting radio beams off the surface of
Mercury, Venus and Mars. Figure 6.6, top, shows results obtained in 1967 by Shapiro
and coworkers observing two superior conjunctions of Mercury. Planets however
have rough surfaces: problems related to the orography can produce errors in the
propagation time of the order of 5 µs. These pioneering measurements thus confirmed
GR (γ = 1) at the 5% level.
Artificial satellites offer a good alternative: the uncertainty in their position is
∼50 m (150 ns) due to spurious accelerations caused by the position control system
or by the solar wind. To overcome this problem, it is more common to reflect off
artificial satellites orbiting around a planet, where their passive orbit can be precisely
predicted and monitored.
6.6 Experimental Limits of the PPN Parameters 133
Prior limits on γ were derived by analysing the echo radar delays obtained from
ranging to the Mariner 9 spacecraft,10 the first artificial satellite sent in orbit around
Mars. The smallest uncertainty associated to γ values was about 2%.
A serious problem with the experiments carried out in solar conjunction is related
to density fluctuations in the solar corona: the radio beams grazing the Sun pass
through regions where rapid changes in the electron density cause the delay to vary
in a non-controllable fashion, adding a hard-to-model delay of up to 100 µs. A two-
fold substantial improvement in the Shapiro delay measurement was achieved with
the Viking (Shapiro 1977) mission, devoted again to the exploration of Mars. In this
case NASA sent to the Red Planet two spacecrafts, Viking11 1 and 2, each made of
two main parts: an orbiter designed to photograph the surface of Mars while orbiting
it, and a lander designed to study the planet from the surface. The two Vikings began
orbiting Mars in the summer of 1976 and spent about a month surveying landing
sites. They then released their landers, that touched down on flat lowland sites in the
northern hemisphere of Mars about 6,500 km apart. Scientists used the radar echo to
the landers to test GR: radio signals were sent to the landers on Mars, reflected back
to the orbiter and from these to the Earth. The improvement in accuracy, over the
Mariner experiment, of the echo delays measurement, transponded by the spacecraft,
was essentially due to the fact that the two Viking landers were implanted on the
surface of Mars so that their trajectories were precisely determined. In addition,
while the two Viking landers transmitted to the orbiters only in S-band (∼2.3 GHz),
the spacecrafts transmitted signals to Earth in both S- and X-band (∼8.4 GHz), both
coherent with the single S-band signal received from the lander. This dual-band,
one-way ranging allowed estimation of the contribution to the echo delays from the
solar-corona plasma.
Mars passed in superior conjunction with the Sun and the Earth on November
25, 1976, few months after the arrival of the Vikings. From those data the following
limit was obtained (Reasenberg et al. 1979):
γ = 1.000 ± 0.002 Viking ranging, 1976
Note that the distance of Mars from the Sun is d−Mar s = 1.56 AU, therefore
the round trip time from the Earth to Mars, when in opposition, is 2 · d⊕−Mar s =
2.56 AU · 2/c = 2552 s. The detection of the Shapiro delay, 247 µs, requires with
a time accuracy of one part in 107 . Measuring γ at 0.2 % level means measuring
the transit time to better than 2 · 10−10 . This difficult requirement poses a threefold
challenge to experimenter:
– clock stability, (amply satisfied by H masers and, more recently by atomic clocks,
now in the 10−15 range),
– ability to describe and predict the orbits to better than cδt
150 m,
– precise knowledge of the positioning for the ranging stations on Earth, as discussed
in Sect. 1.7.
The best limit, to date, on the γ parameter was set by the analysis of the radio
signal tracking the Cassini12 spacecraft, during its trip to Saturn.
The mission contributed to studies of Jupiter for six months in 2000 before reach-
ing its destination, Saturn, in 2004 and starting a series of flybys of Saturn’s moons.
That same year it released the Huygens lander on Saturn’s moon Titan to conduct a
study of the moon’s atmosphere and surface composition. On its way toward Saturn,
Cassini passed superior conjunction with Sun and Earth on July 7, 2002: the radio
signals passed near the Sun at a distance of d = 1.6R . Bertotti, Iess and Tortora
exploited a sophisticated and innovative radio link, with two frequencies, 7.175 GHz
(X-band) and 34.316 GHz (Ka-band), uplink (transmitted from ground to the space-
craft) and three frequencies downlink, to suppress the corona noise down to 10−4 of
the relativistic signal. While the experiment with the Vikings measured the excess
transit time (range measurement), the analysis carried out on the data of Cassini
focused on the time derivative of the range data (range-rate), as shown in Eq. 6.34.
Indeed, in superior conjunction, the distance b from the Sun, and hence the delaying
effect, reaches a minimum, so that its time derivative is most sensitive. This proce-
dure was possible only because the corona noise was efficiently suppressed. Indeed,
the signal in the lower pane of Fig. 6.6 (Cassini) is proportional to the time derivative
of the upper one. They have produced (Bertotti 2003) what is still today (2021) the
best measurement of propagation delay, constraining γ :
Finally, we mention the use of the Shapiro time delay also in the measurement
of radio waves emitted by quasars, or other extragalactic sources. It is obvious that
the classical Shapiro formula, Eq. 6.33, requiring knowledge of the target position,
is of little use when looking at a source at cosmological distances. However, if
one simultaneously observes two sources with a similar impact parameter b, and
computes the difference of the two Shapiro delays, many common terms drop out
and the relations simplifies to (Hellings 1986):
G M r1 | + (
| r1 · k̂)
τgrav = (1 + γ) 3
log
c |
r2 | + (
r2 · k̂)
where r j is the position of the jth VLBI station, in Solar System Barycentric coor-
dinates, and k̂ is the unit vector in the direction of the source: if the source is extra-
galactic, the direction is the same for both antennas. With this method, a dedicated 1-
day (as opposed to years) observation session was devoted to two sources, 0229+131
12 Cassini–Huygens, ESA/ NASA / ASI. 1997–2017. Named after the XVII century astronomers
Giovanni Domenico Cassini, discoverer of four Saturn moons and Christiaan Huygens, discoverer
of Titan.
6.6 Experimental Limits of the PPN Parameters 135
and 0234+164, whose position is known to better than 200 µas. By observing them at
angular distances from the Sun of 1.15◦ to 2.6◦ , the team (Titov et al. 2018) achieved
a limit:
γ − 1 = (2.72 ± 0.92) · 10−4 - delay + VLBI, 2018
This accuracy approaches, but does not reach, that of Cassini reported above.
Estimates for the parameter β can be deduced from the measurements of the preces-
sion in the perihelium of the planets, but we will need the prior knowledge of the
value of γ, that we have derived in the last subsection.
Let us now return to the precession of Mercury’s perihelion ω Mer cur y , that we
analyzed, in the framework of GR, in Sect. 5.5.
From the observational point of view, this precession measured from an Earth-
based observer is actually very large: 5599.74 But we need to subtract a number of
classical corrections:
– the precession of Earth equinoxes, 5025.64 (a period of 25 700 years): this has
nothing to do with Mercury precession, is just a change of coordinates to refer
the measurement to an inertial frame (ECI frame, see Chap. 14),
– perturbations to the orbit due to Venus (277.8 ), Jupiter (153.6 ), Earth (90.0 )
and other planets (10.0), inferred from Newtonian mechanics and observation of
the orbits,
When all these terms are considered, we are left with a residual precession of
0.1035 /revolution, or 42.98 /century, almost exactly the result that Einstein had
computed.13
6πG M
ω̇ G
Mer cur y =
R
= 42.98 /century
a(1 − e2 )c2 T
T , e and a are the period, the eccentricity and the semi-major axis of the planet’s
orbit. This was the first, amazing success for his new theory14 : the excess motion
of Mercury’s perihelion of 43 per century had been observed by Le Verrier in
1859 and had remained enigmatic up to the formulation of GR.
So, GR prediction exactly matches observation: what else is there to learn from
Mercury precession? As mentioned in Sect. 1.6, this stunning result could be
spoiled by one additional correction:
13 Einstein had computed 43 /century, compatible with the less precise knowledge of c and a of
those times.
14 Einstein wrote:“for a few days, I was beside myself with joyous excitement”.
136 6 Gravity at the Second Post Newtonian Order
J 2 3π R 2 J2
ω̇ Mer cur y = − = 1.275 105 · J2 sec/century (6.35)
a 2 (1 − e2 )2 T
The dimensionless parameter J2 is defined in Eq. 1.18, but can also be expressed,
Ic −I A
for a spheroid like the Sun, as J2 = M R2
with IC and I A the two principal
inertia moments of the Sun. Therefore, a J2 value of the order of 10−5 or more
could give a substantial contribution to the excess precession, and thus invalidate
the “perfect match” between theory and measurement. This objection was moved
by R. Dicke, who had just proposed his theory alternative to GR. He actually
carried out a measurement (Dicke and Goldenberg 1967), with a result, J2 =
(2.47 ± 0.23) 10−5 , that seemed to support this interpretation.
This was sufficient motivation to work out the derivation with the PPN metric:
μ and M are the reduced mass and the total mass of the Sun-Mercury system.
The first factor, multiplying the parenthesis, is the GR result. Inside the parenthesis
we have two terms: the first is the one that allows to evaluate β. Its value is unity for
GR. The second term in Eq. 6.36 depends on the parameters αi and ζ2 . However, this
term is weighted by the ratio μ/M between reduced and total mass of the system.
For Mercury this is
μ M Mer cur y
∼ ∼ 2 · 10−7
M M
Therefore, its contribution to the precession is well below the experimental uncer-
titude. The last term is due to the Sun quadrupole moment, that had raised warm
expectations among the supporters of the Brans–Dicke theory. However, many mea-
surements of the solar J2 were made after Dicke’s (see Sect. 1.6), and improved
data have dispelled these allegations, pushing the corresponding β value toward the
GR limit. Recent determinations based on helioseismology, mostly from the space
missions SOHO15 and RHESSI16 give
15 Solar and Heliospheric Observatory, NASA. Orbiting the L1 Lagrange point; 1995–ongoing.
16 Reuven Ramaty High Energy Solar Spectroscopic Imager, NASA. Geocentric 2002–2018.
6.6 Experimental Limits of the PPN Parameters 137
way too low to produce any appreciable contribution to ω̇. An interesting historical
review of the quest for J2 and its implications in relativity can be found in Rozelot
and Damiani (2011).
Once all the classical effects have been accounted for, a bound for the parameter
β is obtained. A first bound was (Shapiro 1976):
1
(2 + 2γ − β) =1.003 ± 0.005 - Mercury perihelion, 1966−1976
3
A significant boost in the accuracy of the measurements of the perihelion advance
was obtained thanks to the data of the mission MESSENGER.17 In 2011, the first
artificial satellite was set in orbit around Mercury. It orbited for four years, ending
its life by a controlled crash of the spacecraft on the Mercury surface. The range
and Doppler (i.e. range rate) data of MESSENGER improved our knowledge of
Mercury’s orbit around the Sun. Although the most striking relativistic feature, the
perihelion precession, only depends on the combination (2 + 2γ − β), the detailed
dynamics of the planet has more complex dependence (see Will 2018) that allow to
separately pinpoint the two parameters. Analysis of the MESSENGER data yielded
these bounds on γ and β (Verma et al. 2014):
The parameter β is also sensitive to the Nordtvedt effect, that we will discuss in the
next section. Using MESSENGER data and assuming the existence of the Nordtvedt
effect,18 the limit on |β − 1| derived, was set to
The MESSENGER data also yielded an improved estimate for the solar quadrupole
moment (Genova et al. 2018)
To obtain experimental bounds for the other PPN parameters we need to consider
other aspects of metric theories. To reveal a possible violation of the strong equiva-
lence principle, (SEP), we apply the PPN formalism to self-gravitating systems, i.e.
systems in which the gravitational energy of the body itself is not negligible.
We recall that EEP already incorporates, besides the concept of universality of
free fall (UFF), the principles of Lorentz local invariance (LLI—non-existence of
preferred reference frames) and of Lorentz position invariance (LPI—invariance of
position in the space-time of non-gravitational experiments). The SEP expands the
LPI and LLI concepts to experiments with a strong self-gravitating contribution.
General Relativity incorporates SEP, while other theories can violate it.
Consider a SEP violation, where gravitational energy falls with different accel-
eration than other energies: a body of mass m and gravitational self-energy E g , in a
gravitational potential , falls with acceleration
Eg
a =− 1−η ∇
m
The parameter of violation η quantifies the difference between the inertial and
gravitational mass of the test body
mg Eg
=1+η
mi m i c2
10 2 2 1
η = 4β − γ − 3 − ξ − α1 − α2 − ζ1 − ζ2 (6.37)
3 3 3 3
Eg 1 ρ( r ) 3 3
r )ρ(
= G d rd r (6.38)
m ρ(
r )d 3r r − r |
|
where ρ is the body mass density. For laboratory systems, the values of the grav-
itational self-energy are of the order E g /mc2 ∼ 10−27 . To have larger values of
this ratio, we should consider systems such as the Moon (∼2 · 10−11 ), the Earth
(∼4.6 · 10−10 ), or even better, the Sun (∼10−5 ). If SEP is violated, the difference in
self-gravitational energy content of the Earth and the Moon will force the two bodies
to fall differently towards the Sun. As a consequence, we would observe a polar-
ization of the Earth-Moon orbit. This is known as the Nordtvedt effect (Nordtvedt
1968). A proper calculation of the deformation in the Earth-Moon orbit yields the
following magnitude for the effect:
δr = 13.1 η cos(ωÁ − ω )t
6.6 Experimental Limits of the PPN Parameters 139
where δr is measured in metres ωÁ e ω are the angular frequencies for the Moon
and the Sun revolutions, assumed as circular and observed from a reference frame
in Earth. A violation of SEP would displace the lunar orbit along the Earth–Sun line
by an amount δr , producing a range signature having a 29.53 day synodic period
(different from the lunar orbit period of 27.32 days).
Measurements of Lunar Laser Ranging have been performed regularly since 1968,
and discussed in Sect. 1.9. Their accuracy is better than 1 cm, that is, 50 ps for the
round-trip travel time of an e.m. signal from the Earth to the Moon.
A comprehensive model took into account the classical orbit, the perturbation
effects due to the classical field from the Sun and from the other planets, the tidal
interactions, lunar librations, atmospheric effects, etc. The analysis of the residuals
yielded the following limit (Hofmann et al. 2010):
The study of the orbit of Mercury is the other method well-suited to test SEP
via the Nordtvdet effect. A SEP violation would cause an indirect perturbation on
Mercury’s orbit. It has been noted that the current planetary ephemerides studies
are based on the hypothesis that the gravitational and inertial masses are equal to
compute the Solar Barycentric System (SSB) and a violation of the SEP may lead
to an intrinsic mismodeling of the SSB position.
The data confirm the validity of the strong equivalence principle with a signifi-
cantly refined uncertainty of the Nordtvedt parameter (Genova et al. 2018)
Violations of the invariance for LLI and LPI are related, in the PPN framework, to
the values of the parameters
α1 , α2 , α3 , ξ
We recall that LPI states that identical experiments must have identical results regard-
less of where and when they are performed: i.e. they must not depend on their location
x μ in space-time. Therefore, non-zero values of these parameters would imply a local
variation of G in time and/or its dependence on the position in space. The idea that
G, as well as other universal constants, might not be really constants is rather old.
Indeed, it goes back to the 1930s, when Paul Dirac (1938) presented his Large Num-
ber Hypothesis—LNH. Probably he was influenced by some numeric coincidences,
such as the ratio between the Universe radius cTU to the classical radius of the
2
electron re = 4π em c2 :
o e
cTU
∼ 1040
re
140 6 Gravity at the Second Post Newtonian Order
being so similar to the ratio of the electrical to the gravitational forces between a
proton and an electron:
e2
∼ 1040
4πo Gm p m e
In the LNH Dirac argued that five dimensionless combinations of universal constants
(pure numbers) should be considered as variable parameters characterising the state
of the universe. One of the consequences of this phenomenological approach is
that the gravitational coupling constant should change as the universe evolves, as it
comes out by supposing that the numeric coincidence cited above hides a fundamental
physics principle.19 It is then reasonable to speculate that the time dependence of G
be directly connected to the Universe expansion rate given by the Hubble parameter
H0 = 70 ± 5 km/(s Mpc),20 i.e. Ġ/G ∝ H0 . We must also note that, if fundamental
constants are time dependent, the present mystery of the so-called dark energy effect
should have to do with it, because dark energy dominates the present expansion of
the universe.
The Dirac’s argument opened a research field with a significant number of experi-
ments of different types, such as the search of anomalies in the Earth tides, anomalous
contributions to the planets and lunar orbits, self-accelerations of pulsars or varia-
tions of the Earth’s spin. These measurements, from which upper bounds of the time
derivative of G (and thus on the PPN parameters), can be roughly classified in three
categories:
Ġ
10−13 yr −1
G
– Observations over times of the order of 109 − 1010 years, not necessarily including
the early universe. In this category we classify the bounds obtained by palaeon-
tological or stellar astrophysics observations, with the latter including also the
observations of helioseismology, pulsars in binary systems and globular clusters.
The bounds obtained are
Ġ
∼ 10−11 − 10−12 yr −1
G
19 Dirac focused on Newton’s constant G: since then, possible time dependence of light velocity,
Planck constant and fine structure constant have also been considered.
20 Several estimates of H are available today, and there is tension among them. We chose an
0
uncertitude large enough to contain all determinations.
6.6 Experimental Limits of the PPN Parameters 141
– The third class consists of experiments performed on the time scale of human
life, i.e., of the order of decades. These include possible variations in the plane-
tary orbits, the revolution motion of a binary system containing a pulsar and/or
oscillations of white dwarfs. The bounds obtained are:
Ġ
∼ 10−10 − 10−11 yr −1
G
Although the results of the first category place the most stringent bounds on the long
term variations of G, it must be noted that they depend on the cosmology model used
and on the relative model for the variation of G.
The best result to date is derived from a global fit performed on orbits of planets
and satellites (Verma et al. 2014):
Ġ
= (−5 ± 3) · 10−14 yr −1 - Solar System Orbits
G
However, like in all observations in the Solar System, the measured quantity is
actually the product G M :
1 d Ġ Ṁ
G M = + ≤ (−6 ± 4) 10−14 yr −1
G M dt G M
Ṁ
thus, that limit depends then on the estimate of Sun’s mass loss M
∼ (−1.12 ±
−13 −1
0.25) 10 yr due to solar radiance and wind.
Actually, the change in time of any fundamental constant would be an evidence of
LPI violation. There is a wide variety of measurements on this subject (Uzan 2003).
We just mention here the bound set on the fine structure constant α̇em /αem ≤ 2 · 10−8
derived by the geophysical surveys on the abundances of U isotopes measured in a
natural nuclear reactor, active some 106 years ago, located in Oklo, Gabon.
We finish this rather long section on the experimental bounds of the PPN parameters,
reporting some results about the tests of conservation laws. These tests involve the
parameters
α3 , ζ1 , ζ2 , ζ3 , ζ4 .
A limit on ζ3 can be derived from the Lunar Laser Ranging tracking of the Moon
orbit. We discussed in Chap. 1 the experiment of Bartlett-Van Buren, that probed
the difference between active and passive mass: it set a bound of m A /m p − 1 ≤
4 × 10−12 (Bartlett and Van Buren 1986). If we assume this violation to be associated
142 6 Gravity at the Second Post Newtonian Order
with the electrostatic energy contribution of the nucleus E el , we can translate this
limit on a limit for ζ3 :
mA − mP
ζ3 = 2 ≤ 1 · 10−8
E el
A significant limit on the |ζ2 + α3 | is deduced from the observations of binary
systems of neutron stars such as PSR1913+16, that we shall describe in detail in
Chap. 12. The violation of the conservation of linear momentum in a binary system
s would cause a self-acceleration of the binary system’s centre of mass
1 m1 − m2 m1m2 e
acm = − (ζ2 + α3 ) n (6.39)
2 a 3 m 1 + m 2 (1 − e2 )3/2
where m 1 and m 2 are the star masses, a the semi-major axis of the orbit, and n is
a unitary vector oriented from the center of mass to the point of periastron of m 1 .
From the observed data of the pulsar PSR1913+16 it has been deduced one of the
best upper bounds for the quantities
|α3 + ζ2 | = 4 · 10−5
Finally, we recall a consideration by Will (2018): in any reasonable theory there
must be a relation between the quantities ρv 2 , ρ, p (something like Bernoulli’s
theorem in classical fluid dynamics), that leads to a linear relation among the PPN
parameters: 6ζ4 = 3α3 + 2ζ1 − 3ζ3 . This allows us to reduce the number of param-
eters, disregarding ζ4 .
The gravitomagnetic effect can also be studied using the PPN metric. Consider the
PPN predictions for the two precessions studied in Sects. 5.6 and 5.8:
1
Geo =
(1 + 2γ)
v × ∇ (6.40)
2
– the Lense-Thirring precession, is related to the dragging of inertial frames caused
by the spin-spin coupling with the central body. In the PPN farmework, it depends
on the parameters γ and α1 :
1 α1 G n)
LT = −
1+γ+ ( L − 3(
n · L) (6.41)
2 4 r 3 c2
6.7 Gravitomagnetism and PPN Parameters 143
6.8 Conclusions
We have briefly reviewed over fifty years of measurements and observations of the
deflection and delay of e.m. radiation, precession, geophysical and astronomical
effects. They have all been repeatedly performed, with ever-increasing accuracy, and
they are all leading to the conclusion that:
The values of the parameters of the PPN formalism are consistent with those expected
by General Relativity.
In Table 6.1 we report a summary of the situation as of 2020. Nevertheless, exper-
imental efforts to find a flaw in the predictions of Einstein’s theory continue, with
the hope of opening the way to a description of the universe that can address the
fundamental questions, such as the nature of dark matter and dark energy, that still
remain unanswered.
In the foreseable future, improvements in PPN testing are expected, on the weak
field side, from new space projects, with improved sensitivity for tracking and gravity
measurements.
Many missions have been proposed, approved or launched, such as
Table 6.1 Upper bounds, measured in the weak field regime, i.e. in the Solar System, for the PPN
parameters (as of 2020). Since the values of all these parameters are consistent with zero, we report
the experimental error, as a bound to each value. See Imperi et al. (2018), Will (2018) and references
therein
Parameter σ Solar system observation
γ−1 2.3 · 10−5 Time delay—Cassini radio tracking
β−1 3.9 · 10−5 Mercury precession—MESSENGER Tracking
ξ 10−3 Terrestrial tides—Gravimeters
α1 6 · 10−6 Solar System precession—LLR
α2 2 · 10−6 Precession of solar spin—Ecliptic alignment
α3 2 · 10−7 Perihelion shift—Lunar Laser Ranging
ζ3 10−8 Newton’s 3rd law—Bartlett and Van Buren (Chap. 1)
η 3 · 10−4 Nordtvedt Effect—Moon Orbit—LLR
Ġ/G 4 · 10−14 Mars ephemeris—MRO ranging
144 6 Gravity at the Second Post Newtonian Order
These missions have probed, or will probe, effects that we discussed here, like Mer-
cury’s perihelion precession and Shapiro delay, as well as others, like planetary
perturbations on the ranging, Lense-Thirring effect, Compton wavelength of the
graviton and non-gravitational forces (De Marchi and Cascioli 2020). It is worth
noting that all these missions, as well as CASSINI and MESSENGER mentioned
above, carry numerous complex instruments for a variety of different measurements.
The relativistic measurements, typically based on the analysis of tracking and timing
data, are a byproduct of much more expensive and complex projects.
The strongest input for setting bounds (or finding the values, should they turn
out to be different from zero) on the PPN parameters, is coming in recent years
from pulsar observations and timing and from gravitational wave detection. In both
cases, observations are carried out in object where the gravity is much stronger: U ∼
10−3 − 10−1 near pulsars, as opposed to U ∼ 10−5 in the Solar System. Therefore,
the expansion in powers of U and other potentials can easily break down, and a full
general-relativistic analysis might be required. The PPN parameters as coefficients
of such expansion might therefore lose meaning. For this reason, the PPN parameters
as measured in strong field are labeled with a “hat” ˆ in the literature. In the Chap. 12
we will overview the limits set in the strong field regime.
References 145
References
Bartlett, D.F., Van Buren, D.: Equivalence of active and passive gravitational mass using the moon.
Phys. Rev. Lett. 57 (1986)
Bertotti, B., Iess, L., Tortora, P.: A test of general relativity using radio links with the cassini
spacecraft. Nature 425, 374–376 (2003)
Brans, C., Dicke, R.H.: Mach’s principle and a relativistic theory of gravitation. Phys. Rev. 124,
925 (1961)
De Marchi, F., Cascioli, G.: Testing general relativity in the solar system: present and future per-
spectives. Class. Quantum Grav. 37, 095007 (2020)
Dicke, R.H., Goldenberg, H.: Mark solar oblateness and general relativity. Phys. Rev Lett. 18, 313
(1967)
Dirac, P.A.M.: A new basis for cosmology. Proc. R. Soc. Lond. Ser. A 165, 199–208 (1938)
Fomalont, E.B., Sramek, R.A.: Measurements of the solar gravitational deflection of radio waves
in agreement with general relativity. Phys. Rev. Lett. 36, 1475 (1976)
Froeschlé, M., Mignard, F., Arenou, F.: Determination of the PPN parameter gamma with the
HIPPARCOS data. In: ESA Symposium Hipparcos - Venice 97, May 1997, Venice, Italy, pp.
49–52
Genova, A., Mazarico, E., Goossens, S., Lemoine, F.G., Neumann, G.A., David, E., Smith, D.E.,
Zuber, M.T.: Solar system expansion and strong equivalence principle as seen by the NASA
MESSENGER mission. Nat. Commun. 9, 289 (2018)
Hellings, R.W.: Relativistic effects in astronomical timing measurements. Astron. J. 91, 650 (1986)
Hofmann, F., Müller, J., Biskupek, L.: Lunar laser ranging test of the Nordtvedt parameter and a
possible variation in the gravitational constant. Astron. Astrophys. 522, L5 (2010)
Imperi, L., Iess, L., Mariani, M.J.: An analysis of the geodesy and relativity experiments of Bepi-
Colombo. Icarus 301, 9–25 (2018)
Jennison, R.C.: A phase sensitive interferometer technique for the measurement of the Fourier
transforms of spatial brightness distributions of small angular extent. MNRAS 118, 276–284
(1958)
Lambert, S.B., Le Poncin-Lafitte, C.: Improved determination of γ by VLBI. A&A 529, A70 (2011)
Landau, L., Liftschitz, E.: Classical Theory of Fields. Addison-Wesley, Reading (1951)
Ni, W.T.: A new theory of gravity. Phys. Rev. D 7, 2880 (1973)
Nordtvedt, K.: Testing relativity with laser ranging to the Moon. Phys. Rev. 170, 1186 (1968)
Poisson, E., Will, C.M.: Gravity. Cambridge University Press, Cambridge (2014)
Reasenberg, R.D., et al.: (1979) Viking relativity experiment - verification of signal retardation by
solar gravity. App. J. 34, L219–L221 (1979)
Rozelot, J.P., Damiani, C.: History of solar oblateness measurements and interpretation. Eur. Phys.
J. H. 36, 407–436 (2011)
Shapiro, I.I., et al.: Fourth test of general relativity: preliminary results. Phys. Rev. Lett. 20, 1265
(1968)
Shapiro, I.I., Counselman, C.C., III., King, R.W.: Verification of the principle of equivalence for
massive bodies. Phys. Rev. Lett. 36, 555 (1976)
Shapiro, I.I., et al.: The viking relativity experiment. J. Geophys. Rev. 82, 4329–4334 (1977)
Titov, O., et al.: Testing general relativity with geodetic VLBI. A&A 618, A8 (2018)
Uzan, J.P.: The fundamental constants and their variation: observational and theoretical status. Rev.
Modern Phys. 75, 403 (2003)
Vecchiato, A.: Astrometric tests of general relativity in the solar system: mathematical and compu-
tational scenarios. J. Phys.: Conf. Ser. 490, 012241 (2014)
Verma, A.K., Fienga, A., Laskar, J., Manche, H., Gastineau, M.: Use of MESSENGER radioscience
data to improve planetary ephemeris and to test general relativity. A&A 561, A115 (2014).
1306.5569
Weinberg, S.: Gravitation and Cosmology: Principles and Applications of the General Theory of
Relativity. Wiley, New York (1972)
146 6 Gravity at the Second Post Newtonian Order
Will, C.M.: The theoretical tools of experimental gravitation, pp. 1–110. In: Bertotti, B. (ed.)
Proceedings of the International School Enrico Fermi “Experimental Gravitation”. Academic,
Cambridge (1974)
Will, C.M.: The confrontation between general relativity and experiment. Living Rev. Relativ. 17,
4 (2014)
Will, C.M.: Theory and Experiment in Gravitational Physics. Cambridge University Press, Cam-
bridge (2018)
Further Reading
Wambsganss, J.: Gravitational Lensing in Astronomy. Living Reviews in Relativity 1, 12 (1998)
Will, C.M., Nordtvedt, K. Jr.: Conservation laws and preferred frames in relativistic gravity. I.
Preferred-frame theories and an extended PPN formalism. Ap. J. 177, 774 (1972)
Gravitational Waves
7
The first mention of the remarkable concept of gravitational wave dates back to the
end of XIX century, when Heaviside published his book Electromagnetic Theory
(Heaviside 1893) and assumed that gravitational radiation should exist, based on an
analogy with electromagnetic propagation.
In 1905 Henry Poincaré, in his long paper La Dynamique de l’Electron for the
proceedings of the Mathematical Circle of Palermo (Poincaré 1905), wrote that all
forces have to be subject to the Lorentz transformations, as in the electromagnetic
case. If this condition is also imposed on gravitation, it must follow that the gravi-
tational interaction is also propagated at the speed of propagation of light. It should
be emphasized that this work was written before the famous A. Einstein’s paper on
Special Relativity.
The explicit derivation of a wave propagation of the gravitational interaction
is presented by A. Einstein, on June 22nd, 1916 in Berlin, at the meeting of the
Königlic Preussichen Akademie der Wissenchaften (Einstein 1916). In that occasion
Einstein proved that the equations of the gravitational field can be linearized under
the condition of small perturbations |h μν | 1 of the Minkowski metric ημν .
We recall once more (see previous chapters) that the elementary space-time inter-
val ds 2 is related to the metric tensor by the relation:
ds 2 = gμν d x μ d x ν (7.1)
with
gμν = ημν + h μν and |h μν | 1 (7.2)
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 147
F. Ricci and M. Bassan, Experimental Gravitation, Lecture Notes in Physics 998,
https://doi.org/10.1007/978-3-030-95596-0_7
148 7 Gravitational Waves
In this approximation, the field equations are linearized and the quantities h μν are
derived by writing the analogous solution of the delayed potentials of electromag-
netism (see Chap. 5)
A year later Einstein wrote a second paper (Einstein 1918). This is a complete
review of topics such as the absorption of incident waves on mechanical systems.
There we can find the expression (with a computation error, edited only later on)
of the power radiated in gravitational waves by a moving system and the relative
luminosity L G :
2
G d3
LG = 5 Dkh (7.3)
5c dt 3
kh
where G and c are the usual universal constants and Dkh (with k, h = 1, 2, 3) is the
quadrupole moment of the emitter:
1
Dkh = ρ xk x h − δkh x d V
2
(7.4)
V 3
The issue of whether GW are real and detectable, and not a mere mathematical
artifact, had been debated since Einstein’s 1916 paper for about 40 years.1 It was
tackled in the 1957 Chapel Hill conference “On the Role of Gravitation in Physics”,
when F. Pirani proved that, in the presence of a gravitational wave, a set of freely-
falling particles would experience actual motions with respect to one another. Thus,
gravitational waves must be real and measurable. Pirani (1956) considered 2 test
masses linked by a spring and H. Bondi suggested (1959) to also insert a dashpot,
and extract energy from the GW. While interferometers eventually substituted the
spring with a laser (then, yet to be invented) beam, it was J. Weber of the University of
Maryland who began the extraordinary effort of bringing the problem of gravitational
waves in the field of experimental physics (Weber 1960) (Fig. 7.1).
Then, the research in the field of experimental gravitation experiments intensified
as well as, in parallel, the theoretical activities. During the 1960s R. Penrose publishes
his work on the spinorial theory of Relativity and General (Penrose 1960; Penrose
and Rindler 1984) and later, the scalar-tensor theory of Brans–Dicke (1961) was
formulated as an alternative to Einstein’s theory. As mentioned in Chap. 6, many
other theories have since been proposed: their benchmark is the observable post-
Newtonian effects, translated in terms of PPN parameter values.
This theoretical framework is also used to analyze the observations conducted
on binary systems such as that in which the post-Newtonian effects are significantly
larger. The binary systems turn out to be perfect laboratories to test PN gravitational
effects. Indeed, the first ever detection of the effects of gravitational wave emission
(the change in rotational period due to energy loss) was observed in the first and
most famous of this binaries, PSR 1913+16 discovered by Taylor and Hulse (1975).
1 Einstein himself, in a 1937 paper (flawed, and not accepted by Physical Review Letters) had lifted
Fig.7.1 Joseph Weber and one of his resonant gravitational antennas, equipped in the central section
with piezoelectric transducers. Credits Special Collections and University Archives, University of
Maryland Libraries, https://hdl.handle.net/1903.1/32732
This observation, although indirect, of the effect of gravitational waves gave new
momentum to the activity toward direct detection. The cryogenic resonant anten-
nas were developed in the years between 1975 and 2005: Chap. 8 describes these
detectors. In the first decade of the new century, interferometers with arms of km
length began to operate, both in the USA and in Italy. In the following decade the
construction of the second generation interferometers, with advanced configuration,
was finalized and on September 14th, 2015 the two aLIGO interferometers have
detected the first signal of a gravitational wave signal, emitted by the coalescence of
two black holes. The announcement of the discovery was given in the USA and in
Italy by LIGO and Virgo collaborations: it was February 11th, 2016, one hundred
years after the publication of the article by A. Einstein on General Relativity. On
the same day, a landmark scientific paper was published on Physical Review Let-
ters (Abbott et al. 2016), containing the results and the details of the data analysis
jointly carried out by the LIGO and Virgo collaborations over five frentic months.
For this long awaited discovery, the Nobel prize for physics was assigned in 2017
to three of the leading figures of the LIGO project: B. C. Barish, K. S. Thorne and
R. Weiss. Then, in August 2017 the three instruments sent an “alarm” to the astro-
nomical community: they had detected a signal from the coalescence of two Neutron
Stars (NS), located in an area of ∼30 square degrees in the southern sky at a dis-
tance from the Earth ranging from 85 to 160 millions of light years. The hunt of
the optical counterpart started immediately afterwards, carried on by an impressive
number of telescopes covering the entire electromagnetic spectrum. The source is
identified as a Kilonova (Metzger 2010) in the galaxy NG4393, marking the birth of
multi-messenger astronomy.
150 7 Gravitational Waves
At the moment of writing, while the the two LIGOs in the US and Virgo in
Europe continue observations, detecting gravitational signals at a rate of about one
per week, a fourth interferometer of kilometer scale is about to join the network:
KAGRA, in Japan, will have new advanced features like underground location and
cryogenic mirrors. A fifth interferometer in India will also be ready in 2024–25.
These systems will constitute a planetary network aimed at writing a new chapters
of Gravitational Astronomy and Gravitodynamics. Chapter 9 analyzes in detail these
amazing instruments, capable of measuring strain as tiny as h ∼ 10−21 .
In the early 2030s, the spaceborne detector LISA (Laser Interferometer Space
Antenna) will scan the sky, observing waves in the mHz frequency band that carry a
huge potential of astrophysical information, complementary to that in the 20–1000 Hz
band observed by Earth-based instruments.
Many key technologies for LISA have been successfully verified by a dedicated
space mission of the European Space Agency: LISA-Pathfinder has flown in 2016–
2017 exceeding the expected performance in terms of noise immunity and readout
accuracy. Details are reported in Chap. 11.
Finally, the next decade will also witness the birth of gravitational wave obser-
vation in the nHz range through Pulsar Timing Arrays, thanks to the coordinated
effort of IPTA, a world-wide network of radio telescopes. An overview of these
observatories and related techniques can be found in Chap. 12.
We can start our discussion from Eq. 5.20, where we have shown how the modified
metric perturbation h̃ μν satisfies the D’Alembert equation2
∂ 2 h̃ μν 16πG
= Tμν (7.5)
∂x λ ∂xλ c4
with
1
gμν = h̃ μν − ημν h̃ + ημν (7.6)
2
μ
Lorentz’s gauge condition, ∂x∂ μ h̃ ν = 0 that we used to obtain Eq. 7.5, does not yet
univocally bind the choice of the reference system. Indeed, we must consider that
any change of coordinates
x α = x α +
α (x) subject to the condition:
∂ 2
μ
=0 (7.7)
∂x λ ∂xλ
2 Note that, with our choice of the signature (+, −, −, −), ∂x λ ∂xλ has opposite signs with respect
to the usual definition of D’Alembertian operator = ∂x j ∂x j − 1/c2 ∂t ∂t .
7.2 Gravitational Wave Properties 151
∂
μ ∂
ν ∂
λ
h̃ μν → h̃ μν − − + ημν (7.8)
∂x ν ∂x μ ∂x λ
In conclusion, the 16 components of the tensor h μν are reduced to 10 because the
tensor is symmetric. Moreover, the choice of Lorentz’s gauge eliminates 4 degrees
of freedom, and the choice of reference system (Eq. 7.8) 4 more, so that we conclude
that there are only two independent components of h μν .
To study the properties of waves, we consider the problem of propagation in empty
space, i.e. where the energy-momentum tensor is null. The field equations (7.5) take
the form:
∂ 2 h̃ μν
=0 (7.9)
∂x λ ∂xλ
The simplest solution of this equation is a plane wave, of the form:
h̃ μν = Re Aμν exp(ikα x α ) (7.10)
Here the analogy with the case of electromagnetism is almost perfect. We easily
get:
kα k α = 0
that shows that k α is a null vector: this is typical of fields with zero mass exchange par-
ticles (gravitons) and proves that the wave propagates at the speed of light. Imposing
μ
∂ h̃ ν
the Lorentz condition ∂x μ = 0 to the plane wave solution we get
k α Aνα = 0
We now use the freedom offered by the choice of reference frame to set the trace
of the perturbation to zero:
h̃ μμ = h̃ = 0
h 0k = 0
and, being the three h 0k set to zero, this implies that h 00 is constant in time: it cor-
responds to the static (Newtonian) background gravitational field, and can be set
152 7 Gravitational Waves
to zero for the wave solution: now all h 0μ = 0, and the Lorentz conditions reduce
∂h
to ∂xiij = 0. This last condition, applied to Eq. 7.10, yields k i Ai j = 0, showing that
gravitational waves are transverse, i.e. have no component in the direction of prop-
agation.3
Summarizing, the above constraints allow us to have:
∂h i j
h = 0; h μν = h̃ μν ; h μ0 = 0; h kk = 0 =0
∂xi
In this gauge h μν is a traceless tensor in which only two spatial components survive.
This is called the TT gauge (Transverse and Traceless); in the case of a plane wave
propagating along the x direction, the most general solution can be cast in the form:
⎛ ⎞
00 0 0
⎜0 0 0 0 ⎟
h T T μν = ⎜
⎝0 0 h yy h yz ⎠
⎟ (7.11)
0 0 h yz −h yy
The generic plane wave can be expressed as a linear superposition of just two
components:
h yy = h + = Re A+ exp(i(ωt − x/c)) (7.12)
h yz = h × = Re A× exp(i(ωt − x/c)) (7.13)
corresponding to two independent polarization states. The reason of these symbols
will be clear shortly. The two states are usually represented through the tensors
e+ , e× :
⎛ ⎞ ⎛ ⎞
000 0 0000
⎜0 0 0 0 ⎟ ⎜0 0 0 0⎟
e+ = ⎜ ⎝0 0 1 0 ⎠
⎟ e× = ⎜ ⎟
⎝0 0 0 1⎠ (7.14)
0 0 0 −1 0010
and the two scalar quantities a+ , a× so that4
A + = a + e+ A × = a × e× (7.15)
By taking into account the metric perturbation Eq. 7.11, a space-time interval
becomes, in presence of gravitational waves traveling along the x axis:
∂2ξμ
+ u α u β ξ γ R μ αβγ = 0 (7.17)
∂s 2
To further simplify the problem we set the origin of the reference frame on particle
A, so that ξ i = x iB , and express the variation of the geodesic line in terms of changes
in proper time dτ . Within the usual WFSM approximation gμν = ημν + h μν we get
from Eq. 7.17
i(T T )
d 2 x iB 1 ∂2hk
x Bk (7.18)
dτ 2 2 ∂τ 2
Spelled out for a wave propagating along the x axis, Eq. 7.18 reads (we now drop
the (T T ) superscript):
d 2 yB 1 ∂2h+ ∂2h×
= y B + z B
dτ 2 2 ∂τ 2 ∂τ 2
(7.19)
d2z B 1 ∂2h× ∂2h+
= y B − z B
dτ 2 2 ∂τ 2 ∂τ 2
Thus, the motion of the particle B seen from A appears as due to an acceleration
field proportional to the second time derivative of h. Integration of Eq. 7.18 yields:
1 T)
x iB (τ ) = x Bk (0) δki + h i(T (7.20)
2 k
If, for example, our test-particles A and B are located along the y axis, with
unperturbed distance l AB , and the incoming (along x) wave has polarization h + ,
Eq. 7.20 simply means:
l AB h+
=
l AB 2
154 7 Gravitational Waves
Two test particles along the z axis would move accordingly, but with a π phase
difference (h zz = −h yy ). On the other hand, if the h × polarization is involved, it can
be seen from Eq. 7.19 that the same effect would be experienced by two test particles
positioned at π/4 with respect to the x, y axes.
This analysis allows us to derive the effect of a gravitational wave on a system of
point-like masses distributed along a circle normal to the propagation direction. In
Fig. 7.2 we show the field lines for the two polarization states of the wave, with the
typical quadrupolar structure of a tidal force.
1 8πG
Rμν (1) − ημν R (1) = 4 Tμν + tμν (7.21)
2 c
The source of the field is expressed by the term Tμν + tμν and it depends on h μν ,
as discussed after Eq. (5.15). At the same time, the metric perturbation is generated
7.2 Gravitational Wave Properties 155
Fig. 7.3 The lines of force of the + and × polarization of gravitational waveforms. It can be easily
seen that the two polarization differ by a rotation of π/4
by the total mass and by the flow of energy and momentum so that we reasonably
interpret tμν as the energy-momentum tensor of the wave itself.
c4 1 1
tμν = Rμν − gμν R − Rμν (1) + ημν R (1) (7.22)
8πG 2 2
tμν is deduced by a series expansion for small values of h, up to h 2 :
c4 (2) 1 αβ (2)
tμν Rμν − ημν η Rαβ (7.23)
8πG 2
The explicit expansion is a rather laborious but instructive exercise and we refer
the reader to the textbook of Weinberg (1972) for details. Mediating over many
wavelengths of radiation results in a relatively simple expression
c4
< tμν >= kμ kν h 2+ + h 2× (7.24)
16πG
from which it is finally possible to derive the expression of the wave intensity (or
flux):
c3 dh + 2 dh × 2
F= + (7.25)
16πG dt dt
We have previously seen that the verification and comparison of different theories
of gravitation are based on the post-Newtonian phenomena, i.e. in the WFSM limit
and on comparison of the relative values of the PPN parameters. We now need to
156 7 Gravitational Waves
remark that alternative theories of General Relativity are associated with different
predictions for the wave field properties. We will limit ourselves here to discuss the
predictions of a few metric theories of gravity.
In a metric theory the role of non-gravitational fields is to help define the space-
time curvature associated with the metric. Matter can generate these fields and they
can help generate the metric, but they cannot directly interact with the matter that
only responds to the metric itself. It follows that the metric and the equation of motion
for matter are the milestones of the theory. Then the difference between alternative
metric theories is the particular way that matter (and other non-gravitational fields)
have to generate the metric of space-time.
Some theories of gravity postulate the existence of a dynamic scalar gravitational
field in addition to the metric tensor g and the Brans–Dicke theory (Brans and Dicke
1961) can be considered a special case in this category. We have here two arbitrary
functions of : the cosmological function λ() and a function of the coupling ω().
In the case of the Brans–Dicke theory we assume ω = constant e λ = 0 and in the
ω → ∞ limit the theory returns General Relativity.
The cosmological function expresses the characteristic length of interaction l of
the scalar field. Indeed, for an isolated system, it can be shown that the gravitational
potential contains a Yukawa-like term as well as the Newtonian one:
ρ(
x , t) (a + b exp(−|
x − ξ|/l)
U (
x , t) = d3 ξ
|
x − ξ|
We note that scalar-tensor theories are conservative and, in general, for large
values of ω their predictions differ from those of Einstein’s theory for corrections of
the order of O( ω1 ) both for the post-Newtonian limit and for the gravitational wave
characteristics.
A different case is represented by vector-tensor theories where a gravitational 4
vector field K μ is added to the metric tensor. The difference between the theories
of this category is at the end, the value of four parameters. In the limit where these
parameters tend to zero, the theories collapse towards the Einstein theory.
These semi-conservative theories, in the post-Newtonian limit, produce effects
compatible with the existence of privileged reference systems. Moreover, the lin-
earized field equations are more complex than the einsteinian and scalar-tensorial
ones, and the tensor propagation solution of h μν is influenced by the value of the K
field of cosmological background. In general there are as many as ten different wave
solutions, each with its own characteristic polarization and velocity condition.
In general, there are two ways in which velocity of gravitational waves vg can differ
from the speed of light. The first is its dependence on cosmological parameters and the
second on the local distribution of matter. Observations in the solar system limit some
of the parameters that define the value of vg to differ by less than 10−3 from the value
of the speed of light. A crucial test for these theories was provided by the comparison
of the arrival times of the gravitational wave and electromagnetic signals emitted
in the same astrophysical process and detected on Earth. The GW event, due to the
merging of two neutron stars at 40 Mpc from the Earth, was detected on August 17th,
7.3 Waves in Other Gravitational Theories 157
2017, by the LIGO-Virgo network. The electromagnetic signal was received within
1.7 s by the two satellites Fermi and Integral equipped with Gamma ray detectors.
The limits set on the ratio (cem − cGW )/cem is ranging between −3 × 10−15 and
+7 × 10−16 (Abbott et al. 2017b).
As for the polarization of gravitational radiation, it must be said that the Einstein
prediction of having only two independent states is probably unique with respect to
other metric theories. The most general prediction is of six polarization states which
are expressed by means of six Riemann tensor components R0i0 j which determine
the forces applied to the detector.
Eardley et al. (1973) developed a classification method based on the study of
the symmetry properties of the amplitudes of gravitational waves when a rotation is
applied around the wave propagation direction (helicity).
In principle, assuming a wave coming from a single source, with a known direction
of propagation, and using the information obtained by two non parallel interferome-
ters, it is possible to discriminate the class of metric theories that fits the polarization
state of the detected wave.
– a generation zone with r r I , where r I is the typical internal radius of the source,
– a local area with r I r rex where rex defines the extension of the local zone
and finally
– the wave zone (propagation zone) for r ≥ rex .
The boundary lengths, r I and rex , are also related to the wavelength of the emitted
radiation. The internal radius must be larger than the reduced wavelength of the
radiation, that is r I λg /2π; at the same time r I is the edge of a region where the
gravity field of the source is relatively weak. r I L where L is a measure of the
linear dimensions of the source. For the external radius we will have
rex − r I λ/2π
However, if rex is too large we should not ignore the phase shift δ due to the
gravitational field of the source when the wave is passing through the local area. So,
we must also impose that
2π Rg rex
δ = ln 1 (7.26)
λ rI
158 7 Gravitational Waves
We need all these conditions to handle propagation in the local area as in the case
of quasi-flat metric space-time. We use again, as a starting point, the field equations
(7.9) and the Lorentz gauge condition:
∂ 2 h μν 16πG ∂h μα
= − 4 T μν ; =0
∂x α ∂xα c ∂xα
and again we proceed in close analogy with the electromagnetic case:
∂ 2 Aμ 4π ∂ Aμ
= Jμ =0
∂x α ∂xα c ∂xμ
where Aμ is the 4-potential and Jμ is the 4-current. When we combine the first
equation and the second in both systems we get
∂T μα ∂Jμ
=0 =0 (7.27)
∂xα ∂xμ
Both equations represent conservation laws: the invariance of charge in the elec-
tromagnetic case and the laws of conservation in mechanics, in particular the con-
servation of 4-momentum, in the case of gravity. Both solutions are given in terms of
retarded potential. However, when we perform multipolar expansion of the solution,
in the case of gravity, due to the conservation of momentum, the dipolar term is
zero and the first non-zero contribution to the radiation field is the quadrupolar one.
It follows that the simplest calculation techniques related to the problem of gravi-
tational radiation generation are based on the so-called quadrupolar formalism. In
quadrupole approximation, it can be shown that the metric tensor expressed in the TT
gauge, is a function of the second time derivative of the mass quadrupole moment:
2 ∂2 (T T )
h (T T ) jk = D jk (t − r /c) (7.28)
r ∂t 2
where r is the distance from the central point of the source, t is the proper time
measured by an observer at rest with respect to the source, (t − r /c) is the retarded
time and D jk is the quadrupole moment already defined in Eq. 7.4. Moreover, if
the internal motion of the source is slow, it is possible to neglect, in the local area,
the propagation delay (see Eq. 7.26) and to approximate the field considering the
Newtonian potential
c2
U= (1 − g00 ) (7.29)
2
From Eq. 7.28 we conclude that the most powerful sources should be characterized
by large internal masses in high-speed non-spherical motion. However, we have the
most interesting case when the gravity field of the source is very intense, i.e. when
the quadrupole approximation it is not sufficient. To deal with this problem, various
7.4 The Emission of Gravitational Waves 159
2
G d3
LG = 5 Dkh (7.30)
5c dt 3
kh
Here we see that L G depends on the third derivative of the quadrupole moment
Dhk and is proportional to the coefficient G/5c5 = 5.5 · 10−54 m−2 kg−1 s3 . The
coefficient is extremely small, and rules out, as we shall show in the next section,
any hope of generating on Earth a measurable amount of g.w. We are then forced
to consider as detectable sources of gravitational waves only those of astrophysical
origin.
Assuming an experimental point of view, we should correlate the luminosity
with the impinging flux on Earth F and the value of the amplitude of the metric
perturbation tensor h. If we assume the simplifying hypothesis of isotropic emission
we get:
LG
F= (7.31)
4πr 2
where r is here the distance of the detector from the source. It is reasonable to expect
most of the sources to rotate around an axis, and thus emit with a non-symmetric
radiation pattern, so the factor 4π is an indicative value. When we consider the simple
case of a wave with a single Fourier component of angular frequency ω, the relation
between F and h is deduced from Eqs. 7.28, 7.30 and 7.31 and we conclude that
1 16πG
h= F (7.32)
ω c3
Note that a value of h ∼ 10−20 corresponds to a large value of continuous energy
flow to Earth (about 20 W /m2 for an frequency of 1 kHz) and a giant amount of
power emitted at astronomical distances: of the order of 1035 W for a source 1 pc
away.
Having introduced the basic relations for the emission, we now review the astro-
physical processes that emit gravitational radiation that could be detectable to Earth.
We divide the sources into two broad categories:
Examples of continuous sources are the double star systems that rotate around each
other (rather, around their center of mass) or compact stars with some degree of
asymmetry, quickly rotating around their axis.
In this same category we also include stochastic signals of cosmological nature,
the gravitational analog of the microwave cosmic background of electromagnetic
radiation.
In the class of impulsive emitters we classify all the stellar collapse processes,
like the explosions of supernovae and the coalescence of binary systems.
In principle, one would hope to build in the laboratory the simplest source of contin-
uous signal, as a simple mechanical device: for example, we might consider a rod or
a dumbbell rotating around an axis perpendicular to its length, at angular velocity ω.
The calculation of the rod quadrupole moment in Cartesian coordinates is relatively
simple and we suggest it as an exercise. Applying the definition of luminosity (7.31),
we obtain
32 G 2 6
LG = I ω (7.33)
5 c5
where I is the moment of inertia with respect to the rotation axis. As a practical exam-
ple, we calculate the emission of a rotor actually built to calibrate the EXPLORER
gravitational wave detector, installed at CERN (Astone et al. 1991, 1993). A similar
rotor was also used to set a limit on the existence of a Yukawa-like “fifth force”
(Astone et al. 1998). This rotor is a 14 kg aluminum bar with a profile that mini-
mizes centrifugal forces. The bar rotates at 462 Hz, close to the tear-apart limit of the
material due to centrifugal forces. Its moment of inertia, relative to the rotation axis,
is I = 0.15 kg m2 . This rotor emits radiation at twice the rotation frequency, but its
luminosity is a mere L GW = 2.4 · 10−33 W. This value is so small that it makes it
impossible, in the light of current technology, to attempt and detect the emitted grav-
itational wave. It is therefore clear that only an astrophysical (i.e. very massive and
compact) objects rotating at high angular velocity can provide significantly large
signal; and even in this case, the signals are still (to date) not large enough to be
measurable.
Rotating neutron stars can be sources of gravitational waves, if their crust has any
distortion. We shall consider them here only for what concerns them as GW sources.
More details can be found about these peculiar stars in Chap. 12. These compact
objects are the result of the evolutionary process of a star of large mass: once the
nuclear processes that sustain it are exhausted, the star tends to collapse under the
action of gravitational force. In this phase the protons react, through inverse β decay,
7.5 Continuous Sources 161
Fig. 7.4 Composite (X-Ray and optical) image of the pulsar in the Crab nebula. Credits
NASA/HST/CXC/ASU/J. Hester et al. On the left a sketch that highlights the lighthouse effect.
Courtesy M. Kramer
with the electrons to form neutrons and anti-neutrinos. The neutrinos carry an enor-
mous amount of energy towards the upper layers of the star, propagating a devastating
shock wave and expelling the star mantel in a gigantic explosion. The light pulse that
is generated is known as the supernova, which hides the collapsed nucleus inside,
that is, a neutron star of radius of the order of 10 km and mass of the order of 1.5 M .
While drastically decreasing its radius, the collapsed object begins to rotate more
quickly on itself due to the conservation of angular momentum. Some of these stars
have associated a very high magnetic field (1011 − 1015 Gauss). When the axis of
rotation does not coincide with the magnetic axis, the star emits electromagnetic
waves along the direction of the field. Such a star is called a Pulsar (pulsating radio
source) and emits radio waves for the combined effect of rapid rotation and intense
magnetic field on charged particles in its atmosphere. The waves are emitted within
a small cone around its magnetic axis: this, on the other hand, is tilted with respect
to the axis of rotation, so that the radio wave sweeps the sky at each turn, as happens
for the light emitted from the lighthouse of a port. An observer on Earth intercepts
the beam at regular time intervals and then receives radio pulses with a recurrence
period equal to that of the star’s rotation (Fig. 7.4).
In stars with higher angular velocity the decrease of the rotation period (spin-
down), when observed, is generally extremely small. This implies a low emission of
gravitational waves and consequently a high symmetry of the star. However, given
the fact that the intense electromagnetic emission of a pulsar can only be explained
by the presence of an extremely strong multipolar magnetic field, its asymmetry can
contribute significantly to its structural deformation Moreover, the observation of
sudden changes in the rotation period of some pulsars is interpreted as a change in
the structure of the crust (a star-quake) and a settlement of its residual distortion.
To evaluate the luminosity of this potential source of gravitational waves, we
consider a star of mass M, having a simple ellipsoidal geometry with eccentricity e
(a − b)
e= √
a·b
162 7 Gravitational Waves
M 2
I3 = (a + b2 )
5
is the main moment of inertia with respect to the axis perpendicular to the equatorial
plane. The axis of rotation of the star is rotated with respect to the axis of I3 of an
angle φ = t, where is the angular velocity.
Consider the matrix of the main moments of inertia:
⎛ ⎞
I1 0 0
I jk = ⎝ 0 I1 0 ⎠ (7.34)
0 0 I3
We obtain:
⎛ ⎞
I1 cos 2 φ + I2 sin 2 φ −sinφcosφ(I2 − I1 ) 0
ik jl Ikl = ⎝−sinφcosφ(I2 − I1 ) I2 cos 2 φ + I1 sin 2 φ 0 ⎠ (7.36)
0 0 I3
2G d2 r
h T T jk = jklm Dlm t −
r c4 dt 2 c
5 In practice
jklm ìs obtained by applying twice the 2 × 2 tensor Pi j = δi j − n i n j , which projects a
vector in the plane perpendicular to the direction of the unit vector n: jklm = P jk Plm − 21 P jk Plm .
7.5 Continuous Sources 163
1
jklm = (δ jk − n j n k )(δkl − n k nl ) − (δ jk − n j n k )(δml − n m nl )
2
⎛ ⎞
−cos(2tr ) −sin(2tr ) 0
h T T jk = h o ⎝ −sin(2tr ) cos(2tr ) 0⎠ (7.38)
0 0 0
where tr = t − r
c is the retarded time and the amplitude h o is
4G2
ho = I3 e (7.39)
c4 r
Note that the gravitational signal is emitted at twice the rotation frequency ωg =
2, its amplitude depends linearly on e and quadratically on the rotation period
T = 2π/ .
A convenient rearrangement of this equation gives the amplitude in terms of
typical parameters:
1 ms 2 1 kpc I3 e
h o = 4.21 × 10−24
T r 1038 kg m2 10−6
Finally the luminosity is given by the equation
32 G I32 e2 6
L GW = (7.40)
5 c5
An interesting example is the PSR 0532 pulsar in the Crab nebula, rotating at ∼ 2π
29.67 Hz at a distance r = 2 kpc from the Earth,6 with e estimated between 10−6
and 10−8 : we obtain a value of h o in the range 10−26 − 10−28 .
Despite the low value of the metric perturbation, the group of the Tokyo University
and the KEK laboratory (Tsubono 1991) tried to measure the gravitational emission
using a resonant detector, specially shaped to resonate at 59.35 Hz and, not detecting
any signal, set a first upper limit: h < 2 · 10−21 .
In recent years the LSC and Virgo collaborations, jointly analyzing the data of
the interferometers of the LIGO and Virgo project, set a limit of h ≤ 2 · 10−25 . It
corresponds to an ellipticity of the order of 10−4 . This is a first important result of
astrophysical nature, since the value is a factor 7 lower than the limit to the total
energy flow lost by the pulsar as derived by the radio observation of its rotational
frequency slow down.
The neutron stars can rotate at very high angular speeds. In fact, Pulsars with
rotational frequency up to 716 Hz have been observed. These are objects produced
by the growth mechanism of old neutron stars or by the collapse of a white dwarf
6 1 pc 3.86 · 1016 m.
164 7 Gravitational Waves
in a double star system and we know that about 50% of the stars belong to multiple
systems. Current theoretical estimation predict that there should be more than 106
double stars in our galaxy.
R. Wagoner suggested (Wagoner 1984) that high-frequency monochromatic
waves could be generated when the neutron star reaches a point of instability, at
which hydrodynamic waves are generated on the surface, thus determining the emis-
sion of gravitational radiation. In this case the emission counterbalances the increase
in rotational speed and it should be proportional to the flow of X-rays coming from
the star. Some observations in the X band indeed tend to confirm the existence of
growth of angular velocity; however, this process would critically depend on the
viscosity of the star.
We want to study the continuous wave emitted by two massive bodies rotating around
their center of mass and describing a circular orbit of radius a, with angular frequency
ω.
The system is conveniently described the masses of the two bodies m 1 and m 2 or,
by its total mass Mtot and its reduced mass μ:
m1m2
Mtot = m 1 + m 2 ; μ=
m1 + m2
32 G 2 4 6
L GW = μ a ω (7.42)
5 c5
In the assumption that the orbital parameters, a and ω, do not change significantly in
the time interval of the observation, a condition known as adiabatic approximation,
we write the third Kepler law in the form
ω 2 a 3 = G Mtot (7.43)
7.5 Continuous Sources 165
Substituting in (7.42), the luminosity, i.e. the rate of energy loss of the system, takes
the form:
dE 32 G 4 1
L GW = − = (m 1 m 2 )2 (m 1 + m 2 ) 5 (7.44)
dt 5 c5 a
We can also differentiate the orbital (kinetic + potential) energy of the system
E = −(Gm 1 m 2 )/(2a) and, comparing, find that the diminishing distance a(t) of
the two bodies follows the law:
da 64 G 3 1
=− (m 1 m 2 )(m 1 + m 2 ) 3 (7.45)
dt 5 c5 a
Integration of this relation with respect to time yields a(t)
256 G 3
a 4 (t) = ao 4 − (m 1 m 2 )(m 1 + m 2 ) t
5 c5
that we can rewrite as
t 14
a(t) = ao 1 − (7.46)
τcoal
where ao is the distance at the initial time and τcoal the characteristic time to coales-
cence of the phenomenon
5 c5 ao 4
τcoal = (7.47)
256 G 3 (m 1 m 2 )(m 1 + m 2 )
From Eq. 7.45 and differentiating Eq. 7.43 we also see how the angular velocity (or
the rotation period) of the system varies over time:
dω 96 G 3 m 1 + m 2 23 G 2 m 1 m 2 3 L GW
=− = ω0 (7.48)
dt 5 c2 a c2 a 4 2 E
Integrating Eq. 7.43 for ω and substituting Eq. 7.46 for a(t), we derive the orbiting
angular frequency:
t − 38
ω = ωo 1 − (7.49)
τcoal
where the initial angular frequency is:
G Mtot
ωo = (7.50)
ao3
The relation (7.48) explains, although only qualitatively, the radio observations for
the PSR 1913+16 system. In the case of a elliptic orbit, the calculations are less
straightforward, but lead to similar results. The brightness increases rapidly with
eccentricity and the emission is more intense near the periastrum, where the effect
166 7 Gravitational Waves
The signal frequency 2/T = 6 mHz, is still too low for terrestrial detectors, but
well in the band of the future LISA observatory, that will operate in the frequency
range 10−4 − 10−1 Hz.
There is also a large population of binary systems in the galaxy that emit at
frequencies between 10−4 − 10−7 Hz. For the sources closest to us the intensity
is such as to produce waves with h ∼ 10−21 . The number of sources is so high
to produce a random overlap of signals as the dominant effect. This results in a
background noise of a gravitational origin analogous to the cosmological background
noise.
We have mentioned in the previous paragraph the formula of luminosity for a binary
system in a circular orbit (see Eq. 7.42).
The loss by radiation reduces the orbital radius and causes the stars to move closer
until they collide. The intensity of the gravitational ground signal, averaged on all
possible detector orientations with respect to the source, is:
7 In the case of the PSR 1913+16, the most intense Fourier component is the eighth harmonic.
7.6 Transient Signals 167
Fig. 7.5 The chirp, i.e. a gravitational wave signal emitted by the coalescence of a double neutron
star system in the absence of spin and matter transfer
2 2
−23 Mtot 3 ν 3 100 Mpc
h 10 μ
M 100 Hz r
The time scale for reduction of the orbital radius is expressed by the relation:
2 − 8
ν 1 Mtot − 3 ν 3
τ= 8 s
ν̇ μ M 100 Hz
The wave emitted by the system appears similar to a sinusoidal function in which
both frequency and amplitude increase over time up to the moment when the stars
merge (last stable orbit). This waveform is called chirp (Fig. 7.5).
To describe the type and frequency evolution of this signal, it is customary to
introduce the chirp-mass:
1/5
(m 1 m 2 )3 1/5
M= = (m 1 m 2 )2 μ (7.51)
m1 + m2
As we shall see, this combination of the masses m 1 and m 2 of the binary system,
largely determines how rapidly the frequency of the signal evolves over time. The two
components of the metric perturbation h are found, from the relations of the previous
section, with a lengthy but straightforward calculation (we refer, for instance, to the
book of Maggiore 2018):
5
(G M) 3 (πν(t))2/3
h + (t) = 2 (1 + cos2 i) cos(2(t) + 0 )
c4 r
168 7 Gravitational Waves
5
(G M) 3 (πν(t))2/3
h × (t) = −4 cos i sin(2(t) + 0 ) (7.52)
c4 r
where ν = 2/Tor bit is the frequency of emission of g.w., i is the angle between the
orbital angular momentum and the observer’s line of sight, (t) ≡ 2πν(t − r /c) is
the orbital phase of the equivalent one-body system around the center of mass of the
binary system.
The Eq. 7.52 contain the instantaneous orbital frequency ν(t) and terms depen-
dent on the phase (t), whose dominant components oscillate at twice the orbital
frequency:
t
(t) − 0 = 2π 2ν(t )dt (7.53)
0
In the adiabatic approximation and for circular orbits, using the formulas related to
the evolution over time of the orbital frequency Eqs. 7.47, 7.49, we obtain
c3 (τ − t) 58
coal
(t) − 0 = −2 (7.54)
5G M
This relation shows how the measurement of the wave phase allows to obtain direct
information on the chirp mass M.
In a more rigorous approach we should also consider higher order amplitude
corrections that contain other harmonics (terms containing phase of the type n(t),
being n a positive integer). Furthermore, the expressions (7.52) are valid for a system
on an almost circular orbit, an assumption that is not entirely realistic. A detailed
post-Newtonian calculation must take into account spin-orbit and spin-spin couplings
between the two components of the binary system; they produce a characteristic
modulation of the emitted gravitational signal.
The final frequency of the gravitational signal, at the end of the spiralling, when
the two massive objects collide, increases up to (Thorne 1987):
−1
∗ Mtot
f ∼5 kHz (7.55)
M
The relation (7.55) only holds if the system is composed of compact objects. When
the two objects collide (merging) and in the following dynamic evolution of the
new object, the masses are moving at relativistic speeds in a regime of extreme
gravitational fields. Merging implies a particularly violent dynamics that can lead
to the formation of a black hole, with a significant release of energy in the form
of gravitational radiation. The post-Newtonian approximation is certainly no longer
valid and numerical simulations need to be developed to predict the emission. The
characteristic time interval of the merging phase is very short: from a few milliseconds
in the case of stellar mass black holes to a few seconds in the case of large mass black
holes. During this phase, if a lighter star falls toward a heavier and more compact
one, a significant amount of matter will likely have a high angular momentum such as
7.6 Transient Signals 169
to counteract the infall. In the most accredited models the process of matter transfer
determines the formation of an accretion disk around the black hole, which can feed
an intense jet of gamma rays along the rotation axis (Gamma Ray Burst, GRB).
After the merger phase of the two progenitors stars (neutron stars and/or black
holes), the new compact object evolves toward an equilibrium state. In this final
phase it vibrates at its fundamental modes, emitting gravitational radiation at the
frequencies of those oscillation modes that have a quadrupole symmetry (ringdown
phase). The gravitational wave flux emitted after merging is calculated by considering
the overlap of quasi-normal modes of the final object. The ringdown duration depends
on the mass of the final object: in practice it can consist of two or three cycles only.
The coalescence phase of the binary systems is now well modelled and the ampli-
tudes of the two components of the wave emitted are expressed by the relations
(7.52). Since (t), the phase of the signal is known by the post-Newtonian calcu-
lation, it is possible to measure M, ν, t0 and 0 by properly analyzing the data.
The remaining unknown parameters of those equations can be inferred thanks to the
different response of the antennas, depending on the polarization and the direction of
wave propagation. Thus, from the observations of a network of three non-co-located
detectors, we extract three independent combinations for the polarization state and
two time delays of the observed signal.
The quantities ν, h and νν̇ are measurable, and Schutz (1986) noted that in the
product h νν̇ all masses cancel out, and we are left with an equation relating just
frequency ν and distance r . It is then possible to derive r independently from the
values of the stellar masses. This ability to measure distances is unique in Astronomy.
From the knowledge of these parameters and by adequately considering the response
function of the detectors, it is possible to determine the distance between the source
and the detectors with an accuracy of r /r ∼ 1 − 10%.
The scale of the astronomic distance is based on the measurement of the Doppler
effect of electromagnetic signals, i.e. on the redshift of the source z = ν(em) /ν(oss) −
1. The astronomic redshift is measured rather easily, because the emission and absorp-
tion spectra of the various atoms are very distinct and very well known. The redshift
value is not obtained directly from the gravitational signals associated with the coa-
lescence of binary systems. This depends on the cancellation of the simultaneous
contribution of z parameter which affects a in different way the quantities concurring
to define the amplitude h(t). In fact, when the source is at redshift z, the Eq. 7.52
must be rewritten mapping M → Mz = (1 + z)M and ν → ν(em) /(1 + z). The
phase (t), being by definition the time integral of the pulsation 2πν(t), exhibits a
similar cancellation effect and is independent of z.
Therefore, to study the evolution rate of the universe, z must be extracted by
detecting an electromagnetic counterpart of the gravitational signal, for example, by
identifying the host galaxy of the event. Knowing the redshift of the electromagnetic
counterpart and deriving r from the relative gravitational signal, it is possible to
measure the Hubble constant, through the relation H0 = c z/r , and also to study the
function of evolution of the star formation rate. Combining e.m. observations (the
redshift of galaxy NGC 4993) and GW data from the coalescence of two neutron
170 7 Gravitational Waves
+12
stars (Abbott et al. 2017a) a value H0 = 70−8 km s−1 Mpc−1 has been determined,
in discrete agreement with CMB and Supernovae observations.
As discussed above, modelling the merger phase of collapse is difficult. In the
case of a pair of stars with significantly different mass, we expect the smaller body,
once a given distance from the other star is reached, to trigger mass transfer due to
tidal interactions. Angular momentum is also transferred from one star to the other
and in practice the secondary star tends to turn into a thick axis-symmetrical disk
that orbits around the primary star.
The time scale of the evolution of the system depends on the transport mechanisms
of matter and the dissipation of angular momentum from the disk: it is longer when
the process is entirely dominated by matter viscosity. In the case where the initial
mass ratio is large, the mass transfer rate would approach a limit due to the emission of
gravitational radiation. However, this scenario appears unlikely, due to the enormous
value of matter flow.
We have already observed that the steady state of mass transfer is typical of a dou-
ble system with a large mass ratio. We also need to point out that the case of a binary
system of two neutron stars with a large mass ratio is not likely. The transfer process
leads to the interesting possibility that the smaller star, losing mass, is reduced to
a minimum value below which free expansion is unstable and therefore this mass
could explode before colliding with the more massive and voluminous companion.
Most of the energy released in the final collapse is converted in production of neu-
trinos, which are very difficult to detect due to the distance of the event from Earth.
Thus, one can hypothesize a scenario in which the gravitational radiation is the only
observable signal produced by these phenomena.
Indeed, in the coalescence or collapse of compact objects, GW emission is roughly
simultaneous with the release of neutrinos and high energy photons in the explosion
phase. Measuring the arrival time of GW and of the gamma burst (GRB) events has
allowed us to measure the velocity of gravitational waves with an accuracy limited
only by the intrinsic delays to the collapse mechanism, as discussed in Sect. 7.3.
In the Fig. 7.6 sketches a summary of the potential evolutionary scenarios of a
binary system. Depending on the masses and the state of rotation of the components
of the binary system, the merger can evolve in multiple directions (Bartos et al. 2013).
Each of them is associated with a different type of both gravitational and gamma ray.
In the case of systems formed by two neutron stars (NS), the various evolutionary
paths are selected by comparing the total mass of the system Mtot = m 1 + m 2 with
the threshold value of 3 M , also considering the difference |m 1 − m 2 | between
the masses of the individual components. Another discriminating element is the
comparison of Mtot with the maximum mass value of a non rotating neutron star,
M N S (static) . In the case of a neutron star-black hole system (NS-BH), the evolutionary
scheme differs on the basis of the comparison between the radius of the innermost
stable circular orbit relative to the coalescence phase, R I SC O with the typical distance
where the neutron star begins to be taken apart by the intense tidal forces (tidal
disruption—TD) RT D : this value depends on the star equation of state. Note that
mass transfer can begin long before reaching the critical distance RT D . The formation
of the accretion disk around the black hole seems essential to create the conditions
7.6 Transient Signals 171
Fig. 7.6 Possible evolutionary scenarios of a binary system. Credits Bartos et al. (2013)
for GRB emission. Its formation depends on the masses and on the state of rotation
of the objects. If the neutron star is disrupted by tidal forces before falling into the
black hole, the mass of the accretion disk that is formed depends on the spin of BH,
on the initial alignment of the spin of the two objects and on the ratio m B H /m N S .
If the detector has sufficient bandwidth, the signal-to-noise ratio for the chirps
can be drastically improved by filtering the data with an appropriate template, as
discussed further on. The assumed shape of the signal plays a fundamental role in
the design of such templates.
• The reference event for a type I supernova is the nuclear detonation of a white
dwarf, occurring after the accretion from a companion. The most optimistic mod-
els suggest that a significant fraction of the white dwarf mass is involved in forming
a neutron star. In any case, formation of a black hole is inhibited by the insufficient
mass of the progenitor. Furthermore, the presence of surrounding matter at the
time of collapse can be the cause of the limited optical emission associated with
172 7 Gravitational Waves
Fig. 7.7 The sky before and after the supernova 1987a explosion. Credits Anglo-Australian Obser-
vatory
the event. Thus, to classify such events, the term of optically silent supernova was
introduced.
• The type II supernova is associated with the collapse of the central area of a
star and the shock wave determines the ejection of the optically luminous outer
mantle.
with the non-spherical motion. The fundamental quantity that can play an important
role in limiting the sphericity of collapse is the rotation of the star. If its angular
momentum is initially high, the central part involved in the collapse should increase
its velocity. Depending on the initial conditions, the system can evolve breaking
up into two or more parts that, quickly spiralling, would tend to merge again, with
emission of gravitational radiation. If the critical mass value is exceeded (between
1.4 and 3 solar masses M ) the collapse continues until a black hole is formed.
Once completed its formation, the new object can still emit gravitational radiation
at the frequencies of its own normal oscillation modes, excited by the infall of matter.
The frequency range of interest is very large: from 0.1 to 10 kHz. The low frequencies
will be excited during the initial phase of collapse, while the high frequencies will
show up in the later phase, mainly due to the excitation of the normal modes of the
collapsed star. Different authors have derived the characteristics of the gravitational
signal; we report here a rough estimate of the value of h and of the emission frequency
ν, as given by Thorne (1987):
c3 M
ν≈ ≈ 1.3 · 104 Hz (7.56)
5πG M M
15
G M
h≈ (7.57)
2π c2 r
The value of the conversion efficiency
of the rest mass into gravitational radiation
is uncertain. Theoretical evaluations are in disagreement and have gone from the
initial, optimistic estimates of
∼ 10−2 to the more modern ones (Dimmelmeier et
al. 2001)
∼ 10−7 . This is due to the fact that the collapse models have evolved over
time and these estimates also vary according to the values of the dynamic quantities
characterizing the initial state, such as the angular momentum of the body.
Assuming that the signal is emitted at a frequency of 1 kHz from a body of 13 M
with a conversion efficiency
= 10−2 , we would have h 10−17 in the case of a
source at the galactic center (10 kpc). This event would have been detectable even
by the resonant detectors that have continuously operated till a few years ago. The
rate of formation of massive black holes is, however, probably very low. It has then
been necessary to push the sensitivity of the detectors to a level that can detect events
produced in the Virgo cluster of galaxies (∼2500 galaxies at 10 Mpc from the Earth,
h 10−20 ): within that population, we expect such an event to occur with reasonable
probability within the lifetime of gravitational wave detectors.
According to the models based on perturbative calculations, for slow rotations
the efficiency
depends on the fourth power of the angular momentum. However,
increasing the speed of rotation conditions arise that could lead to disruption. In this
case the efficiency of emission by the non-ax-symmetrical residual parts could be
much higher. Saenz and Shapiro (1979, 1981) have developed a model of collapse
starting from an initial geometry of homogeneous spheroid which rebounds on itself
until it leads to the formation of a neutron star. An effort has been made to derive
174 7 Gravitational Waves
1 d ρGW ν d
GW (ν) = = ρGW (7.58)
ρc d log(ν) ρc dν
which is the ratio of the gravitational-wave energy density per unit logarithmic fre-
quency, to the critical energy density, i.e. the value of density needed for a closed
universe:
3 c2 Ho 2
ρc =
8πG
where Ho is the present value of the Hubble–Lemaître constant.
The sources of the stochastic background can be broadly divided into two cate-
gories:
– The physical processes that occurred at the earliest moments of the universe cer-
tainly created a stochastic background that survives, to some extent, until today.
This is analogous to the cosmic microwave background, which is an electromag-
netic record of the early universe.
– SGWB is created also by the superposition of a large number of independent
sources as binary black hole and binary neutron star merging during the history
of the universe. In other words, the incoherent sum of numerous unresolved grav-
itational wave signals should result in a stochastic background of gravitational
waves.
where α is the spectral index, νr e f and α are two theory dependent parameters.
Some papers, inspired by superstring theory, suggest that cosmic strings may
have been produced in the early universe and then expanded to cosmic sizes (see, for
example, Polchinski (2004) for an overview).
7.7 The Stochastic Gravitational Wave Background 177
Fig.7.8 Constraints on the SGWB and the projected design sensitivity of the interferometer network
LIGO-Virgo assuming two years of coincident data. We report also the constraints resulting from
other kind of measurements: CMB measurements at low multipole moments, indirect limits from
the Cosmic Microwave Background (CMB) and Big-Bang Nucleosynthesis, pulsar timing and from
the ringing of Earth’s normal modes. Note the ν −2 slope at frequencies below 10−16 Hz, predicted
by some inflationary models. Credits Christensen (2017)
In the Fig. 7.8 we show the limit set on SGWB using the LIGO-Virgo data collected
during the first observation run of the advanced detectors, as well as constraints from
previous analyses, theoretical predictions, the expected limits at design sensitivity for
Advanced LIGO and Advanced Virgo, and the expected sensitivity of the proposed
Laser Interferometer Space Antenna (LISA).
Since the resulting cosmic strings only interact gravitationally, their effects are
scaled by the product of Newton’s constant and the string tension, Gμ. Because the
energy stored in the strings is eventually converted into gravitational waves, strings
would produce a large stochastic gravitational wave background. There are several
possibilities for the geometry of the compact dimensions of string theory, including
localised branes, and these warped extra dimensions allow the string tension to be
much lower than the Planck scale, anything above the TeV scale.
An ensemble of such objects could produce a background accessible to the inter-
ferometers on the Earth and/or in space.
LIGO-Virgo network). This rate, summed to the background rate from binary black
hole (a much more reliable estimate, thanks to the numerous GW observations)
increases the amplitude of the total astrophysical background.
The AGWB nature is expected to be significantly different from its cosmological
counterpart, which is expected to be, at least for inflation, stationary, unpolarised,
statistically almost Gaussian, and isotropic, by analogy with the cosmic microwave
background.
Recently, particular attention has been devoted to predict SGWB polarisation
(Cusin et al. 2019a) and AGWB anisotropies (Cusin et al. 2019b). Since the astro-
physical sources are located in cosmic structures and they are not uniformly dis-
tributed, the GW energy flux from them (both of resolved or unresolved case) should
not be constant across the sky and it should depends on the direction of observation.
The fact that the source are not uniformly distributed is just one of the causes of the
anisotropy; the second one is related to the deflection effect due to massive structures
present along the GW propagation path. For a background of cosmological origin,
lensing by large scale structures is the main source of anisotropy and actually, GW
and CMB anisotropies, are correlated.
Note that, when a flux of unpolarised radiation coming from a given direction
impinges on a massive object, the outgoing radiation can be polarised, due to the
dependence of the absorption cross section on the polarisation. Thus, to characterise
the polarisation change induced by the diffusion by massive structures, in analogy
with the electromagnetic case, we will refer to the change in the GW Stokes param-
eters, which are defined as
I = |h + |2 + |h × |2 Q = |h + |2 − |h × |2 (7.60)
U = 2Re(h ∗+ h × ) V = 2Im(h ∗+ h × ) (7.61)
where I is proportional to the intensity of the GW, Q to the difference between the
intensities of two polarised contributions.
It can be shown that U provides the same information as Q in a frame rotated by
π/8, while the parameter V describes a phase difference between h + and h × , which
results in circular polarisation. In general these parameters depend on the propagation
direction of the outgoing GW radiation, so that these are basic quantities used to
model the angular power spectrum of AGWB, that hopefully will be a real observable
for the future GW detectors. For an astrophysical background in the PTA and LISA
frequency bands, the above cited authors predict that the amount of polarisation
generated is suppressed by a factor 10−4 − 10−5 with respect to anisotropies. For
a cosmological background, predictions are even more discouraging: an additional
suppression factor of two order of magnitudes is indeed expected.
As mentioned above, the primary target of the Earth-based interferometric detec-
tors will be the background from unresolved binary mergers of stellar-mass black
holes and/or neutron stars throughout the universe. In an optimistic scenario, this
detection can take place with the present, second generation of detectors (Advanced
LIGO and Advanced Virgo). The detection of this background will provide infor-
mation about stellar-mass binary-black-hole populations at much larger distances
7.7 The Stochastic Gravitational Wave Background 179
than those accessible for the resolved mergers. It will also complement searches for
gravitational-wave backgrounds at much lower frequencies, as the probable detect
ions before the end of the decade by PTA, and possibly by CMB searches.
Finally, we mention that a huge number of ultra-compact binaries in our galaxy
(millions of sources) will form an unresolved foreground signal in LISA (2013). It
will appear as noise, but being modulated in amplitude during the yearly rotation,
it will be detectable. The overall strength will tell us about the distribution of the
sources in the Galaxy, as the different Galactic components (thin disc, thick disc,
halo) contribute differently to the modulation. Their relative amplitudes will be used
to set upper limits on the halo population, yet largely unknown.
References
Abbott, B., et al.: Observation of gravitational waves from a binary black hole merger. Phys. Rev.
Lett. 116, 061102 (2016)
Abbott, B., et al.: GW170817: observation of gravitational waves from a binary neutron star inspiral.
Phys. Rev. Lett. 119, 161101 (2017)
Abbott, B., et al.: Gravitational waves and gamma-rays from a binary neutron star merger:
GW170817 and GRB 170817A. App. J. Lett. 848, L13 (2017)
Allen, B.: Stochastic gravity-wave background in inflationary-universe models. Phys. Rev. D 37,
2078 (1988)
Astone, P., et al.: Evaluation and preliminary measurement of the interaction of a dynamical gravi-
tational near field with a cryogenic gravitational wave antenna. Z. Phys. C 50, 21 (1991)
Astone, P., et al.: Long term operation of the “EXPLORER” cryogenic gravitational wave detector.
Phys. Rev. D 47, 362–375 (1993)
Astone, P., et al.: Experimental study of the dynamic Newtonian field with a cryogenic gravitational
wave antenna. Eur. Phys. J. C 5, 651 (1998)
Bartos, I., Brady, P., Marka, S.: How gravitational-wave observations can shape the gamma-ray
burst paradigm. Class. Quantum Grav. 30, 123001 (2013)
Bondi, H., Pirani, F.A.E., Robinson, I.: Gravitational Waves in General Relativity. III. Exact Plane
Waves. Proc. R. Soc. Lond. A 251, 519 (1959)
Brans, C., Dicke, R.H.: Mach’s principle and a relativistic theory of gravitation. Phys. Rev. 124,
925 (1961)
Christensen, N.L.: Searching for the stochastic gravitational-wave background with advanced LIGO
and advanced virgo. In: Auge, E., Dumarchez, J. (eds.) 52nd Rencontres de Moriond on Gravi-
tation. Jean Tran Thanh Van (2017)
Cusin, G., Durrer, R., Ferreir, P.G.: Polarization of a stochastic gravitational wave background
through diffusion by massive structures. Phys. Rev. D 99, 023534 (2019)
Cusin, G., Dvorkin, I., Pitrou, C.: Uzan J-P Properties of the stochastic astrophysical gravitational
wave background: astrophysical sources dependencies. Phys. Rev. D 100, 063004 (2019)
Dimmelmeier, H., Font, J.A., Möller, E.: Gravitational waves from relativistic rotational core col-
lapse. Astrophys. J. Lett. 560, L163–L166 (2001)
Dimmelmeier, H., et al.: Gravitational wave burst signal from core collapse of rotating stars. Phys.
Rev. D 88, 064056 (2008)
Eardley, D.M., Lee, D.L., Lightman, A.P.: Gravitational-wave observations as a tool for testing
relativistic gravity. Phys. Rev. D 8, 3308 (1973)
Einstein, A.: Näherungsweise Integration der Feldgleichungen der Gravitation (Approximate inte-
gration of the field equations of gravitation). Königh Pruss. Akad. der Wissenschaften Sitzungs-
bericthe, Erster Halbband (1916), p. 688
180 7 Gravitational Waves
Einstein, A.: Gravitationswellen (On Gravitational Waves). Kö nigh Pruss. Akad. der Wis-
senschaften Sitzungsbericthe, Erster Halbband, 154 (1918)
Gravitational Radiation and Supernova: In: Wheeler, J.C., Piran, T., Weinberg, S. (eds.) Supernovae.
World Scientific Publishing, Singapore (1991)
Heaviside O.: Electromagnetic Theory, Appendix 1 and 2. London (1893)
Hulse, R.A., Taylor, J.H.: Discovery of a pulsar in a binary system. Astrophys. J. Lett. 195, 251
(1975)
Maggiore, M.: Gravitational Waves, vol.1: Theory and Experiments. Oxford University Press,
Oxford (2018)
Metzger, B.D., et al.: Electromagnetic counterparts of compact object mergers powered by the
radioactive decay of r-process nuclei. MNRAS 406, 2650 (2010)
Muller, E.: The collapse of rotating stellar cores: the amount of gravitational radiation predicted by
various numerical models. In: Bancell, D., Signore, M. (eds.) Problems of Collapse and Numerical
Relativity. Springer, Dordrecht (1984)
Penrose, R.: A spinor approach to general relativity. Ann. of Phys. (New York) 10, 171 (1960)
Penrose, R., Rindler, W.: Spinors and Space-Time. Cambridge University Press, Cambridge (1984)
Piran, T., Stark, R.F.: Numerical relativity, rotating gravitational collapse and gravitational radiation.
In: Centrella, J.M. (ed.) Dynamical Spacetime and Numerical Relativity. Cambridge University
Press, Cambridge (1986)
Pirani, F.A.E.: On the physical significance of the Riemann tensor. Acta Physica Polonica 15, 389
(1956)
Poincaré, E.H.: Sur la Dynamique de l’Electron. Rendiconti del Circolo matematico di Palermo 21,
129–176 (1905)
Polchinski, J.: Cosmic superstrings revisited. AIP Conf. Proc. 743, 331 (2004)
Saenz, R.A., Shapiro, S.L.: Gravitational and Neutrino Radiation from Stellar Collapse: Improved
Ellipsoidal Model Calculations. App. J. 229, 1107 (1979)
Saenz, R.A., Shapiro, S.L.: Gravitational radiation from stellar collapse: III damped ellipsoidal
models. App. J. 244, 1033 (1981)
Schutz, B.F.: Determining the Hubble constant form gravitational wave observations. Nature 323,
310 (1986)
Takiwaki, T., Kotake, K., Yudai Suwa, Y.: Three-dimensional simulations of rapidly rotating core-
collapse supernovae: finding a neutrino-powered explosion aided by non-axisymmetric flows.
MNRAS Lett. 461, L112–L116 (2016)
The LISA Consortium, The Gravitational Universe. arXiv:1305.5720
Thorne K. Gravitational radiation in 300 years of Gravitation, Hawking S.W. and Israel W. ed.,
Cambridge Univ. Press (1987)
Tsubono, K.: Detection of continuous waves. In: Blair, D.G. (ed.) The Detection of Gravitational
Waves. Cambridge University Press, Cambridge (1991)
Wagoner, R.V.: Gravitational radiation from accreting neutron stars. Astrophys. J. 278, 345 (1984)
Weber, J.: Detection and generation of gravitational waves. Phys. Rev. 117, 306 (1960)
Weinberg, S.: Photons and gravitons in perturbation theory: derivation of maxwell’s and Einstein’s
equations. Phys. Rev. 138 B 988 (1965)
Weinberg, S.: Gravitation and Cosmology: Principles and Applications of the General Theory of
Relativity. Wiley, New York (1972)
Further Reading
Sathyaprakash, B.S., Schutz, B.F.: Physics, Astrophysics and Cosmology with Gravitational Waves.
Living Rev. Relat. 12, 2 (2009)
The Resonant Detector
8
8.1 Introduction
For several decades, roughly between 1960 and 2005, the experimental search for
gravitational waves was mainly carried on by resonant detectors or “Weber bars”.
These detectors were eventually phased out when long-baseline interferometers came
to maturity and outperformed them both in sensitivity and bandwidth. Nevertheless,
besides the historical relevance, there is an actual interest in studying this sophis-
ticated and ingenious apparata, where many senior GW scientists initiated their
formation, due to the many technologies, still used these days, that were conceived
and developed for these detectors. Moreover, resonant detector methods and strate-
gies have been recently implemented (Goryachev et al. 2021) in searches for GW in
the MHz frequency region. We refer to Aguiar (2010) for a readable and thorough
historical review of the many experimental projects that have involved resonant GW
detectors, with an extensive bibliography.
As mentioned in Sect. 7.1, the idea of an elastic detector of GW was suggested
by F. Pirani in 1956. Joseph Weber (and J. A. Wheeler) immediately considered
the elastic detector and worked out a detailed analysis by 1960. Weber then went
on to build several of these detectors and was followed by many groups around the
world: we count three generations of Weber bars (room temperature, cryogenic and
ultra-cryogenic), which we briefly describe here.
A useful, although misleading shortcut in dealing with the effect of GW on free
(or bound) masses is the following: GW cause, as we saw in Chap. 7, relative motion
of two test masses at distance : x(t) = 21 h(t). In classical mechanics, force is what
causes motion of masses, according to F = m x¨ . If we take the second time derivative
of the first relation, and compare it with one-dimensional Newton’s second law, we
can define a GW pseudo-force
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 181
F. Ricci and M. Bassan, Experimental Gravitation, Lecture Notes in Physics 998,
https://doi.org/10.1007/978-3-030-95596-0_8
182 8 The Resonant Detector
Fig. 8.1 Schematic of Weber’s oscillator composed of two point masses connected by a spring
1
FGW =
m ḧ(t) (8.1)
2
We shall use this relation to derive the antenna response to GW. It is however clear
from it that the acceleration produced by GW grows with the linear dimension of
the detector, but does not depend on its mass m, in accordance with the Equivalence
Principle. Large mass detectors were developed not to increase the signal, but, as we
shall show, to reduce the thermal noise.
A gravitational wave is a tidal effect. As such, any detector needs to extend over a
finite region of space to feel such effect. In the case of e.m. beam detectors, test masses
are set far from each other (km, for Earth-based interferometers, Gm for spaceborne
detectors) in an almost free falling condition and the space-time is probed by light
bouncing between them.
J. Weber in 1960 modified the equation of the geodesic deviation to include a term
of elastic interaction between two freely gravitating masses1 and derived a solution.
Consider a simple harmonic oscillator, as shown in Fig. 8.1: two point masses
under the driving pseudo-force of a GW h(t). For simplicity, we take the wave as
impinging normal to the line (along an x-axis) joining the two test masses. The
equation of motion for either test mass of such a system is
2 1
ẍ + ẋ + ωo2 x = l ḧ(t) (8.2)
τ 2
1 The spring schematizes the restoring effect to the state of stable equilibrium of the system; this
effect is essentially due to the internal electromagnetic forces among the microscopic components of
the elastic body. It could therefore be argued that the changing metric of space-time also influences
these binding forces. This effect would be relevant only if the gravitational energy density (including
the rest energy) were comparable with the electromagnetic energy density of the system. We deduce
that in a first approximation this effect is negligible.
8.2 The Resonant Detector 183
Fig. 8.2 Extension of the oscillator to a macroscopic, continuous resonator. The gravitational wave
acts on the whole cylinder: it pushes and pulls each section of the bar. On each pair of sections,
symmetric with respect to the centre, the effect is computed in the same way as in the case of Fig. 8.1
where we have introduced the dissipation factor 1/τ and the restoring force, charac-
terized by the angular resonance frequency ωo .
We solve this differential equation in the frequency domain, introducing the
Fourier transforms H (ω) for the forcing term and X (ω) for the variable x (Fig. 8.2):
l ω2 H (ω)
X (ω) = (8.3)
2 ωo2 − ω2 + i2ω
τ
To move beyond the conceptual experiment, and design a truly feasible device, we
need to replace the idealized harmonic oscillator with an elastic, extended body.
Weber’s choice was a metal cylinder, a strategy replicated in most successive experi-
ments. A detailed treatment of the detector implies modifying the elasticity equation
for a solid, dissipative body by introducing a tidal term due to the impinging gravi-
tational wave. Here we limit our considerations reporting the results for a thin rod of
mass M and length L, of a material with sound velocity vs , vibrating along its axis
(the x-axis, see Fig. 8.2) in its first longitudinal vibration mode.
In the case of a plane GW propagating perpendicular to the cylinder axis, the
extremes of the rod behave as a simple harmonic oscillator (Weber 1961) with equiv-
alent rest length and equivalent mass Meq 2 given by
4 M
= L Meq = (8.4)
π2 2
2 The equivalent mass of an oscillating body depends on the vibrational mode considered (not
all elements of the extended solid oscillate with the same amplitude) and on the point where the
vibration is measured: the equivalence is based on the kinetic energy: 21 Meq ˙2 = 21 body ρ ẋ 2 d V .
184 8 The Resonant Detector
This equivalence, which simplifies the analysis of the detector, is valid near the
resonance frequency of the first longitudinal mode of the bar ωo = π vs /L. Suppose
the antenna is excited by a pulse of the form
L t
x(t) = − h o τg ωo e− τ sin ωo t (8.5)
π2
In the case of excitation due to a monochromatic wave at the resonant frequency
of the bar, we have
2L
x(t) = −
Qh o sin ωo t (8.6)
π2
where Q = ωo τ/2 is the quality factor of the oscillator. This is a first hint of the
importance of using bars of materials with low dissipation, i.e. large Q.
In the most sensitive experiments, the rod is a massive metal body weighing few
tons. It hangs, through ingenious suspension systems that isolate it from external
mechanical noise, in a vacuum chamber, to reduce air friction. The detectors of
second and third generation had metal cylinders, L 3 m and M 2300 kg, cooled
at low (4 K) and ultra-low (0.1 K) temperature to reduce the thermal noise and increase
the bar Q: in those cases, the vacuum chamber (“experimental vacuum”) was part of
a complex cryostat with many (six, in NAUTILUS Astone et al. 1991) nested shells.
The choice of material fell, for most cryogenic detectors, on aluminium alloy Al
5056, characterized by low acoustic dissipation, Q ∼ 107 , at liquid helium temper-
ature. The group at University of Western Australia chose instead to cool a cylinder
of ∼1000 kg of Niobium, a very expensive material that exhibits, at low temperature,
Q ∼ 2 · 108 . The cylindrical shape of the antenna was the most common choice for
these detectors.
However, as early as in 1976, Wagoner and Paik (1977) had introduced the idea
of a spherical antenna: by exploiting its five quadrupolar oscillation modes, this type
of detector had the advantage of omni-directionality, while cylinders have a peaked
sin 4 (θ ) antenna pattern.3 Spherical antennas, made of an Al-Cu alloy, 60 cm in
diameter and cooled to low (4 K) and very low temperature (10–100 mK) were built
in Brazil (Aguiar et al. 2006) and in the Netherlands (Gottardi et al. 2007). Pictures
of these detectors are shown in Fig. 8.3.
Fig. 8.3 Left: a diagram of the Minigrail spherical antenna, developed at the University of Leiden
(NL). The large pumping system serves the specially designed dilution refrigerator, capable of
cooling the sphere to 65 mK. Right: photo of the Brazilian antenna Mario Schenberg. Its design is
similar to Minigrail, without the ultra-low temperature cooling stage. Courtesy of the Schenberg
and Minigrail groups
f (t) = αq(t)
This means that any noise current i n = dqn /dt in the input circuit is transformed
into a fluctuating force f n acting on the antenna: this effect is called back action and
is negligible always but in the most sensitive applications, like indeed GW detectors.
A different, more general description of a transducer is given in terms of a linear
two-port network, with two mechanical variables, the force f (t) and the velocity
ẋ(t), and two electrical ones, the circulating current i(t) and the voltage v(t) across
the output electrical impedance. These quantities are related, just as in an all electric
two-port, by the following relations:
4 In an alternative derivation, we could write the Lagrangian of the electric + mechanical system:
the transduction effect would be found in the interaction part of the Lagrangian.
8.2 The Resonant Detector 187
The Z i j terms of the so-called transduction matrix are, in general, linear differential
operators. Clearly, Z 22 is the output impedance of the electrical circuit, and Z 11 is
the mechanical impedance of the oscillator. Z 21 and Z 12 are the direct and reverse
transduction coefficients, respectively. They are not independent: Z 21 = −Z 21 ∗ and
in a lossless transducer, where the real part is negligible, they are equal. We recover
the previous description with the relation α = iω Z 21 , but the impedance matrix is
richer in details. For our example, in the capacitive transducer, we have
V0 Ẋ (ω) 1
Z 12 = Z 21 = ; Z 11 = = Meq ωo ; Z 22 =
iω d F(ω) iωC0
Fig. 8.4 Left: a capacitive transducer, with two large surfaces (150 mm diameter) facing a plate res-
onating at ωo /2π = 930 Hz. Center: an inductance-modulation transducer, with the flat “pancake”
superconducting coils. Right: a parametric transducer composed of two highly reentrant Niobium
cavities, resonating at 480 MHz, facing the two sides of a vibrating plate
Fig. 8.5 Block diagram of the readout for a resonant antenna equipped with a resonant capacitive
transducer, biased to a d.c. voltage V p and a SQUID amplifier. The total impedance Z el is the
proper composition of the transducer capacitance (Z 22 = 1/iωC0 ), the decoupling capacitor Cd ,
the matching transformer Mt , and the SQUID input transformer Ms (Z in = iωMs )
as an ideal one (noiseless, with open input circuit), plus two sources of noise: current
and voltage noise. Due to fundamental considerations, both sources are invariably
present in any amplifier. Besides, the input impedance of a real voltage amplifier is
not infinite but has a large value Z in in parallel to the input.6 The input impedance Z in
contributes, with the transducer impedance Z 22 and any possible matching circuit,
to the total circuit impedance Z el . We show in Fig. 8.5, as an example, the complex
readout of Explorer, NAUTILUS, AURIGA, where a superconducting transformer
was required to impedance match the high impedance of the capacitive transducer
to the low input impedance of a SQUID amplifier. So, Z el turned into a resonant LC
circuit. √ √
An amplifier has a voltage noise 7 source v = Sv (ν) measured in [V / H z],
√ √ n
and a current noise source i n = Si (ν) [A/ H z]: one of these is added to the output
and the other generates noise currents that circulate in the input circuit, giving rise
6 Fora current amplifier we would have: the input impedance of a real current amplifier is not null
but has a small value Z in in series with the input.
7 We assume the reader to be familiar with the basic notions of noise and its statistical description in
terms of frequency spectra. A quick recap of these concepts is given in the first sections of Chap. 10.
8.2 The Resonant Detector 189
to the back-action effect mentioned above. These noise spectra are assumed white,
i.e. with an amplitude distribution independent of frequency, in the small band of
interest around the mechanical resonance ωo . We make the assumption, not always
verified but which simplifies the representation, that the two random variables, vn
and i n are statistically independent.
We shall focus here on a voltage amplifier model, as shown in Fig. 8.6, although a
dual model for current amplifier is also encountered.8 The voltage source represents
the output noise: it entirely drops on the (almost) infinite input impedance and appears
unmodified (but for the gain) at the output. The current source gives rise to input
noise, i.e. noise currents circulating in the input circuit.
As an equivalent description, the two following parameters are often used:
i n vn vn
Tn = ; λ= (8.9)
kB i n |Z el |
where k B is the Boltzmann constant and Tn is called the amplifier noise temperature:
k B Tn is the minimum signal energy detectable, with SNR = 1. λ is a matching
parameter: it measures how far the actual impedance Z el is from the noise match
impedance vn /i n ; the optimum value is λ ∼ 1.
8 The voltage amplifier model is suitable even when dealing with SQUIDs that are typically described
as current amplifier.
190 8 The Resonant Detector
• Cooper pairs: In a superconductor (SC) the carrier of electric current are Cooper
pairs of electrons, with spin 0 and charge q = 2e.
• Flux quantization: In a SC ring, magnetic flux is quantized:(B) = no , with
o = h/2e = 2.01 · 10−15 Wb is the flux quantum.
• Order parameter: The collective motion of SC pairs can be described, following
the Ginzburg-Landau approach, with a macroscopic order parameter (a sort of
collective wave function) ψ = |ψ|ex p(i s(r )) such that |ψ|2 = n s represents the
carrier density. The assumption is made that only the phase depends on position.
The current density is then Js = 2e 2 r ).
m |ψ| ∇s(
• Minimum coupling: In the presence of an e.m. field, the prescription of quantum
mechanics is to apply the minimum substitution p → p − q A or ∇ →∇
− 2ie A,
m
where A is the e.m. vector potential.
• Josephson effect: Consider a SC interrupted by a thin layer of normal conductor or
insulator: a Josephson Junction (JJ). ψ can experience tunnel effect and maintain
continuity, but there is a voltage drop across the junction (typically some 104 V),
and the phase oscillates. The phenomenon is described by two equations:
I = Ic sin s (8.10)
ds 2e V
= (8.11)
dt
where Ic is the critical current, i.e. the maximum current that JJ can stand before
superconductivity is disrupted. Equation 8.11 shows that the phase across the JJ
oscillates at a very high frequency:
2eV
ωJ = = 2π · 486 MHz/µV
Consider now a ring with two JJs as shown in Fig. 8.7: we require the phase s
to be single-valued around the loop: s = ds = s1 − s2 = 2nπ . This phase rela-
tion implies a phase-locking between the two JJs accomplished, thanks to a cur-
rent circulating in the ring Icir c = I2 − I1 , that is superimposed to the bias current
IT = I2 + I1 . If we now shine a magnetic field onto the ring we have
2e
s = s1 − s2 − A · d
= 2nπ (8.12)
h
The added integral is easily identified with the magnetic flux (B) in the ring. So,
a changing applied flux will modify the phase relations and the voltage across the
ring. Equation 8.10 shows that this effect is periodic, with flux periodicity o .
A tight analogy exists between this device and a Young, two slit interferometer:
the two rapidly oscillating light field in the slits are replaced by the oscillating phases
across the two JJs; the superposition of the two fields gives rise to a d.c. periodic
response: proportional to sin(2π δx/λ) in one case, to sin(2π ext /o ) in the other.
Besides this analogy, we suggest the following physical interpretation: suppose
to linearly increase an external magnetic flux ext : a shield current is generated in
8.2 The Resonant Detector 191
Fig.8.7 On the left the conceptual scheme of DC SQUID: dimensions of the Josephson junctions are
exaggerated. On the right the V versus ext , showing the periodic response, typical of interference
the SC ring to keep the magnetic flux ext + cir c = no constant inside it; when
the shield flux exceeds the value o /2, the JJs go normal, and let one flux quantum
in and we restart with a smaller current, a more energetically favourable condition.
Although the dynamics of a d.c. SQUID is quite complex (we should also account
for capacitive and resistive shunts across the JJs, the inductance of the ring, mutual
inductance to the outside world, noise sources...) and it must be solved numerically,
the response to small signals can be well modelled with an electric two-port (see
Sect. 8.2.3) having as input variables the applied flux ext and the bias current IT ;
the output variables are the voltage drop V and the circulating current Icir c , the latter
giving rise to the back action. So, strictly speaking, SQUIDS are transducers, as they
convert changes in magnetic flux into voltage signals. SQUIDs can be used as very
sensitive current amplifiers, simply by coupling the input current to the ring via a
mutual inductance, as schematically shown in Fig. 8.5.
The SQUID sensitivity, expressed in units of the quantum
√ of magnetic flux o =
π /e = 2.068 · 10−15 Wb, can be as low as 10−7 o / H z. Inserting the SQUID in
a feedback loop, with the error signal provided by an additional inductive coupling,
substantially extends the useful linear response regime.
A great benefit of d.c. SQUIDs is their extremely low input noise: it has been mea-
sured only in a special, suitably degraded device. Therefore, back action is negligible
for these devices. For a detailed presentation of SQUIDs and their applications, we
refer to Clarke (2010).
We shall distinguish between intrinsic and external noise sources. In the second
class we list the disturbances due to environmental noise, which is mainly due to
seismic and acoustic vibrations, as well as to the boil-off of refrigerating liquids
192 8 The Resonant Detector
(Astone 1992): they propagate to the mechanical oscillator and excite its vibrations.
To mitigate these disturbances, the antenna is suspended in a vacuum by a system
of mechanical filters that provide attenuation of the order of 250 dB at the resonant
frequency.
The intrinsic noise sources, that represent the fundamental limitation to the sen-
sitivity of the detectors, are of thermal and electronic origin.
Thermal or Nyquist noise is due to the Brownian motion of the atoms of the bar and
is the mechanical equivalent of Johnson noise in a resistor in thermal equilibrium at
temperature T ; it consists of a random force with white monolateral spectrum
Meq ωo
S f = 4k B T (8.13)
Q
with Q = ωo τ/2 is the quality factor. It causes a displacement of the bar extremes
that is filtered by the resonant transfer function of the antenna
Sf 1
Sx = (8.14)
Meq (ωo − ω )2 + 4( ωτ )2
2 2 2
a Lorentzian shape, with a very narrow width if the decay time τ is large. In the
time-domain language, this is equivalent to saying that the amplitude of vibration of
the oscillator changes very slowly with time, i.e. is highly correlated. Equation 8.14
shows why a large oscillator mass Meq is important.
The displacement autocorrelation function is deduced from the inverse Fourier
transform of Eq. 8.14:
Sf τ −t
Rx x (t) = 2 ω2
e τ cos(ωo t) (8.15)
4Meq o
kB T
σx 2 = (8.16)
Meq ωo2
that is just a fancy way to express a basic concept: the mean kinetic energy of the
vibration mode is provided by the equipartition theorem: 21 Meq v 2 = 21 k B T .
At the output of a linear transducer, we have the voltage vth due to thermal noise,
with variance
σV th 2 = α 2 σx 2 (8.17)
However, if a resonant transducer is employed, an additional thermal noise source
must be added. This term is relevant, due to the small mass of the second oscillator;
8.3 Noise in Resonant Detectors 193
however, the gain in transduction efficiency makes its use worthwhile. An optimum
mass for the second oscillator exists, balancing these two competing effects.
The reduction of thermal noise is pursued by reducing the antenna temperature.
In the last generation of resonant detectors (AURIGA and NAUTILUS) the Al body
of M = 2300 kg was cooled to temperatures close to 100 mK, at that time the low-
est temperature ever achieved by an object of mass ∼103 kg (Astone et al. 1991)
(Fig. 8.8).
• Amplifier wideband (output) noise: The wideband noise voltage vn does not
take part in the dynamics of the detector but is added to its output. It has a
white spectrum (at least in the small bandwidth of interest) and its effect is then
proportional to the bandwidth used.
• Amplifier input (resonant) noise—Back Action: It is the noise current i n , gener-
ated by the amplifier or by lossy elements in the readout circuit. This noise source
is usually negligibly small, but GW resonant detectors are extremely sensitive
devices and the input noise appears in a twofold way. First, the noise current,
circulating in the circuit impedance Z el , generates an additional noise voltage:
S0 = Sv + |Z el |2 Si = Sv (1 + λ−2 ) (8.18)
The second effect is the back action: As shown by Eq. 8.7, the noise current that
flows into the transducer produces, by reverse conversion, a noise force acting
(b.a.)
on the mechanical oscillators: S f = |Z 12 |2 Si . Its spectral characteristics are
hardly distinguishable from the thermal noise, and its variance is much smaller,
so its effect is often accounted just with a correction to thermal noise, raising the
thermodynamic temperature T to a slightly higher value Te . From Eq. 8.13 we
have
194 8 The Resonant Detector
|Z 12 |2 τ
Te = T + Si (8.19)
4Meq k B
Although in the cryogenic antennas, equipped with SQUID, the back-action effect is
negligible, this contribution to the linear readout measurement cannot be eliminated.
A linear transducer is conceived to continuously monitor the position (or the
momentum) of the oscillator, providing a signal at the output of the system. This
mechanism of back action is the classical equivalent of the influence of the mea-
surement on a quantum state (see, for example, Caves 1983; Braginsky et al. 2003).
In fact, the sensitivity achieved with linear measurement schemes has the ultimate
limit related to the quantum nature of the oscillator and it is due to the Heisenberg
principle: this is the Standard Quantum Limit.
This barrier can be circumvented by applying the quantum non-demolition—
QND measurement strategies. In the equivalent classical case, i.e. when the system
is affected by the back action noise, it has been experimentally demonstrated that,
with a back-action evading (BAE) system (Cinquegrana et al. 1993) based on a
specially designed parametric transducer, the limit can be surpassed.
Summarizing, the total noise voltage spectrum in the detector is the sum of white
amplifier noise S0 , thus proportional to the measurement bandwidth ν = ω/2π ,
and a resonant term, due to thermal and back-action noise, that has the Lorentzian
shape of Eq. 8.14:
Sn (ω) = α 2 Sx (Te , ω) + S0 (8.20)
or, integrating over frequency:
where the explicit dependence on Te reminds us that the back action must be included
in the resonant noise.
An antenna affected only by thermal noise would have, in principle, infinite band-
width: indeed, thermal noise and GW signal act on the detector in the same way, and
appear at the output through the same transfer function (i.e. the antenna resonance).
Therefore their ratio (SNR) is a constant independent of frequency. The presence of
amplifier noise, which has a white spectrum, modifies this framework, as shown in
Fig. 8.9: the response to a GW (and to thermal noise) is visible only near resonance,
where it can peak above the white amplifier noise.
8.4 Burst Sensitivity 195
Fig. 8.9 The interplay between resonant thermal noise and wideband amplifier noise determines
the detection bandwidth that can be much wider than the resonant bandwidth
The width of this region is determined by the relative strength of these two noise
sources and is determined by the two frequencies where the noise spectrum Eq. 8.20
is reduced to half the maximum value
2 1+ S0 /τ
δωd = with ≡ 2 (8.22)
τ σVth
is the ratio between the wideband noise in the resonance bandwidth 1/τ and the
integrated resonant noise of Eq. 8.17.
In cryogenic detectors we had ∼ 10−6 so that the SNR bandwidth was several
Hz wide, much larger than the resonance bandwidth 2/τ ∼ mHz.
As
1, we shall approximate in the following expressions 1 + ∼ 1. So we
are led to distinguish between two very different time (or bandwidth) scales:
The use of a resonant transducer does not modify this picture, although it improves
it quantitatively.
Nevertheless, a band of tens of Hertz around the mechanical resonance of the
detector (1 kHz in the case of the cryogenic cylindrical bars) is still insufficient to
extract any information on the spectral shape of a GW signal. Thus, most of the
searches focused on looking for sudden energy innovations, i.e. featureless δ-like
signals.
196 8 The Resonant Detector
The mean energy in the antenna is proportional to the σV2tot of Eq. 8.21. Proper filtering
of data can greatly improve the sensitivity of these detectors. The basic idea is to
look for a sudden change in the energy of the detector, against a slowly changing (as
shown by Eq. 8.15) thermal noise. This is achieved by taking a time derivative of the
output or, in first approximation, a finite difference at a time interval t. We have
t ω0 β
E = k B Te + k B Tn (1 + λ−2 ) (8.23)
τ t
where β is the coupling coefficient defined in Eq. 8.8. This shows that there exists
an optimum sampling time, balancing the contributions of thermal innovation and
amplifier noise. Recalling that Nyquist’s sampling theorem commands, for a desired
bandwidth ν, a minimum sampling time t = 2ν 1
, the optimum sampling time
recovers the detection bandwidth δωd of Eq. 8.22.
Emin , often expressed as k B Te f f , is a measure of the sensitivity integrated into
frequency over the detector bandwidth.
In a more sophisticated approach, we can further improve the sensitivity by apply-
ing the Wiener-Kolmogorov, that is described in some detail in Sect. 10.11. In this
case, the filtered output responds to an δ-like excitation of the detector as (Bonifazi
1978)
Vs |t|
s(t) = √ exp − (8.24)
2 twk
where Vs is the maximum amplitude at the filter output that needs to be calibrated,
and
√ 1
twk = τ = (8.25)
δωd
The filtered signal s(t) has a double exponential shape as it depends on |t|, and peaks
at t = 0: we conclude that the filter response is neither advanced nor delayed with
respect to the incoming signal. The Fourier transform of s(t) is a complex function
with square modulus and phase:
1
S 2 (ω) ∝ φ = arctan(−ωtwk ) (8.26)
1
2
twk
+ ω2
−1
It is straightforward to verify that δωd = twk has the meaning of detector bandwidth.
Following this procedure an expression of the minimum observable energy change
at the detector output is derived:
√
Emin = 4k B Te (8.27)
Equation 8.27 shows that the sensitivity gain for a δ-like signal achieved by the filter
depends on the spectral ratio , which must therefore be as low as possible: electronic
noise in the detector bandwidth must be small compared to thermal noise.
8.4 Burst Sensitivity 197
Fig. 8.10 An example of achieved spectral sensitivities of the resonant antennas NAUTILUS,
√
AURIGA and EXPLORER. In this case the spectral strain sensitivity is about 10−21 / H z. The
sensitivities are compared to what Virgo had achieved in 2005. Credits: Acernese et al. (2008)
More refined methods, operating in the frequency domain, were developed over
the years, leading to bursts sensitivities of the order of Te f f 1 mK: a gain of 4000
over the thermodynamic, Brownian, mean energy.
A particular source of noise is the acoustic effect produced by cosmic rays that
affect the mechanical oscillator. This noise is both external and fundamental, i.e.
unavoidable, although it could be reduced by locating the detector in an underground
laboratory. Indeed, both extensive air showers and very energetic single particles have
198 8 The Resonant Detector
been shown capable to excite the antenna through a mechanism well explained by
the so-called thermo acoustic model: loss of energy in the bulk of the massive bar
→ local heating → thermal expansion → excitation of longitudinal vibrations. The
model predicts an amount of vibrational energy E given by (Coccia 1995)
4 γ 2 dW 2
E= · f (
r) (8.28)
9π ρ Lvs2 d x
where dWd x is the ionization energy loss, f ( r ) is a scalar function of the cosmic
ray trajectory (0 ≤ f ≤ 1), ρ is the bar density and γ is the Gruneisen parameter:
γ = V (d P/d E )V describes the coupling between changing volume and vibrational
excitation in a solid. One of the thermodynamical definitions of Gruneisen parameter
is γ = αth vs2 /C p , where αth is thermal expansion coefficient and C p the heat capacity
at constant pressure.
This disturbance became significant when the sensitivity reached the level h ∼
1 · 10−19 . For this reason the most advanced resonant detectors were equipped with
a cosmic ray telescope to veto the events induced by high energy particles hitting
the bar (Astone et al. 2000). Analysis of these events allowed us to discover that
some very energetic particles caused a local transition from the superconducting to
the normal state of aluminium (cooled to 0.14 K), producing extremely large signals
in the detector (Astone et al. 2008).
Resonant antennas have also been used to search for periodic signals, such as those
emitted by pulsars. In the 1970s, the Japanese group of Tokyo University and KEK
studied different geometries in order to maximize the antenna response. Their goal
was to select a vibrational mode with quadrupole symmetry resonating at a frequency
close to 60 Hz, i.e. twice the rotation rate of the pulsar in the Crab nebula (Crab
pulsar). They first developed a square-shaped detector with cuts, obtaining a sort of
four-leaf clover (see Fig. 8.11). A second resonator, dumbell-shaped and cooled to
4 K, was later built, whose first torsional resonance mode was again tuned to 60 Hz
(Tsubono 1991).
If the expected signal frequency is close to resonance, the detection strategy is
relatively simple. We can assume that the dominant noise is due to thermal noise
(Brownian and back-action noise), which corresponds to the
1 condition. This
can always be achieved, as shown in Eq. 8.21, by extending the integration time tobs ,
i.e. narrowing the bandwidth around the frequency of interest: that is because the
power of electronic noise is proportional to the measurement bandwidth.
We compute the explicit expression of the SNR for a sinusoidal signal with ampli-
tude h 0 and frequency ω
= ωo by expanding the noise spectrum of Eq. 8.20 and
comparing it with the signal of Eq. 8.6, but off-resonance:
8.6 The Search of Periodic Signals with Resonant Detectors 199
2
Lh o ω4 Meq Q tobs 1
S N R(ω) = · (8.29)
π2 ωo kb Te 1 + Q 2 (1 − ω2 2
) + ω2
ωo2 ωo2
This same relation, setting S N R(ω) = 1 and tobs = 1 s (unitary bandwidth) and
solving for h 0 , yields the spectral sensitivity curve of Sect. 8.4.3.
The detection band, measured at full width of half maximum (FWHM) height of
Eq. 8.29 is
ωo 1
ωd = √ (8.30)
Q
in agreement with Eq. 8.22: with the parameter values quoted above (ωo =
2π 900 Hz; Q ∼ 106 ; ∼ 10−4 − 10−7 ) this allows a useful band of the order of
a few tens of Hertz.
– Modifying the detector geometry by machining it, raising or lowering the resonant
frequency
– Adding small masses at appropriate points of the oscillating body: this can only
lower the mechanical frequency
– By exerting static forces on the antenna, installing electric or magnetic actuators,
or taking advantage of the electric stiffness of the transducers themselves. Mag-
netic transducers have a positive magnetic stiffness (raise the resonant frequency),
while electric devices have it negative (they lower ωo )
200 8 The Resonant Detector
Fig. 8.12 The cold damping method for widening the detector band without increasing the thermal
noise through a 4 K cooled resistance
In a long-term search for periodic signals, the detector must be able to track the
frequency of the signal, if this changes. Indeed, the frequency of the gravitational
wave signal can change over time due to various reasons:
• The Doppler effect caused by the orbital and the diurnal motion of the Earth, of
the order of ±0.03 Hz.
• A pulsar generally shows a small slowing of the rotation frequency (spin down)
of the order of ∼0.01 Hz/year.
• The signal can exhibit glitches, i.e. sudden jumps in the rotation frequency
(Chap. 12 has more details on pulsar frequency and its tracking).
• Furthermore, the resonance frequency of the antenna not only changes with tem-
perature, mainly due to thermal dilation, but also due to changes in the speed of
sound. The magnitude of the effect depends on the vibration mode considered and
on the antenna material, as well as on the operating temperature. At cryogenic
temperatures, this effect is reduced, but slow variations are observed, related to
the level of the cryogenic liquids or to changes in the electric or magnetic field
present in the transducer, that change its stiffness.
If the antenna Q is too high, the signal can easily fall out of the useful bandwidth
ωd during the search. On the other hand, a low Q factor degrades the sensitivity
of the detector, and should be avoided.
To overcome these difficulties, researchers at Tokyo University first used, in their
search for a 60 Hz signal from the Crab pulsar, a very ingenious system to obtain a
widening of the detection band without altering the signal to noise ratio (Hirakawa
et al. 1977). Consider the scheme shown in Fig. 8.12. The harmonic oscillator repre-
sents the antenna, at room temperature (T 280 K), equipped with an electrostatic
transducer to which the resistance R is connected. The equations that characterize
the electromechanical system are
8.6 The Search of Periodic Signals with Resonant Detectors 201
d2x ωo d x
M + + ωo x + E o q =0
2
dt 2 Q A dt
dq 1
R + q + E o x =0 (8.31)
dt C
where ωo , Q A , M are, as usual, the antenna resonant (angular) frequency, qual-
ity factor and mass, respectively; E o is the average electric field in the transducer
capacitor C, q its charge and R is the readout resistor.
Combining these two equations, with some approximations, we get
d2x 1 RCωo dx
+ ωo +β +
dt 2 QA 1 + (RCωo ) dt
2
1
+ ωo 2 1 − β x =0 (8.32)
1 + (RCωo )2
where β is the electromechanical coupling factor of the capacitive transducer:
C Eo 2
β= (8.33)
Mωo 2
Equation 8.32 describes the motion of an oscillator with a modified quality factor:
1 1 1 1 RCωo
∗
= + with =β (8.34)
Q QA QR QR 1 + (RCωo )2
It follows that
T∗ TA TR
∗
= + (8.36)
Q QA QR
Combining Eqs. 8.34, and 8.36 we get
∗ 1 + (RCωo )2 T A + β Q a RCωo TR
T = (8.37)
1 + (RCωo )2 + β Q A RCωo
202 8 The Resonant Detector
In the case β Q A >> 1, we can simplify this relation to get the equivalent tem-
perature
1
T ∗ = TR + 2T A (8.38)
βQA
that can be substantially lower than T A . The thermal fluctuations of the antenna
vibration x now take place at the equivalent temperature T ∗ :
kB T ∗
σx 2 = (8.39)
Mωo 2
Such lowering of the equivalent temperature of the system is called cold damping.
Inevitably, this comes with a lowering of the detector’s Q. Therefore the signal
decreases along with the noise and the SNR does not change. This method makes it
possible to extend the detection bandwidth, thus making it easier to track the periodic
signal in frequency.
References
Aguiar, O.D.: The past, present and future of the resonant-mass gravitational wave detectors (2010).
arxiv:1009.1138
Aguiar, O.D., et al.: The Brazilian gravitational wave detector Mario Schenberg: status report. Class.
Quantum Grav. 23, S239 (2006)
Astone, P., et al.: First cooling below 0.1 K of the new gravitational-wave antenna “Nautilus” of
the rome group. Europhys. Lett. 16, 231 (1991)
Astone, P., et al.: Noise behaviour of the Explorer gravitational wave antenna during the λ transition
to the superfluid phase. Cryogenics 32, 668 (1992)
Astone, P., et al.: Cosmic rays observed by the resonant gravitational wave detector NAUTILUS.
Phys. Rev. Lett. 84, 14 (2000)
Astone, P., et al.: Detection of high energy cosmic rays with the resonant gravitational wave detectors
NAUTILUS and EXPLORER. Astroparticle Physiscs 30, 200 (2008) and references therein
Bonifazi, P., Ferrari, V., Frasca, S., Pallottino, G.V., Pizzella, G.: Data analysis algorithms for
gravitational-wave experiments. Nuovo Cimento 1C, 465 (1978)
Braginsky, V.B., et al.: The noise in gravitational wave detectors and other classical force measure-
ments is not influenced by test mass quantization. Phys. Rev. D 67, 082001 (2003)
Caves, C.M.: Quantum non-Demolition Measurements. In: Meystre, P., Scully, M. (eds.) Quantum
Optics, Experimental Gravity, and Measurement Theory. Springer, Boston (1983)
Cinquegrana, C., Majorana, E., Rapagnani, P., Ricci, F.: Back-action-evading transducing scheme
for cryogenic gravitational wave antennas. Phys. Rev. D 48, 448 (1993)
Goryachev, M., et al.: Rare events detected with a bulk acoustic wave high frequency gravitational
wave antenna. Phys. Rev. Lett. 127, 071102 (2021)
Coccia, E., Marini, A., Mazzitelli, G., Modestino, G., Ricci, F., Ronga, F., Votano, L.: A cosmic-ray
veto system for the gravitational wave detector NAUTILUS. Nucl. Inst. & Meth. Phys. Res. A
355, 624 (1995)
Clarke, J.: SQUIDs: then and now. Int. J. Mod. Phys. B 24, 3999–4038 (2010)
Acernese, F., et al.: First joint gravitational wave search by the AURIGA-EXPLORER-NAUTILUS-
virgo collaborations. Class. Quantum Grav. 25, 205007 (2008)
Gottardi, L., et al.: Sensitivity of the spherical gravitational wave detector MiniGRAIL operating
at 5 K. Phys. Rev. D 76, 102005 (2007)
References 203
Hirakawa, H., Hiramatsu, S., Ogawa, Y.: Damping of Brownian motion by cold load. Phys. Lett. A
63, 199 (1977)
Paik, H.J.: Superconducting tunable diaphragm transducer for sensitive acceleration measurements.
J. Appl. Phys. 47, 1168–1178 (1976)
Ricci, F.: Mechanical noise and low temperature physics aspects of the gravitational wave experi-
ment. In: Posada, E., Violini, G. (eds.) The Search of Gravitational Waves, p. 157. Word Scientific,
Singapore (1982)
Tsubono, K.: Detection of continuous waves. In: Blair, D.G. (ed.) The Detection of Gravitational
Waves. Cambridge University Press, Cambridge (1991)
Wagoner, R.V., Paik, H.J.: Multi-mode Detection of Gravitational Waves by a Sphere. B. Bertotti
ed. Accademia Nazionale dei Lincei, Roma (1977), p. 257
Weber, J.: General relativity and gravitational waves. Interscience Tracts on Physics and Astronomy.
Interscience Publishers Inc, New York (1961)
Interferometric Detectors of
Gravitational Waves 9
9.1 Introduction
The pioneering idea of using a light signal that travels between two freely gravitating
masses and characterizes the metric of space is already present in Piran’s work (Pirani
1956).
J. Weber, the scientist who devised and built the first resonant detectors (Chap. 8),
together with R. Forward, first considered the possibility of replacing the spring with
a beam of laser light. Forward, with L. Miller and G. Moss, performed an experiment
of this type at the Hughes research lab (Moss et al. 1971; Forward 1978).
Several classes of detectors can be traced back to this principle spacecraft Doppler
tracking, a method based on the monitoring of Doppler signals for tracking artificial
satellites, planetary ranging, or the measurements of planetary orbitals, the combined
observation of the arrival times of the radio signals emitted by the pulsars, and finally
the long-baseline Michelson interferometers.
In each of these techniques the effect of the gravitational wave is that of influencing
the propagation of the signal e.m. and the test masses. In the case of pulsar timing
the signals are simply received at Earth (the first gravitating mass) and accurately
compared with a phase reference derived from the best available (in terms of stability)
atomic clocks. In the other case, the signal originates from a mass and is then reflected
backwards (or re-transmitted while maintaining phase coherence) from the second
mass. Since the seismic noise is absent, these techniques can be used to detect signals
in the frequency band from 10−8 to 1 Hz.
Finally, it should be emphasized that the idea of the interferometric detection of
gravitational waves was taken up and literally re-invented in 1972 by R. Weiss at
M.I.T. (Weiss 1972). In fact, he carried out a real feasibility study of the experimental
configuration, laying the foundations of the current Virgo and LIGO detectors.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 205
F. Ricci and M. Bassan, Experimental Gravitation, Lecture Notes in Physics 998,
https://doi.org/10.1007/978-3-030-95596-0_9
206 9 Interferometric Detectors of Gravitational Waves
h + (
r , t) → h 0 cos(ωg t − kz + φ) (9.1)
We now series expand (1 + h + )−1/2 and integrate, assuming for h(t) a waveform
like Eq. 9.1:
h0
2L c(tr − t0 ) − [sin(ωg tr ) − sin(ωg t0 )] (9.4)
2ωg
As we are limiting our solution to first order in h 0 , we can take tr t0 + 2L/c, in
the argument of the sine function in Eq. 9.4; so we get for the round trip time
2L L sin η
tr − t0 = ± h0 cos (ωg t + η) (9.5)
c c η
ω L
where η = gc and the “-” sign holds for photons travelling along the x-axis (and
“+” for the y-axis).1
Clearly, the round trip travel time is, in the unperturbed (h + = 0) case, (tr −
t0 )(0) = 2L/c; also, if the distance L is short enough, and h + (t) changes slowly
enough to be considered constant during the travel time, we simply have (tr −
t0 )(1) = (2 + h + (t))L/c so that the GW produce a perturbation δ(Tr t ) = Lc h 0 . This
is also the simple, intuitive result we would get by computing the effect in a detector-
based reference frame. Note also that the travel time depends on time, via the phase
of the gravitational wave.
We now compute the effect of the GW on the electromagnetic field of a monochro-
matic light wave, of frequency ω L /2π, travelling between TM1 and TM2: Call
A(t0 ) = A0 e−iω L t0 the (complex) amplitude at the beginning of the round trip, and
B(t0 ) = A(tr ) its value at the end. To evaluate B(t) we substitute in the exponential
tr as given by Eq. 9.5 and we expand to first order in h 0 .
To simplify the notation, from now on we drop the index “0”, for a generic initial
time, and drop “+” from h + and we introduce the function2 sinc(x) = sin(x) x
1
h o e−i(ω L −ωg t) A1 e−2iη ± iξsinc(η)e−iη A0 +
2
1
h o e−i(ω L +ωg t) A2 e2iη ∓ iξsinc(η)eiη A0 (9.8)
2
where
⎛ ⎞
1 0 0
⎜ sin(η) −iη −2iη
0 ⎟
D = e2iξ ⎝i ± ξ η e e ⎠ (9.10)
sin(η) iη
i ±ξ η e 0 e2iη
The matrix formalism is a powerful tool to analyse complex optical system; Eq. 9.10
gives us the opportunity to treat also the interaction GW-travelling light within this
formalism: each of the optical components of the system is described by a matrix,
element of a non-commutative algebra in which we define a global operator O that
represents the detector as a whole (Vinet 2020).
Fig. 9.2 The e.m. fields in the different points inside the Michelson interferometer
A more general analysis of the Michelson interferometer can be done using the
optical matrix formalism. Here we refer to the fields as defined in Fig. 9.2. The two
mirrors have reflectivities, r1 and r2 , that can be different; rbs and tbs indicate the
reflectivity and transmittivity of the beamsplitter, respectively; ψi , i = 1...n is the
e.m. field at the different point inside the interferometer; andψin and ψout are the
e.m fields at the input and output port of the Michelson set-up. Using the matrix
formalism developed in the previous sections
210 9 Interferometric Detectors of Gravitational Waves
Pin ∝ |ψin |2
ψ1 = tbs ψin ψ5 = rbs ψin
−ik L x
ψ2 = e ψ1 ψ6 = e−ik L y ψ5
ψ3 = ir1 ψ2 ψ7 = ir2 ψ6
ψ4 = e−ik L x ψ3 ψ8 = e−ik L y ψ7
ψout = irbs ψ7 + tbs ψ8
Pout ∝ |ψout |2
Carrying through the products we finally find the e.m. power collected at the output
port: proportional to the square modulus of the output field, is shown in the same
figure
2r1r2
Pout ∝ |ψout | =
2 2 2
Pin rbs tbs (r12 + r22 ) 1+ 2 cos 2kδL (9.11)
r1 + r22
C varies between 1, for perfectly reflecting mirrors (r1 = r2 = 1), and 0, when no
interference takes place. However, even with perfect mirrors, we can have C < 1, due
to an imperfect alignment of the recombined beams. In addition to the output beam,
directed towards the detector, another beam is sent back from the interferometer
towards the laser source. The field of this beam is in phase opposition with respect
to that at the output port (just count the reflections, each reflection adds π/2 to
the phase), complying with energy conservation. This implies that when the output
power is at its minimum value, the beam reflected back at the input port is at its
maximum.
We now return to the problem of the interaction of a GW with a light beam: in Sect. 9.2
we had taken the simplifying hypothesis that the two test masses be in free fall. In a
real interferometric system installed on the ground, the bodies are suspended mirrors
to form pendulums. We now calculate the phase difference measured by this system
by applying the equation of geodesic deviation for the three masses with respect to
the centre of mass of the whole system. As a result, the action of the gravitational
9.4 The Suspended Mirror 211
wave can be seen as an acceleration field a(t, ξc.m.s. ) to which the three suspended
masses are subjected:
j(T T )
∂2hk
a (n) (t, (n) ) = 1
ξ ξc.m.s.
k (n)
c.m.s.
j
2c2 ∂t 2
where
ξcms
(n)
≡ (x (n) − xcms , y (n) − ycms , z (n) − z cms )
indicates the position vector of the n-th pendulum (n = 0, 1, 2, see Fig. 9.3). If the
origin of our reference is placed on the beam separator (beamsplitter), we have the
equations of motion for the various masses:
1
ẍ0 + τ −1 ẋ0 + ω 2p x0 = − (ḧ x x xcms + ḧ x y yc.m.s. )
2
−1 1
ÿ0 + τ ẏ0 + ω p y0 = − (ḧ yx xcms + ḧ yy yc.m.s. )
2
2
−1 1
ẍ1 + τ ẋ1 + ω p x1 = − (ḧ x x (L − xcms ) + ḧ x y yc.m.s. )
2
2
−1 1
ÿ2 + τ ẏ2 + ω p y2 = (−ḧ yx xcms + ḧ yy (L − yc.m.s. ))
2
2
For sake of simplicity we have omitted the (T T ) symbol; we introduced both the
relaxation time τ and the characteristic pulsation ω p , which we assume, for simplicity,
equal for all pendulums. The phase difference measured by the detector is
δφ = 4π (x − y) (9.13)
where x = x0 − x1 and y = y0 − y2 .
This quantity is deduced by combining the linear differential equations previously
written:
φ ˙ + ω p φ = 4π ḧ x x L
¨ + τ −1 φ (9.14)
λ
The solution of this equation depends on the form of ḧ x x .
If the metric perturbation h(t) is a burst of amplitude h o and duration t 1/ω p ,
we have
4π 4π t
δφ(t) Lh(t) + Lh o [ω p t sin(ω p t)e− τ ] (9.15)
λ λ
which holds for large values of the oscillator quality factor Q ≡ ω p τ /2
1.
This equation shows how the detector output is a measure of h(t); the second
term represents, for times t > t, the memory effect on the pendulum system of the
occurred interaction with the gravitational wave. This term, being weighted by the
factor ω p t << 1, can usually be neglected.
212 9 Interferometric Detectors of Gravitational Waves
Fig. 9.3 Michelson interferometer scheme for the calculation of Sect. 9.4
h(t) = h o e−iωg t
we have for ωg
ω p and Q
1
4πL
δφ = h o eiωg t (9.16)
λ
This means that the interferometer can detect sinusoidal signals at frequencies higher
than the resonance of the pendulum suspension.
The approach followed here is equivalent to the matrix treatment, where the
mirrors are considered in free fall, as long as the frequency of the mirror suspension
is lower than those characteristics of the signal (Fig. 9.3).
In this section, we have consistently neglected the factor sinc(ωg L/c), see, for
example, Eq. 9.8, accounting for the signal reduction when the light travel time inside
the interferometer arm is comparable with half of the GW period. Although this is
certainly acceptable for a simple Michelson,3 we shall recall this correcting factor
in Sect. 9.5 where we discuss interferometers with optically extended arms.
3 Even at the highest frequency of 5 kHz, the GW wavelength is λgw = 60 km, and the biggest
interferometer, LIGO, has arms 4 km long.
9.5 The Experimental Configuration 213
– The multi-pass Michelson, which includes in each arm an optical delay line (Delay
Line—DL). DL consists of two spherical mirrors arranged in a quasi-confocal
configuration (Herriott et al. 1964). The light beam enters through a hole made
in the first (input) mirror and travels back and forth for N round trips, reflecting
214 9 Interferometric Detectors of Gravitational Waves
N times off the end mirror and N − 1 times off the input one. The beam hits the
mirrors always in different points, and finally exits from another hole in the input
mirror and recombines on the beamsplitter with the other beam coming out of the
second arm. The recombined beam impinges on a photodiode that monitors the
interference state of the system. In this way, we extend the effective path length to
2N L, at the cost of attenuating the light power by a factor ∼ (ri re ) N , where ri , re
are the reflectivities (close to unity) of the input and end mirror, respectively.
– The Michelson–Fabry-Perot (FP): the interferometer is obtained by placing, in
each of the two arms, besides the (virtually) perfectly reflecting mirror at the far
end, a partially transmitting mirror after the beamsplitter so as to compose an
optically resonant Fabry-Perot cavity (Pérot and Fabry 1899). The light returning
from the two cavities is recombined in phase opposition, as it is done in the simple
Michelson. For a detailed discussion on Fabry-Perot, see Appendix C.
A few parameters are useful to characterize a FP cavity; we first define the ampli-
tude reflectivities ri , re (just as above) and the trasmissivities ti , te of the input and
end mirrors, respectively; then
2F
Pe f f = Pin (9.18)
π
– the frequency spacing of the resonance lines, free spectral range—FSR,
c
ν F S R = (9.19)
2L
– the width of the resonance lines, full-width half-maximum—FWHM,
ν F S R
ν F W H M = (9.20)
F
– the time that a light wavefront (or packet) remains in the cavity, the storage time
τs
1
2 L (ri re ) 2 2L
τsF P = = F (9.21)
c 1 − ri r e πc
9.5 The Experimental Configuration 215
2π N L
δφ DL = h o sinc(ωg N L/c) (9.22)
λ
8F L 1
δφ F P h o sinc(ωg L/c) (9.23)
λ (1 + ωg2 τs2 )
We note two main differences with respect to the phase change δφ Mich = h 0 4πL
λ in
a single-pass Michelson (see Eq. 9.16):
– The optical path length (2L) is now extended to 2NL or 8F L/π, respectively.
– The response is no longer independent of frequency: the factors sinc(ωg N L/c)
for the DL configuration, and
(1 + ωg2 τs2 )−1/2 for FP, produce a slow reduction of the phase change signal when
the frequency increases.
The factor sinc(ωg L/c) for the FP, which we have reintroduced for completeness,
is on the other hand irrelevant: its value is virtually 1 for physical arm lengths of a
few kilometres and frequencies below 10 kHz.
In the relations above, and in particular in Eq. 9.23, we have neglected the possible
losses of the mirrors.
If we compare the performance at equal storage time, under the reasonable hypoth-
esis ωg τs π, it turns out that the FP configuration is only a factor of two more
sensitive than DL.
However, the choice between these configurations also requires an analysis of
the difficulties encountered in carrying out each one. In the case of delay lines the
main problem is the effect of light scattered by the mirror, which overlaps with the
main beam, producing spurious signals. In particular, the condition of avoiding any
overlap between the reflection spots of the beam on each mirror of the DL imposes
the use of spherical mirrors of large diameter and excellent quality, especially on the
edge. In the FP cavity, on the other hand, the beam is also reflected many times, but
always from the central section of the mirror, at normal incidence; therefore a smaller
diameter is required for the mirrors, as well as for the vacuum pipes connecting them:
this means, for a many-kilometre ultra-high vacuum system, a relevant reduction of
costs.
The difference between the static length of optical paths in the two arms is a
problem common to the two configurations because it makes them sensitive to fluc-
tuations in the laser frequency. In the DL case this difference may be due to different
values of the curvature radii of the mirrors of the two cavities while in the FP inter-
ferometer, different values of finesse in the two FP cavities, or due to differences in
216 9 Interferometric Detectors of Gravitational Waves
either the radius of curvature or the reflectivity of the mirrors, so one can place even
more demanding conditions on the reduction of frequency noise of laser light.
In conclusion, the choice of the FP system compared to the DL is due to two main
reasons:
– The FP configuration needs mirrors with diameters five times smaller than the
DL device.
– The light inside the cavity FP travels back and forth along the same path so that
the problem of the scattered light is greatly reduced.
For this reason, LIGO, Virgo and KAGRA detectors preferred to adopt the FP con-
figuration.
Pin
Pout = 1 + Ccos(α + φgw ) (9.24)
2
where Pin is the input power. Equation 9.24 allows us to deduce the sensitivity of the
instrument to a gravitational signal:
∂ Pout Pin
δ Pgw = δφgw = Csin(α) δφgw (9.25)
∂φgw 2
Thus, the signal is maximized by the choice α = (k + 1)π/2 with k integer. This
would naively suggest to adjust the interferometer so that the output light is halfway
between the conditions of constructive (maximum light power, bright fringe) and
9.6 Interferometer Signal and Noises 217
destructive (minimum light, dark fringe) interference: this is called condition of half
fringe, or grey fringe.
But the correct approach is, rather than maximizing the signal, to configure the
interferometer in order to optimize the signal to noise ratio (SNR). Although many
noise sources must be included in this optimization, it is instructive here to evaluate
the SNR considering only the optical, or readout noise components, which limits
the sensitivity in the higher range of the detector bandwidth. Other noises, and their
role in determining the sensitivity, will be mentioned at the end of this chapter.
So, we must compare the signal δ Pgw of Eq. 9.25 with the fluctuations δ Pshot
in the output power caused by a fundamental noise source: granular noise (or shot
noise) on the output photodiode.
Gravitational waves of the same amplitude and frequency but arriving from different
directions will produce different length changes in the interferometer. For example,
a GW incident perpendicular to the interferometer plane will produce strain on both
arms and the effect at the interferometer output GW is summed while, if the same GW
is incident in-line with one arm does not produce a length change in this arm (GW
in GR are transverse) but only on the other so that the signal at the interferometer
output is half of the previous case. Additionally, if a GW is incident from a non-
orthogonal angle each arm will see the projection of the GW onto its axis and therefore
measure a reduced amount of strain. Thus, the detector sensitivity depends also on
the propagation direction of the GW with respect to the interferometer; besides, this
effect on the detector is different from one polarization to another. In analogy with
the electromagnetic antennas, we define the GW antenna patterns of a Michelson
interferometer: we assume a coordinate system determined by the detector and by the
gravitational wave source location. The detector arms lie along the x and y axes with
the beamsplitter at the origin O. The sky locations of the source is specified by two
polar angles φ and θ as defined in Fig. 9.5 In general, the gravitational radiation has
an arbitrary polarization. We introduce an additional coordinate system for a plane
gravitational radiation: X and Y lie on the wavefront plane and Z is the propagation
direction. The polarization angle ψ is defined between the X -axis and the line of
nodes, which is the intersection of the wavefront plane and Earth equatorial plane.
The dashed line in Fig. 9.5 is thus the North-South rotation axis of the Earth.
To proceed further, it is better to separately analyse the two states of polarization
of the radiation. Consider first the + polarization tensor, when the polarization angle
is ψ = 0
⎛ ⎞
00 0 0
⎜0 1 0 0⎟
+
eα,β =⎜ ⎝0 0 −1 0⎠
⎟ (9.26)
00 0 0
218 9 Interferometric Detectors of Gravitational Waves
Fig. 9.5 The coordinate systems used to compute the GW antenna pattern of a Michelson inter-
ferometer: the origin is set at the beamsplitter, the arms lie along the x and y axes, the source is
identified by two polar angles θ and φ
We then convert it in the detector coordinate frame, using the rotation matrix
⎛ ⎞
1 0 0 0
⎜0 cosφ sinφ 0 ⎟
Rαβ = ⎜
⎝0 −cosθsinφ cosθcosφ sinθ ⎠
⎟ (9.27)
0 sinθsinφ −sinθcosφ cosθ
where the suffix d reminds us that the polarization tensor is now expressed in the
detector frame.
We now start from the simplest and optimal case, already discussed: the antenna
response to a gravitational wave of amplitude h 0 , polarization +, when the directions
9.6 Interferometer Signal and Noises 219
of the axes (X, Y, Z) coincide with (x, y, z), modulo rotations of π/2 around the Z -
axis, which transform the wave into itself.
Focus now on the optical path difference4 = L x − L y : it can be rewritten in
a convenient form by introducing the detector response matrix Aα,β
⎛ ⎞
00 0 0
⎜0 1 0 0 ⎟
Aαβ = L ⎜ ⎝0 0 −1 0⎠
⎟ (9.29)
00 0 0
where L is the arm length of the interferometer. Using this matrix we can write
1 1 +
= (h x x − h yy )L = h 0 eαβ Aαβ = h 0 L (9.30)
2 2
Extend now this result to obtain the antenna response for a + polarized wave incident
+ +d
along a generic direction: just replace in Eq. 9.30 eαβ with eαβ :
1 + d αβ 1
+ = h 0 eαβ A = h 0 L (1 + cos 2 θ)cos2φ (9.31)
2 2
We can compute, following the same logic path, the antenna response for GW with
polarization ×, i.e. ψ = π/4
⎛ ⎞
0000
⎜0 0 1 0⎟
×
eαβ =⎜⎝0 1 0 0⎠
⎟ (9.32)
0000
In the detector coordinates we have
⎛ ⎞
0 0 0 0
⎜0 −sin2φcosθ cos2φcosθ cosφsinφ⎟
×d
eαβ =⎜ ⎟
⎝0 cos2φcosθ sin2φcosθ sinφsinθ ⎠ (9.33)
0 cosφsinθ sinφsinθ 0
1 × d αβ
× = h 0 eαβ A = −h 0 L cosθsin2φ (9.34)
2
We can now compute the general case, i.e. the superposition of two polarized waves
of different amplitudes h + and h × , with polarization angle ψ.
λ
4 We consider here, for clarity sake, a simple Michelson interferometer, with = 2π φgw ,
neglecting the signal enhancement factors due to multi-pass configurations discussed in Sect. 9.5.
220 9 Interferometric Detectors of Gravitational Waves
Fig. 9.7 The antenna pattern of a Michelson interferometer for the two polarizations + and ×. The
third figure is the total combined antenna response averaged over two polarization states
h d+ = cos 2ψ h + + sin 2ψ h ×
h d× = − sin 2ψ h + + cos 2ψ h × (9.36)
= F + h d+ + F × h d× (9.37)
L
where the general form factors of the antenna for the two polarizations are
1
F + (θ, φ, ψ) = (1 + cos 2 θ)cos2φcos2ψ − cosθsin2φsin2ψ (9.38)
2
1
F × (θ, φ, ψ) = (1 + cos 2 θ)cos2φsin2ψ + cosθsin2φcos2ψ
2
In Fig. 9.7 we show the antenna pattern of a Michelson interferometer for the two
polarizations + and ×, as well as the total combined antenna response averaged over
the two polarization states. The distance from the origin of a point on the surface
is proportional to the amplitude of the detector response to waves coming from that
direction: it is largest in the directions normal to the plane of the interferometer. The
secondary response characteristics of the antenna are along the four tangential lobes
aligned with the antenna arms.
9.6 Interferometer Signal and Noises 221
The shot noise of a laser light is derived by the fluctuations in the number of detected
photons (photon counting) and it is a standard example of a measurement of a random
variable following the Poisson statistics. The average output power is P̄out = n̄ω L ,
defining n̄ the average rate of detected photons. The probability to measure N photons
in the time interval τ is
N̄ N − N̄
p(N ) = e (9.39)
N!
The expectation value is N̄ = n̄τ and the variance σ 2N = N̄ : its square root σ N
measures the quantum fluctuation of the e.m. radiation, so we can see that the relative
error on the light power measurement, in the given time interval
√
σN n̄τ 1
= =√ . (9.40)
N̄ n̄τ n̄τ
decreases using more laser power (n̄) and a longer integration time. The power
fluctuations in the time interval τ are then characterized by
σN Pout Pin
δ Pshot = ω L = ω L = (1 + Ccosα)ω L (9.41)
τ ω L τ 2τ
Note that the amount of power fluctuations depends on the integration time τ ,
that defines the instrument bandwidth (τ = 1/2 f ). Therefore we can write again
Eq. 9.41 as
(δ Pshot )2 = 2Pout ω L f (9.42)
This noise power Pshot does not depend on the observation frequency, but is propor-
tional to the bandwidth f chosen: it is an example of white noise, and we define
the unilateral shot noise power spectral density (PSD) (see Chap. 10)
(shot) δ Pshot
2
SP ≡ = 2ω L Pout [W2 /Hz]
f
We can now evaluate the signal to noise ratio, comparing Eqs. 9.25 and 9.41
√ δ Pgw 1 Pin Csinα 2πL
SN R = = √ h(t) (9.43)
δ Pshot 2 ω L f 1 + C cosα λ
value very close, for C 1, to the condition of darkness (cos αopt = −1, dark
fringe). Besides, operation on the dark fringe is required when a radiofrequency
modulation is added to the laser light, as will be discussed in Appendix A.
Nevertheless, for sake of simplicity, we shall evaluate the sensitivity of the Michel-
son interferometer with the naive assumption of operating at the maximum signal
response (sinα = 1). The minimum detectable GW amplitude is simply found by
setting S N R = 1 in Eq. 9.43:
λ 2ω L 1 cλ
σh = = (9.45)
4πL η Pin τ L 4πη Pin τ
where we have introduced, for completeness, the efficiency η of the detection pho-
todiode.
From this (using again τ = 1/2 f ) we can define a shot noise spectrum referred
to the input, i.e. measured in h units:
1 cλ
Sh(shot) (ν) = (9.46)
L 2 2πη Pin
We find the obvious confirmation that sensitivity increases with the arm length L,
with observation time τ (for periodic or long-lasting signals) and with the input power
Pin . It is customary to express in units of h, as done here, all noise sources, acting
in different points of the apparatus: this is useful as it allows to compare noises, to
evaluate their effect on the sensitivity of the instrument and to add them up to obtain
a comprehensive noise spectrum.
We report here the following formula, useful to estimate the order of magnitude
of the shot noise contribution to limit the measurement of the GW signal:
−20 1000 m λ 10 W 1 ms
σh = 5.2 · 10 . (9.47)
L 1.064 µm Pin τ
21
(DL) 1 cλ f 1
σh = −1
· ω τ (9.48)
2 N L πη Pin r 2N sinc( g2 s )
Using Eq. 9.23, a similar limit is obtained in the case of the FP cavities with
negligible losses:
1
1 πλc f 2 1 1
σh (F P) = · (9.49)
4F L η Pin 1 + (ωg τs )2 sinc(ω g L/c)
Unlike the simple Michelson, for both configurations the shot limit is a function of the
angular frequency of the gravitational signal ωg and in particular for ωg > (2πτs )−1
the sensitivity decreases.
In practical interferometers, to limit the contribution of the very low frequency
noise that characterizes the radiation emitted by the laser, the light frequency is modu-
lated at a frequency of the order of MHz. High efficiency and high speed modulators
are realized exploiting the electro-optical and acousto-optical effects in crystals,
whose refractive index can be changed by applying an electric field or mechanical
stress. The modulators based on these effects realize the phase modulation (direct
or through polarization variations) or frequency (see the dedicated Appendix A on
modulation techniques). The modulated signals of the interferometer are then used
to control the interferometer itself. Under modulation conditions, the gravitational
signal appears in one of the side bands of the light signal spectrum and the maximum
of the SNR ratio occurs again when the signal on the interferometer output photodi-
ode is at the extinction (destructive interference, or dark fringe). In this condition the
light reflected backwards from the interferometer towards the laser (second output)
is maximum.
The net result is that the shot-noise-limited sensitivity is enhanced by the square
root of this factor K .
An exhaustive discussion of the power recycling system should include optical
losses. Indeed, the increase in power inside the recycling interferometer is limited
by the losses, mainly due to the absorption and scattering of the cavity mirrors. It
can be shown (Meers 1988) that the configuration gain offered by the recycling of
light, with respect to the non-recycled case, depends on the storage time inside the
recycling cavity that, in the end, is limited by the losses.
Signal Recycling The search for signals in a given frequency region can be opti-
mized by the signal recycling configuration (Pegoraro et al. 1978; Drever 1983),
which consists in improving the sensitivity in a narrower (few hundreds Hz or less)
frequency band, centred around a frequency of interest νg , typically 300–500 Hz.
We recall here that, when the interferometer is operated on the dark fringe, the light
at the laser frequency is not present at the photodiode output, while the sidebands
containing the GW signal information are completely transmitted. It follows that, in
principle, we can send back into the interferometer the signal deprived of the carrier,
i.e. the laser frequency component of the light. In other words, the strategy consists
of storing the light carrying the GW information in one arm for half of the gravity
wave cycle π/ωg . Then, instead of extracting it towards the photodiode, we feed
it back into the other arm. During this second half of the wave cycle, since in this
second arm both the phase of the signal and the sign of the response are inverted,
the phase shift of the light adds up, in the same direction of the first half cycle. The
signal recycling with this modality can be repeated many times until the losses of
the mirror prevail.
In practice, to achieve this recycling effect on the signal, an additional mirror
is used, the sixth of the dual recycling Fabry-Perot configuration (not counting the
beamsplitter). This last signal recycling (SR) mirror is placed between the beamsplit-
ter and the detection photodiode and constitutes, together with the power recycling
interferometer considered as a whole, a cavity that is designed to resonate at a fre-
quency νsr , on or near the frequency ν L of the laser light (Fig. 9.8).
This additional SR cavity will affect the phase of the carrier circulating in the
interferometer by a factor (Vajente 2014)
2π π
φ= L (S RC) + (9.50)
λ 4
where
L (S RC) = l S R + (l x + l y )/2
is the length of the SR cavity: the distance l S R between the SR mirror and the
beamsplitter, plus the mean value of the Michelson arm lengths (see Fig. 9.8), which
is typically of the order of few tens of metres.
The computation of the response of a dual (power + signal) recycled interferome-
ter is conceptually simple, but algebraically cumbersome: however, we can consider
the power-recycled interferometer as a single box and then treat it as a virtual mirror
with a frequency-dependent reflectivity in front of the SR.
9.6 Interferometer Signal and Noises 225
Fig. 9.8 A diagram of a Michelson interferometer with both power recycling mirror (PRM) and
signal recycling mirror (SRM). The input mode cleaner is an additional optical cavity, working as
a filter, needed to improve the light spectral purity. Courtesy of Virgo collaboration
We have seen that the quantized nature of light implies an uncertainty in the mea-
surement of the relative position of the mirrors, which we have attributed to the shot
noise. However, it is also inevitable that the quantum fluctuations of light result in
226 9 Interferometric Detectors of Gravitational Waves
Fig. 9.9 Dual recycling response for different φ values (credit Vajente (2014))
fluctuations in the radiation pressure of the light beam and in the impulse transferred
to the mirrors.
To estimate the effects of this second contribution related to the quantum nature
of radiation, we consider the force f exerted by an electromagnetic wave of power
Pe f f ( F · Pin in the case of a FP cavity) on a mirror of mass M:
Pe f f
f = . (9.51)
c
This is a static force that causes a constant shift in the equilibrium position of each
suspended mirror. But a fluctuation of the number of photons impinging on the mirror
implies a fluctuation in the force applied on the mirror surface:
σP ω L σ N πPe f f
σf = = = (9.52)
c c τ cλτ
The spectral density of this force is then
2πPe f f
S f (ν) = . (9.53)
cλ
The oscillation amplitude of each suspended mirror is
f (ν) 1 Pe f f
x(ν) = = . (9.54)
M(2πν)2 Mν 2 8π 3 cλ
9.6 Interferometer Signal and Noises 227
Since the fluctuation in the two interferometer arms are anticorrelated (a positive
fluctuation on one arm corresponds to a negative fluctuation on the other), the change
δL is twice X (ν). As a consequence, the radiation pressure noise referred to the input
port and expressed in terms of power spectra density of h is
2x(ν) 1 Pe f f
Sr p (ν) = = . (9.55)
L Mν L 2π 3 cλ
2
Note that, in order to reduce the effects of this noise, one should increase the mass
M of the mirrors and decrease the incident power Pin . This last request is opposite
to that associated with the reduction of the shot noise. There is therefore an optimum
value for the light power that balances the contribution of these two noise sources.
At present, the available light power on the interferometer mirror is still smaller
than the optimum value so that the dominant contribution remains that of shot noise.
However, the next generation of detectors, for which we foresee a considerable
increase in the power of the input laser and of the finesse of the FP cavities, will
likely approach that condition.
As we have in the two previous paragraphs, two different types of noise are both
related to the quantum nature of electromagnetic radiation, which is the basis of
our system of monitoring the dynamic state of the mirrors. We therefore consider
together these two sources of noise as a single entity, which are referred to with the
name optical readout noise:
Fig. 9.10 The optical readout noise in a simple Michelson for two values of power stored in the
arms. The green line is the locus of points representing the standard quantum limit
Not surprisingly, the product of these two noises shows that the ultimate limit in
the strategy of a classic monitoring system is given by the Heisenberg uncertainty
principle
σ p · σδL
2
it, it is worth mentioning here some specific aspects of this noise in connection with
the sensitivity of interferometric GW detectors.
Thermal noise has two main contributions: normal modes of the mirror and vibra-
tion modes of the suspension fibres. Each mirror is a thick disk of fused silica (350 mm
in diameter and 200 mm thick, for a total mass of 42 kg in Advanced Virgo). It is sus-
pended by four fused silica fibres of circular cross section: they are thin (∼200 µm
in diameter) in the long middle section, and about twice as thick near the two ends,
where they are fastened to the mirror and to the previous suspension stage. The fibres
are 60 cm long and the resulting pendulum frequency is ν p ∼ 0.6 Hz. The geometry
is chosen so as to keep high the other fibre transverse mode frequencies (the first
overtone is 500 Hz) and to keep low the vertical stretching mode frequency (9 Hz for
the chosen mirror mass).
Each mode of the suspended mirror (both for oscillation and vibration) has an
associated fluctuation energy equal to k B T , where k B is the Boltzmann constant and
T is the equilibrium temperature of the mirror. Some of these vibration modes have
a displacement field extending over regions of the mirror surface hit by the light
beam: The thermal fluctuations of the mirror surface give rise to an ill-defined mirror
position, and therefore to a displacement noise. To quantitatively assess this effect,
one should also consider that the wavefront of the light mode resonating in the FP
cavity (the T E M00 mode) has a Gaussian intensity profile: the vibration of a region
of the mirror surface is more relevant if hit by a portion of the wavefront with higher
intensity. Therefore, the computation of the spectral density of the displacement
fluctuations due to the thermal noise must include the convolution of the spatial
distributions, on the mirror surface, of the mechanical and electromagnetic modes.
The thermal noise of the mirror pendulum resonances, localized on the suspension
wires, is also a source of noise relevant for the sensitivity to GW.
The computation of this contribution is based on the fluctuation-dissipation theo-
rem stating that the unilateral spectral noise densities5 of the force and displacement
fluctuations of a mechanical system at the equilibrium temperature T are, respec-
tively,
4k B T 1
S f = 4k B T [Z (ω)] Sx =
[ ] (9.59)
ω 2 Z (ω)
where k B is the Boltzmann constant and Z (ω) = F(ω)/iω X (ω) is the impedance
of the mechanical system defined via the Fourier transforms of the force and the
velocity.6
We then assume that each mode v(t) follows the typical equation of the damped
harmonic oscillator,
dv Mi
Mi + v + Mi ωi 2
v(t)dt = f (t) (9.60)
dt τi
5 The bi-lateral spectrum is defined in the frequency domain (−∞ , + ∞). It is, obviously, half as
much as the unilateral one that extends only over positive frequencies, i.e. in the domain (0, + ∞).
6 Not to be confused with the transfer function (TF), that is the ratio between the Fourier transforms
where Mi , ωi and Mi /τi are the effective mass, angular frequency and coefficient of
dissipation7 related to the i-th vibration mode we are considering.
After Fourier transforming Eq. 9.60, we find the mechanical impedance:
Mi ω2
Z (ω) = + i Mi (ω − i )
τi ω
4k B T Mi
Sf = (9.61)
τi
Sf 4k B T 1
Sx (ω) = = (9.62)
|Z (ω)|2 Mi τi (ω 2 − ωi2 )2 + ( τω )2
i
Note the double role played by the coefficient τi : it is both the measure of dissi-
pation in the oscillation and the source of the thermal noise force. Hence the name
fluctuation-dissipation theorem.
Generally, the amount of dissipation is assessed via the quality factor of each
resonant mode Q i = ωi τi .
We can now apply to Eq. 9.62 the usual approximations for frequencies much
higher or much lower than the resonant frequency ωi . Thus, if we consider a GW
signal at a frequency ω > ω p higher than the pendulum own frequency ω p , we can
deduce a sensitivity limit in h due to the thermal noise of the pendulum mode of
suspension:
(ther 1 1 4k B T ω p
Sh m. p.) = (9.63)
L ω2 MQp
where the mass of this mode coincides with the mirror mass M and Q p accounts for
several different dissipation mechanisms. This noise becomes less and less relevant
as the observation frequency ω is increased, typically 20 Hz.
On the other hand, the modes of the mirror bulk oscillation are at frequencies
above 5 kHz, i.e. higher than the ω of interest and in this case (ω < ωi ) we have
1 1 4k B T ωi
Sh (ther m.m.) = (9.64)
L ωi2 Mi Q i
These limits depend on the nature of the material as well as on the geometry and on
the modes of vibration considered.
7τ as used here is the energy dissipation time constant. Elsewhere, the amplitude time constant
i
might be used. As the energy decay is proportional to A2 (t), we have τampl = 2τenergy .
9.6 Interferometer Signal and Noises 231
Assuming Q i 106 , the thermal noise contributions can be kept below the thresh-
old value of Sh ∼ 10−23 Hz−1/2 for a mass M ∼ 100 kg and angular frequency
1/2
ω ∼ 2π · 160 Hz.
Measuring the contribution of thermal noise in a gravitational interferometer is
a delicate task. The envelope of the two main contributions of thermal noise, that
of the pendulum and that of the modes of the mirror, should contribute to defining
the spectral region of maximum sensitivity of the instrument. In this case, to reach
Sh 10−23 Hz−1/2 , the Q values of all the modes we are considering should range
1/2
in the order of 106 − 108 . This is quite a difficult goal to achieve at room temperature.
In particular, the dissipation at the connection points of the suspensions was found
to influence the overall dissipation of both the mirror and the suspensions. However,
with a careful selection of materials and careful design of the connection points,
significant increase in the value of Q values has been achieved. In addition, these
values are also enhanced, for crystalline materials, by cooling the system to low
temperature. This is one of the reasons to support the use of cryogenic techniques
for the KAGRA observatory and for future-generation detectors.
We must notice that the formulas reported above have been computed assuming in
Eq. 9.60 a viscous (proportional to velocity) force. In 1990, Saulson (1990) pointed
out that these calculations of thermal noise based on velocity-damping models could
be in error. In particular, the internal dissipation in materials obey an extension
of Hooke’s law, in which losses are modelled by a complex spring constant,8 and
including in it a loss angle φ(ω), which determines the lag of the response x of the
spring when solicited by a sinusoidal force:
By far the most common functional form for the loss angle is φ(ω) = constant over
a large band of frequencies and its typical values are φ << 1. These results of the
acoustic dissipation have been obtained mainly for bulk materials where the loss
mechanism is associated with dislocations. Under this assumption the equation of
any damped harmonic oscillator, resonating at ω0 , becomes
d2x
M + Mωo2 (1 + iφ)x = f (t)
dt 2
and the corresponding mechanical impedance is
ωo2 ω2
Z (ω) = Mφ + i M(ω − o )
ω ω
so that the spectral densities of the force and displacement are
ωo2 4k B T ωo2 φ
S f = 4k B T Mφ ; Sx (ω) = (9.66)
ω Mω[(ω − ωo2 )2 + (ωo2 φ)2 ]
2
This model yields a frequency dependence significantly different from the viscous
case. This is evident far from the resonance, in the two limits discussed above,
ωg >> ω p and ωg << ωi
4k B T ω 2p φ p
1 1
Sh (ther m. p.) = √ for ωg
ω p (9.67)
L M ω5
(ther m.m.) 1 4k B T φi 1
Sh = √ for ωg ωi (9.68)
L Mi ωi2 ω
In today’s advanced detectors, the dominant noise in the frequency region of highest
sensitivity is thermal noise in the mirrors. The main cause is dissipation in the thin
film of materials used to produce the reflective coating of the mirrors. Here two
mechanisms are related to thermal fluctuations:
– The thermo-elastic effect: the coupling between mechanical and thermal fluctu-
ations produces local deformations in the mirrors and therefore a displacement
noise
– The thermo-refractive coefficients of the coating materials. Temperature fluctua-
tions in both the bulk and coating of mirrors produce a change in the refractive
index of the materials, and again a displacement noise in mechanical loss in the
bulk fused silica is responsible for the substrate Brownian noise term.
For this reason a wide research programme for low-losses coating materials is under-
way.
In a real detector the list of sources of noise, limiting the sensitivity to GW, is very
long: careful design and construction is required for all components.
Residual gas noise
The whole interferometer is hosted in a huge vacuum enclosure, in order to suppress
the noise effects associated with the presence of air along the beam trajectory. Indeed,
the presence of any residual gas implies:
The upper limit for the residual pressure is derived mainly by considering the last
noise mechanism listed above. The light beam propagates back and forth in a vacuum
enclosure, a tube few kilometres long and ∼ 1 m diameter. The density fluctuations of
9.6 Interferometer Signal and Noises 233
the residual molecules in the tube determines fluctuations of the effective refractive
index of the medium crossed by the laser beam. This noise can be modelled by
calculating the change in the phase of the Fabry-Perot cavity field as a molecule
moves through the beam, and by integrating over the molecular velocity distribution.
The noise power spectrum, expressed in h units, due to the residual pressure, is
Accadia et al. (2012):
8π α2 L ρ(z) ωw(z)
Sh vacuum (ω) = 2 ex p − dz (9.69)
L v0 0 w(z) vo
with ρ is the residual density of molecules function of the longitudinal coordinate of
the beam tube z, w(z) is the light beam transversal
dimension at z, α is the optical
polarizability of the molecules while v0 = 2kmB T is the thermal velocity of the
molecules, T the temperature and k B the Boltzmann constant. The goal is to keep
the pressure-related noise one order of magnitude below the shot noise: for Advanced
Virgo this implies that the noise strain due to pressure fluctuations must be below
10−25 Hz−1/2 corresponding to a residual gas pressure of about 10−7 Pa in the case
of hydrogen molecules; for hydrocarbons, the limit is much more stringent, 10−11
Pa. In the case of residual gas dominated by higher polarizability gases, the limit
pressure should be lowered by one order of magnitude. Reaching such low values of
partial pressure is a real challenge: they must be achieved in a tube of a 1 m diameter
and of kilometric length. The volume of the vacuum tube determines the pump-out
time, but it is the wall surface that determines the limit pressure plim , that is given
by the relation
SR
plim = (9.70)
where (we report here numerical values for Virgo) S = 25000 m2 is the wall surface,
is the pumping speed and R ∼ 10−10 Pa · l/cm2 · s is the degassing rate. Typical
values of degassing rate measured on Stainless Steel 304L samples are of the order
of 10−10 Pa · l/cm2 · s.
The required plim = 10−7 Pa then would demand, with a small safety margin, the
impressive pumping speed = 106 l/s, i.e. a huge investment in pumping systems,
to be distributed along the kilometre-long tube. Therefore, the only way to keep the
pumping system at an economically affordable level is to reduce the wall outgassing
rate. This can be obtained by firing each tube module, 12 m long, before the installa-
tion: firing involves heating in air the tube modules to 450 ◦ C,9 a procedure aimed
at increasing the hydrogen atoms mobility. These atoms are trapped inside the metal
during the production process at a weight concentration of few part per millions.
When cooled to room temperature, the hydrogen outgassing rate is reduced by more
than two orders of magnitude. After this procedure, the modules are welded to form
the long tube and this huge volume is pumped out while keeping the tube tempera-
ture at 150 ◦ C for several days (bake-out procedure). This procedure is particularly
9 The firing temperature is chosen to be well below the brittle temperature for 304L stainless steel.
234 9 Interferometric Detectors of Gravitational Waves
effective to reduce the residual water vapour adsorbed on the inner surface of tube,
and is the standard procedure applied to all ultra-high vacuum systems.
Moreover, the following technical aspects should be highlighted:
• The mirrors installed in the vacuum chambers are sensitive to any kind of con-
tamination: pumps must be oil-free and must not generate residual particles.
• The pumping stations should produce low acoustic, seismic and electromagnetic
noise.
In addition, the residual gas in the test mass vacuum chambers will contribute to the
damping of the test mass suspension, potentially increasing the suspension thermal
noise. This damping effect is increased by the relatively narrow gap between the
test mass and its suspended reaction mass, an effect called proximity-enhanced gas
damping. Gas damping noise is most significant in the 10–40 Hz band, and then it
falls off with frequency. An approximate expression of the noise spectral density in
the relevant region is
1/2
P
Sh(vac) 7 10 −25
(9.71)
10−7 Pa
Seismic noise In the low-frequency range, i.e. below 10 Hz, the sensitivity is limited
by seismic noise: ground motion drives the structure holding the apparatus, thus
coupling a displacement noise to the mirrors. The strategy to fight this noise input
uses the property of harmonic oscillators to attenuate all disturbances at frequencies
above its resonance. The solution adopted is to suspend each mirror to a chain of
several stages in series, each composed of a pendulum, and connected by vertical
springs. This technique, called super-attenuator (Ballardin 2001), proposed by A.
Giazotto, permits to lower the detector useful bandwidth down to a few Hz.
The super-attenuator is a rather complex device: the loading masses of each pen-
dulum are equipped with vertical springs at the end of which the pendulum wires are
fastened (Fig. 9.11).
In this way each pendulum attenuates the horizontal vibrations by a factor
2
ωh /ωg , where ωh is the resonance frequency of the pendulum and ωg is the
frequency of interest. The springs set at the connecting point of the pendulum wire
2
determine the attenuation of vertical vibrations by a similar factor ωv /ωg where
ωv is the resonant frequency of the springs.
With a proper design of the pendulums and of the vertical springs, a total atten-
uation of the suspension system of the order of 10−14 at 10 Hz has been achieved,
both for horizontal and vertical vibrations. Each mirror in the Virgo interferometer
is suspended by one of these systems.
Laser fluctuations A complex optical layout and several nested feedback loops are
used to stabilize the intensity and frequency of the laser and reduce the beam jitter. An
important element is a suspended, triangular Fabry-Perot cavity, which cleans up the
spatial profile of the laser beam, suppresses input beam jitter, cleans polarization and
helps stabilize the laser frequency before sending the beam into the interferometer.
9.6 Interferometer Signal and Noises 235
Fig. 9.11 On the right is a picture, taken from below, of the Virgo super-attenuator. The pendulum
cables and the Maraging steel blades for the attenuation of the horizontal and vertical seismic motion
are indicated. On the left, a measurement of the spectrum of the seismic noise, measured on the
ground and on the mass suspended by the super-attenuator. Courtesy of the Virgo Collaboration
(Braccini 2010)
Gravity gradient noise is one of the limiting noise sources of the Advanced Detec-
tors design in the frequency range 10–20 Hz. Strategies, based on measurements and
dedicated algorithms, to subtract this particular noise from the data are currently
under study and test in Virgo.
Scattered light noise A fraction of the laser light leaves the volume occupied by
the main beam through scattering or reflections from the various optics. This light
will hit and scatter from surfaces that are typically not as well mechanically isolated
as the suspended optics, picking up large phase fluctuations relative to the main
interferometer light. Thus, even very small levels of scattered light can be a relevant
noise source if it recombines with the main beam. To limit these noise contributions,
originated by the acoustic and seismic vibration of the vacuum pipe anchored to
the ground, a photon killing system has been developed. The scattering process and
the related noise was evaluated via ray tracing and Monte Carlo simulations, to
design absorbing elements (baffles) to be distributed along the vacuum pipes. The
best known absorbing material is coated glass for welding protection: this solution,
however, cannot be adopted because of its fragility. The adopted solution consists of
stainless steel diaphragms with a conical profile that reduce the useful diameter from
1.2 to 0.9 m. They have an absorption coefficient close to 50 % and the remaining
light is reflected on a wide field.
Fig. 9.12 Virgo sensitivity curve achieved in the year 2011. The quantity plotted versus frequency
is amplitude spectral sensitivity H (ω/2π). Courtesy of the Virgo Collaboration
Fig. 9.13 Virgo sensitivity curve achieved in 2019 during the O3 observation run. The lower curve
is the projected sensitivity for an upcoming run when the signal recycling will be in place. Courtesy
of the Virgo Collaboration
vice versa. We refer to Hild (2014) for an introduction to squeezing, and to Danilishin
and Khalili (2012) for an in-depth analysis of quantum optics of the GW interferom-
eters.
The alternation of periods dedicated to hardware upgrading with periods of Uni-
verse observations will continue for the foreseable future, pushing these instruments
to their ultimate limits, both in terms of durability and performance, to the point
where no further significant sensitivity improvements will be possible. Thus, a new
generation of detectors will be needed: a third generation (3G), counting Virgo-LIGO
as the first and aLIGO-Advanced Virgo as the second.
The design of such complex detectors, planned for the mid-2030s, has already
begun: the Einstein Telescope (ET) is the European 3rd generation effort, while
the Cosmic Explorer (CE) project is in preparation in the USA. Both will rely on
new research infrastructures designed to observe the entire Universe with the GW
messenger. ET and CE are conceived following the same basic design of LIGO and
Virgo: a Michelson interferometer with Fabry-Perot cavities in the arms plus power
and signal recycling. Both new instruments will have longer arms: the present plan
calls for 40 km on surface for CE and 10 km underground for ET.
The underground operations will allow to extend the frequency band of the obser-
vatory down to a few Hz, thanks to reduced seismic and gravity gradient noise induced
by seismic waves.
There is a dichotomy in the optimization of a high-sensitivity interferometer:
high light power is needed to reduce the shot noise, dominant at high frequencies. At
9.7 Conclusive Notes 239
the same time, due to the inevitable optical losses, high power increases the mirror
temperature and the thermal noise, relevant at low frequencies. In ET this issue is
solved with a xylophone, i.e. by building a pair of complementary interferometers:
– ET-LF, more sensitive at low frequencies, thanks to the use of cryogenic mirrors
to reduce the thermal noise
– ET-HF, operated at room temperature and at higher light power stored in the
optical cavities for optimum performance at higher frequencies.
References
Accadia, T., et al.: Virgo: a laser interferometer to detect gravitational waves. JInst 7, P03012 (2012)
Ballardin: Measurement of the VIRGO superattenuator performance for seismic noise suppression.
Rev. Sci. Instrum.72, 3643 (2001)
Braccini, S., et al.: Measurements of superattenuator seismic isolation by Virgo interferometer.
Astropart. Phys. 33, 182–189 (2010)
Danilishin, S.L., Khalili, F.Y.: Quantum measurement theory in gravitational-wave detectors. Liv.
Rev. Relat. 15, 5 (2012)
240 9 Interferometric Detectors of Gravitational Waves
Drever, R.W.P.: Interferometric detectors for gravitational radiation. In: Deruelle, N., Piran, T. (eds.)
Gravitational Radiation. North Holland (1983)
Forward, R.L.: Wideband laser-interferometer graviational-radiation experiment. Phys. Rev. D 17,
379 (1978)
Herriott, D., Kogelnik, H., Kompfner, R.: Off-axis paths in spherical mirror interferometers. Appl.
Opt. 3, 523 (1964)
Hild, S.: A basic introduction to quantum noise and quantum-non-demolition techniques. In: Bassan,
M. (ed.) Advanced Interferometers and the Search for Gravitational Waves. Springer Int. Publish.
(2014)
Maggiore, M.: Gravitational Waves, vol.1: Theory and Experiments. Oxford University Press,
Oxford (2018)
Meers, B.: Recycling in laser-interferometric gravitational-wave detectors. Phys. Rev. D 38, 2317
(1988)
Meers, B.J.: The frequency response of interferometric gravitational wave detectors. Phys. Lett. A
142, 465 (1989)
Moss, G.E., Miller, L.R., Forward, R.L.: Photon-noise-limited laser transducer for gravitational
antenna. Appl. Opt. 10, 2495 (1971)
Pegoraro, F., Picasso, Radicati, G.: On the operation of a tunable electromagnetic detector for
gravitational waves. J. Phys. A 11, 1949 (1978)
Pérot, A., Fabry, C.: On a new form of interferometer. Ap. J. 9, 87 (1899)
Pirani, F.A.E.: On the physical significance of the Riemann tensor. Acta Physica Polonica 15, 389
(1956)
ch9Strainsps1991 Saulson, P.: Thermal noise in mechanical experiments. Phys. Rev. D 42, 2437
(1990)
Strain, A., Meers, B.J.: Experimental demonstration of dual recycling for interferometric
gravitational-wave detectors. Phys. Rev. Lett. 66, 1391 (1990)
Vajente, G.: Interferometer Configuration. In “Advanced Interferometers and the Search for Gravi-
tational Waves”, Bassan M. ed. Springer Int. Publish (2014)
Vinet, J.-Y.: The VIRGO Physics Book, OPTICS and related TOPICS (revision 2020)
Vinet, J.Y., Meers, B., Man, C.N., Brillet, A.: An improved test of the strong equivalence principle
with the pulsar in a triple star system. Phys. Rev. D 38, 433 (1988)
Weiss, R.: Electomagnetically coupled Broadband Gravitational Wave Antenna. Quarterly Progress
Report, MIT Research Laboratory of Electronics 105, 54 (1972). Reprinted as LIGO-P720002-
00-R
Further Reading
Saulson, P.R.: Fundamentals of Interferometric Gravitational Wave Detectors. World Scientific,
Singapore (1994)
Bond, C., Brown, D., Freise, A., Strain, K.A.: Interferometer techniques for gravitational-wave
detection. Living Reviews in Relativity 19, 3 (2016)
Danilishin, S.L., Khalili, F.Y., Miao, H.: Advanced quantum techniques for future gravitational-
wave detectors. Living Reviews in Relativity 22, 2 (2019)
Data Analysis
10
10.1 Introduction
Several physical processes targeted by the present research studies suffer from large
background contamination that needs to be filtered out. Statistical analysis of the
data plays a crucial role in boosting the signal-to-noise ratio (SNR) and improve the
sensitivity. Scientific methods for data processing and analysis in the area of preci-
sion experiments has progressed impressively during the last decades, continuously
developing dedicated data analysis strategies: they range from very simple and robust
methods to more sophisticated and targeted procedures. The experimental field that
most needs these tools, and the one where the most important developments have
arisen, is that of the search for GW with ground-based interferometers: we shall
specialize our overview to these techniques, bearing in mind that many are suitable
for application anywhere a small signal needs to be detected in the presence of large
noise. Two excellent references on these topics are the classical books by Papoulis
(1965) and Zubakov and Wainstein (1962). For applications to the data analysis of
GW detectors, see the book by Schutz (1989).
In this chapter, we summarize basic data analysis tools,1 starting with few fun-
damental concepts for the noise characterization. Then, we will review few analysis
strategies for different classes of signals, to offer the flavour of the complexity of
this important topic of experimental gravitation.
1 The reader is expected to be familiar with the basic concepts and tools of statistics. To fill possible
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 241
F. Ricci and M. Bassan, Experimental Gravitation, Lecture Notes in Physics 998,
https://doi.org/10.1007/978-3-030-95596-0_10
242 10 Data Analysis
Observing the different realizations at the instant t0 , we are studying the statistics of
the random variable X (to ): these realizations are values that the process assumes at
the instant t0 in the (k = 1, 2 . . . ) replicas of our experiment.
To characterize X we make use of the probability density function: focusing on
the realizations x(t) taken in the generic instant t, we derive the estimate of the first
(1)
order probability density function f t ( see Fig. 10.1).
The mean, i.e. the first-order moment of the distribution, is
+∞
μx (t) ≡ E[X (t)] = x f t(1) (x)d x (10.1)
−∞
When we consider two time instants, t and t", we use the joint probability density
function f t ,t" (2) to define the second-order momentum of the distribution, that per-
mits to evaluate the interdependence of the random variable realizations at different
times:
2 Formally, a random variable is a function that can take on either a finite number of values, each
with an associated probability, or an infinite number of values, whose probabilities are summarized
by a density function.
10.3 Stationary Processes 243
Fig. 10.1 The figure shows four realizations of the stochastic process xk (t). From many of these
realizations, we extract their values at the time parameter t0 . The ensemble of these values constitutes
the random variable X (t0 ). The histogram on the right is the estimate of the probability density
function of the process for the variable X (t0 )
In general, the probability densities of all orders change as the points in time are
changed.
We define a fundamental property of a class of processes, the stationarity: an
arbitrary time translation of the entire realizations does not change its statistics.
We distinguish processes that are stationary in a strict (narrow, strong) sense or
in a wide (weak) sense
A process is stationary in the strict sense if its statistics do not change under any
time shift, i.e. when:
For a process to be stationary in the wide or weak sense, it is sufficient to meet the
first two requirements.
averages. With reference to Fig. 10.1, this is like saying that the average value of
x(t0 ) can be estimated by the time average of any realization xk . This allows us
to calculate statistical averages from a single and generic realization of the process
through time averages.
We define the auto-correlation function Rx x ,
t+T
1
Rx x (t, t + τ ) ≡ lim x(t )x(t + τ )dt (10.4)
T →∞ 2T t−T
it gives information on the memory of the stochastic process, i.e. it permits to infer
how much the observation at time t is correlated with that at t + τ .
In the case of ergodicity, we have
+T
1
μx = lim x(t)dt (10.5)
T →∞ 2 T −T
If the process is ergodic in its statistical auto-correlation, all the realizations of the
process have the same auto-correlation function,
+T
1
Rx x (τ ) = lim x(t)x(t + τ )dt = E[X (t), X (t + τ )] (10.6)
T →∞ 2 T −T
this is just Eq. 10.3 rewritten as a time average. Moreover, in the case μx = 0,
showing that the variance is the sum of random contributions at all frequencies.
• When the ergodic process i(t) is the input of a linear system with transfer function
H (ω) (see Appendix B), the output is also an ergodic process o(t) with auto-
correlation function Roo (τ ) and spectral density:
+T
1
Rx y (τ ) = lim x(t)y(t + τ )dt (10.11)
T →∞ 2T −T
+∞
Sx y (ω) = Rx y (τ ) exp(− jωτ )dτ (10.12)
−∞
In previous chapters, we have used the simplified notation Sx (ω) instead of Sx x (ω).
This is often adopted when no cross-spectrum or cross-correlation is involved so that
the double subscript is redundant.
The notion of filtering raw data to extract the signal embedded in the noise was origi-
nated in the context of the communication theory: N.Wiener in the 1940s broadened
the concept interpreting the filter theory on statistical basis. Filter theory is still in
evolution in these days, targeting new methods of design of optimum filters. In the
following, we will focus on the optimum linear filter,3 although robust, non-linear
methods, like those based on machine learning (Ross 2021), are now being developed
and implemented.
The matched filter is the best linear approach to extract a signal of known shape
when it is embedded in a stationary Gaussian noise.
We shall apply the following considerations to the gravitational signal h(t) and
the noise n(t) from an interferometer, as discussed in Chap. 9, but the analysis,
originally derived for e.m. antennas, is very general and is applied to a large class of
experiments. Consider a data stream, a function i(t) containing a linear superposition
of both the signal h(t) and the noise n(t):
3 The filter operator F , relates an input i(t) to the output o(t) by o = F {i}. F is linear when the
two properties of superposition and scaling are verified: F {ai 1 + bi 2 } = a F {i 1 } + bF {i 2 }.
246 10 Data Analysis
We feed this function i(t) to a linear filter so that the output o(t) is the convolution
of i(t) with the filter function k(t):
+∞
o(t) = i(t − τ )k(τ )dτ (10.13)
−∞
If the input signal is a very short pulse i(t) → Aδ(t), the output tends to coincide
with the filter function: hence the name “filter impulse response” for k(t).
Since we are dealing with a linear combination of noise and signal, we shall have4
+∞
o(t) = [h(t − τ ) + n(t − τ )]k(τ )dτ = oh (t) + on (t) (10.14)
−∞
Our goal is then to find a filter that will maximize the output SNR at a given time
t0 . Intuitively, we look for a linear time-invariant filter whose output will be much
larger if the signal h(t) is present than when it is absent. To do this, the filter should
make the instantaneous power in oh (t0 ), as large as possible compared to the average
power on (t): this is equivalent to maximizing the standard definition of the SNR.
A well-known property of Fourier transform is the mapping of convolution in the
time domain to a simple product in the frequency domain. Therefore, the analysis
is simpler in the latter, and we introduce the Fourier transform of the functions i(t),
o(t), k(t), denoted by the corresponding capital letters:
1 1
I (ω) = i(t)e−iωt dt ; K (ω) = k(t)e−iωt dt
2π 2π
1
O(ω) = o(t)e−iωt dt (10.15)
2π
4 To avoid ambiguities: in the following, the word signal is reserved for the component h(t), not to
the entire input i(t); same for the signal output oh (t).
10.5 The Matched Filter 247
Being n(t) a stochastic process, the noise contribution at the output of the filters is
computed applying the Wiener-Khinchin theorem: the power spectrum of the noise
processes at the input Sn (ω) and the output of the filter Son is:
The output noise spectrum Son is easily evaluated when the signal is rare, thus usually
absent in the output (like in the detection of GW): in the case Son simply coincide
with the spectrum of the filter output So .
Recalling the property of the power spectral density of a stationary process with
null average, the expected value of the noise variance is
+∞
1
E[on2 ] = Son (ω)dω (10.18)
2π −∞
To find the optimum K (ω), we proceed by maximizing the SNR, applying the
Cauchy-Schwartz inequality, written here for two generic functions A(ω) and B(ω):
2
+∞ +∞ +∞
A(ω)B(ω)dω |A(ω)| dω2
|B(ω)|2 dω (10.20)
−∞ −∞ −∞
H (ω)
A(ω) = K (ω) Sn (ω)eiωt0 B(ω) = √
Sn (ω)
we can rewrite the numerator of Eq. 10.19 as
2
+∞ +∞ +∞ |H (ω)|2
K (ω)H (ω)e iωto
dω |K (ω)| Sn (ω)dω
2
dω
Sn (ω)
−∞ −∞ −∞
(10.21)
Substituting this relation into Eq. 10.19, we obtain an upper bound
+∞
1 |H (ω)|2
S N Rt0 dω (10.22)
2π −∞ Sn (ω)
Equation 10.21 shows that the upper bound is achieved if we choose as filter the
function
H ∗ (ω)
K (ω) = Ce−iωt0 (10.23)
Sn (ω)
248 10 Data Analysis
we notice that the first block in square brackets divides the input by the spectrum of
the noise. At the output of this first block the noise no longer has any spectral feature,
but a constant spectrum: Sn (ω) = constant. In jargon, this operation is called data
whitening.
When we deal with Gaussian noise and white power spectrum, the frequency domain
filter for which we have the maximum SNR is simply proportional to the Fourier
transform H (ω) of the signal that we want to detect. The filtering process will pass
through only those frequencies that contribute to the signal:
h(t) is a real function so H ∗ (ω) = H (−ω) and the signal at the filter output is
+∞
1
oh (t) = H (−ω)H (ω)eiω(t−t0 ) dω (10.26)
2π −∞
Using the Fourier transform of
+∞
H (ω) = h(t )e−iωt dt (10.27)
−∞
5 The matched filter is, indeed, matched to a known, or presumed, form of the input signal h(t).
10.6 The Case of Signal in White Noise 249
Fig. 10.2 On the top left, the signal embedded in the detector noise in the time domain. On the
right, the waveform h(t) used in the matched filter: a chirp, emitted by a binary neutron star system
in the final coalescence phase. On the bottom left, the filter output versus time. The signal in the
data produces a sharp peak in the matched-filter output at the termination time of the chirp signal
that we rewrite as
+∞
oh (τ ) = h(t )h(t − τ )dt (10.29)
−∞
where τ = (t − t0 ), shows the delay introduced by the filter. We read in this expres-
sion the definition of the auto-correlation function of the signal.
In the presence of the whole input i(t) = h(t) + n(t), the matched filter then acts
as
+∞
o(t) = i(τ )h(τ − t)dτ (10.30)
−∞
This result is telling us that the matched filter simply works as a correlator and acts
on the input i(t) as by cross-correlating the input with the known signal.
An example of the application of the matched filter to GW data is shown in
Fig. 10.2. We convolve the strain data i(t) with the waveform h(t) shown in the figure:
he result is a sharp, narrow peak occurring at a time that almost exactly coincides
with the termination time of the chirp signal (merging time). This is conventionally
taken as the chirp arrival time.
250 10 Data Analysis
The first direct detections of GW signals concerned chirp signals, oscillating func-
tions with frequency and amplitude that grow over time, as it was shown in Fig. 10.2.
Thus, let us discuss the matched filtering technique in this case. As we discussed
in Chap. 9, the use of the resonant cavities in the interferometer modifies the shape
of the sensitivity curve in the high-frequency range of the detector bandwidth. The
shot noise contribution rises above a frequency νkn , that was 400 Hz for the LIGO
detectors. At lower frequencies, below νs ∼ 20 Hz, the sensitivity is limited by the
seismic noise so that a crude, but effective approximation for the spectral density of
the overall noise is
1 ν 2
Sn (ν) = σn (νkn ) 1 + ( ) ν > νs
2 νkn
=∞ ν < νs (10.31)
We then use, as a model for the signal, the waveform generated by two point masses
in circular orbit as described in Sect. 7.5.2. Here, we follow a standard assumption
of the community of GW hunters, to deal a signals emitted by two identical neutron
stars of 1.4 M (see the example in (Schutz 1991)).
We use a simplified expression of the amplitude of the chirp wave (Eq. 7.52)
impinging on the detector along the optimal direction and with the most favourable
polarization. Note that in this chapter we are focused on GW signal analysis so that
ν identifies the signal frequency: for rotating sources, this is twice the frequency of
rotation ν of Chap. 7.
t
h(t) = Ah [ν(t)] cos 2π ν(t˜)d t˜ + (10.32)
ta
where ta is the arrival time of the signal when its instantaneous frequency is equal
to νa , and the phase at the same time. The amplitude increases slowly over time6
dν M 5 ν 11
= 13 3( ) 3 Hz s−1 (10.34)
dt M 100 Hz
6 In this equation and in the following of this long paragraph, where numerical reference values
appear, the frequencies are normalized in units of 102 Hz, the distances r in unit of 102 Mpc and
the time by 1 s.
10.7 The Detection of a Chirp Signal 251
From this differential equation, we deduce how the frequency changes with time
3
ν(t) 8 M 5 t − ta − 8
= (νa )− 3 − 0.33 3( ) (10.35)
100 Hz M 1s
one additional integration yields the expression of the phase of the Eq. 10.32
t
(t) = 2π ν(t˜)d t˜ = (10.36)
ta
5
M − 53 νa − 53 νa 8 M 53 t − ta 8
= 3000( − ( )− 3 − 0.33( ( )
M 100 Hz 100 Hz M 1s
Using Eq. 10.35, the duration of the chirp signal starting from the moment the fre-
quency is equal to νa and stopping at the end of the chirp: in principle, when ν → ∞
but, in practice, when the last stable orbit is reached.
M − 5 νa − 8
Tcoal (νa ) = 3 3 3 s (10.37)
M 100 Hz
M 5 ν 7 100 Mpc
|H (ν)| 3.7 · 10−24 6( )− 6 ( ) Hz−1/2 (10.38)
M 100 Hz r
With this, we can evaluate the SNR:
∞ |H (ν)|2
SN R = dν (10.39)
−∞ Sn (ν)
An interesting event for the detection of the gravitational signal is defined when the
filtered variable exceeds an assigned threshold value. This event will be characterized
by the arrival time tarr and the amplitude of the signal. The accuracy of the arrival
time, which depends in part on the efficiency of the filter, is an essential feature
in order to determine delays of the signal seen by a network of gravitational wave
detectors and extract information on the direction of arrival.
The time resolution, i.e. the standard deviation of tarr , can be approximatively
evaluated, when the signal is narrow band around the value νo as in (Schutz 1991)
1
σtarr (10.40)
2π νo · S N R
The efficiency of the filter is excellent in the statistical sense mentioned above, if
the template assumed in the filter coincides with the real signal. The template depends
on several parameters: for example in the simplest case in addition to the position in
the sky of the emitting system, the polarization angle, the total mass m 1 + m 2 and
the chirp mass M must be taken into account, determining both the amplitude and
the frequency of the signal.
7 The template bank could be seen naively as an uniform grid in the parameter space. However, to
appreciate how this is not, in general, a trivial task, just consider the case of placing dots on a sphere
at equal distances.
10.8 The Template Bank 253
θr e f ≡ {θr e f i }, i = 1...N
We begin by calibrating the filter, feeding it with a simulated signal that per-
fectly matches the template θr e f , i.e. a signal with FT H (ω; θr e f ), as prescribed by
Eq. 10.25. Clearly, this combination, signal and template perfectly matched, gives
the maximum possible SNR, but our filter is also sensitive to signals corresponding to
nearby parameters, although with a reduced SNR. If we define the acceptable loss in
SNR, usually up to 5%, we can compute the volume of the parameter space covered
by that template, and the distance to the adjacent ones, in order to efficiently tile the
whole space. Consider then one of these nearby templates: θnear = θr e f +
θ: the
loss in SNR with respect to the optimal case introduces the ambiguity function
S N R(θnear )
ζ ≡ (10.41)
S N R(θr e f )
Applying the filter with H (ω; θr e f ) to nearby points, we can express, by power series
expansion near its maximum, the reduction in S N R:
1 ∂ 2ζ
ζ (θr e f +
θ)
=1+
θ i
θ j + · · ·· (10.42)
2 ∂θ i ∂θ j
θ k =0
Using the tools of differential geometry8 , we can interpret the coefficient of the
expansion as the metric tensor of the parameter space:
1 ∂ 2ζ
gi j = − (10.43)
2 ∂θ i ∂θ j
θ k =0
by means of which we quantify the distance square between two neighbour templates
with
1 − ζ = gi j
θ i
θ j (10.44)
Based on the metric 10.43, we map the template placement using the smallest possible
number of templates, to fully cover the parameter space (i.e. leave no unexplored
regions).
A few main template placement strategies have been developed. We cite here just
the construction of a geometric quasi-regular lattice of points and the stochastic
construction built from a set of random proposals.
Under the geometrical approach, a set of coordinates is first identified and then a
quasi-regular distribution of patches are placed.
The stochastic template placement algorithm is built starting from a set of seed
points drawn from a uniform statistical distribution over the parameter space. Then,
other √
point are randomly proposed and then accepted only if they lie at a distance
D > 1 − ζ from the others, previously drawn.
The random choice process continues until a pre-set convergence threshold is
reached for the full coverage of the parameter space. The stochastic method is robust
and can be implemented in higher dimensional curved parameter spaces.
In general, the dimensionality of the parameter space is large and the number of
parameters depends on the complexity of the waveform we are assuming. Even in
the simplest case of GW signals emitted by a spinless coalescent binary systems, the
vector is nine dimensional: two BH masses, inclination angle i, polarization angle ψ,
phase at coalescence φc , right ascension α and declination δ, luminosity distance d L
and time of coalescence t. The dimensionality rises to 15 (adding two spin vectors)
for spinning stellar systems, and the parameter number is even larger when more
detailed physics is included in the gravitational waveform.
To give an idea of the bank dimension, we refer to the case of the first direct
detection of a GW signal from a binary black hole system (Abbott et al. 2016a). At that
time, the bank was constructed by targeting compact binary systems with individual
masses restricting the total mass up to a maximum of 100 M . The dimensionless
aligned-spin magnitude9 of the individual objects was limited to 0.99: the search
was carried on a bank consisting of almost 250000 templates.
The goal of data analysis is to reconstruct the information, hidden in the experiment
output, by comparing the observational data with theoretical waveform of the signal
and extracting the parameter values. When dealing with a transient signal emitted
by an astrophysical source, the possibility to repeat the identical experiment several
times is precluded: we have just a unique event. It follows that the assessment of
statistical confidence on the event parameters cannot be based on a strict frequentist
statistical approach.
Such assessment is instead achieved applying Bayesian inference. The statistical
pillar of the inference is the Bayes theorem, which describes the probability of an
event, based on prior knowledge of conditions that might be related to the event
(Thrane and Talbot 2019).
Let us first introduce the notation.
• θ and i are two variables; for our purposes θ represents the model parameters and
i the data of the event.
9 Ina black hole, the angular momentum can have a maximum value | J| G M 2 /c. It is therefore
natural to define a dimensionless spin vector with magnitude | j| = c| J|/G M 2 1.
10.9 Bayesian Inference 255
• p(θ ) and p(i) are the probabilities of observing θ and i, respectively, without any
given conditions, i.e. they are the prior probabilities.
• p(θ |i) is the conditional probability of the event θ occurring given that i is true.
This quantity is also called the posterior probability of θ given i.
• p(i|θ ) is also a conditional probability: the probability of event i occurring given
that θ is true. This can be interpreted as the likelihood function of θ for a given i:
p(i|θ ) = L(i|θ ).
• the unconditioned probability p(i) can be expressed as the weighted sum of all
conditioned probabilities:
p(i) = L(i|θ ) p(θ )dθ ≡ Z (10.45)
The Bayes’ theorem (or Bayes’ rule) states, in a deceivingly simple formulation, that
1
p(θ |i) = p(i|θ ) p(θ ) (10.46)
p(i)
as long as p(i)
= 0.
An example from medical field and a diagram may help gaining insight with this
important tool. Consider a human population that may or may not have some sort of
cancer and a medical test that returns positive or negative for detecting that disease.
We know that 6% of the population is affected by this cancer.
So we set θ = Disease is True and i= Test is Positive: the parameters here are not
only discrete variables but also are binary (true/false). The prior is p(θ ) = 6%.
Diagnostic tests are not perfect: in 5 % of the tests a patient will not have cancer,
but the test gives a positive result (false detection, or false positive); conversely, the
test might not detect a cancer that is present (false dismissal, or false negative) in 18
% of the cases. The likelihood is thus L(i|θ ) = 0.82. The situation is summarized
in the tree diagram of Fig. 10.3. By working backward this tree, we can answer the
question:
Fig. 10.3 A tree diagram summarizing the probabilities involved in the example given in the text
256 10 Data Analysis
• what is the probability that a patient has the disease, given that he tested positive?
where from now on θ = (θ1 . . . θn ) is a vector whose components are the waveform
parameters of a model under scrutiny.
It sometimes happens that we are mostly interested in assessing the value of one of
these parameters, giving up any information on the others: the conditional posterior
probability of a parameter θk , is obtained by applying the marginalization procedure,
that simply means integrating over all possible values, over all the other components
of θ
1
p(θk |i) = L(i|θk ) p(θk ) =
p(θ|i) dθn (10.49)
Z
n
=k
Fig. 10.4 Posterior probability density functions for the masses m 1 and m 2 (it is assumed by
convention that m source
2 < m source
1 ) of the two black holes of the coalescent binary system, which
emitted the GW150914 signal. In the plot, we show the results obtained with two different family
of waveforms: IMRPhenom in blue and EOBNR in red. In solid black, the Overall posterior is
reported. The dashed vertical lines mark the 90% credible interval for the overall posterior. The
two lines of the contour plot show the 50% and 90% credible regions of the posterior probability
density function. Credits: Abbott et al. (2016b). Creative Commons License
The evidence Eq. 10.45 can be used to define to what extent this signal model is
statistically supported by the data. Let us consider the case of a signal embedded in
Gaussian noise:
1 |i − h(θ)|2
L(i|θ) = exp −
2π σ 2 2σ 2
where h(θ ) is a given GW signal template and σ the noise variance. We can assign
2
evidences both in presence of a signal Zh and in its absence, i.e. the presence of pure
noise Zn :
Zh ≡ L(i|θ) p(θ)d θ Zn ≡ L(i|0)
to indicate which model is preferred over the other: in this case for negative ln(B Fnh )
noise prevails on signal. A typical threshold for a strong evidence of signal presence
is ln(B Fnh ) = 8. A similar approach can be used to discriminate among different
models: for example, to compare the case of waveforms with or without spins (case
A and B). If the two priors p(θA ) and p(θB ) are significantly different, the Bayes
ratio is modified by weighting the two evidences with the priors
Z A p(θA )
B F BA = (10.51)
Z B p(θB )
The matched filter theory we have presented assumes the stochastic process n(t) to
follow a Gaussian statistic and to be stationary. It is a matter of fact, however, that
the detectors are often affected by non-Gaussian, non-stationary noise sources. As
a consequence, the signal searches suffer from reduced sensitivity, so it is crucial to
understand how to deal with the non-Gaussian effects in order to maximize the yield
of the available data. The non-Gaussian behaviour can be manifested as a change in
the noise power spectral density (PSD): we refer to slow and continuous adiabatic
drifts in the power spectrum occurring over minutes or hours; procedures to correct,
in first approximation, the effect of the variation in the PSD of the background
have been developed (Zackay et al. 2019; Abbott et al. 2020). The adiabatic drift
of the power spectrum can be treated as a local stationary processes and a simpler
approach is to divide the data into small chunks of time centred on time ti , and
compute a smoothed estimate for the power spectrum for each chunk Sn (ω, ti ). If
the chunk is short the visualization of the signal in a time-frequency plot is poor and
not efficient if based on the Fourier Transform (too few points for computing the
discrete Fourier transform of the signal). A more efficient approach it is provided by
a representation of the detector output using wavelet transform (Graps 1995), which
extracts local spectral and temporal information simultaneously, while the Fourier
transform captures global frequency information, i.e. frequencies that persist over
the entire output stretch. In practice, the wavelet transform decomposes a function
into a set of wavelets, wave-like oscillations that are localized in time. The formal
mathematical definition of a continuous wavelet transform of the function f (t) is
+∞
1 ∗ t −τ
W (s, τ ) = √ f (t)ψ dt (10.52)
s −∞ s
where the parameters s
= 0 and τ are called the scale and translation parameters
and ψ is a function of choice. Usually, s is associated to a sort of frequency content
and τ to the location in the time domain. If we decrease s the wavelet becomes more
squeezed, where τ shifts the wavelet along the signal f (t). In practice, the basic idea
is to compute how much of a wavelet is in a signal for a particular scale component
and location. The calculation of the continuous wavelet transform is implemented
10.10 Signals in Non-Gaussian Noise 259
by taking discrete values for the scaling parameter s and translation parameter τ .
The resulting wavelet coefficients are called wavelet series. The simplest and most
efficient discretization method for practical purposes leads to the construction of an
orthonormal wavelet basis
−m/2
ψm,n = s0 ψ(s −m t − nτ0 )
where m and n control the wavelet dilation and translation. Then, the wavelet series
are calculated as
+∞
Wm,n = f (t)ψ ∗ m,n (t)dt (10.53)
−∞
Any wavelet transform for which the wavelets are discretely sampled represents
a valid discrete wavelet transform. It decomposes a signal into a set of mutually
orthogonal wavelet basis functions. These functions differ from sinusoidal basis
functions in that they are localized in time: they differ from zero only over part of
the total signal length. Figure 10.5 shows the first five Morlet wavelets, a popular
example of basis.
The quantitative assessments of non-stationarity is made by using discrete, orthog-
onal wavelet transforms. These can be visualized using a scalogram, see Fig. 10.6,
260 10 Data Analysis
Fig. 10.6 A discrete wavelet transform was used to produce the scalogram of non-stationary data.
The data were first whitened using a spectral density estimated from a stretch of 256 s data centred
around the time marked here as t = 0. The scalogram shows that, for about 30 s around t = 0, the
data had a significant spectral content in the 50–150 Hz band. Courtesy of the Virgo-Roma group
showing the amplitudes of the wavelet basis functions at each discrete time and fre-
quency pixel. A second type of non-Gaussian behaviour is encountered when we
have abrupt noise transients, called in jargon glitches, that can be caused by either
environmental disturbance or instrumental malfunction. These non-Gaussian noise
transients are tricky, as they manifests as bursts of excess power in the detectors
above and beyond what would be expected from stationary Gaussian noise alone:
therefore, glitches can be mistaken for real GW signals if they occur simultaneously
in multiple detectors and limit the search sensitivity to many astrophysical signals.
To mitigate this disturbance, a time-frequency excess power search is used to identify
high-amplitude, short-duration transients that are not already vetoed on the basis of
extra information provided, for example, by environmental monitors. To discrimi-
nate glitches from GW signals a χ 2 test is applied: the data and the GW waveform
are sliced in several different frequency bands. Then, given a specific number of
frequency bands n, a reduced χ 2 is computed:
2
χr2 =
n 1 < i|h k > − < i|h >
2n − 2 < h|h > n
k=1,n
where, for reason of compactness, we have borrowed Dirac’s bra-ket notation for the
inner product of the functions
a(ω) b∗ (ω)
< a|b >= dω
Sn (ω)
|i > is the FFT of the input time series, |h > is the template and |h k > is the sub-
template corresponding to the k-th frequency band. Values of χr2 near unity indicate
a very good match between data and model.
10.11 The Wiener-Kolmogorov Filter 261
Early gravitational wave data analysis was concerned with the detection of bursts
originating from supernova explosions; at the epoch of the Weber-like resonant bars,
the analysis was limited essentially to looking for impulsive events modelled by
a Dirac-δ signal. This was also due to the relatively narrow detection band of the
detector, that made it difficult to distinguish the shape of any transient signal other
than a δ. For Earth-based interferometers, the detection band extends from a lower
limit of few Hz up to 10 kHz. The broad band makes the detector more versatile and
permits to study waveforms of transient signals beyond the δ function. To detect a
delta-like signal in the resonant GW detector it was adopted the Wiener-Kolmogorov
(WK) filter. This filter was independently introduced by two eminent mathematicians:
N. Wiener, in the Western world and A.N. Kolmogorov in the USSR. The problem
concerns how to perform a prediction in presence of stationary stochastic processes.
In Wiener’s particular case, the application studied was the prediction of the trajectory
of an aircraft in order to direct the anti-aircraft shooting. In fact the filter theory had
a strong development during World War II for the detection of radar signals, signals
of known form: the received signal must have a shape similar to the emitted one and
measuring the delay in reception we can establish the distance of the target.
Assume we have a signal h(t), while n(t) is noise, i.e. a random variable added
to the output. We want to filter the stationary process i(t) = h(t) + n(t) in order to
extract information about the nature of the signal h(t).
ĥ is the signal estimation based on a linear combination of the process realizations
i 1 , i 2 , ...i n weighted by wi i = 1, 2.....n coefficients
ĥ(t) = w1 i 1 + w2 i 2 + · · · + wn i n
Our goal is to find the optimum value of ĥ at the time t using the information available
about the process i.
The optimization is based on minimizing the estimation error, which is variance
+∞
E[ 2 (t)] = E |h(t) − wk i k |2 (10.54)
n=−∞
This leads to the so-called orthogonality principle, which states that the minimum
error is uncorrelated to each of the available data:
E h(t) − ĥ(t) i k = 0 ∀ ik
In other words, the variance of the error is minimal when is orthogonal to the
sample of the data available at the filter’s entry.
262 10 Data Analysis
If the stochastic variable is continuous and so are its realizations (e.g. in the case
of analog measurements), the estimation takes the form of an integral
β
ĥ(t) = w(ξ )i(t − ξ )dξ
α
and the application of the principle of orthogonality takes the form:
β
E h(t) − w(ξ )i(t − ξ )dξ i(t) = 0 ∀α<ξ <β
α
This can be written as
β
Rhi (t) = Rii (t − ξ )w(ξ )dξ
α
Shi
W (ω) =
Sii
In this formulation, the filter is based on all the available data (the realizations),
acquired both before and after the time t. With an abuse of language, this filter is
also referred to as non-causal or unworkable. The Wiener filter is insensitive to the
input signal phase, being the w(t) function tied to auto-correlation functions. We
report an example with historical flavour: the search for burst signals with resonant-
mass GW antennas (see Chap. 8). The search consisted in detecting a sudden change
of vibration amplitude against a background given by the sum of slowly varying
(because of high Q) thermal noise and white amplifier noise. The best estimate of the
pure noise output at a given time was given by a WK filter that used, with weights
as shown in Fig. 10.7, both past and future samples: a non-causal filter.
The first attempts to use sophisticated statistical filters to improve the detection
SNR were carried on in the search of GW signal with resonant detectors. The first
algorithms applied by Joseph Weber were conceived to monitor the derivative of
the detector output, to enhance the energy innovation in the detector. At the time of
cryogenic resonant detectors, a few decades later, the data samples of the detector
output were analysed by applying the Wiener-Kolmogorov filter (Bonifazi et al.
1978). As we have shown in the previous section, the filter aims to extract an estimate
of a signal sequence from an observable data sequence by minimizing the mean
square error (MSE). In addition the filter is defined under the assumptions of a noise,
additive to the signal, which is treated as a stationary linear stochastic process with
known spectral characteristics or known auto-correlation. The signal and the noise
10.12 Matched and WK Filter 263
Fig. 10.7 The weights wn of the Wiener Kolmogorov filter applied to the data of resonant-mass
GW antennas in a search for short bursts. Applying these weights to the output data stream yields
the best estimate of the noise at the sample n = 0. In abscissa, the sample number: numbers n > 0
refer to the future of the considered sample
are mutually independent and the stochastic process has zero-mean. At a first glance
matched and Wiener-Kolmogorov filters seem very similar. Both are linear, but the
matched filter is conceived to maximize the output SNR, while the Wiener filter
minimizes the mean square error. MSE is a measure of the deviation between the
desired and actual response, while SNR includes an additive noise term as well,
which is missing in the MSE.
Generally, the two filters are applied having in mind different goals: the matched
filter is used to the assess the detection of a known signal embedded in the noise,
whereas Wiener filter applies to the estimation of an unknown signal with known
power spectrum. In either case, the spectrum of the signal is assumed to be known.
We also notice a couple of additional differences: when we consider the filter effect
on the signal, we realize that the Wiener one partially affects it, while the matched
filter doesn’t. We can say that the matched filter is derived from the transient signal
in the time domain, whereas the Wiener filter is derived from the signal and noise
covariances.
When the GW impulsive signals are characterized by poor predictions of the emitted
wave forms, the matched filtering, as used for the detection of inspiraling binaries,
is clearly of little use, and robust methods for detecting this kind of signals are then
required. In general, we are considering the case of signals with durations from mil-
liseconds to seconds, frequencies 100 Hz–a few kHz, and a large range of waveforms.
264 10 Data Analysis
P(|xi | sσn ) = 2 √ dx
s 2π
The number of bins Nc which are by chance above threshold, follows a binomial
distribution and the probability to have Nc = n is
N n N −n
P(Nc = n) = p 1− p
n
with p = er f c( √s ).10
2
It can be shown that the reduced random variable N̂c = (Nc − μc )/σc with μc =
N p and σc2 = N p(1 − p), follows a statistics, which is approximately normal when
N p > 5 and N (1 − p) > 5. This method is reasonably simple and demands just two
arbitrary choices, the window length T and the threshold s, and using the statistic
we can asses the significance of a GW event by computing the probability to observe
the event by chance.
Several, more sophisticated methods have been proposed, such as those based on
Time-frequency strategies, a standard procedure in many areas of signal analysis. If
we have hints on the duration and frequency band of the signal to be detected, we can
adapt the excess power filter by comparing the power of the data, in the estimated
frequency band
f and for the estimated duration
t, to the known statistical
distribution of noise power. Under the usual hypothesis of stationary Gaussian noise
of the detector output, the output power, sampled and whitened during the time
window of observation, follows a χ 2 distribution.
x
10 er f c(x) =1− √2
π 0 ex p(−t 2 )dt is the complementary error function.
10.13 Unbiased Methods to Detect Signals 265
– select H1 when H0 is true, i.e. we decide that the signal is present and it is not
true: false alarm with probability Q 0
– select H0 when H1 is true, i.e. we decide that the signal is absent when it is present:
false dismissal with probability Q 1
p(H1 |i)
(i) =
p(H0 |i)
We accept H1 and reject H0 when is larger than a threshold fixed by a specified
value of Q 0 .
In the classical case of signal embedded in Gaussian white noise with zero mean,
the probability densities are
11 A χ 2 distribution is the sum of n independent terms, each one being the square of a gaussian
random variable, with zero mean and unity variance. When the normally distributed variables have
a mean other than zero, then the corresponding sum of squared terms yields a non-central χ 2
distribution of n degrees of freedom and the non-centrality parameter being the sum of the squared
means of the normally distributed quantities.
266 10 Data Analysis
1 i2
p(H0 |i) = √ exp − k 2
k=1,N 2π σ 2 2σ
1 |i k − ξk |2
p(H1 |i) = √ exp − (10.56)
2π σ 2 2σ 2
k=1,N
where σ is the noise standard deviation, i k are the generic component of the vector
i and ξk the corresponding components of the signal. As usual, when dealing with
products or ratios of exponential functions, it is convenient to define the logarithm
of the likelihood ratio
1
p(H1 |i) 1
= log = i ξ − ξk2
2 k k
(10.57)
p(H0 |i) σ 2
k=1,N
We then look looking for the maximum (with respect to all possible templates ξi ) of
the functional Eq. 10.57
p(H1 |i)
M (i) = max
p(H0 |i)
and comparing it to the chosen threshold. This test is called maximum likelihood
ratio.
A relevant example: in order to assess the first detection of gravitational waves,
GW150914, where we could not afford to be wrong, the false detection probability
was set at Q 0 < 2 · 10−7 , corresponding to 5.1 σ of the noise, or 1 spurious event
every 2 · 105 years. The matched filter for binary coalescences yielded a SNR =
23.6.
12 In coherence with our notation, we should denote the reconstructed input as i(t). However, in this
section we will abide by the use of the GW community that invariably refers to the reconstructed
h(t).
10.14 Hunting Gravitational Wave Signals 267
Rotating neutron stars, the main target of searches for continuous GWs, are extremely
stable emitters, as we discuss in Chap. 12: we expect the emission of a pure, sinusoidal
signal. Nevertheless, the task we face is trying to detect an almost monochromatic
signal: almost, because, due to the relative motion of the detector with respect to
the emitting source, the spectral purity of the signal is contaminated by the Doppler
effect. As a third step in complexity, we will finally address the issue of signals with
intrinsic frequency change (spin down). If the Doppler modulation were negligible,
the analysis would be limited to the calculation of the Fourier transform of the data and
the identification of a possible peak at a given frequency. The frequency resolution of
a Fourier transform is set by the length of the observation time,
νm = 1/Tobs . By
performing high-resolution spectra, we can assume the noise spectral density Sn (ν)
not to vary in the resolution bandwidth, so that σn2 = Sn ·
νm . The periodic signal,
of amplitude h 0 , will emerge above the noise with a power SNR:
|h o |2
S N R power = Tobs (10.58)
Sn
√
and the amplitude SNR grows as S N Rh ∝ Tobs .
Thus, we should consider stretches of data several months long to allow the
signal to rise above the noise. However, we will see that data stretches so long pose
a significant computational burden.
The Doppler effect complicates the analysis, modulating the signal in frequency
and amplitude, thus spreading it in a frequency range. We need to determine the
positions in the celestial sphere of the sources that are compatible with the Doppler
correction to the signal frequency ν:
|
v|
ν(t) = νo 1 + cosφ (10.59)
c
where v is the speed of the detector with respect to the source and φ is the angle
between this vector and the line of sight of the source. v is the sum of two components:
the orbital revolution around the centre of gravity of the solar system (SSB), with
vor b | ∼ 32 km/s and the daily spinning around its axis, with |
| vspin | ∼ 0.45 km/s at
the equator. In the long term (of the order of a year), the significant effect is given
by the orbital motion which determines a variation
νor bital ∼ 10−4 νo
268 10 Data Analysis
In the short term (from 1 hour to 1 day), the variation is dominated by the spin motion,
even though the spin velocity magnitude is much smaller than that associated with
the rotation motion around the terrestrial axis.
It is therefore convenient to subtract from the data the Doppler effect associated
with the motion of the Earth, calculating it in the reference system with origin in the
SSB.13 In practice the Doppler effect is efficiently corrected in the time domain by
changing the time stamp t of data samples according to the “Römer correction”:
r (t)
t = t + cosφ (10.60)
c
13 The Doppler correction depends on the location in the sky of the source, because the angular
momentum of the Earth associated with its orbital motion around the Sun is not parallel to the
Earth’s spin axis.
14 We neglect here possible orbital motion of the source and all relativistic effects, that are instead
This search is performed dividing the sky into Nz zones, the extension of which is
chosen taking into account the angular resolution of the detector, that increases with
the SNR that, in turns, improves with the observation time. The resolution depends
on the square of the signal frequency and increases with the fourth power of the
2
observation time,
νν 2 . On each patch, the coherent search requires to apply the
m
Doppler correction, perform a huge FFT on the whole stretch of data and finally look
for bins with excess power. Schutz calculated, for a search at 1 kHz and a relatively
short observation time (∼ 1 day), a number of zones Nz ∼ 1013 (Schutz 1991). We
conclude that the coherent method for a blind search, optimal from the point of view
of signal theory, is not viable due to the enormous computing power required.
It follows that it is necessary to develop sub-optimal statistical techniques, reduc-
ing as much as possible the losses in terms of SNR. All methods use the power from
the Fourier transforms of short stretches of data, ignoring the phase information of
the signal. In this sense, all these approaches are classified as incoherent methods.
In the frequency bins where the signal is present, there should systematically be an
excess of power. As Doppler effect imposes a frequency modulation on the signal,
we expect the signal related excess power to migrate from bin to bin in successive
Fourier transforms. The search algorithm must chase such frequency change.
In the so-called stack-slide method, one shifts the frequency bins of each Fourier
transform to align the signal peaks and then adds the power.
We mention here another method based on the Hough transform (Palomba et
al. 2005). The Hough transform is a method of digital image processing, useful
to identify geometric shapes in images: it was invented in 1959 as an automatic
tool to read out ionization traces left by charged particles in bubble chambers, and
was later extended to identify arbitrary shapes such as circles or ellipses. It is a
robust parameter estimator of multidimensional patterns in images and it finds many
applications in astronomical data analysis. In essence, the Hough transform is an
algorithm that converts from one set of variables, to another, more suited for the
analysis of interest. In the original application, the conversion was performed from
the coordinates {xi , yi } of the points, in the bubble chamber image, darkened above a
threshold, to the description {slopes intercepts} of the best lines15 connecting those
points.
To show how the Hough transform is used in this context of continuous GW
searches, we consider a signal emitted at a given frequency ν and modulated, at
reception, according to Eq. 10.59.
We project the position of the detector on the celestial sphere and use equatorial
celestial coordinates: Declination δ and Right Ascension α;
15 Note the plural in lines: if it were only one line, the problem would be trivially solved by a linear
fit.
270 10 Data Analysis
• α and δ are the right ascension and declination variables associated with the
instantaneous value of the frequency that changes due to Doppler effect
• The angle φ, as defined in Eq. 10.59 is the angle between the detector velocity
and the line of sight of the source and is computed using the relation:
In this way, the points, or more appropriately the pixels, identified by the pair of values
α, δ, for which the identity Eq. 10.64 is verified, will compose a circular annulus. In
practice, using Eq. 10.64 we build a correspondence (a Hough transform) between the
observational space, { frequency and time}, and the parameter space, the coordinates
{α, δ} of the source in the celestial sphere.
In the light of this observation, we proceed by dividing the data, related to a
long period of observation, in many sub-periods. We choose the length of the sub-
periods to be short enough.16 that the discrete spectrum, obtained by calculating
the spectral power of the interferometer signal h(t), the periodogram, has a low-
frequency resolution: a bin large enough that the signal does not suffer from the
Doppler effect. Any signal will then be contained in a single frequency bin of the
discrete spectrum; in the next periodogram the signal will move from one frequency
bin to an adjacent one. Plotting together all the periodograms produced, we can build
a time-frequency map. In that map, in the presence of a signal with large SNR, a
signature like that shown in Fig. 10.8 will appear.
We now focus the attention on those particular frequency values of the discrete
Fourier spectrum that exceed a predetermined threshold value. For each of these
frequency values, the relative closed curve on the celestial sphere is traced. By
performing repeated independent observations, at different times or with different
detectors, several circles are produced and the various curves will intersect at one
point of the celestial sphere, identifying the position of the source.
Finally, we address the additional complication that we face when the signal is not
intrinsically monochromatic. The signal might exhibit an intrinsic frequency drift,
Fig. 10.8 Left: time-frequency map of simulated data in the presence of a signal with S N R > 100.
Right: the data on the left, mapped into the {α, δ} plane by the Hough transform, show a closed
curve. Courtesy of the Virgo-Roma group
as expected from emitting neutron stars which spin down with time, or modulation
due to its motion (nutation of the axis, or rotation around a companion). This is a
very relevant complication: if there is no spin-down, the star is not losing energy, and
does not emit GW ! These effects must also be included in the analysis. This is done
by assuming as general model for the source frequency drifts, a power expansion of
the GW signal frequency in terms of its time derivatives
1
ν(t) = ν(t0 ) + ν̇ t + ν̈ t 2 + . . .
2
The unknown values of the source frequency and its time derivatives are additional
dimensions of the parameter space, increasing the search complexity and the com-
putation burden.
As title of example, we report here the upper limit of GW emission of isolated
spinning neutron stars. In the plot Fig. 10.9, we show the upper limit of the amplitude
of isolated spinning neutron stars of unknown position in the sky (Abbott et al. 2019).
It is based on data collected by the two Advanced LIGO detectors; the data have been
analysed by three different methods of search covering a frequency band from 20
to 1922 Hz. None of these searches has found clear evidence for a continuous GW
signal. However, this result provides an interesting astrophysical limit on the neutron
star eccentricity. We recall here Eq. 7.39, linking the amplitude h o of the emitted
wave to the eccentricity e of the compact spinning star:
c4 r h o
e= (10.65)
4G I3 2
where r is the distance of the source from the Earth, I3 the moment of inertia assumed
to be 1038 kg m2 and = π ν is the rotation angular frequency of the source. It
follows that for sources at 1 kpc emitting 500 Hz, the ellipticity is below 10−6 and
at 10 kpc the limit rises at 10−5 .
272 10 Data Analysis
Fig. 10.9 Upper limits on the strain amplitude h o at the 95 % of confidence level obtained from
LIGO data, using three √
different incoherent methods. Note that in the vertical axis are reported not
the spectral amplitude Sh , measured in H z −1/2 , but the dimensionless amplitude h o . Credits:
Reprinted figure with permission from Abbott et al. (2019). Copyright 2019 by the American
Physical Society
It should now be clear to the reader that the detection of transient gravitational wave
signals must be performed by exploiting information from multiple detectors.
To trace back from the detected signal to all the physical parameters that char-
acterize the emission process, that is to solve the inverse problem, it is necessary to
combine the data of five interferometers, taking into account their different position
and orientation on the Earth. In terms of the SNR, the linear combination of the out-
put signals must take into account the different noise level of the various antennas.
This is done by combining the whitened data of each detector, or combining in the
time domain using the Fourier anti-transform, the ratio of the collected data in the
frequency domain divided by the relative spectral density of noise. The consistent
combination of these data must take into account:
– the arrival time of the signal and therefore the phase in each detector,
– the amplitude of the signal response, which depends on the antenna orientation
at that moment,
the constructive combination of data that exceed the detection threshold at a given
moment, there must correspond the absence of a signal in the null stream. The most
powerful algorithm to detect signals in an unbiased way is the coherent Wave Burst
CWB method (Klimenko et al. 2008). In coherent methods, a statistic is built up as a
coherent sum over detector responses. In general, it is expected to be “more optimal”
(better sensitivity for the same false alarm rate) than the detection statistics of the
individual detectors that make up the network. The method that we summarize here,
was developed in the context of the GW search of burst signals: it combines all data
streams of the independent detectors into one coherent statistic constructed in the
framework of the constrained maximum likelihood analysis (Klimenko et al. 2005).
To be be coherent, the data are combined taking into account the phase. The total
likelihood ratio of the detector network to be maximized, similar to the definition of
Eq. 10.57, is
D N
1
log(L) = xki ξki − ξki
2
(10.66)
σ2
k=1 i=1 k
where k runs over the D detectors in the network and i runs over the N samples in the
data sequence. We also assume that noises in different detectors are uncorrelated. The
likelihood method introduced requires high computational load and use of memory.
To reduce the computational burden, in the Coherent Wave Burst algorithm data are
analysed applying a wavelet transform, which generates time-frequency pixels (s, τ ),
i.e. time intervals and frequency bands centred in a definite time and frequency. The
square of coefficient d(s, τ ) is the energy of the signal related to the time-frequency
pixel which is associated to the pair (s, τ ). CWB uses different decomposition levels
for the same data stream, so to have different characterizations of the signal and to
find the optimal one. After the application of the wavelet transform, the algorithm
selects the most energetic pixels (core) and their neighbours. Core pixels are chosen
if the corresponding energy is above a threshold which depends on the noise level.
The significant advantages of this coherent method are
– the sensitivity is not limited by the least sensitive detector in the network
– the maximum likelihood ratio statistic represents the total SNR of the GW signal
detected in the network
– the null data stream allows to distinguish genuine signals from noise artefacts.
The coherent analysis maximizes the SNR of the antenna network, but requires
huge computational power. A much more manageable approach was the threshold
coincidences method, used by Weber (1980) for the analysis of the data of its antennas
and in all subsequent resonant detector experiments.
The analysis method applied by Weber is conceptually simple. The filtered output
of each detector is examined to determine when the preset threshold is exceeded.
This is how the each group produce the list of events that is then exchanged with the
274 10 Data Analysis
other groups, operating the various detectors. These lists are compared and searched
for coincidences for temporal events falling into a coincident window
t set a priori,
i.e. before starting the analysis. This window is chosen taking into account the overall
time resolution of the antenna network and all the possible directions of arrival of
the gravitational signals.
This comparison is performed Nshi f t times, introducing at each step a time delay
tshi f t in one list with respect to the other. This procedure is needed to derive the
probability of random coincidences, exploiting the same data with their possible non-
stationary characteristics. In essence, we will find a given number of coincidences
with zero delay n c , which we compare with those obtained for
tshi f t
= 0. In other
words we tend to check if it happens that
n c (
tshi f t = 0) n c (
tshi f t > 0) (10.67)
f
pex p =
Nshi f t
If the data selected by two antennas were realizations of a stationary process, the mean
coincidence number < n c >random should be deducible by applying the Poisson
statistics:
t
< n c >random = N1 N2 (10.68)
tm
where N1 and N2 are the numbers of selected events in the first and second antenna,
tm and the total coincidence data collection time (actual measurement time of the
detector network). The possible discrepancy of the number observed at a delay with
respect to what is foreseen by the Eq. 10.68, will then be subjected to a statistical
test to quantitatively assess its consistency.
It is certainly faster and more robust, has a negligible computing weight and
can be applied to data produced by detectors of a different nature such as resonant
bars and interferometers (different in sampling time, bandwidth, filtering etc.). Some
disadvantages of this method are:
– it cannot reveal events that are poorly reconstructed by filters and that are still
immersed in noise;
– it works correctly if the filters and detection thresholds applied to the different
detectors produce data and events of comparable sensitivity. This is not always
possible, e.g. when the detectors are very different: one possible criterion is then
to produce lists where the events are selected, rather than on their energy, by the
choice of having a chosen number (same for all detectors) of events per unit time,
with the intent to build an almost stationary data set of events. In any case, the
standardization of the procedure on the selection of events among the various
detectors is a crucial aspect for the application of the method.
10.16 Assessing the Detection in a Network of Antennas 275
Fig. 10.10 The EXPLORER resonant detector of the Universities of Rome, installed at CERN. The
metal cube on the left contains the calibrator, an aluminium quadrupole of 14 kg rotating up 460 Hz
17 Antenna engineers like to express the energy associated to a stochastic signal in degrees Kelvin,
1 K = 1 J /k B = 1.38 · 10−23 J . This is just like high-energy physicists do with the electron-Volt:
1 eV = 1 J /|e| = 1.6 · 10−19 J .
276 10 Data Analysis
In 2017, with also Virgo in operation, the three interferometers identified the first
coalescence event of neutron stars in a much narrower area, ∼30 square degrees,
ushering in the era of multi-messenger astronomy.
8π G dρGW
GW = ν (10.69)
3Ho2 c2 dν
4π 2 3
GW = ν Sh 1 ,h 2 (ν) (10.71)
3Ho2
To search for a SGWB signal we consider the data stretches of two detectors i 1 (t),
i 2 (t) and a matched filter q. In order to find the filter that maximizes the SNR,
we introduce a figure of merit, call the estimator of the SBGW signal: the cross
correlation of i 1 (t) and i 2 (t) weighted by the filtering function qα (t, t ). The latter is
18 The typical assumption made when presenting the upper limits values of the LIGO-Virgo network,
is Ho = 67.9 km s−1 Mpc−1 , the Hubble constant value from the observations of the Planck
satellite).
10.17 Search for the Stochastic Background of GW 277
built on the hypothesis that SGWB is expressed by the power law of spectral index
α, given by Eq. 7.59 to (−∞, ∞). Thus, the SGWB estimator is
Tobs /2 Tobs /2
YGW = dt i 1 (t)i 2 (t )qα (t, t )dt (10.72)
−Tobs /2 −Tobs /2
Since Tobs is much higher than the wave travel time between the two detectors,
we are justified in changing the limits on the second integral of Eq. 10.72 from
(−Tobs /2 , Tobs /2) to (−∞, +∞)
Here too, it is simpler to compute this estimator in the frequency domain:
∞ ∞
YGW = dν δT (ν − ν )I1∗ (ν)I2 (ν )Q α (ν )dν (10.73)
−∞ −∞
where
Tobs /2
sin[π(ν − ν )Tobs ]
δT (ν − ν ) = e−i2(ν−ν )t dt = (10.74)
−Tobs /2 π(ν − ν )
γ (ν)
Q α (ν) = α Sh ,h (10.75)
ν 3 Sn 1 (ν)Sn 2 (ν) 1 2
where α is a constant which depends on the spectral index α of the selected SGWB
model. Sn 1 (ν) and Sn 2 (ν) are the noise spectral densities of the two detectors. γ (ν) is
the overlap reduction function, which accounts for the reduction in sensitivity due to
separation and relative misalignment between the two detectors. It is an adimensional
function with maximum value γ (ν) = 1, attained for two co-located and co-aligned
detectors (Fig. 10.11). An explicit expression of γ (ν) requires a lengthy derivation
(Flanagan 1993). Here we just report the result and discuss its physical meaning.
5 n̂·
x
γ (ν) = F1k (n̂)F2k (n̂)ei2πν c ds (10.76)
8π s
k=+,×
all frequencies and the sum over polarization’s k is appropriate for an unpolarized
stochastic background. We note also that the integration over the whole two-sphere
solid angle holds for an isotropic stochastic background, while the exponential factor
accounts for the phase shift due to the propagation time of the wave between the two
detectors. Finally, the quantity F1k (n̂)F2k (n̂) is the product of the angular responses
(antenna patterns) of the two detectors to both polarization waves (see Eq. 9.38).
The frequency dependence of the overlap function can be explained as follows: if
the wavelength is comparable to or smaller than the separation between two detec-
tors, the detectors will see different phases of the wave at the same time, and this
phase difference will depend on the direction of propagation of the wave. Since the
stochastic GW background is assumed to be isotropic, averaging over different prop-
agation directions suppresses the sensitivity of a pair of detectors to high-frequency
waves. For example, a wave whose wavelength is twice the distance between the two
detectors will drive them π out of phase if it travels along the line separating them,
but in phase if its direction of propagation is perpendicular to this line.
Just as in the case of periodic signals discussed in Sect. 10.15, a SGWB is always
present in the detectors. Thus, it is not surprising that, to improve the SNR, we have
to play with long observation time Tobs , i.e. long stretches of data. In order to evaluate
the SNR, we assume optimal choice of the function Q(ν) and compute the variance
of the estimator Eq. 10.73:
∞
σ 2 YGW = Tobs S1 (ν)S2 (ν)|Q ( ν)|2 dν (10.77)
−∞
Note that S N R 2 depends linearly on Tobs√. It follows that our accuracy in determining
the spectral
√ properties of Sh grows with Tobs and the strain amplitude of the SGWB
with 4 Tobs .
10.17 Search for the Stochastic Background of GW 279
– Assuming a flat frequency dependence, namely α = 0, the constraint is GW (ν) <
1.7 × 10−7
– For spectral index α = 2/3 we have GW (ν) < 1.3 × 10−7
– Finally, for α = 3 the bound is GW (ν) < 1.7 × 10−8 . This was computed using a
bandwidth extending up 300 Hz, and setting νr e f = 25 Hz the parameter appearing
in Eq. 7.59.
Today, the special and general-purpose software systems for different research
projects and with different features have been established in various laboratories
around the world, to make a solid base for the gravitational experiments. Some of
these tools are available to everybody, also outside the collaborations, so that anyone
can try his/her hand at finding, in the public-domain data, undetected merging, or
periodic signals or analysing the background.
See for instance:
https://www.gw-openscience.org/about/
https://asd.gsfc.nasa.gov/archive/astrogravs/docs/mldc/lisa_data_analysis.html
References
Abbott, B.: Upper limits on the isotropic gravitational-wave background from Advanced LIGO and
Advanced Virgo’s third observing run. Phys. Rev. D 104, 022004 (2021)
Abbott, B., et al.: GW151226: observation of gravitational waves from a 22-solar-mass binary black
hole coalescence. Phys. Rev. Lett. 116, 241103 (2016a)
Abbott, B., et al.: (the LIGO and Virgo collaborations): properties of the binary black hole merger
GW150914. Phys. Rev. Lett. 116, 241102 (2016b)
Abbott, B., et al.: All-sky search for continuous gravitational waves from isolated neutron stars
using Advanced LIGO O2 data. Phys. Rev. D 100, 024004 (2019)
Abbott, B.P., et al.: A guide to LIGO Virgo detector noise and extraction of transient gravitational-
wave signals. Class. Quantum Grav. 37, 055002 (2020)
Astone, P., et al.: Search for gravitational radiation with the Allegro and Explorer detectors. Phys.
Rev. D 59, 122001 (1999)
Bonifazi, P., Ferrari, V., Frasca, S., Pallottino, G.V., Pizzella, G.: Data analysis algorithms for
gravitational-wave experiments. Nuovo Cimento 1C, 465 (1978)
Dhurandhar, S.V., Schutz, B.F.: Filtering coalescing binary signals: issues concerning narrow band-
ing, thresholds, and optimal sampling. Phys. Rev. D 50, 2390 (1994)
Flanagan, È.È.: Sensitivity of the laser interferometer gravitational wave observatory to a stochastic
background, and its dependence on the detector orientations. Phys. Rev. D 48, 2389 (1993)
280 10 Data Analysis
Graps, A.: An introduction to wavelets. IEEE Comput. Sci. Eng. 54, 50 (1995)
Klimenko, S., Mohanty, S., Rakhmanov, M., Mitselmakher, G.: Constraint likelihood analysis for
a network of gravitational wave detectors. Phys. Rev. D 72, 122002 (2005)
Klimenko, S., Yakushin, I., Mercer, A., Mitselmakher, G.: A coherent method for detection of
gravitational wave bursts. Class. Quantum Grav. 25, 114029 (2008)
Maggiore, M.: Gravitational Waves, vol.1: Theory and Experiments. Oxford University Press,
Oxford (2018)
Palomba, C., Pia, Astone P., Frasca, S.: Adaptive Hough transform for the search of periodic sources.
Class. Quantum Grav. 22, S1255 (2005)
Papoulis, A.: Probability, Random variables and Stochastic Processes. Mac Graw Hill, New York
(1965)
Ross, S.W.: Introduction to Probability and Statistics for Engineers and Scientists, 6th edn. Academic
Press (2021)
Schutz, B.F. (ed.): Gravitational Wave Data Analysis. NATO ASI Series, Springer Science+Business
(1989)
Schutz, B.F.: Data processing, analysis, and storage for interferometric antennas. In: Blair, D.G.
(ed.) The Detection of Gravitational Waves. Cambridge Univ. Press, Cambridge (1991)
Thrane, E., Talbot, C.: An introduction to Bayesian inference in gravitational-wave astronomy:
parameter estimation, model selection, and hierarchical models. Publ. Astron. Soc. Aust. 36,
e010 (2019)
Weber, J.: The search for gravitational radiation. In: Held, A. (ed.) General Relativity and Gravita-
tion, vol. 2. Plenum Pub., New York NY USA (1980). Edited by A. Held. NY: Plenum Press, p.
435, 1980
Zackay, B., et al.: Detecting Gravitational Waves in Data with Non-Gaussian Noise (2019).
arXiv:1908.05644
Zubakov, V.D., Wainstein, L.A.: Extraction of Signals from Noise. Dover Publication (1962)
Space Detectors of GW
11
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 281
F. Ricci and M. Bassan, Experimental Gravitation, Lecture Notes in Physics 998,
https://doi.org/10.1007/978-3-030-95596-0_11
282 11 Space Detectors of GW
carried out from space: a space-based detector is immune from these noise sources
and can be made very long, thus exploring the frequency range down to 20 µHz. We
recall that an antenna is best matched to waves with λ ∼ 2L; to observe GW in the
sub-Hz band we then need L ∼ c/2 f ∼ 1010 m: such length can only be achieved
in open space. Beside, a space interferometer will not require an arm-long vacuum
enclosure!
LISA will complement ground-based observatories such as LIGO-Virgo-KAGRA,
which detect gravitational waves in the Hz to kHz range.
The region of the spectrum below 1 Hz is very rich in interesting GW sources, and
crucial advances in the understanding of strong-field phenomena are expected from
their observation. The survey of the mHz sky promises to detect tens of thousands
of individual astrophysical sources ranging from White Dwarf binaries in the Milky
Way to mergers of massive black holes (BHs) at red-shifts extending to the very early
universe, beyond the epoch of reionization. Before we detail the scientific goals of a
LISA mission, we need to recall Eq. 7.56
c3 M
ν≈ ≈ 1.3 · 104 Hz (11.1)
5πG M M
that shows how binary systems of mass larger than 104 M emit GW at frequencies
below 1 Hz. A space-based, low frequency detector is thus suited to observe massive
and supermassive (M > 106 M ) black holes.
The GW astrophysics achievable in the mHz band can be described in terms
of the eight Scientific Objectives that were spelled out in the proposal (Danzmann
et al. 2017), submitted to ESA in 2017.
The physics, astrophysics and cosmology that can be explored by LISA are subject
of a vast and ever-evolving literature. A thorough review is Amaro-Seoane (2012). For
the purpose of discussing the sources described in this section and their detectability,
it helps to put forward the expected sensitivity curve for LISA, Fig. 11.1, that will
be derived and discussed in following sections.
• [Science Objective 1]: Study the formation and evolution of compact binary
stars in the Milky Way.
Numerous (millions) compact binaries in the Milky Way galaxy emit continu-
ous GW signals, that are nearly monochromatic in the source reference frame.
This population of Galactic Binaries consists mostly of white dwarfs, but also
of combinations of neutron stars and stellar-mass black holes The orbital motion
of the detector imparts a characteristic frequency and amplitude modulation (as
described in Chap. 12) on the GW signal, that allows us to constrain the extrinsic
properties of some of the systems. Based on current estimates for the population
of compact stars and assuming LISA sensitivity as shown in Fig. 11.1, tens of
thousands Galactic Binaries will be individually resolved with measured masses,
orbital parameters, and 3D locations
Several binaries are currently known, and many more will be discovered in the
near future by GAIA1 and other astronomy space missions. These systems are
completely characterized by e.m. observations, and their masses and orbital peri-
ods P = 2/ f GW are measured to high accuracy, so that the GW signal from
these sources can be accurately predicted. They are guaranteed multi-messenger
sources, called verification binaries, that can be used to calibrate the detector. At
low frequencies, Galactic Binaries are thought to be so numerous that individual
detections will be limited by confusion with other binaries yielding a stochastic
foreground or confusion signal: LISA will have a GW noise! As shown in Sect. 7.6,
a chirp GW signal passes through the sensitive band of Earth-based interferom-
eters in the last fraction of a second before merging2 and can be observed for
up to a few hundred oscillations. A low frequency detector like LISA instead
can observe sources, orbiting with periods of hours to a few seconds, for a very
long time, accumulating up to 105 oscillations, harvesting a wealth of information
about the “early” inspiral phase.
• [Science Objective 2]: Trace the origin, growth and merger history of Massive
BHs.
The origin of Massive BHs powering active nuclei and lurking at the centres of
today’s galaxies is unknown. LISA observations will probe massive black holes
over a wide, almost unexplored, range of red-shift and mass, covering essentially
all important epochs of their evolutionary history. LISA sources are coalescing
massive binary black holes, among the loudest sources of GW in the Universe.
They are expected to appear at the cosmic dawn, at red-shift z 11, when the
first galaxies started to form. LISA will also explore black holes at red-shifts
3 z 5, when the star formation rate in the Universe and the activity of Quasi
Stellar Objects (QSOs) and Active Galactic Nuclei (AGN) was highest.
Current studies predict masses for their seeds in the interval 103 − 105 M and
formation red-shifts between 10 z 15. They then grow up to 109 M and
more through two mechanisms: accretion and repeated merging, thus participat-
ing in the clustering of cosmic structures. These events show up in the LISA
frequency spectrum, from a few 10−5 Hz to a few 10−1 Hz. Mergers and accre-
tion influence the spin of Massive BHs in different ways, thus informing us about
their way of growing. The GW signal is transient, lasting from months to days
down to hours. The signal encodes information on the inspiral and merger of the
two spinning BHs and the ring-down of the new Massive BH that is formed. LISA
will provide opportunities to probe the birth and growth of massive black holes
and their host galaxies at red-shift ranges and for halo mass ranges that are not
accessible with other techniques.
• [Science Objective 3]: Probe the dynamics of dense nuclear clusters using
EMRIs.
Extreme Mass Ratio Inspirals (EMRIs) describe the long-lasting (months to years)
inspiral and plunge of stellar-mass black holes or neutron stars, into a Massive
BH of 105 − 106 M in the centre of galaxies. The large difference in mass
between the two objects results in a highly complex, highly relativistic orbit with
multiple, time dependent frequency components. The stellar object spends 103
to 105 orbits in close vicinity of the Massive BH, and the orbit displays extreme
forms of relativistic precessions, as described in Chap. 5. The large number of
orbital cycles allows ultra precise measurements of the parameters of the binary
system as the GW signal encodes information about the space-time of the central
massive object. LISA will observe tens to hundreds of these EMRI events, yielding
very precise tests of General Relativity in the strong-field regime (Babak et al.
2017) and also providing unique insight into the population and dynamics of these
objects in galaxies of the local Universe (z < 2). LISA may also be able to detect
GWs from the capture, and eventual disruption, of individual White Dwarfs by
Massive BHs in the nearby Universe, leading to an exciting new multi-messenger
source.
• [Science Objective 4] Understand the astrophysics of stellar-mass black holes.
The LIGO-Virgo discovery of stellar-mass black holes in the mass range from 10
to 30 M , merging in binary systems in the nearby Universe, has stimulated a
new science objective for a space detector: based on the inferred rates from the
LIGO-Virgo detections, LISA will be able to observe hundreds of such Stellar-
mass BH binaries, but with larger orbital separations, i.e. days or months before
the merging. These rotating sources lose energy by emitting GW, and that leads
to a slow decrease of their orbital separation and to an increase of the orbital fre-
quency. Stellar-mass BHs will therefore be studied by LISA for years before their
orbital frequency increases to the point that they exit the LISA band and enter the
measurement band of ground-based detectors, where their final orbits and merger
will be observed. It will be therefore possible to observe the same coalescence
11.2 The Gravitational Universe—Scientific Motivations 285
with both space and ground instruments, offering the intriguing possibility of
multi-band GW observations.
• [Science Objective 5]: Explore the fundamental nature of gravity and black
holes.
As mentioned before, LISA measurements of Massive BH Binaries and EMRIs
will enable us to perform tests of GR in the strong field and dynamical regime.
Measurements that can be performed include:
– Observation of the post-coalescence ring-down of Massive BH binaries will
verify if the resulting objects are the black holes of General Relativity, testing the
no-hair theorem.
– Test of the presence of emission channels not contemplated by GR, like dipole
radiation, to unprecedented accuracy. It will be done by detecting Stellar-mass
BH binaries, which appear in both the LISA and LIGO-Virgo frequency bands.
– Verify the propagation properties of GW: speed and dispersion
– Dark Matter search: by precise observation of the orbital motion of the “light”
element in an EMRI, we can investigate the presence of non-spherical halos, e.g.
of axions, around the central black hole. These precision tests require loud signals,
that is, Massive BH Binaries with SNR> 100 in the post-merger phase or EMRIS
with SNR > 50.
Observations with LISA will offer opportunities to explore topics of fundamental
physics in the extreme gravity regime, where the gravitational interaction is enor-
mous and curvatures are large and dynamically changing: nature of black holes,
speed of gravity, Equivalence Principle, nature of Dark Matter and Dark Energy
and more are discussed in a thorough review (Barausse 2020)
• [Science Objective 6]: Probe the rate of expansion of the Universe.
Let us introduce the concept of standard sirens: some GW sources, like the inspiral
of super massive black hole binaries, can yield a measurement of the distance,
as pointed out by Schutz (1986) and discussed in Sect. 7.6. In analogy with the
standard candles of e.m. astronomy, these sources are called standard sirens, with
reference to the fact that measuring GW on Earth resembles more sound recording
than image picturing. Simultaneous observation in the e.m. spectrum can also
provide the red-shift z and a measure of the Hubble Constant, as it was done for
the event GW170817 (Abbott et al. 2017b). Observation of a number of binaries
with a total mass M > 50M in the near universe will allow determination of the
Hubble constant to better than 0.02% without e.m. counterparts. Different sources
can provide data at different scales: Stellar-mass BH binaries for z <0.2, EMRIs
at intermediate scales, z <1.5 and Massive BH Binaries for the largest distances:
z < 6.
• [Science Objective 7]: Understand stochastic GW backgrounds and their
implications for the early Universe and TeV-scale particle physics.
One of the LISA goals is the direct detection of a stochastic GW background of
cosmological origin in the presence of stochastic foregrounds. Probing a stochastic
GW background of cosmological origin provides information on new physics in
the early Universe. The signal spectrum gives an indication of its origin, while an
upper limit allows to constrain models of the early Universe and particle physics
286 11 Space Detectors of GW
beyond the standard model. For these investigations it is crucial the availability
of a particular combination of the LISA signals (the Sagnac combination) that
is, at least partially, insensitive to the GW signal: this allows to separate the GW
background from instrument noise. The sensitivity should be enough to constrain
the spectral shape of the GW background in the LISA band, as well as to probe
the gaussianity, polarization and level of anisotropy of the measured background.
• [Science Objective 8]: Search for GW bursts and unforeseen sources.
The potential for discovery is probably the strongest motivation for exploring
this spectral region, a yet-to-be observed window on the Universe. Possibili-
ties include both astrophysical sources, like intermediate-mass black holes, and
cosmological sources, like possible GW backgrounds from inflation or early uni-
verse phase transitions, and cosmic string bursts, etc. Distinguishing unforeseen,
unmodeled signals from possible instrumental artifacts will be crucial in explor-
ing new astrophysical systems or unexpected cosmological sources: this is one of
the main challenges of the mission.
The science addressed by LISA is extremely rich and covers many different
domains of astrophysics. As shown in Fig. 11.1, many of the LISA sources have
extremely high SNR, so that their detection will happen even with a minor changes
in sensitivity or in the astrophysical estimates of the signals.
The LISA science case is robust, rich and bound to investigate new science.
The idea of a space-based GW detector was first proposed in 1982 by P. Bender and
J. Faller A six-page paper, with hand-drawn figures (Faller et al. 1985) describes the
project, named LAGOS,3 and works out most of the key ideas there are still today
at the heart of a space interferometer, including arm length of 106 km (i.e. 1 Gm),
heliocentric orbit at 1 AU , drag-free operation (see below), down to the details of
data transmission to Earth. However, they proposed a LIGO-like interferometer, with
a central station trailing the Earth on the ecliptic, and the two end-mirror stations
rotating around it. Several missions were proposed to both ESA and NASA in the
1990s: they were based on six-spacecraft design, orbiting around the Earth, like
SAGITTARIUS4 or OMEGA5 or around the Sun, like LISAG.6 The latter was
proposed as a medium-size mission to the European Space Agency in 1993. The
design was then refined to a triangular configuration of three identical spacecrafts
(S/C) with three 5 Gm arms, orbiting the Sun. In 1997 the two space agencies joined
forces on LISA. The design of the mission progressed, in the frame of ESA’s Cosmic
Fig. 11.2 Sketches of some pre-LISA proposals: from the left: the Gravitational Wave Interfer-
ometer (GWI), with an Aluminium structure to be extruded in space, 1 km armlength; OMEGA,
orbiting the Earth and LISAG, close to the final design, but with 6 spacecrafts. Credits: NASA
ences around the three sides of the triangle, we obtain a Sagnac signal (see Chap. 13)
that is insensitive to GW. Unlike terrestrial detectors, where the output contains (at
present) GW information only with rare, sporadic short burst, LISA will be flooded
with continuous GW signals; for this reason, the Sagnac combination offers the
opportunity to better characterize the instrumental noise and calibrate the detector
without the signal. This triple interferometer configuration is so promising that it is
also being considered for the Einstein Telescope, the next generation of Earth-based
interferometers.
A key ingredient of LISA design is the so-called smart orbit: the triangular con-
figuration is maintained, in an almost rigid way, by placing the spacecrafts on three
independent orbits with the following characteristics (Fig. 11.3)
α 2α2
e∼ √ +
3 3
(f) The three orbits are identical, but shifted by 2π/3 with respect to each other.
(g) The distance from the earth, about 20◦ or 50 Gm, is chosen as a compromise
between long-term stability of the constellation (minimization of Earth pertur-
bations) and communications requirements.
We have reviewed in Sect. 1.3 how the equations of motion of a generic body
orbiting around the Sun, like anyone of our three spacecrafts, can be expressed in
terms of the eccentric anomaly ψ, defined by the equation
ψ − e sin ψ = t;
G M
where = 2π/T = R3
is the mean orbital velocity.
Earth ℓ = 2.5Gm
R= 1 AU (150 Gm )
Sun
Fig. 11.3 Left: a sketch of the LISA constellation. On the right, a sketch of its motion of rotation
and heliocentric revolution. The constellation size and the orbit inclination are not to scale
Each spacecraft of the LISA constellation has its own eccentric anomaly, phase
shifted (see previous item f) by ±2π/3 with respect to the others:
2π(k − 1)
ψk − e sin ψk = t − (k = 1, 2, 3) (11.2)
3
• The plane of the triangle maintains a constant inclination of π/3 over the ecliptic
plane,10 with constant illumination from the Sun, thus avoiding periodic thermal
effects that could affect the sensitive components like optics and sensors.
• The triangle centre moves on the ecliptic, trailing the Earth by an angle chosen
at ∼ 20◦ , or 50 Gm. The distance to Earth is rather stable, thereby reducing the
9 The motion of the LISA constellation can be visualized through the animated gif: https://it.
2.50
2.5
Lij (m/s)
Lij (Gm)
2.49
0
dt
d
2.48
-2.5
2.47
t (years) t (years)
Fig. 11.4 Right: Flexing of LISA arms over a 6-year period: the three arms lengths change in
different ways. The oscillations are not purely periodic due to the drift away effect. Left: velocities
L̇ i j of the spacecrafts along the line of sight. Adapted from Pucacco et al. (2010)
However, the word almost in the previous sentence must be seriously considered:
the tidal forces on the constellation (the spacecrafts have different instantaneous
distances from Earth and Sun) cause the distances between the satellites to change by
roughly 2%, resulting in a differential velocity, in the direction towards the opposite
spacecraft, of up to 10 m/s and in slight changes of the constellation shape (Fig. 11.4).
This change of arm length, called flexing has important consequences in the design
of the detector:
In the next section we shall see how to deal with these issues.
A refined analysis must also consider perturbations: the Earth and other planets,
the Sun quadrupole, tides, relativistic effects. Modern integrator codes solve the
orbits numerically to the required accuracy, but the methods of celestial mechanics
also allow an analytical evaluation of these effects.
11.5 LISA: The Instrument 291
Mission Concept: Each spacecraft houses two test masses (TM), kept as closely as
possible in free fall, that form the reference points for an interferometric measurement
of the inter-spacecraft distance. The TMs take the place of the mirror in ground
interferometers: a change in the distance between two distant TM is the signature
of the passage of GW. The hardware on the spacecraft must provide shielding to
external, non-GW disturbance and measure the TM separations.
An interferometer in space is not a simple task and each LISA spacecraft is a
complex instrument: a detailed description of the numerous sub-units can be found
in Jennrich (2009). We shall follow, in this section, the instrument description as
outlined in the proposal (Danzmann et al. 2017), but the reader should be aware that
the technologies are continuously evolving.
Each of the three spacecrafts contains two identical scientific units, serving the
two arms pointing at the other two spacecrafts. As shown in the simplified drawing
of Fig. 11.5, each unit contains:
– a free falling test mass (TM) inside a gravitational reference sensor (GRS)
– an infrared laser system
– an optical bench to condition the light and form optical beat notes between the
local laser and lasers from other units
– a phase metre to measure the phase evolution of each of these beat signals
– a telescope of 30 cm diameter to send and receive the laser beams to and from the
other spacecrafts.
The two assemblies are mounted in a common frame that allows rotation of each
assembly about the vertical axis by about 2◦ in order to track the breathing of the
constellation vertex angles.
The Observable: Phase Shift Versus Frequency Shift
Ground-based interferometers are operated in the Long Wavelength Limit, L λ,
and the response to GW is usually defined, by the geodesic deviation equation, in
the frame associated with the beam splitter. We then chose, in Sect. 9.6.1, an inertial
frame covering the whole instrument, which simplifies the calculations.
For LISA, the Long Wavelength Limit does not hold over much of its frequency
band: 109 m ≤ λ ≤ 1013 m. The computed detector response must then account for
time delays in the response of the instrument to the waves, and travel times along
beams in the instrument. It is convenient to describe the interaction in the “TT”
(transverse-traceless) frame which can be seen as “co-moving” with the GW: in this
frame the coordinate distance between the Test Masses does not change (but the
metric does!).
In standard interferometers the light from the two arms is combined optically and
the phase of each beam impinging on the recombining beamsplitter is not known.
In LISA, instead, each incoming beam is combined optically with a reference beam
292 11 Space Detectors of GW
S/C2
S/C1 S/C3
Fig. 11.5 Left: schematic block diagram of one instrumentation unit of a LISA mission: each
spacecraft contains two such units, pointing to and receiving light from the two other spacecrafts.
Credit: The LISA Consortium (2013). On the right, a sketch of the spacecraft constellation: each
spacecraft has two units, connected by a fibre optics back link, a simplified science (or long-arm)
interferometer and the telescopes pointing the other, distant spacecrafts. The grey squares represent
the Test Masses. Credit: Jennrich (2009)
individually, so that the phase of the incoming light is individually measured and
recorded. Recombination of the phases of the two beams is performed off-line. Thus,
we will regard LISA not as constituting one or more conventional Michelson inter-
ferometers, but rather as an array of six one-arm delay lines between the test masses.
The detector response is best described in terms of observed differential frequency
shifts (Doppler shifts) of the laser frequency, rather than in terms of phase shifts as
used in ground-based interferometry; these data streams can obviously be derived
one from the other.
fr ec − f emit 1
= h r ec (t) − h emit (t − L/c) (11.4)
f 2
Fig. 11.6 Schematics of the distance measurement between two test masses, split in three different
interferometers. Adapted from Jennrich (2009)
For practical reasons, the distance between TMs is measured by summing the signals
from three interferometers: the position of each TM is measured with respect to its
own optical bench, and a third interferometer measures the inter-satellite distance,
from one optical bench to the other. Such a partition of the measurement would
be normally avoided, as it increases the detection noise. However, the noise budget
for LISA is dominated by the contribution of the shot noise in the measurement
between the spacecraft, and the amount of noise added by this technique is negligible
(Fig. 11.6).
Reflection interferometry, as used in LIGO-Virgo, cannot work for a Gm path
length, for the following reason: to point the light beam towards the far spacecraft,
LISA will use infrared (λ = 1064 nm) laser light, with an emitted power PE = 2 W,
and a telescope with diameter D = 30 cm. The beam divergence, for a diffraction
limited telescope, is
λ 1.064 µm
θd = = = 3.5 µrad
D 30 cm
When the beam emitted by spacecraft 1 (S/C1 ) reaches S/C2 at the far end of the
arm, the beam diameter is 2θd L ∼ 9 km, and the telescope collects a fraction Pin of
the emitted power
2
D D4
Pin = PE = PE ∼ 0.7 nW (11.5)
2θd L 4λ2 L 2
making it unthinkable to reflect the light back. The adopted alternative approach is
the transponder scheme with offset phase locking (Fig. 11.7).
The received light on spacecraft S/C2 is combined on the optical bench with light
derived from the local, transmitting laser (2), to produce a heterodyne beat note in
the MHz√ region. The phase of the beat note can be measured with high precision
(µrad/ Hz) by an electronic phasemeter: it contains the information about the
294 11 Space Detectors of GW
Fig. 11.7 A simplified scheme of the phase-lock system aboard the two spacecrafts of GRACE-
Follow On. The light beam emitted by laser 1 (master) has frequency f 1 ; it is received on S/C2
with a Doppler frequency shift f D = vr el /λ; here, a second shift f o f f is added to it. The laser 2
(transponder) is locked to this frequency, and sends it back to S/C1 ; here it is received with an
additional shift f D (identical, if the light travel time is short); interfering the received beam with
the local oscillator f 1 leaves a low frequency (MHz) signal at frequency 2 f D + f o f f . As f o f f is
known, the measurement yields the value of 2 f D , hence the relative velocity vr el . Courtesy of G.
Heinzel (Albert Einstein Institute)
Doppler shift from the relative motion of the two spacecrafts. By controlling the
frequency of the laser (2), we can make the phase of the transmitted beam (2) a true
copy of the phase of the received light. When received back on spacecraft S/C1
and recombined with the local oscillator, the resulting beat note contains f o f f , the
Doppler shift f D due to arm flexing and the signal f GW due to the GW, as well as
some noise due to local accelerations.
With the same procedure, we can phase-lock the two lasers on the same spacecraft
and, in the end, all six lasers can be locked to a single, designated master laser. To
operate such a scheme, we need two phase measurements for each arm and one
between the two lasers of each spacecraft. In addition, the position of each test mass
with respect to the optical bench must be measured interferometrically, requiring
six more phase measurements. A total of fifteen phase measurements thus carry the
complete information on the relative motion of the spacecrafts.
This operating mode makes it convenient to consider LISA not as one or more
conventional Michelson interferometers, but rather a closed array of six one-arm
delay lines between the test masses.
Communication among LISA spacecrafts is achieved by imprinting, with an
Electro-Optic Modulator, several weak auxiliary modulations on the transmitted
laser light: clock noise, bi-directional timestamp synchronization, absolute range
distances and other data are exchanged among the spacecraft.
The transponder technique has been tested in the lab and, more recently, in space:
as mentioned in Sect. 1.7 the geodesic mission GRACE-Follow On (Abich et al.
2019), launched in 2018, has an inter-satellite optical link that works just like that
designed for LISA. Although the distance between the satellites is only 250 km, the
instrument successfully passed the crucial tests of beam acquisition, phase locking
11.5 LISA: The Instrument 295
The Interferometric Measuring System uses optical benches (OBs) which are con-
structed, in the lab of University of Glasgow, using the hydroxide-catalysis bonding
technique (van Veggel and Killow 2014): all components (mirrors, beam splitters,
plates...), made of fused silica, are aligned and bonded to a fused silica baseplate
in the lab via a procedure that uses an aqueous hydroxide solution to form a chem-
ical bond between oxide or oxidisable materials (e.g. SiO2 , sapphire, silicon and
SiC). This method forms strong, thin bonds without using epoxy nor glue: the bench
results in a quasi monolithic object, with components aligned to ∼10 µm precision.
The technique has been used in space for GP-B (Sect. 5.8.1) and LISA-Pathfinder
(Sect. 11.7) as well as in LIGO, GEO and Virgo for the mirror suspensions.
Each optical bench (Fig. 11.8) hosts one science interferometer for the received
light from the distant spacecraft, one local, or test mass interferometer which mon-
itors the position and orientation of the test mass, and a reference interferometer.
Construction techniques for the optical bench with√ the required alignment accuracy
(∼10 µm) and pathlength stability in orbit ( pm/ H z) have been demonstrated with
LISA-Pathfinder (LPF). The main laser field is injected via a single mode optical
fibre and distributed via several beam splitters and mirrors to the different interfer-
ometers and additional sensors such as a power monitors. A signal of few mW is also
exchanged between the two optical benches on each spacecraft via a bi-directional
Fig.11.8 The optical bench of LISA-Pathfinder: all optical elements are bonded with the hydroxide-
catalysis technique. In the background the Test Mass can be seen, past the optical window of the
vacuum enclosure and past the coupling hole of the Electrode Housing. The picture is taken from
the position that eventually hosted the second GRS. Integration of the sub-systems was carried out
by Airbus Defence and Space GmbH (Friedrichshafen), that has also provided this photo
296 11 Space Detectors of GW
backlink: either an optical fibre, or a free beam path between both OBs. The OB
has optical interfaces with the test mass (local interferometer) on one side and the
telescope (science interferometer) on the other side. Each telescope has an aperture
of about 30 cm diameter and serves simultaneously both to transmit and receive the
beams along the respective arm. The amount of backscattered transmitted light into
the receiving path must be minimized, and this is achieved through an off-axis design
with up to 6 curved reflectors, some aspherical, requiring a surface figure accuracy of
∼ 30 nm. The test mass interferometer measures the distance between the test mass
and the optical bench by reflecting light off the TM and combining this measurement
beam with a local oscillator on the optical bench.
Beside the breathing of the constellation mentioned above, another issue need
to be addressed when doing interferometry in space: the light travel time between
spacecrafts is L/c ∼ 8 s: light should therefore be sent not towards the present posi-
tion of the far spacecraft, identified by the wavefront of the incoming beam, but in
the direction where the far spacecraft will be in a time L/c from now. This task is
performed by the so-called Point Ahead Angle Mechanism, an actuator based on a
rotatable mirror that must steer the beam without introducing any extra optical path
or angular jitter.
Differential Wavefront Sensing
The laser beams in the science and local interferometers are monitored by InGaAs
quadrant photodiodes: each device has a sensitive area (2 mm in diameter) divided in 4
quadrants, separately read. The phasemeter processes the signals from each segment
using the Differential Wavefront Sensing (DWS) technique (Pierce et al. 2008):
proper combinations of the four outputs φi provide both the optical path difference,
as a sum of all quadrants, and the angle, horizontal and vertical components, between
the interfering wavefronts, as shown in Fig. 11.9
The Test Masses (TMs) are the nominal reference points of the experiment, the
elements in free fall along geodesics lines, whose distance is modified by the GW.
Geodesic motion is made possible by the technique called drag-free motion of satel-
lites:
11.5 LISA: The Instrument 297
Front view
Beam 2
α A +
C
-
Beam 1
Fig. 11.9 Top: a quadrant photodiode (Otron Sensor) and a schematic indicating the four outputs.
Bottom: schematic of the Differential Wavefront Sensing. Two beams (flat profile) interfere on
the photodetector at an angle α: the phase difference is constant in average but varies along the
photodiode surface: one quadrant photodiode reads the path difference and two inclination angles
between beam 1 and beam 2. If the two beams have slightly different frequencies (δ f ∼ 10 MHz),
all signals oscillate at δ f , and a demodulation scheme can be applied
• If the TM moves with respect to the spacecraft, it is the spacecraft that must be
recentered around the Test Mass.
Fig. 11.10 The Test Mass (left) and the Electrode Housing (right) of LISA-Pathfinder. On the TM
the recesses hosting the grabbing mechanism (on the edges) and for the soft release (centre of top
face) are visible. On the electrode housing we can see some of the x and y electrodes and the entry
holes for the plunger and the laser beam at the face centres, for the caging fingers at the edges.
Photos courtesy of OHB-Italia
for high reflectivity, as it constitutes one end mirror of the local interferometer, that
measures its position with respect to the OB, i.e. to the spacecraft. It has a mass
m T M = 1.96 kg and is machined with a few indentation that mate with the caging
device that keeps it in position during launch. The TM and the Electrode Housing
used in LISA-Pathfinder are shown in Fig. 11.10 and no relevant modifications are
expected for LISA.
The Gravitational Reference System is composed of the TM, the Electrode
Housing surrounding it, and some additional devices to serve numerous purposes:
• It shields the TM and limit stray forces, to allow the free fall requirement on the
1/2
x axis: ∼ 3 fm/s2 /Hz at 1 mHz.
• It is padded with capacitor plates: two facing each TM side, form six capacitance
bridges (two bridges for each pair of cube sides) with the TM as the central elec-
trode. Others six plates are needed to inject a 100 kHz “pump” voltage used to
polarize the bridges. These are simultaneously used for both readout and actu-
ation: different combinations of the twelve bridge signals provide information
and exert electrostatic force or torque on five DoFs. The x DoF is read by the
local interferometer and no force is actuated on it, to preserve its free fall, but the
electrostatic readout is available as a backup.
• The capacitor gaps are wide: 3–4 mm depending on the axis, to reduce force noise
from stray electrostatics and residual gas effects;
• The actuation authority,12 when operating in science mode (with maximum sen-
sitivity and reduced range, see below) must be enough to compensate linear accel-
erations of nm/s2 and angular accelerations of 10 nrad/s2 .
• The GRS comprises the Grab-Release-Position Mechanism mentioned above: a
first set of “large fingers” to hold the TM in position with a force of ∼ 1k N at
launch, and a second “soft” set (the plunger) to release it into free fall. This must
Ftm
a= − ωs2 X tm (11.8)
m
• The GRS is complemented with a number of auxiliary sensors that monitor tem-
perature, magnetic fields, cosmic ray flux.
300 11 Space Detectors of GW
The Front End Electronics. A specially designed electronics measures the capac-
itance change in all the bridges and provides, to the same capacitors, audio band
voltages to generate the electrostatic force or torque actuations, as required by the
feedback loop. When operated in its high sensitivity “Science
√ Mode”, with
√ the range
limited to ±100 µm, its noise level is roughly 1 aF/ Hz√(10−18 F/ Hz!). This
√ sensitivity as low as 1.2 nm/ Hz and a rotation sen-
corresponds to a displacement
sitivity as low as 83 nrad Hz.
– Colloid thrusters: small droplets of a ionic liquid are produced and ionized by
an electro-spray process, accelerated in an electrical field and ejected from the
thruster. Developed at NASA JPL, and tested on LPF, are now commercially
available. However, they provide a small amount of thrust, no more than 15 µN.
– Field Emission Electric Propulsion: a method based on field ionization of a liquid
metal (Caesium, Indium or Mercury) and subsequent acceleration of the ions by
a strong electric field. Emission from a tip or a slit is achieved with fields of
∼ 109 V/m. They were considered for LPF but, due to reliability concerns, were
eventually replaced by
– Cold Gas thrusters: they use the expansion of an inert, pressurized gas through
a nozzle to generate thrust. It is a simple and reliable propulsion system, with
the drawback of needing a larger mass of propellant than the other considered
technologies. They were space-tested by GAIA and later adopted by Microscope
and LPF.
– Radio-frequency Ion Thrusters: an r.f. field ionizes xenon atoms to form a plasma.
The heavy positive ions are then accelerated by an electrostatic field and ejected to
cause thrust. The plasma is then neutralized by adding electrons a from a neutral-
izer, which prevents the satellite from becoming charged. Still under development,
they would require substantially less propellant mass than cold gas.
The spacecraft and the two TMs constitute a 18 DoFs system, that must be prop-
erly kept centred and oriented, i.e. towards the incoming beam. The thrusters and
the control law (see Appendix B) commanding them constitute the Drag-Free and
Attitude Control System (DFACS), a key component of LISA.
11.5 LISA: The Instrument 301
The sensitivity of LISA is limited by three main noise sources in different frequency
regions, as summarized by the spectral sensitivity of Fig. 11.11.
allowing the noise to grow below 400 µHz, where the 1/ f behaviour occurs, and
above 8 mHz, where cross-talk from other DoFs can leak into the x sensing.
Detailed noise budget analyses count dozens of sources of acceleration noise. We
mention here the most relevant:
– Brownian motion, due to impact of residual gas molecules on the TM. Due to
small gaps (mm) between the TM and its housing, multiple bounces on the walls
enhance this effect.
– Electrostatic forces: the TM and the facing electrodes are overall equipotential, but
inevitably exhibit patch effect, i.e. the presence of charge domains on the surfaces,
due to contamination and/or varying work function. As charge accumulates and
fluctuates on the TM for effect of cosmic rays, fluctuating stray forces appear.
10 -16
)
−1/2
Acceleration noise
(Hz
-1/2
)
-17
10
(3 fm/s 2 / Hz )
(Hz h
amplitude SSh
-19
10
spectral
shot noise c
-20 (10 pm / Hz ) ν=
10
4L
10 -4 10 -3 10 -2 10 -1 10 0
frequency (Hz)
frequency (Hz)
√
Fig. 11.11 The expected sensitivity curve Sh (ν) for LISA
302 11 Space Detectors of GW
Hence the need of periodically discharging the Test Mass and of compensation
of d.c. stray fields.
– Thermal gradients can appear across the TM, due to thermal radiation pressure or
radiometric effects. A temperature difference turns √ into a net force. A fluctuating
temperature difference across the TM of 10 µK/ Hz would be a concern.
– Magnetic forces, arising from fluctuating fields, of either interplanetary o local
(spacecraft) origin. These fields produce a fluctuating magnetic moment in the
TM, that interacts with any magnetic gradient present in the spacecraft.
– Gravity fluctuations: any change in local gravity (caused, e.g. by the consumption
of thruster fuel, or by settling of the structures) will produce a pull on the TM.
– DC force and torques: continuous forces and torques are applied to the TMs to
compensate for the different local gravity and, in general, for all the DC terms
affecting the TM. A small fraction of these feedbacks, applied to any DoF can
leak into the “GW sensitive” x axis. This is a particularly serious issue for the
actuation on the φ rotation, because it uses the same electrodes than x, and for the
y translation, due to the fact that the y directions in the two TMs are not parallel,
as shown in Fig. 11.5. Moreover, non-diagonal stiffness elements can couple any
DoF of a noisy spacecraft motion into x motion.
Many of these noise sources have been tested, measured and analysed in the LISA-
Pathfinder mission, described in Sect. 11.7.
- Several noise sources, related to the optical measuring system, are summarized
under the label Shot Noise: the three interferometers (long-arm, TM readout, refer-
ence), thermal noise in the optical components, etc. The shot noise, see Sect. 9.6.3,
has a white spectrum with a 1/ f 2 tail below ∼ 2 mHz, and the most relevant term is
that of the inter-spacecraft interferometer. When expressed in GW h units, it takes
the form of Eq. 9.46:
1 cλ
Sh(shot) (ν) =
L 2 2πη Pin
where Pin is the detected light power and η the photodetector quantum efficiency. It
is worth noticing that, for LISA and any Gm long interferometer with a given emitted
power PE and telescope diameter D, the shot noise contribution is independent of
the arm length L: indeed, by combining this last relation with Eq. 11.5 we obtain
2cλ3
Sh(shot) (ν) = (11.9)
πη PE D 4
showing that the shot noise spectrum only depends on the emitted laser power
and the√
telescope diameter. The interferometer displacement sensitivity is around
10 pm/ Hz for each arm, dominated by shot noise in the long-arm interferometer.
11.6 Noise and Sensitivity 303
This is about nine orders of magnitude worse than what achieved by ground-based
detectors in the “sweet spot” above 100 Hz, due to the different environments of
the two detectors: in LISA the test masses are in motion, and the received light is
measured in nW rather than tens of W. The shot noise from the local interferometers
and from the back link is negligible with respect to this. Equation
√ 11.9 and current
technology set the shot noise limited GW sensitivity at 10−20 Hz. This noise is
dominant for intermediate frequencies, between approximately 3 and 30 mHz.
The sensitivity is reduced for frequencies above 30 mHz: not due to the insurgence of
other noise sources, but to the transfer function of the detector. Computing the antenna
pattern of a space detector is a more complex task than what done in Sect. 9.6.2 for ter-
restrial interferometers, for two main reasons: we cannot apply the long-wavelength
approximation and the detector moves (revolution and rotation) during the mea-
surement, that can last months. One has to be satisfied with computing quantities
averaged over time, directions and polarizations, and even so a part of the calculation
must be carried out numerically (Larson et al. 2000). The essential feature of this
computation is that the effect of gravitational waves starts to cancel out as soon as
the optical path in the detector (2L) approaches a multiple of the half wavelength of
the GW. As mentioned above, for λ < 4L multiple wavelengths of the GW fit into
the arms, causing partial cancellation of the signal. Actually, the detector response
should be zero at all frequencies that satisfy the relation ν = n c/4L with n integer.
This is not quite the case, as can be seen in Fig. 11.11: the peaks raise well above
the noise level, but remain finite.13 This is the only benefit of arm flexing: due to the
unequal length of the 3 arms in LISA, signal cancellation does not take place on all
the links.
We are now ready to discuss the value of the inter-satellite distance, or arm length
L. Intuitively, it should pay off to extend the distance as much as it is practical (e.g.,
5 Gm of the original 1998 LISA proposal); however, we have just shown that the
high frequency rising and the peaks of Sh start at a frequency ν = c/4L inversely
proportional to the length. On the other hand, the acceleration noise δL acc at low
frequency is due to local disturbances, and is independent of the distance L from the
other spacecrafts: hence, the strain noise δL acc /L scales with 1/L. A longer arm
improves sensitivity at the low end of the band but reduces the bandwidth on the
high end. The shot noise has no role in this optimization, being independent of L, as
shown in Eq. 11.9. The choice L = 2.5 Gm balances these opposite requirements,
and yields an expected noise curve for LISA of Fig. 11.11.
13 We remind the reader that the noise spectral density Sh ( f ) is closely related to the inverse of the
SNR.
304 11 Space Detectors of GW
The noise breakdown described so far has the overlooked frequency fluctuations of
the laser sources; to include them, we rewrite Eq. 11.7, adding a term that depends
on frequency noise f n :
f˙r ec − f˙emit 1
(t) = ḣ r ec (t) − ḣ emit (t − L/c) +
f 2
1 1 ˙
+ ar ec (t) − aemit (t − L/c) + f n,r ec (t) + f˙n,emit (t − 2L/c) (11.10)
c f
where L/c = 8.3 s, is the one-way travel time between two spacecrafts.
Let us briefly recap the influence of the laser frequency noise in equal arm interfer-
ometers: the relevant quantity measured by an interferometer is the phase difference
between the two recombined beams: φ = 4πc f L, where f is, as usual, the laser
frequency and L the static arm length difference. Any time-varying change of
frequency δ f mimics a displacement δL as: δ f / f = δL/L.
If we now move √ from time domain fluctuations to their spectral densities: δ f →
S f and δL/L → Sh , we can see the contribution of frequency fluctuations to the
overall detector noise:
f r eq S f L
Sh = (11.11)
f L
For ground-based detectors, L is controlled at a fixed value of the order of 1 m
(the Schnupp asymmetry, defined in AppendixB.5). The laser in√ Virgo has reached
a frequency fluctuation spectrum as low as S f ∼ 10−6 H z/ H z, thanks to a
complex system that includes a mode cleaner cavity 144 m long and even the lock
to
a 3 km long Fabry-Perot arm cavity. This produces an equivalent strain noise
f r eq −24
√
Sh ∼ 10 / H z.
Very different are the operating conditions √in LISA: the laser frequency is stabilized
by an on-board cavity to about 30 H z/ H z at the operating frequencies (mHz).
Moreover, arm flexing forces path differences as large as L ∼ 107 m, yielding
√
∼ 10−15 / H z, orders of magnitude larger than LISA
f r eq
an equivalent strain Sh
√ √
target sensitivity Sh = 10−20 / H z
Locking to the long-arm interferometer, in analogy to what is done in LIGO
and Virgo, has been considered, but it would not give all the needed improvement.
This apparently hopeless situation has been saved by an ingenious method of data
processing (Tinto and Dhurandhar 2014): the Time Delay Interferometry (TDI). In
essence, TDI requires, in the off-line data post-processing, to appropriately time-shift
the single link Doppler signals coming from different arms, and recombine them in
such a way to reconstruct n equal arm Michelson interferometer.
To catch the flavour of the TDI, let us consider the round trip of two light beams
collected in spacecraft n.3 (S/C3 ); in the LISA arm 1, connecting S/C1 − S/C3 , and
11.6 Noise and Sensitivity 305
arm 2, reaching to S/C2 14 : their lengths are L 1 = cT1 and L 2 = cT2 respectively.
The two beams are collected, on S/C3 , at time t: each carries the Doppler information
about frequency noise, that we indicate with Fn , imprinted on the beam at the time
of emission and at the time of detection. Moreover, there will be a Doppler signature
h i from the GW, and n i from any other noise source. We then have two time series:
The two beams, emitted at different times, carry a different frequency noise, that is not
cancelled as it happens in standard interferometry. Indeed, by taking the difference
of the two signals, we see that only the frequency noise at detection time t cancels
out:
y1 (t − 2T2 ) − y2 (t − 2T1 ) =
= Fn (t − 2T1 ) − Fn (t − 2T2 )+
+h 1 (t − 2T2 ) − h 2 (t − 2T1 ) + n 1 (t − 2T2 ) − n 2 (t − 2T1 ) (11.14)
This combination has the exactly the same frequency noise than the previous one,
Eq. 11.13. If we now subtract Eq. 11.14 from Eq. 11.13, we generate a new data set,
that is immune from laser frequency fluctuations:
This defines the TDI first-generation variable X , the first and simplest of an ample,
and ever growing, family of TDI data sets. Analogous definitions can be given for
the beams “recombined” at the other spacecrafts, called Y and Z . In order to gain
some physical insight, we rearrange the terms of this definition of the data set X
We can interpret X as the difference of the Doppler signal arising from two fabri-
cated light paths, not as simple as the usual round trip along one arm, as shown in
Fig. 11.12b: the first term in square bracket can be read as the path of a light beam
14 Here we derogate from the usual TDI convention of naming the arm with the number of the
opposite spacecraft; we are just presenting an example, and privilege clarity and immediacy over
completeness.
306 11 Space Detectors of GW
S/C1
S/C2
L1 = cT1
L 2 = cT2 S/C1
S/C3
S/C3
S/C2
(a) (b)
Fig. 11.12 a A pictorial image of first-generation TDI, balancing the light paths for stationary
end stations. Adapted from Tinto and Dhurandhar (2014) under Creative Commons (2021). b A
different sketch of first-generation TDI, with a space-time visualization. Adapted from Muratore et
al. (2020). Both diagrams have been modified to adhere to the spacecraft naming convention used
here
generated in S/C3 , reflected15 off S/C2 , returning at S/C3 , reflected off S/C1 and
finally detected at S/C3 . The second term represents a beam reflected first off S/C1
and then off S/C2 . The travel time of these two synthetic beams is the same, and
when recombined, the frequency noise cancels exactly.
We remark that the lines of Fig. 11.12 do not represent real beam paths: in each
arm we only have two beams (one in each direction). The lines can rather be intended
as the paths of relative frequency changes. The algebra of TDI is usually expressed
by means of the time-shift operators :
up to 24 links: in this case, the delay operators no longer commute. A detailed anal-
ysis of TDI is beyond the purpose of this book, a comprehensive description of the
algebraic approach is given (Tinto and Dhurandhar 2014).
TDI has been experimentally verified in the lab, with delays in fibre of 3 km
(Mitryk et al. 2012), but the exhaustive test, over the Gm distances, will have to wait
for the flight of LISA.
The experimental challenges outlined in the previous section are huge, and great
would be the risk of launching LISA without proper testing the technological solu-
tions. Here is a first list:
All these issues require free fall for a proper validation. Extensive test were performed
on torsion pendulums, that allow quasi free fall on one DoF to evaluate residual forces
(Cavalleri et al. 2009). A special pendulum, with two soft DoFs (one rotation and
one translation) was developed (Bassan et al. 2016) to investigate possible cross-
talk between x and other DoFs. Although relevant upper limits were set, these were
partial tests compared to real free fall on 6 DoFs. Fortunately, all of these noise
contributions are independent of the long-arm length of LISA as they occur within
each single spacecraft. That is why ESA promoted LISA-Pathfinder, a space mission
with the purpose of paving the way for LISA: testing all of the above technologies
and verifying the noise levels that can be achieved in space.
The driving concept of LISA-Pathfinder (LPF) was to squeeze a LISA arm from
2.5 Gm down to 38 cm: hosting two TMs (call them TM1 and TM2) on the same
spacecraft, it was possible to monitor their distance and validate both the optical
measuring system and the free fall inside the GRS, by measuring the residual force
(actually, acceleration) on the TMs. An interferometer measured the position x1 of
TM1 with respect to the optical bench and another the distance x between TM1
and TM2, both along the x direction connecting the two centres of mass:
where xsc is the spacecraft displacement, with respect to an inertial frame and n i are
readout, additive noise contributions, mostly laser shot noise.
LPF was launched on Dec. 3, 2015 and reached, in a 50 day trip, a wide orbit around
the L1 Lagrange point of the Earth-Sun System.17 Then, the two Test Masses were
successfully released into free fall and locked into a complex feedback system: as
described above, both TMs are electrostatically controlled on 5 DoFs, but free (in
principle) along the x direction. However, the spacecraft cannot follow both TMs in
their motion along x, thus only one TM can be kept in geodesic motion:
- TM1 was left in free fall, with the micro-thrusters controlling the spacecraft posi-
tion. The drag-free control loop, with 1 Hz bandwidth, nulls the output o1 of the first
interferometer.
- TM2 was instead actuated by the GRS electrodes in order to keep constant the
distance TM1-TM2. The electrostatic suspension, a control loop with 1 mHz band-
width, nulls o12 .
We have here a dynamical system with 18 DoFs, controlled by the Drag-Free Atti-
tude Control System, a sophisticated control law loaded on the on-board computer.
The diagram of Fig. 11.13 may help clarifying the set-up.
The main scientific goal of LPF is assessing the differential acceleration of the
two TMs:
F2 (t) F1 (t)
g(t) ≡ − (11.19)
m m
F1
ẍ1 = − ωs1
2
(x1 − xsc )
m
F2 Fes
ẍ2 = − ωs2
2
(x2 − xsc ) + (11.20)
m m
2 (i = 1, 2) are the stiffnesses, mentioned in Sect. 11.5.3, that feed residual
where ωsi
spacecraft motion to the TM.
Combining Eqs. 11.19, 11.20 and 11.18 leads us to:
Fes
g = ö12 − + (ωs2
2
− ωs1
2
)o1 + ωs2
2
o12 + n I F O (11.21)
m
where n I F O is the combined noise of the two interferometers, that contains, among
other terms, the second time derivative of n 2 . Equation 11.21 shows that, in order to
have a fair estimate of the differential acceleration, one must subtract all known and
measurable contributions, like those introduced by the stiffnesses (that are evaluated
with a calibration procedure) and by the applied feedback. Other effects, not reported
in this equation but accounted for in the analysis are the centrifugal forces due to
spacecraft rotation (about 2◦ / day, due to orbiting around L1) and pick-up of the
noisy spacecraft motion on other DoFs, due to residual misalignment of Electrode
Housings and optical bench.
The LPF mission requirement relaxed the LISA goal, on the measured resid-
ual acceleration,
√ by a factor 10 both in frequency and in noise spectral density:
30 fm/s2 / H z at 1 mHz.
One reason of the relaxed sensitivity goal is the presence of the electrostatic
suspension; indeed, it will not be present in LISA, where the spacecraft will be free
to follow each TM along its x axis, the direction of the long arm; the feedback controls
were expected to be the limiting source of noise at low frequency, proportional to the
amount of DC fields, mainly gravitational, they balance. In LPF these fields turned
out to be small as a result of accurate design and positioning of the masses in the
spacecraft; the reduced electrostatic suspension controls allowed for the excellent
performances of the LPF mission.
Figure 11.14 shows the acceleration noise measured by LPF. Since its first day of
operation (blue curve), the mission requirement, marked in dark grey, was largely
overcome, and the sensitivity approached the level required for LISA.
The mission lasted 15 months, and numerous experiments were carried out to
identify and characterize the various sources of noise. The apparatus was fine tuned,
e.g. by lowering the pressure via continuous venting to outer space, and by reducing
the feedback authority on TM2.18 The ultimate performance of LPF (Armano et al.
2018) is shown by the red curve in Fig. 11.14: it was achieved also by lowering the
18 Thiswas possible because the local gravity turned out to be extremely well balanced: no serious
gradient was measured on TM2.
310 11 Space Detectors of GW
Fig.11.14 Initial (blue) and best (red) performance of LISA-Pathfinder. Unlike sensitivities usually
shown in the GW community, here the plotted variable is the differential acceleration g, i.e. the
deviation from perfect free fall. The shaded areas show the acceleration noise requirements for
LISA and, relaxed by a factor 10, for LPF. Credits: reprinted figure with permission from Armano
et al. (2018). Copyright 2018 by the American Physical Society
temperature
√ by 12 degrees,
√ to about 284 K. The interferometer noise was measured
to be S0 = 35 fm/ Hz, well below the LISA requirement: when converted √ to
acceleration (our variable of interest here) it grows with frequency as S0 ω 2 and
dominates the spectrum for f > 20 mHz, barely visible at the right edge of Fig. 11.14.
LISA-Pathfinder was an extremely successful mission, that cleared the field from
many potential LISA stoppers. Some technological issues could not be tackled in a
single-spacecraft mission, e.g.:
All these will be tested with the full three-spacecraft, long-arm constellation, but the
road towards LISA has been definitely smoothed by LPF and GRACE-FO.
The Europe-USA project LISA is, at the moment, the most advanced space project for
observing GWs, but by no means the only one. The Chinese space-GW community
11.7 Paving the Way: LISA-Pathfinder, the Technology Demonstrator 311
is simultaneously pursuing two different missions (Gong et al. 2021), TianQuin and
Taiji, while the Japanese community has proposed the ambitious DECIGO observa-
tory; we briefly describe these projects.
The amount of effort devoted into these studies might close the gap with LISA, so
that the exciting perspective of having two or more observatories active at the same
time (mid 2030s) is not ruled out.
11.8.1 DECIGO
11.8.2 TianQin
• the thermal stability of the spacecraft, affected by the variation in the sunlight
direction relative to the orbital plane or eclipses due to the Earth or the Moon
temporarily blocking the sunlight
• the constellation stability: distortions in the equilateral triangle are induced by
the gravitational disturbances, especially from the nearby Earth-Moon system
• the problem of a steady power supply because of the eclipses of the Earth and the
Moon when the constellation passes through their shadows.
All these issues are object of careful, ongoing studies by the TianQin team and will
probably impact on the duty-cycle of the observatory operations.
TianQin project started in 2016 with an aggressive roadmap in four steps: Step
0: laser ranging technology, has been fulfilled with the launch in 2018 of a laser
ranged satellite to the L2 Lagrange point and the construction of a Earth ranging
station with single-photon sensitivity. Step 1: TianQin-1 is a satellite with inertial
sensing, micronewton propulsion, drag-free control and laser interferometry tech-
nologies with in-orbit experiments; beside, it aims to test the temperature control
technologies and centre-of-mass measurements of the satellite. It was successfully
launched in 2019 √ and achieved all its goals, measuring a residual acceleration of
5 · 10−11 m/s2 / Hz at 50 mHz. Step 2 is a two-satellite mission similar to GRACE-
FO, to test inter-satellite interferometry and map the gravity field. It is scheduled for
launch in 2025. The final Step 3 is the full constellation and is planned for deployment
in 2035, with an exciting perspective of simultaneous observations with LISA.
11.8.3 Taiji
Similar to LISA, Taiji is a project of the Chinese Academy of Sciences, very similar
to LISA: a three-satellite constellation in a heliocentric orbit ahead of the Earth by
about 18–20◦ (unlike LISA, that trails the Earth), as a compromise between orbit
injection costs and Earth-lunar system disturbance. The arm length is L = 3 · 109
m, longer than LISA by 20%, allows a better sensitivity at frequencies around 0.01–
0.1 Hz. Goals for displacement noise and acceleration noise are roughly the same
than LISA’s, and so are the scientific objectives.
A pre-pathfinder mission, Taiji-1, was launched in 2019. Taiji-2, a pair of tech-
nology demonstration satellites designed to cover almost all of the Taiji technologies
except TDI, is scheduled to launch in 2024. They will fly at a distance L = 5 · 108
m on the same orbit, and carry the same payload, than the final Taiji observatory.
Taiji is expected to launch around 2033: if both Taiji and LISA launch according to
plan, they will have a few years of overlapping observing time, leading to amazing
capabilities in source localization: while LISA or Taiji alone have an angular reso-
lution of about 1 deg2 , the combination of 2 or 3 networks could lead to resolutions
of the order of 10−3 deg2 because they will operate as a VLBI interferometer (see
Sect. 6.6.1), with a baseline of 1011 m.
We must remark that, while the noise spectrum for LISA rests on solid experimental
background, sensitivities and timelines for other detectors appear, at the present state
of technology, quite optimistic. Observation of GW from space is an exciting perspec-
tive for the mid 2030s. The quantity and variety of sources has been outlined in Sect.
11.2. The long signal duration (days to months for many sources) will allow detailed
determination of the source parameters, and precise estimate of the coalescence
time, months or even years afterwards, with uncertainties better than 10 s and 1 deg2
(Sesana 2016). This prediction capability will give advance warning to ground GW
detectors and will allow the prepointing of telescopes to realize coincident GW and
multiwavelength electromagnetic observations of the final seconds before merging.
Time coincidence is critical, as it was shown by the first multi-messenger detection
(Abbott et al. 2017a) of a Binary Neutron Star Merger. Figure 11.16 pictorially shows
20°
20° 120°
Sun
120°
DECIGO
DECIGO
314 11 Space Detectors of GW
Fig. 11.16 A sketch of combined sensitivities for the GW observatories of the 2030s. The green
lines mark the signal evolution of black hole binaries, that will first cross the LISA sensitivity band,
to end up, weeks or years later, in the VIRGO-LIGO band. The SNR √ for some TianQin sources
is also shown. Note the y-axis units, the “characteristic amplitude” ν · Sh . Credits Gong et al.
(2021)
this concept: the green lines show the frequency evolution of signals from BH binary
coalescences.
Multimessenger astronomy will be even more fruitful when signals from neutrino
detectors will complement GW and electromagnetic observations.
References
Abbott, B.P., et al.: Multi-messenger observations of a binary neutron star merger. ApJL 848(L12),
1–59 (2017a)
Abbott, B., et al.: GW170817: observation of gravitational waves from a binary neutron star inspiral.
Phys. Rev. Lett. 119, 161101 (2017b)
Abich, K., et al.: In-orbit performance of the GRACE follow-on laser ranging interferometer. Phys.
Rev. Lett. 123, 031101 (2019)
Amaro-Seoane, P., et al.: Low-frequency gravitational-wave science with eLISA/NGO class. Quan-
tum Grav. 29, 124016 (2012)
Armano, M., et al.: Sub-Femto-g free fall for space-based gravitational wave observatories: LISA
pathfinder results. Phys. Rev. Lett. 116, 231101 (7pp) (2016)
Armano, M., et al.: Beyond the required LISA free-fall performance: new LISA pathfinder results
down to 20 μHz. Phys. Rev. Lett. 120, 061101 (10 pp) (2018)
Babak, S., et al.: Science with the space-based interferometer LISA. V: extreme mass-ratio inspirals.
Phys. Rev. D 95, 103012(21) (2017)
Babak, S., Hewitson, M., Petiteau, A.: LISA sensitivity and SNR calculations. Technical Note
LISA-LCST-SGS-TN-001. arXiv:2108.01167 (2021)
Barausse, E., et al.: Prospects for fundamental physics with LISA. Gen Relativ Grav. 52, 81 (2020)
Bassan, M., et al.: Approaching free fall on two degrees of freedom: simultaneous measurement of
residual force and torque on a double torsion pendulum. Phys. Rev. Lett. 116, 051104 (2016)
References 315
Cavalleri, A., et al.: Direct force measurements for testing the LISA Pathfinder gravitational refer-
ence sensor. Class. Quantum Grav. 26, 094012 (10pp) (2009)
Danzmann, K., et al.: LISA Laser Interferometer Space Antenna - A proposal in response to the
ESA call for L3 mission concepts (2017). arXiv:1702.00786
De Bra, D.B., Dassoulas, J., Kershner, R.B.: A satellite freed of all but gravitational forces: “TRIAD
I”. J. Spacecr. 11, 637 (1974)
Faller, J.E., et al.: Space antenna for gravitational wave astronomy. In: Proc. Colloquium “Kilometric
Optical Arrays in Space” ESA-SP226 (1985)
Gong, Y., Luo, J., Wang, B.: Concepts and status of Chinese space gravitational wave detection
projects. Nat. Astron. 5, 881–889 (2021)
Jennrich, O.: LISA technology and instrumentation. Class. Quantum Grav. 26, 153001 (2009)
Kawamura, S., et al.: Current status of space gravitational wave antenna DECIGO and B- DECIGO.
Theor. Exp. Phys. 2021, 05A105 (2021)
Lange, B.: The drag-free satellite. AIAA J. 2, 1590 (1964)
Larson, S.L., Hiscock, W.A., Hellings, R.W.: Sensitivity curves for spaceborne gravitational wave
interferometers. Phys. Rev. D 62, 062001 (2000)
Mitryk, S.J., Mueller, G., Sanjuan, J.: Hardware-based demonstration of time-delay interferometry
and TDI-ranging with spacecraft motion effects. Phys. Rev. D 86, 122006 (2012)
Muratore, M., Vetrugno, D., Vitale, S.: Revisitation of time delay interferometry combinations that
suppress laser noise in LISA. Class. Quantum Grav. 37, 185019 (18pp) (2020)
Pierce, R., et al.: Intersatellite range monitoring using optical interferometry. Appl. Opt. 47, 5007–
5019 (2008)
Pucacco, G., Bassan, M., Visco, M.: Autonomous perturbations of LISA orbits. Class. Quantum
Grav. 27, 235001 (2010)
Schutz, B.F.: Determining the Hubble constant form gravitational wave observations. Nature 323,
310 (1986)
Sesana, A.: Prospects for multiband gravitational-wave astronomy after GW150914. Phys. Rev.
Lett. 116, 231102 (2016)
The LISA Consortium: The Gravitational Universe (2013). - arXiv:1305.5720
Tinto, M., Dhurandhar, S.V.: Time-delay interferometry. Living Rev. Relativ. 24, 1 (2021)
van Veggel, A.M.A., Killow, C.J.: Hydroxide catalysis bonding for astronomical instruments.
Advan. Opt. Technol. 3, 293–307 (2014)
Pulsar as Gravitational Laboratory
12
Pulsars are the most advanced laboratory for gravitational physics. While tests in
the Solar System probe the “weak field” regime, with a compactness parameter
U = 2G M/r c2 10−5 , and observations of gravitational waves provide information
about strong regime only in the very few instants preceding the merging, pulsars offer
long term observations of gravity in the “intermediate” or “not weak” regime, with
U ∼ 0.3 (“strong” is reserved to black hole physics, where compactness reaches
unity). Pulsars have proven, in recent years, to be fantastic tools to probe relativistic
gravity, offering measurements of unprecedented accuracy.
We list here the topics addressed in this chapter, together with some in-depth
review papers to expand and integrate our brief overview:
– we first recap our still vague knowledge of pulsar structure: see the reviews by
Lorimer (2008), Lattimer (2015)
– we describe the method of pulsar timing and introduce the Post-Keplerian orbital
elements: see Becker et al. (2018), the review paper (Taylor 1992) and the book by
Lorimer and Kramer (2005).
– we show how accurate pulsar timing has allowed to collect a wealth of significant
experimental results, setting stringent limits to PPN parameters and reinforcing our
faith in GR: see Wex (2014) and Manchester (2015)
– finally, we describe the Pulsar Timing Array (PTA) projects to detect gravitational
waves at n H z frequencies: see Lommen (2015), Perrodin and Sesana (2018), Tiburzi
(2018) or Dahal (2020).
Pacini (1967) predicted that a rotating neutron star with a magnetic field would emit
a beam of electromagnetic radiation that can be detected when the beam propagates
along the Earth line of sight, a Pulsar. Pacini’s paper was published 30 years after the
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 317
F. Ricci and M. Bassan, Experimental Gravitation, Lecture Notes in Physics 998,
https://doi.org/10.1007/978-3-030-95596-0_12
318 12 Pulsar as Gravitational Laboratory
speculations of Walter Baade and Fritz Zwicky about the existence of stars super-
compressed by gravity, the neutron stars (Baade and Zwicky 1934). At the same
time, in the summer of 1967, Jocelyn Bell Burnell, graduate student at Cambridge
University, made the first detection of a radio pulsar. Bell and her adviser Antony
Hewish published their findings in February 1968 (Hewish et al. 1968). The discovery
set off a race to find more pulsars and, in 1974 Hewish (but not Bell), was awarded
a share of the Nobel Prize in Physics for the discovery.
A pulsar is a compact star spinning around an axis different from that of its
magnetic field, so that the emitted beam of electromagnetic radiation sweeps the sky
like a lighthouse beam (Fig. 12.1). When the beam crosses the line of sight of the
observatory, a pulse is detected. The pulse profile encodes information about neutron
star physics, its magnetic field, the interstellar medium and much else.
Pulsars are born, following the gravitational collapse of a giant (10 − 25 M )
stellar progenitor, with rotation periods of the order of fractions of a second, with mag-
netic fields of B ∼ 108 − 1010 T (Shapiro et al. 1983). The current model describes
pulsar emission assuming a rotating dipole magnetic field, due to a misalignment
of the rotation and dipole axis. Particles, mostly e± , accelerated by the field, emit
radiation, according to the Larmor equation, and this radiated energy produces a loss
of rotational kinetic energy.
It is widely accepted that the rapid rotation of the neutron star, combined with its
super-strong magnetic field, leads to a large electric field that accelerates particles
to extremely high energies. These primary particles emit gamma rays which decay
into secondary electron-positron pairs, and the emission in the radio bandwidth
is attributed to these outflowing secondary pairs. But the scientific debate on the
emission mechanism of a pulsar is still open.
However, the radiation pulses are too bright to be explained in terms of incoherent
emission (plasma particles radiating independently of each other). The brightness
values sustain the hypothesis of a coherent emission. In the early 1970s, two emission
models have been proposed: the coherent curvature emission and the relativistic
12.1 Introduction to Pulsars 319
1 The Langmuir waves are rapid oscillations of electron density in the plasma, whose frequency
depends only weakly on the wavelength of the oscillation. The plasmon is the quasi-particle resulting
from the quantization of these oscillations.
320 12 Pulsar as Gravitational Laboratory
binary systems are ideal system to test the effects of General Relativity and to look
for deviations from the predictions of Einstein’s theory. Foster and Backer (1990)
showed how a comparison of timing observations from multiple millisecond pulsars
(a spatial array of pulsars) could be used to search for gravitational waves (GWs),
building on previous intuitions by Sazhin (1978) and Detweiler (1979).
Pulsars are often referred to as cosmic clocks, as it is possible to predict the pulse
times-of-arrival (TOA) with high accuracy: this is the key quantity of interest in the
powerful technique known as pulsar timing. The best pulsars to be exploited are
those with a small repetition rate, to make the averaging operation easier, and those
with high electromagnetic flux S, i.e. higher signal to noise ratio.
Pulsar timing consists of regularly monitoring the TOAs of a known pulsar for
several years, on a weekly or monthly basis (Perrodin and Sesana 2018). For high-
precision experiments, pulsars should have predictable spin evolution. There are
three main requirements that need to be met, to enable accurate predictions of the
arrival times
• The pulsar must be a very stable clock but, as mentioned in Sect. 12.1, it loses
rotational energy via magnetic dipole radiation, and therefore it spin down. Mil-
lisecond pulsars have slower spin-down rates than classic pulsars, so that their
rotation is significantly more stable. In addition, the spin-down might suffer from
irregularities: for example, there are phenomena of sudden accelerations, called
glitches, that can be explained by a slight decoupling of the crust from the core
and therefore by differential rotation, at the end of which the star resumes its
slowdown.
• The shape of the integrated pulse profile must be stable in time: pulsars are intrin-
sically weak radio sources and, typically, the individual pulse signal is below the
radiometer noise of the telescope. Therefore, it is usually necessary to add up
thousands of individual pulses, with a technique called folding, that increases the
signal-to-noise ratio. Remarkably, for a given pulsar, even though the individual
pulses have different shapes, the integrated profile is usually very stable, on time
scale of years, for any observation at the same radio frequency.
• The timing model of the pulsar, the ephemeris, must be well known. It is a set of
parameters that describe the pulsar spin and spin-down, its orbital parameters, its
astrometry, and the dispersive influence of the ionized InterStellar Medium (ISM)
along the line of sight to the pulsar. Pulsars emit mainly in the radio band: at these
frequencies, ISM influences the TOA of the signals, through two main effects:
– Dispersion: pulses observed at higher radio frequencies arrive earlier at the
telescope than their lower frequency counterparts. This is due to the frequency-
dependent dispersive effect of the ionized ISM on radio pulses. This is quanti-
fied by the dispersion measure (DM), defined as the integrated column density
of free electrons along the line of sight:
12.2 Pulsar Timing 321
d
DM = n e dl, (12.1)
0
where d is the distance to the pulsar and n e is the free electron number density2
– Scintillation: in addition to being magnetized and ionized, the ISM is highly
turbulent and inhomogeneous. These irregularities cause variations in the sig-
nal intensity. This effect is known as interstellar scintillation. Since the effect
decreases with the signal frequency as 1/ f 4 , the problem is tackled by divid-
ing the wide observation radio band in many narrower bands, where the shape
remains stable (Lommen 2015);
Generating ephemeris for a given pulsar is an ongoing, iterative effort: the first
version is typically obtained at the time of discovery and provides approximate
estimate of the pulsar spin, position, and D M. A precise knowledge of the timing
model can be achieved through an iterative procedure of data correction and pulsar
parameters estimations. When a stable set of parameters is obtained, we can compute
expected TOAs based on our best-known models for the pulsar motion. Taking the
difference between the observed TOAs and the expected ones, we obtain the timing
residuals that, if the model is correct, should be scattered around a zero mean. Model
parameters are continuously adjusted under this constraint to minimize the timing
residuals.
We now try and clarify this method through an example (Perrodin and Sesana
2018): assume that an observing campaign is performed, with a radio telescope, on
a given pulsar. For each observation, we first obtain an integrated pulse profile PP
that is statistically stable and has a suitably high signal-to-noise ratio (Fig. 12.3).
In pulsar timing, average TOAs for each observation are computed via a cross-
correlation of each integrated pulse profile with a high SNR reference template S,
which is typically obtained from the superposition of many earlier observation at
that particular observing frequency (Eq. 12.4). PP(t) is a scaled and shifted version
of the template S(t), with added noise, n(t)
PP(t) = a + b S(t − τ ) + n(t), (12.2)
where a is an arbitrary offset and b is a scaling factor. The time shift τ , that must
be added to the observation time to optimize the overlap between the profile and
the template, yields the TOA. However, our observations are performed from a non-
inertial frame: we use telescopes located on the Earth, spinning and orbiting the
Sun. Before any analysis, we need to transfer the TOAs measured with the observa-
tory clock (topocentric arrival times), to an inertial reference frame: conventionally,
we use the Solar System Barycentre (SSB), where the Temps coordoneè barycen-
trique is defined. We must, therefore, carry out the following barycentric correction,
the transformation between the TOA measured at the telescope, T O Atopo , and the
T O A SS B :
DM
T O A SS B = T O Atopo − k + R + E + S (12.3)
f2
The first correction is the de-dispersion discussed above: k is a constant, D M is the
dispersion measure (Eq. (12.1)) and f is the observing frequency. Unfortunately, the
interstellar medium is not static, so D M must be measured at different epochs and
updated.
The term R , the dominant correction, is the Roemer delay and accounts for the
varying distance of the telescope from the SSB
r · n̂ r · n̂)2 − |
( r |2
R = + (12.4)
c 2cd
where r is the vector pointing from SSB to Earth and n̂ is the unit vector pointing
from SSB to the pulsar. The second term in Eq. 12.4 is due to the propagation of
the spherical radio waves from the source to the telescope and contains the pulsar
distance d: so far, it has been accurately measured for only five pulsars.
The term E accounts for the relativistic delay due to Earth motion and the
gravitational red-shift caused by other celestial bodies in the Solar System. Its time
derivative is (Burgay et al. 2021)
d E Gm k v⊕
2
= + − const (12.5)
dt c2 dk,⊕ 2c2
k
Fig.12.4 Diagram showing the basic concept of pulsar timing observation. Credits: Lorimer (2008)
The term S accounts for the Shapiro delay, (see Sect. 5.4.1) acquired by photons
travelling in the gravitational field of the Sun and planets (mostly Jupiter).
All these corrections produce our best estimate of the Solar barycentric TOA: due
to the variability of D M, clock corrections must regularly be updated (Lynch 2015).
Once the barycentric TOAs t has been derived, we can compute the pulse number
N that represents a “counter” for the number of pulsar rotations, i.e. of received radio
pulses
1 1
N (t) = N0 + ν0 (t − t0 ) + ν˙0 (t − t0 )2 + ν¨0 (t − t0 )3 + ... (12.6)
2 6
where N0 is the pulse number at the reference time t0 , while ν0 and ν˙0 , ν¨0 are
respectively the pulsar frequency and its time derivatives at t0 . The right-hand side
of Eq. (12.6) is the Taylor expansion of the pulsar spin.
The timing procedure requires several parameters (position, proper motion, rota-
tion frequency and higher derivative, etc.) that are not known a priori (or they are
initially known with limited precision). Given a minimal set of starting parameters,
a least square fit is needed to match the measured arrival times to pulse numbers,
according to Eq. (12.6). Therefore, we minimize the expression
N (ti ) − n i 2
χ2 = , (12.7)
σi
i
where n i is the nearest integer to N (ti ) and σi is the TOA uncertainty in units of
pulse period. The aim is to obtain a phase-coherent solution that accounts for every
single rotation of the pulsar between two observations. Starting off with a small set
of TOAs, the data set is gradually expanded (Fig. 12.4).
Ideally the residuals ri = N (ti ) − n i should be randomly distributed with zero
average. A different trend can suggest that one or more parameters are incorrectly
evaluated: for example a linear trend (the residuals increase linearly over time),
implies a wrong assumption about the frequency ν0 . A periodic modulation of the
residuals with a period ∼1 year implies that the position of the pulsar, which appears
in the Roemer delay, is not well identified (Fig. 12.5).
Moreover, since we are considering the signals emitted and received in two iner-
tial frames, we have also to evaluate and subtract the contribution due to the galactic
324 12 Pulsar as Gravitational Laboratory
acceleration acting both on SSB and the pulsar (Burgay et al. 2021). Summarizing,
the goal of pulsar timing is to develop a model of the pulse phase as a function of
time, so that all future pulse arrival times can be predicted with good accuracy. All
these continuous estimates and corrections allow for extremely precise measurement
of the model parameters, illustrating the power of pulsar timing.
If the pulsar is part of a binary system, and orbits with its companion star around
their centre of mass, we have to extend the timing model to take into account the
modulation caused by the pulsar motion (Fig. 12.6). Therefore, TOAs must be con-
verted from the pulsar proper time to the Binary System Barycenter (BSB) time.
Other terms must be added to Eq. 12.3
where the terms on the right represent corrections similar to those seen for the Solar
System (Earth motion, red-shift, Shapiro delay) but are now referred to the pulsar-
companion system. In the following, we label with the subscript p all quantities
referred to the pulsar, and with c those referred to the companion star, while b
indicates the binary system.
Modelling the TOAs of a binary pulsar allows us to derive the value of five
Keplerian parameters (see Chap. 1): the pulsar orbital period Pb = 1/ν0 ; the semi-
major axis (projected on a plane normal to the line of sight) x; the eccentricity
e; the longitude of periastron ω and the epoch of passage of periastron T0 . These
parameters are the same as those obtained with normal spectroscopic binaries, but
12.3 Post Keplerian Parameters 325
the use of pulsar timing allows a much higher precision (Burgay et al. 2021). Kepler’s
third law allows us to derive a relation between these parameters and the masses of
the stars, m p and m c , called the Mass Function
(m c sin i)3 4π 2 x 3
f (M) = = ·(1 − e2 )3/2 (12.9)
M2 G Pb2
where M = m p + m c and i is the angle between the plane of the orbit and the
plane defined above. Moreover, pulsar binary systems are often in a regime of
strong gravity, and relativistic phenomena take place that cannot be described by the
Newtonian-Keplerian dynamics, like the precessions. Therefore, additional param-
eters have to be included in the timing model: they are called Post Keplerian (PK)
Parameters. We list them here with their expression derived in the framework of
General Relativity, expressing the star masses in units of solar mass and defining
T ≡ GcM3 = 4.92549 µs. The five PK parameters are:
– the rate of advance of periastron
5/3
2π 1
ω̇ = 3 (T M)2/3 (12.10)
Pb (1 − e2 )
We repeat here the warning given in Chap. 1: in this context, ω is an angle, not an
angular frequency !
– the gravitational red-shift and time dilation parameter (No relation with the PPN
parameter γ !)
1/3
Pb
γ∗ = e T M −4/3 m c (m p + 2m c )
2/3
(12.11)
2π
Two parameters are related to the Shapiro delay, that can be expressed, in terms
of the pulsar anomaly θ, by:
1 + e cosθ
Sbin = 2 r log
1 − s sin(ω + θ)
Fig. 12.7 Left: the shift in the periastron passage of the double neutron star system PSR 1913+16,
plotted over four decades. The shift results from orbital energy loss due to the emission of gravita-
tional radiation. The plot shows the perfect agreement between the observed orbital decay (black
dots) and the prediction by GR (solid line). The lack of data in the period 1992–1998 was due to
an overhaul of the Arecibo observatory. Credits: Weisberg and Huang (2016). Right: Mass-Mass
diagram for the PSR 1913+16. The grey zone is forbidden by the Mass Function and the constraint
sin i 1. The three curves, describing the values of ω̇, γ ∗ , Ṗb as computed within GR, meet in one
point, yielding precise values for the two star masses. Adapted from Weisberg and Huang (2016)
The first binary pulsar, PSR 1913+16, was discovered in 1974 by R. Hulse and J.
Taylor using the Arecibo Telescope (Hulse and Taylor 1974). It is the most famous
and most extensively studied binary system, also because it brought the first indirect
confirmation of the existence of GWs.
PSR 1913+16 is a pulsar located about 5 kpc away from the Earth, in the Aquila
constellation. The radio pulse has a repetition time P 59 ms, but this varies by
about one part in 1,000 with a periodicity of 7.75 hours. These variations were inter-
preted as due to a Doppler shift effect associated with the orbital motion about a
companion with velocities of the order of 10−3 c. At first, Hulse and Taylor derived
a velocity curve versus time, which deviated from a a sinusoidal dependence. The
curve distortion was the crucial element permitting to compute the celestial param-
eters of the system. By a detailed fit of the velocity curve under the assumption of a
Keplerian two-bodies orbit, they derived the parameters reported in Table 12.1.
The orbit is an ellipse of eccentricity 0.62 and the binary system is composed of
two neutron stars, each having a mass of about 1.4 solar masses, with a semi-major
axis separation of only 2.8 solar radii.3 The minimum separation, at periastron, is
about 1.1 solar radii, while the maximum separation, at apoastron, is 4.8 solar radii.
The orbit is inclined at about 45◦ with respect to the line of sight.
3 Another odd unit of astronomers: measuring star dimensions in terms of the solar radius, R =
6.957 · 108 m = 1/215AU , as defined in 2015 by the IAU. It is indeed suggestive to think of a
two-star system rotating in an orbit that contains just 3 Suns.
328 12 Pulsar as Gravitational Laboratory
Table 12.1 Coordinates, Keplerian orbital elements and post-Keplerian parameters of PSR
1913+16. In parenthesis, the uncertainty on the last digit. Adapted from Weisberg et al. (2010),
Weisberg and Huang (2016)
Observed parameters Value
Epoch T0 MJD 52984.0
Right ascension α 19h 15m 27s .99928(9)
Declination δ 16◦ 00 27".3871(13)
Signal frequency f 16.940537785677(3) Hz
Signal frequency derivative f˙ −2.4733(1) · 10−15 Hz/s
Fitted Keplerian parameters
Orbiting period Pb 0.322997448918(3) days
Eccentricity e 0.6171340(4)
Projected semi-major axis x ≡ a p sin i /c 2.341776(2) s
Argument of periastron ω0 292.54450(8) deg
Orbital velocity of stars at periastron v p 450 km/s
(with respect to the centre of mass)
Orbital velocity of stars at apoastron va 10 km/s
(w.r.t centre of mass)
Post-Keplerian parameters
Rate of advance of periastron ω̇ 4.226585(4) deg/yr
Red-shift parameter γ ∗ 4.307(4) ms
Orbiting period derivative Ṗb −2.423(1) · 10−12
Mass of companion m c 1.3867(2)M
Mass of pulsar m p 1.4414(2)M
Soon after the pulsar discovery, it was realized that the system is a natural lab-
oratory for providing GR tests to high precisions levels. The first example is the
relativistic precession of periastron: the orientation of periastron changes by about
4.2◦ per year in direction of the orbital motion (in January 1975, it was oriented so
that periastron occurred perpendicular to the line of sight from Earth).
This amount can be compared with the GR perihelion advance of the Mercury
planet, 43 arcsec per century: the precession this system achieves in 1 day is as much
as Mercury does in a century.
Table 12.1 reports recent determinations of Keplerian and post-Keplerian param-
eters for the binary system PSR 1913+16, referred to the epoch MJD 52984.0.4 As
the authors wrote: “It is interesting to note that since the value of Newton’s constant
G is known to a fractional accuracy of only ∼ 10−4 , masses can be expressed more
accurately in solar masses than in kg”. Indeed, the accuracy on T is ∼ 10−6 .
A binary star system is a rotating mass quadrupole: therefore, GR predicts it to
emit gravitational radiation; the loss of energy due to the GW emission will result in a
4 The Modified Julian Date (MJD) counts the number of days since midnight on November 17,
shrinking of the orbit, and in the decrease of the orbital period Ṗb . The measurements
of the orbital decay span from 1974 to nowadays: since the early 1980s, it was clear
that the observed orbital decay of this binary system was in agreement with the GR
predictions, with a precision that, today, exceeds 99.5%. The total power of the GWs
emitted by this system is calculated to be 7.35 × 1024 W. With this energy loss, the
orbital period decreases by 76.5 µs/year, the semimajor axis decreases by 3.5 meters
per year and the calculated lifetime up to the final merger of the two neutron stars is
some 300 million years.
This observation put an end, 30 years before the detection in 2015, to the scientific
debate on the existence of gravitational waves. The results proved that GWs are not
just a mathematical feature of the Einstein field equation, terminating the debate on
the physical nature of this phenomenon. It played a central role toward the approval
of the construction of the Earth-based interferometers. Hulse and Taylor received the
1993 Nobel Prize for Physics in recognition of their achievement.
Inserting these values in Eq. 12.12, we derive the GR prediction ṖbG R . Comparing
with the observed values Ṗbobs , we get (Weisberg 2016)
Ṗbobs
= 0.9983 ± 0.0016 (12.15)
ṖbG R
The best binary pulsar system to test gravity theories is, to this day, PSR J0737-3039
A and B, for four diverse reasons: the two stars are very close to each other, x A /c =
1.42 s, x B /c = 1.51 s, leading to motion at almost relativistic velocities in a very
strong gravity field; their orbital plane is almost parallel to the line of sight, i ≈ 88◦ ,
so that the system is seen edge-on, and the effect of the Shapiro delay is enhanced; last
and foremost, both neutron stars have been seen as pulsars and therefore we had two
“clocks” to perform timing on. To give one example of how strong relativistic effects
are in this binary system, we mention the precession of periastron: ω̇ = 16.9◦ /yr .
This is more than 105 times larger than the 43 ar csec/centur y of Mercury!
For PSR J0737-3039, all five PK parameters are measured, and Fig. 12.8 shows
how well their determination agrees with GR. In this system, one additional rela-
tivistic effect has been measured, the rate of the geodetic precession B for the spin
of star B.
PSR J0737-3039 was discovered in May 2003. Just 3 months later, many parame-
ters of both pulsars had been determined to a great accuracy, as shown in Table 12.2.
In binary systems with only one visible pulsar, it takes years of accumulated data to
achieve these measurements. The precision of these values has been largely improved
since this first determination.
Since March 2008, the pulses of star B are no longer observable: due to the spin
precession, the radio beam is no longer shining on the Earth. But don’t despair: it
should reappear in 2035.
330 12 Pulsar as Gravitational Laboratory
More than 2800 pulsars have been discovered since 1967,5 and about 10 % of these
belong to a binary system. At least one pulsar, J0337+1715 is known to belong to a
triple system (Voisin et al. 2020). Each one has its peculiarity, and timing observations
have played a role in improving our knowledge of astrophysics and gravitational
theory. Just to mention a few:
• PSR J1738+0333 has been used to set limits on PPN parameters, as discussed in
the next section;
• the triple system PSR J0337+1715 has constrained SEP violations and improved
the constraint on the Brans-Dicke parameter ω > 140 000;
• PSR J0348+0432 and its white-dwarf companion have set limits on scalar-tensor
theories;
• PSR J0348+0432, with a massive (2 M ) pulsar in a relativistic orbit has con-
strained the emission of dipolar GW;
• PSR J0437-4715 give interesting bounds on Ġ.
In Chap. 6, we have described the PPN formalism to test and compare different the-
ories of gravity in the weak-field regime of the Solar System. Pulsar observations
do not directly constrain the parameters γ or β but can provide a wealth of infor-
mation useful in constraining many of the others PPN parameters, that GR sets to
zero: indeed, these observations currently place the strongest constraints on several
parameters. However, the PPN formalism cannot be applied, as is, to binary systems
Table 12.2 Keplerian and post-Keplerian parameters for the Double Pulsar, evaluated within 3
months (Lyne et al. 2004) from the discovery of PSR J707B. The distance is estimated from the
dispersion measure
Observed parameters PSR J0737−3039A PSR J0737−3039B
Right ascension (J2000) 07h 37m 51s .247(2) −
Declination (J2000) −30◦ 39 40 .74(3) −
Pulse period P (ms) 22.69937855615(6) 2773.4607474(4)
Pulse period derivative Ṗ 1.74(5) · 10−18 0.88(13) · 10−15
Dispersion measure DM 48.914(2) 48.7(2)
(cm−3 pc)
Fitted Keplerian parameters
Orbital period Pb (day) 0.102251563(1) −
Eccentricity e 0.087779(5) −
Epoch of periastron T0 (MJD) 52870.0120589(6) −
Longitude of periastron ω 73.805(3) 73.805 + 180.0
(deg)
Projected semi-major axis 1.41504(2) 1.513(4)
x = a sin
c
i
(s)
Post-Keplerian parameters
Advance of periastron ω̇ 16.90(1) −
(deg/yr)
Gravitational red-shift 0.38(5)
parameter γ ∗ (ms)
Shapiro delay parameter s 0.9995(−32, +4)
Shapiro delay parameter r (µs) 5.6(−12, +18)
Orbital inclination from 87(3)
Shapiro s (deg)
Distance from SSB (kpc) ∼0.6
RMS timing residual (µs) 27 2660
Stellar mass (M ) 1.337(5) 1.250(5)
of compact objects, because the hypotheses of weak field and slow motion break
down in these strongly gravitating systems. However, we can extend this formalism
to the binary stars, by applying some straightforward changes (Will 2018), if the
following conditions are verified: (a) the two stars are sufficiently far away from
each other (distance much larger than their diameters), so that tidal interactions can
be neglected and (b) their velocities are well below c.
It is thus customary to label the strong-field parameters with a hat: so, e.g. the
parameters αi are modified as α̂i . This is the standard choice in the literature and
could cause a notation conflict, as we use the hat to indicate unit vectors. We rely on
the common sense of the reader to distinguish in the very few ambiguous cases.
332 12 Pulsar as Gravitational Laboratory
In the weak field limit, we then recover the usual PPN values αi .
The parameters αi , α2 , α3 account for possible violation of the Local Lorentz Invari-
ance (LLI). This means that, in the case of any α̂i
= 0, there should be an observable
effect due to the relative motion with velocity w between the reference frame and
the universal preferred rest frame. The review by Shao and Wex (2012) has more
details on this topic.
In this game, we first need to determine the preferred reference system to evaluate
w.
The most obvious choice is the frame in which the Cosmic Microwave Background
(CMB) shows no dipole anisotropy: we assume the CMB to be at rest in this frame.
The pulsar barycentre velocity with respect to the CMB is measured in two steps:
first, the velocity of the pulsar with respect to the SSB and then, the velocity of the
SSB with respect to the CMB.
The timing technique, described above, does not provide information about the radial
velocity of the system; therefore, experiments to test the LLI, performed over a binary
system with a pulsar and a White Dwarf (WD) are to be preferred, so that the radial
velocity can be obtained from the spectroscopic analysis of the visible companion.
We need to recall here the definition of the eccentricity vector e: a dimensionless
vector with magnitude equal to the orbit’s eccentricity e and direction pointing from
apoapsis to periapsis. In a tightly interacting binary system, e rotates due to the pre-
cession of periastron, but its modulus is constant.
We call the angle between w and the unit vector Jˆ of the total angular momentum.
So, the total eccentricity e = |e R + eF | changes its magnitude in time. Therefore,
12.5 Limits on PPN Parameters from Binary Pulsars 333
the binary orbit changes from a less to a more eccentric configuration and back on
the time scale of the periastron advance:
5/3 2/3
2π Pb 2M
Te ≈ 104 year s ·
ω̇ 1 day M
This perturbation of the binary orbit is actually an extension of the Nordtvet effect
discussed in Chap. 6.
Unfortunately, the angle depends on the longitude of the ascending node
which is not, usually, a directly measurable variable. Moreover, the angle between
eF and eR is unknown therefore it is not possible to predict how the value of e will
change. However, if the system is old and ω̇ is large enough so that the periastron
has completed several turns, this angle can be assumed as a uniformly distributed
variable between 0 and 2π.
The best binary pulsar system to test the value of α̂1 is J1738+0333 (PSR
J1012+5307 is pretty good too): it has a short orbital period, Pb ∼ 0.35 days,
extremely low orbital eccentricity e < 10−7 and a precession rate ω̇ ≈ 1.6◦ /yr : this
means that during the observation time, now over a decade, the rotating eccentricity
has swept an angle wide enough to make a change in eccentricity measurable. The
outcome of the analysis of PSR J1738-0333 data permits to set the limit
+3.7
α̂1 = −0.4−3.1 × 10−5 (12.18)
This result is five times better than that obtained with observations in the Solar Sys-
tem with LLR and, even more important, also constrains strong-field effects.
This precession would cause the plane of the orbit, normal to J, to change its orienta-
tion with respect to the line of sight, leading to a change of the value of the projected
semi-major axis x = a sin i.
ẋ α̂2 2π w 2 cos i
=− sin 2 cos θ (12.20)
x 4 Pb c sin i
where θ is the angle between the direction of the ascending node and the projected
component of w on the orbital plane. x and ẋ in Eq. 12.20 are directly measurable
by the timing procedure, the inclination angle i can be obtained rewriting the mass
function, Eq. 12.9, as a function of the mass ratio m p /m c . This latter ratio can be
derived from the star velocities of the binary system since it is composed by a pulsar
and a WD. This precession could be masked by other effects that cause a variation
in the observed x
ẋ ẋ ẋ ẋ ẋ
= + + + (12.21)
x T OT x Pb x PM x GR x α̂2
The first term accounts for the orbit shrinking due to the loss of energy via GWs
emission. The contribution of this term can be estimated from Kepler’s third law:
being proportional to Ṗb /Pb it is usually of order 10−19 s −1 , about 4 order of mag-
nitude smaller than the relevant time scale. The second term accounts for the proper
motion of the system. Since a typical pulsar describes an arc of significant extension
on the celestial sphere when observed from Earth, its orbital inclination with respect
to the plane of sky is constantly changing.
The term (ẋ/x)G R accounts for other gravity effects that could cause the preces-
sion of the total angular momentum such as the quadrupole momentum of the WD
and the Lense-Thirring effect. However, they are negligible unless the WD has a
very high spin (PW D < 200 s), which is not the case for the systems discussed here.
From the combined analysis of PSR J1012 +5307 and PSR J1738 +03333 an upper
limit was set on α̂2
|α̂2 | < 1.8 × 10−4 (12.22)
which is three orders of magnitude larger than the limit obtained in the Solar System.
A better constraint is obtained observing isolated pulsars. We still use Eq. 12.20
replacing the orbital period by the spin period. The precession of the angular momen-
tum would change the direction of radio emission determining a change in the shape
of the pulse profile. Observation PSR B1937 +21 and PSR J1744 -1134 for over 15
years shows no changes in the pulse profile, leading to a combined constraint (Wex
2014)
|α̂2 | < 1.6 × 10−9 (12.23)
Modelling the pulse profile in order to quantify the changes is a difficult task. As we
are considering isolated pulsars, no information about the radial value of w can be
derived.
12.5 Limits on PPN Parameters from Binary Pulsars 335
Tests of α̂3 can be derived from both binary and single pulsars, using slightly different
techniques. A non-zero value would imply a violation of local Lorentz invariance as
well as non-conservation of momentum. It would cause a rotating body to experience
a self-acceleration in a direction orthogonal to both its spin angular velocity6 p
and its absolute velocity w.
α̂3
aC M = −
Up w × p (12.24)
3
where U p = 2Gm p /r p c2 is (up to factors of order unity) the compactness parameter
and p the angular velocity of the pulsar. The corresponding term for the companion
can be neglected since α̂3 test is performed on systems with white dwarfs, that have
a significantly smaller compactness. It is interesting to note that the double nature
of α̂3 is shown, in this equation, by the presence of both a self-acceleration and the
velocity w.
The self-acceleration determines a polarization of the orbit that, as in the case of
α̂1 , can be read as adding a forcing term that modifies the eccentricity vector, again
much like the Nordtvedt effect:
U p |w|
c2
|e F | = αˆ3 Pb2 p sin β (12.25)
24π G(m p + m c )
The best constraints on α̂3 are obtained selecting a sample of binary systems with
large orbital periods Pb and small eccentricity. Following this approach, the achieved
limit is
|α̂3 | < 5.5 × 10−20 (12.26)
This shows that α̂3 is the most tightly constrained PN parameter.
The Post Newtonian Parameter ξ
= 0 also implies a violation of the Local Position
Invariance (LPI). As a consequence, the dynamics of a self-gravitating body could
depend on its position with respect to the gravitational field of the galaxy and therefore
on the overall distribution of matter.
A non-vanishing ξ would cause the precession of the total angular momentum
around the direction of the external gravitational field
2π vG 2
p = ξ̂ cos χ (12.27)
Pb c
where χ is the angle between the direction pointing to the galactic centre and the
orbital velocity vG due to the galactic gravitational field. This equation is formally
identical to Eq. 12.20: therefore the same kind of analysis can be performed (Wex
2014) yielding to
|ξ̂| < 3.1 × 10−4 (12.28)
In this case as well, we can get more stringent limits by looking at isolated pulsars:
then Pb is replaced by the pulsar spin period and we get
Because of the rotation of ê p , the unit vector pointing from the centre of mass to
the periastron, this self-acceleration could be detected via the change in the pulsar
frequency f , due to the changing Doppler shift between SSB and BSB. This rate of
change is given by
f˙ n̂
acm ·
≈ −
f c
where n̂ is a vector aligned with the line of sight.
The measurement of f˙ does not effectively constrain the value of ζ̂2 since f˙ is
largely caused by the pulsar spin down due to the dipole emission of electromagnetic
waves.
The second time derivative of f , instead, is a better quantity to test the exis-
tence of a self-acceleration term since it is much less sensitive to other factors.
Considering the second time derivative, and neglecting higher-order terms like
O(( f˙/ f )2 ), O( Ṗb ), O( P̈b ), we can derive the relation
f¨ ζ̂2 m p (m p − m c ) 2π 2 e dω
= c T sin i cos ω (12.31)
f 2 (m p + m c )2 Pb (1 − e )
2 3/2 dt
The contribution due to e.m. emission is roughly two orders of magnitude smaller
than the term of interest and that due to the galactic acceleration is, for the observed
pulsars, as much as three orders of magnitude smaller. A third possible contribution
could arise due to the acceleration from a nearby third body. In the worst case, this
acceleration could cancel that due to ζ̂2 , but it is statistically unlikely that this would
happen for every observed pulsar.
12.5 Limits on PPN Parameters from Binary Pulsars 337
Table 12.3 Upper bounds for the PPN parameters (as of 2020), measured in the strong or quasi-
strong field regime, in isolated or binary pulsars. Since the values of all these parameters are
consistent with zero, we report the experimental error, as a bound to each value. Data from: Imperi
et al. (2018), Will (2018) and references therein
Parameter σ Limits from compact stars
ξ̂ 3.9 · 10−9 PSRs B1937+21 and
J1744+1134
αˆ1 3.7 · 10−5 Pulsar-White dwarf binaries
PSRs J1012+5307 and
J1738+0333
αˆ2 1.6 · 10−9 PSRs B1937+21 and
J1744-1134
αˆ3 4 · 10−20 Binary ms pulsars
ζˆ2 4 · 10−5 Binary motion P̈ in PSR
1913+16
η 1.8 · 10−6 Triple system PSR
J0337+1715
Ġ/G 1.11̇0−12 21-year timing of J1713+0747
We finally note that in Eq. 12.31 the term cos ω is time dependent, due to preces-
sion. In the data analysis, two different approaches have been pursued: use a reference
value (usually cos ω at half of the observation time) or use a value for cos ω uniformly
distributed between all values. The two methods provide very similar limits for ζ̂2 :
|ζ̂2 |r e f er ence < 2.7 × 10−5 |ζ̂2 |distributed < 2.6 × 10−5 (12.32)
Note that Solar System experiment and observation set a much weaker limit on
ξ < 10−3 , while no bounds have been set on ζ2 (Table 12.3).
Assuming that all the pulsar timing parameters have been accurately estimated and
all the noise sources (described below, in Sect. 12.7) have been minimized, from the
so-called timing residuals we can extract a signal due to the metric change of a GW
crossing the radio signal path (Lynch 2015). In practice, the radio wave emitted by
the pulsar and received on the Earth probes the distance between these two bodies in a
similar way of the light bouncing between two mirrors set at enormous distance. This
technique is substantially identical to that proposed in the 1970s to detect GW by
Doppler tracking of artificial satellites with radio beams (Estabrook and Wahlquist
1975), and it is not too different from that used to measure the Shapiro time delay
(see Sect. 5.4.1). The use of pulsar signals offers an enormously longer baseline,
and an extremely stable radio signal, making the Pulsar-Earth system an attractive
detector.
338 12 Pulsar as Gravitational Laboratory
Fig. 12.10 Sensitivity curve of GW experiments to different sources of gravitational waves. Cour-
tesy of Alberto Sesana
Just as in e.m. antennas, a GW detector works best for wavelengths of the order
of its dimensions, λ 2L; else, the signal is reduced, because the two mirrors move
in phase and the differential motion is smaller.
The Earth-based interferometers, as LIGO and Virgo, with their arms of 3/4 km,
detect the GWs in the frequency range of 10 − 104 Hz. The Laser Interferometer
Space Antenna (LISA) will have arms of 2.5 millions of kilometres and is aimed
at detecting GWs in the frequency range 10−4 − 10−1 Hz. By analyzing the timing
signals emitted by an array of pulsar (Pulsar Timing Array - PTA), it is possible to
probe the space metric on distances of the order of parsec and detect gravitational
waves of very low frequencies, 10−9 − 10−7 Hz. LIGO and Virgo detect GWs from
stellar-mass objects just before their merging, while PTAs is sensitive to GW emitted
by supermassive black holes in the early stage of their inspirals. Therefore, PTAs
provide a view of the gravitational wave sky complementary to the earth-based and
space-based interferometers (see Fig. 12.10). This makes them suited to investigate
the galaxy merger rate and the population of supermassive black hole binaries in the
Universe. PTAs also provide an opportunity to test the theory of gravitation in the
nanohertz regime (Dahal 2020). Just as in the case of ground-based interferometers,
the electromagnetic signals that travel along the arms of the detector take up varying
delays, owing to the influence of the GW. In the case of LIGO-Virgo, the e.m. signal
is a laser light, while for pulsar timing the e.m. signal is the radio beam from a
pulsar. Figure 12.11 shows only few pulsars, but PTA scientists actually monitor
many dozens of isolated pulsars (Lommen 2017).
12.7 Sensitivity and Noise in Pulsar Timing 339
Fig. 12.11 Artist’s rendition (not to scale) of pulsar timing: PTAs detect the changing time-of-flight
of the electromagnetic beam emitted by a pulsar as a GW passes through the beam propagation
path. Courtesy of David J. Champion
There are several issues to be considered when we plan to use pulsars as gravitational
wave detectors:
• The accuracy of the Pulsar Timing model: given an initial set of parameter
estimates, the accuracy of the model is improved by acquiring new data. However,
if we analyze a single pulsar, we would be unable to distinguish between data that
are actually improving our model and the ones that are affected by a GW crossing
the beam path. The main solution of this problem, in all detection schemes, is to
combine the signals of several pulsars. The drawback of this approach is that the
response of each pulsar to a GW depends on two terms: one associated with the
interaction of GW with the Earth, common to all the pulsars, and one due to the
interaction of the GW with the pulsar beam, different for each pulsar. We will
address this problem in Sect. 12.8.
• Wide-bandwidth: The second key issue in PTAs is the consequence of the enor-
mous gains in the receiver bandwidth of the radio telescopes in the present config-
urations. This fact brings both advantages and disadvantages. The SNR in radio
telescopes, when the receiver thermal noise dominates, follows a simple scaling
rule (Lommen 2015)
√
A τf
SN R ∝ (12.33)
Tsys
where A is the collecting area of the telescope, τ is the integration time, f is
the bandwidth and Tsys is the temperature of the receiver. Therefore, quadrupling
340 12 Pulsar as Gravitational Laboratory
the bandwidth is the same as doubling the collecting area of the telescope, or
quadrupling the observation time, the latter two of which come at a very high
cost: this is a great advantage. However, the signal profile of a pulsar changes
with frequency: as seen in Sect. 12.2, the scintillation effects and the dispersion
are inversely proportional to the signal emission frequency. Indeed the integrated
pulse profile is stable, but because of these frequency-dependent effects, we have
variation in a very large bandwidth. To address this issue, astronomers divide the
observation band into sub-bands and perform independent analysis for each of
these sub-intervals of frequency.
• Noise: timing errors are usually divided into those pertinent to the time tagging
of pulses (i.e. TOA measurement by template fitting) and those relative to the
physical properties of pulsar or interstellar medium. Among the latter, there is an
intrinsic noise of the pulsars, with characteristics both white (flat spectrum) and red
(peaked around a given frequency). The white noise contribution is associated with
the emission mechanism of the signals and depends on the unmodeled variation
of the rotation frequency of pulsars: it causes a jitter effect, i.e. a deviation from
true periodicity of the signal. Indeed, individual pulse can jitter in phase at the
level of single pulse width, and its amplitude can change more than 100 % from
pulse to pulse, thus requiring to average over several hundreds of pulses. Jitter
currently only affects a relatively small number of the PTA pulsars, but in the
upcoming era of larger and more sensitive telescopes, jitter will become a major
limitation for PTA research.
The red noise is instead associated with star-quakes or interactions between the
crust of the pulsars and the innermost superfluid: they take place with a maximum
frequency of ∼1/year and they cause sudden changes in the rotation period,
pulsar glitches, and consequently large variations in TOAs.
When a GW crosses the line of sight from the Earth to the pulsar, the TOAs change:
to extract the GW information they are measured and compared with predictions
for the arrival times based on a pulsar timing model. The differences between the
predictions and the measurements are known as the pulsar timing residuals. We can
connect the residual in a pulsar signal due to the GW, Rgw to the amplitude h of the
GW causing it through the relation
1
Rgw (t) = [1 + cos μ] [r+ (t) cos (2 ψ) + r× (t) sin (2 ψ)] (12.34)
2
12.8 Timing Residuals Due to Gravitational Waves 341
where μ is the angle between the GW source and the pulsar, as seen from the Earth,7
while ψ is the GW polarization angle. The subscripts + and × refer to the two
polarization of the GW. The amplitudes r+ and r× are each composed of two terms:
p
r+,× (t) = r+,×
e
(t) − r+,× (t) (12.35)
one related to the gravitational strain h e+,× (t) at the Earth (the Earth term) and one
for the gravitational strain at the source (the pulsar term):
t
r+,× (t) =
e
h e+,× (τ ) dτ (12.36)
0
t
p p d
r+,× (t) = h +,× τ − (1 − cos μ) dτ (12.37)
0 c
where d is the distance between Earth and the pulsar. Note that the pulsar term
p
h +,× (t) has the same functional shape as the Earth term but evaluated at a delayed
time. The delay is the time between two events: the first is when the GW arrives at
the Earth, while the second is when the Earth receives the information that the GW
has arrived at the pulsar.
The amount of the delay depends on the angle μ between the pulsar, the Earth
and the GW source and on the distance d to the pulsar emitting the e.m. signal.
Thus, in the case of a GW source emitting a monochromatic signal, the Earth term
is coherent (fixed phase relationship) for all the pulsars in an array, while the pulsar
terms are all delayed by different amounts. Therefore, in general, the pulsar term
is an unknown quantity assuming different values for each observed pulsar and in
practice appears as noise in the measurement, unless the delay can be determined
and accounted for all the elements of the pulsar array. Unfortunately, only few pulsar
distances have been measured, using VLBI, to better than 20% (20–200 pc), while
the needed accuracy is a fraction of GW wavelength, i.e. a fraction of a pc. Ideally,
a single very bright GW source would allow us to solve for the distances to all the
pulsars.
The amplitude of the Earth term depends on the wave direction as [1 + cos (μ)]:
for μ = 0 the pulsar and the GW source are aligned and the term has a maximum; for
μ = π the two test masses (Earth and pulsar) are anti-aligned in the sky and we have
zero response, while for μ = π/2, the amplitude is half of the maximum. However,
when we also consider the pulsar term, Eq. (12.35) shows that the difference between
the two contributions equals zero, Rgw (t) = 0, when we have a perfect alignment of
the GW source with the line of sight of the e.m. signal. This agrees with the transverse
character of the GW: it cannot give a non-zero response from pulsars oriented along
the direction of propagation. Thus, we need a misalignment large enough for the
7 Care must be taken in defining the angle μ: (Maggiore 2008, Chap. 23), offers a detailed derivation
of Eq. 12.34, but considers the direction from the GW source to the Earth, opposite to our choice,
resulting in a change of sign in front of cos μ.
342 12 Pulsar as Gravitational Laboratory
GW to pass some fraction of a wavelength away from the pulsar (say, about a light
year) in order to have R
= 0. We can conclude that the optimum detection is when
the pulsar is nearly aligned with the GW source.
The nature of gravitational waves depends on the sources producing them and this
determines the structure of timing residual. Figure 12.10 shows the sensitivity curve
of the PTA and its complementary with those of the LISA and LIGO detectors. In
the same figure, we also show candidates of GW sources detectable by PTAs that we
review shortly:
• Binary systems of super-massive black holes (M > 108 M ): our current under-
standing of galaxy evolution suggests that every galaxy contains a super-massive
black hole (SMBH). The hierarchical or bottom-up formation scenario predicts
that larger galaxies are generated by the merge of smaller galaxies at high red-
shift. When two galaxies merge, we expect that the two SMBHs at their centres
form a binary system. Binary systems are characterized by a non-vanishing sec-
ond time derivative of their mass quadrupole moment, thus they are continuous
source of GWs. The expected number of SMBH binaries (SMBHB) is extremely
large, up to 106 depending on the red-shift and the mass range of the involved
BHs. The incoherent superposition of the GW signals emitted from such a popu-
lation of SMBHBs causes a stochastic background of GWs, usually considered
isotropic. This is the most likely GW signal that PTA hopes to detect.
• Cosmic strings: they are the most controversial sources in the PTA band. They are
hypothetical one-dimensional topological defects that may have formed during a
symmetry-breaking phase transition in the early universe. If they exist, the cosmic
strings will oscillate and emit GW signals of short duration and large amplitude.
• GW cosmic background: it consists of the relic GWs from the early evolution
of the universe. The Big Bang is expected to be a prime candidate for the pro-
duction of the many random processes needed to generate this stochastic GWs,
which therefore may carry information about the origin and history of the uni-
verse. If these GWs were actually originated in the Big Bang, they should have
stretched as the universe expanded and they can tell us about the very beginning
of the universe. They would have been produced between approximately 10−36
to 10−32 seconds after the Big Bang, whereas the CMB was produced approxi-
mately 300.000 years after the Big Bang. Though widely believed to exist, a relic
background will probably not be detected, since it is several orders of magnitude
below the unresolved background due SMBHB (Maggiore 2000).
12.10 Detection Procedure 343
The correlation C, shown in Fig. 12.12, starts out at a maximum when the two
pulsars are seen in the same direction (zero angular separation); it has a zero when
γ = 0.855 rad = 49◦ ; then it reaches a minimum at 1.431 rad = 82◦ and finally
grows up to half of its initial zero-separation value at π rad. The Hellings and Downs
curve is computed using the Earth term only (Eq. (12.36)), because the pulsar terms
are not correlated, as discussed in Sect. 12.8. The geometry shown in Fig. 12.13 can
help us understand the physics of the Hellings and Downs curve. For γ = 0, i.e.
for a given pair of pulsars at the same “sky location”, aligned on the same line of
sight, the correlation between their TOAs will be maximum: the timing residuals
due to crossing of GW (Eq. (12.34)) will be the same. On the other hand, when the
angular separation is γ ∼ π/2, we have negative correlation: as explained in Sect. 7.2,
GWs alternatively stretch and compress space-time along orthogonal directions.
Therefore, when the TOA from pulsar 1 is affected by the space-time dilation due
to the GW, the TOA from pulsar 2 is shrunk and viceversa. Searches for stochastic
gravitational-wave backgrounds using pulsar timing arrays effectively compare the
measured correlations with the expected values from the Hellings and Downs curve
to determine whether or not a signal from an isotropic, non-polarized background is
present in the data.
Fig. 12.12 The Hellings and
Downs correlation curve.
Reproduced from Jenet and
Romano (2015), with the
permission of the American
Association of Physics
Teachers
344 12 Pulsar as Gravitational Laboratory
Fig. 12.13 Geometry for the Hellings and Downs correlation function. The Earth (rather, the Solar
System Barycenter, but the difference is irrelevant: 1 AU 1kpc = 2 · 108 AU ) is located at the
origin, pulsar 1 on the z-axis and pulsar 2 in the x z-plane. The angle γ is defined by the two
directions from the Earth toward the two pulsars. Reproduced from Jenet and Romano (2015), with
the permission of the American Association of Physics Teachers
Foster and Backer (1990) initiated the observing program called Pulsar Timing
Array—PTA program to observe three pulsars using the US National Radio Astron-
omy Observatory telescope (Hobbs and Shi 2017). Nowadays, the PTA approach is
pursed by different groups and the number of the pulsars in the array has expanded
(Tiburzi 2018; Dahal 2020). At present, there are three main collaborations:
• The Parkes Pulsar Timing Array (PPTA): this project began in 2004 making use
of the Parkes Observatory in Australia. It is currently taking observations of 25
pulsars. As the southernmost telescope currently used for PTA observation, Parkes
is able to see many Southern Hemisphere millisecond pulsars that are not visible
at NANOGrav and EPTA telescopes.
The three PTAs collaborate in the framework of the International Pulsar Timing
Array (IPTA), which aims to facilitate GW scientific research through the sharing
of data and software codes.
Efforts to establish new PTA collaborations are ongoing in India, China and South
Africa. The Indian PTA observes millisecond pulsars with the Ooty Radio Telescope
and the Giant Metrewave Radio Telescope The Chinese PTA had a kick-off in 2007: it
operates several 100-meter class telescopes (e.g. NSRT, Kunming, Tianma), together
with the two most important facilities: the world-largest Five hundred meter Aperture
Spherical Telescope (FAST) and the QiTai Radio Telescope. Finally, MeerTIME is an
approved proposal dedicated to pulsar timing, which will use the MeerKAT telescope
(South Africa).
No detection of gravitational waves has been made to date with the Pulsar Timing
Array technique, but PTA upper limits already contributed to rule out some models
of galaxy formation. Continuous efforts are made to improve the sensitivity and to
understand the astrophysical implications of the (so far) null detection (Hobbs and
Shi 2017). The primary limitation is simply the small number of pulsars are in the
arrays. In 2019, the IPTA second data release (IPTA dr2) was published (Perera et al.
2019): it consists of regularly observed high-precision timing data for 65 millisecond
pulsars (see Fig. 12.14), 16 more than the previous data release.
The primary goal of the IPTA is to detect and characterize low-frequency GWs
using high-precision pulsar timing. The collaborations (Lommen 2015) confidently
state that PTA will detect GWs: the question is when, and what needs to be done now
to take full advantage of the array in the future.
Obvious steps toward increasing the detection probability of GW are:
• To increase the SNR of the detected signals, using larger radio telescopes.
• To increase the observation frequency of the ms pulsar array, i.e. coordinating
more radio telescopes together. In this respect, IPTA is a crucial tool for collabo-
ration.
• To discover more and more millisecond pulsars, in order to have larger and larger
arrays and increase the accuracy in the measurement of time residual caused by
GW.
• to identify and correct for corrupting effects on the TOAs. At present, most of the
ongoing research focuses on the study of several long-period processes affecting
the residuals, such as inaccuracies in the Solar System ephemerides, intrinsic
pulsar-timing noise and instrumental instabilities.
The new generation of radio telescopes that are now coming online will greatly
increase our sensitivity to low-frequency GWs. The most sensitive instrument, the
Square Kilometer Array (SKA), should begin operations in 2025. It will be the world
largest telescope, with the potential of discovering all the pulsars in our galaxy with
the beam shining on the Earth. In preparation for SKA, several pathfinders have
been deployed, such as MeerKAT and the MWA (Western Australia), LWA (New
Mexico, USA), and the LOw Frequency ARray (LOFAR, in Europe). They are
essential tools to tackle some of the mentioned main challenges. For example, the
low-frequency facilities such as LOFAR, LWA and MWA, are useful instruments to
monitor the turbulent ionized interstellar medium and its effects on pulsar timing. In
particular, DM variations are among the main sources of red noise in the TOA time
series. Ionized interstellar medium studies at low frequencies will provide precious
insights to improve the red noise model, and to disentangle its contribution from
intrinsic timing noise generated from instabilities in the pulsar spin The science
reward of this effort, detection by Pulsar Timing Arrays at low frequency, will be an
invaluable insight into galaxies mergers and black hole dynamics, the exploration of
fundamental physics problems like the cosmological constant, and high sensitivity
tests of General Relativity. Over the next decades, PTAs will open a window over
a new and unique regime of the GW universe, nicely complementing ground and
space-based interferometers.
References
Baade, W., Zwicky, F.: On super-novae. Proc. Nat. Acad. Sci. USA 20, 254–259 (1934). Baade, W.,
Zwicky, F.: Cosmic rays from super-novae. Proc. Nat. Acad. Sci. USA 20, 259–263 (1934)
Becker, W., Kramer, M., Sesana, A.: Pulsar timing and its application for navigation and gravitational
wave detection. Space Sci. Rev. 214, 30 (2018)
Breton, R.P., et al.: Relativistic spin precession in the double pulsar. Science 321, 104 (2008)
Burgay, M., Perrodin, D., Possenti, A.: Timing neutron stars: pulsations, oscillations and explosions.
In: Belloni, T.M., Méndez, M., Zhang, C. (eds.) General Relativity Measurements from Pulsars.
Springer, Berlin, Heidelberg (2021)
Dahal, P.K.: Review of pulsar timing array for gravitational wave research. J. Astrophys. Astr. 41,
8 (2020)
References 347
Detweiler, S.: Pulsar timing measurements and the search for gravitational waves. Ap. J. 234,
1100–1104 (1979)
Estabrook, F.B., Wahlquist, H.D.: Response of Doppler spacecraft tracking to gravitational radiation.
Gen Relat. Gravit. 6, 439–447 (1975)
Foster, R.S., Backer, D.C.: Constructing a pulsar timing array. Ap. J. 361 300 (1990)
Hellings, R.W., Downs, G.S.: Upper limits on the isotropic gravitational radiation background from
pulsar timing. Astrophys. J. Lett. 265, L39-42 (1983)
Hewish, A., Bell, S.J., Pilkington, J.D.H., Scott, P.F., Collins, R.A.: Observation of a rapidly pul-
sating radio source. Nature 217, 709–713 (1968)
Hobbs, G., Shi, D.: Gravitational wave research using pulsar timing arrays. Natl. Sci. Rev. 4, 707
(2017)
Hulse, R.A., Taylor, J.H.: A high-sensitivity pulsar survey. Ap. J. 191, L59 (1974)
Imperi, L., Iess, L., Mariani, M.J.: An analysis of the geodesy and relativity experiments of Bepi-
Colombo. Icarus 301, 9–25 (2018)
Jenet, F.A., Romano, J.D.: Understanding the gravitational-wave Hellings and Downs curve for
pulsar timing arrays in terms of sound and electromagnetic waves. Am. J. Phys. 83, 635 (2015)
Lattimer, J.M.: Introduction to neutron stars. AIP Conf. Proc. 1645, 61 (2015)
Lommen, A.: Pulsar timing arrays: the promise of gravitational wave detection. Rep. Prog. Phys.
78, 124901 (2015)
Lommen, A.: Pulsar timing for gravitational wave detection. Nat. Astron. 1, 809 (2017)
Lorimer, D.R.: Binary and millisecond pulsars. Living Rev. Relativ. 11, 8 (2008)
Lorimer, D.R., Kramer, M.: Handbook of Pulsar Astronomy. Cambridge University Press, Cam-
bridge (2005)
Lynch, R.S.: Pulsar timing arrays. J. Phys. Conf. Ser. 610, 012017 (2015)
Lyne, A.G., et al.: A double-pulsar system: a rare laboratory for relativistic gravity and plasma
physics. Science 303, 1153 (2004)
Maggiore, M.: Gravitational wave experiments and early universe cosmology. Phys. Reports 331,
283–367 (2000)
Maggiore, M.: Gravitational Waves, vol.2: Astrophysics and Cosmology, chap. 23. Oxford Univer-
sity Press, Oxford (2008)
Manchester, R.N.: Pulsars and gravity. Int. J. Mod. Phys. D 24, 1530018 (2015)
Pacini, F.: Energy emission frome a neutron star. Nature 216, 567–568 (1967)
Perera, B.B.P., et al.: The international pulsar timing array: second data release. MNRAS 490,
4666–4687 (2019)
Perrodin, D., Sesana, A.: Radio pulsars: testing gravity and detecting gravitational waves. In: Rez-
zolla, L., et al. (eds.) The Physics and Astrophysics of Neutron Stars. Astrophysics and Space
Science Library, vol. 457. Springer, Cham (2018)
Sazhin, M.V.: Opportunities for detecting ultralong gravitational waves. Astronomicheskii Zhurnal
55 565–568 (1978). Translated in: Sov. Astron. 22, 36–38 (1978)
Shao, L., Wex, N.: New tests of local Lorentz invariance of gravity with small-eccentricity binary
pulsars. Class. Quantum Grav. 29, 215018 (2012)
Shapiro, S.L., Teukolsky, S.A.: Black Holes, White Dwarfs, and Neutron Stars: The Physics of
Compact Objects. Wiley-Interscience Publication. Wiley, New York (1983)
Stairs, I.H.: Testing general relativity with pulsar timing. Living Rev. Relat. 6, 5 (2003)
Taylor, J.: Pulsar timing and relativistic gravity phil. Trans R. Soc. Lond. A 341, 117–134 (1992)
Tiburzi, C.: Pulsars Probe the Low-Frequency Gravitational Sky: Pulsar Timing Arrays Basics and
Recent Results. Publications of the Astronomical Society of Australia 35 e013 (2018). Astro-
nomical Society of Australia 2018; published by Cambridge University Press
Voisin, G., et al.: Optimization of long-baseline optical interferometers for gravitational-wave detec-
tion. A&A 638, A24 (2020)
Weisberg, J.M., Huang, Y.: Relativistic measurements from timing the binary pulsar PSR B1913+16.
Ap. J. 829, 55 (2016)
Weisberg, J.M., Nice, D.J., Taylor, J.H.: Timing measurements of the relativistic binary pulsar PSR
B1913+16. Ap. J. 722, 1030 (2010)
348 12 Pulsar as Gravitational Laboratory
Wex, N.: Testing relativistic gravity with radio pulsars. In: Kopeikin, S.M. (ed.) Applications and
Experiments, Frontiers in Relativistic Celestial Mechanics, vol. 2. Walter de Gruyter GmbH,
Berlin/Boston (2014). arXiv:1402.5594
Will, C.M.: Theory and Experiment in Gravitational Physics. Cambridge University press (2018)
Further Reading
Lyne, A.G.: A Review of the double pulsar–PSR J0737–3039–Chin. J. Astron. Astrophys. Suppl.
2, 162 (2006)
Miao, X. et al.: Tests of conservation laws in post-Newtonian gravity with binary pulsars. Ap. J.
898, 69 (2020)
Sesana, A., Vecchio, A., Colacino, C.N.: The stochastic gravitational-wave background from mas-
sive black hole binary systems: implications for observations with Pulsar Timing Arrays. MNRAS
390, 192–209 (2008)
The Sagnac Effect
13
13.1 Introduction
Galileo first observed that linear uniform motion does not influence mechanical
phenomena, concluding that it is not possible to distinguish the relative motion of
(inertial) observers by studying of phenomena such as the free fall of bodies or the
swing of the pendulum. Non-inertial references are inherently distinguishable. In
them a free body describes non-linear trajectories, the pendulum oscillation plane
rotates in space and the massive body falls moving from the vertical direction.
Although these examples refer to the motion of bodies, which follow the laws of
mechanics, electromagnetism-induced phenomenology also allows to discriminate
between inertial and accelerated reference frames. In 1913, the French physicist
Georges Sagnac devised an interferometric device that was able to highlight the
rotational state of the reference system in which the apparatus rests. He noticed
that the rotation of the interferometer induces a delay, and consequently a phase
difference, between two light waves that propagate in opposite directions along a
closed optical path at rest in the rotating reference frame (Fig. 13.1). Its output is
an interference figure from which it is possible to derive the angular velocity of
the rotation. This is the Sagnac effect. Indeed, the first description of the Sagnac
effect in the framework of special relativity was done by Laue (1911), two years
before Sagnac performed his experiment (Sagnac 1913a), observing the correlation
of angular velocity with the light phase-shift.
To derive the phase difference produced by the Sagnac effect, we consider a circular
optical path of radius R, obtained for example by propagating the beam in a planar
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 349
F. Ricci and M. Bassan, Experimental Gravitation, Lecture Notes in Physics 998,
https://doi.org/10.1007/978-3-030-95596-0_13
350 13 The Sagnac Effect
loop created in a platform that can be rotated around its symmetry axis, with angular
velocity .
Referring to the Fig. 13.2, the light beam enters from point A, where it is divided
by a beam splitter in two different beams that propagate one in clockwise and the
other in counter clockwise direction. The two rays recombine at the interferometer
output, the point A , which has moved from the position A due to the rotation of
the entire system. In the case of a simple circular loop, the path length is L = 2π R,
along which the light propagates in a time t = L/c = 2π R/c if = 0 , i.e. when
the platform does not rotate. As usual, c is the propagation velocity of the laser light.
When the platform rotates at , we must compute different travel times for the
beam propagating in the clockwise direction (same as the platform) t1 and that of
the beam travelling counterclockwise, t2 . At the recombination point A’ (Fig. 13.2)
we have
ct1 = 2π R + Rt1 ct2 = 2π R − Rt2 (13.1)
2π R 2π R
t1 = t2 = (13.2)
c − R c + R
Therefore, the time delay between the two rays is
4π R 2 4π R 2
T = t1 − t2 = (13.3)
c2 − R 2 2 c2
4π R 2
L = cT (13.4)
c
13.2 The Sagnac Effect 351
It can be proven that the above formula, derived for a circular path normal to the
rotation axis, is valid for all closed optical paths. Therefore, the Sagnac effect is
represented in terms of the general equation:
· S
T = 4 (13.5)
c2
where S is the vector oriented perpendicular to the surface S enclosed by the optical
path.
This time difference is associated with a phase difference between the two
waves, given by the following expression:
ω 4ω
= ωT = L = 2 · S (13.6)
c c
where ω is the angular frequency of the laser light.
The difference in optical path that is generated by the rotation of the interferometer
is generally a very small effect. Consider, as an example, the simple case a loop of
R = 10 cm, rotating with parallel to S, at slow pace of 2 revolutions per
hour: we have for L a value of about 1.5 · 10−12 m, ∼ a millionth of the typical
wavelength of a laser light (0.4−1 µm). To increase L, and with it the phase
difference , a long optical fibre, of the order of kilometre, is used as a guide for
light. The fibre is wound to form a coil, in order to increment the total area which
determines the Sagnac effect by the number of the coil turns. We can express the
area in terms of the total length L of the fibre:
RL
= 4π (13.7)
cλ
where λ = 2πc/ω is the light wavelength. To determine the intensity of the resulting
beam at the interferometer output, we assume, for now, that the waves at the input
A are monochromatic and in phase:
s1 = s2 = S0 sin(ωt)
352 13 The Sagnac Effect
At the end of the two paths, ignoring possible losses, the light signals are:
I0
I = I0 cos 2 = (1 + cos)
2 2
I0 is the maximum of the light intensity that we have for = 0 (signals in the
phase and non-rotating interferometer). The minimum values of intensity (I = 0)
are obtained for = ±mπ.
In synthesis, the phase change due to the rotation of the interferometer has the
effect of inducing an interference figure and the Sagnac phase can be determined
directly from the interference figure. However, this method is extremely sensitive to
changes in the intensity of the light focused in the fibre. The usual technique that
addresses this problem consists in phase modulating the signal at a frequency ωm ,
by means of a moving mirror that adds a time varying phase:
(t) = m cos(ωm t)
The two waves that propagate in opposite directions undergo the same modulation
but in different time instants, t and t − τ where τ is a fixed, but yet to be determined
delay.
This induces an additional phase difference between the two rays of (t) and
(t − τ ). The signal intensity at the interferometer output is now time dependent:
1
I (t) = I0 1 + cos + m cos ωm t − m cos ωm (t − τ )
2
After a few trigonometric manipulations,
and defining the modulation depth (see
Appendix A) φc ≡ 2m sin 21 ωm τ , we can rewrite the intensity as:
I0
2I (t) = 1 + cos cos φc sin ωt − sin sin(φc sinωt) .
2
13.2 The Sagnac Effect 353
1
I (t) = I0 1 + J0 (φc )cos() +
2
∞
+I0 cos J2k (φc )cos(2kωt − kωτ ) +
k=1
∞
1
+ I0 sin J2k−1 (φc )sin((2k − 1)ωt − k − ωτ (13.10)
2
k=1
When we compare the terms of Bessel expansion with those of the Fourier one
∞
I (t) = S0 + Sk cos(kωt + αk )
k=1
1
where φc = 2φm sin 2 ωm t , we end up with the following identities:
1
S0 = I0 1 + Jo (φc )cos
2
S1 = I0 J1 (φc )sin
S2 = I0 J2 (φc )cos
S3 = I0 J3 (φc )sin
S4 = I0 J4 (φc )cos
The even harmonics are proportional to the cosine of the Sagnac phase, while the
odd ones are proportional to the sine. In the absence of rotation ( = 0) we have
S2k−1 = 0 and at the interferometer output we get a signal at twice the modulation
frequency. At the photodiode output we have a broadband signal that includes only
the even harmonics (2ω, 4ω, 6ω, ...) whose amplitudes are determined by that of
the modulation. When the gyroscope rotates all harmonic components are present
(ω, 2ω, 3ω, 4ω, ...).
354 13 The Sagnac Effect
S1 J1 (φc )
=
S3 J3 (φc )
The first commercial devices, based on the Sagnac effect, were the Ring Laser Gyro-
scopes (RLG), first demonstrated in the early 1960 (for a review, see Anderson 1994):
a sealed ring-cavity with very high-quality mirrors, filled with He-Ne, operates as an
active resonator for two counter-propagating laser beams. These can then re-circulate
a great number of times, depending on the cavity finesse. The two waves resonate at
different frequencies, because the storage time is different, and the output has a beat
note that is simply related to the rotation frequency. To avoid issues with too low
beat frequencies, when < 1o / min, a dithering is applied to the cavity, modulating
the signal to higher frequency. This degrades the performance and adds mechanical
noise.
More recently, Fibre Optic Gyroscopes (FOG) have become popular: in a FOG,
the ring is not a part of the laser. Rather, an external solid-state semiconductor
laser injects counter-propagating beams into a fibre ring, where rotation causes a
relative phase shift between those beams when interfered at the exit. The Sagnac
effect can be enhanced by using a fibre coil with many turns. However, in such a
passive interferometer, the signal is a phase (and not frequency) difference between
the counter-propagating waves. This implies some additional signal processing and
a lower sensitivity, because the phase difference does not grow with time. While
based on the same physical principle, and having similar theoretical performance, the
technology for the two devices has important differences, mostly to FOGs advantage:
A FOG consists of few components, standard in the communication industry, there is
no dithering (no moving parts, no mechanical noise), there is no gas as active medium
(no leakage, better durability and reliability), it is scalable just by adding turns to
the fibre coil. Commercial FOGs have now thousands √ of turns in a diameter of few
mm. These devices have sensitivity of 10 nrad/s/ Hz in the rotation range 0.01–
100 Hz. The sensitivity can be enhanced at will by using longer and longer fibres,
but with a trade-off: increased thermal phase noise and increased shot noise (due to
13.4 The Sagnac Effect and Gravitation 355
fibre attenuation) limit the long term stability and portability of these devices. FOGs
have replaced mechanical (spinning) gyroscopes in most applications of inertial
navigation.
Today, the Sagnac interferometer, in either version, is a tool of current technology.
Its foremost use is in inertial guidance systems, where high sensitivity to rotations is
of foremost importance. Global navigation satellite systems (GNSSs), such as GPS,
GLONASS, COMPASS or Galileo, need to account for the rotation of the Earth in
the procedures for synchronising clocks by multilateral exchanges of radio signals
(see Chap. 14).
nature, can suppress the quantum back-action noise for displacement measurements
over a broad frequency band (Chen 2003).
There is a close connection between the Sagnac effect, seen in previous sections, and
Gravitomagnetism, that we discussed in Sect. 5.7. We shall show here that an effect,
quite similar to Sagnac, is generated if the light beam propagates in regions of space
where a gravitomagnetic field, generated by rotating source masses, exists.
We consider an interferometer where counterpropagating e.m. beams describe a
close path in the space-time, at a mean distance R from a central source M, endowed
with angular momentum J . The closed path needs not encircle the source. This
situation can apply to the next generation of GPS and Galileo satellites orbiting the
Earth, that will exchange radio signal to synchronize their clocks, or to the LISA
constellation orbiting around the Sun. We shall assume that the interferometer path
is in free fall at a distance R from the source: this implies a circular orbit around the
source.
We start from a general expression for the metric line element and recall that, for
a photon, such interval is null:
We can solve this equation for t, i.e. the coordinate time interval for a space dis-
placement vector {d x i } along a light ray. The result is:
dxi
g0i (g0i d x i )2 − g00 gi j d x i d x j
cdt = − + (13.12)
g00 g00
We have chosen the sign so that the propagation always be to the future, dt > 0,
irrespective of the sign of the d x i ’s, i.e. of the direction of propagation along the
space path.
13.4 The Sagnac Effect and Gravitation 357
We can integrate Eq. 13.12 along the closed three-dimensional path to find the
time of flight for a loop travel from event A (injection) to event B (extraction).
(g0i d x i )2 − g00 gi j d x i d x j
g0i d x i
t AB = − d + d (13.13)
cg00 d cg00
Notice that the first integral depends on the direction of propagation, while the second
does not. Therefore, when considering the time of flight for two beams counter-
propagating along the same closed path, the first integral gives different contributions,
and we obtain two different results, say t+ for signal co-rotating with the reference
frame, and t− for the counter-rotating signal.
2 g0i (
x) i
δt = |t+ − t− | = − d (13.14)
c g00 (
x)
The difference is due to the off-diagonal components g0i of the metric tensor, those
related to the source rotation.
Solution of Eq. 13.14 is rather complex in the general case: we outline here the
necessary steps to evaluate the time difference in the WFSM limit (Chap. 5), referring
to Ruggiero and Tartaglia (2019) for details.
1. Choose the metric to work with. For instance, the WFSM limit of Kerr’s metric
of Eq. 5.30. We allow ourselves the additional, simplifying liberty of considering
only motion in the equatorial plane (θ = π/2):
2G M 2 2 dr 2 4G J
ds 2 = (1 − )c dt − − r 2 dϕ2 − 3 cdtdϕ
rc 2
(1 − 2G M
r c2
) c r
2G M 2G J
g00 = 1 − g0i = − δiφ (13.15)
r c2 c3 r
2. Convert from coordinate time to the observer’s proper time, where the beams are
recombined: δτ = g00 ( xobs )δt.
3. Convert to the rotating frame where the interferometer is at rest: ϕ‘ = ϕ − t,
where = G M/R 3
4. Apply a Lorentz boost at the Keplerian speed V = Rϕ̂: it affects both the time
t and the azimuthal distance Rϕ
5. Expand g00 ( x ), g0 j (
x ), to be integrated along the closed path , as well as
g00 (
xobs ), to first order in 2G J
c3 r 2
1 and Gr cM2 1.
358 13 The Sagnac Effect
The calculation generates a number of “first order terms”: among them we can
extract the classical Sagnac effect previously discussed. We focus here, instead, on
the gravitomagnetic term due to g0 j that, to first order, simply reduces to
4 GJ
δτ = dϕ (13.16)
c4 r
For the LISA constellation, rotating around the Sun with a one year period, =
2π/1year , this amounts to
Finally, it is worth noting that the line integral of Eq. 13.14 can be converted, by virtue
of Stokes’ theorem, into a surface integral of the gravitomagnetic field Bg given by
the curl of the vector potential −2g0i /c2 (Eq. 5.64). In this way, the connection with
the encircled area typical of Sagnac phenomena is brought to evidence.
References
Laue, Max von (1911), On an Experiment on the Optics of Moving Bodies Münchener Sitzungs-
berichte, 405–412
Sagnac, G. L’ éther lumineux démontré par l’effet du vent relatif d’ éther dans un interféromètre en
rotation uniforme (The demonstration of the luminiferous aether by an interferometer in uniform
rotation) - Comptes Rendus Acadèmie de Sciences, Paris, 157, 708-710 (1913)
Anderson R., Bilger H. R., and Stedman G. E. Sagnac effect: A century of Earth-rotated interfer-
ometers - Am. Jou. Phys., 62, 975 (1994)
Di Virgilio A. et al. Underground Sagnac gyroscope with sub-prad/s rotation rate sensitivity: Toward
general relativity tests on Earth - Phys. Rev. Research 2, 032069(R) (2020)
Shaddock D. A. , Operating LISA as a Sagnac interferometer - Phys. Rev. D 69, 022001 (2004)
Sun K.-X., Fejer M.M. , Gustafson E., Byer R.L. Sagnac Interferometer for Gravitational-Wave
Detection - Phys. Rev. Lett. 76, 3053 (1996)
Chen Y., Sagnac Interferometer as a Speed-Meter-Type, Quantum-Nondemolition Gravitational-
Wave Detector - Phys. Rev. D 67, 122004 (2003).
Ruggiero, M.L., Tartaglia, A.: Test of gravitomagnetism with satellites around the Earth - Eur. Phys.
Jour. Plus 134, 205 (2019)
GPS and Relativity
14
14.1 Introduction
In 1973, the U.S. Department of Defence approved a project to create a global satellite
navigation system: it was the act of birth of the Navigation System for Timing And
Ranging (NAVSTAR) network, called in a simpler way Global Positioning System
(GPS). The system was designed to determine the position of a vehicle on the globe
with sufficient accuracy to avoid collision hazards and to guide it up to its destination.
A receiver of the GPS navigation signals measures, on a local clock, the arrival
times of signals from four or more different satellites. The receiver then uses these
measurements along with information in the navigation messages to solve for four
unknowns: user position (x, y, z) and the receiver clock offset from GPS time. In
this computation, a GPS receiver must apply two relativistic corrections in order to
provide time or position to the user: the first due to special relativity, the second to
GR. In the following we present the GPS system and discuss these corrections, with
a main focus on estimating the geometric range delay, i.e. the time that GPS signals
take to propagate from the transmitter to the receiver.
A constellation of satellites orbiting the Earth is the backbone of the GPS system.
The constellation consists of 24 active satellites, arranged in 6 almost circular orbits
inclined 55◦ relative to the equatorial plane (Fig. 14.1). All satellites orbit in a period
of half sidereal day, T = 43082 s. Kepler’s laws tell us that they orbit at an average
altitude of 20184 km and an average speed vs = 3873.8 m/s. There are 4 satellites
in each orbit, placed at regular distances. This choice of orbits responds to precise
criteria: it allows an even distribution of satellites around the globe, with orbits that
are close enough to the poles to ensure that the system works properly even in those
remote regions of the Earth, even if with a slight loss of precision. Moreover, with
the current arrangement, a receiver anywhere on Earth is exposed to receive signals
from at least five satellites.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 359
F. Ricci and M. Bassan, Experimental Gravitation, Lecture Notes in Physics 998,
https://doi.org/10.1007/978-3-030-95596-0_14
360 14 GPS and Relativity
Fig. 14.1 Left: a drawing of the orbits of the 24 GPS satellites, arranged in pairs on the same orbit.
The red segments indicate the signal propagation to a receiver on Earth. Right: artistic rendering of
a GPS satellite. Credits: public domain images
All satellites experience disturbances such as gravitational fields and the solar
wind, that make their control challenging. In addition to the 24 satellites that consti-
tute the operating network, there are now 7 spares in orbit, plus several of previous
generations or malfunctioning.
The first generation of satellites was launched between 1978 and 1985 and formed
the system called GPS I. In that stage, the orbits of the satellites were tilted by 63◦
relative to the equatorial plane, the remaining characteristics were the same as the
current ones, i.e. 24 satellites arranged, in groups of four, on six orbital planes.
The satellites were 5.3 m wide, weighing 759 kg, each equipped with one atomic
Caesium (Cs) clock and two Rubidium (Rb) clocks. Their expected operational life
was 5 years, although many remained in service longer. They could be operated with
3 control axes and a hydrazine (N2 H4 ) propulsion system. The energy was derived
from two solar panels capable of providing ∼400 W of power and NiCd batteries
when orbiting in the shadows. The navigation data were transmitted on two L-band
frequencies, chosen because they are able to penetrate clouds, fog, rain, storms,
and vegetation. They are called L1, at 1575.42 MHz and L2-P5, at 1227.60 MHz,
both generated by upconverting, 154 and 120 times respectively, the local oscillator
frequency f 0 = 10.23 MHz.
The second generation (1990–1997) satellites were equipped with one additional
Cs atomic clock. In the last generations, the satellites have larger sizes and mass and
a life expectancy of about 12 years. They are equipped with a hydrogen MASER as
time reference.
Finally, we have the Block III satellites, the third generation of GPS satellites
(GPS III). They began operation in 2009 and are expected to remain in service until
2030 and beyond.
14.2 Clock Fluctuations and the Allan Variance 361
The role of clocks is crucial in the use of GPS, and it is worthwhile to briefly sum-
marise their noise characteristics. Crystal oscillators and atomic clocks are affected
by phase noise, which has few different spectral components; the two main contribu-
tions are a white frequency noise and flicker frequency noise. The last one is a type of
electronic noise with a 1/ f power spectral density and it is particularly troublesome,
being more and more significant as we go to lower frequencies. At very low frequen-
cies, we can think of the noise as becoming a drift, although the mechanisms causing
drifts are usually different from flicker noise. There is a practical consequence to
this behaviour: if we try to evaluate the noise using traditional statistical tools, such
as standard deviation which implies the averaging operation, this estimator will not
converge. To overcome this difficulty, and to assess the frequency stability of a clock,
a suitable tool is the clock Allan variance, also known as two sample variance. We
note that the term Allan variance is a bit ambiguous, because there exist several
different definitions. Here we refer to the following:
The Allan variance is one half of the time average of the squares of the differ-
ences between successive measurements of the frequency deviation sampled over the
sampling period.
To understand this definition, we must first introduce the concept of M-variance
In mathematical terms, assume we sample the clock reading every T seconds;
being x(k T ) and x(k T + τ ) the clock reading done at k T and k T + τ , we define
the incremental ratio computed at the delay time τ
x(k T + τ ) − x(k T )
yk = (14.1)
τ
Suppose we collect M measurements. The M-variance is defined as (Fig. 14.2):
⎧ M−1 2 ⎫
1 ⎨ M−1
1 ⎬
σ 2y (M, T , τ ) = y¯k 2 − y¯k (14.2)
M −1⎩ M ⎭
k=0 k=0
The Allan variance is a particular case of M-variance, for M = 2, and T = τ :
σ 2y (τ ) = σ 2y (2, τ , τ ) (14.3)
and it depends just on the τ variable. The Allan deviation is σ y (τ ), the square-root
of the Allan variance. To clarify the meaning of this quantity and give a quantitative
example of a clock frequency stability, let us assume that σ y (τ ) = 1.3 × 10−9 at an
observation time τ = 1 s. If the clock signal is an oscillation at 10 MHz, it means
that we are dealing with a clock frequency instability equivalent to 13 MHz RMS.
In Fig. 14.3 the stability of quartz and Caesium atomic clocks verses the observa-
tion time τ are compared.
From the plots we can see that quartz clocks are stabler for short observation times,
up to tens of seconds, while atomic clocks have a much better stability at longer
times.
362 14 GPS and Relativity
Fig. 14.2 Definition of the M-variance: x are the clock readings taken at constant time interval. The
frequency fluctuation is evaluated by computing the M-variance of the y variable. This variance
depends on the chosen observational delay τ , the sampling time T and the number of averaged
measurements M, as shown in Eq. 14.2
Quartz
Caesium
Fig. 14.3 Allan’s deviation on the frequency of a quartz oscillator and a Caesium atomic clock vs
the time interval τ between measurements. Credits: CC-BY Ashby (2003)-NC-ND/2.0
Once the satellites are in the correct orbit, it is necessary to carefully monitor
their telemetry and operation to ensure the proper operation of the system.1 For
this purpose, many monitoring stations have been built and operated, first by U.S.
military and later by other agencies, around the world (see Fig. 14.4).
1 Factors such as the solar wind or attractions gravitational waves by other bodies in the universe can
divert satellites from their intended trajectory. These are causes of error in the data of the navigation
system.
14.2 Clock Fluctuations and the Allan Variance 363
Fig. 14.4 The network of Ground Stations for the control of the GPS system. Credits: GPS.gov
Finally, the signals of GPS satellites are detected on ground by the receivers,
which consist of an antenna and a processing unit, capable to receive and interpret
the signals sent by the satellites, plus a common quartz clock, which is constantly
synchronised to the atomic clocks of the satellites.
At present, inexpensive receivers are available, capable of receiving and process-
ing the signals of up to 12 satellites in parallel. There are two main types of receivers:
for civil or military use. The former are only able to receive the L1 signal, while the
latter receive and decode both L1 and L2 signals, thus providing higher accuracy.
Indeed, one of the largest causes of errors is the slowing down of signals when cross-
ing the atmosphere: military receivers can correct for this problem by comparing
the propagation times of the two signals, the civilians receivers must instead rely on
external systems. Even the L1 signals were degraded to a precision of 100 m, by the
so called Selective Availability, implemented by the US military for security reasons.
Many smart methods were then devised to circumvent this limitation, and finally, in
2000, the Selective Availability was turned off for good.
The process that is used to determine the location is called trilateration and is based
on the knowledge of the positon of some reference points to determine that of the
point sought. We first give a conceptual approach. Assume any reference system with
origin at the Earth center O. Consider a GPS receiver D (as Detector), located at an
unknown point rD . The satellite S1 sends a first signal, with encoded its emission time
and its position r1 , to the receiver. The distance d1 = |r D − r1 | from the transmitter
is calculated by measuring the travel time t1 of the signal: d1 = ct1 . The receiver
can conclude that its position rD is anywhere on the spherical surface of radius d1 ,
centered in r1 . A second satellite S2 sends another signal, whereby the receiver is
364 14 GPS and Relativity
able to track a second sphere of radius d2 = | r D − r2 | = ct2 . The allowed positions
reduce now to the intersection of these two spheres, i.e. a circumference. With the
signal of the third satellite S3 we finally reduce the intersections of the circle to just
two points, as shown in Fig. 14.5. This final degeneracy is easily removed because
one of the two points is normally at unacceptably high altitude.
In essence, all satellites simultaneously send a signal every millisecond that con-
tains navigation parameters, from which their positions can be determined. The
receiver must detect at least four of these signals: for the jth satellite, the position
r j correspond to the center of the sphere, while the radius, i.e. the distance from the
receiver, is given by
c t j = |rD − r j | (14.4)
where t j is the propagation time of the signal from the satellite to the receiver whose
position is located by the rD vector.
The need to use at least four equations, and therefore four satellites, stems from
the fact that there are four unknowns in the problem: the time and the three space
coordinates that locate the event in space-time corresponding to the reception of the
signal simultaneously sent by the satellites.2
In order to obtain a position with a precision of the order of one meter, the travel
time of the signals must be measured with an accuracy of the order of 1 ns. For these
level of accuracy it is essential to consider and compensate relativistic effects that
characterise the space-time around the Earth, as well as the fact that the signals do
not reach at the same time the receiver, depending on the distance of their emitters
from the receiver. These effects are the consequence of fundamental features of
2 The issue is actually more complex, because the timing of typical receiver (e.g. a smartphone)
is much less precise than the Caesium clocks on the satellites. A fourth satellite and additional
computation (see e.g. Ashby 2003) is thus required to remove the receiver clock bias.
14.3 The Trilateration Method 365
– synchronisation: signals are not received at the same time by a moving (if nothing
else, with the Earth) detector
– time dilation due to special relativity: satellites move at 3.9 km/s relative to Earth,
– time difference due to the general relativity: emitting satellites and receiver are
at different heights and their clocks are in a different point of the gravitational
field. Clocks on GPS satellites run faster than clocks at rest on the Earth’s surface
(gravitational redshift). Thus, GPS satellite clock frequencies need to be adjusted
to compensate for this effect.
The first correction depends on the locations of the satellites and the receiver and,
therefore, is not constant. Therefore, we need to correct the equations when we refer
the result to the receiver. As for the remaining two effects, the satellites altitude as
well as their velocity remain roughly constant along an orbit.
As a next step, we must choose the reference frame to compute the time intervals
of the light propagation and, in doing that, we should keep in mind the need to include
these relativistic corrections.
3 The receiver must also account for ionospheric and tropospheric delay corrections, which we do
not consider here.
366 14 GPS and Relativity
This reference system has the origin in the Earth center and rotates with it. It
is generally traced back to the reference system WGS-84(G873), with a constant
angular velocity for the rotation around the Earth axis of ⊕ = 7.292115 · 10−5
rad/s ( 2π/sidereal day). This is clearly a non-inertial frame and, as a consequence,
we cannot apply Eq. (14.4) used to determine the position and time of the receiver.
We then need to compute these quantities using the ECI frame where the equations
of special relativity are valid The frame is redefined by the receiver at each cycle of
the calculation of the space-time coordinates, so it is instantaneously inertial, i.e. in
free fall with the Earth in the gravitational field of the masses of the universe.
In essence, the computation steps are : (a) solve the signal propagation equation in
the ECI reference, (b) transform the results into the ECEF to obtain the coordinates
of the Earth-related position.
In the following, we shall run our calculations with the usual Spherical coordi-
nates (r , θ, φ). However, GPS receivers output the position using the Geographical
coordinates, and it is thus appropriate to recap the differences and the conversion
between the two systems.
The Geographical coordinates are:
– Longitude λ: the angle calculated from the primary meridian defined as the semi-
circonference that joins the two poles passing through the city of Greenwich (UK).
Same as the azimuthal angle φ of spherical coordinates.
– Latitude ϕ or β 4 : is the polar angle, just as θ in spherical coordinates, but mea-
sured from the equator rather than from the North pole. In this notation, the North
pole has latitude +π/2 and the South pole has −π/2. Actually, both latitude and
longitude are usually expressed in degrees, rather than radians. We convert from
one to the other with: latitude = π/2 − θ.
– Altitude, or elevation h: is the height above the surface of the Reference Ellipsoid,
a geometrical figure (an oblate ellipsoid of revolution) that best approximates the
volume of the Earth.
The altitude is thus defined as h = r − R⊕ (θ), where r is the usual radial coor-
dinate. R⊕ is the distance of the ellipsoid surface from the Earth center of mass
(Table 14.1):
a 2 b2
R⊕ (θ) = (14.5)
a 2 sin2 θ + b2 cos2 θ
where a = 6.378137 · 106 m is the equatorial radius (or semi-major axis)5 and
b = 6.356752 · 106 m is the polar radius, or semi-minor axis (Fig. 14.6).
The ellipsoid is a geometrical approximation to the Earth, but we actually need a
description of its gravitational potential: the geoid is defined as the locus of points,
Table 14.1 Symbols, range and units for various coordinates on a sphere. We also recall the
cylindrical coordinates that will be used in the following
Coordinates Radial Polar angle Azimuthal angle
Spherical 0<r <∞ 0 ≤ θ <≤ π 0 ≤ φ < 2π
Pseudo-spherical 0<r <∞ −π/2 ≤ θ ≤ π/2 0 ≤ φ < 2π
Geographical R⊕ ≤ h < ∞ −90◦ ≤ ϕ(or β) <≤ 0 ≤ λ < 360◦
90◦
Cylindrical 0<r <∞ −∞ < z < ∞ 0 ≤ φ < 2π
Rotation axis
Equatorial plane
Fig. 14.6 Left: Graphical representation of the Reference Ellipsoid for the Earth. Credits: Creative
Commons BY Cmglee-SA 4.0. Right: Graphical representation of pseudo-spherical and Cartesian
coordinates of point P
on or near the Earth surface, with the same value of the potential | g |. It has the shape
that the free surface of water (if there were no tides, winds, streams etc.) would take.
Because the geoid is defined in the rotating ECEF frame, we must also include the
centrifugal potential. By truncating the multipole expansion of | g | to the quadrupole
moment, a geoid is defined as the surface where
G M⊕ a
1
2
˜ , θ) = −
(r 1 − J2 P2 (cos θ) + . . . − ω⊕r sin θ (14.6)
r r 2
˜ 0 . Such value can be computed on the equator:
has a constant value
G M⊕ J2
ω⊕
2 a2
˜ 0 = (r
˜ = a, θ = π/2) = − 1+ −
a 2 2
˜ 0 /c2 = −6.969 · 10−10 : the centrifugal term accounts for
In dimensionless unit,
1/500 of the monopole term, the quadrupole for 1/2000. Higher order terms are,
obviously, even more negligible. We shall use the geoid later on in this chapter,
dealing with GR timing corrections.
368 14 GPS and Relativity
Two main corrections concerning the GPS system directly derive from special rela-
tivity: the effect of time dilation and the problem of the simultaneous nature of events.
It is well known that clock in motion, with velocity u with respect to an ECI, runs
slower than the time measured by a clock at rest in the ECI, i.e. the proper time,
This is the time dilation effect due to the relative velocity, also called the second-
order Doppler effect. To have an idea of the quantities involved, for the speed of
GPS satellites vs = 3874 m/s, we have β = u/c = 1.3 · 10−5 and γ(vs ) − 1 = 8.23 ·
10−11 . This is the fractional frequency offset needed to compensate for this effect,
relative to the rate of clocks on the Earth. The main frequency of the clocks is 10.23
MHz (then upconverted to L band) and the frequency error accumulates over time.
In other words, if we do not take this effect into account, within a day we accumulate
an error of ∼7.2 µs, which results in an error on the receiver position of ∼2 km.
Fig. 14.7 Satellites A and B move along a straight trajectory at constant velocity u relative to the
receiver. R and R’ are the rest reference frames of the receiver and satellites
c2 (t D − t A )2 = (x D − x A )2 + (y D − y A )2 (14.8)
from which we infer the detection time t D:
1 1
t D =t A+ (x B − x A )2 + h 2 (14.9)
c 4
where we have introduced the altitude h = (y D − y A )2 and we have imposed that
in R’ the detection point is the midpoint of the satellites,
1
(x D − x A )2 = (x B − x A )2 (14.10)
4
x D = γ(x D + β c t D ) =
1
=γ x D + βct A +β (x B − x A )2 + h 2 (14.11)
4
This position differs significantly from that of the midpoint x M = (x A +x
2
B)
. Indeed,
we can express x M using t A = t B and the fact that the detection is at the midpoint
x D = x M = 1/2(x A + x B ):
1
xM = γ x A + βct A + x B + βct B = γ x D + βct A (14.12)
2
370 14 GPS and Relativity
So, the difference between the space coordinates of the detection point x D and the
midpoint x M is
1
xD − xM = β (x B − x A )2 + h 2
4
In the case of h ∼ 20200 km, x A = 0, x B = 1000 km, u/c = 1.3 · 10−5 t A =
t B = 0 we get x D = 500.263 km verses x M = 500.007 km, for an error of the order
of 250 m.
In the simplifying hypothesis that the speed of signal propagation remains constant
while crossing the atmosphere, we must also take into account that the receiver is on
the Earth, and moving with it. As mentioned before, the equations considered so far
hold for an inertial frame: we need to compare the clock time of the satellites with
that of the receiver, rotating around the Earth’s axis with angular velocity ω⊕ . At
the equator, the linear velocity due to the Earth rotation is a·ω⊕ = 0.464 km/s, to be
compared with the satellite velocity u ∼ 3.9 km/s. In addition, the propagation time
of the signal depends on the position of the satellite relative to the receiver. Locali-
sation of the detector requires receiving signals from at least four satellites: because
they arrive from different positions, these signals will not arrive in coincidence: the
arrival time of the last signal can be delayed up to to 19 ms with respect to the first,
as shown in Fig. 14.9. As discussed above, this is the basis of trilateration, but there
is a catch: during this delay time, the Earth rotates approximately 1 µrad and the
satellite moves by more than 30 m above a stationary receiver in the ECEF reference
frame and more than 60 m in the ECI reference frame. This is a relevant effect, and
we need to discuss its contribution.
We thus consider the photon travel time as measured in the non-inertial rotating
frame. It is convenient to perform the conversion, from the inertial to the rotating
14.5 GPS and Special Relativity 371
ms
c ⋅ 86
d ma x ≃
Satellite dmin ≃ c ⋅ 67 ms
Fig. 14.9 A GPS satellite orbits at an altitude of 20200 km (67 ms). The transmitted signal takes
between 67 and 86 ms to propagate to points the Earth surface in the satellite cone of visibility.
Credits: public domain image by Rogibert
frame where the variables are marked as primed variables. In cylindrical coordinates,
we have:
r = r; z = z; φ = φ − ωt t = t
The time transformation t = t in Eqs. (3) is deceivingly simple. It means that the
time variable t for the rotating frame is really determined in the underlying inertial
frame.
In the inertial frame ECI, we write the usual Minkowski metric as
r 2 ω 2
ds 2 = 1 − 2 c2 dt 2 − dσ 2 − 2r 2 ωdφ dt (14.13)
c
Consider now two or more observers in the rotating frame who want to synchronize
their clocks using the Einstein procedure (that is, using the principle that c is constant).
Photons travel along a null worldline, ds 2 = 0. Neglecting second order terms in the
small quantity ωr /c, we can rewrite the previous metric as:
The first integral, σ /c, is the time that naive observers, ignoring that they are
rotating with the Earth, would use to synchronize their clocks. The second term is
the error we would incur, if we neglected the effect of Earth rotation, in computing
the propagation time. The integral is easily recognised as the area A z swept by
the position vector along the path , projected on a plane with z = const (e.g., the
equator). Observing from an inertial frame, this can be regarded as the additional
travel time required by the light to catch up with the rotating frame. This difference
can lead to significant errors: on the Earth, for an observer synchronizing a set of
clocks around the equator ( = 2π R⊕ ), this amounts to:
2ω⊕ 2ω⊕ 2
d Az = πa = 207.4 ns
c2 c2
This result echoes the Sagnac effect, discussed in Chap. 13: Eq. 13.3 has an extra
facotr 2 due to the two counterpropagating beams that doubled the effect.
Some authors interpret the Sagnac effect as light not travelling at speed c, without
violating the postulate of SR because the effect is analyzed in a non-inertial frame.
It is however simpler to consider that the light does not travel, in the ECEF frame,
in a straight path, but rather spiralizes due to rotation of the frame.
GPS can be used to compare times on two earth-fixed clocks when a single satellite
is in view from both locations. This is the “common-view” method of comparison
of primary standards, whose locations on Earth’s surface are usually known very
accurately in advance from ground-based surveys. Signals from a single GPS satellite
in common view of receivers at the two locations provide enough information to
determine the time difference between the two local clocks. The Sagnac effect is very
important in making such comparisons, as it can amount to hundreds of nanoseconds,
depending on the geometry. In 1984 GPS satellites 3, 4, 6, and 8 were used in
simultaneous common view between three pairs of Earth timing centers, to perform
an around-the-world Sagnac experiment. The centers were the National Bureau of
Standards (NBS) in Boulder, CO, Physikalisch-Technische Bundesanstalt (PTB) in
Braunschweig, West Germany, and Tokyo Astronomical Observatory (TAO). The
size of the Sagnac correction varied from 240 to 350 ns. Enough data were collected
to perform 90 independent circumnavigations. The actual mean value of the error,
obtained after adding the three pairs of time differences, was 5 ns, which is less than
2% of the magnitude of the calculated total Sagnac effect (Allan 1985).
Satellites orbit at a mean altitude h = 20184 km and are subject to a weaker gravita-
tional field than a receiver on Earth. In previous chapters, we discussed experiments
that highlight how time slows down as the gravitational field grows. Therefore, time
runs faster for a clock placed in a satellite than in a receiver on Earth. It is therefore
necessary to slow down the clocks of satellites, in order to ensure that the frequencies
of the signals sent correspond, at the reception, to those of the local receiver clock.
14.6 GPS and General Relativity 373
According to General Relativity, the emission (on spacecrafts) time intervals te
clock is related to the detection (on Earth) time intervals t D as
te = t D 1 + 2 (14.18)
c
1 1
1− 2 (14.19)
te t D c
The relative change in frequency is then −5.284 · 10−10 for the h = 20184 km
altitude.
This implies that, in a day, the atomic clocks of satellites, oscillating at 10.23
MHz, would slow down by a staggering 45.705 µs. In the absence of a correction,
this we would cause a positional error of 13 km/day, cumulating with time.
We note that the Special Relativity correction related to time dilation (Eq. 14.7),
and that associated with General Relativity, have opposite signs. Therefore, they
partially compensate.
Summarizing, the total correction is:
1
= 10.23 · (1 − 5.284 · 10−10 + 8.229 · 10−11 )
te
6 We neglect it here, for sake of simplicity, the contribution from the Earth quadrupole moment
G M⊕ R⊕
(r , θ) = − 1 − J2 ( )P2 (sinθ)
r r
that is usually included in the computation.
374 14 GPS and Relativity
We conclude by pointing out that, in principle, the time correction for gravita-
tional effects is not constant, because on one hand, the Earth gravitational field is not
uniform; on the other hand the satellites are also exposed to time-dependent gravi-
tational fields of external bodies, such as the Sun and, to a larger extent, the Moon.
However, the error associated to these contributions are of the order of a centimetre.
These results can also be derived using the simpler argument by Mehr Un Nisa. We
will analyze the ticking of clocks using the Schwarzchild metric, seen in Chap. 5.
Consider a coordinate time, at rest at infinity (with the distant stars), and the proper
time ticking with the GPS satellite τs and with the receiver at rest on the Earth τ D .
For each of these we can write the metric as:
2G M⊕ 2G M⊕ −1 2
ds 2 = 1 − (cdt)2
1 − dr (14.21)
r c2 r c2
− r 2 (dθ2 + sin2 θdφ2 ) = (cdτ )2
Both the receiver D and the satellite S move at constant distance from the Earth (R⊕
and rs = R⊕ + h, respectively) on almost circular orbits, and therefore we can set
dr = dθ = 0. We can divide Eq. 14.21 by (cdt)2 and write two equations, for both
the satellite and the detector proper times:
2
dτs 2G M⊕ dφs 2
= 1− − rs sin θs
2 2
(14.22)
dt rs c2 cdt
2
dτ D 2G M⊕ dφs 2
= 1− − R 2
⊕ sin 2
θ D (14.23)
dt R⊕ c 2 cdt
In the last term, we easily identify r sin θ dφ/dt as the velocity along the circular
orbit. Dividing the two above equations we get:
dτ
2
s
1− 2G M⊕
rs c2
− ( vcs )2
= (14.24)
dτ D 1− 2G M⊕
R⊕ c 2
− ( vcD )2
This equation gives the ratio of the proper times of the clocks at two different
positions from the centre of the Earth.
We can separately analyse the kinematic effect, or second order Doppler shift:
dτs 1 − ( vcs )2 vs2 v 2D
= vD 2 1 − + = −8.23 · 10−11
dτ D 1−( c ) 2c2 2c2
References 375
Thus, we end up with the same corrections derived in the previous sections.
To be more refined, we should substitute the last term (the Earth monopole poten-
tial) with the potential on the geoid ˜ 0 /c2 introduced in Sect. 14.4. Numerically,
it makes little difference (0.2%) but it is important to remark that all clocks on the
geoid tick at the same frequency.
As usual, life is more complicated than this: we did not take into account, in
this simplified analysis, several issues like the ellipticity (small, but non zero) of the
satellite orbits, the non-sphericity of the Earth, the velocity of receiver with respect
to the ECEF (negligible for a hiker, less for a car, definitely not for an airplane), the
Shapiro time delay and the ray bending due to Earth, time dependent perturbations
due to the Moon and other planets, variations in the Earth rotation rate... All these
perturbations, discussed in Ashby (2003) are responsible for corrections below one
meter, and can be neglected in a primer like this chapter.
References
Ashby N. Relativity in the Global Positioning System - Living Rev. Relativity 6, 1 (2003).
http://www.livingreviews.org/lrr-2003-1
Allan, D.W., Weiss, M., and Ashby, N. Around-the-World Relativistic Sagnac Experiment - Science,
228, issue 4695, pp. 69-70 (1985)
Modulation Techniques
A
A.1 Introduction
– The size of the antennas would be enormous: the optimum length of an antenna is
λ/2, and the wavelength at corresponding to a frequency of 5 kHz is λ = c/5 kHz
60 km. Therefore the antennas, to have a good efficiency, would have to be
30 km long.
– The power required to power an antenna of this size would be enormous.
– The transmitter would be heavy and voluminous. The frequency band would be
the same (0–5 kHz) for all users. In other words, the e.m. channel, in the absence
of modulation, would be unique: all the users in the world would talk and listen
at the same time on the same channel, making any communication absolutely
impossible.
– The communications being de facto public, there would not be, and there could
not be, any form of “IT privacy”.
From these considerations it emerges the need to translate the signal in frequency,
allocating in different channels the transmissions of different users, overcoming the
above-listed disadvantages. As we will discuss in the following section, by transmit-
ting in Frequency Modulation (FM), i.e. around 100 MHz, the length of the antennas
is 75 cm, the required power is much lower and the size of the transmitter is minimal.
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer 377
Nature Switzerland AG 2022
F. Ricci and M. Bassan, Experimental Gravitation, Lecture Notes in Physics 998,
https://doi.org/10.1007/978-3-030-95596-0_A
378 Appendix A: Modulation Techniques
Using different values of the frequency ratio factor (the ratio between the signal
and the carrier frequencies), we can use different frequencies for each transmission,
which can then occur at the same time.
In essence, the method of modulation allows the characteristics of the signal
spectrum to be transmitted, adapted in such a way to pass well through the channel
and, at the same time, allow the multiplexing process, i.e. the transmission of many
signals on the same channel without interference.
A modulation process is characterized by two frequency components
Modulation methods
ANALOGIC NUMERIC
AM FM PM
The carrier propagates to a distant receiver the information contained in the low
frequency signal, called the modulator. We shall assume, for ease of calculation
and without losing generality, that also the modulator is a sine wave of frequency
f m = ωm /2π. A sinusoidal modulation is not very interesting, because it has only
two bits of information: its amplitude and frequency; nevertheless, Fourier’s theorem
assure us that any periodic (or even aperiodic) signal can be decomposed in the sum
of a number, often infinite, of sinusoidal waves.
In Fig. A.2 the three signals shown are, from top to bottom:
Fig. A.2 The three waveforms: the modulating signal, the carrier wave, the modulated signal
380 Appendix A: Modulation Techniques
are the modulating and carrier signals, respectively. The resulting modulated signal
is:
v AM (t) = (Vm cosωm t + Vc )cosωc t = (mcosωm t + 1)Vc cosωc t (A.2)
Here we introduced the modulation index m (also known as the modulation depth),
defined as the ratio of the amplitude of the modulating signal to the amplitude of the
carrier signal
Vm
m=
Vc
Applying the well-known trigonometric relation cos α cos β = 21 [cos(α − β) +
cos(α + β)], the expression A.2 takes the form
m m
v AM (t) = Vc cosωc t + cos(ωc − ωm )t + cos(ωc + ωm )t
2 2
that is, the signal is composed of the sum of three sine waves: the first coincides
with the unperturbed carrier, while the others are two sinusoids of amplitude m Vc /2
oscillating at frequencies given by the sum and difference of the carrier and the
modulation frequencies (Fig. A.3).
We note that the process of modulation results in the modulating signal at f m being
translated to higher frequency by an amount given by f c . In addition, the modulated
signal extends over a bandwidth that is twice the f m modulating frequency.
The modulation index m usually ranges between 0 and 1. Equation A.2 shows that
m = 0, means no transmission of modulated signal, while still engaging the channel
with the carrier. When m > 1 we observe at the output a strong distortion of the
Appendix A: Modulation Techniques 381
Vc
m=0
Vc
Vm
Vm Vc
A B m=0.5
Vm Vc
Vm
Vm
Vc
m=1
Vc
Vm
Vm
Vc
m>1
Vc
Vm
Fig. A.4 In the plots we show the signal in the time domain v AM (t) for the four cases m = 0, 0.5, 1,
and for m > 1 (overmodulation)
modulated signal (crossover) as shown in Fig. A.4, since the amplitude modulated
signal consists of three distinct components in the frequency domain, its total time
averaged power is the sum of the powers of the three contributions:
only the two sidebands contain the transmitted information (the same in both side-
bands), while the third, more intense, is the carrier which has no information content.
This observation suggests that amplitude modulation has a low informational perfor-
mance: to transmit one of the two sidebands that alone contains all the information,
we are forced to transmit two more lines, one of which, with much larger power.
We define the modulation yield as the ratio of the power of the transmitted infor-
mation signal, contained in only one of the two side lines, to all the power that must
be transmitted, which is due to all the three lines. By applying the relationship A.4,
we have
Pc−m m2
η= = (A.5)
PAM 2(m 2 + 2)
Equation A.5 gives us the range of available performance as we vary m between
its limit values of m, 0 < m < 1:
1
0<η<
6
The modulator is usually a periodic signal that can be expanded in Fourier series. Just
as we did in the previous case of AM transmission, we shall consider for simplicity’s
sake just the lowest harmonic of the modulating signal vm (t) = Vm cos ωm t, where
we assume that ωc >> ωm .
In frequency modulation, the amplitude of the modulated signal is constant and
equal to the amplitude of the carrier Vc . On the other hand, the frequency changes pro-
portionally to the instantaneous value of the modulating signal; this can be achieved,
0.5
0
Modulator
-0.5
-1
0 2 4 6 8 10 12 14 16 18 20
0.5
0 Carrier
-0.5
-1
0 2 4 6 8 10 12 14 16 18 20
0.5
FM signal
0
-0.5
-1
0 2 4 6 8 10 12 14 16 18 20
time
where
ω F M (t) = ωc + K F Vm cosωm t = ωc + 2π f cosωm t (A.7)
where K F is a normalization constant that renders K F Vm in units of [rad/s]. The
peak frequency deviation f is the maximum distance from the carrier frequency:
K F Vm
f = (A.8)
2π
in FM radio systems we have f = 75 kHz (Fig. A.5).
The modulated signal phase φ F M (t) is a function of time, obtained by integrating
over time the periodic function ω F M (t):
φ F M (t) = ω F M dt =
K F Vm
= (ωc + K F Vm cosωm t)dt = ωc t + sinωm t (A.9)
ωm
This shows that the modulation index in this case is
K F Vm f
m= = (A.10)
ωm fm
384 Appendix A: Modulation Techniques
J0(m)
J1(m)
0.5 J2(m)
J4(m)
J10(m)
J (m)
-0.5
0 2 4 6 8 10 12
m
This periodic functions can be expanded in series of Bessel functions (Fig. A.6):
It appears from Eq. A.12 that the modulated signal spectrum is symmetric with respect
to ωc and consists of a series of lines with equal spacing ωm (Fig. A.7). The amplitude
Appendix A: Modulation Techniques 385
of each line is given by the value Ji (m) of the Bessel function of the ith order for
the given value m of the modulation index.2
Let us consider the following practical example: we assume f c = 100 MHz,
f m = 15 kHz, f = 45 kHz, Vc = 100 V, which corresponds to m = 3, as given by
Eq. (A.10). From the Bessel function diagram of Fig. A.6 we find, in correspondence
with the chosen value m, from the cross point with all the curves J0 , J1 , J2 , . . .,
multiplied by Vc , we obtain the corresponding voltage amplitude of the spectral lines.
In FM the bandwidth of a modulated signal is defined on the basis of the set of
spectral lines having a appreciable amplitude. A typical threshold is to consider as
detectable the lines as large as 1% of the carrier in absence of modulation: this has
F M /Vc = J0 (m = 0) = 1.
a normalized amplitude v max
In our example, it corresponds to exclude the contribution of all Bessel functions
with order i 6, leaving ±5 sidebands separated by f m = 15 kHz. In general, the
FM bandwidth results to be
B = 2N f m (A.13)
i.e. being N = 5 we have B = 150 kHz.
In the case when the modulating signal has a continuous or broad spectrum of fre-
quencies, the bandwidth required for transmission can be approximatively estimated
by Carson’s bandwidth rule
where f is peak frequency deviation, see Eq. A.8, and f m max is the highest fre-
quency in the Fourier spectrum of the modulating signal. This formula best approx-
imates the value computed by Eq. A.13 for larger m values. Indeed, applying it to
the previous example where we have f = 45 kHz e the f m max = f m = 15 kHz, the
bandwidth calculated using the Carson rule is BC = 120 kHz, which is 20% lower
than the number computed using Eq. A.13. A more fitting example is FM radio
broadcast that, with f = 75 kHz allows transmission of frequencies up to 53 kHz.
Equation A.14 then yields BC = 256 kHz. In principle, radio stations are assigned
2 The Bessel functions (of the first kind) are a family of special functions, solution of the Laplace
equation in cylindrical coordinates. The numerical values (just as the values of trigonometric func-
tions) are tabulated but also directly available in modern computing tools.
386 Appendix A: Modulation Techniques
carrier frequencies distant 400 kHz, in order to accomodate the full bandwidth, with
margin.
Just as in the AM case, the power transmitted is computed by summing all squared
amplitudes in the signal:
Vc2 2
Pc = Jo (m) + 2 Ji 2 (m) (A.15)
2Z
i
To this purpose, we can use an important sum rule for Bessel functions3 :
This relation, applied to Eq. A.12 shows that the energy of a transmitted signal is
conserved, and the choice of m just changes the way it is distributed among the
various frequency lines. It is therefore advantageous to minimize the amount of
power contained in the carrier that transmits no information. This is achieved by
choosing a value of m where J0 vanishes: the first zeros of J0 are found for m =
2.4, 5.5, 8.7, 11.8 . . . . . .. In this condition of carrier suppression, the modulation
yield (extending the definition Eq. A.5 to all the bands on one side of the carrier)
reaches η = 50%.
Mixers are non-linear devices, used in telecommunications systems, that accept two
input signals of different frequency and produce, at the output, signals oscillating at
various combination of the two frequencies:
vi (t) = Vi sin(2π f i t + φi ), i = 1, 2
3 This relation is the “Bessel equivalent” to sin 2 (α) + cos 2 (α) = 1 for trigonometric functions.
Appendix A: Modulation Techniques 387
and one way to obtain sum and difference of these frequencies f i is to multiply the
two signals together. In fact, by applying the trigonometric identity
1
sin α sin β = [cos(α − β) − cos(α + β)] (A.17)
2
and assuming, for simplicity, that the two signals are in phase (φ1 = φ2 = 0), we
have
1
v1 (t) · v2 (t) = [cos 2π( f 1 − f 2 )t − cos 2π( f 1 + f 2 )t] (A.18)
2
producing both the sum frequency ( f 1 + f 2 ) and the difference ( f 1 − f 2 ).
In order to multiply two signals, we need a non-linear element. In the radio
frequency band this is usually a simple diode. Diodes are characterized by a voltage-
current v − i relationship
qv
i = i o e nkT − 1 (A.19)
In the case of small voltage signals v(t) applied to the diode input, we expand around
zero the exponential in Eq. A.19
e x − 1 x + x 2 /2 + · · ·
If we now apply both signals at the device input v(t) = v1 (t) + v2 (t), the output
voltage (on a resistive load) will exhibit
1
vout (t) = (v1 + v2 ) + (v1 + v2 )2 + . . .
2
1
= v1 + v2 + (v1 2 + v2 2 ) + v1 v2 + . . . (A.20)
2
The two linear terms v1 and v2 contribute to the output with signals at the f 1 and
f 2 frequencies, the quadratic terms v12 and v22 give rise to a (time averaged) constant
term and terms at 2 f 1 and 2 f 2 , while the product v1 v2 gives rise to signals at the sum
and difference of f 1 and f 2 . Then, the frequency band of interest must be filtered
through an appropriate band pass circuit.
For example, in the case that f 1 , f 2 >> | f 1 − f 2 | (two signals with relatively
close frequencies), the difference signal will be easily filtered compared to all other
components because it oscillates at a much lower frequency than the others: in radio
receivers, the mixer is used to extract this slow signal. In this way, a high frequency
signal is demodulated down to audio frequencies.
388 Appendix A: Modulation Techniques
The heterodyne, also called the Armstrong regenerator, is a signal processing tech-
nique, largely used in radio equipment that shifts one frequency range into another
(shifting up in transmission, down in reception). It was anticipated in 1896 by Nikola
Tesla, developed by Reginald Fessenden in 1901 and perfected in 1913 by Edwin
Armstrong during his studies on the operation of the triode. Modern implementations
of this principle, taking advantage of the mixer, are called super-heterodyne.
The signal from the antenna is processed by a broadband radio frequency (RF)
filter centred on a frequency f s and then fed to the input of a mixer together with a
signal generated by a local oscillator at a f L O frequency such that their difference
is a preset intermediate frequency:
fi = | f L O − fs | (A.21)
signal are superimposed in the mixer, which typically is a photodiode. This responds
to the intensity of the incoming beam and therefore its response to the amplitude of
the signal is non-linear.
The input electromagnetic field is
E s (t) = E s cos(ωs t + φ)
E L O (t) = E L O cosω L O t
The output of the detector is proportional to the light intensity that is the square the
incident field:
2
I ∝ E s cos(ωs t + φ) + E L O cosω L O t
By expanding this expression we have
E s2 E2
(1 + cos(2ωs t + 2φ) + L O (1 + cos2ω L O t)+
2 2
+ E s E L O cos[(ωs + ω L O )t] + cos[(ωs − ω L O )t] (A.22)
and by collecting terms in the above equation, we identify the three different contri-
butions:
E s2 +E L2 O
• the static components 2
2 E L2 O
• the high frequency components: E2s cos(2ωs t + 2φ) + 2 cos(2ω L O t) +
E s E L O cos[(ω L O + ωs )t + φ]
• the beat component: E s E L O cos[(ω L O − ωs )t]
In heterodyne detection, static and high frequency components are filtered, so that
only the beat term survives. As it can be seen from Eq. A.22, the amplitude of the
latter component is proportional to the amplitude of the input signal E s .
A peculiar type of demodulator is the lock-in amplifier, also known as phase sensitive
detector: it is an instrument capable of picking a signal of interest, with given fre-
quency and phase, when it is buried in a large noise. It can achieve a Signal-to-Noise
Ratio as large as 106 . It is mainly used in an extended audio band (0–100 kHz), but
there are also models working in the MHz range, while mixers are usually employed
in radio frequency bands. The other, important difference with respect to a simple
mixer is the phase sensitivity of the lock-in.
The lock-in building blocks are as indicated in Fig. A.8:
390 Appendix A: Modulation Techniques
• an input amplifier, with an optional broad band-pass filter to reject most of the
unwanted spectral components, and thus increasing the dynamic range
• a reference input, where a sine wave of the desired frequency is applied. This is
fed to an adjustable phase shifter and then to
• an audio mixer, where the two signals are multiplied according to Eq. A.17.
• a low-pass filter
• an output amplifier
vi (t) = Ai sin(ωi t + φi )
The amplitude Ai , much smaller than the rms value of the whole input, is constant,
or slowly varying with time. The reference signal is a pure sine wave:
vr e f (t) = Ar e f sin(ωr e f t + φr e f )
where φr e f can be changed at the operator’s will. The instrument is usually adjusted
so that the reference amplitude is included in the overall gain G, so that we can
neglect it in the following. At the mixer, use of Eq. A.17, with unequal phases, yields
Band Low
pass Mixer pass Amplifier
input signal vi(t)
G Ai cos
Ai Blp G
ref
reference adjustable
oscillator phase
Fig. A.8 Block diagram of a lock-in amplifier. The input amplifier is not shown
Appendix A: Modulation Techniques 391
1
vmi xer (t) = Ai [cos((ωi − ωr e f )t + φ) − cos((ωi + ωr e f )t + φi + φr e f )]
2
(A.23)
where φ = φi − φr e f . The low-pass filter, with a bandwidth Blp , filters away the
high frequency (sum) component, while the low frequency (difference) is almost
unchanged if |ωi − ωr e f | < 2π Blp . Note that this difference frequency can also be
negative: it just means that the phase evolves in the opposite (clockwise) direction.
The bandwidth Blp is chosen on the basis of the frequency content of Ai (t), typically
between 30 mHz and 10 Hz.
The original signal at ωi is now translated down to a low frequency ωi − ωr e f ,
that can be brought to zero by careful tuning of ωr e f . We are left with
G Ai
vout = cos φ (A.24)
2
Note that vout is no longer an oscillating signal, but a constant voltage, as long as the
input amplitude Ai is steady. Adjusting the phase φr e f to a maximum of vout will
give us both the amplitude Ai and the phase, relative to the reference, of the input
signal component vi (t).
There is a catch we overlooked so far: not always the reference signal is available
as a clean sine wave. This can be a complex or noisy signal, derived from another
apparatus. For this reason, the reference is conditioned within the lock-in into a
square wave, that contains odd harmonics of the base frequency:
4 1 1
vr e f = Ar e f sin ωr e f t + sin 3ωr e f t + sin 5ωr e f t + · · ·
π 3 5
We can set up a second channel, identical to this, but with the phase of the reference
signal shifted by π/2; we call this additional output y(t):
t
4 t−t
y(t) = dt vin (t ) sin(ωr e f t ) e− τ
πτ −∞
392 Appendix A: Modulation Techniques
Therefore, we have at our disposal both the in-phase (cosine) and quadrature (sine)
components of the input signal: x(t) and y(t) are to be considered as the two, time
dependent, components of the complex vector, or phasor, representing amplitude
and phase of the input signal vi (t). A more common representation of a phasor is in
polar coordinates:
y(t)
r (t) = x 2 (t) + y 2 (t); tg(φ) =
x(t)
Lock-ins let the user choose between “cartesian” and “polar” output: in the polar
form, the “r(t)” output always gives the magnitude of the input (at the desired fre-
quency) regardless of the phase. This is useful if the phase jitters or drifts in time.
The invention of the lock-in is yet another contribution to experimental physics
by the eminent scientist Robert Dicke4 although he said he had found the idea in a
previous paper.
A typical use is in solid-state experiments: the laser beam is modulated in a simple
way by a chopper: a rotating disc with a slit, so that light is shone on the sample
intermittently at the rotation frequency, and the response is expected with the same
pace: the generator driving the rotating disc is fed as a reference to a lock-in amplifier,
and the output only contains that frequency component.
Lock-ins are somewhat obsolete in this days, as Digital Signal Processing allows
to perform the same task in software on a sampled signal. However, they are still
used when frequencies are too high for real-time processing. Anyhow, software-based
lock-ins work on exactly the operating principle described here.
4 We have encountered prof. Dicke in Chap. 3 (Roll-Krotkov-Dicke experiment of EP) and in Chap. 6
for the Brans-Dicke theory of gravity. He is also credited, among many other things, for predicting
the existence of cosmic microwave background and for the correct interpretation of its serendipitous
discovery by Penzias and Wilson (who were using a “Dicke radiometer”). He also founded Princeton
Applied Research, the first company to produce commercial lock-ins.
Appendix A: Modulation Techniques 393
where ψ1 is the field in one arm and ψ2 the modulated field in the second arm, 12
is the static optical path difference, eom is the length of the EOM, whose refractive
index n 2 (t) is modulated in time. Therefore the intensity of the combined beams will
be
cεo A2
I (t) = (1 + cos φ0 + φ(t))
2
where φ0 is a static phase difference and φ(t) the phase modulation.
Modulation is often a key element of control systems: it is an effective way to
stabilize the working point of a non-linear instrument. Take for example the output
of a Michelson interferometer (see Chap. 9): the optical path difference might slowly
drift, moving the interferometer output away from the dark fringe condition. A fast
(kHz to MHz) modulation of the path difference, using an EOM, by a small fraction
of a fringe, will produce a symmetric light output, at twice the modulation frequency
(the response around a minimum is quadratic) (Fig. A.9). If the working point moves
away from the dark fringe, a light wave with a component at the modulation frequency
appears: this can be used in a feedback loop to lock the working point onto the dark
fringe. The topic of modulation in control system is dealt with in greater detail in
Appendix B, dealing with controls.
394 Appendix A: Modulation Techniques
t t
Fig. A.9 In a Michelson interferometer, modulating the optical path difference around the dark
fringe (φ = π/2) produces a light signal at twice the modulation frequency. If the operating point
drifts (in our figure, to the right) a signal component at the modulation frequency appears that can
be used as an error signal to stabilize operations on the dark fringe. In this figure, the amplitude of
modulation Amod = 0.3 rad is vastly exaggerated for clarity
Reference
Schlamminger, S., Choi, K.-Y., Wagner, T.A., Gundlach, J.H., Adelberger, E.G.: Test of the equiv-
alence principle using a rotating torsion balance. Phys. Rev. Lett. 100, 041101 (2008)
Feedback and Controls
B
B.1 Introduction
In this chapter we briefly outline the basic principles of feedback systems and the
concept of stability, without any pretense of thoroughness. For that, we refer to
textbooks of system theory and controls. For a comprehensive tutorial see Bechhoefer
(2005).
To explain the meaning of a feedback process, we consider a very general system,
called plant, which linearly transfers a signal from the input to the output port. This
linearity relation is usually expressed in the frequency domain:
A is the transfer function that relates input and output of the system. V (o) , V (i)
can be physical quantities of any kind, but in the following we shall focus on electric
circuits, so that A represents the open-loop gain of the network.
The plant is in a feedback configuration if a fraction of the output signal is mea-
sured, processed and added back to the signal input. If the feedback signal is applied
to the input with opposite phase with respect to the incoming signal, we have a
negative feedback.
The crucial advantage of the feedback scheme consist, as we shall see, in an
increased stability of the input-output relation (the transfer function) that is, in gen-
eral, subject to change due to fluctuations, e.g. thermal, of the plant components.
The feedback is a concept that goes far beyond its applications in Electronics. Let
us consider an example: imagine you must deliver a glass of water without spilling
its content: your eye (the “sensor”) continuously monitors the non-horizontally of
the water level (the “error signal”), and the brain (the “processor”) sends a command
to the supporting arm (the “actuator”) to correct the tilting of the glass. In this way,
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer 395
Nature Switzerland AG 2022
F. Ricci and M. Bassan, Experimental Gravitation, Lecture Notes in Physics 998,
https://doi.org/10.1007/978-3-030-95596-0_B
396 Appendix B: Feedback and Controls
the attitude of the glass (the “plant”) is continuously monitored and corrected, to
maintain it horizontal and preserve its content.
Nature uses feedback widely and many functions of the human body are carried
out by controlled systems (the regulation of body temperature, the rhythm of the
heartbeat, the ear sensitivity, etc.).
In Fig. B.1 we show the scheme of a plant characterized by the transfer function
A, inserted into a feedback loop. The feedback branch, characterized by the transfer
function β, is then added, with inverted sign, to the input signal.
We infer the following equation:
V (o) = A V (i) − βV (o) (B.2)
As a result, the gain of the feedback amplifier A f is given by the expression:
V (o) A 1 1
Af = (i)
= = (B.3)
V 1 + Aβ β 1 + Aβ
1
– stabilize the gain, i.e. make it less sensitive to changes in the values of circuit
components (e.g. due to temperature changes)
– adjust the values of the input and output resistances of the amplifier stage, close
to the ideal ones
– extend the frequency band of the system.
A dB
105 100
104 80
103 60
102 40
10 20
Fig. B.2 Gain curve of a system in open-loop (red) and with feedback (blue)
-40
-60
Magnitude (dB)
-80
-100
-120
0
Q=1
Q=5
-45 Q = 10
Phase (deg)
Q = 50
-90
-135
-180
101 102 103
Frequency (rad/s)
Fig. B.3 Bode plots for a harmonic oscillator, for various values of the quality factor Q: amplitude
(top) and phase versus frequency. The plot is zoomed on the resonance frequency ω0 = 100 rad/s
the closed-loop configuration the gain is reduced by a factor 1000 and the band
is expanded by the same factor. This is because, lowering the gain, we also move
the point where the gain starts to drop off, as shown in Fig. B.2: indeed the blue,
“closed-loop” line crosses the red, “open-loop gain” curve at ∼100 kHz.
Before venturing further into the basics of controls, we need to recall a few
concepts and math tools:
Fig. B.4 Diagram of the Nyquist criterion in the complex plan Re[A(s)β(s)], Im[A(s)β(s)]
with
2 2
G(ω) = Re[Aβ] + Im[Aβ] (B.4)
Im[Aβ]
φ(ω) =ar ctan (B.5)
Re[Aβ]
• The Bode plots is a graph of the frequency response of a transfer function (or, in
general, of a linear system). It consists of two plots: magnitude versus frequency
and phase versus frequency. Usually, the horizontal axis carries the frequency
(either f [Hz], or ω [rad/s]) in logarithmic scale, the amplitude is expressed in dB
and the phase in degrees. Figure B.3 shows the Bode plots for a simple harmonic
oscillator.
• Other graphical representations of the function Aβ exist: while a Bode graph
draws two functions (magnitude and phase) versus frequency, one can condense
the information in one single figure: the same variables can be plotted in polar coor-
dinates, as shown in Fig. B.7, or in cartesian coordinates, with φ on the abscissa
and G on the ordinate: this is called a Nichols plot. Note that in a polar Bode plot
Appendix B: Feedback and Controls 399
we must abandon the dB units for the amplitude, because a magnitude cannot be
negative.
• We can also plot I m(Aβ) versus Re(Aβ), in what is called a Nyquist plot, as
shown in Fig. B.4. In this plane, varying the variable ω, the function A(ω)β(ω)
describes a path (ω). The path is closed if we include negative frequencies
−∞ < ω < ∞.
We shall see how the stability criteria have different statements depending on the
graphical representation.
• In the Laplace transform, the conjugate variable is the complex frequency s =
σ + iω. A function f (t) is transformed according to
∞
F(s) = f (t)e−st dt
0
The Fourier transform is therefore a particular case of the Laplace transform, with
σ = 0. The latter is more suited to describe both periodicity (ω) and damping
(when σ < 0) as well as the response of the plant to transients. In the following
we shall consider the function A f (s).
• A Transfer Function A(ω) is, most often, a rational function of ω, i.e. the ratio of
two polynomials. A polynomials can be always be factored in terms of its roots
and, as the roots are in general complex, it is convenient to replace ω → s. We
thus write
an ω n + an−1 ω n−1 + · · · a0
A(ω) =
bm ω m + bm−1 ω m−1 + · · · b0
that becomes:
(s − z 1 )(s − z 2 ) · · · (s − z n )
A(s) = (B.6)
(s − p1 )(s − p2 ) · · · (s − pm )
m
Ad B = 20 log10 |s − z j | − 20 log10 |s − pk |
j=1 k=1
• Zeros and poles are called the break points of the Bode plot: around these points
the Bode plots exhibit the largest change in slope. Restricting again to s = iω, and
using a rule of thumb, the amplitude Bode plot can be approximated with a series of
straight segments connecting the frequencies of break points: the slope increases
by 20 db/decade at each zero, i.e. for ω = |z j |, and decreases by −20 db/decade
at each pole, i.e. for ω = | pk |.
400 Appendix B: Feedback and Controls
A plant is stable if, due to the feedback, the output stays close to a given value (e.g.
zero) independently of the value of the input. The issue of stability for a system with
feedback can be traced back to the analysis of the complex poles of the function
A f (s). A general result of complex analysis states:
to insure the system stability, A f (s) must not have poles with a positive real compo-
nent.
Two stability criteria frequently used are those of Bode and Nyquist, because they
are based on easily measured quantities.
We shall use an example to illustrate the criteria and their graphic representation.
Consider the two following Transfer Functions:
15
H1 (s) =
(1 + s) · (1 + s/2) · (1 + s/3)
2
H2 (s) =
(1 + s) · (1 + s/2) · (1 + s/3)
they are identical, except for the low frequency gain.5 In Figs. B.5, B.6 and B.7 the
two functions are shown, H1 in green and H2 in blue, in the various representations
discussed above.
The Nyquist criterion is based on the properties of the denominator of A f , the
analytic function 1 + A(s)β(s). This function has an obvious zero at A(s)β(s) = −1,
and that is where the system will show instability.
We now plot the above functions H1,2 (ω) = Aβ in a Nyquist plot, see Fig. B.5,
as described in the previous section, and state the criterion:
The system is stable if the path does not contain the coordinate point [−1, i0]
within it.6
The Bode criterion.
We now consider the Bode plots for the function A(ω)β(ω). As noted above, we have
instability if this function takes the value −1, i.e. G(ω) = 1, φ(ω) = −π = −180◦ .
We introduce two particular angular frequencies:
1. The phase crossover frequency ωπ where the phase takes the value −180◦ , i.e.
the feedback changes sign:
φ(ωπ ) = −180◦ .
2. The gain crossover frequency ω1 where the amplitude gain is unity:
G(ω1 ) = |A(ω1 )β(ω1 )| = 1 = 0 dB
Bode stability criterion is expressed in terms of these two frequencies:
Fig. B.5 Nyquist stability criterion visualized in a Nyquist plot: the green Open-Loop Transfer
Function is unstable, as it circles the instability point in clockwise sense as ω increases. In blue a
similar, stable system. The symmetric branch at I m(Aβ) < 0 is described by negative frequencies.
On the right, a zoom around the instability point [−1, i0] (red dot)
If, at the phase crossover frequency, the corresponding gain is less than unity:
G(ωπ ) < 0 dB, then the feedback system is stable.
In a stable system, we generally have ω1 < ωπ .
These two frequencies are also used to define:
– The gain margin: find on the phase plot the frequency ωπ , where φ = −180◦ .
Then read on the amplitude plot the value of G at this frequency. The gain margin
measures, in dB units, how far G(ωπ ) is from the unity value:
– The phase margin: analogously, find on the amplitude plot the frequency ω1 such
that G(ω1 ) = 1. Then, read on the phase plot φ(ω1 ). If φ(ω1 ) > −180◦ , the system
is stable. The phase margin measures, in degrees, how far φ(ω1 ) is from −180◦ :
Both phase and gain margins are measures of how close the system comes to
instability.
In these terms we can state: a system is stable if both phase and gain margins are
positive.
402 Appendix B: Feedback and Controls
gain margin
phase margin
Fig. B.6 Bode stability criteria visualized in Bode plots: on the left the Open-Loop Transfer Func-
tion is unstable: the gain at ωπ exceeds unity. On the right a similar, stable system with the phase
margin and gain margin. The only difference between the two functions is the low frequency gain
Finally, we can use a Bode polar plot, as in Fig. B.7. The pole of the Nyquist dia-
gram, [−1, i0] in Cartesian coordinates, corresponds to [1, −π] in polar coordinates.
The stability criterion can be stated as follows:
if the function Aβ crosses the unit circle at a phase angle −π < φ < 0, the system
is stable.
Indeed, its gain will be less than one at φ = −π.
Appendix B: Feedback and Controls 403
Fig. B.8 A scheme for evaluating noise in a feedback system. We are assuming that there are
( f b)
sources of voltage noise at the input vn(i) , at the output vn(o) and on the reaction branch vn
When we compose a feedback system, we add some circuit elements that, invariably,
contribute to the overall system noise. To assess this effect, we refer to Fig. B.8, where
the two blocks consisting of the amplifier A and the feedback circuit β are shown
together with the noise sources that characterize each component. Thus, the circuit
(i) (u)
includes the noise voltage generators at the input vn , at the output vn and on the
( f b)
reaction branch vn ; we shall assume these random variables to be uncorrelated.
As before, V (i) is the input voltage, i.e. the signal to be detected, V (o) is the output
signal, and V ( f b) is the feedback signal. Moreover, V (s) is the voltage at the entry
of the plant A, past the feedback node. We can now generalize Eq. B.2, to include
the noise generators. We have
A 1
V (o) = V (i) + [Avn (i) − Aβvn ( f b) + vn (o) ] (B.8)
1 + βA 1 + βA
The first term is the signal, conditioned by feedback as seen in Sect. B.1; the
following terms show how the various noise voltages appear at the output, due to the
effect of feedback. We now let V (i) = 0, to focus on the total noise at the output:
2 2
2 A (i) 2 vn (o) 2
vn(tot) = vn + + β 2 vn ( f b) (B.9)
1 + βA A2
We conclude that
– The feedback has the same effect on the input noise of the amplifier vn (i) and
on the input signal Vi (Eq. B.8): therefore it cannot improve the Signal-to-Noise
Ratio (SNR) of a measurement.
– By choosing a large value of the product β A, the feedback can reduce the contri-
(o)
bution of noises vn acting at the output.
( f b)
– The feedback branch contributes an additional noise vn , that appears with unity
gain at the output.
Finally, we rewrite the total noise by referring it to the system input: this is the noise
that is to be compared with the incoming signal, to assess the Signal-to-Noise Ratio
of the measurement.7 This can be done by solving Eq. B.7 with respect to V (i) or,
more directly, by multiplying Eq. B.10 by the inverse of the input-output transfer
function: in the same approximation β A >> 1, this is just 1/β 2 . We also introduce
( f b)
power spectral densities of the noise source: Svn is the spectrum of vn ( f b) and
similarly for vn (i) and vn (o) . We have
(o)
Svn ( f b)
= Svn (i) +
(at input)
Stot + β 2 Svn (B.11)
A2
which highlights the contribution of the noise generator in the feedback branch to
the total noise of the closed-loop system.
We conclude this section by mentioning a famous theorem, stating that feed-
back cannot, by any means, improve the Signal-to-Noise Ratio of a measurement.
If applied correctly, feedback can improve other features (like the bandwidth, as we
have seen, or the dynamics) and have little influence on the noise.
7 The SNR can be evaluated at any position of the detection chain. Computing the SNR at the input
is often convenient because no conditioning (transfer function) needs to be applied to the signal.
Appendix B: Feedback and Controls 405
1
yn = [b0 xn + b1 xn−1 + · · · + b M xn−M
a0
−a1 yn−1 − a2 yn−2 − · · · − a Q yn−Q ] (B.12)
Q
M
1
yn = − ak yn−k + bh xn−h (B.13)
a0
k=1 h=0
M
ak yn−k = bh xn−h (B.14)
k=0 h=0
If we apply a short impulse at the input, this type of filter can give a non-zero
output even when the input has long returned to zero. For this reason it is called
Infinite Impulse-Response—(IIR).
Conversely, for the so-called Finite Impulse-Response—(FIR) filters, Eq. B.12
only carries bk terms. In other words, the output only depends on the input, and not
on previous output values. For this reason, they are called filters with no feedback.
FIR filters are by definition stable, while IIR can grow indefinitely
In most cases the feedback control loops are designed assuming to deal with
continuous signals defined in the complex frequency domain s by their Laplace
transform. Then, in the presence of discrete sequences of data xk sampled at a time
rate T , we need to introduce the Z transform defined as
∞
X (z) = xk z −k (B.15)
k=0
+ SENSOR
ACTUATOR
DAC ADC
FILTERING
where the complex variable z is linked to the Laplace variable s by the relationship
z = esT (B.16)
Q −k
Y (z) k=o ak z
= M
X (z) h=o bh z
−h
If the filter function is known in the s domain, its transformation in the z space
can be simply performed by creating a correspondence map based on a bi-linear
transform derived from (B.16). We have
esT /2 1+ sT
z = esT = 2
(B.17)
e−sT /2 1− sT
2
1 2 1 − z −1
s= log(z) (B.18)
T T 1 + z −1
In this way, if we know the transfer function Hc (s) of a filter in the continuous domain
of the s variable, we can infer the approximate function Hd (z) of the equivalent digital
filter through the relationship:
2 1 − z −1
Hd (z) = Hc
T 1 + z −1
Appendix B: Feedback and Controls 407
Because of these approximations, the digital filter deviates from its analog equiv-
alent. In order to assess the magnitude of this deviation, we take s = i ωT and then
z = ex p(iωt). In this case, the function Td (z) is
2 eiωT /2 − e−iωT /2
Hd (ex p(iωt)) ≈ Hd =
T eiωT /2 + e−iωT /2
2 ωT
= Hd i tan = Hc (iωc )
T 2
where iωc is the variable in the continuum space.
2 ωT
ωc ⇒ tan (B.19)
T 2
If we don’t use the Z transform, a digital filter designed using continuous-time
methods (Laplace transform) will produce an inaccurate transfer function: poles and
zeros will be found at positions different from those desired.
The control of optical systems is a field where optics, electronics and mechanics
come together to achieve stability of the instrument: in a GW interferometer, lengths
need to be controlled to much better than one wavelength, sometimes to nanometer
level. The error signal is optical, processing is done by electronics and actuation is
performed by micro- or nano-mechanical devices. The Fabry–Perot optical resonator
has a dedicated appendix, with a detailed description of the sophisticated Pound–
Drever–Hall technique to keep its length on resonance. Here we focus on the controls
for the Michelson interferometer, crucial for the operation of the large GW optical
detectors. We recall the basic relations for the simplest Michelson interferometer, as
discussed in Chap. 9. A static difference in the length of the arms L = l1 − l2 causes
a phase mismatch between the recombined beams on the beam splitter: φ = 2kL.
The power collected on the photodiode is
Pin r1 2 + r2 2
Pd.c. = 1 + Ccos(φ + φs ) (B.20)
2 2
For control purposes, we must choose the operation point by selecting a φ value.
The choice is based on an optimization criterion. If we just want to maximize the
signal, then it is obvious to choose sin φ = 1 → φ = π/2, that is, to lock the inter-
ferometer at half fringe, the intermediate position between the maximums of light
and dark, where the slope of the instrument response is largest. However, in precision
experiments, the correct quantity to maximize is the signal-to-noise ratio (SNR) and
not simply the signal. We shall assume that the dominant noise is the shot noise of
light power, with spectral density
hc
SP P = 2 P̄d.c. (B.22)
λ
and that the incoming GW signal has Fourier transform H ( f ). It follows that the
Fourier transform of the phase s ( f ) = 2k L H ( f ). Using Eq. B.22 for the noise
and Eq. B.21, we have
√ Pin λ C sin φ
SN R = k L √ H( f ) (B.23)
hc 1 + Ccosφ
so that the optimum condition turns out to be: cos φ = −C 1, the dark fringe point.
Up to here, we have not introduced any modulation of the light field: the inter-
ferometer is at still on the chosen working point and φs produces the only time
dependent output. This configuration is referred to as D.C. detection: it has the
advantage of avoiding the phase noise of the oscillator generating the modulating
signal.
However, laser sources are affected by the so-called 1/ f noise, i.e. are much
noisier at low frequencies (φs is expected to vary at 10 − 104 Hz) than in the MHz
region. In order to overcome this problem, the frontal modulation technique is
used. The operational configuration is that of the heterodyne detector (Fig. B.10).
The angular frequency ω of the incoming light is phase modulated at the angular
frequency . The resulting electric field is
∞
where m is the modulation index. A more detailed account can be found in the
Appendix A. We now focus on the first three frequency components: the carrier
at ω and the two side bands at ω ± . The Michelson interferometer can then be
configured in such a way that the length of the two arms has a static difference
L S = l1 − l2 , called Schnupp asymmetry, chosen in such a way that the signal of
the outgoing carrier verifies the condition of destructive interference (dark fringe)
ωL S /c = (2k + 1)π. In this way, the output signal of the carrier is suppressed,
while the side bands are transmitted, as they do not verify the dark fringe condition.
Indeed, in traversing the interferometer, the sidebands accumulate an additional phase
Appendix B: Feedback and Controls 409
l2
l1
Fig. B.10 A simple Michelson control scheme based on the heterodyne method. Courtesy of the
Virgo collaboration
difference ±2L S /c. This allows us to choose the modulation frequency such as
to maximize the sidebands amplitude:
c
= (n + 1)π
2L
If the reflectivities of the end mirrors r1 and r2 are different, the dark condition for
the carrier is only partially verified, the light intensity is at a minimum, but non-zero:
the output signal of the photodiode will also have a constant term proportional to
(r1 − r2 )2 . In fact, the wave function at the output of the interferometer is the sum
of three fields:
u (m, t) = 0 + + −
with
0 = −Jo eiδlωt/c + e−iδlωt/c
± = ±J1 eiδl(ω±)t/c + e−iδl(ω±)t/c e±i(l1 +l2 )/c
where we added to the static asymmetry of Schnupp a gravitational signal δl gw so
that δl = L S + δl gw .
It follows that the light power incident on the photodiode is proportional to
ω ω
||2 = DC + A sin L S sin 2 δl gw ·
c c
A1 sint + A2 cost (B.26)
being A, A1 , A2 constant quantities, while DC = |0 |2 + |− |2 + |+ |2 is the
fraction of the light power hitting the photodiode that is constant in time. Equa-
tion B.26 tells us that the gravitational signal will be detectable at the modulation
frequency as long as the Schnupp asymmetry is different from zero. In addition,
the same output signal can be used to keep the interferometer in the dark fringe
condition by acting on one of the two end mirrors or on the beam splitter.
We briefly recall here the complex optical scheme of Advanced Virgo, the European
interferometric gravitational wave detector, described in greater detail in Chap. 9.
• The basic, simple Michelson consists of the two 3 km long arms, along the North
and West directions, instrumented with a beam splitter (BS) and two end mirrors,
North-End (NE) and West-End (WE).
• Each arm hosts an additional input mirror (WI, NI) that forms with the respective
End mirror a Fabry–Perot (FP) cavity, where the light bounces back and forth. This
increases the equivalent optical path and amplifies the gravitational signal. In order
to maximize the signal-to-noise ratio, the length difference of two arms (North
and West) is adjusted so that the rays destructively interfere at the interferometer
output (dark fringe condition). In this set-up virtually all the light is reflected back
towards the laser.
• By inserting an additional, partially reflecting mirror, the recycling mirror (RM),
between the laser and the BS, a new optical cavity is formed, where light resonates.
This is called power recycling cavity (RC) and consists of the RM mirror and all
the interferometer that operates reflecting back towards the laser all the light, like
a virtual mirror. By taking advantage of the resonance of the recycling cavity, the
circulating light power can reach 1 kW on the beam splitter with only 20 W of
laser power.
• Finally, we have the Signal Recycling Mirror, to enhance the interferometer
response at a chosen frequency. For sake of simplicity, we shall not consider
it here.
In this configuration (see Fig. B.11) the interferometer is a coupled cavity system: in
addition to the dark fringe condition, all of these cavities must be locked in resonance
in order to achieve a proper operating configuration for the detector.
Each mirror is suspended like a pendulum to isolate it from seismic noise: this
behaves as a mechanical low-pass filter with a characteristic frequency of few Hz.
However, the residual, unfiltered excitation at very low frequency will make the
mirror oscillate around its equilibrium position, with an amplitude of the order of
Appendix B: Feedback and Controls 411
Fig. B.11 Simplified optical diagram of Virgo, without the signal recycling cavity. FP cavities are
made up of pairs of mirrors: NI + NE and WI + WE. BS and RC are the beam splitter and the
power recycling mirror. EOM indicates the electro-optical modulator used to frequency modulate
the light. Courtesy of the Virgo collaboration
few tens of micrometres. A local control system, partly based on accelerometers and
partly on displacement sensors anchored to the ground, reduces this amplitude to a
fraction of a wavelength, still large enough to prevent stabilization of the resonance
conditions for the optical cavities and Michelson dark fringe. In essence, purely
local systems are not sufficient to keep the interferometer in the working position:
the locking of the detector working point requires a system of global control of mirror
distances, based on signals extracted from the interferometer itself.
Referring to Fig. B.11, there are four independent lengths that need to be con-
trolled:
Thus, we need to extract four independent error signals for these four lengths, in
order to implement actuation on the various mirrors. The control system must have
a high gain at very low frequency (up to a few Hz) and should not introduce noise in
the frequency bandwidth of the detector (10 Hz–5 kHz).
To get an idea of the challenge in the design of the control system, consider
a numerical example relative to the initial Virgo configuration: the laser light has a
wavelength of λ = 2πc/ω = 1.05 µm and the Finesse of the various resonant optical
cavities is F ∼ 50. In order to keep the 3 km long FP cavities in resonant condition,
412 Appendix B: Feedback and Controls
π±2 lr + F P (ω ± ) = 2nπ (B.28)
c
A detailed analysis of these conditions shows that there is an optimum value for the
Schnupp asymmetry, a value that maximizes the transmission of the sidebands at the
photodiode output: (Vinet 2020)
(l1 − l2 )
cos = rr r I T F
c
where rr is the reflection coefficient of the recycling mirror and r I T F that of the
second, virtual mirror of the RC, composed by the entire interferometer. The physical
meaning of this condition is that the power extracted from the side bands should match
that lost within the recycling cavity. In Virgo jargon, this is the optimal matching
condition at the output. As mentioned above, in order to control the interferometer
we need four independent signals related to the four lengths to stabilize. In practice,
experimenters chose to control four degrees of freedom that are linear combinations
of these lengths. In Fig. B.12 we sketch these motions:
The error signals used to control these quantities are the photocurrents produced
in the photodiodes shown in Fig. B.13. The numbering of photodiodes, apparently
bizarre, reflects the historical use adopted in the Virgo experiment.
Appendix B: Feedback and Controls 413
Fig. B.12 The four degrees of freedom of the Virgo interferometer: a DARM: Fabry–Perot dif-
ferential mode. b CARM: Fabry–Perot common mode. c MICH: Michelson differential mode. d
PRCL: Michelson common mode. Courtesy of the Virgo collaboration
The output signal at the photodiode 1, where the information about the GW signal
is imprinted, is mainly determined by the differential mode of the 3 km FP cavity
(δL 1 − δL 2 ).
On the other hand, the signal from photodiode 2 mainly depends on the common
mode of the FP cavities (δL 1 + δL 2 ) and is much less influenced by the Michelson
common mode (δl1 + δl2 ).
The roles in the control strategy of the photodiodes 5 and 7 are similar: photodiode
5 controls the North arm FP cavity in reflection (and 7 in transmission); its output
signal also strongly depends on the length of the West arm cavity. This effect is due
to the coupling between the two FP cavities, induced by the recycling mirror. As a
result, it is sensitive to the FP common mode and its dependence on the FP differential
mode is much weaker. Photodiode 8 has an equivalent function in transmission of
the West cavity.
In conclusion, most of the signals are more sensitive to changes in the length
of the FP cavity than to that of the Michelson, whose arm lengths are smaller by a
factor of ∼102 . To extract information about the differential motion of the Michelson
and the length of the recycling cavity, we need to compute the difference between
two or more signals. This reduces the SNR because the independent noises of the
various photodiodes add up quadratically. In order to get around this problem in
part, we can increase the information redundancy by using also higher harmonics
of the modulation frequency, obtained by demodulating and filtering at the proper
frequencies the output signal from photodiodes.
414 Appendix B: Feedback and Controls
Fig. B.13 The Virgo longitudinal control. Courtesy of the Virgo collaboration
References
Bechhoefer, J.: Feedback for physicists: a tutorial essay on control. Rev. Mod. Phys. 77, 783–835
(2005)
Vinet, J.-Y.: The VIRGO Physics Book, OPTICS and related TOPICS (revision 2020). https://artemis.oca.eu/fr/
rechercheartemis/projets/virgo/2081-the-darkf-optical-simulation-code
The Fabry–Pérot Cavity
C
C.1 Introduction
A Fabry–Pérot (FP) cavity is a linear optical resonator which consists of two highly
reflecting mirrors, where the light bounces between the two reflecting surfaces, and
is transmitted only for well-defined wavelengths. It is widely used in telecommuni-
cations, lasers technology and astrophysics. It is also employed as a high-resolution
optical spectrometer, exploiting the fact that the transmission through such a res-
onator exhibits sharp resonances and is very small between those. As we shall see,
these devices are used, in gravitational research, to extend the path of light in the
interferometer arms, to keep their length equal to a few picometers, to stabilize the
laser wavelength.
The FP consists of two mirrors positioned at a distance l from each other: as
shown in the upper right pane of Fig. C.1, IM is the input mirror, with reflectivity,
transmittivity and losses indicated by the symbols r1 , t1 and p1 , respectively. EM is
the second (end) mirror of the cavity, with corresponding quantities r2 , t2 and p2 .
We assume to illuminate the cavity with a monochromatic plane light wave of
frequency ν
ψ(x, t) = K ei(kx−2πνt) = A(t) eikx
where A is the complex (and time dependent) amplitude, k = 2π/λ and λ = c/ν
the wavelength. This wave, once propagated over a distance l, acquires an additional
phase factor i kl. Moreover, at each reflection, the complex amplitude of the optical
field changes by a factor (i r ) and at each transmission by a factor t. If we call ψin
the field at the cavity input, the transmitted ψt and reflected ψr fields are deducted
taking into account all these contributions. Figure C.1 shows how we combine the
various contributions inside and outside the cavity. In high sensitivity cavities we
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer 415
Nature Switzerland AG 2022
F. Ricci and M. Bassan, Experimental Gravitation, Lecture Notes in Physics 998,
https://doi.org/10.1007/978-3-030-95596-0_C
416 Appendix C: The Fabry–Pérot Cavity
ψin = A eikl
ψ1 = ?
ψ2 = eikl ψ1
ψ3 = ir2 ψ2
ψ4 = eikl ψ3 l
Fig. C.1 Fields in the Fabry–Pérot cavity
The resulting field is the superposition of an infinite series of waves, with geomet-
rically decreasing amplitudes. Summing up all the contributions we have, e.g. the
field ψ1 inside the cavity:
ψ1 =t1 ψin + (ir1 )(ir2 )t1 e2ikl ψin + · · · (ir1 )n (ir2 )n t1 e2nikl ψin + · · ·
t1
=ψin (C.2)
1 + r1r2 ei2kl
t1 t2 e−ik l
ψt = ψin , (C.3)
1 + r1r2 ei2kl
r1 + r2 (1 − p1 )ei2k l
ψr = i ψin (C.4)
1 + r1r2 ei2kl
We can also derive the fields from their mutual relations, as shown in Fig. C.1.
These relations show that when ei2kl = −1, i.e. when 2l = (n + 1/2)λ, we have a
resonance phenomenon.
Suppose now we keep the mirrors at a fixed distance l and to change the light
frequency ν = c/λ. The cavity exhibits resonance peaks, one after the other, every
time that the frequency equals
1
c
νn = n + n = 1, 2, 3 . . . . (C.5)
2 2l
The frequency difference between two adjacent resonances is the Free Spectral Range
(FSR)
c
ν F S R = (C.6)
2l
Appendix C: The Fabry–Pérot Cavity 417
The inverse of ν F S R is the travel time of a round trip pass for the light inside the
cavity, so that = 2πν/ν F S R is the phase accumulated by the wave at each round
trip. As an example, in the Virgo gravitational wave detector, where l 3000 m, we
have ν F S R 50 kHz. Using a laser light with λ = 1.06 µm, i.e. ν 3 · 1014 Hz,
the resonance condition Eq. C.5 implies n ∼ 109 .
The ratio S = ψ1 /ψin , called overvoltage factor,8 on resonance takes the value
t1
S(max) = (C.7)
1 − r1 r2
δν
= (2n + 1)π + 2π (C.8)
ν F S R
Neglecting, for the moment, the losses pi , we define the cavity finesse as
√
r1 r2
F =π (C.10)
1 − r1 r2
Note that F only depends on the combined reflectance r1r2 , just as ν F S R only
depends on the mirror distance l: so, [F , ν F S R ] represent an equivalent description
of the FP cavity.
Assuming δν << F S R , Eq. C.9 rewrites in a simpler form:
2
S(max)
|S|2 = 2 (C.11)
1 + 2πF sin π νδνF S R
This transmittance function of a FP interferometer is also referred to as the Airy
function, named after the British astronomer George Biddell Airy (1801–1892). In
the vicinity of the resonance, it is well approximated by
1
|S| S(max) 2 (C.12)
1 + 2F νδνF S R
8 Similar definition is used in the case of resonant electrical circuits, hence the name.
418 Appendix C: The Fabry–Pérot Cavity
From the relation (C.11), it is evident that the resonance of the Airy curve at half of
its maximum value has a Full Width Half Maximum—(FWHM)
ν F S R
δν F W H M = (C.13)
F
This shows that a large finesse produces a narrow resonance: for this reason F is
considered a sort of quality factor for a FP cavity.
We now focus our attention on the field ψr reflected back by the cavity, as expressed
by Eq. C.4. We consider the whole cavity as a virtual mirror, with a reflectivity R
given by the ratio ψr /ψin . We compute this ratio substituting into Eq. C.4 the phase
expressed by Eq. C.8, and observe its behaviour versus the frequency shift δν from
resonance:
r1 + r2 (1 − p1 )ei
R=i (C.14)
1 + r1r2 ei
1
|R|2 = 2 ·
1 + (r1r2 )2 + 2r1r2 cos
2 2
· r1 1 + r22 (1 − p1 ) + r2 r12 + (1 − p1 ) cos + r2 (1 − p1 ) − r12 sin
(C.15)
2
r1 − (1 − p1 ) r2 sin
R ≡ ar ctan{R} = (C.16)
r1 1 + r2 (1 − p1 ) + r2 (1 − p1 ) + r12 ) cos
2
For δν = 0, i.e. = (2n + 1)π, the function has an absorption peak and its phase
undergoes a swift change of 2π.
We can change the point of view: keep the light frequency ν (and the wavelength
λ = c/ν) constant, and allow the length of the cavity to change. The resonance
condition Eq. C.5 expressed for cavity length becomes
1
λ
ln = n + (C.17)
2 2
We will then talk about of Free Spectral Range length l F S R = λ/2 and Full Width
Half Maximum length, l F W H M = λ/2F .
Consider again Eq. C.8, this time for l close to resonance condition:
δν
= 2k l = π + 2π mod(2π) (C.18)
ν F S R
Appendix C: The Fabry–Pérot Cavity 419
Fig. C.2 The reflectivity of a FP resonator versus the intra-cavity phase change = 2kl, normalized
to π. Top: amplitude reflection, for a finesse F = 154, achieved with r1 = 0.98, r2 = 0.9998.
Bottom: phase of the light reflected by the FP resonator
To simply the notation we restrict to the interval [0, 2π[ and introduce the reduced
frequency f ≡ δν/ν F W H M = δν/(F ν F S R ), so that the phase takes the compact
form
f
= π + 2π
F
For high-quality cavities, as in the case of gravitational experiments, the losses
are small, p1 , p2 1, and the finesse is high: F > 50 for what discussed here (and
up to 105 in special cases). As remarked above, F only depends on the product r1r2 .
Inverting Eq. C.10 and neglecting terms O(π/F )2 << 1, we get
π
r1 r2 1 − (C.19)
F
We substitute this relation into Eq. C.14 and define (1 − p1 )r22 = 1 − p, where
the parameter p includes all the cavity losses.
The cavity reflection coefficient R is well approximated with
1 − p F /π + 2i f 1 − σ + 2i f
r2 R − =− (C.20)
1 − 2i f 1 − 2i f
In Eq. C.20 we have also introduced the coupling parameter σ of the cavity:
σ ≡ p F /π (C.21)
420 Appendix C: The Fabry–Pérot Cavity
0<σ<2
R( f = 0) = σ − 1 (C.22)
This relation shows the physical meaning of the coupling parameter σ:
– For σ = 1 we have maximum light storage in the cavity and no reflection. This
is the condition of optimum coupling.
– For 0 < σ < 1 the cavity is over-coupled. As σ grows in this range, the stored
light intensity increases.
– For 1 < σ < 2 the cavity is under-coupled and the stored light intensity decreases
to the point, for σ = 2, where the condition of total reflection takes place
In other words, as σ increases in the range 1−2 the stored light increases and the
reflected signal becomes progressively less sensitive to the cavity status. This can be
seen by studying R versus f. Its square modulus is
σ(2 − σ)
|R|2 = 1 − (C.23)
1+4f2
σ(2 − σ)
S= (C.25)
p(1 + 4 f 2 )
Appendix C: The Fabry–Pérot Cavity 421
and its maximum occurs at σ = 1. Inverting Eq. C.21, to express the coupling param-
eter in terms of the finesse F , we obtain
1 2F
Smax = = (C.26)
p π
It is interesting to evaluate how rapidly the phase changes around f = 0: this is indeed
the error signal used to keep the cavity on resonance. Equation C.24, expanded near
resonance to first order in f , has the form
2−σ 2−σ δν
R 2 f =2 F (C.27)
1−σ 1 − σ ν F S R
Combining this last equation with Eq. C.21, we obtain the desired relation:
dR 2−σ π
= 2σ (C.28)
dδν 1 − σ pν F S R
dR 2 − σ 2π
= 2σ (C.29)
dδl 1 − σ pλ
In the second equation we used the equivalence δν/ν F S R = δl/l F S R and l F S R =
λ/2. In the case of low losses and low cavity coupling, we can approximate with
dR 8F
= (C.30)
dδl λ
As a last step, we use this last simplified equation to compare the rate of phase
change for the light reflected by a FP cavity with that of an optical delay line.
The latter is a cavity of two mirrors where the light wave is bounced n DL times,
avoiding overlap of the beams. The phase change of the beam exiting the delay line
is DL = (4πn DL l)/λ and its derivative
d DL 4πn DL
= (C.31)
dl λ
Comparing Eqs. C.31 and C.30 we infer that, as regards the phase delay, the FP cavity
is equivalent to an optical delay line where the light is bounced a number of times
equal to n F P e f f
2F
nF Pef f = (C.32)
π
This conclusion can be used to define an equivalent optical path of light in the case
in which the FP cavity is used as the arm of interferometric detector of gravitational
waves.
422 Appendix C: The Fabry–Pérot Cavity
The Fabry–Pérot (FP) resonator is usually presented, for sake of clarity, in the con-
figuration composed of two parallel plane mirrors. However, the FP analysis of the
previous sections is general enough to hold validity in the general case of cavities
made by curved mirrors. The name optical resonator or optical cavity refers to this
general case.
In practice, an optical resonator is composed of two or more mirrors configured
in such a way to confine the light, which is reflected many times in the cavity,
producing standing waves for given resonant frequencies. The patterns of standing
wave produced are called cavity normal modes. The longitudinal modes differ only in
frequency, while the transverse modes have also different intensity patterns across the
beam transversal section. To have a stable configuration, the geometry of the cavity
must be chosen such to prevent the beam transversal dimension to continuously grow
with multiple reflections.
The stability depends on the geometrical parameters of the cavity, that are the
curvature of the mirrors R1 and R2 and the intra-cavity distance L. In the case of
an unstable cavity, the beam transversal dimension increases up to the point that it
becomes larger than the mirror diameter, and the light escapes from the cavity. In an
unstable resonator, the light, regardless of the initial direction, will leave the cavity,
while in a stable configuration rays are reflected back towards the system centre. We
state the stability condition as follows:
the light rays remain in the vicinity of the optical axis during the propagation
when the numbers of light bounces in the cavity tends to infinity.
To study the stability of a two-mirror cavity, we define for each mirror the stability
parameter
L
gi = 1 − with i = 1, 2
Ri
We then apply the formalism of ray transfer matrix, or optical matrix9 : by repeat-
edly applying the mirror-propagation-mirror matrices to our rays, we can derive the
stability requirement, which states that the hyperbolic function g1 g2 = ±1 define
9 This ray-tracing formalism works in the case of paraxial rays: each optical element is described
by a 2 × 2 transfer matrix, called ABCD matrix, applied to a 2 × 1 vector describing position and
inclination of the input light ray. The result is the output vector describing the output ray:
xout A B xin
=
θout C D θin
with the matrix elements A, B, C, D peculiar to each optical element (lens, mirror, propagation,
etc.) The computation of the light in the cavity resonator is done by applying the mirror matrix a
number of time equivalent to the number of light bounces in the cavity.
Appendix C: The Fabry–Pérot Cavity 423
0 g1 g2 1
The region of stable resonator is limited by the coordinate axes and the two hyperbolas
Let us note a few special cases (Fig. C.3):
and looking for stationary solutions for the electromagnetic field, separate the wave-
function U in its time and spatial dependence: U = A(x, y, z)T (t) and, because of
the paraxial approximation, let A(x, y, z) = u(x, y)eikz , where z is the optical axis
of the cavity. The solutions of the paraxial case of Eq. C.33 are given by superpo-
sition of Hermite–Gaussian modes when the amplitude profiles are described using
424 Appendix C: The Fabry–Pérot Cavity
where:
For the first transversal mode n = m = 0, i.e. TEM00 , the intensity distribution
along the z coordinate is
2r 2
I (z) = I0 exp − 2
w (z)
This implies that the intensity profiles of the light, measured on all the planes perpen-
dicular to the cavity axis, have a gaussian distribution, but with a width that varies
along the axis, as shown by the presence of w(z) in the variance. Due to diffrac-
tion, this Gaussian beam first converges to its minimum transverse size, at the beam
waist w(z) = w0 , and then, from this section, it expands with a divergence angle
θ = λ/(πw0 ) as it is shown in Fig. C.5.
The diameter of the beam profile depends on z as
λz 2
w(z) = w0 1 +
πw02
Appendix C: The Fabry–Pérot Cavity 425
Fig. C.4 Hermite-Gauss optical modes T E Mmn . The subscripts m and n count the number of nodal
lines in the x and y direction, respectively. Credits Creative Commons
Fig. C.5 Profile of a Gaussian beam: it is defined by the waist w0 and the Rayleigh range z R
Finally we derive the resonance frequencies of the modes associated to the cavity.
We recall that the phase of the transverse modes is
z
φ(0, 0,z) = kz − (1 + n + m)ar ctan = kz − (1 + n + m)ζ(z) (C.36)
zR
426 Appendix C: The Fabry–Pérot Cavity
Then, we impose the phase matching condition after a round trip of the light in
the cavity of length L:
10 This is the same R. Pound we encountered in Sect. 4.3 for the Pound–Rebka experiment. He is
also credited for the discovery of Nuclear Magnetic Resonance.
Appendix C: The Fabry–Pérot Cavity 427
If the modulation index is small, m 1, we only retain the carrier and the first
order sidebands and approximate, to first order in m: J0 (m) 1 and J±1 (m)
± m/2, so that the relevant spectral components are limited to
m m
ψin A0 e−iωt + i e−i(ω+)t + i e−i(ω−)t (C.40)
2 2
We can consider the light field as the superposition of three fields oscillating at
(slightly) different frequencies. The reflected field is
im im
ψr A0 e−ωt R0 + R+ e+it + R− e−it (C.41)
2 2
where R0 is the cavity reflectivity at the carrier frequency ω and R± the reflectivities
at ω ± .
11 Frequency and phase modulation are similar and yield the same results. But phase modulation is
easier to implement, e.g. by varying the optical path with an electro-optical modulator like a Pockel
cell.
428 Appendix C: The Fabry–Pérot Cavity
We have set the amplitude of this signal to I0 and will include in it, hence on,
all constant terms introduced by the following stages, like conversion efficiency,
amplifier gains, reference signal amplitude, etc.
The photo current is then processed by a mixer that extracts the amplitude con-
tribution at frequency .
Formally, the demodulation process consists of multiplying the incoming signal
by the reference signal, a cosine at frequency :
−i(t+θ)
D(t) ∝ e i(t+θ)
+e
and then low-pass filtering the result, in order to reject higher harmonics. As shown,
the process can introduce a phase delay θ, thus we must consider both the component
in phase with the demodulating signal (θ = 0) and that shifted by θ = π/2, called
quadrature. This is what a lock-in amplifier does, see Sect. A.6. The demodulated
current is
ideally half-way between our resonance peak and the adjacent ones. The sidebands
are then completely reflected, as shown by Eq. C.18 or by Fig. C.2. So, by choosing
c
∼ 2π ⇒ R± 1, (C.45)
4l
and dropping the subscript demod, Eq. C.43 simplifies to:
I p =m I0 m(R0 )
I q =0 (C.46)
The reflectivity near resonance, derived in the previous section (Eq. C.20 with
r2 ∼ 1), allows us to finally evaluate the PDH response:
1 − σ + 2i f p 4m(2 − σ)
R0 − ⇒ Idemod = −I0 f (C.47)
1 − 2i f 1+4f2
sin2kx
I p = − 2I0 Jo (m)J1 (m)(t1 t2 )2 r1r2
1 + (r1r2 )4 − 2(r1r2 )2 cos4kx
Iq = 0 (C.48)
where x is the displacement from resonance. The slope of the linear region, near
x = 0 can be evaluated:
4m F x
I p I0
π λ
having used Eq. C.10 that relates finesse and mirror reflectivities. A large value of
F improves sensitivity, yielding a steep response, but on the other hand narrows the
linearity range where the PDH method can be applied.
Figure C.7 shows how this function makes abrupt jumps when the condition 2l =
nλ/2 is met. In the lower pane, a zoom around the central resonance shows the linear
response in a ±2 nm interval.
430 Appendix C: The Fabry–Pérot Cavity
10
5
I p (a.u.)
-5
-10
-750 -500 -250 0 250 500 750
displacement from resonance (nm)
10
5
I p (a.u.)
-5
-10
-15 -10 -5 0 5 10 15
displacement from resonance (nm)
Fig. C.7 Simulated I p response of a PDH scheme for λ = 1.05 μm, r1 r2 = 0.99. The sharp
segments occur every λ/4 250 nm. The zoom on the resonance x = 0 shows the linearity region
used for feedback
References
Black, E.D.: An introduction to Pound - Drever - Hall laser frequency stabilization. Amer. J. Phys.
69, 79 (2001)
Drever R.W.P., et al.: Laser phase and frequency stabilization using an optical resonator. Appl.
Phys. B 31, 97–105 (1983)
Pound, R.V.: Electronic frequency stabilization of microwave oscillators. Rev. Sci. Instr. 17, 490
(1946)
Vajente, G.: Readout, Sensing and Control. In “Advanced Interferometers and the Search for Grav-
itational Waves”, Bassan M. ed. Springer Int. Publish (2014)