0% found this document useful (0 votes)

954 views374 pages

(Lionel J. Mason, Nicholas Michael John Woodhouse) (Bokos-Z1)

Q: Analyze how the nKdV hierarchy emerges from the twistor construction and the structure of gauge transformations.

The emergence of the nKdV hierarchy from twistor constructions hinges on the intricate structure of gauge transformations as they apply to hierarchies of differential equations. The process involves choosing initial patching matrices that respect the symmetry properties of the gauge transformations and then integrating the flows which are defined through these matrices . This delicate process relies on the Birkhoff factorization and interpreting these transformations within the twistor framework allows one to integrate the resulting flows explicitly due to the purely algebraic nature of the hierarchy's solutions. This approach reveals the rich interplay of symmetry and integrability nested within the SDYM frameworks .

Q: Evaluate the impact of Watkins' method on deriving solutions for the self-dual Yang-Mills equations on different spaces.

Watkins' method impacted the derivation of solutions for self-dual Yang-Mills (SDYM) equations by focusing on the self-duality condition rather than the full Yang-Mills equations, which simplifies the problem as self-duality conditions form an integrable system . Self-dual solutions automatically satisfy the Yang-Mills equations and eliminate half the curvature of the original system, which is significant because this integrable system can be related to twistor theory, providing a geometric framework that simplifies studying these solutions . Moreover, these reduced equations allowed for more explicit integration techniques and connections with other mathematical structures, such as twistor spaces and integrable systems . Watkins' method, leveraging self-duality, aligns with techniques like the Penrose-Ward transform for finding solutions over complex spaces, exploiting the Euclidean and ultrahyperbolic signatures to avoid issues with real structures . This approach broadens the applicability of the SDYM solutions in various theoretical physics and mathematical contexts .

Q: How do modern developments in twistor theory influence the understanding of Einstein's equations and self-dual structures?

Modern developments in twistor theory have been pivotal in reshaping the comprehension of Einstein's equations and self-dual structures by utilizing the geometric elegance of twistors to express these equations in concise and insightful forms . Twistor theory reframes these fundamental equations through complex analysis, showing that the self-dual portion of the gravitational field can be understood in terms of holomorphic curves . These advances offer new insights into the structure of spacetime, suggesting that self-dual structures inherently contain a wealth of mathematical symmetry that can clarify both classical and quantum geometrical properties of gravity . Moreover, twistor theory extends these notions to semi-simple Lie groups and other algebraic structures, further broadening its applicability across mathematical physics .

Q: Discuss the significance of the ASDYM equation reductions to the understanding of two-dimensional translation groups and their applications.

The reductions of the Anti-Self-Dual Yang-Mills (ASDYM) equations significantly enhance the understanding of two-dimensional translation groups by leading to integrable systems like the Korteweg de Vries (KdV) and nonlinear Schrodinger (NLS) equations. These reductions allow the construction of associated equations, such as the Kadomtsev-Petviashvili (KP) and Davey-Stewartson equations, which generalize the KdV and NLS equations respectively . The process involves imposing symmetry conditions, typically through two-dimensional translation groups, resulting in equations with two independent variables . This method accounts for a wide range of integrable systems and provides a connection between symmetry operations and practical mathematical models . By applying a two-dimensional subgroup of the conformal group, one can derive various reduced forms, which illustrate the role of symmetry in simplifying complex systems and establish integrability characterized by the existence of a twistor construction . These applications are crucial for expanding mathematical frameworks that model physical phenomena, and play a significant role in the classification of possible symmetries and integrable systems derived from the ASDYM equation .

Q: Explain the relation between the canonical line bundle on twistor space and the Einstein metrics in the context of anti-self-dual (ASD) structures.

The canonical line bundle on twistor space is deeply connected to the structure of Einstein metrics, especially in the context of anti-self-dual (ASD) structures. In four dimensions, an anti-self-dual Yang-Mills equation implies that the curvature is decomposed into self-dual and anti-self-dual parts, with the latter vanishing for anti-self-dual structures . This self-duality condition, pertinent to both linear and complex manifolds, manifests in the geometry of twistor space which consists of lines (representing points in the complexified space-time). In the twistor theory, often employed to analyze such structures, solutions to the anti-self-dual Yang-Mills equations correspond to holomorphic vector bundles over parts of the complex projective space, CP³ . For anti-self-dual Einstein metrics, the twistor space can be seen as a curved counterpart where a concavely chosen neighborhood in the space-time corresponds to a set of a-surfaces, akin to null-planes in flat space, encoded within the twistor space . The use of twistor theory in this context involves the Penrose-Ward transform, which connects the ASDYM equations and the holomorphic bundles over these twistor spaces . Thus, the structure of the canonical line bundle over twistor space is foundational for capturing the ASD property and subsequently the complexities of the Einstein metrics deployed within these geometric constructions.

Q: How does Penrose's concept of nonlinear gravitons contribute to the study of curved space-time in twistor theory?

Penrose's concept of nonlinear gravitons contributes to the study of curved space-time in twistor theory by utilizing the twistor framework, which maps solutions of the anti-self-dual (ASD) Einstein equations to curved twistor spaces . This approach simplifies the complex geometry of space-time by translating it into linear structures in twistor space, allowing for easier manipulation and analysis . The correspondence between solutions to the ASD Yang-Mills equations and holomorphic vector bundles exemplifies the power of twistors in describing space-time geometrically . This transformation preserves the conformal symmetries, further integrating these geometric insights into the broader context of theoretical physics .

Q: How does the twistor theory framework facilitate the understanding of anti-self-dual (ASD) conformal structures in mathematical physics?

Twistor theory facilitates the understanding of anti-self-dual (ASD) conformal structures by providing a natural framework to study the geometry of integrable systems and their solutions. It connects the anti-self-dual Yang-Mills (ASDYM) equations to holomorphic vector bundles over complex projective space, allowing the ASD condition on curvature to be interpreted in terms of these geometric structures . The Penrose-Ward transform in twistor theory maps solutions of the ASDYM equation to holomorphic bundles, simplifying the study of these equations and revealing their integrability . Furthermore, twistor theory helps in exploring the conformal invariance of ASD conditions, as these conditions are preserved under the twistor correspondence, allowing for a deeper understanding of the underlying geometric and physical phenomena .

Q: What roles do the integrability conditions and null vectors play in simplifying the physical models represented by Yang-Mills equations?

Integrability conditions significantly simplify the Yang-Mills equations by reducing the number of variables and highlighting algebraic structures that can lead to solvable models. Specifically, the inclusion of null vectors under conformal transformations reduces the complexity of the equations, transforming them into a more manageable system . Null vectors, in conjunction with gauge symmetries, simplify the physical models by eliminating extraneous dynamical variables and preserving essential dynamics in the forms of first integrals and reduced equations. Such techniques ultimately play into achieving reductions to known integrable systems, thus simplifying the physical model while retaining significant dynamical content .

Q: What are the implications of the reduction of self-dual Yang-Mills equations for understanding integrable systems like the KdV and NLS equations?

The reduction of self-dual Yang-Mills (SDYM) equations highlights the underlying structures comparable to integrable systems like KdV and NLS equations. When the SDYM equations are reduced under certain symmetries, they lead to known integrable equations, demonstrating that the structure of integrable systems can be observed within a broader framework. For example, Zakharov's system, when further reduced, can yield the nonlinear Schrodinger and Korteweg de Vries equations, demonstrating their embeddedness within the SDYM framework . This shows that integrable systems and their solutions find a natural place in the architecture of SDYM equations .

Q: How do point symmetries manifest in the reduction by null translations in the context of conformal Killing vectors?

Point symmetries in the context of reduction by null translations manifest as an infinite-dimensional group of transformations. Specifically, when the conformal Killing vector is null, the reduction leads to a new class of symmetries distinguished from those resulting from non-null reductions like time translations or rotations, which are typically finite-dimensional . This infinite-dimensional symmetry group arises because the reduced system of equations from a null translation involves the Higgs field and potential, where transformation freedoms can be represented by an arbitrary function of the coordinates, enabling an extensive solution space . In particular, for a given gauge group such as SL(2,C), the solutions related by these point transformations are considered equivalent, absorbing the choice of first integrals into the coordinate freedom . The resulting structure from this symmetry is closely linked with specific equations like the Zakharov system, which are embedded within the ASDYM (anti-self-dual Yang-Mills) equations through such reductions .

Uploaded by

rebe53

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

954 views374 pages

(Lionel J. Mason, Nicholas Michael John Woodhouse) (Bokos-Z1)

Uploaded by

rebe53

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 374

LONDON MATHEMATICAL SOCIETY MONOGRAPHS

NEW SERIES

Previous volumes of the LMS Monographs were published by Academic Press, to

whom all enquiries should be addressed. Volumes in the New Series will be published
by Oxford University Press throughout the world.

NEW SERIES

1. Diophantine inequalities R. C. Baker

2. The Schur multiplier Gregory Karpilovsky
3. Existentially closed groups Graham Higman and Elizabeth Scott
4. The asymptotic solution of linear differential systems M. S. P. Eastham
5. The restricted Burnside problem Michael Vaughan-Lee
6. Pluripotential theory Maciej Klimek
7. Free Lie algebras Christophe Reutenauer
8. The restricted Burnside problem (2nd edition) Michael Vaughan-Lee
9. The geometry of topological stability Andrew du Plessis and Terry Wall
10. Spectral decompositions and analytic sheaves J. Eschmeier and M. Putinar
11. An atlas of Brauer characters C. Jansen, K. Lux, R. Parker, and R. Wilson
12. Fundamentals ofsemigroup theory John M. Howie
13. Area, lattice points, and exponential sums M. N. Huxley
14. Super-real fields H. G. Dales and W. H. Woodin
15. Integrability, self-duality, and twistor theory L. Mason and
N. M. J. Woodhouse
16. Categories of symmetries and infinite-dimensional groups Yu. A. Neretin
Integrability,
Self-Duality, and
Twistor Theory
L. J. Mason
and
N. M. J. Woodhouse
The Mathematical Institute, Oxford, UK

CLARENDON PRESS . OXFORD

1996
Oxford University Press. Walton Street. Oxford OX2 6DP
Oxford New York
Athens Auckland Bangkok Bombay
Calcutta Cape Town DaresSalaam Delhi
Florence Hong Kong Istanbul Karachi
Kuala Lumpur Madras Madrid Melbourne
Mexico City Nairobi Paris Singapore
Taipei Tokyo Toronto
and associated companies in
Berlin Ibadan

Oxford is a trade mark of Oxford University Press

Published in the United States by

Oxford University Press Inc., New York

® L. J. Mason and N. M. J. Woodhouse, 1996

All rights reserved. No part of this publication may he

reproduced, stored in a retrieval system, or transmitted, in any
form or by any means, without the prior permission in writ ing of Oxford
University Press. Within the UK, exceptions are allowed in respect of any
fair dealing for the purpose o f research or private study, or criticism or
review, as permitted under the Copyright. Designs and Patents Act, 1988. or
in the case of reprographic reproduction in accordance with the terms of
licences issued by the Copyright Licensing Agency. Enquiries concerning
reproduction outside those terms and in other countries should be sent to
the Rights Department. Oxford University Press, at the address above.

This book is sold subject to the condition that it shall not,

by way of trade or otherwise, he lent, re-sold! hired out, or otherwise
circulated without the publisher's prior consent in any form of binding
or cover other than that in which it is published and without a similar
condition including this condition being imposed
on the subsequent purchaser.

A catalogue record for this book is available from the British Library

Library of Congress Cataloging in Publication Data

(Data applied for)

ISBN 0-19-853498-1

Typeset by the authors using LA TEX

Printed in Great Britain by
Bookcraft Ltd. Midsomer Norton, A von
Preface

This book grew out of a series of lectures that one of us (LJM) gave in Oxford
three years ago. It had become increasingly clear to us that the connections
between integrability theory and Roger Penrose's twistor construction are deep
and significant, and it seemed timely to try to draw them together in a review.
As is inevitable, our ideas shifted as the project took shape, and like almost all
books, it is longer than we originally intended.
Over the past twenty-five years, the study of integrability has grown into a
significant branch of mathematics. Examples of integrable systems have been
found in fields ranging from fluid dynamics, nonlinear optics, particle physics,
and general relativity to differential and algebraic geometry, and topology. Their
special significance is that. they combine tractability with nonlinearity, so they
make it possible to explore nonlinear phenomena while working with explicit
solutions: in many integrable systems one can even obtain detailed information
about the structure of the entire space of solutions. Integrability theory has also
had an impact on other branches of mathematics through the application of its
techniques, for example, in statistical mechanics and in the theory of cellular
automata.
Our book is not an exhaustive survey of this huge and growing catalogue of
theory and example. Rather we present a unified point of view on what might be
termed the core of the theory, adopting an approach that is strongly influenced
by ideas of Richard Ward (1985, 1990a). He drew attention to the unifying
role of the self-dual Yang-Mills equations, which contain many of the familiar
examples of integrable equations as symmetry reductions. We have two central
themes.
(1) The symmetries of the self-duality equations-the self-dual Yang-Mills equa-
tions, the self-dual Einstein equations, and various generalizations of them-
provide a natural classification scheme for integrable systems, albeit one that
is not yet complete.
(2) The twistor theory of the self-duality equations is a natural framework within
which to study the geometry of some of the powerful general constructions,
such as the inverse scattering method, and the connections between them.
Our aim is to present a systematic account of the basic theory of integrable
equations from this point of view, in a way that makes the origin of the standard
constructions less mysterious. We hope that our book will be of use to the
beginner who wants to learn the subject from scratch as well as to the expert.
But it is not intended to be just a text on a maturing branch of mathematics: we
claim that the body of results presented here, some new, some well known, lends
vi Preface
powerful support to the thesis that integrability is characterized by the existence
of a twistor construction.
There are topics that we wanted to include, but did not through lack of time
and space: some of these are mentioned at the end of Chapter 1. A project for
the future will be to explore in more depth the connection between the ideas pre-
sented here and the construction of integrable systems from the representation
theory of infinite-dimensional groups, as in Kostant (1979), Adler and van Moer-
beke (1980), Symes (1980), Sato (1981), Date et al. (1983), Ueno and Nakamura
(1983), Kac and Wakimoto (1989), as well as the R-matrix formulation described
in Faddeev and Takhtajan (1987), and the beautiful ideas in Novikov (1994).
We thank M. J. Ablowitz, M. F. Atiyah, G. Calvert, A. Carey, S. Chakravarty,
P. A. Clarkson, M. Dunajski, K. C. Hannabuss, N. J. Hitchin, G. P. Kelly, E.
T. Newman, R. Penrose, G. B. Segal, M. A. Singer, G. A. J. Sparling, I. A. B.
Strachan, K. P. Tod, and, in particular, R. S. Ward and the Press's anonymous
reader for contributions and encouragement. We thank the Isaac Newton Insti-
tute and NATO (CRG numbers 950300 and 901086) for support while this book
was being written.

The Mathematical Institute, Oxford L. J. M

January 1996 N. M. J. W
Contents

1 Introduction
1.1 Examples of integrability
1.2 Outline of the book
Notes on Chapter 1

I REDUCTIONS OF THE ASDYM EQUATION

2 Mathematical background I
2.1 Gauge theories
2.2 Space-time
2.3 Differential forms
2.4 Conformal transformations and compactified space-time
2.5 Bundles, connections, and curvature
2.6 The Yang-Mills equations
Notes on Chapter 2
3 The ASD Yang-Mills equation
3.1 ASD electromagnetic fields
3.2 Lax pairs
3.3 Yang's equation and the K-matrix
3.4 Lagrangians for the ASDYM equation
3.5 The Hamiltonian formalism
Notes on Chapter 3
4 Reduction of the ASDYM equation
4.1 Classification of reductions
4.2 Reductions of the linear ASD equation
4.3 Conformal reduction in the non-Abelian case
4.4 Invariant connections and Higgs fields
4.5 The space of orbits
4.6 Backlund transformations
Notes on Chapter 4
5 Reduction to three dimensions
5.1 The Bogomolny equation
5.2 Hyperbolic monopoles and other generalizations
5.3 Reduction by a null translation
viii Contents
Notes on Chapter 5 66

6 Reduction to two dimensions 67

6.1 Two-dimensional groups of conformal motions 67
6.2 Reductions by H++ 68
6.3 Reduction by H+o 73
6.4 Reduction by HSD 82
6.5 Reduction by HASD 84
6.6 The Ernst equation 84
6.7 Reduction of Yang's equation 89
6.8 Liouville's equation 91
Notes on Chapter 6 92

7 Reductions to one dimension 95

7.1 Abelian reduction to one-dimension 95
7.2 Nahm's equations and tops 98
7.3 The motion of an n-dimensional rigid body 101
7.4 The Painleve equations 102
7.5 Non-Abelian reductions 108
Notes on Chapter 7 109

8 Hierarchies 111
8.1 The KdV flows 111
8.2 The recursion operator for the ASDYM equation 114
8.3 Hamiltonian formalism 115
8.4 ASDYM and Bogomolny hierarchies 118
8.5 Reductions of the ASDYM flows 123
8.6 The generalized ASDYM equation 127
Notes on Chapter 8 132

II TWISTOR METHODS

9 Mathematical background II 137

9.1 Projective spaces and flag manifolds 137
9.2 Twistor space 138
9.3 Birkhoff's factorization theorem 145
9.4 Holomorphic vector bundles: the Cech description 149
9.5 e-operators 153
9.6 Cohomology 155
9.7 The Grassmannian 157
9.8 Scattering on the real line 158
9.9 Spinors 160
Notes on Chapter 9 168
Contents ix

10 The twistor correspondence 171

10.1 The concrete form of the Penrose-Ward transform 171
10.2 The abstract form of the transform 176
10.3 The Painleve property 179
10.4 Global solutions in Euclidean signature 180
10.5 Global solutions in ultrahyperbolic signature 187
10.6 The GASDYM equation 194
10.7 The truncated GASDYM hierarchy 195
10.8 The linear Penrose transform 196
Notes on Chapter 10 201

11 Reductions of the Penrose-Ward transform 204

11.1 Symmetries of the twistor correspondence 205
11.2 Symmetries of the twistor bundle 206
11.3 Reduced twistor spaces 211
11.4 The KdV and NLS equations 218
11.5 The initial value problem and inverse scattering 220
11.6 Isomonodromy and the Painlev6 equations 231
11.7 The Schlesinger equation 239
Notes on Chapter 11 241

12 Twistor construction of hierarchies 244

12.1 Transformations of the patching matrix 245
12.2 DS operators and the GASDYM hierarchy 250
12.3 The twistor construction of the DS flows 254
12.4 Explicit construction of solutions from twistor data 264
12.5 Hamiltonian formalism 269
12.6 The KP equation and the KP hierarchy 274
Notes on Chapter 12 282

13 ASD metrics 284

13.1 Self-duality in curved space-time 284
13.2 The Levi-Civita connection 286
13.3 Spinors and the correspondence space 289
13.4 ASD conformal structures 294
13.5 Curved twistor spaces 300
13.6 Reductions 305
13.7 ASDYM fields and the switch map 307
Notes on Chapter 13 316

A Active and passive gauge transformations 317

B The Drinfeld-Sokolov construction 319

Notes on Appendix B 326
x

C Poisson and symplectic structures

Notes on Appendix C
D Reductions of the ASDYM equation
References
A note on notation
Index of notation
Index
1
Introduction

It is easier to give examples of `integrability' than to say precisely what it means.

The mathematical literature contains an impressive body of well-developed the-
ory, but no single effective characterization that covers all the known cases. The
difficulty is that the theoretical development has been driven very largely by the
study of particular examples, and that the mathematical tools that are used to
deal with one class of examples do not always carry over to another. As the
catalogue of theory and example grows, older definitions that capture the exact
meaning of integrability in finite-dimensional systems are seen to lack precision
and universality.

1.1 EXAMPLES OF INTEGRABILITY

In classical mechanics, ìntegrability' is a clearcut concept, although even here
one must take care not to make the definition too wide. A Hamiltonian system
with a 2n-dimensional phase space is integrable if it is possible to find explic-
itly n constants of the motion in involution, that is, with vanishing Poisson
brackets. This is fewer than one might expect to have to find, because in gen-
eral 2n - 1 constants are needed to integrate a flow in 2n dimensions. But in
Hamiltonian systems, n constants in involution are sufficient to reduce the prob-
lem to quadratures, because the constants play two roles: first, they determine
a family of n-dimensional manifolds ('level surfaces') tangent to the flow, and
so effectively they halve the number of dependent variables in the equations of
motion; second, they generate Abelian symmetry groups that determine special
coordinate systems within the level surfaces in which the evolution is linear. The
coordinate vectors are the Hamiltonian vector fields of the conserved quantities,
which commute and are tangent to the level surfaces. In fact, by evaluating in-
tegrals, one can introduce new canonical coordinates, the action-angle variables,
in which the entire flow of an integrable system is linear in t.
Every 2n-dimensional Hamiltonian system admits n local constants of the
motion in involution. What is required for integrability is that the constants
should be global in an appropriately strong sense and that they should be given
explicitly in terms of the original coordinates of the problem. In bound systems, a
standard globality condition is that the level surfaces should be compact. They
then determine a foliation of phase space by affine manifolds tangent to the
Hamiltonian flow. The leaves are tori, and the flow in each leaf is linear. In
2 Introduction
unbound systems, it is difficult to translate the requirements into a simple geo-
metric condition. The problem is that it is often very simple to define implicitly
constants of the motion in involution, such as the initial values of the position
coordinates, which are `global', but are not easily expressed as èxplicit functions'
of the phase space coordinates at a general time.
When we turn to systems of partial differential equations, in which there are
infinitely many degrees of freedom, the definition is harder to formulate precisely.
There are direct extensions of the finite-dimensional Hamiltonian theory, such
as that given by Faddeev and Takhtajan (1987), but they require a phase space
formulation and therefore the specification of boundary conditions. This seems
unduly restrictive, as one would like to see integrability as a property of a set
of equations, independent of any particular choice of a class of solutions. Also
there is no simple extension of the Hamiltonian approach to elliptic equations
nor to systems in space-times with other non-Lorentzian signatures, where the
notion of èvolution' is inappropriate.
One can, however, identify common features of the systems that are ìn-
tegrable' in one accepted sense or another. First, the equations are to some
degree soluble. In many cases a large or even dense class of solutions can be
found explicitly and there exist general methods for constructing solutions. Sec-
ond, it is often possible to find nonlinear superpositions of solutions. Third,
ergodic behaviour is ruled out by the existence of a large number of constants of
the motion-chaotic behaviour is certainly evidence of nonintegrability. Fourth,
they have the Painleve property, which we shall explain shortly. These might
be regarded as ìntegrable properties'. On the other hand the nontrivial exam-
ples have genuine nonlinear properties, such as the existence of bound solutions
that do not disperse. Such solutions do not exist for equations which are merely
nonlinear transforms of linear equations.
The following examples should serve to illustrate these remarks (they are
discussed in more detail later on).
The Euler top. The dynamical evolution of an asymmetric body spinning
about its centre of mass is given by the geodesic flow of a left-invariant metric
on SO(3). This is an integrable Hamiltonian system: the equations of motion
can be solved by elliptic functions, and the phase space is foliated by Lagrangian
tori, on which the flow is linear: the tori are the common level surfaces of the
energy, the total angular momentum, and one of the components of the angular
momentum.
Two other top configurations are also integrable: the Lagrange top, in which
the fixed point and the centre of gravity lie on the axis of symmetry, and the
Kovalevskaya top, which has principal moments at 2A, 2A and A at the fixed
point, with the centre of mass lying in the plane of the principal axes of moment
2A. The `third constant' in the Kovalevskaya case is not easy to find.
The Korteweg-de Vries (KdV) equation. This celebrated equation,
Out - uxxx - 6uux = 0,
Examples of integrability 3

was derived by Korteweg and de Vries in their investigation of the behaviour of

water waves in a shallow channel. By building on the work of Boussinesq and
Rayleigh, they used it to explain the empirical observations of Scott-Russell,
who had followed a large solitary wave while riding alongside a canal and had
later reproduced the phenomenon in laboratory experiments. A solitary wave is
modelled by a soliton solution to the KdV equation
u = 21c2sech2(Kx + rc3t - kxo) ,
where ,c and xo are constants. This is able to retain its size and shape as t
increases because of the balance between the effects of the dispersive term uxxx
and of the nonlinear term 6uux, which on its own would cause the wave to bunch
up and break.
One of the integrable properties of the KdV equation is that solitons can be
superposed: there are solutions for which a has the initial form of a number of
widely separated solitary waves which pass through each other as t increases and
re-emerge after interaction as separate solitons with the same size and shape as
the originals, but displaced (see, for example, Drazin 1983). The existence of
such solutions is essentially a nonlinear phenomenon.
Linear systems. The `integrability' of the KdV equation can be traced to the
existence of a Lax pair the equation is the condition that the two differential
operators
L=82+u and M=8t-82-u8x- nux
should commute. The existence of a Lax pair, or, more generally, a reformula-
tion of an equation as the consistency condition for an associated linear system
of overdetermined differential equations, does not by itself characterize integra-
bility, but most general methods for solving integrable systems begin with the
introduction of a linear system. 1
A central example of such a method is the inverse-scattering transform. To
apply it to the KdV equation, we impose rapidly decreasing boundary conditions
on u as IxI oo. We interpret La = Sa at each fixed value of t as a time-
independent Schrodinger equation 82a + ua = -k2a, in which u is the potential
and _ -k2 ('the spectral parameter') is an eigenvalue. An eigenfunction such
that
a e-ikx

for large negative x for some real value of k has the asymptotic form
a - T-1 (e-ikx + Reikx)
for large positive x, where R(k) and T(k) are the reflection and transmission
coefficients. The transmission coefficient T has an analytic continuation over the
upper half k plane and can have poles at a finite number of points k1, ..., kn
on the positive imaginary axis, which make up the `discrete spectrum'. At these
values of k, one of the eigenfunctions of L is square-integrable.
The `scattering data' of u are R(k, t), kl (t), ... , k,(t), and n complex numbers
ci(t) determined by the eigenfunctions of the discrete spectrum.2 The data can
4 Introduction
be prescribed freely at t = 0. From the commutativity of L and M, one deduces
that their time dependence is governed by the linear equations
IR = -2ik3R, ki = 0, Ci = -2ik3ci,
where the dot denotes the time derivative. In simple cases, u can be recovered
explicitly from the scattering data; more generally, one must solve a Riemann-
Hilbert problem. The Riemann-Hilbert problem is equivalent, in turn, to a linear
integral equation. So the construction of the solution comes down to a sequence
of linear procedures. This is one of the senses in which the KdV equation is
`soluble'.
The inverse-scattering transform relates directly to the Hamiltonian meaning
of 'integrability' since the KdV equation determines a Hamiltonian flow with
respect to the symplectic structure

W(u, u) = f (9 axq - gaxq') dx,

where u = 2axq, u' = 2axq'. The kis, the IciI2s, and values of IR(k)I2 for
different real values of k are the constants of the motion, and one can interpret
the complete scattering data in terms of action-angle variables.
The inverse-scattering transform leads to an elegant picture in which the
general solution is decomposed into a `superposition' of k solitons, the shapes
and velocities of which are determined by (ci, ki), and a radiative or dispersive
part, the shape of which is determined by R(k, t).
Other soliton equations
The nonlinear Schrodinger (NLS) equation
iacz/i = - 2 -V) ± IV) I2V
for a complex-valued wave function behaves in much the same way as the
KdV equation (one physical context in which it arises is in the modelling of
propagation of light along optical fibres). The sine-Gordon equation

also has very similar properties (with a plus sign between the two terms on the
left-hand side, it is the equation for harmonic maps into S2), as do the soliton
equations of Drinfeld and Sokolov (Appendix B), and of Kac and Wakimoto
(1989). Each corresponds to a choice of a Kac-Moody algebra, together with
certain other algebraic data.
The ASDYM equation
Here the dependent variable is a connection on a vector bundle over space-time.
The anti-self-dual Yang-Mills (ASDYM) equation is the condition that the cur-
vature should be anti-self-dual (we work with the anti-self-dual rather than the
self-dual equation because it is more natural in the context of Kahler geometry
and because it fits more easily with the standard conventions of twistor theory,
Examples of integrability 5

but the two equations are equivalent by reversing the orientation of space-time).
There is a natural linear system since anti-self-duality is the integrability condi-
tion for the existence of covariantly constant sections of the vector bundle over
totally null 2-planes with self-dual tangent bivectors. This interpretation is the
basis of what we call the Penrose-Ward transform, 3 by which the solutions are
parametrized by holomorphic vector bundles over (parts of) twistor space CP3,
and hence by holomorphic patching matrices. Some solutions can be found ex-
plicitly by using the inverse transform in simple cases; in general, as with the
inverse-scattering transform, the method reduces the solution of the ASDYM
equation to a Riemann-Hilbert problem. The Penrose-Ward transform is par-
ticularly well suited to our purpose of understanding integrability as a property
of equations, as opposed to equations together with boundary conditions, be-
cause it can be applied to solutions in arbitrary neighbourhoods in space-time
(subject only to certain convexity conditions); this is in contrast to the inverse-
scattering transform and the Hamiltonian theory, both of which apply only to
various classes of global solutions.
One `integrable property' of the ASDYM equation is a four-dimensional ver-
sion of the Painleve property, which we come to in the next example.
The Painleve equations
An important general distinction between linear and nonlinear ordinary differ-
ential equations is in the way in which the singularities of their solutions depend
on the constants of integration. A solution to the linear equation
2
+ q(t) dt + r(t)y 0,
dt2
in which q and r are holomorphic functions of the complex variable t, can have
singularities only at the singularities of q and r, and so the location of the
singularities in the complex plane is therefore independent of the constants of
integration. On the other hand, the general solution to a typical nonlinear
equation has movable singularities. For example, the solution to
0
dt + y3 =
is y = (2(t - c))-112, which has a branch point that moves when the value of
the constant c is changed.
Certain very special nonlinear equations are unusual in this respect in that
their behaviour is similar to that of linear equations; although their solutions
have both fixed and movable singularities, the movable singularities are all poles.
These equations are said to be of Painleve type. The first-order equations were
investigated by Fuchs, who found that all the cases that he considered could
either be reduced to linear equations or else be solved in terms of elliptic functions
(Ince 1956, Chapter 13). For example the generalized Riccati equations have the
Painleve property: they are
y = a(t)y2 + b(t)y + c(t),
6 Introduction

where a, b, and c are holomorphic. But here the property is a trivial consequence
of the reduction to the linear form
aw" - (a'+ ab)w' + a'cw = 0,
by the substitution y = -w'/aw.
The second-order examples are rather more interesting: Painleve and others
found fifty canonical classes of equations of Painleve type of the form
y"=F(y',y,t),
with F rational in y' and y. Forty-four of these could be integrated in terms
of known functions, but six defined new transcendental functions-the so-called
Painleve transcendents. 4
The six transcendental equations are integrable in a somewhat broader sense
than that of the Hamiltonian theory. They can `solved' by a twistor construction
and the Painleve property is itself an example of an `integrable property'. We
shall see that the connection between the six transcendental equations and the
isomonodromy problem is an example of the association between linear systems
and integrable equations.
The Painleve property plays an important part in the analysis of other inte-
grable systems. Equations which can be reduced to Riemann-Hilbert problems
will in general have a form of the Painleve property. In the reverse direction,
it is the basis of the `Painleve test'. In the simplest form of the test, one at-
tempts to construct power series solutions to the equations with branching type
singularities. If the singularity is forced by the equations to be either rational or
fixed, then the equations are said to pass the Painleve test. ,5 Despite the lack of
theoretical justification in the converse direction, the test is remarkably success-
ful at distinguishing between integrable and nonintegrable equations. Indeed,
Kovalevskaya discovered her integrable top by requiring that the equations of
motion should have the Painleve property.
Burgers' equation and explicitly soluble equations
Some equations are integrable by virtue of the fact that one can transform them
to linear equations or even just write down the general solution. The Riccati
equations above are examples of such ODEs. A notable example of such a PDE
is Burgers' equation
uw = 2uu, + uZz , (1.1.1)
which is satisfied by u for any solution 0 to the heat equation
0w-0zz=0
Another example of an equation to which the general solution can be found
explicitly is the Liouville equation
uZZ = -2eu,
which is the condition that the metric e" dz dz should have unit scalar curvature.
Up to holomorphic coordinate transformations, z v(z), the only such metric
Outline of the book 7

with unit scalar curvature in two dimensions is (1+vv)-2dvdv. Thus by putting

v = v(z), we obtain the general solution
u = - log ((1 + vv)-2v` 7) ,

where v is an arbitrary holomorphic function of z.

Unlike the earlier examples, equations such as these are in some sense trivial:
the construction of the solutions does not require a nonlocal transform.
Theoretical developments
These are only a small sample from the long list of integrable systems that have
been studied in detail and they illustrate only a few aspects of the general theory.
There are many other approaches. For example, Sato set up a correspondence be-
tween germs of solutions of integrable equations and certain infinite-dimensional
Grassmannians in such a way that the flows generated by the constants of the
motion correspond to standard flows on the Grassmannians. Jimbo and Miwa
extended Sato's theory by embedding the Grassmannians in the highest-weight
representation spaces of certain Kac-Moody Lie algebras and by interpreting
the solutions in terms of the representation theory of the Lie algebras. 'T'here
are connections between integrability and two-dimensional quantum field theory.
There are also many other connections with algebraic geometry: for example
through the Penrose-Ward transform and through the Krichever construction,
which expresses certain special solutions of integrable equations in terms of theta
functions constructed from the Jacobians of Riemann surfaces.
1.2 OUTLINE OF THE BOOK
In the following chapters, we shall draw together some of these ideas through the
theory of the self-duality equations. In the first part, we shall introduce the anti-
self-dual Yang-Mills (ASDYM) equation and catalogue some of its reductions.
In the second part, we shall develop various versions of the Penrose-Ward trans-
form, under which solutions to the ASDYM equation correspond to holomorphic
vector bundles over parts of the complex projective space CP3 and solutions to
the anti-self-dual (ASD) Einstein equation correspond to curved twistor spaces.
We have begun each part with a brief summary of the mathematical background.
This is intended to do no more than establish notation and conventions, and to
record for easy reference some important results. The summaries should he suffi-
cient for a first reading of the main body of the text, but not for all the examples
and notes. We have deliberately written different sections of the book at differ-
ent technical levels. In the main text of Part I, for example, we have assumed
only a very basic knowledge of differential geometry and vector bundle theory,
although more sophisticated topics appear in the notes and some examples. In
the first half of Part II, we develop twistor theory mostly in coordinates, with-
out using spinor formalism, but the later chapters make rather deeper use of the
underlying geometric ideas and require more familiarity with spinor calculus.
We follow the mathematical background chapter in Part I by introducing
the ASDYM equation in Chapter 3; we describe various potential forms of the
8 Introduction
equation and show that they can be derived from Lagrangians. In Chapter 4,
we consider the geometric techniques needed to make symmetry reductions, and
in Chapters 5-7, we apply them to construct integrable systems in three, two,
and one dimensions. In Chapter 8, we introduce the recursion operator and
flows on the solution space of the ASDYM equation, which characterize it as
an `integrable system', and show how they induce the corresponding structures
on the solution spaces of the KdV and NLS equations. Finally, we consider
extensions to higher dimensions.
In Part II, we follow the mathematical background chapter with an intro-
duction to the Penrose-Ward transform for the ASDYM equation (Chapter 10).
We use it to derive the Painleve property of the equation and to construct the
instanton solutions in Euclidean space. We then show that it leads to natu-
ral boundary conditions at infinity for solutions in ultrahyperbolic space-time
and construct twistor data for the general solution. In this last application, we
see a typical pattern: that the data have identifiable `solitonic' and `scattering'
components. We then consider (Chapter 11) reductions of the transform, and
demonstrate connections with the inverse-scattering transform for the NLS and
KdV equations, and with the isomonodromy problem for ODEs. In Chapter 12,
we use the twistor geometry to construct the KdV hierarchy and to give a new
derivation of the Drinfeld-Sokolov theory, and we consider how the transform
should be extended to deal a with the Kadomtsev-Petviashvili (KP) and Davey-
Stewartson equations. In Chapter 13, we extend the ideas to curved space-time:
we develop the twistor theory of ASD conformal structures and consider the
symmetry reductions of various ASD conditions on a metric.
The appendices (on gauge transformations, on the Drinfeld-Sokolov con-
struction, and on Poisson and symplectic structures) contain material which is
referred to throughout the book. Each chapter contains a section of notes at the
end, sometimes containing background material, and sometimes giving detailed
derivations that might unduly interrupt the development of the theory if they
were included in the main text. At the end, there is an index of reductions of
the ASDYM equation, with the reduced equations written in a standard form,
and an index of notation.
One point that should be emphasized is that we use `reduction' in a stronger
sense than usual. In the case of the ASDYM equation, for example, we re-
quire that gauge-equivalent potentials with the appropriate symmetry should
give the same solution of the reduced equation, possibly up to some residual
gauge freedom in the reduced system, and that inequivalent potentials should
give different solutions. This contrasts with a common usage, in which 'reduc-
tion' means `substitution of a particular form of the gauge potential into the
self-dual Yang-Mills equations'. A reduction in this latter sense is a reduction of
the self-duality equations together with a particular gauge condition. In extreme
cases, the Yang-Mills equations play no part at all, and all that is being reduced
is a gauge condition on a flat connection. For example, Burgers' equation is the
condition that
A = udz + (u2 + u2 )dw
Notes on Chapter 1 9

should satisfy the ASD condition in electromagnetic theory (3.1.2). In a weak

sense, it is a reduction of the linear ASD equation; but the electromagnetic field
generated by A vanishes whenever u satisfies Burgers' equation, so it is not a
reduction according to our more demanding criteria.
The two most notable omissions from the list of reductions of the ASDYM
equation are the KP equation and the Landau-Lifschitz equation, both of which
are key examples in that their theory has wide ramifications, and both of which
have a degree of universality of their own. There are also many other integrable
equations that are not symmetry reductions of the self-duality equations in four
dimensions, simply because of the number of independent variables involved; for
example, the equations of the generalized ASDYM hierarchy (Chapter 8). We
obtain most integrable systems in one and two dimensions as reductions of the
self-duality equations in four dimensions because of the high degree of freedom
allowed in low dimensions by the choices available for the gauge group and the
symmetry group. The central point, however, is not that the systems are reduc-
tions of self-duality equations, but that they inherit the twistor correspondence
from the ASDYM equation: it is this that underlies their integrability and that
motivates the study of many of the higher-dimensional examples.
There are two ways to approach these other equations within the framework
that we develop here. The first is to express them as reductions of the self-
duality equations, but in a more general sense than the reductions that we have
allowed. The second is to include them by extending the twistor theory of the
self-duality equations, which in our view is the more fundamental strategy. The
first approach is the one adopted by Strachan (1992, 1994), who obtained the
KP equations from a Poisson bracket formulation of Einstein's equations by
replacing the Poisson bracket by a Moyal bracket. Others have also considered
using alternative infinite-dimensional Lie algebras, combined, perhaps, with a
limiting process or other extensions of the ASDYM equations, to obtain the KP
equations from the ASDYM equation (Mason 1990, Chakravarty and Ahlowitz
1992, Ablowitz'ct al. 1993). It seems clear that such constructions can be made to
work. It also seems likely that one can obtain the Landau-Lifschitz equation by
using the gauge group SL(8, C ). In neither case, however, does the construction
seem entirely natural because it does not lead to a simple twistor correspondence.
It is the second strategy that we follow at the end of Chapter 12; it is also the one
adopted by Mason (1995) for the KP equation, and by Carey et al. (1993), where
the twistor theory of the Landau-Lifschitz equation is based on the replacement
of the Riemann sphere of the standard twistor construction by an elliptic curve.

NOTES ON CHAPTER 1
1. The Einstein vacuum equation in general relativity is a notable example of system
which is not integrable in any accepted sense, but for which there is a linear system,
since the equation Rab = 0 is the consistency condition for a `potential modulo gauge'
form of the spin-3/2 massless field equation. In 2-component spinor notation, the linear
system is
AA' C C C
V YA'B'=0, YA'B'=YB'A',
10 Introduction

where two -ys are identified whenever they differ by a gauge term of the form VB,vA,
for some solution of the Weyl neutrino equation VAA'vA' = 0. The exact meaning
of 'consistency' in this case raises some subtle issues, which are discussed by Penrose
(1992).
2. The solutions al+ and a2_ such that ai+ - e'kx as x -. oo, and 0!2- a-ikx as
x - -oo, for real k, have analytic continuations to the upper half of the complex k
plane. The cis are defined by
C, 00

=J a,+ dx
00

atk=k;.
3. It is difficult to find the correct term for this transform. It was first explored in Roger
Penrose's research group in Oxford in the the 1970s, with significant contributions from
M. F. Atiyah, N. J. Hitchin, I. M. Singer, and other temporary and permanent members
of the Mathematical Institute. The linear form of the transform, and its nonlinear
extension to the self-dual Einstein equations, are clearly due to Penrose, although one
can see some of his contour integral formulas in much earlier work, for example in that
of Bateman (1910). The extension to the self-dual Yang-Mills equation first appeared
in Richard Ward's D. Phil. thesis, and was published by him (Ward 1977). An early
significant application was to the instanton problem (Atiyah and Ward 1977). We hope
that the use of the term 'Penrose-Ward transform' to cover both the linear and the
nonlinear transforms correctly reflects both the origin of the central ideas and Ward's
discovery of their application to the self-dual Yang-Mills equation. We remark that
the transform for the self-dual Yang-Mills equation is called the 'Ward construction'
by Penrose and Rindler (1986) and is unnamed in Ward and Wells (1990).
4. For a comprehensive list of references, see Ince (1956), Chapter 14, and also Ablowitz
and Clarkson (1991).
5. One must show that the singular solution so obtained is sufficiently general and does
not admit essential singularities. See Ablowitz and Clarkson (1991), and references
therein.
Part I
Reductions of the ASDYM
equation
2
Mathematical background I

In this chapter, we summarize the mathematical background to the Yang-Mills

equations. In §2.1 we give an informal overview of gauge theories and their re-
lation to the anti-self-duality condition. In §2.2-§2.4 we give brief notes on the
geometry of real and complex space-time, and on conformal transformations,
followed in §2.5-§2.6 by a more formal discussion of bundles, connections, cur-
vature and the Yang-Mills equations. These topics are covered more fully by
Ward and Wells (1990).

2.1 GAUGE THEORIES

The Yang-Mills equations are partial differential equations in four independent
variables-the four space-time coordinates. There is one system of equations
for each choice of a certain Lie group G, which is called the gauge group, and
different choices of G can result in systems with rather different properties. So,
for example, the equations are linear when G is Abelian and nonlinear when G
is non-Abelian. In the special case in which G = U(1), they reduce to Maxwell's
equations.
In their quantized form, the Yang-Mills equations determine the behaviour
of the strong and weak forces between elementary particles in much the same
way that Maxwell's equations determine electromagnetic interactions. They are
therefore fundamental to our understanding of the nature of matter. As a sys-
tem of partial differential equations, they are also of considerable interest from
a purely mathematical point of view, most notably because they provide new
connecting links between analysis, geometry, and topology in four dimensions.
The step from electromagnetism to a general gauge theory requires two
changes to the elementary interpretation of the equations of electrodynamics.
The first is to take as the fundamental variables not the electric and magnetic
fields E and B, but the components of the 4-potential A 'modulo gauge transfor-
mations'. The second is to regard A as a connection 1-form; that is, to encode it
in a differential operator D = d + iA, which acts on wave functions taking values
in a complex line bundle. Maxwell's equations are then equations on the cur-
vature 2-form of D. This geometric description is invariant under the `external'
symmetries of the electromagnetic field, Lorentz and conformal transformations
of space-time, as well as the `internal symmetries', gauge transformations of the
potential combined with phase transformations of the wave function.
14 Mathematical background I
In a general gauge theory, the external symmetries are retained, while the
group of gauge transformations is enlarged by replacing the line bundle, with its
U(1) structure group, by a vector bundle with some other structure group. The
Yang-Mills equations are differential equations on the curvature of a connection
on the vector bundle.
In four dimensions, a curvature 2-form F has a conformally invariant decom-
position as the sum of a self-dual 2-form F+ and an anti-self-dual 2-form F-.
This is reflected in Maxwell's theory in the decomposition of a real source-free
electromagnetic field into a superposition of two complex fields, the self-dual
(SD) part with B = iE and the anti-self-dual (ASD) part with B = -iE. The
two parts separately satisfy Maxwell's equations. In physical terms, they are the
two circularly polarized components of the field.
While this decomposition does not extend to the fields in non-Abelian gauge
theories, it is possible to find solutions to the Yang-Mills equations for which
one half of the curvature vanishes, so that the curvature 2-form is either self-
dual or anti-self-dual. Furthermore, every connection with self-dual curvature
automatically satisfies the Yang-Mills equations. It is this self-duality condition,
rather than the full Yang-Mills equations, with which we shall principally be
concerned. Unlike the full Yang-Mills equations, the self-duality condition is an
integrable system of equations. 1
2.2 SPACE-TIME
The space-time of special relativity is a four-dimensional affine space. Its ge-
ometric structure is determined by the Minkowski metric, which has signature
(+ - - -), and a choice of orientation. The various relativistic wave equations,
such as Maxwell's equations in vacuo, are invariant under translations of space-
time and proper Lorentz transformations-that is, linear transformations that
preserve the metric and the orientation.
There are no real nontrivial self-dual or anti-self-dual Maxwell fields in a
space-time with Lorentz signature because the condition B = ±iE is incompat-
ible with the reality of E and B. The same is true in the Yang-Mills case if we
take the gauge group to be a real form of GL(n, C) or SL(n, C ). Thus we shall
be interested either in complex forms of the equation or in real forms on spaces
on which the metric is either positive definite, the Euclidean case, or has split
signature (+ + - -), the ultrahyperbolic case; in Euclidean and ultrahyperbolic
signature, the reality condition does not force a self-dual and anti-self-dual 2-
form to vanish. It will be convenient to deal with all these cases within a common
framework by allowing the coordinates to take complex values. That is, we shall
think of the three real spaces as being embedded in complex Minkowski space
CM.
Double-null coordinates
This device also gives us freedom to simplify the equations by using complex as
well as real coordinate transformations. The anti-self-duality condition takes a
particularly simple form in double null coordinates, in which the metric on C M
Space-time 15

is 2
ds2 = 2 (dz dz - dw dw) ,

and the volume element is

v=dwAdwndzAdz. (2.2.1)
The coordinate vectors 8,,,, 8Z, 8,;,, 8Z form a null tetrad at each point of space-
time. A general null tetrad is a basis of 4-vectors {W, Z, W, Z} such that
77(Z, Z) = -77(W, W) = 1, 24 v(W, Ti', Z, Z) = 1, (2.2.2)
where 77 is the metric tensor, and such that all the other inner products vanish.
We recover the various real spaces (or `real slices') by imposing reality con-
ditions on w, w, z, z, as follows.

(IE) On the Euclidean real slice, IE,

z w 1 x° + ixl -x2 + ix3

w z 72 x2 + ix3 x° - ix'
where x0, x1, x2, x3 are real Cartesian coordinates. That is, lE is picked out
by the reality conditions w = -w, z = z.
(M) On the Minkowski real slice, M,
XO +X1X2 -
C Z!! z) 725 (x2 + x3 x° - x1) '

where x0, x1, x2, x3 are real inertial coordinates. The reality conditions are
that z and z should be real, and that w = w.
(U) On the ultrahyperbolic real slice, U,

(w z) f (x2+ix3 x° - ixl)
where x°, xl, x2, x3 are real. The reality conditions are
'

z, w = w.
Another possibility is to take z, w z and w to be real, which gives a different
real slice, but a metric of the same signature.

Volume forms
It is important to keep in mind that a volume element that is real on M is
imaginary on lE and U, and conversely. This is one reason why the self-duality
equations have very different characters on the different real slices. Most of the
real reductions that we shall look at will be in U or IE, so rather than use different
volume elements in different examples, we shall work throughout with v, defined
by (2.2.1). This is real on U and E, where it coincides with
d4x = dx° A dx' A dx2 A dx3
in Cartesian coordinates, but is imaginary on M.
16 Mathematical background I
General coordinates
We shall denote a general coordinate system on C M by xa (a = 0, 1, 2, 3), and
we shall adopt the range and summation conventions for the lower case indices
a, b,c... = 0,1,2,3. In this notation,
ds2 = gabdxadxb and v = vabcddxa A dxb A dxc A dxd .
Here the 1)abs are the components of the metric tensor q, and
Uabcd = 21 4 Aeabcd ,

where 02 = det(llab) and eabcd is the four-dimensional alternating symbol (de-

fined by eabcd = e(abcdl and e0123 = 1). We shall use qab and its inverse qab
(defined by 1]ab7]bc = 5) to lower and raise indices.
In many examples, we shall be more interested in the equations themselves
than in particular properties of their solutions, and it will not be necessary to
state explicitly whether we are working with complex coordinates on C M, or with
their restrictions to one or other of these real slices. It should be understood
that the various functions involved are either holomorphic, in the complex case,
or smooth, in one or other of the real cases. Of course there are other contexts
in which the signature does play a critical role, and in which we shall have to be
more specific.
2.3 DIFFERENTIAL FORMS
We shall use the following conventions for differential forms. The components
of a p-form /3 are its components flab..., as a skew-symmetric covariant tensor,
so that 3(X,Y,...) = Qab XaYb... for vector fields X,Y,.... The exterior
derivative d/3 has components
a[a/3bc...d] ,
and the exterior product /3 A -y of /3 with a q-form y has components
Olab...e Yde...k] ,
Here as = a/axa and the square brackets denote skew-symmetrization.3 With
these definitions,

and for a vector field X,

GX/3=XJd/3+d(XJ/3),
where the contraction X J -y of X with a q-form y is the (q - 1)-form with
components
(X J -Y)b...c = qX
The Hodge star operator
In n dimensions, the dual of a p-form /3 is the (n - p)-form */3 with components
1 de.. .ffde...
(n f
Qab...c
- p)! Eab...c
Differential forms 17

The *-operator has particular importance for 2-forms: if /3 is a 2-form with

components Nab, then *0 is also a 2-form, with components
aab = 2OEabcdlfel)df fef .
In this case the action of * is conformally invariant; it is also idempotent, that is,
*2 = 1. Thus the space of 2-forms decomposes into the direct sum of eigenspaces
of * with eigenvalues ±1. We say that /3 is self-dual (SD) whenever */3 or
anti-self-dual (ASD) whenever *0 = -/3. The three forms
a=dwndz, a=dwndz, and w=dwAdw-dzndz (2.3.1)
span the space of SD 2-forms; and
dwndz, dwndz, and dw n dw +dzndz (2.3.2)
span the space of ASD forms. Note that w n w = -2v, and that w is the Kahler
form on the Euclidean real slice, multiplied by 2i.

Example 2.3.1 In Euclidean space, A = 1 in a positively oriented Cartesian

coordinate system. If /3 has components
/ 0 X1 X2 X3 \
-XI 0 Y3 -Y2
(Qab) _
-X2 -Y3 0 Y1
-X3 Y2 -Y1 0/
in these coordinates, then *,3 is given by interchanging X = (XI, X2, X3) and
Y = (Y1, Y2, Y3), and /3 is SD whenever X = Y and ASD whenever X = -Y.
Example 2.3.2 In Minkowski space, A = i in a positively oriented inertial
coordinate system. If Fob is an electromagnetic field tensor, then
0 E1 E2 E3
-E1 0 -B3 B2
(Fab) = i -E2 B3 0 -B1
-E3 -B2 BI 0
in these coordinates, where E and B are the electric and magnetic fields. The
dual field *F has components
0 B1 B2 B3
-B1 0 E3 -E2
(F 'ab) _ -B2 -E3 0 E1
-B3 E2 -E1 0
The field is SD whenever B = iE and ASD whenever B = -iE. (If the
Lorentzian volume element is used instead of v to define *, then * is a real
operator in real Minkowski space, but its square is -1, its eigenvalues are fi,
and again it has no real eigenvectors.)
There are no real SD 2-forms in M, but in Euclidean and ultrahyperbolic
signature the spaces of real SD and ASD forms are both three-dimensional. For
18 Mathematical background I
example, the forms in equations (2.3.1) and (2.3.2) are real when the double-null
coordinates are real (an ultrahyperbolic real slice) and dw A dz - dw A dz is real
on the Euclidean real slice.

Decomposition of the exterior derivative

Given a double-null coordinate system, we can decompose a general 1-form
/3 =,3w dw+0,dz+Q,-,dzu+/.3idi
into the sum Q = /3(1.0) + 0(o, 1) of a (1,0)-part 0(1,0) = 0w dw + /3Z dz and a
(0, 1)-part /3(o,1) _ ,0,-, dzo +,0Z dz. The decomposition depends on the choice of
coordinates, but is invariant under transformations which preserve the foliation
by surfaces of constant w, z and the foliation by surfaces of constant w, z. We
note that
*0 = (13 ,0) - 0(0,1)) A w.
The decomposition extends to forms of higher degree: a k-form is said to he
of type (p, q) (relative to the choice of foliations), where p + q = k, if it is a
combination of exterior products of p (1, 0)-forms and q (0, 1)-forms. Every k-
form can be written uniquely as a sum of forms of types (k, 0), (k - 1,1), ... ,
(0, k).
The exterior derivative similarly decomposes into a sum d = a + a of two
operators
a=dwaw+dzaZ, a=dwa,;,+dzai.
On lE and U, the choice of coordinates determines a complex structure in which
w and z are holomorphic coordinates, and the decomposition of d is the standard
decomposition into the sum of the operators a and a (see §9.5).

Null 2-planes
We say that a 2-plane in space-time is null (more properly, totally null) if
i7(A, B) = 0 for every pair of tangent vectors A, B. With each null 2-plane
IT we associate a tangent bivector A = A A B with components 7rab = A[aBb] ,
where A and B are independent tangent vectors.4 The tangent bivector deter-
mines the tangent space to the 2-plane, and is determined by it up to a nonzero
scalar multiple.
Lemma 2.3.3 If H is a null 2-plane, then 1rab1rab = 0, and 7rabdxa A dxb is
either SD or ASD.

Proof We can characterize ir, up to a nonzero scalar factor, by ira6Pa = 0 for

every P E H. On the other hand grab = A)aBb), where A and B span H. Since A
and B are null and orthogonal, aabPa = 0 whenever P is a linear combination of
A and B. Hence Tr = t r* for some p # 0. But the eigenvalues of the * operator
are ±1. Therefore either 7rab = Trab or 7rab = -7rab. Since 7rab = A[aBb), we have
7rab7r ab = 0.
Conformal transformations and compactified space-time 19

We call II an a-plane whenever 7r is self-dual and a,3-plane whenever 7r is anti-self-

dual. In double-null coordinates, the surfaces of constant w, z and the surfaces
of constant w, z are a-planes.
Every a-plane through the origin, apart from the plane w = z = 0, has a
unique tangent bivector of the form
7rab = L1aMb1

where
L=aw - cai, M =Bz-(aw, (2.3.3)
for some ( E C. Conversely, for each (E C, the span of L and M is an a-plane
through the origin. If we include the point at infinity by mapping ( = oo to
the space spanned by 8j and 8i, then we obtain a one-to-one correspondence
IIs ( between a-planes through the origin and points of the Riemann sphere.
In the twistor construction, a general a-plane, not necessarily passing through
the origin, is labelled by three complex coordinates: the parameter S, which
determines the tangent space, together with (w + z and (z + w, which are
constant over the a-plane. The entire space of a-planes, including those at
infinity, is C 1P3. We shall look at this in detail in Chapter 10.
A 2-form -y is ASD whenever it is orthogonal to the SD bivectors, that is,
whenever
-Y(aw,az) = -Y(aw,ai) = y(aw, aw) - -Y(az, ai) = 0,
and SD whenever it is orthogonal to the ASD bivectors, that is, whenever
-Y(aw, ai) = y((9z, aw) = 'Y(& aw) + -Y(az, ai) = 0 .
Thus the eigenspaces of * are orthogonal. The ASD condition can be expressed
more compactly as the condition
y(L, M) = 0,
identically in (, where L and M are defined by (2.3.3). As ( varies over the
Riemann sphere, L A M varies over all the tangent bivectors to the a-planes
through a point, so the ASD condition is that y should vanish on restriction to
a-planes. Later on, we shall interpret ( as the `spectral parameter'.
2.4 CONFORMAL TRANSFORMATIONS AND COMPACTIFIED SPACE-TIME
Conformal transformations are of special significance in four-dimensional gauge
theories because they preserve the tensor field Vabcd and hence the duality oper-
ator on 2-forms. Therefore the ASD condition on the curvature of a connection
is conformally invariant.
Proper conformal transformations
A proper conformal transformation p of real or complex space-time is character-
ized by the conditions p'7) = 52277 and p'v = 114v for some function Q. Here p
denotes a mapping of space-time to itself and p" denotes the pull-back action
on covariant tensors. At the infinitesimal level, conformal transformations are
20 Mathematical background I
given by conformal Killing vectors, that is, by vector fields K such that
a(aKb) = Q71abacK`. (2.4.1)
This is the condition that GK7ab a lab, or equivalently, that the flow of K should
be a one-parameter family of conformal transformations. When aCKC = 0, K is
a Killing vector, and the transformations are isometries.
The space of conformal Killing vectors is fifteen dimensional, since the general
solution to (2.4.1) is
Ka = Ta + LabX b + Rxa + xbxbSa - 2Sbxbxa ,

where the coefficients are constant, with Lab = -Lba. The components of T
label the translations (four parameters), the components of L label the rotations
and Lorentz transformations (six parameters), R (one parameter) labels the
dilatations, and the components of S label the special conformal transformations
(four parameters). See Penrose and Rindler (1986), p. 83.

The complex conformal group

The only conformal transformations that are defined globally on C M, or on
one of its real slices, are combinations of isometries and dilatations (constant
rescalings of the coordinates). More general examples, such as combinations of
inversions and reflections, map a light-cone or a null hyperplane to infinity. In
order to have a group action on space-time, we adjoin a light cone at infinity to
obtain compactified Minkowski space, which we denote by CM#. This has a con-
formal structure and an orientation, and the proper conformal transformations
C M# -+ C M# form a fifteen-dimensional group, which we call the complex con-
formal group (or simply the `conformal group' when `complex' is obvious from
the context). Every proper conformal transformation p: U p(U), where U is
open in CM or in one of its real slices, extends uniquely to a global transfor-
mation C M# -+ C M#, and every conformal Killing vector on C M extends to
C M#, and determines an element of the Lie algebra of the conformal group.
The complex conformal group is isomorphic to the projective general linear
group PGL(4, C) = GL(4, C)/CX , by a construction that is central to twistor
theory, in which C MO is identified with the Klein quadric in C P5. We shall
look at the underlying projective geometry in more detail in Chapter 9. For the
moment, all we shall need is the following explicit description of the isomorphism.
Let x = (x°Q), a,,3 = 0, 1, 2, 3, be a 4 x 4 skew-symmetric complex matrix
with zero determinant. Provided that x23 # 0, x is a nonzero complex multiple
of 5
0 S -W z
-s 0 -z w
w z 0 1
-z -w -1 0
for some w, z, w, z, where s = zi - wiw. Moreover, we have
e°3y6dx°adxy6 = E,t(dz di - dw dw) ,
Conformal transformations and compactified space-time 21

where sap.yb denotes the four-dimensional alternating symbol and µ is a scalar.

It follows that any transformation x --- pxpt, where p E GL(4, C ), induces a
conformal transformation of space-time, with the multiples of the identity acting
trivially, and in fact every proper conformal transformation arises in this way.
Since the nonzero multiples of p all induce the same transformation, there is no
loss of generality in taking det p = 1.
When x23 = 0, some or all of the space-time coordinates are infinite. In the
projective-geometric interpretation, we can append these `points at infinity' by
regarding the six entries x"Q, a < /3, as homogeneous coordinates on C 1P5, and
defining the conformal compactification of C M to be the Klein quadric
CM# = {x' x76) = 0} C C1P5.
The conformal group acts globally on C M#. The points at infinity are those for
which x23 = 0, the remaining points are in one-to-one correspondence with the
points of C M, and the conformal structure is determined by identifying the null
geodesics in C M# with the projective lines of C lPs that lie in C M#.
Real forms
Various real forms of the conformal group are obtained by requiring that the
transformation should preserve one or other of the real slices.
(IE) The Euclidean slice is invariant under x pxp` if
0) p
where e=
P (0 0) = (0 01
0) ,

that is, if p E GL(2, III[).

P(
l
(M) The Minkowski slice is invariant if

')Pt=1 1 0),
where the matrix is in 2 x 2 block form; that is, if p E U(2,2).
(U) The ultrahyperbolic slice on which z, w, z and zu are real is invariant when
p E GL(4, R).
At the infinitesimal level, each 4 x 4 matrix A, that is each element of gl(4, C ),
generates a conformal Killing vector K, which can be found by equating 6x to a
scalar multiple of Ax + xAt. If we decompose A into 2 x 2 blocks by writing

A
A T)
a _At
then the entries in r generate translations, the entries in a generate special
conformal transformations, and A and A are the left and right components of
an infinitesimal rotation (see below), together with a dilatation. When A has a
one in the a/3 entry and zeros elsewhere, K is given by Table 2.1 (note that the
conformal Killing vectors labelled by a(3 = 00, 11, 22,33 sum to zero, so there
are only fifteen independent generators). 6
22 Mathematical background I
Table 2.1 The generators of the conformal group
a/3 Generator 0 Generator
00 waw + zai 02 ai
10 zaz + waZ 12 a,,
01 zaw + wai 03 a,
11 zaZ + zuav, 13 aZ

20 -zuwaZ - i219, - zwaj - Zwa,,, 22 -waw - xai

30 -zza,r, -wzai - wzaZ - w2aw 32 -za,;, - wai
21 -zxaw - 4D2a,, - wzai - wzaZ 23 -za,,, - waZ
31 -tiiwai - zwa,,, - zwaw - z21Z 33 -zaZ - waw.

Left and right rotations

Every proper isometry of space-time, that is, every linear transformation that
preserves 77 and v, is a combination of a translation and a (complex, four-
dimensional) rotation. In a complex Cartesian coordinate system,
ds2 = (dx°)2 + (dxl)2 + (dx2)2 + (dx3)2
and v = dx°ndxl Adx2Adx3. In these coordinates, the rotations are given by the
complex proper orthogonal matrices, and so we have the standard isomorphism
between the rotation group and SO(4, C ). A choice of double-null coordinates,
on the other hand, leads to a different isomorphism. It reveals a central feature
of four-dimensional geometry, that every rotation can be represented by a pair
(A, A) E SL(2, C) x SL(2, C), uniquely up to the identification of (A, A) with
(-A, -A). Thus?
SO(4, C) = (SL(2, C) x SL(2, C)) /Z2 .
The two components A, the left rotation, and A, the right rotation, act linearly
on space-time by

(w z) ~A(w z)' (w z) ~ (w z
W IAA.
are/isometries,
Because they leave invariant the determinant zz - ww, they and,
because SL(2, C) is connected, they lie in the identity component of the com-
plex rotation group and therefore preserve v. Clearly left and right rotations
commute.
The Lie algebra of the complex four-dimensional rotation group is the space
of linear maps is C M - C M such that
r/(rcA, B) + r)(A, kB) = 0
for all vectors A, B. Thus (A, B) '- 77(A, rKB) defines a 2-form, with components
Kab = At the Lie algebra level, the decomposition of n into left and right
Bundles, connections, and curvature 23

rotations reflects the decomposition of this 2-form into its SD and ASD parts:
an element of the Lie algebra is uniquely the sum of left and right infinitesimal
rotations, and the corresponding 2-forms are, respectively, ASD and SD.
The action on null two-planes
Left and right rotations can also be characterized by the way in which they act on
the null two-planes in space-time. Left rotations leave invariant a-planes through
the origin and right-rotations leave invariant 0-planes through the origin. A right
rotation
A=(ac b)
d)
acts on L and Al by
L H (a + (c)8,,, - (b + (d)8Z, M (a + (c)8z - (b + (d)8,7,
and so maps IIS to IIt-, where t;' = (b + (d)/(a + (c). Thus right rotations act on
the Riemann sphere of a-planes through the origin by Mobius transformations.
The flow along a conformal Killing vector K moves a-planes into a-planes
and 8-planes into ,3-planes. We say that K is self-dual (anti-self-dual) if the
2-form 8[aKbJ is everywhere SD (ASD). The flow along an ASD conformal Killing
vector maps a-planes to parallel a-planes, and the flow along a SD conformal
Killing vector maps 0-planes to parallel fl-planes.

2.5 BUNDLES, CONNECTIONS, AND CURVATURE

The following is not intended to be a complete treatment, but simply a brief
informal sketch, which should serve to establish elementary terminology and
to highlight some key definitions. It should be sufficient background for the
first eight chapters. In the second part of the book, we shall assume rather
more extensive familiarity with the theory of bundles, particularly holomorphic
bundles.
Vector bundles
A rank-n vector bundle E on a manifold M is a family of n-dimensional complex
vector spaces E, labelled by x E M, and varying smoothly or holomorphically
with x, according to the context. More precisely, it is a manifold E (the total
space) together with a projection map 7r: E M, such that each fibre A-1(x)
has the structure of an n-dimensional vector space. The projection is required
to be locally trivial, in the sense that each x E M has a neighbourhood U such
that Eu = 7r-'(U) can be represented as the product U x R' or U x C'.
A (local) section of E is a map s: U C M - E such that ir(s(x)) = x for
every x E U; it is global if U = M. A p-form with values in E is a skew-symmetric
multilinear map that assigns a section a(X, Y, . . . , Z) to each ordered set of p
vector fields X, Y,. .. , Z on M. If M is a real manifold, then the fibres can be
either real or complex vector spaces, and the maps are required to be smooth. If
M is complex, then the fibres must be complex vector spaces and the maps are
24 Mathematical background I
required to be holomorphic; in this case, we call E a holomorphic vector bundle.
When n = 1 and the fibres are complex, we call E a line bundle.
We denote the space of sections of E over U by I'(U, E), or simply by ['(E)
when there is no possibility of ambiguity. A key fact is that when M is a compact
complex manifold and E is holomorphic, r(M, E) is a finite-dimensional vector
space (see, for example, Wells 1973, p. 156).
Many of the elementary constructions of linear algebra extend in an obvious
way to vector bundles. In particular, if E and E' are vector bundles over M,
then their direct sum and tensor product are the bundles E ® E' and E ® E'
with fibres ET ® E and E:,, ® Ez, respectively.

Local trivializations
A local trivialization of a vector bundle is the same as a choice of a local frame
field, that is, a family of local sections e 1, . . . , e, such that {ei (x) } is a basis in
E,, at each x. There is no canonical choice, and in general it not possible to
extend a local frame field to the whole of M because of topological or analytic
obstructions. Given a local frame field, we represent a local section by a column
vector with components s1, ... , sn by writing s = sj ej, and we represent a p-
form with values in E by a column vector of p-forms in the ordinary sense.
We shall not always make a careful distinction between sections and their local
representatives.
Two local frame fields are related by ej = eigij (i, j, ... = 1,... , n, with
summation), and the corresponding vector representatives of a section s are
related by si = gij. j. The transition function or patching matrix g = (gj) takes
values in the n x n matrices and is defined on the overlap of the domains of
the two frame fields. It may be that the bundle has some additional structure,
such as a Hermitian metric in each fibre. In that case, the choice of basis can be
restricted and the transition functions will take values in some subgroup of the
general linear group, for example U(n) in the Hermitian case. The subgroup is
called the structure group of the bundle (when there is no additional structure,
the structure group is GL(n)).
There are topological obstructions to the existence of a global trivialization of
a general vector bundle. However, the vector bundles that we shall encounter in
gauge theories in the next few chapters will be globally trivial bundles over open
subsets of real or complex space-time, that is, they will be products E = U x E0,
where Eo is a fixed vector space. The important point about the `vector bundle'
terminology in this context is that a trivial bundle need not have a preferred
trivialization.

Principal fibre bundles

Associated with a vector bundle E --p M we have the corresponding principal
fibre bundle P -+ M. This is the manifold of pairs (x, e3), where x E M and e3
is a basis in E. The projection onto M is the obvious map (x, ej) '--* x. Thus a
local trivialization of E is the same thing as a local section of P.
Bundles, connections, and curvature 25

There is a natural action of the general linear group on P (on the right) by
(x, e3) '-. (x, eihij), h E GL(n). This is transitive on the fibres, so locally we can
identify P with M x GL(n), but not in a canonical way. When E has additional
structure, the bases ej are restricted in the appropriate way, for example, to
orthonormal bases when E has a Hermitian structure. Then the structure group
acts on P, and P is locally a product of M with the structure group.
Conversely, given P and a representation of the structure group on a vector
space V, we can construct an associated vector bundle E - M with fibre V by
using the using the representations of the gijs as transition functions.

Connections
A connection on E is a first-order differential operator D that maps sections of
E to 1-forms with values in E. In a local trivialization,
Ds = Das dxa = ds + 4Ds,
where '1 = 4)a dxa is a matrix-valued 1-form, called the connection or gauge
potential, or simply the potential. We denote X i Ds, where X is a vector field,
by DXs. A section s is parallel along a curve with tangent T if DTS = 0, in
which case s is determined on the curve by its value at one point (the values of
s at other points of the curve are said to be given by `parallel transport').
A connection determines a 'covariant exterior derivative' for forms with values
in E by
Da=da+4) A a,
where Ana is the standard matrix product, but with the ordinary multiplication
rule for components replaced by exterior multiplication of differential forms. It
also determines a connection D* on the dual bundle E* (the bundle whose fibres
are the dual spaces to those of E). In the dual trivialization,
D* =d - fit.
When E has an additional structure that picks out a special class of local
trivializations, we can impose a compatibility condition on D by requiring that 4)
should take its values in the Lie algebra of the corresponding structure group-
this is an invariant condition. For example, if E is a complex vector bundle over
a real manifold, and if there is a Hermitian metric (- , ) on each fibre, then the
structure group is U(n), and the compatibility condition is that the components
4a should be skew-Hermitian when the local frame field is orthonormal. An
equivalent condition is
d(s, s) = (Ds, s) + (s, Ds) ,
for any section s. If E also has a complex volume element on each fibre, then
the structure group reduces to SU(n), and compatibility requires the further
condition that the components of (D should be trace-free.
26 Mathematical background I
Gauge transformations
In gauge theories, a local trivialization is a `gauge' and a structure group is the
`gauge group', although this last term has a different meaning in the mathe-
matics literature (see Appendix A). The use of the word `gauge' in this context
is rather odd: it comes from Weyl's unsuccessful attempt to unify gravity and
electromagnetism in a single geometric theory in which the lengths of vectors
were allowed to change under parallel transport.
When the local frame is changed to ej = eigi3, the connection 1-form under-
goes a gauge transformation. The local representatives of sections transform by
s '- s = g-1 s, and 1 is replaced by
4, = 9g + g-ldg,
so that
(d+(D)s=g-t(d+4))s.
Curvature and integrability
The curvature of D is the matrix-valued 2-form F = Fabdxa A dxb, where
Fb=aa4b-ab(a+I4)a,4b]
It measures the extent to which the operators Da fail to commute, since
(DaDb - DbDa)s = FabS
for any section s. For forms with values in E, D2a = F Aa. If D is compatible
with some additional structure on E, then F takes values
2 in the Lie algebra of
the structure group.
Under a gauge transformation the curvature transforms by conjugation, that
is, F = g-' Fg. Thus the curvature is an obstruction to finding a gauge in which
= 0 since if there exists a frame in which 4) = 0, then F must be zero in all
frames. Conversely, if F = 0, then there exists a local gauge such that 4) = 0
since the vanishing of F is the local Frobenius integrability condition for the
system of linear equations
Daei = 0.
The adjoint bundle
From a more geometric point of view, the curvature is a 2-form with values in
the adjoint bundle, adj(E). The fibre of adj(E) at x is the Lie algebra of the
structure group. When the structure group is GL(n), we have
adj(Ex) = Ex 0 Ex ,
which is the Lie algebra of linear transformations of Ex. Sections of adj(E)
are represented locally by matrix-valued functions ¢, with the transformation
rule 0 '- g-leg under change of local trivialization, which is the behaviour of
the curvature form under gauge transformations. The connection extends in a
natural way to sections of adj(E) and to forms with values in adj(E). If 0 is a
section of adj(E) and S2 is a p-form with values in adj(E), then
Bundles, connections, and curvature 27

Do=d¢+[4),0] DS2=dc +(1) AS2-(-1)PQA

(in the second equation,' and S2 are matrices of forms, and' AS1 is their matrix
product, except that the multiplication between entries is the exterior product).
For the curvature form, we define
DaFbc = aaFbc + [-Da, Fbc]
in linear coordinates, so that
DF=DIQFb,IdxaAdxbAdx`=dF+ID AF-FA .

However, the Jacobi identity implies that

[Da, [Db, Dc]] + [Db, [Dr, Dal] + [Dc, [Da, Db]] = 0,
which yields
DF = D[aFbc]dxa A dxb A dx` = 0.
This is the Bianchi identity. It can also be proved in another way by picking
a point x, and by making a gauge transformation by g. With an appropriate
choice for the partial derivatives of g at x, it is possible to make 4) vanish at x
(of course will not vanish at other points unless F = 0). Then F = d1 and
DF = 2d24D = 0 at x. Since DF = 0 is a gauge-invariant equation, we conclude
that F satisfies the Bianchi identity at x; but x can be any point, so the identity
holds everywhere.
Pull-backs
If E -+ M is a vector bundle and p: M' M is a smooth or holomorphic map
(depending on the context), then the pull-back of E to M' is the bundle E' = p'E
defined by Ei, = Ep(x'). A local trivialization of E over U C M determines a
local trivialization of E' over U' = p-1(U). A connection D = d + 4) on E
determines a pull-back connection p'D = d' +p'4) on E. Clearly the curvatures
are related, in corresponding local trivializations, by F = p'F..
If M' C M and p is the inclusion map, then p'E is the restriction of E to M'
and p'D is the restricted connection. We denote the restricted bundle by Elm,
or by Em,.
Lifts
Suppose that H is a group that acts on M by diffeomorphisms or by biholomor-
phic transformations, depending on the context. A lift of the action of H to E
is a rule that assigns to each p E H a map p.: E - E such that:
(a) ir(p.(e)) = p(7r(e)), for all e E E, where 7r is the projection E - M;
(b) for each m E M, the restricted map p.: E,,, --+ E,,(m), is linear;
(c) p --- p. is a group homomorphism.
A lift determines a 'pull-back' action of H on sections: for a section s, we define
p's by
P s(m) = P*-' (s(P(m))) .
28 Mathematical background I
There is a natural extension to forms with values in E such that, for a product as,
where s is a section and a is a form in the ordinary sense, p'(as) = p*(a)p*(s).
(The notation and definitions are modelled on the properties of the `derived map'
on tangent vectors and the 'pull-back' map on covectors and forms.)
Lie derivatives
At the infinitesimal level, in place of a group action, we have an action by a
Lie algebra f), that is, a linear map that assigns a vector field X on M to each
element of f) and that preserves Lie brackets. To avoid notational complication,
we shall usually identify h with the corresponding algebra of vector fields on M.
In this context, a lift is a map that assigns a `Lie derivative' Gx to each X E f).
The Lie derivative acts on sections of E and is given in a local trivialization by
Gxs = X(s) +Oxs,
where Ox is a matrix-valued function on M. It has the properties:
(i) Gx (f s) = X (f )s + f Gxs, where f is any function and X (f) denotes its
derivative along X;
(ii) Lax+by = aGx + bLy for every X, Y E h and for every constant a, b;
(iii) L[x,yl _ [Lx, Lyl for every X, Y E f).
Under gauge transformations,
0x ,-, g-' X (g) + g-'Oxg
The Lie derivative extends to forms with values in such a way that G(as) _
L' (a)s + aLxs, where a is a form in the ordinary sense and G'xa is its Lie
derivative in the ordinary sense (we use the prime here to avoid notational con-
fusion).

Exponentiation
An action of a Lie group H determines an action of its Lie algebra f), and a lift of
an action of H determines a lift of the action of h. When H is connected, we re-
cover its lift to E from the Lie derivatives along its generators by exponentiating
the vector fields on E given by

Xa a
8xa
- 9'.zj a
8zz '
(2.5.1)

where (O'j) = Ox and the z`s are the linear coordinates on the fibres of E.
Conversely, the lift determines these vector fields and hence the Lie derivatives.
Invariant gauges
An invariant gauge for an action of h is one in which the frame field satisfies
Gxe; = 0 for every X E f). Under this condition, the O Xs vanish and the Lie
derivatives are given by ordinary differentiation along the generators. It is always
possible to construct a local invariant gauge when the infinitesimal action of 1)
on M is free, which means that if X E h does not vanish identically, then it has
The Yang-Mills equations 29

no zeros in U. In this case, the vector fields (2.5.1) are transverse to the fibres
in Elu, and it is possible to find local sections of E that are tangent to all these
vector fields, and which are therefore invariant under the action of .
However if the action of h is not free, then an invariant gauge may not exist
since it can happen that there is no invariant frame in a region containing a
zero of one of the generators. In fact, if X = 0 at m, then m is a fixed point
of the one-parameter subgroup {exp(tX)} C H, and any lift of the action of H
determines a representation of {exp(tX)} on the fibre Em. If the representation
is nontrivial, then there is no invariant frame in any neighbourhood containing
m. In general, different lifts generate inequivalent representations of {exp(tX)}
at M.
2.6 THE YANG-MILLS EQUATIONS
The simplest gauge theory involves the interaction between a classical electro-
magnetic field and a complex wave function on space-time. The electromagnetic
4-potential A is encoded in a connection D = d + iA on a complex line bundle L
with structure group U(1) and the wave functions are represented by sections of
L. A gauge transformation of the potential, A " A =-A + df, is accompanied by
a change in the phase of the wave function by V) V) = e-'f Vi . This preserves
D since
(d + iA)V) = e'f (d + iA)z%.
The curvature of D is the electromagnetic field. Since the structure group is
Abelian, the curvature is invariant under gauge transformations, so that it is a
2-form in the ordinary sense. In terms of the electric and magnetic field vectors,
E and B, F = Fabdxa A dxb, where
0 E1 E2 E3
-E, 0 B3 BB2
(2.6.1)
(Fab)
=i -E2 B3 0
-E3 -B2 B1 0
In the absence of sources, Maxwell's equations are

divB = 0, curl E + a = 0,
which are equivalent to the Bianchi identity dF = 0, and
divE = 0, curl B - a = 0,
which are equivalent to the equation d*F = 0.
In a general gauge theory, we simply replace L and D by a general vector
bundle and connection, and we replace Maxwell's equations by the Yang-Mills
equations DF = 0 and D*F = 0, that is,
D(aFbc] = 0, DaFab = 0. (2.6.2)
The first is the Bianchi identity and the second is the Euler-Lagrange equation
of the Lagrangian density a-tr(FabFab), regarded as a function of the potential
30 Mathematical background I
and its derivatives. However it is important to note two differences from the
electromagnetic case, which arise from the way in which the potential appears
in (2.6.2) in the operators De. First, when the structure group is non-Abelian,
the equations are nonlinear. Second, we must now take the dependent variable
to be the connection, or the potential modulo gauge, and not the curvature F or
its constituent vectors E and B. In electromagnetic theory, the Bianchi identity
dF = 0 is a sufficient condition for the existence of a local potential, but there
is no analogous way in the non-Abelian case to express the existence of I as a
simple condition on the 2-form F.
The Yang-Mills equations are conformally invariant in the sense that if D
is a solution and if p is a proper conformal transformation, then p'cf is also a
solution, because the duality operator on 2-forms is itself invariant. However, to
compare D to its pull-hack under p, it is necessary to lift the action of p to the
vector bundle E (see Chapter 4).
In real Minkowski space, we can write (2.6.2) as evolution equations for a
`vector potential' A and an `electric component of the field' E. We first make
a space-time decomposition of D, by following the example of the linear theory.
We write
-D =qdt-Aldx-A2dy-A3dz,
where A is a vector with matrix-valued components, and we define 8
B=cur1A-AxA, E=-atA-VO+Aq5-qA.
Then the Yang-Mills equations become
atA=-E-Vt-[0,A), atE=cur1B-AxB-BxA-[ ,E] (2.6.3)
which determine the evolution of A and E.
NOTES ON CHAPTER 2
1. The ASDYM equations are equivalent to the self-duality equations by reversing the
orientation. We have taken the ASD equations as basic because they arise from the
natural choice of orientation and conformal structure on a Ki hler manifold.
2. The minus sign in the metric is chosen so that the squared distance of (w, z, w, z)
from the origin is
2(zz - wtiw) = 2det (z w
V Z )
This choice greatly simplifies the introduction of spinors; it also leads to a symmetrical
form for the operators L and M in the Lax pair form of the ASDYM equation (eqns
3.2.1).
3. For a p-index tensor,
1 1(-1)I0IT,(a)o(b)
Tlab...cl = ..o(c)
p
where the sum is over all permutations a of {1,... , p}, and Iai is 0 or 1 as the per-
mutation is even or odd. Symmetrization is defined similarly: T(ab..c) is given by the
same formula, but omitting the factors of -1.
4. A bivector is a skew-symmetric 2-index contravariant tensor.
Notes on Chapter 2 31

5. In spinor notation,
(X)"
03 SCAB 2 g
-x A'e CA,B,
where
AA' (z w
w z

6. Note that if [A, A'] = A" in gl(4, C ), then the corresponding conformal Killing
vectors are related by K" = - [K, K'[. The sign reversal arises because GL(4, C) is
acting on space-time on the left.
7. Different choices for the orthonormal tetrad or for the null tetrad give conjugate
isomorphisms.
8. The vector product of two vectors X and Y with matrix components is defined by
X X Y = (X2Y3 - X3Y2, X3Y1 - XIY3, XIY2 - X2YI) .
Note that A x A $ 0 if the components of A do not commute.
3
The ASD Yang-Mills equation

The anti-self-dual Yang-Mills (ASDYM) equation on a connection is the condi-

tion
F = -*F
on its curvature form. An equivalent formulation, which is the basis of the
Penrose-Ward transform, is that the curvature should vanish on restriction to
a-planes. As the terminology suggests, a solution to the ASDYM equation nec-
essarily satisfies the full Yang-Mills equations (2.6.2), since the Bianchi identity
DF = 0 holds for any connection, and since this implies that D*F = 0 whenever
F = -*F.
In this chapter, we shall look at various ways of expressing the ASD condi-
tion, first as a system of first-order equations on the components of the gauge
potential, and then as the commutativity condition on a Lax pair of operators.
We shall then look at two second-order forms: the first involves a matrix poten-
tial that has become known as Yang's J-matrix, although in the Euclidean and
ultrahyperbolic cases, it has an older interpretation as a Hermitian metric on a
holomorphic vector bundle; the second involves another second-order potential,
which we call the K-matrix. The resulting partial differential equations will be
the central objects of study in the rest of the book. Both Yang's equation and
the K-matrix equation are derived from Lagrangians, and therefore their solu-
tion spaces have symplectic structures. In the first section, we set the scene by
looking at the corresponding equations in Maxwell's theory.

3.1 ASD ELECTROMAGNETIC FIELDS

In the electromagnetic case, the ASD condition on the field tensor (2.6.1) in
Minkowski space is B = -iE, and when it holds Maxwell's equations reduce to

divE=0, curlE - i a =0. (3.1.1)

The solutions in real space-time are necessarily complex, although they have a
direct physical interpretation in quantum theory as photon wave functions with
circular polarization. On the Euclidean and ultrahyperbolic slices, on the other
hand, the equations are real. For example we can restrict to lE by replacing t
by -it, and then by requiring that the coordinates be real. The result is the
Euclidean form of the ASD equations,
Lax pairs 33
aE
divE=0, curl E+ =0,
which do admit real solutions.
Other real forms are found by transforming to double null coordinates (2.3.2)
and by replacing the dependent variables by a second-order potential which
satisfies a complex form of the wave equation. In double null coordinates, the
ASD condition is
a,,,Az - azAw = 0, awAi - aiAw = 0,
aZAi-aiAz-awA;,+awAw=0, (3.1.2)
where A = Aw dw + AZ dz + Aw dw + Ai dz. We interpret the first two equations
as integrability conditions: they imply the existence of functions u and v such
that
A = a,,,udw+azudz+a,;,vdw+aivdz.
We are free to replace A by the equivalent potential A - du, so nothing is lost
by imposing the gauge condition A. = AZ = 0. Then A = ay v diu +aivdz.
We interpret v as a 'second-order potential' for the electromagnetic field: it
determines the field completely, and is determined by it up to
v- v+ f(w,z)+ f(iu,z),
where f and f are arbitrary functions of two variables. As a condition on v, the
ASD condition reduces to the single differential equation
a2v a2v
= 0. (3.1.3)
azaz awaw
Here two `real forms' are evident: when w = -ui, z = z, (3.1.3) is the four-
dimensional Laplace equation; and when w = ti,, z = z, or alternatively when
(w, z, w, z) are all real, it is the `ultrahyperbolic' wave equation on U. ' In the
first case, an imaginary solution to Laplace's equation in lE gives a real solution to
the ASD equations in ]E; in the second case, a real solution to the ultrahyperbolic
wave equation gives a real ASD 2-form in U.

3.2 LAX PAIRS

We now turn to the general case of the ASDYM equation. Let D be a connection
on a complex rank-n vector bundle E over some region U in real or complex
space-time, and let F be its curvature 2-form. In a local trivialization, F takes
values in the n x n matrices. If D = d + -t, then F = Fabdxa A dxb, where
Fab = aAb - ab' a + [%, mb]
In double null coordinates, the ASD condition on F becomes
az4)w - aw4)z + = 0,
19i CZ. - (9g, 4) i + [4) i, (DWf = 0,
aA)i - (9i)z - a,A)w 4),u] = 0, (3.2.1)
34 The ASD Yang-Mills equation
on the components of the potential. If we write
Dz=az+(DZ, D"=aw+4u Di=ai+(Di,
then these are
[Dz, DWI = 0, [Di, Dw] = 0, [DZ, Di[ - [Dw, Dw] = 0.
An equivalent condition is that the Lax pair of operators
L = Dw - (Di, M=D,-(Dc, (3.2.2)
should commute for every value of the complex `spectral parameter' (, where
L and Al act on vector-valued functions of the space-time coordinates. This
formulation in terms of a linear system is central to the theory of integrability
and to its connections with twistor theory. (Note that, in some contexts, we
shall also use L and N1 to denote the pair of vector fields aw - (ai and a2 - (a,;,,
as well as other Lax pairs.)

3.3 YANG'S EQUATION AND THE K-MATRIX

The ASD condition F = -*F on a gauge potential is coordinate-independent
and manifestly invariant under gauge transformations as well as under conformal
isometries of space-time. However, as in the linear case, there are other more
tractable forms of the equation which break one or other of these symmetries.
The J-matrix
The first two equations (3.2.1) are the local integrability conditions for the exis-
tence of two matrix-valued functions h and h of the space-time coordinates such
that
awh+,Dwh = 0, a2h+,DZh = 0,
8,-,h + (;, h = 0, ai h + cih = 0.
They are determined uniquely by (D up to h hP, h --, hP, where P depends
only on w and z, and P depends only on w and z. If 1 is replaced by the gauge
equivalent potential g-1 4)g + g-1 dg, then h and h can be replaced by g-1 h and
g-1 h, which leaves h-1 h unchanged.
The matrix J = h-1 h is Yang's matrix (Yang 1977). It is determined by D
up to the freedom J P-1JP, and it determines D since 1 is equivalent to
J-15J = J-1a, Jdw+J-1aiJdz (3.3.1)
by the gauge transformation 4) H h-14Dh+h-1dh. The first two ASD equations
(3.2.1) are satisfied identically by (3.3.1); the third holds if and only if J satisfies
Yang's equation
aw(J-1a,;,J) - az(J-1aiJ) = 0. (3.3.2)
Yang's equation is equivalent to the ASD equations, but it is not covariant under
coordinate transformations which change the 2-planes spanned by aw and aZ and
by a,;, and ai. It can also be written in the form
Yang's equation and the K-matrix 35

a(J-'5J) Aw = 0,
where w = dw A dw - dz A di, and 0 and a are defined in §2.3.
We can write Yang's equation in general linear coordinates in the form
aa(J-1aaJ) +2WabaO(J-labJ) = 0,
from which we see that, in the non-Abelian case, the equation is invariant only
under conformal isometries that preserve w up to scale. In the U(1) case, J = e1t
and the equation is covariant, but the relationship between the solutions and the
corresponding electromagnetic fields is not.
The construction of J can be understood from a more geometric point of
view, as follows. By making a transformation first from the original gauge by
g = It, and second from the original gauge by g = it, one obtains equivalent gauge
potentials with vanishing w and z components (in the first case) or vanishing zu
and i components (in the second case). If the corresponding frame fields of the
vector bundle are { e 1 i ... , en} and - ,i n ) , then
Dwei = 0, Dzei = 0, (3.3.3)
Dwei = 0, Diei = 0, (3.3.4)
for i = 1, 2, ... , n. Moreover e3 = e,JJ,, so J is the linear transformation in
the fibres from a frame field satisfying (3.3.4) to one satisfying (3.3.3). The
connection potentials in the frames ei and ei are, respectively,
J-'5J and JOJ-1.

and the freedom in the construction of J from D is the freedom to transform the
first frame by P and the second by P.
In the case of a U(n) bundle over Euclidean or ultrahyperbolic space, the
fibres have Hermitian metrics and the relationship between J and D has an-
other interpretation in terms of a general result that a Hermitian structure on a
holomorphic vector bundle determines a unique connection (Griffiths and Harris
1978, p. 73). By taking w and z to be holomorphic coordinates, and w and z
to be anti-holomorphic, we can identify IE or U with the flat (pseudo-) Kahler
manifold C2, with Kahler form -2iw. If D satisfies the ASDYM equation, then
the bundle is holomorphic, with its local holomorphic frames given by the solu-
tions to (3.3.4). Suppose that D is also compatible with the Hermitian structure
and that we take ei to be holomorphic and ei to be the dual frame, that is,
(ei,ej) = big. Then (3.3.3) also holds and ej = eiJij, where
1

(we take the inner product ( , ) to be linear in the first entry and antilinear in
the second). Thus J-1 is the matrix of inner products of the vectors making up
a holomorphic frame field. Conversely, a Hermitian structure on a holomorphic
vector bundle determines a connection, given by d + JOJ-1 in a holomorphic
local trivialization, where J-1 is the matrix of the inner product in the fibres.
36 The ASD Yang-Mills equation
Example 3.3.1 Single instanton. The following example is the `pseudoparticle'
of Belavin et al. (1975). The gauge group is SL(2, C) and the potential is given
in complex null coordinates by
f w 0 f w -2z
2 (-2z w)' "- 2 0 -w
f z 2zu f z 0
4iZ
2 (0 z) 2 (2w -z '

where f = (1 - wza + zz)-1. Note that f is nonsingular if we impose the reality

condition (zu = -w, z = z), so this is a global solution in Euclidean space (in
fact it extends to the compactification S4). We can take
h_ (z (I + w
32)), h-1 Zux (s2w z(1 z r2) )
where r2 = wzu, s2 = zi. Then

J= rsf ( s2s2- r2 -r2s2(s2 - r2 + 2)

r
Note that det J = 1.
The K-matrix
Another form of the ASD equation in complex space-time can be obtained by
exploiting the existence of the frame ei in a different way. In the local frame e2,
we have 4w = 4)Z = 0 and the ASD equations read
ai4w - 9,Ai + [Ii, 4w] = 0, t94) i - C7w4iw = 0. (3.3.5)
This time we interpret the second equation as an integrability condition: it
implies the existence of a matrix-valued function K such that
4i = OZK dzu + awK dz. (3.3.6)
Clearly K determines D, and, conversely, K is determined by D up to
K,--' g-1Kg+zg-1a,a9+wg-1ai9+c,
where g and c depend only on w and z. The freedom to choose g comes from the
gauge freedom in the frame e;, while c is an integration constant of the second
of equations (3.3.5). As a condition on K, the ASD equation is
a,aiK - awawK + [aWK, aZK] = 0, (3.3.7)
by substituting (3.3.6) into the first of eqns (3.3.5) (Newman 1978).
3.4 LAGRANGIANS FOR THE ASDYM EQUATION
Both Yang's equation for the J-matrix and eqn (3.3.7) for the K-matrix can be
derived from Lagrangians, as we shall show in this section. Thus the two poten-
tial representations give two families of Lagrangians for the ASDYM equation,
labelled by the different possible choices for the double-null coordinates.
Lagrangians for the ASDYM equation 37

The K-matrix Lagrangian is the most straightforward. Written in general

linear coordinates, the K-matrix equation is
77abaaabK - 2«ab[aaK, abK] = 0,
where a is defined in eqn (2.3.1). This is the Euler-Lagrange equation for the
action
S[K] = tr(2r7abaaKabK + gaabKaaKabK)v.
J
The J-matrix Lagrangian is more subtle and does not have an invariant first-
order representation. It was written down for gauge group SU(2) in Pohlmeyer
(1980) in the form given in Proposition 3.4.1 below, and in a general abstract
form by Donaldson (1985). Donaldson observed that one can define a closed
1-form A on the space of matrix-valued functions on space-time by

A(6J) = - tr (.Oa(J-1aJ)) Aw, .0 = J-1bJ,

J
where d = a + a is the decomposition of the exterior derivative defined in §2.3,
the integral is over some bounded four-dimensional contour, and the proof of
closure involves the formal application of Stokes' theorem to discard boundary
terms. 2 Consequently we can write
A(6J) = bS
for some functional S of the J matrices. The variational equation bS = 0 is
equivalent to Yang's equation in the form
a(J-16J) A w = 0.
There is more than one way to represent the action S as the integral of
a Lagrangian density because of the freedom to add boundary terms, but one
possibility, due to Donaldson, is

S. = i fItr(FAF)
where F is the curvature 2-form of the connection d + J-13J and K is any
function such that w = a5K (in the Kahler case, K is a multiple of the Kahler
potential). Another is given by the following proposition.3
Proposition 3.4.1 Write J = UL-1, where L is lower triangular and U is the
sum of the identity matrix and a strictly upper triangular matrix (i.e. U has ones
down the diagonal and zeros below it). Then Yang's equation is derived from the
action
S[J] = 2 tr (2U-'DUA L-'aL - L-IaL A L-'5L) Aw.
J
Proof Put
A=L-'aL, a=L-10L, µ=U-13U.

In terms of these 1-forms, the action is

38 The ASD Yang-Mills equation

S f(2A_AA) Aw,
and Yang's equation is
(aji+aa+AAA+.\ Ad) Aw=o.
We have to show that Yang's equation is the same as the variational equation
bS = 0. Put u = U-'bU and e = L-'aL, and note that
bµ=au+[z,u], ba=ae+[a,e1, ba=ae+(a,e].
Consider first a variation of J with bL = 0. Then
bS= A A ,

where D is the region of integration. Similarly, with bU = 0, the variation is

bs=2 f tr(2µA(ae+[A,e])-aeAAA5e)Aw
D

= fDtr(e(au+AAA +AAft +aa))Aw+ f

where to obtain the second equation we have noted that QA +5.\ =-A A A -A A A
is strictly lower triangular and that the trace of a strictly lower triangular matrix
vanishes. On setting both the variations to zero, and on noting that the trace of
a strictly tipper triangular matrix vanishes, we obtain Yang's equation.
3.5 THE HAMILTONIAN FORMALISM
In relativistic field theory in real Minkowski space, the Lagrangian density L
determines both the field equations and the Legendre transformation to their
Hamiltonian form. Looked at from a geometric point of view, the boundary
terms in the variational calculation define a potential 1-form O for the symplectic
structure Il = d0 on the solution space. If the Lagrangian density is translation
invariant and depends only on the field components E, and their space-time
derivatives at = 8 ,, that is, if L = L(l;1,laj), then we have

f afar
bc'Ea ,

where bf is a solution to the linearized field equations (a tangent vector to the

solution space), Ea = as J v, and the integral is taken over a Cauchy surface.
Clearly St = dE is closed; under appropriate boundary conditions, it is also inde-
pendent of the choice of Cauchy surface since the integrand is a closed 3-form on
space-time whenever C satisfies the field equations and t5 satisfies the linearized
field equations. It is not always nondegenerate, but can be made so by taking
the quotient by the characteristic distribution of SZ (see Appendix C); in the
case of Maxwell's equations, for example, the passage to the quotient identifies
gauge-equivalent potentials (see Woodhouse 1992b). When L is independent of
the coordinates xa, the translations in space-time determine Hamiltonian flows
The Hamiltonian formalism 39

on the solution space (or its quotient), with translation along the constant vector
field V = V°aQ generated by the Hamiltonian

Hv = f Vb(
aL
bi - 6 L) Ea .

Again as a consequence of the field equations, the integrand is a closed 3-form

on space-time, and so Hv is independent of the choice of hypersurface.
In the case of the ASDYM equation, we have two Lagrangian formulations,
and therefore two symplectic structures for a given choice of double-null coordi-
nates. Before describing them explicitly, we shall first consider what we mean
by the `solution space' in this context. We shall work in the complex, and not
consider here the issue of boundary conditions, but simply derive the formal ex-
pressions for the symplectic forms as integrals over unspecified hypersurfaces. 4
The solution space
By the `solution space' of the ASDYM equation we mean the quotient M = C/G,
where C is the set of ASD connections on a fixed vector bundle E U, and g is
the group of active gauge transformations (see Appendix A). Two connections
D, D' E C determine the same point of M whenever they are equivalent in the
sense that there is an automorphism g: E -* E such that g(ES) = Ex for every
x E U, and such that
D(gs) = gD's
for every section s. We define the projection C -+ M by mapping D to its
equivalence class [D]. If 3 and 1C denote the respective solution spaces of eqns
(3.3.2) and (3.3.7), then we also have projections
J M,
which map the J and K matrices to the corresponding solutions to the ASDYM
equation. The symplectic structures are defined in the first instance on 3 and
!C; at a formal level, they can be transferred to M, but this can be clone in a
rigorous way only if we are given sections of the last two projection maps, that
is, if we have some way of picking out unique J and K matrices for each ASD
connection.
The linearized equations
If D = d + 4) is a connection on E and 41 is a 1-form with values in adj(E), then
the curvature of D + T is
F+2DT,
to the first order in ', where F is the curvature of D and DIP = dW+4n'+'A b.
Thus the linearized ASDYM equation is 5
DAY=*DT. (3.5.1)
This form of the equation takes account of gauge transformations of the con-
nection in a natural way. A solution represents a tangent vector to C at D and
40 The ASD Yang-Mills equation
hence, by projection, an element of T1DIM. The projection vanishes whenever
4< is given by an infinitesimal gauge transformation of the connection, that is,
whenever
4< = do + [4),01 =Do (3.5.2)
for some section 0 of the adjoint bundle (q5 is an infinitesimal automorphism of
E). Thus a tangent vector to M is represented by a class of gauge equivalent
1-forms T, where we regard' and'' as gauge-equivalent whenever they differ
by Do for some ¢. By making an appropriate choice for 0, we can always choose
a representative from the class such that
%F = Twdw + Tidz .
This is unique up to the addition of DO, where D,,,o = 0 = D2 .

Perturbations of the J and K matrices

With an appropriate choice of gauge, an ASDYM potential can be written in
either of the two forms
,b = J-1awJduw+J-1aiJdz =O Kdtii+a,,,Kd2, (3.5.3)
where J and K satisfy eqns (3.3.2) and (3.3.7), respectively. A significant feature
of the ASDYM system is that the linearized forms of these equations are the
same. If we put 6J = JO and 6K = 0, where 0 is a matrix-valued function
on space-time, then 6J and 6K satisfy the respective linearized equations if and
only if
a.aurm - azai0 + [Cw, a.01- [(D i, a=-01 = 0 . (3.5.4)
Under gauge transformations, 0 behaves as a section of the adjoint bundle and
(3.5.4) is the background-coupled wave equation
D*DO = 0. (3.5.5)
Here D = d + [4), . is the connection on the adjoint bundle and * is the Hodge
operator on 1-forms, which maps a 1-form O to the 3-form
*,3= (0wdw+0 dz - fl dw - Qidz) A c , (3.5.6)
where w=dwAdziu - dzAdz.
The tangent spaces
We denote the solution space to the background-coupled wave equation of a
given ASD connection D by WD, and we put
aD = dwD,,, +dzD., aD = dW D,;, +dzDi,
so that D = aD + aD (on a Euclidean real slice, aD and aD are the (1,0) and
(0, 1) parts of D with respect to the complex structure in which w and z are
holomorphic coordinates). Within WD, we have two subspaces WD+ and WD_,
picked out respectively by the conditions aDO = 0 and aDt = 0. By writing
6J = Jo and 6K = 0, and by constructing D from J or K as above, we have
The Hamiltonian formalism 41

TJJ = WD = TKIC .
We also have two projection maps
TJJ - T[DIM, TKKC -+ T[DIM,
defined by mapping 6J = J4) or 6K = 0 to the corresponding solution to the
linearized ASDYM equation, which is given in the respective cases by
91 = D,;,4) dziw + DZ4) dz and W' = DZ4) dw + D¢ dz.
If 0 E WD+, then %P = 0, while if 0 E WD_, then %P is an infinitesimal gauge
transformation and V = 0.
The symplectic forms
Our Lagrangian densities for eqns (3.3.2) and (3.3.7) determine symplectic struc-
tures on J and IC, and hence two bilinear forms on WD for each solution to the
ASDYM equation. In fact these bilinear forms coincide, and are both given by

IZ(4), 0') = J
2
2 1 tr(4)8D4)' - 4)BD4)' - 4)'OD4) + 4)'aD4)') A w. (3.5.7)

In particular, they depend only on D, and are unchanged when J or K is replaced

by another potential for the same connection.
In the case of the K-matrix, this follows by direct calculation, by writing
V = 2&a'abK. In the case of the J-matrix, the result is less obvious. We have
to show that the exterior derivative of the 1-form 6 defined on J by
6(bJ) = ftr(uA_ei - z(A - A)) A w
coincides at each solution with SZ (the right-hand side is given by the boundary
terms in the variational calculation). To do this, we choose two commuting
variations 6 and 6', that is, two commuting vector fields on the solution space of
Yang's equation, and define 0, u and e as before. We put 0 = 6JJ-', and use
primes to denote the corresponding quantities with 6 replaced by 6'. We then
have
=L(u-e)L-1, =U(u-e)U-1,
and hence

tr((u-e)a(u -e')+(u-e)a(u' -e')+(u-e)[A-it,u -e'])

On the other hand, we have
tr(6(u'A) - 6'(0)) = tr(u'(a + [A, e]) - u(ae' + [A, e']) - [u, u'] A)
and so on. By using Stokes' theorem to discard exact differentials, and by again
noting that the trace of a strictly upper triangular or strictly lower triangular
matrix vanishes, we conclude that
42 The ASD Yang-Mills equation

6(e(b'J)) - 6'(O(6J)) = ftr(a' + a ' - 'a ) Aw. (3.5.8)

Now ao0 = aq5 and

aoO = am + 1J-'9J, 01= J-'
Hence 0(O, 0') is also given by the right-hand side of (3.5.8) and so d8 = SZ. 6
In the case of the K-matrix formulation the Haniiltonian generating transla-
tions along V is
/
Hv = J trVb(aaKabK -
3a°cK1abK,acK1 - 46 a`dKa, KadK)EQ
Note that the integrand does not yield a symmetric energy-momentum tensor
since the Lagrangian depends not only on the metric, but also on &. The J-
matrix Hamiltonians can similarly be written out explicitly.
NOTES ON CHAPTER 3
1. The ultrahyperbolic wave operator is the difference between two two-dimensional
Laplacians, or, equivalently, the difference between two two-dimensional wave opera-
tors. It was studied by John (1938); see also Woodhouse (1992a).
2. On a compact Khhler manifold, the integral can be taken over the whole manifold:
the operator a becomes the 8-operator, and the formal element in the proof can be
eliminated.
3. From an unpublished paper by Mason and Sparling.
4. The Hamiltonian theory has also been considered in a rather different way by J.
Schiff in an unpublished paper, SDYM and the hamiltonian structures of integrable
systems (hep-th/9211070).
5. Written out in full, (3.5.1) is equivalent to
al'F + [(D,, %PW( - au,qsz - ('hu l ' ] = 0, Wf0,
8 'I', + 14)p, %YW( - a,b%Pj - [4
and
&q' +(4)_,q'=1(4rz, =1(4 ,'F ]+8,;,'1 +(4)ti,%PW(=0.
These are the linearizations of (3.2.1).
6. We can use the same calculations to show directly that Il determines closed forms
on J and K. In the first case, we have to show that for three variations 6, 6' and 6",
E(6(S2(J-'6'J,J-'6"J)) -S2(J-'6(6'/) +S2(J-'6'(6J),J-'6"J)) 0,
=
where the sum is over the three cyclic permutations of the variations. However, this
follows from (3.5.8) by using
6'46 = J-' 6'(6J) 6'(6J)J-'
and so on.
The similar result for K follows by first writing
Q(6K, 6'K) = 2 Jtr(6K*db'K - 6'K*d5K - 26K(dK, 6'K( A dw A di)
and then by taking a further variation of the right-hand side, summing, and discarding
an exact form from the integrand.
4
Reduction of the ASDYM equation

In Chapters 5-7, we shall consider the classification of the integrable systems

that arise as symmetry reductions of the anti-self-dual Yang-Mills (ASDYM)
equation. In this chapter, we discuss general aspects of the reduction process. I
We describe a framework for classification of the reduced equations, and we give
examples of conformal reduction in the Abelian case to illustrate how some of
the basic linear equations of mathematical physics arise from the ASD Maxwell
equations. This is followed by a treatment of symmetry reduction in the non-
Abelian case. We investigate the invariance of connections, define the associated
Higgs fields and discuss their properties. Finally we look at other `hidden' sym-
metries of the ASDYM equation, which do not arise from point transformations
of space-time.
In Appendix A we shall consider the more technical problem of passing from
the invariance, up to gauge, of D to the invariance of fi, and address some
geometric issues that are important for understanding the sense in which different
choices of the action of the symmetry group can be inequivalent.

4.1 CLASSIFICATION OF REDUCTIONS

Whatever the definition of integrability, the process of reducing an integrable
system of equations by imposing symmetry or by specializing the parameters
leads to another integrable system. Thus we have a partial ordering: system A
is less than system B if A is a reduction of B.
The existence of the ordering has led to the search for a `maximal element'-
an integrable system from which all others can be derived. Such a `universal
integrable system' has not been found, but it has emerged that the ASDYM
equation yields almost all known systems in one and two dimensions, and many
important systems in three dimensions (the most significant omission is a fam-
ily that includes the KP and Davey-Stewartson equations in three dimensions).
Thus we shall not lose much generality by restricting our attention to reduc-
tions of the ASDYM equation. These we can classify by considering the various
ingredients in the reduction process, which are:
(a) a group H of conformal isometries;
(b) a gauge or structure group G;
(c) a lift of the action of H to the bundle E;
44 Reduction of the ASDYM equation
(d) a choice of `constants of integration', which may include the conjugacy classes
of some of the Higgs fields;
(e) a choice of gauge for the ASDYM connection or, equivalently, a set of invari-
ants that determine the connection.
The group H in (a) is a symmetry group of the ASDYM equation because proper
conformal isometries of space-time map ASD 2-forms to ASD 2-forms and there-
fore preserve the ASD condition. They are, in fact, the only space-time transfor-
mations with this property, so invariance under point transformations of space-
time is consistent with the ASDYM equations only if the transformations are
conformal. 2
The choice involved in (b) is clear. We shall explain (c) in this chapter: it is
particularly important when the action of H on space-time is not free, in which
case there may be a number of different ways of lifting the action to the bundle.
Also algebraic and differential constraints can arise, and there may be a number
of different ways of satisfying them, with different choices leading to different
equations.
The choices under (d) depend on the details of the particular case. It may
happen that some of the reduced equations can be integrated directly to yield
first integrals, which will generally be disposable functions depending on one
variable less than the number of independent variables in the reduced system.
They can appear in the final form of the reduced system, so different choices
of first integral can lead to different equations. It can also happen that the
conjugacy class of a Higgs field is constant, in which case different reductions
can be distinguished by different choices for the normal form of the Higgs field.
We shall give examples in later chapters.
The choices under (e) arise because most of the integrable systems that we
consider are expressed as equations on dependent variables that are not subject to
gauge transformations. On the other hand, even after we have made the choices
in (a)-(d), there will be residual gauge freedom in the gauge potential. This must
be fixed either by choosing a standard gauge or, equivalently, by finding a set
of gauge invariants that completely determine the connection. Different choices
can lead to different reduced equations, the solutions to which are related by
gauge transformations of the corresponding solutions to the ASDYM equation.
For example, we show in §6.3 that the Heisenberg ferromagnet equation and the
nonlinear Schrodinger equation are related in this way. We shall not attempt
to catalogue all possible choices, which would require us to consider all possible
gauge transformations of a given system, but simply restrict ourselves to the
choices that give the standard forms of known integrable equations. 3
Although this approach to classifying integrable equations has the obvious
drawback that it is restricted to examples that are known to be reductions of the
ASDYM equations (or the ASD Einstein equations or more general ASD equa-
tions), it does open the way to the application of powerful geometric techniques.
Furthermore, we shall see that it is much less restrictive than might have been
expected.
Reductions of the linear ASD equation 45

4.2 REDUCTIONS OF THE LINEAR ASD EQUATION

The following examples illustrate how some of the classical equations of mathe-
matical physics arise by reduction of the linear ASD equation. They are obtained
by requiring that the solutions to the ASD equation should be invariant under
various symmetries of space-time. In some cases, this involves no more than a
requirement that the solution to the wave equation (3.1.3) should be indepen-
dent of one or more of the space-time coordinates, but this is not always the best
starting point because the gauge condition A,;, = AZ = 0 is not compatible with
all the symmetries of (3.1.1).
Laplace's equation
When E is static, that is, independent of t, eqn (3.1.1) is div E = 0 = curl E. If
we put E = -grad 0, then the ASD equation reduces to the three-dimensional
Laplace equation
V20=0.
Each electrostatic potential 0 generates a time-independent solution to the ASD
equation.
Two-dimensional equations
In §3.1, we showed that, with an appropriate choice of gauge, the ASD condition
on a complex electromagnetic field comes down to the single linear equation
02v a2v
azaz 8wO 0
Under either of the reality conditions (i) w = w, z = z, or (ii) w, z, w, z are real,
this is the ultrahyperbolic wave equation, which reduces to:
(a) Laplace's equation in two dimensions if we take the first reality condition
and require that v should be independent of w and w;
(b) the wave equation in two space-time dimensions when we impose either re-
ality condition and require that v should depend only on w + 20 and z + z;
(c) the heat equation Vit = ixx when we impose the second reality condition
and require that v should be of the form
v(w, w, z, z) = V)(x, t)ez
wheret=z andx=w+w.
In each case, the symmetry group is the group of translations generated by two
constant vectors X and Y. In the first case, X = 8,,, and Y =,9,b; in the second
X=8,,,-a,;,and Y=az-aziand in the third, X=a,,,-a,;,and Y=O.
In the first two cases, v is constant along X and Y, which translates into the
condition that the Lie derivative of the potential A along X and Y should vanish.
In the third case, X (v) = 0 and Y(v) = v, which translates into the symmetry
condition on A that
GrA=O, £ A=A,
46 Reduction of the ASDYM equation
that is, the electromagnetic field is invariant under translation along X and is
rescaled under translation through Y. "
The first two cases are distinguished by the signature of the metric on the
2-plane spanned by X and Y. In the third case, Y is null and the 2-plane spanned
by X and Y has a degenerate metric. Reduction by non-null translations gives
hyperbolic or elliptic equations, while null reductions can give parabolic equa-
tions. The same is true in the nonlinear theory.
Ordinary differential equations
By imposing symmetry under a three-dimensional group of space-time transfor-
mations, we can reduce the ASD condition to an ordinary differential equation.
For example, we can look for electrostatic potentials of the form
Ox1,x2,x3) = ekx3+b1oy(r)

where x1 + ix2 = rei0, and k and n are constant. Then the reduction is Bessel's
equation
2
2 r2J + ry + (k2r2 - n2)y = 0.

The corresponding static solution to the ASD equations has cylindrical symme-
try: it is invariant up to scale under translation in x3i and under rotation about
the x3-axis.
We obtain a less familiar example, which has an interesting nonlinear coun-
terpart in the reduction of the ASDYM equation with gauge group SL(2, C) to
the sixth Painleve equation (§7.4), by requiring that the solution to the ultra-
hyperbolic wave equation
d2v 82v
-0
(9zaz awaw
should be of the form
v = Sgt,\ky(A)e2ik0+2iin,p
where z = rei0, w = se'm, A = r2/s2, and k, e, and m are constant. Here we
obtain the hypergeometric equation

all -A) 12/ + ((2k+1)(1 -A)+eA)d +(m2 -e2-k2+2ek)y=0.

4.3 CONFORMAL REDUCTION IN THE NON-ABELIAN CASE
We now turn to our central theme, which is the study of the differential equations
that arise from from the nonlinear ASD condition by conformal reduction-that
is, by requiring that an ASD connection D = d + 4? should be invariant under
a subgroup of the conformal group. A simple example is the subgroup of time
translations
(t,xl,x2,x3)'--' (t+a,x1,x2,x3)
Here we can impose invariance by writing
4) = 4)0dt+4;1dx1 +4i2dx2+4?3dx3,
Invariant connections and Higgs fields 47

and by requiring that the components 4 a should be independent of the 'ignorable

coordinate' t. A general gauge transformation D ' -, g-'4ig + g-'dg, where
g depends on t as well as on x1, X2, and x3, transforms a potential that is
invariant in this straightforward sense into one that is not, so time-independence
as a condition on the components of (D is a restriction both on the connection
and on the gauge in which it is presented. Only the transformations for which g
is independent of t preserve the invariance of the gauge.
A connection can be invariant even though it is presented in a gauge that is
not invariant. This is the reason that we introduce in the next sections a gen-
eral definition of invariance, based on the gauge-free notion of a `Lie derivative'
operator. Such operators differentiate the sections of E along the generators of
H. In the example, the operator corresponding to the generator of the time
translations is at in the original gauge, but is given by
at +g-latg
in a general gauge. Often it is convenient to study an invariant connection in a
general gauge, and sometimes it is actually necessary to do this because invariant
gauges may not exist, for example when the symmetry group does not act freely
on space-time. From the geometric point of view that we shall now describe,
we shall see that the selection of a preferred class of `invariant gauges' involves
choice: it amounts to the selection of an action of the symmetry group on the
vector bundle on which the connection is defined. Inequivalent choices of the
action are possible when the group does not act freely and these can lead to
different reductions.5

4.4 INVARIANT CONNECTIONS AND HIGGS FIELDS

The general geometric framework that we shall use to discuss the action of a
group of conformal symmetries on solutions to the ASDYM equations was de-
scribed in §2.5. Let E be a vector bundle over an open subset U of real or
complex space-time and let H be a subgroup of the conformal group that acts
on U. Suppose that we are given a connection D on E, and a lift to E of the
action on U. For p E H, we define the pull-back connection p*D by
(p*D)(V') = p*(Dp.V))
To see what this means in concrete terms, suppose that e= is a local frame field.
Then p*et is also a local frame field. If D = d + 4' in the local trivialization
determined by et, then p*D = d + p*44 in the trivialization determined by p*et.
The curvature of p*D is p*(F), where F is the curvature of D, so p*D satisfies
the self-duality equation whenever D does.
Note that in general the action of p* changes both the potential and the local
trivialization. If, however, we can choose the trivialization so that the local frame
field is invariant under the lifted action on p, then et = p*ej and p*D is given
more simply by pulling back the entries in the potential by p.
48 Reduction of the ASDYM equation

Lie derivatives
At the infinitesimal level, the lift assigns a `Lie derivative' operator Lx to each
conformal Killing vector X E l) (the Lie algebra of H). This acts on sections of
E and forms with values in E by
Lxs = X(s) +Oxs, Lx(as) = (G'xa)s + aGxs,
where s E F(E), a is a complex form, V is the ordinary Lie derivative on
forms, and 8x is a matrix-valued function on space-time, which transforms under
changes of gauge by
Ox H g-1X(9) +g-19x9

Invariant connections
We say that the connection is invariant if it is preserved by the action of H, that
is, if p*D = D for every p E H. At the Lie algebra level, the condition is that
Gx(Ds) = D(Gxs) (4.4.1)
for every X E t) and for every s E r(E), where Lx is the Lie derivative on
sections. This still makes sense as a definition of invariance when we are given
only the infinitesimal action of h on U and E. In terms of the potential, it is
L'x 4i + lox, 4i] = d9x , (4.4.2)
where L' denotes the ordinary Lie derivative operator on differential forms,
applied to 4? entry by entry. In an invariant gauge, the condition reduces to
L'x4?=0.
The action of the operators Gx extends to sections of adj(E) by
,CxO=XM+lox, §6]
When D is invariant, Lx commutes with D, and therefore also with D2. It
follows that
LxF=L'xF+lox, F]=0,
and hence that invariance is consistent with the ASD condition.
If H is a group of translations, then the action is necessarily free and it is
always possible to find an invariant gauge. In this case, the invariance condition
is simply that the components of 4i in linear coordinates should be constant
along the generators of H.

Higgs fields
When the connection is invariant under translation along one of the coordinate
vectors in a linear coordinate system, the corresponding component of the po-
tential transforms by conjugation under a change from one invariant gauge to
another. This `Higgs field' is significant for two reasons.6 First, its conjugacy
class at each point of space-time is independent of the choice of the invariant
gauge, and can be used to distinguish between different cases of reduction. Sec-
The space of orbits 49

ond, it depends only on the other three coordinates, and therefore it is natural
to use it as one of the dependent variables in the reduced equations.
Now consider a general subgroup H of the conformal group and suppose
that D is invariant under some lift of H to E. There is then a Higgs field Ox
associated with each conformal Killing vector X E lj (the Lie algebra of H),
which measures the difference between the covariant derivative along X and the
Lie derivative along X. It is defined by
0xs = DXs - GXs
for every section s of E. The right-hand side is linear (over functions) in s, so
the value of the left-hand side at a point m E M depends only on s(m), and is
linear in s(m). In a local trivialization,
GXs=X(s)+9Xs and DXs=X(s)+(XJ4?)s.
Therefore the Higgs field has the matrix representation
¢x=XJ -D - 9x.
Under a general gauge transformation, Ox transforms by conjugation by g, that
is, it behaves as a section of adj(E). In an invariant gauge, 9x = 0 and Ox =
X J 4D, so the new general definition extends the one we gave in the particular
example of translation and can be used in any gauge.
From (4.4.2), 7
X (oY) + [BX , 4'Y] _'[X,Y] (4.4.3)

for every X, Y E h. This can be written in the explicitly gauge-invariant form

'CX0Y = 0[X,Y[ .

4.5 THE SPACE OF ORBITS

The most straightforward reductions arise when 4 is Abelian and acts freely on
some open subset .U of space-time. In this case, h has a basis of commuting
vector fields that span the tangent space to the orbit of h through each point
of U, and we can choose the gauge to be invariant. Then, because the vector
fields commute, the corresponding Higgs fields are constant on the orbits, and
so we can think of them as functions on the space S of orbits. 8 They determine
the components of the connection along the orbits, so the remaining dependent
variables are the components of (D in directions transverse to the orbits, which
are also constant on the orbits. 9
Other cases are less simple and require a closer look at the geometry of the
symmetry condition. When h is not Abelian, the Higgs fields are generally not
constant on the orbits, and when the action is not free, there are differential and
algebraic constraints.
Curvature identities
If D is invariant under the flow of X, then
Xid4) =L'X4?-d(XJ4D) =-[9x,' 1 - dOx,
50 Reduction of the ASDYM equation
where GX is the ordinary Lie derivative of forms, applied to 4) entry by entry.
It follows that for any vector field V,
F(V,X) =2d4?(V,X)+[VJ4i,XJ4?] =DvOx. (4.5.1)
By combining this with (4.4.3), we have that
F(X, Y) = c(x,Y( + [Ox, OY] (4.5.2)
whenever D is invariant under the flows of X and Y.
It follows from (4.5.1) that Datx = XbFab, and hence that D°DaOX =
8l°Xbl Fab. If D is ASD and invariant along X, and if X is SD, then ¢x is a
solution to the background-coupled wave equation D(*DO) = 0. In particular,
this holds when X is a translation, an observation that will be significant in the
treatment of hierarchies (Chapter 8).
Kinematic constraints
Suppose now that D is defined on an open subset U in space-time and that it
is invariant under a subgroup H of the conformal group with Lie algebra h. For
each point x E U, we denote the stabilizer of x by Hx and its Lie algebra by hx
That is
4x={XE41 x(x)=o}.
When the action of H is not free, Hx is nontrivial for some points of U. In this
case eqns (4.5.1) and (4.5.2) lead to differential and algebraic constraints on the
Higgs fields. The differential constraints are
DOx = 0 at x, VX E hx , (4.5.3)
and the algebraic constraints are
(OX, OY] + Olx,Y(= 0 at x, dX E 1)x, Y E l) . (4.5.4)
We refer to (4.5.3) and (4.5.4) as kinematic constraints because they arise purely
from the action of the symmetry group and do not involve the ASDYM equation.
If X, Y E l)x, then F(X, Y) = 0, and (4.5.4) implies that X -+ Ox (x) is a
representation of the Lie algebra 1),,. It is the infinitesimal form of the represen-
tation of Hx on Ex determined by the action of H on E (each g E Hx fixes x, and
so maps Ex to itself). Different lifts can result in inequivalent representations.
Transversals to the orbits
If the dimension of hx is constant in U, and if U has been chosen so that the
orbits of H foliate U, then we can pick out a submanifold S c U that intersects
each orbit transversely at a single point and we can identify S with the quotient
U/H. In this case, we put E' = Els and let D' = DDS. Then the curvature of D'
is F' = FIs and the restrictions of the Higgs fields to S are sections of adj(E').
At points of S, the curvature of D can be found from
F(V,W) = F(V,W),
F(V, X) = DvOx,
The space of orbits 51

F(X, Y) = Olx,yl + [Ox, cby ], (4.5.5)

where V, W are tangent to S and X, Y E b. If the ASD condition holds on
S, then, by invariance, it holds throughout U. Thus we can express the ASD
condition on an invariant connection D as a system of PDEs for dependent
variables defined on S. These are (i) the restricted connection D' and (ii) the
restrictions to S of the Higgs fields.
If bx # 0 for x E S, then there are differential and algebraic constraints. They
are most easily derived when, for every x E S, bx = bo for some fixed subalgebra
bo C b; if the orbits are identical homogeneous spaces, then the stabilizers of the
points of U are conjugate in G, and we can arrange for for bx to be constant
by choosing S to pass through an appropriate base point on each orbit. In this
case, the constraints are that at every point of S,
D'Ox = 0, [Ox, Oy] + Olx,yl = 0 (4.5.6)
for all X E bo, Y E b. The first of these reduces the structure group of D' from
G to the subgroup that preserves Ox for every X E bo
Example 4.5.1 Let H = SO(3,C) be the subgroup generated by
x=x283-x382, Y=x381 -xla3, Z=xla2-x281
in complex Cartesian coordinates. Here [X, Y] = -Z, and so on, and the orbits
are complex 2-surfaces in the hyperplanes of constant xo. We take U to be the
complement of the fixed point set of H and S to be the surface x1 = x2 = 0,
at each point of which the stabilizer is the subgroup Ho generated by Z. We
put B = Ox + i¢y, b = Ox - i¢y, and C = icZ (restricted to S). Then the
algebraic constraints are
B + [B, C] = 0, f3 - [B, C] = 0
and the differential constraint is D'C = 0. (See §6.8 for the corresponding
reduction of the ASDYM equations.)
Example 4.5.2 Let H be the complex Euclidean group generated by
X = a, X = a,w, Z = iwa,0 - iwaj
in double-null coordinates. Then [X, Z] = iX, [X, Z] = -iX, and the orbits of
H are the 2-planes of constant z, z. We take S to be the transversal 2-plane
w = w = 0 and Ho to be the subgroup generated by Z, and again put B = Ox,
B = Ox, and C = -ioz (restricted to S). Then the constraints are
B + [B, C] = 0, b - [B, C] = 0, D'C = 0, (4.5.7)
which are the same as in the first example. However, we shall see in Example
6.2.1 that the reduced ASDYM equations are different.
When the gauge group is SL(2, C ), the constraints in these examples can be
satisfied with nonzero B or b only if C has eigenvalues ±Z. In that case there
is a gauge in which
52 Reduction of the ASDYM equation
1 0
C B=
2 (0 -1)' B = (0 0) ,
(9b
0) ,
(4.5.8)

and the potential of D' is diagonal, so the gauge group has been reduced to the
Abelian group C" C SL(2, C).
When the gauge group is GL(n, C), the constraints again impose severe re-
strictions on the algebraic form of the Higgs fields. If B and b have maximal
rank (n - 1), then there must be a gauge in which
C = diag(c + n,c + n - 1,...,c + 1),

for some constant c, and

0 01 0 ... 0
0 0 02 ... 0

B=
0 0 0 ... On-I
0 0 0 ... 0
/0 0 ... 0 0
01 0 ... 0 0
B= 0 2 ... 0 0 (4.5.9)

\0 0 ... n_1 0l
for nonzero functions O;, ;, again with the potential of D' diagonal.
Discrete symmetries
We have derived the differential and algebraic constraints associated with the
actions of the identity components of the stabilizer subgroups. Constraints can
also arise from the action of transformations outside the identity components
when the stabilizers are not connected.
Suppose that p E H and that p(x) = x for every x E S. Let p.: E --* E
be the lift of p to E (see §4.4). The restriction of p. to Els induces a linear
transformation Q : Ex - Ex for each x E S. Given a choice of gauge, Q., is a
matrix-valued function on S, which behaves as a section of the adjoint bundle
adj(Es) under gauge transformations. The invariance of D under H implies that
at S
QOX = op. X Q' (dcl + [D, Q]) Is = 0, (4.5.10)

for any X E .
Let Ho be a subgroup of H and suppose that S is a transversal to the orbits
such that Hx = Ho for every x E S. Then for each p E H0, we have a section 1l,
of adj(Es) such that (4.5.10) holds, and such that
QPP' = SZPSZP1
for every p, p' E Ha.
The space of orbits 53

Example 4.5.3 In Example 4.5.2, H was the group of translations and rotations
in the w, w-plane. We now replace the rotation subgroup by a finite group of
rotations in the w plane through integer multiples of 2ir/n. That is, we take H
to be the group generated by the translations along
X=aw, X=Bw
and the rotation
p: w I--+ aw, t -4 a1 , z t--+ z, z --+ z ,

where a = e2i'/". Let S be the transversal given by w = w = 0. Then HO is the

finite group
HO={pk10<k<n}.
With B and b defined as in Example (4.5.2), the constraints are now
S2-1BS2 = a-'B, S2-'BS2 = aB, (dS2 + [-D,f2])s= 0,
where S is a matrix such that Sin = 1.
If B and b have maximal rank, then there exists a gauge in which
an 0 0 ... 0
0 ai-1 0 ... 0
S2 = 0 0 ai-2 ... 0

0 0 0 . . . a
and B and B are of the form
0 0 0
0 0
B= B = 0 2
0 0 0

for some nonzero functions O j, ¢_, again with the potential of D' diagonal.
Z2 symmetry
When HO = Z2, we can extend the sense in which D is invariant by allowing the
discrete symmetries to interchange D and D*, where
D' =d-4i`
is the dual connection on E*. Then, if p is the generator of HO, in place of
(4.5.10), we have
f cbx = - cbP . x I , (dI + 4)Q + S24i`) I s = 0,
where X E and S22 = 1. When S2 = 1, in an appropriate gauge, the gauge
group of D' is reduced to the complex orthogonal group, although the Higgs
fields are not necessarily in the Lie algebra of the orthogonal group. 10
54 Reduction of the ASDYM equation
Example 4.5.4 Let H be the group generated by the translations
X =8,,,-a,u, Y=az, Z=az
and the reflection
p:wiw, z'--+-z, zu'--,w,
The reflection reverses Y and Z, but preserves X. If we take
S={z=0=z, w=1-V}
and Q = 1, then the algebraic constraints are that the restrictions to S of the
Higgs fields c¢y and OZ should be symmetric, that the components of the poten-
tial of D' should be skew-symmetric, and that Ox should be skew-symmetric.
This example arises in the reduction to the classical integrable cases of a spinning
top (§7.2).

Constraints and boundary conditions

In some cases, the stabilizer is the identity except on some lower-dimensional
subset of U. If we excise this subset, then we can impose the ASD condition
in the remainder of U and solve the reduced equations without encountering
constraints, although the solutions may be multivalued if the resulting region is
no longer simply connected. The algebraic constraints reappear, however, in the
form of boundary conditions when we try to extend the solutions to the excised
points of U. An example is the `hyperbolic monopole' reduction in §5.2.

Dynamic constraints
When the action of a symmetry group is not free, the kinematic constraints
contain algebraic restrictions on the components of the potential. It is also
possible that some of the field equations reduce to purely algebraic conditions
even when the symmetry group acts freely. This happens when the tangent
spaces to the orbits contain a-planes.
Consider a point x and suppose that the vector fields X, Y E ry span the
tangent space to an a-plane through x. We can choose the gauge so that 4)
is invariant along X and Y, since X and Y are necessarily independent in a
neighbourhood of x, and we can choose the coordinates so that X = 8,,, and
Y = 8Z at x. Since G' iD = L'' 4D = 0, we have at the point x that
X°aa4b = -(DaabXa, 3x 4) b = YaBaDb = -4'aBbYa
So the first of the ASDYM eqns (3.2.1) becomes the algebraic condition that
(4)z, (Dw]
(awX a - aya).a +
should vanish at x. We call such conditions dynamic constraints.

Example 4.5.5 Suppose that X = 8,,,, Y = 8Z everywhere. Then

O"Xb=0=8 yb.
Backlund transformations bb

and the dynamic constraint is [P, Q) = 0, where P and Q are the Higgs fields of
Xand Y.

4.6 BACKLUND TRANSFORMATIONS

The ASDYM equation also has a number of non-point `hidden' symmetries that
can be used to generate groups of transformations on the solution space of the
full equation and its various symmetry reductions. For example, the Kinnersley-
Chitre transformations of the Ernst equation in general relativity arise in this
way. Such transformations have been found by many people in greater or lesser
generality in a wide variety of contexts, see the papers cited in Woodhouse (1987),
and in Woodhouse and Mason (1988).
Hidden symmetries of Yang's equation
Consider Yang's form of the ASDYM equation, with gauge group GL(n, C ). A
generic n x n matrix J can be decomposed in the block form
J=(A-'ABAB -AA)=(0 1 (A (4.6.1)
A B1 0)
where A is a k x k nonsingular matrix, A is a k x k nonsingular matrix (k+k = n),
B is a k x k matrix, and B is a k x k matrix; the qualification `generic' rules out,
for example, the case in which the second diagonal block in J is not invertible.
With this decomposition, Yang's equation is the coupled system
aZ(AB,A) - a,-, (AB,, A) = 0.
a,(ABZA) -a w(ABwA) = 0,
a,(A-IAZ)A-1

- BZABZ - BwAB,, = 0,
A-'aZ(AZA-1) - A-1aw(A,j.A-1) + BZABZ - BwAB,, = 0.
We interpret the first two of these as integrability conditions. They imply the
existence of B' and B' such that
aZB' = AB,,,A. awB' = ABA,
aZB' = AB,;,A, awB' = ABZA.
If we also put A' = A-1, A' = A', k' = k, k' = k, then by replacing A, A,
B, B by the primed variables, we obtain a new solution J'. We denote this
transformation by
ik: J J'. (4.6.2)

Clearly, 2n-k o ik is the identity.

Irregular gauge transformations
We can understand the transformation in another way (Mason et al. 1988). If

h= (AB` 0
i t h= ( 0 A ') ,
56 Reduction of the ASDYM equation
then, because of the particular form of J, we can choose the gauge so that
'Dw = -hwh-1, 4)z = -hzh-1,
,7 = hih-1, 4ii = -hih-1
and hence so that the linear system is
A-1A,,,
L- aw (aj+ -BwA -(A
(BiAA;;) '

M = a - (ew + A-'A, (Bw (4.6.3)

( -BZA -(A- Aw )
This is related to the linear system of the transformed ASDYM field by
L' = g-1Lg, M' = g-1Mg, (4.6.4)
where

g- (
0
A
(0 Y
and eqns (4.6.4) are operator equalities, in which g and g-1 act on column vec-
tors by left multiplication. Thus the transformation from the original potential
(determined by J) to the new one (determined by J') is an `irregular gauge
transformation', in the sense that it depends on (, and has a pole at ( = oo and
zero determinant at ( = 0. Nevertheless it has the property that L' and M' are
linear in C and so can be interpreted as a Lax pair for an ASDYM field: they
still commute because they they are related to L and M by conjugation by g.
In either approach, one has either to work with a particular choice of J, or
to make a special choice of the gauge. In general, different starting points will
result in different transformed ASDYM fields. We can exploit this to generate
new solutions, for example by combining the transformations ik with conjugation
of J by constant matrices. In §12.1, we shall see that the result is an action on
the solution space of the loop group LGL(n, C ).
NOTES ON CHAPTER 4
1. See also Forgacs and Manton (1980).
2. Every conformal isometry preserves Ene d up to sign. The converse follows from
the fact that the metric tensor gab is characterized up to scale by the condition
that Eae f gecgfd should be skew in a, b, c, d. The proof of this is straightforward in
2-component spinor notation: the key step is to show that if
eab'f hcchfd = Eabcd,
where hab = hba, then hab = ±gab. This is immediate on writing hab = hABA'B' and
Eabcd = EACEBDEA'D'EB'C' - EADEBCEA'C'EB'D' ,
and contracting with EBDED'A'
3. There is a point of principle here also. Given an ASDYM connection in general
gauge, the task of transforming to a standard gauge often amounts to the solution of
a differential equation that is supplementary to the ASDYM equation. If this supple-
mentary equation is chosen to he nonintegrable, then the gauge condition will lead to
Notes on Chapter 4 57

a nonintegrable equation despite the integrability of the ASDYM system. A similar

problem arises with the choice of coordinates for the ASD vacuum equations. Indeed
there is a reduction of the ASD vacuum equations that fails the Painleve property for
a given choice of coordinates (Atiyah and Hitchin 1988, Ablowitz and Clarkson 1991).
However, when other coordinates are chosen, the system reduces to one satisfying the
Painleve test. When we come to the twistor construction, we shall see that there exists
a class of gauges that can be obtained algebraically from the solutions to the twistor
Riemann-Hilbert problem and these will lead to integrable equations. The reductions
in this book all use gauges from this class.
4. Maxwell's equations are invariant under constant rescaling of the potential, a sym-
metry that is not present in non-Abelian gauge theories. However we can understand
it within the general framework of gauge theory by changing the gauge group of elec-
tromagnetism to an Abelian subgroup of GL(2, C). This also solves another problem,
that there may be a topological obstruction to representing a given electromagnetic
field by a global U(1) connection (Dirac 1931, Kostant 1970). The device is to identify
A with the gauge potential
4i= (0 A)
0 0J
The structure group is the additive group C, in the representation
_ 1 z
zEC,
g 0 1

and, with this choice, there are no topological obstructions. If the structure group is
extended to include the diagonal subgroup of SL(2, C ), then the rescaling symmetry
A F- AA becomes a gauge transformation by a constant diagonal matrix.
5. Note that it is also possible for a connection to be invariant under the same group
with respect to more than one choice of action. For example a time-independent U(1)
potential
c=i(Odt-A,dx-A2dy-A3dz),
is invariant under time translation in the gauge in which it is presented (and therefore
also in any gauge obtained from this one by a time-independent gauge transforma-
tion). It is also independent of t in the gauge related to the given one by the gauge
transformation g = elkt for any constant k. Different values of k label different actions
of the translation group on the bundle on which the connection is defined. However,
this is a feature of Abelian connections and, more generally, reducible connections. If
an irreducible connection, that is, a connection that does not preserve any nontrivial
sub-bundles, with structure group contained in SL(n, C) is invariant under the action
of a group, then that action is necessarily unique.
6. The terminology comes from Kaluza-Klein theory, in which one considers a pure
Yang-Mills theory in 4+n dimensions with symmetry under a group with n-dimensional
orbits. This gives a Yang-Mills theory in four dimensions coupled to a collection of
'Higgs fields' that, in certain models, play the role of the Higgs field of the standard
model.
7. Since the connection is invariant, Cx(Dys) = Dlx,y s+Dy(Gxs). By the definition
of a lift, C[X,yls = Gx(Gys) - Cy(Lxs). Also Dy(Lxs) = Gy(Gxs) + OY(Gxs)
Hence
Dlx,yls - Lix.YI8 = Gx(OYS) - cby(Lxs) = Lx(0y)s.
8. The orbits of ll in U are the connected components of the intersections with U of the
orbits of H. That is, they are the leaves of the foliation spanned by the vector fields I).
9. By a `component transverse to the orbits' we mean the contraction of 4) with an
invariant vector field which is not tangent to the orbits.
58 Reduction of the ASDYM equation
10. One can include this construction in the previous one by combining D and D' into
a single connection on E ® E*. In the real case, there are further extensions in which
the 7L2 symmetry is allowed to interchange D and the Hermitian adjoint. The gauge
group of Els can then be reduced to the unitary group.
5
Reduction to three dimensions

The ASD condition reduces to a system of equations in three dimensions when

we take the quotient of a neighbourhood in space-time by the action of a one-
dimensional group generated by a single conformal Killing vector. The general
features of the various cases are determined by the properties of the generator:
whether or not it is null, whether or not it is hypersurface orthogonal, and
how it behaves near its zeros (the fixed points of the action). We shall see two
significant phenomena in the case of a null translation. First, the invariants of
the Higgs field are constant on the a-planes orthogonal to the generator: these
`first integrals' classify further reductions. Second, we find that the reduced
system acquires an unexpected infinite-dimensional point symmetry group.

5.1 THE BOGOMOLNY EQUATION

The basic example is the static form of the ASD condition, which is the reduction
by the one-parameter group of translations along the Killing vector T = 8t, in
Cartesian coordinates t, x, y, z. Whatever choice we make for the action on E, we
can construct a trivialization in which the frame field e; is invariant by picking
any frame on the initial hypersurface S = {t = 0}, and by using the invariance
condition GTe; = 0 as a propagation equation. We then have
4) = cbdt+A, dx+A2dy+A3dz = Odt+A,
where A = Al dx + A2 dy + A3 dz, and the matrices ¢, Ai are independent of t.
The Higgs field is 0 = OT = T j 4).
In electromagnetic theory, the t-independent solutions to the ASD equation
are given by Laplace's equation. In the case of a general gauge group, the static
reduction is the Bogomolny equation, in which the Higgs field 0 plays the role of
the electrostatic potential. If we put
F,3 =B2A,-aA,+[A=,Aj] and Dtq=8,¢+[At,0],
where i, j = 1, 2, 3 and 81 = 8 and so on, then the Bogomolny equation is
Fjk = EjkiDl'
We regard two solutions as equivalent if they are related by a gauge transforma-
tion
-1
0 '-' g Og , A - g -1 Ag + g -1 d g ,
60 Reduction to three dimensions
where g = g(x, y, z). This is the residual gauge freedom in the potential (D.
From a more geometric point of view, D' = d+A is a connection on ES = Els
and the F3s are the components of its curvature F'-the `magnetic part' of the
curvature of D; the Higgs field is a section of the adjoint bundle adj(Es) and the
Bogomolny equation is
F' = 2*D'o,
where D'¢ = d¢ + [A, 0] and * is the Hodge operator in three dimensions: it
sends a 1-form with components ai to the 2-form with components 2etjkak. In
the electromagnetic case, the Bianchi identity forces 0 to be harmonic and in
general yields D'*D'O = 0.
Monopoles
The monopole solutions arise when the space-time is real, S is three-dimensional
Euclidean space, G = SU(2), and we impose the boundary condition that 0 has
eigenvalues ±i at infinity. Under this condition, the restriction of the Higgs field
to a sphere of large radius in S is a map into the unit sphere
tr(lb2) = -2
in the Lie algebra of SU(2). Its degree as a map S2 --* S2 is a topological
invariant, called the monopole number. At large distances, ES has preferred
local trivializations in which 46 is diagonal, with diagonal entries i, -i. These
are related by gauge transformations which take values in the diagonal U(1)
subgroup of SU(2). Thus at large distances, the gauge symmetry is `broken',
and F' behaves like an electromagnetic field-in fact, the field of the appropriate
number of magnetic monopoles. 1
The complex Bogomolny equations
In complex space-time, we can choose the double-null coordinates so that
x=a,,-aw,

and we can choose the invariant gauge so that D , = 0 = 4 . If we write

I_-Pdti+Qdz,
where P and Q are functions of x = w + ti', z, and z, then P is the Higgs field
of X and the reduced ASDYM equations take the form
a1Q+aaP+[Q,P] =0, aaQ+aaP=O, (5.1.1)
which are the commutation conditions for the Lax pair
L=as-((a +Q), M=9,-((as-P).
Equations (5.1.1) are the complex Bogomolny equations.
5.2 HYPERBOLIC MONOPOLES AND OTHER GENERALIZATIONS
Similar reductions occur when the generator is a rotation in IE (Atiyah 1987) or
some other non-null conformal Killing vector.
Hyperbolic monopoles and other generalizations 61

Reduction by a rotation
For a reduction by a rotational Killing vector, we write the metric in the real
form
(dx2 + d 2 + dr2
ds2 = dx2 + dy2 + dr2 + r2d92 = r2 + d02)
r2
and put X = 80, which generates an isometric action of H = S1. We then see
that ds2 is conformal to dal + d92, where

da2 _ dx2 + dy2 + dr2

r2
is the metric of the upper half-space (r > 0) model of hyperbolic 3-space, IHI3.

The x, y and r components of (D determine a connection D' on a bundle over

lilt 3, and the Higgs field 0 = X J'1 = 1o is a section of the corresponding adjoint
bundle. Again the reduced equation is F = 2*D'O, where F is the curvature of
D' and * is now the Hodge operator on forms on H 3.
There is, however, an important difference from the previous case in that
here the choice of lift plays a nontrivial role: if m is a point of the 2-plane in
Euclidean space on which r = 0, then X(m) = 0 and H = S1 acts by linear
transformations on the fibre Em. The different lifts are parametrized by the
conjugacy classes of the corresponding homomorphisms S1 --+ G. When G =
SU(2), the classes are labelled by an integer p > 0. Invariant instanton solutions
on Euclidean space reduce to finite action 'monopole' solutions of the hyperbolic
Bogomolny equations (Atiyah 1987).2 The magnetic charge k is related to the
(anti-)instanton number c2 of the Yang-Mills field by c2 = 2kp.
Reduction by a non-null conformal Killing vector
Reduction by a general non-null conformal Killing vector differs only in that
the generator need not be hypersurface orthogonal and that its twist must be
included as an additional term in the reduced equation.
Let X be a conformal Killing vector with a given Lie derivative operator and
let U be a region of space-time on which X°Xa # 0. We shall suppose that
the space of integral curves of X in U is a Hausdorff manifold S. This quotient
space has a natural metric (analogous to the hyperbolic metric in the previous
example), which is defined by first fixing the scaling of the space-time metric so
that the norm of X is equal to ±1, and then by transferring the rescaled metric
to S by using the identification of the tangent spaces to S with the 3-spaces in
TU orthogonal to X, that is, by taking the Riemannian quotient.
The bundle E descends to a vector bundle ES S, defined by taking the
quotient of Elu by the flow of (2.5.1), so that the sections of ES are the same
as the invariant sections of E. As before, the connection on E descends to a
connection D' on ES, which is defined as follows. Let Y' be a vector field on
S, let z/I be a section of ES, and let 0 be the corresponding invariant section of
E. Let Y be the vector field on U orthogonal to X that projects onto V. Then
Dy-+p is also invariant and we take D'' 0' to be the corresponding section of Es.
62 Reduction to three dimensions
The Higgs field 0 = X J 4D is constant along X in an invariant gauge, and,
under our correspondence, determines a section of the adjoint bundle adj(Es).
The twist of X is a 2-form w on S, which is defined by the property that its
pullback to U is da, where a is the 1-form
Xadxa
.
XbX'
Note that a and w are independent of the scaling of the space-time metric, and
that, since X J a = 1 and GXa = 0, we have X J da = 0 = CXda, which is the
condition for da to be the pullback of a form on S.
The analogue of the Bogomolny equation in this case is
2*D'O=Ow+F',
where F' is the curvature of D' and * is the Hodge duality operator of the metric
on S.

Reduction of Yang's equation

In some cases, it is also possible to make the reduction by requiring that the
J-matrix should be constant along X. Two conditions are needed to ensure the
existence of one invariant gauge in which 4D,,, = DZ = 0 and another in which
4)w=(Di=0.
(a) The flow of X must preserve the two families of 2-planes
w, z = constant, w, z = constant.
Equivalently, the flow must preserve w = dw A dw - dz A dz up to scale.
(b) The two triples X, 8w, 8Z and X, 8w, 8i must both be linearly independent,
otherwise the invariance conditions may be incompatible with the gauge
conditions.
These can be satisfied for generic X for some choice of coordinates and conformal
factor. For example, if K is a non-null translation in U, then the reduction is
(7723 +VkEijk),9,(J-1ajJ) = 0,
where 77ij (i, j = 1, 2, 3) is the translation-invariant metric on the 3-space orthogo-
nal to X with signature (+ - -) and vk is a constant vector with vkvk = ±1. This
equation has been studied in Ward (1988a, b) when vi is space-like (7)ijvivj = -1)
and in Manakov and Zakharov (1981) when 7?ijvivi = 1. For G = SU(n), we
have that J is unitary in the former case and Hermitian in the latter. The Hamil-
tonians in §3.5 reduce to give conserved quantities. In the former case, the one
generating time translation is non-negative definite and is given by the standard
expression for the chiral model,
tr((J-1aiJ)tj(J-lajJ) - !ti(J-1akJ)(J-1akJ))
where ti is a timelike vector such that tivi = 0.
Reduction by a null translation 63

5.3 REDUCTION BY A NULL TRANSLATION

When the conformal Killing vector is null, the reduced equations have a rather
different character. A simple example is the group of null translations, gener-
ated by Y = ai. In an invariant gauge the components of the potential are
independent of z, the ASDYM equations are
0,
aw-ti + [4Dw, (Di] = 0,
aAi + a1Aw - Mlw + [(DZ, -pi] - [-tw, w] = 0, (5.3.1)
and the Higgs field is Q = ci. The second of eqns (5.3.1) implies that the
eigenvalues of Q and the invariants tr(Qk) are independent of w, and so are
functions only of w and z. They are the `first integrals' alluded to earlier.

Point symmetries
To see the second major feature of null reductions, the infinite-dimensional group
of point symmetries, we simplify the equations further as follows. We use the first
of eqns (5.3.1) to deduce that we can make %, and 4'z vanish by a transformation
to another invariant gauge. With this choice, the reduced equations come down
to
19,wQ+[P,Q]=0, aZQ-19wP=0, (5.3.2)
where P = 4)w, with the residual gauge freedom
Q'-' g-1Qg, P'-' g-1Pg+g-lawg,
where g depends on iu alone.
We make a further simplification by interpreting the second equation in
(5.3.2) as an integrability condition. It implies that there is a potential K such
that Q = a,,,K, P = i%K. We are then left with the single equation
a,;,awK+(aZK,awK] = 0, (5.3.3)
which we could also have derived directly from (3.3.7).
Equation (5.3.3) has a coordinate symmetry which is not at all obvious in
the original system of equations. It is invariant under
wH f(w,z), w-w, K-K
for arbitrary f. This induces the transformation Q H f;'Q of the Higgs field.
Now suppose that the gauge group is SL(2, C) and that Q is nonsingular.
Then the eigenvalues of Q are ±A, where A is a nonvanishing function of w
and z. Under the coordinate symmetry, A --> f;',\. Therefore, since f is arbi-
trary, A is not constrained by (5.3.2) and can be any chosen function of w and
z. In the SL(2, C) case, therefore, the choice of first integral can be absorbed
into the coordinate freedom by considering solutions to be the same whenever
they are related by combinations of gauge transformations and these new point
transformations.
64 Reduction to three dimensions
Zakharov's system
Strachan (1992) observed that when the potential has a certain special form, the
reduction is equivalent to the complexification of a system of equations intro-
duced by Zakharov (1980). Thus Zakharov's system is embedded in the ASDYM
equations. Strachan's ansatz is equivalent to choosing A = 1 and is obtained
explicitly by making a different choice of invariant gauge for the Yang-Mills po-
tential from the one above. The argument is typical of many that we shall meet
later, so we give it in detail.
We choose first a new gauge in which Q = 4)i = diag(1, -1). This is possible
because Q transforms by conjugation. From the ASD condition, (Q, 4'b] = 0,
so I is also diagonal in this gauge. A further diagonal gauge transformation
reduces I to zero, leaving Q unchanged. Then, again from the ASD condition,
a'A. + ((Dz, Q1 = 0 . (5.3.4)
Therefore the diagonal elements in 4)w are independent of w. So, by a diagonal
gauge transformation depending only on w and z, we can make 4),,, anti-diagonal,
while leaving Q and 4'w unchanged. It follows that there is an invariant gauge
such that
11 11

Q=(0 01 ), Dw=(p g), t'b=0,

which is Strachan's starting point. Equation (5.3.4) implies that
,Dz-4r=V 2gw
V
2pw
for some function V. On substituting this into the ASD condition, we obtain the
complex form of Zakharov's system
Vw = -2(pq)w , 2qz = qww + qV, 2pz = -pww - pV. (5.3.5)
The real form
iV)t = V)xy + V ?P , V. = 2( IV)I2) y
is given by putting z¢ = q t = iz, x = w, y = zD, and by taking real
values for t, x, y, that is, by reducing an zSU(2) connection on U.
This observation, together with the coordinate symmetry of the reduced equa-
tion, implies that the complex Zakharov system is equivalent to the SL(2, C )
reduction when the Higgs field is nonsingular: a solution to the reduced ASD
equations is labelled by a solution to (5.3.5), together with a choice of the eigen-
value A(w, z). Given p, q, V and A, we can recover the connection by making an
appropriate choice for f (w, z) in the coordinate transformation.
However, note that here we are not merely thinking of the connection as
being defined modulo gauge but also modulo the coordinate freedom described
above. Something similar happens in the case of reductions to the KdV and
NLS equations, but it is not so simple to deal in the same way with larger gauge
groups: the coordinate symmetry is sufficient in the SL(2, C) case to reduce Q
to standard form only because its eigenvalues are equal and opposite.
Reduction by a null translation 65

First integrals and the case tr(Q2) = 0

For larger gauge groups another point of view is required. Equations (5.3.2)
imply that
a,btr(Qk)=0 (k=1,...,n-1),
in the case of gauge group SL(n, C ). Thus the invariants tr(Qk)s are first inte-
grals: they are constant on the a-planes orthogonal to Y, but can have arbitrary
dependence on w and z. They are disposable functions in the general solution,
and once they are fixed, one can eliminate n - 1 of the equations. The num-
ber of remaining equations is the same as the number of remaining unknowns
less the number of degrees of gauge freedom. If we fix the gauge freedom, then
the resulting systems of differential equations will be classified (in part) by the
choices made for the tr(Qk)s. These free functions will appear as coefficients
in the reduced equations. If we require an autonomous reduction, that is, if we
want the system to be invariant under translations, then we must choose the
tr(Qk)s to be constants.
For example, another interesting system is obtained when the the gauge group
is chosen to be SL(2, C) and tr(Q2) is taken to be zero. Then there exists a gauge
in which
0, Q= (0 0).
The residual gauge freedom is generated by lower triangular matrices with ones
on the diagonal and an arbitrary function g(w, z) in the bottom left entry. If we
set
(q
4'w = qJ ,
Dz
=1 y v) ,

then we find that the remaining equations in (5.\3.1) reduce to

awp=0, u= -awq, 2v=awr, a,;,(pr+q2+awq+waZp)=0,
and
awv-aZq+py-ur=0, awy-aZr+2vr-2yq=0,
so that p is another first integral depending only on w and z. To obtain an
autonomous reduction, we choose p = -1 (alternatively we can make the co-
ordinate transformation w f (w, z) to rescale p to -1). Then r + awq + q2
is independent of w, and can be set equal to zero by using the gauge freedom.
With these choices, all the unknowns can be expressed in terms of q, and the
equations come down to
4gwz - gwwww - 8gwgww - 4gwgww = 0 (5.3.6)
(Schiff 1992). The residual gauge freedom is q '--i q+a(z), where a is any function
of z. Thus there is a one-to-one correspondence between (i) solutions to the
ASDYM equation with structure group SL(2, C ), null translational symmetry,
and tr(Q2) = 0, modulo the coordinate freedom w H f(w, z) and (ii) solutions
to (5.3.6), modulo the addition of an arbitrary function of z.
66 Reduction to three dimensions
Other systems in three dimensions
In the context of the reductions of the ASDYM equation, Zakharov's system and
eqn (5.3.6) are the natural 2 + 1-dimensional generalizations of the Korteweg de
Vries (KdV) and nonlinear Schrodinger (NLS) equations: we shall see in the next
chapter that when we impose a further translational symmetry along a non-null
vector orthogonal to Y, Zakharov's system reduces to the NLS equation and eqn
(5.3.6) reduces to the KdV equation. In the literature, however, rather more
attention has been paid to two other 2 + 1 generalizations: the Kadomtsev-
Petviashvili (KP) equation
(4uz - uwww - 6uuw)w ± 3uww = 0,
which generalizes the KdV equation, and the Davey-Stewartson equation, which
generalizes the NLS equation, and has the real form
iV)c=V)xy+V,b, Vxx+Vyy=(I?pl2)xy
These are not equivalent to our two reductions of the SL(2, C) ASDYM equa-
tions, at least as far as their Lax pairs are concerned.

NOTES ON CHAPTER 5
1. Although the Bogomolny equations describe static monopoles, they can be used to
analyse their dynamical behaviour in the low-energy limit (Manton 1981, Atiyah and
Hitchin 1985). There are interesting connections between the Bogomolny equations
and Nahm's equations (§7.2); see, for example, Corrigan (1986) and Donaldson (1984).
2. By an `instanton', we mean a solution to the ASDYM equation on E that extends
to S4 (§10.5). If the gauge group is simple, then the bundles over S4 are classified by
a single integer, called the instanton number. See Ward and Wells (1990), §5.2. There
are also solutions to the full Yang-Mills equations on S4 which are not global minima
of the action, and which are therefore not constrained to be ASD.
6
Reduction to two dimensions

We now come to the central examples of reduction, in which the symmetry

group has two-dimensional orbits in space-time and the reduced system has two
independent variables. First we shall look at the two-dimensional translation
groups, from which we obtain the ASD equations on a Riemann surface, the
chiral equations, harmonic map equations, the KdV and NLS equations, and
various parts of the nKdV hierarchies. When we add additional discrete sym-
metries, we get the extended Toda field equation, other cases of the harmonic
condition on maps to homogeneous spaces, and the n-wave equation. Second,
we shall consider other two-dimensional groups that act freely on space-time.
A notable example is the reduction to the Ernst equation in general relativity,
where the group is generated by a translation and a rotation. We also consider
the possibility that the symmetry group has higher dimension, but does not act
freely. An obvious example is the rotation group SO(3), acting on hyperplanes in
E. Here the nonsingular orbits are 2-spheres, but the group is three dimensional.
One reduction by this symmetry is Liouville's equation. Another example is
the complex Euclidean group, which has two-dimensional orbits with nontrivial
isotropy. Here the ASDYM equation reduces to the Toda field equation. 1
Again we shall see that the symmetry groups of the reduced equations are
larger than might have been expected and are often infinite dimensional.

6.1 TWO-DIMENSIONAL GROUPS OF CONFORMAL MOTIONS

We shall consider first a two-dimensional subgroup of the conformal group, gen-
erated by two conformal Killing vectors X and Y, which span the tangent space
to the orbit through each point. We shall use two methods to derive the reduc-
tions, the first of which is to impose the symmetry on the potential by requiring
that the Lie derivative of (D along X and Y should vanish. When the symmetry
group is Abelian, (X, Y) = 0, and there exists a coordinate system in which X
and Y are the first two coordinate vectors and the components of are functions
only of the second pair of coordinates. In this case, we construct the reduced
equations by changing the coordinates in (3.2.1) and discarding the derivatives
with respect to the two ignorable coordinates. When X and Y are translations
we can do this more simply by adding to the linear system,
Ls=Ms=0,
68 Reduction to two dimensions
the condition that s should be independent of the ignorable coordinates. The
result is a reduced linear system, for which the reduced equations are the com-
patibility conditions. We can construct the reduced linear system in the same
direct way whenever the symmetry transformations map a-planes to parallel a-
planes (as do translations and left rotations), but for a general symmetry group,
we require further geometric tools (§11.2).
The second method is to impose the symmetry on the solutions to Yang's
equation
aZ(J-laiJ) - aw(J-law J) = 0,

an approach that is not always quite as straightforward as it appears since it is

not possible in every case to impose the symmetry condition on D in the obvious
form that J should be constant along X and Y.
Two-dimensional groups of translations
A two-dimensional translation group H is generated by two constant independent
vectors X and Y. In complex space-time, there are four possibilities for H, up to
conjugacy, which we shall denote by H++, H+o, HSD, HASD If II denotes the
2-plane spanned by X and Y, then the four cases are as follows:
(H++) the metric on H is nondegenerate;
(H+o) the metric on H has rank 1;
(HSD) the metric on H vanishes and grab = X[aYbJ is self-dual;
(HASD) the metric on II vanishes and 7rab = X(aYbl is anti-self-dual.
In the real case, some of these are either impossible or else admit further sub-
division, depending on the signature of the metric. In the Euclidean case, only
the first is possible; in the Lorentzian case, the first case subdivides according to
the sign of 7rab1rab (which distinguishes the two possible signatures of the metric
on II), and the third and fourth are impossible. All four are possible for an
ultrahyperbolic metric, where again the first has two subcases.
6.2 REDUCTIONS BY H++
In complex space-time, we can choose the coordinates so that H++ is generated
by the two null vectors X = 8,,,, Y = a;,. Then 1 depends only z and z, the
reduced Lax pair is
L=-Dw-((ai+41i), M=aZ+,Dz-(cw, (6.2.1)
and the reduced equations are
az4'w + 0,
aiIDw + [I'D i, Dw] = 0 , (6.2.2)
and
8"ID i - ai41Z + [4 , ID i] - [(D w, cw1= 0. (6.2.3)
There remains the freedom to make gauge transformations of the form
(Dw'-' 9-1(%19, D,i'-' 94),7.9-i
Reductions by H++ 69

,Df '-' g-, Dig + g-laig,

DZ --* g-l4,Zg + g-, a--g,
where g depends only on z, z. We can understand this in geometric terms by
interpreting
D' = d + 4bZdz + 4)idz
as a connection on a bundle E' over the z, i-space, and the Higgs fields P = Dw
and Q = Cw as sections of the corresponding adjoint bundle.
The self-duality equations on a Riemann surface
On putting 0 = Iwdz and t = Dwdz, the reduced equations become
D'o=0, D't=0, F'+¢A + A0=0,
where F' is the curvature of D' (a 2-form with values in the adjoint bundle) and
the reduced linear system can be written as a single linear equation

where s depends of z and z. The operator on the left-hand side maps sections
of E' to 1-forms with values in E', and can be written in terms of L and M as
the combination d z M - (-' dz L.
Under the Euclidean reality condition, i = z, the equations become
Hitchin's (anti-) self-duality equations on a Riemann surface: in this form, they
exhibit an unexpected invariance under arbitrary conformal mappings of the
complex variable z and can therefore be transferred to a general Riemann surface
(Hitchin 1987a).2
The complex equations
The complex form of the equations can be written entirely in terms of the Higgs
fields by making a special choice of gauge. By putting -1 in the Lax pair,
we obtain
+(Dw,4Di+4 I =0,
from which it follows that there is an invariant gauge in which Di +' , and
lbZ + 4),b both vanish, and in which the equations therefore come down to
azP=IQ,PI=-aiQ,
where P and Q are the Higgs fields of X and Y.
Harmonic maps and the chiral equation
When the metric is real and ultrahyperbolic, there are two real reductions of
Yang's equation by H++, corresponding to the two possible signatures of the
metric on U. When it is real and Euclidean, there is just one possible signature.
We can see the different nature of the reductions by looking at Yang's equation.
To avoid the problem mentioned above, we adapt the double-null coordinates to
the action of H++ in a different way. We now choose the coordinates so that
x=az-a,;, Y=aw+ai,
70 Reduction to two dimensions
and put u = z + tD, v = w - z. Then X, Y, aw, and aZ are independent, as are
X, Y, a,;,, and aZ. If J depends only on u and v, then Yang's equation reduces
to
av(J-'auJ) = 0, (6.2.4)
and we obtain three different reductions by imposing three different reality con-
ditions, the first two of which are ultrahyperbolic and the third Euclidean.
(a) We put w = z, w = -z, so that X = Y, u = v, and the metric is
ds2 = 2(dz dz + dz dz) .

The reduction is the equation for harmonic maps J: C , G,

0 (6.2.5)
(Ward 1985).
(b) We take w,z,w,z to be real, so that the reduced equation is (6.2.4), but with
u and v real, that is, it is the chiral equation (Ward 1985).
(c) We take the gauge group to he any connected real Lie group G and we impose
the Euclidean reality condition zu = -w and z = z, so that X = Y, u = -v,
and the equation is again (6.2.5). Since the distribution {a,,9 is complex,
any frame e; such that Dee, = Dwe; = 0 is also necessarily complex, so .1
takes values in the complexified gauge group, GC. If we put et = e;, then we
have j = J-', which implies that
J=hh-',
where h E GC is uniquely determined by J up to multiplication on the
right by an element of G. Thus J determines a map into the homogeneous
space GC/G. The reduced equation is the condition that this map should
be harmonic. (In the unitary case, e; can be chosen to be the dual basis to
e;, in which case J is Hermitian.)
Note that these forms of the H++ reductions also exhibit conformal invariance.
Additional symmetries
We now turn to examples in which we enlarge H++ by the inclusion of additional
space-time symmetries, but with the same orbits in space-time, so that the en-
larged group does not act freely. The stabilizer of a typical base-point is then a
nontrivial subgroup Ho C H, and the action of Ho leads to algebraic constraints
on the Higgs fields (see §4.5).
The new feature is that it is no longer possible to find an invariant gauge
in which the symmetry can be imposed by the straightforward requirement that
the potential should be independent of the ignorable coordinates. Instead, we
choose a transverse section S of the orbits and use as the dependent variables
the values of the Higgs fields on S and the restriction D' of the connection to
Els. If the ASD condition is satisfied at S, then it is satisfied throughout U.
So the derivation of the reduced equations requires only the calculation of the
Reductions by H++ 71

components of the curvature F of D at points of S in terms of these variables.

For this, we use (4.5.5).
Example 6.2.1 The Toda field equation. Consider again Example (4.5.2) of
reduction by translations and rotations in w and w. In the notation introduced
there, S is the 2-plane w = w = 0 and Ho is the circle group generated by Z.
We shall use z and z as coordinates on S and write D' = d + Adz + Adz.
From (4.5.5), we have
Fzi = a2A- aZA+ [A, A], &4 = [B,B],
FFW =a2B+[A,B], FZ,;, =azB+[A,B],
where B and b are the Higgs fields of a,, and a,b, respectively. Therefore the
reduction of the ASD condition is
B,z+[A,B] = 0, BZ+[A,B] = 0, [B, B]+aZA -aZA- [A, A] =0,
to which we must add the constraints derived in Example (4.5.2), which are
B + [B,C] = 0, B-[B,C]=0, D'C = 0. (6.2.6)
One way to satisfy these with gauge group GL(n, C) is to take C, B and f3 as
in (4.5.9), with A = diag(al,... , an) and A = -diag(a1,... , an). Then we have
i3 log 01=a2- al, ... , azlog0n-l=an-an-1, (6.2.7)
together with

01 1 + a=al + a=al = 0,
-01 1 +aZa2+OZa2 = 0,

On-1&-1 - On-24n-2 + aian-1 + azan-1 = 0,

-On-i n-1 + azan + azan = 0, (6.2.8)
and the same equations with tilded and untilded variables interchanged. By
putting ui = and eliminating the as, we obtain the Toda field equation
n-1
OZ OZ u i +EKij e', =0 (i= 1 , 2 ,..., n- 1) ,
=1

for the functions ui = log(o, ), where

2 -1 0 0 ... 0
-1 2 -1 0 ... 0
0 -1 2 -1 ... 0
K= 0 0 -1 2 ... 0

o 0 0... 2)
0
is the (n - 1) x (n - 1) Cartan matrix of SU(n). A solution ui determines 'I to
within a gauge transformation. Conversely, the uis are invariantly determined by
72 Reduction to two dimensions
the eigenvalues of BB. When n = 2, the equation is the Liouville equation. By
putting z = z and i = Vii, we see that the real form of the Toda field equation
is a reduction in Euclidean space, with G = U(n).
Example 6.2.2 Toda field equation: extended case. We arrive at a similar equa-
tion, but with a different Cartan matrix, by replacing the rotation subgroup in
the previous example by a finite group of rotations through 27r/n in the w plane.
That is, we now take H to be as in Example (4.5.3) and use the solution to the
constraints given there. The reduced ASDYM equations are the same as before,
except that we must add
az log On = a1 - an
to (6.2.7) and replace the first and final equations in (6.2.8) by
0101-On On+anal+azal=0, 0.0.-On-IOn-I+ajan+19Zan=0,
so that the equations now have cyclic symmetry. By putting ui = and
by eliminating the ais, we again obtain the Toda field equation
n
aZazui+ `Ki,eu;=0 (i=1,2,...,n),
3=11

but this time we have for n > 2

2 -1 0 ...0 0 -1
-1 2 -1...0 0 0
0 -1 2 -1 ... 0 0
K= 0 0 -1 2 ... 0 0

0 0 0 0 ... 2 1

-1 0 0 0 ... -1 2

which is an extended Cartan matrix. The case n = 2 contains the sinh-Gordon

equation, by taking ul = -U2-
Example 6.2.3 Harmonic maps to Riemann symmetric spaces. We showed
above that under invariance along
-a,j, Y=a,"+aj,
and under the reality condition w = z, w = -z, Yang's equation reduces to the
harmonic map equation (6.2.5), in which u = z + w. We now take the gauge
group to be SU(n) and make a further reduction by imposing symmetry under
the Z2 action
a: (w, z, w, z) H (-z, w, z, -w) .
We represent the quotient space of the translations by the fixed-point set of or,
which is
S={w+z=0, w-z=0},
and we let S2 denote the lift of a to the vector bundle at S. We take ei to be a
frame that is invariant tinder translation along X and Y and satisfies
Reduction by H+o 73

Dies = 0, Dwet = 0.
Then e; = a.ei is also invariant and satisfies
DZe==0, D,,,et=0.
Moreover, if J(u, u) is the corresponding J matrix, then we have S2(ej) = e;.1;j
at S. Hence j2 = 1 at S, and therefore j2 = 1 everywhere since J is constant
along X and Y. It follows that the eigenvalues of J are 1 and -1. Suppose
that the corresponding eigenspaces are k-dimensional and (n - k)-dimensional
for some constant k. These spaces are orthogonal with respect to the Hermitian
inner product on E, and so J determines, and is determined by, a map from the
complex u-plane to the Grassmannian Grk(C') of k-planes in C' for some k.
Moreover, the embedding of Grk(C'2) in SU(n) is totally geodesic, and so J can
be identified with a harmonic map C Grk(C').
The construction extends to harmonic maps into a general Riemannian sym-
metric space, as follows. We suppose that the symmetric space N is constructed
from a real Lie group G, with an invariant metric and an involutive automor-
phism r: G G, by taking 9-1 to be
x={gEGIT(g)=g-'}.

(For full definitions, see Burstall and Rawnsley 1990, and Helgason 1962.) We
then consider an ASDYM field with gauge group G, with the same reality con-
dition on space-time and the same translational symmetry as before, but we
impose the additional symmetry condition that a should lift to a Z2 action o.
on the associated principal bundle (the frame bundle), in such a way that it
coincides with the action of r on the fibres over the fixed point set S. If a sec-
tion of the principal bundle satisfies Dig = Dv,g = 0, then g = a.(g) satisfies
DZg = D,,,g = 0, and the argument above extends to show that corresponding J
matrix gg-1 takes values in h and satisfies the harmonic map equation.
Remark. If we attempt to generalize this construction to the case of a Z,, action
with n > 2, we can still obtain a representation of the field in terms of a function
with values in some more general homogeneous space, but the equations will no
longer reduce directly to the harmonic map equation.

6.3 REDUCTION BY H+o

When the metric on lI has rank 1, we can choose the generators of H so that X
is non-null and Y is null and orthogonal to X. Then, with an appropriate choice
of double-null coordinates in complex space-time,
X =a,,,-a,;,, Y=a
By introducing the the linear coordinates x = w + w and t = z, which are
constant along X and Y, and therefore well defined on quotient space, we can
write the reduced linear system in the form
L=ay+(Dw-(Q, M=a1+4: -((as+4 ), (6.3.1)
74 Reduction to two dimensions
where the components of 4i are functions of x and t alone and Q = Y J 4i is the
Higgs field of Y. The second Higgs field is P = X J 4) = 4?w - 4i,;,. From the
compatibility condition [L, M1 = 0, we find that the reduced ASDYM equations
are

Qx+(4)w,Q] =0, [a.+(Dw,at+t ] = 0,

P. + 0w, P1 + Qt + (4)Z, Q1 = 0. (6.3.2)
We can write them in an equivalent, but more geometric, form by introducing
the connection
D'=d+4iwdx+4izdt
on a bundle E' over the x, t-space (the space of orbits), and by interpreting P
and Q as sections of the adjoint bundle adj(E'). We then have
F' = 0, D,Q = [P, Q1, DxP+ D'Q = 0,
where F' is the curvature of D'. In particular, it follows that
ax(trQk) = 0, (k + 1)8x(tr(PQk)) +8t(trQk+1) = 0, (6.3.3)
so that tr(Qk) is independent of x and tr(PQk) is linear in x. Thus tr(Qk) and
the coefficients of x0 and x1 in tr(PQk) are functions of t alone. When Q is
semisimple, these are the first integrals of the reduced system that we discussed
in §4.1.

Gauge conditions
The geometric form of the reduced equations suggests two natural gauge choices.
From the first of equations (6.3.2), we see that the conjugacy class of Q is inde-
pendent of x, and therefore that Q can be reduced by a gauge transformation to
a standard normal form in which it depends only on t. Moreover we can choose
the gauge transformation so that in addition 4)w = 0. We say that the resulting
form of the potential is in normal gauge. On the other hand, the second of eqns
(6.3.2) implies that it is possible to make a different choice of the invariant gauge
to reduce the potential to the form
fi=Qdz-Pdti,, (6.3.4)
so that D' is explicitly trivial. We call this a Higgs gauge, because the potential
is expressed entirely in terms of the Higgs fields. In a Higgs gauge, the reduced
linear system is
L=ex-(Q, M=at-((ax-P), (6.3.5)
and the reduced equations are
aaQ=(P,Q1, 9tQ+exP=0. (6.3.6)
The only residual gauge freedom is P p-- g-1Pg and Q --4 g-1Qg, where g is
a constant matrix. We shall use a normal gauge to derive most of the reduced
equations, and a Higgs gauge to derive the Heisenberg ferromagnet equation,
and the recursion operators in Chapter 8.
Reduction by H+O 75

The SL(2, C) case

When the gauge group is SL(2, C ), two families of solutions are of particular
significance. In the first P and are conjugate to
Q (i0
\
and Oil,
C0
respectively, where 0) the complex NLS equation. In the second,
and zi satisfy
Q is conjugate to
0

( 1
0)
0

and u = -tr(P2) satisfies the KdV equation. Thus two of the most celebrated
integrable equations are embedded in the SL(2, C) ASDYM equation. The em-
beddings are given by particular choices of the first integrals. However, more
than this is true: the reduced ASD equation has an additional coordinate free-
dom that can be used to transform all its nontrivial solutions into solutions for
which the Higgs fields have one or other of these special forms, according to
whether Q is semi-simple or nilpotent. Thus the H+o reductions of the SL(2, C )
equations are essentially equivalent to the KdV or NLS equation.

Point symmetries
We shall look at the coordinate symmetry first in the context of a general gauge
group. The second of eqns (6.3.6) is the condition for the existence of a matrix-
valued potential K(x, t) such that Q = axK and P = -atK in the Higgs gauge.
With K as the dependent variable, the ASD condition comes down to the single
nonlinear equation
O K = [Ox K, at K] (6.3.7)
(which we could also have obtained directly from 3.3.7). Each solution deter-
mines an ASD connection with the required symmetry; conversely every such
ASD connection determines a solution of (6.3.7), uniquely up to K'--' g-1Kg+c,
where g and c are constant.
The coordinate symmetries are the nonlinear Galilean transformations
t '-p t = f(t), x H x = f (t)x + m(t), (6.3.8)
where f and m are arbitrary functions of t, and the dot denotes differentiation
with respect to t. These leave (6.3.7) invariant and induce the transformations
Q9K = f-1Q, P-P=-ajK=f-1P+f-2(fx+yn)Q. (6.3.9)
Combined with (6.3.8), (6.3.9) is a symmetry of (6.3.6). s
In a general gauge, the transformations can be understood as the motions
of the (x, t) plane that preserve dt ® ax, which is the tensor representation of a
degenerate two-dimensional *-operator from 1-forms to 1-forms, defined by
a=axdx+atdt'--4*a=axdt.
76 Reduction to two dimensions
This is clear if we introduce the 1-form ¢ = Q dx - P dt, which takes values in
adj(E'), and write the reduced linear system in the invariant form
F'=O, D't=0, *D'A +OAq=0.
Alternatively, the reduced equations are the integrability conditions for
D's-((0+*D')s=0,
where s is a section of E'. The operator on the left-hand side maps sections of E'
to 1-forms with values in E', and is given in terms of the original linear system
by dxL + dtM. Either way, it is clear that the reduced system is invariant under
any motion that preserves dt 0 ax.
We now specialize to the case G = SL(2, C ), in which P and Q are 2 x 2
trace-free matrices. It follows from (6.3.3) that
ax(trQ2) = 0, 8x(tr(PQ)) = -2dt(trQ2).
Therefore tr(Q2) is independent of x and tr(PQ) depends linearly on x. The
scalars tr(Q2) and tr(PQ) are gauge invariant and characterize the reduced equa-
tions. Under (6.3.9),
tr(Q2) - tr(Q2) = f -2tr(Q2) ,
tr(PQ) tr(PQ) = j-2 tr(PQ) + f -3(f x + rh)tr(Q2) .
By making an appropriate choice of f and m, we can bring the invariants to a
standard form, and so simplify the reduced equations. We consider two subcases
separately, (i) Q is semisimple and (ii) Q is nilpotent. The other possibility, that
Q vanishes, gives the trivial reduction 8.,,P = 0 in the Higgs gauge. In both
subcases, a significant part is played by the third gauge-invariant, u = -tr(P2).
The NLS equation
Suppose that we are given a solution to eqns (6.3.2), with G = SL(2, C ). Let
us assume also that tr(QP) = 0 and tr(Q2) = -2. Then we can find a normal
gauge in which
i
Q= (0 - i), P
(0V) o,
for some functions V) and z/i of x and t, and in which the reduced linear system
becomes

L=ax+ (a
(0 o)-((o 0i), a)-(ax, (6.3.10)

where a, b, c are functions of x and t. This gauge is unique, up to a further

transformation by a diagonal matrix diag(A,.\-1), under which
V) H'\ 2V) , _ a-2' ' (6.3.11)
where \ is any function of t alone. We claim that A can be chosen, uniquely up
to a constant factor, so that V) and ti satisfy the NLS equation. In fact we have
the following proposition.
Reduction by H+o 77

Proposition 6.3.1 The H+o-invariant solutions to the ASDYM equation, with

gauge group SL(2, C ), such that detQ # 0 are parametrized by the solutions to
the complex NLS equation
1Wt = -2Y'xx +'t62e , 1Wt = 2'Vtx - 2V),

together with two arbitrary functions of t. The parametrization is bijective, up

to the equivalence (1p, t) (A2,p, A-2,) for constant A.

Proof Suppose that tr(PQ) = 0 and det Q = 1. Then the ASD condition is the
compatibility condition for the reduced Lax pair (6.3.10). The constant terms
in t give
aza = bpi - cv, azb - atb = 2a i, a.c - atV) = -2a?/i, (6.3.12)
and the linear terms give
axe-2ib=0, 0y'+2ic=0. (6.3.13)
It follows from these and the first of eqns (6.3.12) that ax(2a + hpJ ) = 0. We
can now tie down the choice of A to within a constant factor by requiring that
2a = Then on eliminating a, b, c between (6.3.12) and (6.3.13), we obtain
the complex NLS equations
+,)2,
i'+Gt = -1V).. , i+Gt = z/-)2V).

When tr(PQ) and det Q are general, we first choose f and m so that tr(PQ)
vanishes and det Q = 1. This is always possible provided that Q is nonsingular.
We then construct and in the same way from Q and P:' and t' are uniquely
determined by the original connection D, up to i-+ A2z', z --+ A-2'' where A
is now a constant, and they satisfy the complex NLS equation with 1 and t as
the independent variables. We label the connection by Eli, , together with j 2(t)
and rh(t) (the two arbitrary functions of t).
Going in the other direction, we start with a pair of functions V)(x, t) and
ii(x, t) that satisfy the complex NLS equation. We put a and use
(6.3.13) to define b and c. Then (6.3.10) is a compatible linear system for the
ASDYM equations with H+o symmetry, det Q = 1 and tr(PQ) = 0. The other
connections labelled by V) and Vi are obtained by applying (6.3.8) for different
choices of f 2 and m. 0
Remark. It follows from this proof that if V) and satisfy the complex NLS
equation, then
L=ax+(7 0 0)-((o °i)
(-*
M = at + 2i day (6.3.14)

is the reduced linear system of an ASDYM field. The corresponding solution to

(6.3.6) in the Higgs gauge is
78 Reduction to two dimensions
i 0 P=gzg-1=g1 )g- 1
Q=g 0 -i g-1'
where
9-19a = 2i ( 0x 0 .

Real forms and the Heisenberg ferromagnet equation

The real forms of the NLS equation are obtained by restricting to real values of
w, w, z and z, that is, by making the reduction in U, and by requiring that the
structure group should reduce to to SU(2) or SU(1, 1) on the real slice.
(a) If the gauge group is SU(2), then P is anti-Hermitian in the normal gauge
and In this case, the reduction is the attractive NLS equation
1t = - tax - I W I2
(b) If the gauge group is SU(1, 1), then _ and the reduction is the repulsive
NLS equation
iV)t = -2VG.= + 11G12

(c) If Vi, are real and t is imaginary, then we obtain another real form with
gauge group SU(1, 1). The equations are a pair of coupled heat equations,
one with time reversed. They are badly behaved in the sense that the generic
solution is singular, and that singularities can develop from regular data in
an arbitrarily short time.
In the SU(2) case, we obtain a different but gauge equivalent reduction by using
the Higgs gauge in eqn (6.3.4) and by putting Q = iq.a, P = ip.a, where or is
the Pauli 3-vector, with components
__C1 11 (0
0`1 02 og
1 0 i 0 0 1

Then eqns (6.3.6) become

a.q=2gxp, aiq+aop=0,
with the constraints det Q = q.q = 1, tr(PQ) = -2p.q = 0. They are equivalent
to the Heisenberg ferromagnet equation
2atq=gxaxq,
with p defined by 2p = -q x a9q (Faddeev and Takhtajan 1987, Lakshmanan
1977).

The KdV equation

In case (ii), det Q = 0, Q 54 0, and the analysis goes along the same lines. Since
tr(Q2) = 0, we have that tr(PQ) depends only on t. We consider first the case in
which we are given a solution to eqns (6.3.2) such that tr(PQ) = -1, tr(Q2) = 0,
with gauge group SL(2, C). Then we can find a normal gauge in which
Reduction by H+o 79

p (ac
Q= (0 0) , (q -q) ,
4Z
a) , (6.3.15)

where a, b, c, q, and r are functions of x and t, and in which

L=Bx+I -q I -t;I0 0) M=at+(a a) - Cax, (6.3.16)

9
with the residual gauge freedom to conjugate L and M by
1 0
g= Q
1 , (6.3.17)

where 0 is a function of t alone. The commutativity condition [L, M] = 0 gives

the reduced equations
rx = 2a, qz = -b, qt - ax = -c - rb,
bx = -2a - 2qb, rt - cx = 2ar - 2qc. (6.3.18)
In the following, we shall make use of the fact that b and u = 2r - 2q2 are
independent of the choice of gauge: they can be expressed in terms of the Higgs
fields in a general invariant gauge by
2b = -tr([P, Q] D'Q) = tr(D' Q D' P) , u = -tr(P2) .
It follows from the reduced equation that
8x(r-q2+b)=0
and hence that r-q2+b is a function of t alone. The remaining reduced equations
come down to the KdV equation.
Proposition 6.3.2 The H+o-invariant solutions to the ASDYM equation with
gauge group SL(2, C) such that tr(Q2) = 0, tr(PQ) # 0, are parametrized by the
solutions to the KdV equation
Out - uxxx - 6uux = 0,
together with two arbitrary functions of t.
Proof Let us suppose that we are given a solution to (6.3.2) with gauge group
SL(2, C) such that tr(PQ) # 0, tr(Q2) = 0. Then tr(PQ) is a function of t
alone. By putting
f 2 = -tr(PQ), m = 0,
in (6.3.8), we can obtain a new solution with tr(PQ) = -1, tr(Q2) = 0. The
reduced Lax pair in this case is gauge-equivalent to (6.3.16). Under a further
transformation (6.3.8) with f = 1, the value of tr(PQ) is unchanged, while
tr(P2) - tr(P2) - 2rh, b - b + rn.
So, by making the appropriate choice of yn, we can further reduce to the case
b = q2 - r. Then u = -2b = 2qx, and we deduce that
2cx - ut + uxr - 2grxx = 0.
80 Reduction to two dimensions
By eliminating cx between this and the x-derivative of the third of eqns (6.3.18),
and by using r = -b + q2 = qx +q 2, we arrive at the KdV equation
4ut - uxxx - 6uux = 0.
Each solution determines an invariant connection, uniquely up to gauge, such
that
tr(Q2) = 0, tr(D' Q D2P) = tr(P2) , tr(PQ) = -1. (6.3.19)
Conversely, for every invariant connection satisfying these conditions, we see that
u = -tr(P2) is a solution to the KdV equation.
The general H+o-symmetric ASD connection such that tr(Q2) = 0 and
tr(PQ) is nonzero, can be reduced to one satisfying (6.3.19) by a transforma-
tion (6.3.8). So we have the required parametrization by solutions of the KdV
equation, together with the functions f 2 and Ah that appear in the transforma-
tion. 0
Remark. It follows from the proof that, if u is a solution to the KdV equation
and if qx =2u, then
/ 1
L ax+(qx+q
q2 -q)-(10 0),
0

M=8+ zgxx + qqx -\qx

C - 2 qxx - 99x
is a linear system for an ASDYM field, where c = 4gxxx + 2qx + g2gx + gqxx,
with the residual gauge freedom q H q + /3(t). The corresponding solution to
(6.3.6) in the Higgs gauge is
0
Q=9(1 0)9-1
0
P=9x9-1: 9(r -q)9-1

where r = qx + q2 and
g1qxx
( +qqx -qx
9-191 =
c - 2 qxx - qqx
The modified KdV equation
Another possibility in the KdV case, tr(Q2) = 0, tr(PQ) = -1, is to choose
the gauge so that Q has the same form as in (6.3.15), but 4?,,, and 4iZ are
upper triangular. Because the second of eqns (6.3.2) holds, we can do this by
transforming from the original normal gauge by (6.3.17), with ,6 a suitably chosen
function of x and t. The first of eqns (6.3.2) then implies that 4iw is proportional
to Q, and hence that
L-8x+(0 -p)-((O1 0)' M=ae+(0 c)(0 ).
for some functions p, a, b and c of x and t. The residual gauge freedom is the
same as before. If we now require that the invariant tr(D' Q D' P - P2) should
Reduction by H+o 81

vanish, then the commutativity condition [L, M] = 0 gives

2a = -p2 - p., 2b = -px +p 2, 4c = Pxx - 2p3 ,
and hence that p satisfies the modified KdV equation
4Pt = Pxxx - 6p2px
The corresponding solution to the KdV equation, u = -tr(P2), is
u=Px-P2.
This map from solutions to the modified KdV equation to solutions to the KdV
equation has become known as the Miura transformation (we shall look at the
extended form of the transformation in the context of the Drinfeld-Sokolov con-
struction in Chapter 12). From the point of view of the ASDYM equation, it is
simply a gauge transformation.
Other gauge groups
We have a complete description of the H+o reductions when the gauge group
is SL(2, C ), or one of its real forms: the nontrivial cases come down to either
the NLS equation or to the KdV equation or to a gauge equivalent system. The
classification problem here is straightforward because the invariants and normal
forms of 2 x 2 matrices are simple, so that the consequences of the various choices
can be analysed very easily.
When the gauge group is larger, it is very much harder to make a clear
general statement and we are only able to list some particular cases. It is shown
in Mason and Singer (1994) that the hierarchy of nKdV equations that emerges
from the Drinfeld-Sokolov construction (Appendix B) is embedded in a hierarchy
of ASD equations with gauge group SL(n, C). It follows from the details of the
embedding that the (n + 1)st equation of the nKdV hierarchy determines an
ASDYM field with H+o symmetry. 4 This can be seen by noting that the (n+1)th
nKdV flow is determined by the commutation of the operators
ax+A - A, at+B-S(ax+C), (6.3.20)
in eqns (B.1) and (B.4), which are of the same form as those in the reduced Lax
pair for the H+o reduction of the ASDYM equation. The Higgs fields have the
form
0 0 ... 0
0 0 ... 0

Q= (6.3.21)
0 0 ... 0
1 0 ... 0

and they are special because of

(a) the special normal form of Q, and
(b) the particular values of the invariants
tr(PQ) = tr(P2Q) = ... = tr(Pr-2Q) = 0, tr(P"-IQ) = (-1)n-i
82 Reduction to two dimensions
(The invariants are constants of integration when Q2 = 0.)
When Q is diagonalizable with distinct constant eigenvalues, an analogous
analysis relates the hierarchies of Zakharov and Shabat (1974, 1979) to the gen-
eralized ASDYM hierarchy of §8.6. A simple example with Q diagonal, but with
some coincident roots, is provided by the linear system (6.3.14), but where now
Vi is a row vector of length n, is a column vector, and the matrices are in block
form, with an n x n block in the top left-hand corner and a 1 x 1 block in the
bottom right-hand corner. If we impose the reality condition = Vit, where t
denotes Hermitian conjugation, then we obtain the reduced equation in the form
i0t = a + (00)0
See, for example, §11.1.4 of Faddeev and Takhtajan (1987).
6.4 REDUCTION BY HSD
When II is null and the tangent bivector 7r is self-dual, we can choose the co-
ordinates so that the generators of H are
X = aw and Y = ai .
In this case, P = X J 4) = 4?w, Q = YJ 4i = 4;Z, and the reduced Lax pair is
L=ax+4)w-(Q, M=at +4?Z-(P, (6.4.1)
where the components of 4) depend only on x = w and t = z. The vanishing of
the (2 term in [L, M] gives [P, Q] = 0, while the vanishing of the (° term implies
the existence of an invariant gauge in which 4),,, = 4i = 0: in this gauge, the
ASD equations are
[P,Q]=0, a9P=atQ (6.4.2)
(the first is a dynamic constraint). The only residual gauge freedom is to conju-
gate P and Q by a constant matrix. By setting 0 = Qdx + Pdt, we see that the
equations become
0A =0, d5=0,
and therefore that they are invariant under general diffeomorphisms.
In the SL(2, C) case the equations can be solved explicitly. The first equation
implies that P and Q are proportional, so that we can write 0 = ga for some
matrix function g(x, t) and for some complex-valued 1-form a. Furthermore
a and g can be scaled so that da = 0, from which it follows that a = d f
for some f. The second equation then gives dg A d f = 0, which is satisfied
whenever g is a function of f. Thus the general solution is given by (i) an
arbitrary scalar function f of x and t, and (ii) an arbitrary matrix function g of
f. Because of the invariance under the diffeomorphisms, we can choose f to be
one of the coordinates. Thus the reduction only becomes interesting for larger
gauge groups.
Example 6.4.1 The topological chiral model. In a gauge in which 4%, and 4)z
vanish, the integrability of equation (6.4.1) at -1 implies that P and Q are
of the form
Reduction by HSD 83

P = g-Iat9,
Q = 9-1a=9,
for some nonsingular matrix-valued function g of x and t, which is determined
by the connection up to left and right multiplication by constant matrices. Then
the pair of equations (6.4.2) is equivalent to
at(g-181g)
ax(g-latg) - = 0.

That is, d(g-'dg) = 0, which is the field equation of the topological chiral model.
Unlike the chiral equation in §6.2, this equation does not involve a metric and is
invariant under general transformations of the independent variables.
Other properties of this reduction are more easily seen in the K-matrix for-
mulation. The second equation implies the existence of a potential K such that
P = atK, Q = i9=K. With K as the dependent variable, the ASDYM equation
comes down to
[a1K, atK] = 0.
By itself, this is not a very strong constraint on K. It is solved, for example,
by any diagonal matrix, with arbitrary functions of x and t as diagonal entries.
Further large families of solutions can be generated by using the invariance under
general coordinate transformations in the x, t-plane. Thus the reduced equations
are not deterministic. However, some interesting deterministic equations are
embedded in (6.4.2), and can be recovered by requiring that P and Q should
have additional special properties.
Example 6.4.2 The Boussinesq equation and its generalizations. For k < n,
the linear system of the kth nKdV equation is of the form (6.4.1) (see eqn B.5).
In this case, if the gauge is chosen so that 0 = tt, then
Q= gE"-1g-1,
P= gEn-k ,9-I
where
0 0 ... 0 0 * -1 0
1 0 ... 0 0 * * -1
E = 0 1 ... 0 0 g-latg =
* * *
0 0 ... 1 0 * * * ... *

with det g = 1. When n = 3, k = 2, we obtain the Boussinesq equation (Example

B.4).
The n-wave equation
Chakravarty and Ablowitz (1992) showed that the n-wave equation is embedded
in the ASDYM equation by making a special ansatz for the potential. Their
choice can be seen as an example of the Z2 construction described in §4.5: the
special form of the potential is a consequence of the algebraic constraints. In the
notation of §4.5, we take H to be the group generated by HSD and the reflection
-W,zH-z.
84
Reduction to two dimensions
Tl*illfinitesimal symmetries are generated by X = 8,;, and Y = O. We take S
to N the surface w = z = 0, and Q = 1. We write
D'=d+4bxdx+fibdt,
sx = w and t = z are coordinates on S. Then the constraints imply that (Dx
and 6y are skew-symmetric functions of x and t alone, and that the restrictions
to
$of the Higgs fields P and Q are symmetric functions of x and t. Therefore
the
'bduced system is the compatibility condition for the Lax pair
L=8x+,Dx-(Q, M=at+(Dt-(P.
Tl*second-order term in the compatibility condition [L, M] = 0 is
[P,Q]=0.
In 4 generic case, we can reduce the Higgs fields to the diagonal forms P =
dV41,..., an), Q = diag(b1,... , bn), where the as and bs are distinct, by mak-
ing
Y complex orthogonal gauge transformation.
We now require that the as and bs should be constant. The vanishing of the
(A*% in [L, M] = 0 is equivalent to

['Dt, Q] - [fix, P] = 0,
and hence to
(I t) ij = \ij ('Dx) tij = /\ijwij ,
where
Aij = (ai - aj)/(bi - bj), wij is skew-symmetric, and we have suspended
the
kummation convention. The only remaining equation (the (° term) is the
71`4 e equation
n
atwij -'\ijaxwij = J:(Aik - \kj)Wikwkj
k=1
6.5
REDUCTION BY HASD
We can choose the coordinates so that HASD is generated by
X =8w, Y=Bi.
Th" 4D depends only on z and zu and the Higgs fields are P = bw and Q = 4Di.
The other two components of (D determine a connection D' = d + 4)Zdz +
on a bundle over the (z, w)-plane, and the linear system reduces to

L=P - (Q and M=D' - (Dw.

The reduced field equations are
D'P=O, D'-Q=O and D',Q-D'-P=O.
We know of no interesting integrable systems arising from this reduction.
6.6 THE ERNST EQUATION

There are also significant reductions by two-dimensional conformal groups other

then
translations. One of considerable physical interest is generated by
The Ernst equation 85

X = w8,,, - tii8,;,, y = B= +a. .

If we we adapt the space-time coordinates to X and Y by putting w = re'B,
w = re-ie, z = t - x, z = t + x, then the metric is conformal to
ds2 = dt2 - dx2 - dr2 - r2d92,

and the symmetries are time translations t '- t + to and rotations 0 9 + 00.
On a Minkowski real slice, the coordinates are real and the spatial metric is in
cylindrical polar form, so we are looking at stationary axisymmetric solutions to
the ASD equation and at their continuations to complex space-time. Apart from
their role in Yang-Mills theories, they are important because of a coincidence
noticed by L. Witten, that the reduced equation is equivalent to the Ernst equa-
tion for stationary axisymmetric gravitational fields in general relativity (Witten
1979, Ward 1983).
Stationary axisymmetric solutions
We shall derive the stationary axisymmetric reduction by constructing Yang's
matrix for an invariant potential; in the next section, we shall give a more general
construction that covers this and other similar symmetry groups.
It is possible to choose the invariant gauge so that 5
dw
-D = -P +Qdz, (6.6.1)
W
where the Higgs fields P and Q depend only on x, r. Then, on substituting into
eqns (3.2.1), we obtain the reduced ASDYM equations in the form
Px + rQr + 2[Q, P] = 0, Pr - rQx = 0.
The first implies the existence of a Yang's matrix J(x, r) such that
2P = -rJ-1Jr, , 2Q = J-Q, (6.6.2)
With J as the dependent variable, the first equation is satisfied identically, and
the second becomes
rax(J-laxJ) +ar(rJ-larJ) = 0. (6.6.3)
Every solution to this reduced form of Yang's equation determines a stationary
axisymmetric ASDYM field, and every stationary axisymmetric ASDYM field
can be obtained in this way; J determines the connection uniquely, and is deter-
mined by it up to J i--i AJB, where A and B are constant matrices. When the
gauge group is Cx, J is a scalar function, and (6.6.3) is simply the axisymmetric
form of Laplace's equation for log J.
Reduction of Einstein's equation
The way in which eqn (6.6.3) also arises by reduction of Einstein's equations
can be seen as follows. Let gab be a metric tensor in n + s dimensions, either
real or complex, and let Xa, i = 1,... , n, be commuting Killing vectors that
generate an orthogonally transitive isometry group with non-null n-dimensional
86 Reduction to two dimensions
orbits. This means that the distribution of s-plane elements orthogonal to the
orbits is integrable; that is, [U, V] is orthogonal to the orbits whenever U and V
are. 6
Put J = (Jij) = and let V denote the Levi-Civita connection.
Because the Killing vectors commute, J is constant along the orbits. Also, since
0 = LX,gab = VaXib + VbXia
for each i, and since the Lie brackets of the Killing vectors vanish,
Xj"DaXib = XaVaXjb = -XJ"VbXia = -XaVbXja = -2ab(Jij)
Moreover, for any vector fields U and V orthogonal to the orbits,
UaVbVaXib - VaUbVaXib = -XibUaVaVb + XibVaVaUb

= -Xib(UavaVb - VaVaUb)
=0
by orthogonal transitivity. Therefore we have
VaXtb = 2J'k
((aaJki)Xjb - (abJki)Xja) , (6.6.4)

where JijJjk = bk. Now for a Killing vector X,

VbVcXd = RabcdXa,
where Rabcd is the Riemann tensor. Therefore, by taking the covariant derivative
of (6.6.4), we have
-2Jikg-1aa(g4gabJklabJl,)
RabXaXj =
where g = det gab and Rab = Rcacb is the Ricci tensor. If Einstein's vacuum
equation holds, then Rab = 0 and
aa(gIgabJ-labJ) = 0. (6.6.5)
This can be written as an equation on S, where S is the quotient space, identified
with any one of the s-surfaces orthogonal to the orbits. If hab is the metric on
S and D is the corresponding Levi-Civita connection, then g = -r2 det(hab).
where r2 = - det J, and (6.6.5) becomes
Da(r.J-1Da J)
= 0, (6.6.6)
where the indices now run over 1, ... , s, and are lowered and raised by hab and
its inverse. By taking the trace, we deduce that DaDar = 0, and hence that r
is a harmonic function on S. We shall assume that its gradient is non-null.
When s = 2, we can write the metric on S in the standard form
±SZ2(dr2 + dx2) ,

where x is the harmonic conjugate to r. Then (6.6.6) reduces to (6.6.3). There-

fore we have the following.
Proposition 6.6.1 Let gab be a solution to Einstein's vacuum equation in n + 2
dimensions. Suppose that it admits n independent commuting Killing vectors
The Ernst equation 87

with orbits orthogonal to a family of non-null surfaces, and that the gradient of
r is non-null. Then J(x, r) is the Yang's matrix of a stationary axisymmetric
solution to the ASDYM equation with gauge group GL(n, cC ).
That this gives a useful technique for solving Einstein's equations follows from
the partial converse: every real solution to (6.6.3) such that (i) det J = -r2, and
(ii) J is symmetric, determines a solution to Einstein's vacuum equation. This
is true because, if we reconstruct a metric from a given J and i, then (6.6.3) is
equivalent to the vanishing of the components of Rb along the Killing vectors
(as we have shown), and the remaining components of the vacuum equation come
down to
20{ (log rSl2) = r tr(aeJ-1aCJ) ,
where l; = x + ir, together with the complex conjugate equation (when x and r
are real). Given J subject to (6.6.3) and to the constraint det J = -r2, these
are integrable, and they determine 0 to within a multiplicative constant. The
constraint is not important because we can always satisfy it by multiplying J by
eu, where u is a scalar solution to the axisymmetric Laplace equation
ar (raru) + rayu = 0.
The second condition, J = JL, can be interpreted as a further Z2 symmetry of
the ASD connection.
By specializing further to the case where n = s = 2, we can write J in the
form
fat -r2f-1 -fa
-a
f f
where f and a are functions of x and r, to obtain the space-time metric in the
form
ds2 = f (dt - ad9)2 - f -1r2d02 - S22(dr2 + dx2) .

When f and a are real for real x, r, this is a stationary axisymmetric gravitational
field, written in Weyl canonical coordinates. Other reality conditions correspond
to cylindrical gravitational waves, to the interaction region of a pair of colliding
plane waves, or to the Gowdy cosmological models, see Kramer et al. (1980).
Solution generation
Expressed in terms of the f and a, eqn (6.6.3) is the coupled system
r2 V2 log f + (f ara) 2 + (f axa) 2 = 0 = ar (r-1 f 2ara) + ax (r-1 f 2axa) .
The second equation implies the existence of a function V)(x, r) such that
raxVJ + f 2,9,_a = 0 = rarib - f 2axa.
If we replace a by V) as one of the dependent variables, then the system again
reduces to eqn (6.6.3), but with J replaced by
PZ
J'= f I f2 'l'
88 Reduction to two dimensions
So we can also find solutions to the Einstein's vacuum equation by solving (6.6.3)
for J', subject to the constraints det J' = 1, J' = P. In this context, eqn (6.6.3)
is called the Ernst equation. The transformation

JH (0 w) J' (0 w)
which maps solutions of Yang's equation to solutions of Yang's equation, is a
special case of the Backlund transformation (4.6.2), with
A=-r-2f, A= f, B=-B=a.
In the relativity literature, the complex function E = f +iVi is called the Ernst
potential. Its construction does not treat the two symmetries on an equal footing
since the transformation J '-+ J' is not covariant under linear transformations
in the space of Killing vectors. This failing can be put to good use to find new
solutions from a given seed, a technique that has been extensively exploited in
general relativity (Geroch 1971, 1972, Kinnersley 1977, Kinnersley and Chitre
1977-8, Hoenselaers and Dietz 1984). We start with J', and recover J by solving
for a in terms of Vi and f ; we then replace J by CtJC, where C E SL(2, C ),
and construct a new J' from this solution J. We can also find new solutions
by replacing J' by D`J'D, for D E SL(2, C ). Successive applications of these
two procedures generate an infinite-parameter family of solutions to Einstein's
vacuum equation from the original seed (if C and D are real, then the transfor-
mations preserve the real stationary axisymmetric solutions). It was shown in
Woodhouse and Mason (1988) that this family is the orbit of the seed under a
natural action of the loop group LSL(2, C) on the twistor patching matrices.

The Einstein-Maxwell equations

Again with n = s = 2, there is a further correspondence between 3 x 3 matrix-
valued solutions to eqn (6.6.3), and four-dimensional solutions to the Einstein-
Maxwell equations with two commuting Killing vectors. In appropriate units,
the equations are
Rab = 2k(FacFcb - FacFcb) , V1.Fbcl = 0 = V1aFbcj ,
where F is the electromagnetic 2-form, F* is its dual, and k is a constant. 7 We
make the same assumptions as before about the metric and the Killing vectors,
but now add symmetry conditions on the electromagnetic field. We suppose that
GX; F = 0 and that
2k-1
F + F' = 2k-1 do, F - F' = dq5,

for some complex potential 1-forms 0 = Oadxa and = &dxa, which are in-
variant under the symmetries and which vanish on the surfaces transverse to the
orbits. That is,
0. = J"biXja , -0a = J"OiX ja ,
Reduction of Yang's equation 89

where the contractions Oi = X;q5a, )i = are constant along the orbits. It

then follows by contracting the first Einstein-Maxwell equation with the Killing
`rectors that
Jik9-1aa(Jklg1gababJlj)
= (6.6.7)
With our special choice for the form of the electromagnetic field, the second and
third equations (Maxwell's equations in the curved background) are
aa(949abJijaboj) =
0, a. (g 9ab jij
.15
0. (6.6.8)
It follows from the vanishing of the trace of the right-hand side of the first
Einstein-Maxwell equation that det J = -r2, where r is a harmonic function
on S. Under the assumption that the gradient of r is non-null, we can combine
eqns (6.6.7) and (6.6.8) as the reduced form of Yang's equation (6.6.3) on the
the 3 x 3 matrix
Jtj +
JEM = ( 0i 1 I

Again, the other components of the first Einstein-Maxwell equation are trivial:
they come down to an expression for SZ as an indefinite integral of quantities
constructed from J, 0, and .8 So we have a correspondence between a class
of solutions to the Einstein-Maxwell equations and a class of reductions of the
ASDYM equation with gauge group GL(3, C), although in this case the symme-
try and reality conditions that characterize the latter class are more complicated
than in the vacuum case, see Woodhouse (1990).
6.7 REDUCTION OF YANG'S EQUATION
In this section we shall consider a general class of reductions to two dimensions.
We shall show that the Ernst equation is contained in the ASDYM equation in
more than one way.
Let H be a two-dimensional group of conformal transformations that pre-
serves the two families of coordinate null planes, the planes of constant w, z
and the planes of constant w, i. Then H is generated by two conformal Killing
vectors,9
X = aa,,, +b82 +aaw +68j, Y = caw +da, +c8,;, +dai,
where a, b, c, d depend only on w and z, and a, 6, c, d depend only on w and z.
We impose two further conditions on H:
(a) the two quadruples X, Y, aw, az and X, Y, ew, az are both linearly indepen-
dent (at least on some open subset of C M); and
(b) the induced metric on the orbits of H is nondegenerate, that is
p2 = XaXaYbYb - (X°1'a)2 54 0.
The first is needed for the construction of invariant solutions to Yang's equation:
without it, the conditions that a frame should be invariant and that it should
be covariantly constant on one or other of the families of coordinate null planes
can be incompatible. When it holds, the H-invariant ASD connections can be
90 Reduction to two dimensions
obtained by requiring that J in (3.3.2) should be constant along the orbits of H;
two such Js determine the same invariant connection whenever they are related
by J " PJP for constant P, P.
Let S denote the space of orbits in some suitable open subset of complex
space-time. We define coordinates u, v on S by fixing wo, zo, and by labelling
by u, v the orbit of H through w = u, z = v, w = wo, z = zo. A function f on
complex space-time which is constant along the orbits can be expressed in terms
ofuandv. When w=wo,z=zo,
d)-1(c
(aZf)=(avf)' (awf)=-(c d)(avf
By applying these to J, which is constant on the orbits, we find that the reduction
of Yang's equation is
ai(GijJ-1ajJ) = 0, (6.7.1)
where i, j = 1, 2, a, = au, a2 = aU,
d)-1
=(a Cc d)(0 01)'
with a, b, c, d evaluated at tuo, io, and GijGjk = 6'. Some examples are given
in Table 6.1. In the first, third, fourth, and fifth cases, wo = zo = 1; while in
the other two wo = 1 and zo = 0. In all six cases, w = u and z = v. The first
example also appeared in §6.2.
We can express the reduced equation (6.7.1) in an invariant form
d*(pJ-'dJ) = d(hJ-1dJ),
where d is the exterior derivative on S and * is the two-dimensional duality
operator on 1-forms, and h = 2w(X, Y), with the conformal metric on S defined
by identifying the tangent spaces to S with the 2-spaces orthogonal to the orbits
of H. 10 The function p measures the ratio of an invariant area element on the
orbit to the space-time area element; the function h measures the twist of the
orbits: since the ASD parts of d(Xadxa) and d(Yadx') are proportional to w, h
vanishes whenever the two `twist scalars'
EabcdXaYbVcXd, -abcdX'Ybvcyd,

vanish, which is the condition for the orbits to be orthogonally transitive (i.e.
orthogonal to a family of 2-surfaces). When X and Y are commuting Killing
vectors, the orbits are necessarily orthogonally transitive and h = 0. In this case
p is a harmonic function on S (d*dp = 0). If it has non-null gradient, then the
reduced equation is equivalent to the Ernst equation by a suitable coordinate
transformation on S.
In both the second and third examples, the reduced equation is the Ernst
equation, in spite of the fact that the corresponding symmetry groups are not
conjugate in the conformal group. In the second case, r2 = u, x = -2v (this is
the case considered in §6.6). In the third, r2 = -uv, x = (u + v) (Fletcher and
z
j4,ouville's equation 91

Table 6.1 Some two-dimensional reductions

1. x=as - aw
Y =a:+a,;,
au(J-'&J) +&(J-'auJ) = 0
2. X = w%,, - zba,;,
Y=a:+a:
+av(J-'&J) = 0
3. X=waw-zba1,
Y = za: - ia:
0

4. X = wa,,, + za: - zba,;, - iai

Y = wa: + ia,;,
a a.(uJ-'auJ) + ap((u + 0

5. X =waw+za=+zba;,+ia=
Y=waw-wa;,
a aV 0

6. X=waw+za:+wa;,+za2
Y = a: + ai
au(uJ-'auJ+ vJ-'avJ) - av(J-'aaJ) = 0

Woodhouse 1990). In the fourth example, the gradient of p is a null vector field
on S, so the equation is not a coordinate transform of the Ernst equation. In
the fifth, X is a dilatation and It # 0. In the sixth, X and Y do not commute,
so H is not Abelian.
6.8 LIOUVILLE'S EQUATION
We now turn to a final example of a reduction by a three-dimensional group
with two-dimensional orbits. We consider Example (4.5.1) of reduction by the
complex orthogonal group. We choose the double-null coordinates so that

Cv'' z) 2 (xI +ix2 20' ix 2)

Then, in the notation of Example (4.5.1), S is the surface w = w = 0, and at S,
we have
X+iY=(z-2)8,,,, X-iY=(z-z)aw, Z=0.
We use z and z as coordinates on S, and write D' = d+,D', where V = Adz+Adz.
From the first and third of eqns (4.5.5),
(z - 2)2Fw,1, = 2C + [B, B], FZ= = a=A- 8A + [A, A]
at S, where
92 Reduction to two dimensions

B=.Ox+iOy, B=OX-iq'r, C=iq5z

The vanishing of Fww - Fzi gives
2C + [B, B] = (z - z)2 (a2A - aiA + [A, A]) .

By using the second of eqns (4.5.5), the vanishing of F,,,Z and Fwi on S gives
BZ + [A, B] = 0, Bi + [A, b] = 0.
To these we must add the constraints derived in Example (4.5.1),
B + [B, C] = 0, b - [B, C] = 0, DEC = 0.
When the gauge group is SL(2, C), the constraints can be satisfied with nonzero
B or B only if there is a gauge in which C, B and b are of the form (4.5.8), with
A = diag(a, -a), A = diag(a, -a), where a, b, a, and b are functions of z and z.
In this case, the reduced equations are
(z - z)2(az - ai) = 1 + bb, b2 + 2ab = 0, bi - 2ab = 0.
On eliminating a and a, we obtain
8Zaip = 2e°,
where p is given in terms of the a gauge-invariant quantity bb = tr(BB) by
p = log(bbl (z - z)2). This is the complex form of Liouville's equation; to obtain
the real form in Chapter 1, one imposes the Euclidean reality conditions z = z,
and restricts the gauge group to SU(2) (Witten 1977).

NOTES ON CHAPTER 6
1. Other examples are given by Ivanova and Popov (1992).
2. The stationary axisymmetric reduction of the ASDYM equations also acquires an
unexpected extra invariance under hyperbolic transformations of the upper half-plane.
It can therefore be transferred to a Riemann surface with genus greater than I (such a
surface has a canonical hyperbolic metric). See Mason (1992a) for more details of this
and other examples of unexpectedly large symmetry groups for the reduced ASDYM
equations.
3. The linear coordinate transformations in this class have a clear origin in the space-
time geometry: they arise from dilatations of space-time and from the freedom to
change the generators of H by adding a constant multiple of X (the null generator) to
Y (the non-null generator). However, the nonlinear transformations cannot be deduced
from the symmetries of the original ASDYM equations. As noted above in the cases of
reduction by a single null Killing vector or H++, this phenomenon is not unusual-see
also the HSD reduction and Mason (1992a).
4. The equation at level k = n + 1 in the sequence is generated by the application of
the Drinfeld-Sokolov construction to the loop algebra of sl(n, C ), as in Appendix B.
5. In an invariant gauge, the Lie derivatives of the potential along X and Y vanish,
and so
eie(PW , D:, a-i° I , 4iz
depend only on x and r. Under transformations of the invariant gauge by g(x, r),
Notes on Chapter 6 93

9-'-,D.9 + Ze-peg-'gr , 1D. '--I 2g-'9a

As in the general case, we can interpret the first equation in (3.2.1), which now takes
the form
-ax4'w - e-'°t9,4 + 2[ ' ,'w] = 0, (6.8.1)
as the condition for the existence of an invariant gauge in which 4i,,, and 4iz vanish.
6. When n = s = 2, orthogonal transitivity is equivalent to the vanishing of the two
twist scalars b d
EabcdX1 X2O`X, , EabcdX1 X2OcX2 ,
where e is the alternating tensor. If the Ricci tensor vanishes, then the twist scalars
are constant (Kundt and Triimper 1966, Kramer et at. 1980, p. 163). Therefore they
vanish everywhere if some combination of X1 and X2 has a zero-for example on the
symmetry axis in a stationary axisymmetric space-time.
7. Our definition of the dual is that of §2.3, which differs by a factor of i from the
one that is natural in the context of general relativity. When gab is a real metric of
Lorentzian signature, and F is a real electromagnetic field, our F' is imaginary.
8. The other components of the first Einstein-Maxwell equation are equivalent to
8ar log Q = r tr(KI - x2) - 4r-' - 2r tr(L'ILI J - LZL2J)
4azlog1l = rtr(KIK2) -2rtr(L1L2J)
where KI = J-'Jr, K2 = J-'J,, and L1, L2, LI and L2 are the row vectors with the
respective entries (labelled by i = 1, 2)
JijarVj , Ji'amw) , J'ja4i , Jijazwj .
9. Given that X and Y both satisfy the conformal Killing equation, the condition that
H preserves the two sets of null 2-planes is equivalent to the condition that the SD
parts of d(Xadxa) and d(Yadxa) should both be proportional to
w=dwAdzu - dzAdi.
On the Euclidean real slice, i = z, w = -w, H preserves the complex structure
determined by the coordinates w, z, and w is the Khhler form, multiplied by i/2.
In terms of the isomorphism described in §2.4, H is the projection into PGL(4, C )
of a subgroup of GL(4, C) of matrices of the form
(A B)
D ,
0
where A, B, D are 2 x 2 matrices and D is diagonal. It follows that the transformations
in H are combinations of isometries and dilatations.
10. To derive an expression for ds2, one puts
U=19,,,+aX+0Y, V=az+ryX+5Y,
where a, 6 are chosen so that U and V are orthogonal to the orbits. We then
have

H _Y)

where
H = ( XaYa YaY l
X°Xa
X°Ya
The metric on S is ds2 = U°Uadu2 + 2Ua Vadudv + Va Vadv2, from which we obtain
ds2 = (du dv) G (dv)
94 Reduction to two dimensions

on discarding a conformal factor. Note that G is not symmetric in general, so the

metric tensor is not G;,, but G(;j). If we put gig = G(i,) and g = det(gi,), then we
have
z
g- P GIii) = h f) G(") = P g g;j
(bc-ad)z' ad-bc be - ad
where e`3 is the two-dimensional alternating symbol and g`3 is the contravariant metric
(i.e. g"g;k = bk). By substituting in (3.3.2), and by using the fact that ad - be is
constant on S, we obtain the conformally invariant form of the reduced equation.
7
Reductions to one dimension

Amongst the reductions to one-dimensional systems are two central families of

examples. The first, the integrable motions of symmetric tops found by Euler,
Lagrange, and Kovalevskaya, are historically important for their part in stimulat-
ing the development of the subject. This is particularly true of Kovalevskaya's
example, which she discovered by requiring that the solutions to the equations
of motion should have no movable critical points, that is, the location of singu-
larities other than poles should not depend on the initial conditions. Her idea
was closely related to work by Painleve and others on the classification of second
order ordinary differential equations of the form
Y" = F'(y,y',t),
where F is rational in y and y', with the same Painleve property, that the critical
points of the solutions should be fixed (Painleve 1900). Amongst the fifty equiv-
alence classes of such equations, there are six that required new transcendental
functions for their solution: these make up the second family of examples that we
consider in this chapter. They play a central role in the modern theory of inte-
grability through the various forms of the Painleve test (see Chapter 1). Both the
classical integrable top equations and the six Painleve equations are reductions
of the ASDYM equation by three-dimensional Abelian groups of symmetries, the
first by translations and the second by what we call the Painleve groups. In this
chapter, we shall describe these reductions, and also consider briefly some exam-
ples of non-Abelian reduction. Other aspects of the Painleve equations-their
connection with the isomonodromy problem and their role in the construction of
Bianchi metrics with ASD conformal structure-will be considered in Chapters
11 and 13. 1

7.1 ABELIAN REDUCTION TO ONE-DIMENSION

A three-dimensional Abelian group of conformal symmetries is generated by
three commuting conformal Killing vectors X, Y, Z. For a reduction to one
dimension, the orbits must be three dimensional, and so X, Y, Z must be inde-
pendent, at least in an open subset of space-time. We can then introduce local
coordinates p, q, r, t such that
X=ap, Y=Bq, Z=ar. (7.1.1)
96 Reductions to one dimension
By making a gauge transformation to eliminate the dt component, we can bring
the general invariant Yang-Mills potential into the form
(D = Pdp+Qdq+Rdr, (7.1.2)
where the Higgs fields P, Q, and R are functions of t alone. If the orbits are
non-null, then the ASDYM equation becomes a system of ODEs for P, Q, and
R as functions of t. In the null case, however, there will be a tangent a-plane
at each point of every orbit, and the system will be singular: combinations of
the ASDYM equations will reduce to algebraic restrictions on the dependent
variables P, Q, R, and it will not be possible to solve for the t derivatives of all
the Higgs fields. The restrictions are the `dynamic constraints' that we described
in §4.5.
Example 7.1.1 The three conformal Killing vectors
X =azf Y=za,,,+wa2, Z=zoZ+waw
commute, and are independent almost everywhere. They generate a three-
dimensional conformal group, with the null hyperplanes of constant w as orbits.
We can take
p=z--,ww
z
_ w
q
z '
r=logz, t=w.
Then
4; = Pdp+Qdq+Rdr
= (Q - tP)dw + (twP-wQ+zR) dz -wPd-u' +Pdz.
z z2 z
On substituting into the ASDYM equations (3.2.1), we obtain the singular sys-
tem
[R,Q-tP]=0, P'=0, Q'+[R,P]=0,
where the primes denote t derivatives. The first equation is the dynamic con-
straint that arises because Y - wX and Z span an a-plane at each point.
The system in the example is underdetermined because the algebraic con-
straint does not fix all the components of the Higgs field R-in fact when the
gauge group is Abelian, R can be specified freely. This is a general feature of the
null reductions, although they can nevertheless be reduced further to interesting
deterministic systems by fixing the undetermined degrees of freedom, in a way
that is analogous to the choice of first integrals that we discussed in Chapter 4.
For an example, see §7.3.
Three-dimensional Abelian conformal groups
A three-dimensional subgroup of the conformal group can be identified, by
the construction in §2.4, with a three-dimensional subgroup of PGL(4, C) =
GL(4, C)/C" , and therefore with the quotient of a four-dimensional subgroup
of GL(4, C) by the multiples of the identity (which act trivially on space-time).
Abelian reduction to one-dimension 97

To classify the Abelian reductions to one dimension, therefore, we must list the
conjugacy classes of four-dimensional Abelian subalgebras tl C gl(4, C) that con-
tain the multiples of the identity. By considering the common eigenvectors of the
generators, it is immediate that each class contains a representative of the form
a ® n, where the elements of a are diagonal and the elements of n are upper tri-
angular. Armed with this observation, it possible to list all the distinct cases by
looking at the Jordan canonical forms of the various nilpotent generators. There
are fourteen conjugacy classes of such groups in all, and they can be grouped
into four types.
Degenerate groups. In two cases, the generators in space-time are everywhere
linearly dependent and the orbits are two-dimensional and totally null: these are
the two subgroups
d c b a d 0 0 a
0 d 0 0 0 d 0 b
0 0 d 0 0 0 d c
0 0 0 d 0 0 0 d

of GL(4, C) where a, b, c, d are complex parameters labelling the matrices in the

subgroup. In these cases, there are kinematic constraints, and the reductions are
PDEs in two independent variables. The generators in space-time can be read
off from Table 2.1. They are, respectively,
X =aw, Y=ai, Z=zow+ziloi,
for which the orbits are Q-planes, and
X =aw, Y=az, Z= -ww - zuozt
for which the orbits are n-planes. In each case, X, Y, Z generate the flows of
the parameters a, b, c; the flow of the fourth parameter d is trivial because the
multiples of the identity act trivially on space-time.
Null groups. There are six cases in which the orbits are three-dimensional, but
null. Here the reduced equations are singular and there are dynamic constraints.
The null groups are listed in Table 7.1. The first is the group of translations
parallel to a null hyperplane, for which the reductions include the Euler top,
by a route that generalizes to give integrable motions of an n-dimensional rigid
body. The second entry in the first line is Example 7.1.1.
The translation group. Here the generators are X = aw - 8,;,, Y = ai
and Z = 8zt and the reductions include Nahm's equation and the various top
equations (see below). The translations are parallel to the non-null hyperplane
w+w=0.
The Painleve groups. The remaining conjugacy classes are listed in Table
7.2. Apart from the translations, they are the only Abelian groups of conformal
symmetries with three-dimensional non-null orbits. We call them the Painleve
groups: when the gauge group is SL(2, C ), the corresponding reductions are
the six Painleve equations. The table shows the Painleve equation that arises
the the
of hyper-
and constant
dimension ), flow
the
, G with
to
one the that
to 000 d a b c d w c; so
waw GL(4,
a b c 0 0 cd0 wa b, these
parallel
+ of a, of
0 00
c 0d00 xaZ-
Z
gauge
- za
c 0 00 d000 + t; the
z
t
Reductions
groups
space-time
w82 waw ua
waw
in - wa tiwa:
- -
-
subgroup
wa
groups
+ + +waw z +waw

translations
00 0 d a 0 0 d of
parameters
combination choose
null az
w+ Z
za zaw Za= -zaZ
Za Z
a 0 c 0 b cd 0 az+a. (z-Z)aw+ru(aZ-a=)+a;

-Zaw
the
Painleve
= can
c d 0 of
The b c0 0 0
The
Generators xy Z X= y Z=-Za= xa= y=za,,,+w8Z+aw Z x= y z=-Zaz-waw x= y=_ Zxa
linear
a
group
we
c 00 0 d 00 0 the
7.1 flows
by TOPS
7.2 G) four-parameter
Then
the AND under w.
Table a c 0 d a b c d GL(4, a b c d 0 0 c c 0 o0 d 0 0 0 d 0 00 a of +
b0d0 b0d 0 Table
of b cd0 0 0 c0 a b c0 0 0 c0 0 0b 0 generated
w
0d00 c d 0 0 c d 0 0 bd d 0 0 b c 0 0 a b 0 0 0 c 0 0 d
is t =
1
invariant
d 00 d00 0 d000 d000 co00 6000 dO00 representative
is
1 a 1
4)=i(B+iC)dz-iA(dw-dtu)+i(B-iC)dZ,
10 Subgroup
space-time
EQUATIONS
in
constant
that
in iv vt case,
P',11 P P Pv P parameter of
NAHM'S
each
98 in generators fourth coefficients. 7.2 Suppose planes
Nahm's equations and tops 99

where A, B, C are matrix-valued function of t; they are the Higgs fields of

the generators of translations parallel to the x, y, and z axes of a Cartesian
coordinate system t, x, y, z. On substituting into eqn (3.2.1), we obtain the
reduced equations in the form of Nahm's equations
A' = (B,C], B' = [C, A], C' = [A, B], (7.2.1)
where the prime denotes differentiation with respect to t (Nahm 1983, Hitchin
1983; see also Ward 1985, Ivanova and Popov 1991).
Example 7.2.1 The Euler-Poinsot top. It follows from (7.2.1) that the three
gauge-invariants tr(AB), tr(BC), and tr(CA) are constant. They can be set to
zero by making a rotation of the x, y, z coordinates in space-time to diagonalize
the symmetric matrix
tr(A2) tr(AB) tr(AC)
tr(AB) tr(B2) tr(BC)
tr(AC) tr(BC) tr(C2)
(the trace-free part of this matrix is also constant and gauge invariant). If we
take the gauge group to be SL(2, C ), and make a suitable choice of basis in
sl(2, C ), we then have

A - (a 0)' B - (b 0), C=(O0 Oc)'

where a, b, c are functions of t satisfying

a' = 2bc, b' _ -2ca, c' = 2ab.
With appropriate scaling, these are Euler's equations for a top. 2
With a different choice of invariant gauge, we can arrange that the dw com-
ponent of 4) vanishes. Then
-D =Qdz-Pdw+Rdz,
where P, Q, R are the Higgs fields of the Killing vectors X = 8,,, - 8,b, Y = 8i,
Z = O. In this case, we have
R' = 0, Q'+[Q,P]=0, P' + [R, Q] = 0, (7.2.2)
where the prime denotes the derivative with respect to t = w+zu. By taking the
gauge group to be an orthogonal group, by imposing further discrete symmetry,
and by making particular choices for some of the invariants of the Higgs fields, we
obtain various classical integrable rigid body systems, and their generalizations
to higher dimensions. 3
Example 7.2.2 Kovalevskaya's top. We take the double-null space-time coor-
dinates to be real, that is we work in an ultrahyperbolic space, and we choose
the gauge group to be SO(3, 2), so that P, Q, and R, are real matrices of the
block form
A B
Bt C
100 Reductions to one dimension
where A is a skew-symmetric 3 x 3 matrix and C is a skew-symmetric 2 x 2
matrix.
We require further that Q and R should be symmetric, and that P should be
skew-symmetric, which is equivalent to the imposition on the ASDYM connection
of a further discrete symmetry under the Z2-action
z -z, w-w, wow
with ci = 1 in the chosen gauge (see Example 4.5.4). With these symmetry and
gauge conditions,
0 -L3 L2 0 0
L3 0 -L1 0 0
P= -L2 L1 0 0 0
0 0 0 0 c
0 0 0 -c 0
and
0 0 0 el fl 0 0 0 91 h1
0 0 0 e2 f2 0 0 0 g2 h2
Q= 0 0 0 e3 f3 , R=- 0 0 0 93 h3
el e2 e3 0 0 91 92 93 0 0
f1 f2 f3 0 0 h1 h2 h3 0 0
where c and the real 3-vectors L, e, and f are functions of t, and g and h are
constant. Finally, we choose the values of certain invariants of the Higgs fields.
It follows from the second equation in (7.2.2) that Q evolves by conjugation, and
hence that the gauge-invariant coefficients a, /3 of the characteristic polynomial
det(A - Q) = as - aa3 +,3A are constant. With our choice of gauge,
a = e.e + f . f , R = (e.e)(f f) - )2 ,

so that 4/3 < a2.

The Kovalevskaya top is the extreme case in which this is an equality. With
an appropriate scaling of t, it is the case e.e = f . f = 1, e. f = 0. On substituting
into (7.2.2), we obtain the reduced ASDYM equations
e'=wAe, f'=wn f, L' = f Ah+eng,
where w = L + ck, k = e A f, and c = -1 + L.k, with ry constant. These are the
equations of motion of a symmetric charged top with angular momentum L and
angular velocity w, rotating about a fixed point in a gravitational field g and an
electric field h. The vectors e, f, k make up an orthonormal triad fixed in the
top, and the components of L and w are related by
Li=J ,+7k;,
where J is the inertia tensor and ryk is the gyrostatic momentum. In the triad
fixed in the top,
1 0 0
J= 0 1 0
0 0 1
The motion of an n-dimensional rigid body 101

Thus the top has principal moments 1,1, 2, and symmetry axis along k. The
mass and charge are both equal to 1, and the centre of mass and the centre of
charge are at the points with respective position vectors e and f from the fixed
point (i.e. in the plane orthogonal to the symmetry axis). In the standard case,
h and -y are both zero.
Example 7.2.3 Lagrange's top. If instead we take the gauge group to be the
Lorentz group SO(3, 1), and impose the same discrete space-time symmetry, then
0 -L3 L2 0
P- L3 0 -L1 0
1-L2 L1 0 0
0 0 0 0
and
0 0 0 el 0 0 0 gl
0 0 0 e2 R_ _ 0 0 0 g2
Q 0 0 0 e3 0 0 0 93
el e2 e3 0 91 92 93 0

where the real 3-vectors L and e are functions of t, and g is constant. This time,
on substituting into eqns (7.2.2), we obtain
L'=eAg, e'=LAe.
Note that e.e and e.L are constants of the motion. We take e to be a unit
vector (which again is a special choice for an invariant of the Higgs field Q), and
denote the second constant by Cn. We then have the equation of motion of a
symmetric top with principal moments (1,1, C), rotating about a fixed point on
its axis of symmetry in a constant gravitational field g. The centre of mass is at
the point with position vector e from the fixed point, and n is the component of
the angular velocity along e.
Example 7.2.4 The Toda lattice. Another system of ODEs that arises as a
reduction by three translations together with an additional discrete symmetry is
the original Toda lattice. This can be obtained from Example 6.2.2 by imposing
an extra translational symmetry along 8z - 8i.
7.3 THE MOTION OF AN n-DIMENSIONAL RIGID BODY
Euler's equations for a top also arise from the reduction by the the group trans-
lations parallel to the null hyperplane w = 0. Here the generators are
X =8,,,, Y=aZ, Z=a,,
and the reduced equations are
Q'+[w,Q]=0, P'+[w,P]-[Q,R]=0, [R,P]=0, (7.3.1)
where P, Q, R are the Higgs fields, w = 4),j,, and the prime denotes differentiation
with respect to t = tii. We shall take the gauge group to be SL(n, C) and consider
the generic case in which P has distinct eigenvalues. If then we choose a gauge
102 Reductions to one dimension
in which P is diagonal, .we deduce from the last equation that R must also be
diagonal. The remaining freedom to make diagonal gauge transformations can
be fixed by requiring that the diagonal entries of w should vanish. We then have
from the diagonal entries of the second equation that P is constant, while the
off-diagonal entries determine w in terms of P, Q, and R.
We can choose P to be any constant diagonal matrix, with distinct entries,
and R to be any diagonal function of t. We then express w in terms of the
unknown variable Q, and use the first equation as a propagation equation for
Q. The reduced equations do not constrain the diagonal entries in R, but they
become deterministic once we have made a particular choice for R.
We can make a further reduction to obtain the integrable motions of the
n-dimensional rigid body (Ward 1986), by requiring that Q and w should be
skew-symmetric, which amounts to imposing the additional Z2 symmetry
(z, w, z, w) - (-z,-w, z, w),
as in Example 4.5.4, and by choosing R to be constant, with distinct diagonal
entries. We can then identify Q with the angular momentum of an n-dimensional
rigid body, and w with its angular velocity. The second of eqns (7.3.1) determines
the ijth entry of w in terms of that of Q by
w - ri-P3
-Pi
- rj
:j (7.3.2)

(without summation), where the pis and ris are the diagonal entries in P and
R, while the first of eqns (7.3.1) becomes
Q' = -[w, Q1,
which is the Euler-Arnold-Manakov equation for an n-dimensional spinning rigid
body, or alternatively the equation of the geodesic flow of a left-invariant diagonal
metric on SO(n) (Manakov 1976, Arnold 1984).

7.4 THE PAINLEVE EQUATIONS

The six Painleve equations P, - Pvt are shown in Table 7.3. They are the reduc-
tions of the SL(2, C) ASDYM equation by the five subgroups of the conformal
group shown in Table 7.2. In this section, we shall give a proof of this fact. 4
The transformations from the standard double-null coordinates to the coor-
dinates p, q, r, t in which the generators take the form (7.1.1) are given in Table
7.4. By first making a gauge transformation to bring 4? into the form (7.1.2),
and then by transforming back to the original double-null coordinates, we can
express the w, z, w, z components of an invariant potential in terms of the Higgs
fields P, Q, R. On substituting the result into eqns (3.2.1), we obtain the re-
duced ASDYM equation as systems of ODEs for P, Q and R as functions of t.
There are four gauge-invariant constants of the motion, k, e, m, n. In each case,
P' = 0, k2 = 2tr(P2) .
The Painleve equations 103

Table 7.3 The Painlev6 equations

Pi y'=6y2+t
Pn y'=2y3+ty+a
ay2t+ Q
Pill yF = - + + 7y3 + y
y'2 t 3

PIV y' = Zy + 32 + 4ty2 + 2(t2 - a)y + y

PV
_ ,z
y -y 1 + 1 _ y + (y - 1)2 (a+ Qy + Yy + by(y + 1)
2y y71 t t2 t y-1

Pvi y " = 21 y+y11+y1 tJyZ - It+t 11+y1 tJy

+ y(y - 1)(y - t) + /3t 1) + bt(t - 1)1
t2(t - 1)2
Ca
7 + y(t -
?W--71-)2
(y - t)2 J

Table 7.4 The coordinate transformations

p q r t

P1,11 w + ii(i - z) - 2w3 i - 2202 t0 z-202

Pill -z/iu i/w - log to ur-' iz - ww
Ptv i- 2w2 w/z log z fu - w/z
Pv i/iu log(w - iz/t1) log(z/w) i/t1 - w/z
Pvj - log w - log i log(w/i) iz/ww

Table 7.5 The equations and constants

P1,11 Q'_[R,P] t = tr(PQ)

R'=[tP+R, Q] m = tr(PR+ 2Q2)
n = tr(QR)
tQ' = 21Q, R] Q2 = Ltr(Q2), m = tr(PR)
R' = 2t[Q, P] n = tr(QR)
Piv Q'=[P,R+tQ] Q = tr(PQ), m2 = '-ztr(R2)
R' = [Q, R] n = tr(PR + 2Q2)
Pv Q' = [P, RI t = tr(PQ), m2 = 1tr(R2)
tR' = [R, tP + Q] n2 = Ztr((Q + R)2)
tQ' t2 = Ztr(Q2) , m2 = z tr(R2)
Pvi _ [R, Q]
t(1- t)R' = [tP+Q, R] n2 = Ztr((P + Q + R)2)
104 Reductions to one dimension
The other two equations and the other three constants are different in each case:
they are given in Table 7.5.
The coordinate transformations are not unique, since we are free to add
functions of t to p, q, and r. We then have to conjugate the Higgs fields by an
SL(2, C )-valued function of t to remove the t-component of the potential in the
new coordinate system. We are also free to replace t by any function of t. Apart
from these freedoms, the systems of ODEs are determined by the conjugacy
class of the corresponding Painleve subgroup, independently of any choice of
coordinates or gauge. 5
The particular transformations in Table 7.4 have been chosen in each case
(with the benefit of hindsight) so that one of the reduced equations is P' = 0,
and so that t coincides with the independent variable in the standard form of
the corresponding Painleve equation. Having made these choices, P, Q, and R
are determined by the Yang-Mills connection up to conjugation by a constant
matrix. This residual gauge freedom can be exploited to reduce P to one or
other of the standard forms

P= (0 k) or P= (0 U)
according to whether k # 0 (the semi-simple case) or k = 0 (the nilpotent case).
We then substitute

Q= CV (P 0'
A)' R= P)
into the appropriate reduced equation. In each case, the equations for the un-
knowns A, it, v, p, a, and r come down to a single second-order ODE-the
corresponding Painleve equation. A solution to this, together with the values
of the constants, determines P, Q, and R to within the only remaining gauge
freedom, which is to conjugate P and Q by a constant matrix that commutes
with P. We shall consider the semi-simple and nilpotent cases separately.
The semi-simple cases
The equations of motion and conserved quantities are written out in terms of
the unknown functions in Table 7.6. From these, the reductions to the Painleve
equations are straightforward, if not in every case quick and obvious. The tran-
scendents themselves are gauge-invariants of the original ASDYM equation: ex-
cept in the case of Pv, they are constructed in a simple way from one of the
roots of the gauge-invariant quadratic in s
det([P, sQ - R]) = 0. (7.4.1)

P1111. We put y = a/µ, which is one of the roots of (7.4.1). Then

ky' = 2kp - Qy + 2k2y2 + 2k2t ,
k2p' = -4k3Py + 2mk2y - ze2y + ekp - nk2 .
The Painlevt equations 105

Table 7.6 The reduced equations in the semi-simple cases

PIII A'=0 p'=va-µT I = 2kA

µ' = -2ka a' = 2(Pµ - Aa) + 2ktµ m=2kp+µv+A2
V = 2kr r' = 2(Ar - vp) - 2kty n=2pA+µT+av
Pill tA' = 2(µr - av) p' = 0 t2=A2+µv
tµ' = 4(Aa - pµ) a' = -4ktµ m = 2kp
tv' = 4(pv - Ar) T' = 4kty n=2pA+µr+va
Piv A' = 0 p'=µT-av t=2kA
µ' = 2k(a + tµ) a' = 2(Aa - µp) m2 = p2 + ar
v' = -2k(r + iv) T' = 2(pv - Ar) n = 2kp + µv + A2

Pv A'=0 tp'=av-µT I = 2kA

µ' = 2ka t(a' +,u') = 2(µp - Aa) m2=p2+CT
V' = -2kr t(-r' + v') = 2(A7- - pv) n2 = (A + p)2 + (a +,U)(1. + v)

Pv1 tA'=av-µr (1-t)p'+A'=0 12=A2+µv

t1i'=2(Pµ-Aa) (1-t)a'+µ'=2ka m2 = p2 + aT
tv' = 2(A7- - pv) (1 - t)T' +V' = -2kT n2 = (k+A+p)2+(µ+a)(v+r)

On elimination of p, we obtain
8k4y" = (4k2y - )3 + (4k2y - )(16k4t - 3e2 + 8k2m) + 32k4a,
where
4k2em - 8k4n - 3 + 8k5
a= 16k4
This is equivalent to the second Painleve equation (P11) by affine transformations
of y and t.
Pill. We put y = a/tµ. Then
kty' = -4k2t - 4ktAy2 + (2m - k)y,
ktA' = - )2)y + 2nk - 2mA.
On eliminating A, we obtain the third Painleve equation (Pill) with a = -8n,
Q = 8(m - k), 7 = 6 = -16k2. In this case, ty is a root of (7.4.1).
Piv. We put y = a/µ (again a root of (7.4.1)). Then
ky' _ y - 2kp - 2k2(y2 + ty),
y(4k2n - 2 - 8k3p)
P = m2 -y p2 4k2

On eliminating p, we get
y(2k2t - )2
y ii = Y! + 6k 2Y3 +4y2 (2k 2t -)+ 2k2 - 2kay + 2y ,
2y
where a = (4k3 + 2 - 4k2n)/4k3, and -4m2. By the affine transformation
2k2t - -+ 2k3/2t, y - y/2ki/2, this comes down to the fourth Painleve equation
(P1v).
106 Reductions to one dimension
Pv. Here we put s = (p + A + n)/(a +,u), w = Q/(p + m), and y = sw. We
then have
sp' 2sp 2s2(p + A)(p + m)w
s = -
p+A+n t t(p+A+n)
wp
W1
= 2p(p + A + n) - 2w(pt + A) - 2kw -
st(p+m) p+m'
p' = -t(p+m)(p+A-n)+- (p-m)(p+A+n).
The first two of these equations imply that
2 1
y = t (y-1)2+y t ((A+m-n)y-A-n+m)-2ky,
and on eliminating p between this and the third equation, we find that y satisfies
the fifth Painleve equation (Pv) with a = z (m + n - A)2, 0 = (m + n + A)2, -
ry = -2k(1 + 2n - 2m)2, 6 = 4k2, A = e/2k. In this case as wellz the Painleve
transcendent y is an invariant of the Higgs fields. It can be expressed as a
cross-ratio of eigenvectors of P, Q + R, and R, regarded as points of C Pl .
Pvi. We put
Y
_ tµ A p(1-t)
,a -(t-1)Q y y-t
so that (y - t)/y(t - 1) is a root of (7.4.1). Then
, 2(h - k)(y - t)y y(y - 1)
Y/
t(t - 1) + t(t - 1) '

h
_ 2y-y2-t) (2 ht(t-1 y' by e2(y-1)
- m2(y --1)1) + n2
t(t - 1)(y h - k2y(y - t) t(y - t) y2(t - 1)

(y - t)2 t(y - 1)
On eliminating It, these come down to the sixth Painleve equation (Pvi), with
a=2k2+2,0=-2e2,ry=2n2,6=2m2+2
The nilpotent cases
We can deal with the nilpotent cases Pill and Pvi without further work. We pick
one of the roots of the quadratic (7.4.1), define y in terms of this in the same
way as in the corresponding semi-simple case, and put k = 0 in the equation
satisfied by y. When P has the standard nilpotent form, the quadratic is
0 = det([P, sQ - R]) = -(sv - -r)2,
which has coincident roots. In the case of Pill, we deduce that y = r/tv satisfies
the third Painleve equation, with 6 = 0. In the case of Pvi, y = tv/(v +,r - tr)
satisfies the sixth Painleve equation with a = 2
The other three cases are less straightforward because the parameters in the
Painleve equations become singular when k -+ 0. We have to return to the
The Painleve equations 107

Table 7.7 The reduced equations in the nilpotent cases

P1.11 A' = -r p' = Va - µT + lit t=v

µ'=2p a'=2(pµ-Aa)-2At m=r+A2+µV
V'=0 T'=2(AT-vp) n=µr+av+2pA
Piv A'=T+tv p'=µr-Vo t=v
µ'_-2p-2tA a'=2(Aa-pµ) m2 = p2 + ar
v'=0 r'=2(vp-Ar) n=r+\2+µV
Pv A'=r tp'_-tr+va-Mr t=v
µ' _ -2p ta' = 2tp + 2(µP - Aa) m2 = p2 + ar
v'=0 tr'=2(Ar-vp) n2 = (p+\)2+(µ+a)(v+r)

reduced equations: with P in the standard nilpotent form, these are given
Table 7.7.

P1,11. Here we put y = T/v, which is again a root of (7.4.1). On elimination of

the other dependent variables, we obtain
ey" = 2(3m2 - en) - 6(ey - 3m)2 - 2e2t, (7.4.2)
By making affine transformations of y and t, (7.4.2) reduces to the first Painleve
equation (PI).

PIv. Here the equations come down to

y2y z
2
Y" = + 2ny - 2ety - 4ty2 _

where again y = -r/v. This is equivalent to PII with a = 2m - 2 by the trans-

formation y t t, where 6
y = -(2f)-1(9Z + yz + zt), t=(2e)i(t-a-In).
Pv. Here we have,
t (t7-,)' -- et2Ti2 - 2m2 (r + e)e + 2(7- + t ft + 2en2
r 2r2(r+t) T2 ) T +e'
which is equivalent to the fifth Painleve equation on y = r/(r+t), with a = 2n2,
,Q = -2m2, -y = 2v, 6 = 0. Fokas and Ablowitz (1982) show that when 6 = 0,
Pv can be reduced to PIII

Additional symmetries
The reductions of the ASDYM equations by the Painleve groups reveal symme-
tries that are not obvious in the Painleve equations themselves. For example,
Pill is transformed to itself by X Y, Y --. X, Z -Z, and Pvi is symmetric
under any permutation of X, Y, Z, and -X - Y - Z.
108 Reductions to one dimension
Reductions to the Painleve equations
We can also see in the construction some of the reductions of two-dimensional
integrable systems to the Painleve equations. For example, in the case P1.11, the
vector fields X and Y generate H+o. It follows that the KdV equation has a
reduction to P1 (the nilpotent case) and that the NLS equation has a reduction
to P11 (the semi-simple case). Another example is the Ernst equation, which is
obtained by reducing the ASDYM equation by either of the subalgebras
a 0 b 0 a 0 0 0
0 0 0 b 0 -b 0 0
0 0 a 0 0 0 a-b 0
0 0 0 0 0 0 0 0

of the conformal Lie algebra, identified with the quotient of gl(4, C) by the
multiples of the identity. The first is conjugate to the subalgebra spanned by
the generators X + Y and Z of the Painleve group Pill; the second generates
a subgroup of the Painleve group PVi. Consequently the Ernst equation has
reductions both to Pill and to PV1.7
7.5 NON-ABELIAN REDUCTIONS
One can also reduce the ASDYM condition to a system of ordinary differential
equations by imposing invariance under a non-Abelian subgroup of the confor-
mal group. There are many possibilities, and we shall not attempt to list them.
Instead, we shall just consider two obvious examples, which illustrate the reduc-
tion technique that we introduced in Chapter 4. These are the groups of left
and right rotations, which act transitively on hypersurfaces in space-time (in
the Euclidean case, the orbits are 3-spheres). In the complex, both groups are
isomorphic to SL(2, C ); see §2.4.
Example 7.5.1 The left rotations act on /complex space-time by
Cw A1 w z)
z)
where A E SL(2, C ). The action is generated by the three Killing vectors
X =28, +w8Z, Y=za,,,+woZ, Z=wB,,,-w8w-z8Z+z8Z,
which have the standard Lie brackets 8
[X, Y] = Z, [Y, Z] = 2Y, [Z, X] = 2X.
The first step is to choose a transversal S to the orbits. We take
S= {w=tu=0, z=z}
and we choose the gauge so that Dls = 0. We let t denote the parameter on S
defined by t = z = z and, as usual, we denote the Higgs fields of X, Y, and Z
by P, Q, and R. Evaluated on S, these are functions of t.
The next step is to calculate the curvature components at points of S. By
using (4.5.5),
Notes on Chapter 7 109

2F(X, Y) + tF(T, Z) = -t2Fwti, + t2F 5 = 2tR' + 2R + 2[P, Q],

F(X, Z + tT) = t2Fb, = -2tP' - 2P + [P, R],
F(Y, Z - tT) _ -t2FFZ = 2tQ' + 2Q + [Q, RI
on S, where T = az + ai and the prime denotes differentiation with respect to
t. Hence the reduced equations are
(tR)' + [P, Q] = 0 , 2(tP)' + [R, P] = 0 , 2(tQ)' + [Q, RJ = 0 ,
which are equivalent to Nahm's equations. 9
Example 7.5.2 The calculation for the right rotations is very similar. This
time, the generators are
X = -zaw - wali Y =-za,;,-wai, Z=waw+za- waw - zai.
With the same choice of S, we arrive at an equivalent reduced system.

NOTES ON CHAPTER 7
1. There are many examples in the literature of reductions of integrable equations to one
or other of the Painleve equations. Ablowitz and Clarkson (1991) give a list, and note
that in many cases the integrable equations are themselves reductions of the ASDYM
equation. They remark that representatives of all six Painleve families of ODEs appear
in this way as two-stage reductions of the ASDYM equation, although it is not true
in every case in their list that the end result is the most general form of the Painleve
equation. The results on the Painleve equations in this chapter go beyond the analysis
of Ablowitz and Clarkson because (i) they show that the most general form of the
Painleve equations can be obtained in each case, and (ii) they establish the essential
equivalence of the Painleve equations with the SL(2, C) reductions by the Painleve
groups. They first appeared in Mason and Woodhouse (1993).
2. This example is taken from Ward (1985) who gives a general algebraic method for
obtaining systems of ODEs from Nahm's equations and conjectures that all systems
obtained by his method are integrable.
3. The various choices are discussed by Bobenko et at. (1989), from whom we have
taken our examples. They express the reduced equations in the Lax form [L, 0] = 0,
where
L=at-(Q, 0 = (L + M = R + (P - (2Q
(their M is our L, and their L is our C-10). The interpretation of their construction
in terms of the ASDYM equation was suggested by Chakravarty et at. (1992).
4. We draw here on the calculations in the appendix to Jimbo and Miwa (1981), which
gives details of the way in which the Painleve equations arise from the different cases
of the isomonodromy problem.
5. In fact, it is simpler in cases Pict and Pv to begin the derivation of the reduced
equations by first replacing the subgroup in Table 7.2 by a conjugate one, as in Mason
and Woodhouse (1993).
6. See Ince (1956), eqn XXXIV, p. 340. We are grateful to Peter Clarkson for this
observation.
7. Chandrasekhar (1986), Persides and Xanthopoulos (1988), Leaute and Marcilhacy
(1979). G. Calvert points out to us that, in the reductions to Pv in these papers,
110 Reductions to one dimension
the values of the parameters are such that the Painleve equation can be transformed
to Pin. Calvert has also derived reductions to Pvi, as has Cosgrove (1977), although
Cosgrove did not point out until later that the ODE to which Einstein's equations
reduce is in fact Pvt.
8. In the Euclidean case, we can represent four-dimensional space as the Cartesian
product of the w and z complex planes. Then iZ generates w e'Bw, z'-4 a-'0z. That
is, it generates rotations in opposite senses in the two planes.
9. These equations are used in Kronheimer (1990a, b) to study the nilpotent variety
in the complexified Lie algebra and complex coadjoint orbits, and to introduce hyper-
Kahler structures thereon.
8
Hierarchies

So far, our investigation of integrability and self-duality has concentrated on

the connection between Lax pairs and reductions of the ASDYM linear system.
In this chapter, we shall look at some other fundamental features of integrable
equations and consider the extent to which they reflect the geometry of the
underlying ASD equations.
In Chapter 3 we saw that the ASDYM equation has two Lagrangians, one for
the J-matrix form of the equation, and the other for the K-matrix form. These
give rise to two symplectic forms on the solution space and two Hamiltonians for
each translation of space-time. We shall show in this chapter that the two sym-
plectic forms are compatible in the sense that they determine a bi-Hamiltonian
structure (see Appendix C), and hence a recursion operator and hierarchies of
commuting flows. We shall look in detail at these structures and at the way in
which they are inherited by the various reduced equations. I
8.1 THE KdV FLOWS
Three key properties of integrable partial differential equations are that
(a) the time-evolution is one of an infinite hierarchy of commuting flows;
(b) there are an infinite number of conserved quantities, constant along all the
flows in the hierarchy; and
(c) the evolution is Hamiltonian with respect to more than one Poisson struc-
ture.
A central example is the infinite sequence of flows on the solution space of the
KdV equation
4ut - uxzx - 6uu1 = 0.
Their action on a solution u embeds it in a family of new solutions, labelled by
the parameters t 1, t2, t3, .... The first two flows are simply the space and time
translations, u(x, t) u(x + tl, t + t2), which map solutions to solutions for any
constant tl and t2. But the higher flows, with parameters t3, t4, ..., generate
new solutions in a less trivial way.
The tangents to the flows at u form a sequence of solutions to the linearized
KdV equation
4vt - vyxx - 6uvy - 6uyv = 0,
of which the first few terms are
112 Hierarchies

V1 = Ux,
V2 =- 23 uux + 4 urxr ,
1

V3 = U Ux + 8 UUxxx + q uxuxx + 16 Uxxxxx ,

8
35u3ux
V4 = 16 + 35U2uxxx + 35uuxuxx + 35,u3
32 8 32 x
7
+ 32uuxxxxx +353uxxuxxx + 121 1
321UXUXrxr + 64uxxxxxxx
We interpret these as vector fields on the space of solutions to the KdV equation.
They generate the flows in the sense that we can recover the dependence of u on
the t1, t2.... by solving successively the equations of the KdV hierarchy,
at,u=Vi, at,U=v2, at3U=v3, ...
for u(x + t1, t + t2, t3, t4, ...the equations are mutually consistent because the
flows commute. 2
By fixing a value of t, we can identify the solution space with a space V
of functions of a single variable x. Under suitable decay conditions at infinity,
V has two Poisson structures {., .} and determined by the two Poisson
operators
L. = ax and Mu = 4 ai + uax + u.,
t
with respect to the inner product

g(v, v') = F00 vv' dx .

Although the inner product and the identification with V depends on t, the
resulting Poisson brackets do not: they are natural structures on the solution
space. They are compatible in the sense that a{., } + Of., }' is also a Poisson
structure for any constant a, /3 (see Appendix C). The vector fields vi are Hamil-
tonian with respect to both; they are related by vi+1 = Rvi, where R is the
recursion operator
R=MuoLu1 =u+ 4a2+ ZUxa=1
and they are generated by the Hamiltonians
00
h1 = I Zu2dx,
f 00
ao
h2 = f 00 8
(2u3 - u=) dx,

h3 = f o0

00
00
32
(5u4 - 1Ouu= + uix) dx,

h4 = f 00
128114u5 - 7Ou2U2 + 14uuix - uxxx dx,

and so on. With respect to the first Poisson structure, hi generates vi; with
respect to the second, hi generates vi+1.
The KdV flows 113

There are a number of ways to derive the sequence of vis and his, which
differ in the extent to which they generalize to other integrable systems. The
one we have just considered, which uses the bi-Hamiltonian structure of the KdV
system, is due to Magri (1978); see also Magri (1980), Gel'fand and Dikii (1977),
and Olver (1986). Another is to regard u as an element of the dual Lie algebra of
the Virasoro algebra or the loop algebra of SL(2, R) (Segal 1991). In this formu-
lation the conserved quantities are the coefficients of the characteristic polyno-
mial of the Schrodinger operator of the KdV Lax pair. A third straightforward,
but specialized, method predates these. It exploits the Miura-Gardner-Kruskal
transformation (Miura et at. 1968); see also Ablowitz and Clarkson (1991, p. 23).
It goes as follows. If we put
u = 2w + a-lwx _,\-2W2, (8.1.1)
where w depends on x, t, and an additional parameter A, then
4ut - uxxx - 6uux = (2 + A-18x - 2a-2w) (4wt + 6wx(A-2w2 - 2w) -wxxx)
Therefore u satisfies the KdV equation whenever
4wt + 6wx()-2w2
- 2w) - wxxx = 0
for all A. It follows from this evolution equation that
00
w dx
00

is independent of t. By expanding the integral in powers of A-', one obtains

a sequence of conserved quantities for the KdV equation. The odd coefficients
vanish, while the coefficient of \-2k is a constant multiple of the integral of hk.
This gives a recursive procedure for finding the Hamiltonians. We substitute a
formal power series w = > )-2wj into (8.1.1), and solve successively for the was
by equating coefficients. The first few are
wo =lu,
2
i
wl q U. ,
//
w2 g (uxx + u2) ,

w3 i
16
w4 32 (uxxxx + 6uuxx + 5u2 + 2u3) ,
64 (uxxxxx + 18uxuxx + 16u2ux + 8uuxxx),
w5
ws = 11 (uxxxxxx + 19usx + 28uxuxxx + 50uu2 + 30u2uxx + 10uuxxxx + 5u4).

The odd ws are derivatives and integrate to zero; while, for example, h3 is a
multiple of
00 00
ws = J005u4 - 10uu+ u) dx .
114 Hierarchies
A fourth, more generally applicable, method is the construction of Drinfeld and
Sokolov, which we outline in Appendix B, and consider from a twistor point of
view in §12.3.

8.2 THE RECURSION OPERATOR FOR THE ASDYM EQUATION

In the remainder of this chapter, we shall look at another way of understanding
the origin of the flows and the symplectic structures. We shall show that they
are inherited from analogous structures on the solution space of the ASDYM
equation. In Chapter 12, we shall see that these, in turn, arise in a very natural
way from the underlying twistor geometry.
We first construct the ASDYM recursion operator, and then, in the next sec-
tion, we show that it connects the two Hamiltonian formulations of the ASDYM
equation arising from the J and K-matrix Lagrangians.
The recursion operator
In §3.5, we defined the `solution space' of the ASDYM equation by M = C/9,
where C is the set of ASD connections on a fixed vector bundle E - U C C M,
and 9 is the group of active gauge transformations (see Appendix A). Our aim
is to construct a recursion operator on the tangent space to M which, like
the recursion operator of the KdV equation, generates an infinite hierarchy of
commuting vector fields on M from the `seed flows' given by translations in
space-time.
We explained in §3.5 how a solution ¢ to the background-coupled wave equa-
tion determines linearized solutions 6J = J¢ and bK = 0 to Yang's equation
and the K-matrix equation, and hence two, generally distinct, solutions to the
linearized ASDYM equation. In a general gauge, they are, respectively,
41 = D,;,q5dw + Di0dz and V = DZ0dw + D,,,Odz.
This gives us a way to generate new linearized solutions from old ones: given
one linearized solution W such that %F,,, = qlz = 0, we solve the first of these
equations for 0, and then substitute this 0 into the second equation to find a
new perturbation of the connection. If we then put the new perturbation on the
left-hand side of the first equation, and solve again, then we get a new solution
to eqn (3.5.5), which is related to 0 by
Dwq5 = Dz0, Di/̀i' = DwO
Since D is ASD, the integrability condition for the existence of given 0, is
(3.5.5), and any defined in this way necessarily satisfies (3.5.5). We call the
map
R:
the recursion operator. Since any tangent to M is gauge-equivalent to one such
that W,,, = TZ = 0, we can think of R as a linear operator in the tangent spaces
to M, except that in the absence of boundary conditions at infinity, or some
other restriction, R(qS) is not unique because of the ambiguity in the inversion of
Hamiltonian formalism 115

D,;, and D. Also, its definition depends on the choice of double null coordinates,
although we get the same operator if we make a null coordinate transformation
that preserves the two 2-forms
a=dwAdz and &=dwAdz (8.2.1)
up to the same scalar factor. Translations, left rotations, and dilatations are ex-
amples. We shall show that under reduction, R turns into the recursion operators
of various integrable equations.
We summarize this in the following diagrams, in which J and IC denote
the respective solution spaces to (3.3.2) and (3.3.7): they give the following
relationships between the various equations, and their linearizations:
,7 K: Tj J TKIC

C TDC
1 I.

M TPIM
The recursion operator is the composite
TPIM - l /J -, TKJC T(I)IM,
although the first map here, which is the inverse of TjJ -+ T(I)IM, is not well
defined without the imposition of boundary conditions.
The recursion relations
By iterating R, we generate an infinite sequence 00, q 5j ... of solutions to the
background-coupled wave equation from a given initial solution 00, and hence
a sequence of solutions to the linearized ASDYM equation. The Ois satisfy the
recursion relations
D11, 1+1 = D2Y1 , DzO1+1 = D.O, (8.2.2)
and the corresponding solutions to (3.5.1) are given by

q1, = D;O, dh + Dzo, (U.

If we introduce the formal power series V) _ F_o (-'01, then Ois are determined
by
D,,,i/) -(Dz(/) -¢,o)=0, D, ,V)

If the summation is extended to minus infinity, with the same recursion relations,
then we obtain a solution to the Lax pair, acting on sections of adj(E).
8.3 IHAMII IONIAN FORMALISM
We now turn to the bi-Hamiltonian interpretation of the recursion operator. We
showed in §3.5 that the solutions to the ASDYM equation have two symplectic
structures, generated by the Lagrangians for Yang's equation and the K-matrix
equation. When we identify the tangent spaces to 3 and /C with the space WD
of solutions to the background-coupled wave equation, both coincide with the
116 Hierarchies

natural bilinear form on WD defined by eqn (3.5.7). We shall show here that when
we transfer the two symplectic forms back to M, they are related by the recursion
operator, and that further applications of R generate an infinite sequence of
symplectic forms, each given by an integral over a hypersurface in space-time.
When we impose symmetry, the integrands reduce to those for the symplectic
structures of the reduced system. To a certain extent these considerations are
formal since the identification of a tangent vector to M with an element 0 of WD
involves a free choice of a pair of free functions of two variables. Also we work in
the complex, without specifying the hypersurface, although the definitions are
formally independent of the choice. The results can be made rigorous, however,
by fixing the identification, either by choosing appropriate boundary conditions
or by imposing at least two symmetries.
Recursion in WD
The basic result that underlies the bi-Hamiltonian theory is the following propo-
sition, in which R is regarded as a linear map on WD and SZ is defined by eqn
(3.5.7).
Proposition 8.3.1 Let E WD. Then Sl(Rq, 0') = S2(O, RO').
Proof In the notation of §3.5,
aD(R¢) Aw = ODO Aa, 1DO Aw = -aD(Rq) Aa,
where a/, a and w are the ASD forms defined by (2.3.1). Since D = aD + aD,

tr(gi&DO' + OaDq' + O'aDO + cb'aDlb) A w =I d(tr(OO')w) =0.

J
Therefore, we have

0(0,0') = f tr(OaDq'+O'aDO) Aw,

by using (3.5.6). This use of Stokes' theorem is one formal element in the proof.
Hence
S2(RO, 0') =

f A

f Aa- A a)

a second formal application of Stokes' theorem. 0

It follows that the 2-forms on WD defined by
Qk(0, 0') = 92(Rk0, q') , (8.3.1)

for k positive or negative, are skew-symmetric.

Hamiltonian formalism 117

The ASDYM equation

A solution to the linearized ASDYM equation can be represented by
T = D,;,odw+Di46 dz,
where 0 satisfies (3.5.5). We can use this to transfer the forms Qk to the tan-
gent spaces to M, but only in a formal sense since (i) I is determined by the
perturbation of the connection only up to the addition of D f , where
Dwf=DZf=0;
and (ii) 0 is determined by T only up to the addition of f such that
D,-.f =Dif = 0.
Thus 0 is determined by the perturbation only up to the addition of f + f .
There are further choices in applying the recursion operator and thus in defining
f1k from 11o. Only under special conditions, that is, under suitable boundary
conditions or when we reduce the system by at least two symmetries, will any
of the forms on M be independent of these choices. An exception is the form
9-1 = Io(R-1 , ). By using the calculation in the proof of Proposition (8.3.1),
we find that
Q-1 (W, W') = a tr(.0' 1DO-1 - OD¢')Aw
J
where 0-1 = R-1 q5, 0' 1 = R-10'. Given ' and 4', the right-hand side is
independent of the choices made for 0, 0', 0-1i 0' 1, since it is unchanged by
the addition off to 0 or to 0', and off to 0-1 or to 0'_j, where aD f = 0 and
OD f = 0. Moreover, if 8Dq' = 0, then a formal application of Stokes' theorem
reduces the right-hand side to

2 tr(OD(R-10) Aw) = 2 Jtr(D(R2)) Aa,

J
which vanishes after another application of Stokes' theorem. Thus if T is an
infinitesimal gauge transformation, then it annihilates 52-1. So under boundary
conditions that permit the applications of Stokes' theorem, Q-1 is well defined
on M.

Reduction
If 01 and 02 are invariant under a one- or two-dimensional group H of trans-
lations, then we can construct a form on the quotient space of space-time by
contracting
tr(01*D02 - 42*DO1)
with a basis of generators of H. The integral of this over a surface or curve in
the quotient gives rise to a closed 2-form on the solution space of the reduced
equations.
118 Hierarchies

8.4 ASDYM AND BOGOMOLNY HIERARCHIES

One way to generate solutions to (3.5.1) is to exploit the invariance of the
ASDYM equation under conformal transformations. Given a conformal Killing
vector Y and an ASD connection D + 4D on a bundle E, we can construct a
one-parameter family of ASD connections by dragging 4D along the flow of Yin
space-time. We put
41o=8t04?=£y' , (8.4.1)
where to is the parameter and C is the ordinary Lie derivative on 1-forms, taken
entry-by-entry. Then 41o necessarily satisfies (3.5.1).
Implicit in this is a choice of lift of the flow of Y to E: if we start with a
different lift, then we shall obtain a different, but gauge-equivalent, linearized
solution. A lift is determined by the matrix-valued function By (see §4.2), and
the linearized solution generated by Y is given by (8.4.1) only in an invariant
gauge, in which Oy = 0. In a general gauge,
To =Gy4) +[Oy, 4i]-dOy.
Under gauge transformations By '-+ g-1 Y(g) + g-18yg, and To transforms by
To ,,-4 g- I 410g,
which is the transformation law for a 1-form with values in adj(E).
Commuting flows
A flow on M is generated by a vector field; that is, by a map that assigns
an equivalence class of solutions to the linearized ASDYM equation to each
D E C. We can choose a representative in the class at each point of C such that
%P w = WZ = 0; then
%P = DwOdw + DjOdz
for some solution 0 to the background-coupled wave equation. By iterating the
recursion operator, we construct an infinite sequence of such vector fields from
the given flow. The sequence is not unique, because of the ambiguity in the
definition of the recursion operator and the freedom in the choice of 0, but it is
remarkable that if the original flow is generated by translating D along a constant
vector in space-time (with an appropriate choice of lift), then the sequence can
be chosen so that the vector fields on M integrate to commuting flows on M.
This is an important sense in which the ASDYM equation is integrable.
To make this precise in a way that takes account of the fact that the recursion
operator is not well defined on the tangent spaces to M, we work on J and IC
rather than on M. Suppose that J(x, t) and K(x, t) are matrices depending on
x E C M[ and a sequence of parameters t = (to, t1, ... ). We say that J and K
satisfy the recursion equations if the following hold:
(a) for each t, J satisfies eqn (3.3.2),
(b) for each t, K satisfies (3.3.7),
(c) for each t, O K = J-18,bJ, Bu,K = J-18ZJ,
ASDYM and Bogomolny hierarchies 119

(d) ai+1K = J-'82J, where a i = 8/8ti, f o r i = 0,1, 2, ... .

Under these condition, J and K are potentials for a family of solutions D to the
ASDYM equation, labelled by t. The flows of the tis commute, and the tangents
to the flows in M are related by the recursion operator. To see this, we note
that the tangents to the flows are given by
Oi = J-1aiJ = ai+1K
and the recursion relations arise by taking the derivative of the first equation in
(c) with respect to ti, which yields
azq5i-1 = aZaiK = ai (J-lawJ) = awmi + [,DD, 46i], (8.4.2)
where
4? = J-'8 Jdio + J-'8jJdz, (8.4.3)
which is a potential for D. It follows from this and from a similar calculation for
the other coordinates that
DwOi = Dz*i-1, Diq5i = DwWi-1 ,
since D. = azi D,,, = a,,, in this gauge. Hence the 4is satisfy the recursion
relations.
Proposition 8.4.1 Let Y be a constant vector field on space-time and let D(O) be
a solution to the ASDYM equation. Then there exists a solution J(x, t), K(x, t)
to the recursion equations such that (i) J(x, 0) is a potential for D(°) and (ii)
Y(J) = 80J for all t.
This asserts the existence of the flows generated by recursion from translation
along Y, and implies that they commute. There is a very simple proof that uses
the Penrose-Ward transform, which we shall give in §12.1. This is also a natural
framework within which to consider the domain in the parameter space on which
J and K are defined.
However, we can also understand the construction of the flows in a more
direct way. If J and K satisfy the recursion equations, then
aj+1Oi - ai+loj = (aj+lai+1 - ai+laj+1)K = 0,
and
ai0j - aj0i + [0i'031 = J-1 (aiaj - ajai)J = 0.
for all i,j > 0. If 0OJ = Y(J), then these are equivalent to the evolution
equations
3

aiq5j = Y(Oi+j) - E [Oi+j-m, 0-1, (8.4.4)

m=°
which determine the Ois from their values at t = 0. They can also be written in
a compact form by introducing the generating function 1/i = F,a (-40, and by
putting ?k = >ok-1 (-'Oi. Then (8.4.4) is equivalent to
(-kakV, = Y(V' - '00 - [VI, 'l'kI (k = 1, 2, ...) ,
120 Hierarchies
which is to be interpreted formally, by equating coefficients of powers of C. It is
possible to establish the existence of the commuting flows directly from (8.4.4). 3
Remarks. (i) We can think of the Ois and the components of 4) as the dependent
variables in an infinite sequence of nonlinear differential equations with indepen-
dent variables w, z, w, z, to, t, .... Then the recursion equations are equivalent to
the condition that the operators
L = D. - (Di , M = D. - (D,, , (((9j

where j = 1, 2,..., should all commute with each other.

(ii) In a gauge in which 4)w = 0 = Dz, the components of the potential themselves
satisfy the background-coupled wave equation. If we put qi = (D,;,, then we have
aw4)i = 8j D@ + DjO, D@O,
by using one of the field equations. It follows that 0 generates translation along
8,;,. Similarly, (Dz generates translation along az. These two flows seed the
ASDYM hierarchy.
(iii) Any conformal Killing vector generates a flow on C and so seeds an infinite
hierarchy of flows.
The ASDYM hierarchy
It is natural to consider together the two sequences of flows generated by recur-
sion from the translations along a,b and az, and the two sequence generated by
the inverse of the recursion operator from translation along a,,, and O. These
have a particularly simple representation in terms of the Penrose-Ward trans-
form. The fact that they commute implies that the ASDYM equation can be
embedded in an infinite system of overdetermined partial differential equations,
in the sense that every solution to the ASDYM equation can be extended to
a simultaneous solution of the infinite system. The equations involve arbitrar-
ily many independent variables, but because they are overdetermined, initial
data can be specified freely only on a 3-surface (this follows from the twistor
correspondence).
We shall not prove here that the flows in the four sequences commute with
each other because it is an immediate consequence of the following proposition,
which itself is a direct consequence of the twistor construction (see §12.1).
Proposition 8.4.2 Let D(°) be a local analytic solution to the ASDYM equation.
Then there exists a family of solutions D, labelled by parameters xAi, i E 7G,
A = 0,1, with the following properties. If xAb = 0, for all i, A, then D = D(O).
For each solution, there are matrix-valued functions J and K (depending on the
space-time coordinates and the parameters) such that in some gauge
(a) D=d+J-1a,;,Jdw+J-10ZJdz=d+9,Kdw+awKdz,
(b) aooJ = 9jJ, a,oJ = awJ,
(c) ao1J=awJ, a11J=OZJ,
(d) aA.i+1K = J-1aAiJ, Vi, A,
ASDYM and Bogomolny hierarchies 121

where aAi = a/ax A:.

For each A, the flows labelled by consecutive values of i are related by the
recursion operator. For A = 0 and A = 1, the flows for positive i are generated
by the translations along ai and a,j, respectively, and for negative i, they are
generated by the inverse of the recursion operator from the translations along
a,,, and aZ. The flows for A = 0, 1, i = 0,1 are translations, but the other flows
are generally nontrivial.
It follows from properties (b) and (c) that, for each A, i
OAi = J-laAiJ
is constant along the vector fields
a00 -ai, a10-aw, (9o1 -a,,, all -az
on CM x X, where X is the parameter space. By projecting along these vector
fields onto X, we can represent the Os as functions of the parameters xAi alone,
with the dependence on the space-time coordinates recovered by substituting
z + x00 for x0°, and so on. Interpreted in this way, the dependence of the 4s on
the parameters is determined by the condition that the operators
LAi = aAi - 0aA,i-1 + OA,i-1), A = 0, 1, i E 7.,
should commute for every value of the spectral parameter (, when acting on
column-vector-valued functions on X. The coefficient of (° in the commutator
[LA,i+1, LBJ +11 necessarily vanishes. The c-term is
8B,,+1QAi - 0A.i+1OBj = (OB,j+119A,i+1 - (9A.i+1aB,3+1)K = 0,
and the (2-term is
[aAi + QAi, aBj + OBjl = J-1 (aAiaBj - 1OBjOA1)J = 0.
Conversely, suppose that we are given OAi(x), i E Z, A = 0, 1, such that the
operators LAi commute. We can pull the Os back to functions on C M x X, and
obtain a family of connections labelled by the parameters by putting
4D = ¢lodw + Ooodz.
Then [Loo, Lol] = 0 is the condition that the connections should be ASD, and
the other commutation conditions are equivalent to the recursion relations. The
full set of commutation conditions [LAi, LBj] = 0 is an infinite system of partial
differential equations for the dependent variables 4Ai, which we call the ASDYM
hierarchy.
We can write the operators in the more general form
LAi = aAi + 4Ai - ((aA,i-1 + -iA,i-1) ,
and allow simultaneous gauge transformations
'D Ai'-' 9-1 19ai9+9-1'D Ai9, '$A, p--'
The vanishing of the (° terms in the commutation conditions implies the exis-
tence of a gauge in which 4DAi = 0, and in which the hierarchy reduces to its
122 Hierarchies

original form. When we represent the solution by its J-matrix, the equations of
the hierarchy reduce to
aB,i+1(J-laAiJ) - VA,i+1(J-1aBjJ) = 0
and when we represent it by its K-matrix, they reduce to
aAiOBj+1K - ae,3Ai+1K + [aAi+1K, 5B.)+1KI = 0.

The Bogomolny hierarchy

Any flow on the solution space of the ASDYM equation generates a hierarchy of
flows by recursion. So far, we have considered the `seed flows' given by translation
in space-time. For solutions that are invariant along a self-dual conformal Killing
vector X, another possibility is to seed the recursion by the corresponding Higgs
field P. If D is invariant then P necessarily satisfies the background-coupled wave
equation, and so generates a solution to the linearized ASDYM equation. By
applying the recursion operator, we generate from P a sequence of flows on the
space of invariant solutions to the ASDYM hierarchy. When X is a translation,
this sequence is called the Bogomolny hierarchy (Mason and Sparling 1989, 1992).
Consider the case that X is a non-null translation. Then we can choose the
coordinates so that
x =(9w - aw
and we can choose an invariant gauge so that 4),,, = I = 0, and so that cw = - P
and 4)Z are functions of z, z and x = w + zo.
Let us put 00 = 4Dz, 41 = -P, and define qj for j > 1 by recursion. We
can choose the Ojs so that they also depend only on z, z and x and so that the
corresponding flows preserve the symmetry of D.
By the remark above (p. 120), 00 generates the flow along az. Also, from the
ASDYM equation,

DiO1 = ai(Dw + [c,] = aw4,i = aw00,

so 01 = Roo. By writing these in another way,
D bO, , O, = DiO1
Therefore 01 generates the flow along uw, or, equivalently along ax, since the
potential depends on w only through x = w + w. Finally, the recursion equation
02 = R-01 is equivalent to
Dw02 = aAw, Di02 = aAz,
since I,;, = 01, and a,,,(P1 = by the ASDYM equation. Therefore, 02
generates the flow along O. So the first three flows of the sequence in this case
are the translations
1(z,x,z) F--i c1(z+to,x+t1iz+t2).
Reductions of the ASDYM flows 123

If we put x 0 = z, x1 = x, x2 = z, x3 = t3, and so on, and use the flows for j > 3 to
define the dependence of the Ojs on x3, x4, ..., then the ASDYM equation and the
recursion equations are equivalent to the commutation conditions [Lj, Lk] = 0
for the operators
Lj = aj - ((aj-1 +Oj-1), j = 1,2,... .

This infinite system of equations for the unknowns 4j is the Bogomolny hierarchy.
By truncating the sequence, we obtain a finite system of equations [LLk] = 0
(j, k = 1,2,... , m) for the unknowns 0o, ... , 46m-1 as functions of x8:.... xm.
We call this the Bogomolny hierarchy up to level m, and denote it by B(m); in
particular, B(2) is the complex form of the Bogomolny equations.
8.5 REDUCTIONS OF THE ASDYM FLOWS
Suppose that we are given an ASD connection D = d + on a vector bundle E,
and a constant vector Y on space-time. Then we know from Proposition (8.4.1)
that we can embed D in a family of ASD connections labelled by parameters
ti, i = 0, 1, ..., with the dependence on to given by dragging along Y in an
appropriate gauge.
The question that we shall consider now is the following: if D is invariant
under some group H of conformal transformations, is it possible to choose the
embedding so that all the connections in the family are also invariant under
H? If it is, then the reduced equations will inherit a hierarchy of flows from the
recursion operator of the ASDYM equation. We shall see that the NLS and KdV
hierarchies emerge in this way. 4
The reductions by H+o
Consider the reductions by the group H+o, which are generated by translation
along
X =8w-8w, Y=oi.
Here the potential of an invariant connection can be put in the form
4D = C;,dib + Didz = Qdz - Pdw ,
where the Higgs fields Q and P are functions of t = z and x = w + w, and the
ASD condition is equivalent to
8xQ = [P, Q] , 8tQ + 8xP = 0 (8.5.1)
(see §6.3). The tangent vectors to the solution space satisfy the corresponding
linearized equations,
ax(6Q) = [6P, Q] + [P, 5Q] , at(bQ) + ex(6P) = 0. (8.5.2)
If we require that the Os should also depend only on x and t, then the
recursion operator reduces in this gauge to R: 0 -- , where
[Q, 0] = axe, ex0 - [P, 01 = ato ,
and (3.5.5) reduces to
124 Hierarchies

4xx = [Q, 4t] + [P, 0x] . (8.5.3)

A solution /(x, t) generates perturbations of Q and P by

6Q = [Q'01 , 6P = [P, 0] - 19.0. (8.5.4)

These preserve the conjugacy class of Q and satisfy (8.5.2). Conversely, any
linearized solution represented by a perturbation 6Q, 6P that preserves the con-
jugacy class of Q is generated by some q(x, t) satisfying (8.5.3); 0 is unique up
to the addition of a constant multiple of Q. 5
Example 8.5.1 The linearized equations (8.5.2) are satisfied by 6Q = -mQ,.,
bP = rmQ - inPx for any function m(t). This is the perturbation generated by
the coordinate symmetry x ,-+ x + m, t H t (§6.3). A possible choice for 0 is
0=MP -mxQ.
The KdV recursion operator
Now specialize to the case in which the gauge group is SL(2, C), and Q and P
satisfy the constraints
tr(Q2) = 0, tr(P2) = tr(8xQ8xP), tr(QP) = -1. (8.5.5)
Then the invariant u = -tr(P2) satisfies the KdV equation
4ut - uxxx - 6uux = 0.
In the notation of §6.3,

019-i,
Q=g
0
P = (8x9)9-1 = g ( -q) .q- I
9
where r = q2 + qx (the second constraint) and u = 2qx. If we write 0 in eqn
(8.5.3) as
fi=9 , '-) -t
Ca a 9- ,

then the perturbation given by (8.5.4) is consistent with the constraints whenever
/3x + 2q/3 + 2a = 0 and 2ryx + rx/3 - 2gax - axx = 0. (8.5.6)
Such a 0 generates a solution to the linearized KdV equation
4vt - vxxx - 6uvx - 6uxv = 0
by v = 2tr(Pax0) = 4gax + 2r/3x - 2yx.
We want to show that the recursion operator takes solutions of the KdV
equation to solutions of the KdV equation. Thus, the question is the following:
is it possible to solve the recursion equations
[Q'0'+11 =04i, ax-Oi+1 - [3', 0i+1] = 'Ui (8.5.7)
so that the corresponding sequence of perturbations to Q and P is consistent
with (8.5.5)? Let us write
Reductions of the ASDYM flows 125

( cei q ll
Oi=9('Y t Nai )9-l
\\(8.5.7)
Then we can solve the first of eqns for ¢i+i if and only if a and 0 satisfy
the first of the constraints (8.5.6). When it is satisfied, the solution to the first
of egns (8.5.7) is
Nd+1 = rf3i + 7i - axai , ai+l = rai - 474 + ax7i , (8.5.8)
2
with 7i+1 undetermined. However, ai+l and /3i+1 again satisfy the first con-
straint in (8.5.6) if and only if ai, /3i and 7i also satisfy the second constraint in
(8.5.6). Therefore if the first equation in (8.5.7) is to hold at fixed t, then the
¢is must be determined recursively as follows. Given ai and /3i such that the
first of eqns (8.5.6) holds, we use the second of eqns (8.5.6) to determine 7i up
to a constant, and then determine ai+l and 0i+1 by (8.5.8). These again satisfy
the first of eqns (8.5.6). The second recursion equation in (8.5.7) determines the
t-dependence of the Ois. To start the recursion, we need only find a suitable
solution Oo to (8.5.3) such that the constraints (8.5.6) hold.
With the ¢is defined in this way, the sequence of perturbations to Q and P
is consistent with (8.5.5). Therefore the ¢is determine a sequence of solutions to
the linearized KdV equation by vi = 2tr(P(%Oj). It follows from the recursion
relations and the constraints on the ais and /3is that vi-1 = 28x/31 and that
vi = 2gxvi-1 + 4a2vi-1 + 2gxx)3i
But u = -2qx. Hence the successive vis are related by the recursion operator
R = u + 482 + 2ux(ax)-1
We conclude that the KdV recursion operator is a reduction of the ASDYM
recursion operator.
A natural choice is to take 00 = Q. Then we can take c51 = -P, which
generates translation along 8x. The higher flows are those of the Bogomolny
hierarchy. In particular, we have
r -2rx qx -1
02 = 9 1 -2rxq 4rxx 2rx 9

for which the corresponding flow is translation along 8t.

The KdV symplectic structures
A symplectic form on the solution space of the KdV equation can be written as
a bilinear expression in 0 and 0', where 0 and 0' are two solutions to (8.5.3)
representing tangents to the solution manifold of (8.5.1). We can construct such
forms by applying the reduction procedure to the S2ks defined by eqn (8.3.1).
For k = 0, we contract the 3-form
tr(0*DO' - 0'* DO)
with X and Y (the generators of H+o), and integrate over a curve in the x, t-
plane. We shall take the curve to be a line of constant t. Then, on putting
126 Hierarchies
D = d + Qdz - Pdw and dropping an inessential constant factor, the result is
- tr(O[Q, O'1) dx
Ho(o, 01) = f = foo (8'ax - 0,0x') dx.
00 00

Provided that 0 and 0' behave appropriately at large values of lxi, this is inde-
pendent of t.
The other forms in the sequence are obtained by applying the recursion op-
erator, which satisfies Do (R., ) = R/,.)/., Thus

Qk(0,0') _ Do(Rcc,0') = Do(0i,Wj'),

where i + j = k, and ci = Rio, 4'' = Ri0'. In particular,

cl2(0,0') = (3'01x - 01Nix) dx.

But ax3i = .1vi_1. Therefore, as a 2-form on the space of solutions to the KdV
equation, D2 is the same as

S12 (V, v') = f 0(i'irx - 7rory) dx,

where 1rx = 2v and zr' = Zv', which is the 2-form associated with one of the
standard symplectic structures on the solution space of the KdV equation (see
eqn C.12). Thus the bi-Hamiltonian structure of the KdV equation is inherited
from the sequence of closed forms on the solution space of the ASDYM equation.
The NLS equation
If, instead, we impose the constraints tr(QP) = 0, det Q = 1, then we can write
Q and P in the form

Q=g(0
iI9-P=9x9-1=g1
1 0 0 J9-

where V) and satisfy the complex NLS equation. We recall from §6.3, that
this gives a one-to-one correspondence between solutions to (8.5.1) such that
tr(QP) = 0, detQ = 1, modulo conjugation of Q and P by a constant ma-
trix, on the one hand, and solutions to the complex NLS equation, modulo the
identification of O, tl with Au', A-1z%i for constant A, on the other.
For the linearized equations, we have a correspondence via eqn (8.5.4) be-
tween solutions to (8.5.3) such that
tr(Qrax0) = 0
and solutions to the linearized NLS equation,
i5i,b = -161)x., +2V)z'ft +1/i2bt/i, ibz t= 2bTGxx - 2r/rr(ib '-
If we write
=gra Q 1 g-
a
The generalized ASDYM equation 127

then the correspondence is given by

ax =OVG - yVb, Qx =2ali-ft, yx = -2ao - 6V). (8.5.9)
We have to identify two solutions to the linearized NLS equation that differ by
a constant multiple of bpi = Vi, bpi = and we have to identify two solutions
to (8.5.3) that differ by a constant matrix.
In this case, the recursion relations
AQ, Y'i+i] = axmi, a-0i+1 - (P, of+1] = atoi
come down to
2i/3j +1 = axQ,i - 2aj , 2iyj+1 = -axyj - 2ajVi.
At each stage, ai must satisfy the constraint
axai = QiW - 7'i Y
The corresponding sequence of solutions to the linearized NLS equation is given
by
bjVi = -8xy3 - 2a,V) = 2iyj+i , 6j V) = BxQj - 2a. Vi = 2i/33+I .

We have

Ho(o, 01) = L:trLQ'I)dx = 2i (ryp' - Qy') dx.

J 00

Therefore 112 coincides with the NLS symplectic form

°O
n2(b0, b'V)) b'iibi]i) dx.
= 2i f.0
As with the KdV equation, the NLS recursion operator and symplectic structures
are inherited from the ASDYM equation.
8.6 THE GENERALIZED ASDYM EQUATION
It is only in four dimensions that the Hodge duality operator a ,-' *a maps
2-forms to 2-forms, and so determines the decomposition of the curvature of a
connection into its self-dual and anti-self-dual parts. For this reason, it is only
in four dimensions that the self-duality equations appear completely natural and
geometric. There are generalizations to higher dimensions, but they all require
additional geometric structures of one sort or another, which obstruct the direct
generalization of the twistor construction.
However, some of the theory that we have introduced in this chapter does
have a straightforward extension to higher dimensions. We shall describe here
a generalization of the ASDYM hierarchy, which will be important in our treat-
ment of the Drinfeld-Sokolov construction in Chapter 12 (higher-dimensional
analogues of the self-duality equations are also considered by Ward 1984b, Ped-
ersen and Poon 1988, Popov 1992, Glazebrook et al. (1994).
We begin by introducing the generalized anti-self-dual Yang-Mills equation
(GASDYM) equation on a connection D on a bundle E -+ U, where U is an open
128 Hierarchies

subset of the `space-time' C2k (k > 2). We denote the space-time coordinates
by x',... xk, ik, and the connection, as usual, by D = d + -D. We put
aA = a/axA and aA = a/aiA, and denote by (DA and ('A the components of 4)
along aA and 'A, respectively, for A = 1, ... , k. The GASDYM equation is the
condition that the curvature of D should vanish on every k-plane of the form
(xA + 2A = constant
E C, A = 1, 2, ... , k). By analogy with the four-dimensional case, we call these
a-planes. Equivalently, D satisfies the GASDYM equation whenever [LA, LB]
vanishes for all A, B and for all ( E C, where the LAS are the operators on
r(U, E) defined by
LA=DA-(DA.
As in four dimensions, the equation is invariant under gauge transformations
(D '-'g-' g+g-'dg,
that is, under changes in the local trivialization of E. When the equation is
satisfied, we can choose the gauge so that 'A = 0; then the remaining equations
on the '6AS are
aA B - OB'A = 0, 5AjDB - 3B4A + [ A, 1'B] = 0 . (8.6.1)
We have in all k(k-1) first-order equations in k unknown matrix-valued functions
of 2k variables, so the system is overdetermined for k > 2. We shall see from the
twistor construction, however, that the equations are consistent, and propagate
data from (k + 1)-surfaces.
For any pair of indices, (8.6.1) is the same as the ASDYM equation: for
each A 54 B, the connection D determines a family of solutions to the ASDYM
equation parametrized by the remaining coordinates; conversely, any solution to
the ASDYM equation determines a solution to the GASDYM equation by taking
1'A and 1B to be independent of the other coordinates, and by taking the other
components of 4) to be zero.
The linear coordinate transformations that preserve the equation are
1 xl X1 2l
x2 V x2

k xk k xk

where A E GL(k, G) and A E GL(2, G). In contrast to the four-dimensional case,

we cannot characterize these as the linear isometrics of a space-time metric.
When k is even, and A and A are unitary, the transformations preserve the
hyper-Kahler geometry of C2k (see Atiyah and Hitchin 1988). In general, the
symmetries preserve the paraconformal geometry (Bailey and Eastwood 1991).
The twistor construction reveals additional symmetry under the larger group
PGL(k + 2, C); see §10.7.
The generalized ASDYM equation 129

The J and K potentials

As in four dimensions, we can take either the first or the second of eqns (8.6.1) as
an integrability condition, and so write 4D in terms of either a K or a J potential,
4DA = 0, 4)A = OAK = J-1OAJ.
The GASDYM equation then assumes one or other of the forms
aLA(J-IaB) J) = 0, 20(AOB]K + [OAK, aBK] = 0.
In both cases, there are 2k(k - 1) equations in one unknown matrix-valued
function of 2k variables.

The recursion operator

As before, the linearized forms are the same: if we write 0 = J-'6J = 6K, then
they are both equivalent to
D(ADBJ4 = 0,
where 0 is interpreted as a section of adj(E), and D denotes the connection
d+[,D, ] on adj(E); here we have 2k(k-1) independent equations in general, and
just one equation when k = 2. The coincidence between the linearized equations
allows us to define the recursion operator on tangents to the GASDYM equation
by R: 0 -4 , where
DAB = DAO
Much of the rest of the theory goes through in this more general context, in-
cluding the construction of hierarchies of commuting flows from translations of
space-time.

The GASDYM hierarchy

In particular, by considering the translations along OA, we obtain the flows of
the GASDYM hierarchy on the solution space. Any solution to the GASDYM
equation generates, uniquely up to the ambiguity in the integrals that define
R, a solution to a hierarchy of equations for the unknowns cAi as functions of
the variables xAi, A = 1,. .. , k, i E Z. The equations are the commutation
conditions for
LAi = aA3 - ((OA,,-1 + OA,.i-1),
with the embedding of the original equation in the system given by xAO = iA,
xAl = xA, and qAO = A. For k = 2, it is the ASDYM hierarchy, and for any
pair A, B we have an embedding of the ASDYM hierarchy (by holding the other
xAis constant). In the same way that we can think of the GASDYM equation
as a family of interleaved ASDYM equations, we can think of the GASDYM
hierarchy as a family of interleaved ASDYM hierarchies.
If we allow gauge transformations LAi '--+ 9-1LAi9, where g is a matrix-valued
function of all the xAi, regarded here as a multiplication operator on sections of
E, then in a general gauge,
130 Hierarchies

L A3 = aAJ + PAS - ((OA,i - I + QA,,-1),

with QAO = bA, PAI = 4DA-

The truncated hierarchy

Embedded within the GASDYM hierarchy are many other systems of differential
equations. For any k > 0, p > 0, the commutation conditions
[LAi,LB;J = 0, A,B = 1,2,...,k, i, j = 1,2,...,p
give a system of equations for the unknown matrices
PAi, 1<A<k, 1<i<p
QAi, 1<A<k, 0<i<p-1
as functions of the independent variables xAi, 1 < A < k, 0 < i < p. We denote
this system by H(k, p). Two solutions are regarded as equivalent whenever they
are related by a gauge transformation
PAi'-' 9-IPAi9+9-119Ai9 QAi '-' 9-IQAi9+9-IaAi9,
for some g depending on xAi, 1 < A < k, 0 < i < p, and taking values in
the gauge group. For any 2 < k, p > 0, a solution to the GASDYM hierarchy
contains a family of solutions to H(2, p), parametrized by xAi, A > k, i < 0 or
i > p. In this sense, p) is a truncation of the GASDYM hierarchy. Note that
H(1, oo) is the Bogomolny hierarchy, H(k, 1) is the GASDYM equation, H(2, 1)
is the ASDYM equation.
We can obtain other systems by reduction. We shall not consider the general
theory, which would involve a detailed investigation of the symmetries of H(k, p),
but simply note that we can impose translational symmetry by requiring that
the dependent variables and the gauge transformations should be constant along
some fixed vector in the space of independent variables. If the system is in-
variant in this sense along aAi, then the Higgs fields QAi and PAi transform by
conjugation under gauge transformations.
Finally we note that for any A, B, i, j, the condition [LAi, LB3] = 0 is equiv-
alent to the the condition that
= PAidw + QA,i-Idz + PB3dz + QB,3_Idw
should be the potential of an ASD connection on the four-dimensional space-
time with coordinates w = xAi z = x',_1, z = xBi and iu = xB,j-1 with the
remaining xs regarded as parameters. If A = B and j = i + 1, then the ASD
connection is invariant under translation along a,,, - a,;,.

The nKdV hierarchy

In Appendix B, we describe how the Drinfeld-Sokolov construction generates
an infinite sequence of commuting operators M3 = ai - H3 depending on the
variables t1, t2, ... from a trace-free lower triangular matrix A(x). The sequence
The generalized ASDYM equation 131

is characterized, up to a linear transformation of the independent variables, by

the conditions on p. 326.
The nKdV hierarchy is a central special case of the system H(n - 1, oo) with
gauge group SL(n, C), by the following correspondence (Mason and Singer 1994).
Given the operators Mk (see p. 326), we put xAi = tn(i_1)+A, for n > A > 0,
i > 0, that is, t1 = x11, t2 = x21, ..., and define
LAi = Mn(i-I)+A - (Mn(i-2)+A
= aAi - (8A.i-I + fln(i-1)+A - (fln(i-2)+A, Z>2
LAI=MA -(3AO=aAi-(aAO-HA. (8.6.2)
Then the LAis commute and are linear in the (, and so determine a solution to
H(n - 1, oo).
The solutions to H(n - 1, oo) that arise from the Drinfeld-Sokolov construc-
tion of the nKdV hierarchy are special in two ways. First, they are independent
of xAO, for A = 1, ... , n -1. Second, in the given gauge, the corresponding Higgs
fields QAO have the special form
/0 0 ... 0 0 ... 0

1 0 ... 0 0 ... 0
Q AO = ... ...
* 1 0 0 0

`* ... * 1 ... 0/
*
with ones on the diagonal ending in the Ath entry in the last row, and zeros
above it. Third, certain invariants have special values. These are defined as
follows. Any solution to H(n - 1, oo) which is invariant along 9AO can be put in
the form
LAO = aAi + PAI - ((aao + QAO)
LAi = aAi + PAI (aAi, i > 1, (8.6.3)
by a gauge transformation which is unique up to a constant element of SL(n, C).
The commutation conditions then imply that the invariants tr(PAIQBO) are con-
stant. The third property of the Drinfeld-Sokolov solutions is that
tr(Pn-A,IQAO) = -A. (8.6.4)
Conversely, any solution to H(n - 1, oo) with these three properties arises
from a solution to the nKdV hierarchy (i) by making a linear transformation
of the variables tj and (ii) by constructing the LAO from operators M3, as in
(8.6.2). To see this, we first choose an invariant gauge for the the solution to
H(n - 1, oo) such that (8.6.3) holds. The Higgs fields are constant in this gauge
as a consequence of the commutation conditions and can therefore be reduced
to their special form by a constant gauge transformation, which leaves the other
Qs zero. Again from the commutation conditions, we have
[PAI,QBO] = [PBI,QAO]
132 Hierarchies
From this and the third condition on the invariants, we deduce that
* ... * -1 0 0 ... 0
* ... * * -1 0 ... 0
PA1 = * ... * * * -1

with -1 on the diagonal beginning in the Ath entry in the first row, and zeros
above it. We can then define the Mss by solving (8.6.2) for the Hs, and complete
the proof by appealing to the characterization of the nKdV hierarchy on p.
326 (this leaves out the M3s for j = nr, but they generate trivial flows in the
Drinfeld-Sokolov construction).
The three conditions that characterize the nKdV solutions are more general
than might appear. The Higgs fields commute. So if they are independent
and if one of them is nilpotent of maximal rank, then they can brought into
the required form by a linear transformation of the coordinates xAO; and if the
invariants on the left-hand side of (8.6.4), are nonzero, then they can be set to
the required values by rescaling the coordinates xAl. So, given the symmetries,
the only condition on a general solution to H(n - 1, oo) is on the character of
the subalgebra of sl(n, C) generated by the Higgs fields.
There is no similarly straightforward correspondence between H(n - 1, m)
and the first nm levels of the nKdV hierarchy. We can go in one direction in the
same way: every solution to the truncated nKdV hierarchy can be regarded as
a solution to H(n - 1, m); but it is rather harder to characterize the solutions to
H(n - 1, m) that come from the Drinfeld-Sokolov construction.
NOTES ON CHAPTER 8
1. Related ideas can be found in Forgacs et at. (1981, 1983), Chau et at. (1982).
2. The parameters t I, t2, t3, ... here are the parameters tl , t3 i t5, ... in the Drinfeld-
Sokolov construction in Chapter 12.
3. We shall show that there exists a family of connections D = d +4 , labelled by t, and,
for each connection in the family, a sequence Oi of solutions to the recursion relations
such that
(a) when t = 0, D = D(0);
(b) for all t, and for all i >, 0,
ai(D= D,i,,Idw + Di0i di, ai+1D = D.0i dfu + Dwgi di ,
(c) for some choice of gauge, £y(D = 804) for all values of the parameters.
Choose a gauge such that
4)"i = 4)w-°)dw + 4)z°)di
and choose 00 such that
1'(4,T)) = DbOo, 1'(4=°)) = D:Oo.
Then solve (8.2.2) with D = 1)(0), by using co to start the recursion. We take the
solutions as the initial values in solving the evolution equations
aj(D = D,yO7 ,
Notes on Chapter 8 133

aiwj = Y(Oi+)) - [Oi+j-m, 0m] (i,j >- 0)

M=O
for D = 4),;,dw+4)idz and the Os as functions of the parameters. These are consistent.
Note that with i = 0, the third equation is 8oOj = Y(Oj).
We shall now show that the remaining conditions on D and the Os hold because
they hold at the origin in parameter space, and because they are preserved by the
evolution. Put ai = Si+14),;, - a.0, and b = Y((;,) - 8o4 . We have to show that
ai = 0 = b. From the evolution equations,
ajb=[b,Oj] (j>0).
However, b = 0 at the origin of the parameters, so b = 0 everywhere. Also from the
evolution equations, for j, k > 0,
akaj -,9jak + [4k,aj] + [ak,Oj] = 0, 8k+laj - 8j+lak = 0,
from which we deduce that for i, j > 0,
I
aiaj = Y(ai+)) - E ([ai+j-m, Om] + [Oi+)-m, am]) .

m=0\
Again, ai = 0 at the origin of the parameters, so ai = 0 everywhere. Taken with
the same argument for the (Di, this establishes that 8i+1 I = 8z4i, 8i+1c: = a.0i,
8°c,-. = Y((;,), ao4'i = Y(1i).
Finally, we have to show that 1 = satisfies the ASDYM equation for
all values of the parameters. However, if F denotes the curvature, then F., vanishes,
8,F,,i = Dz(aj(D i) - Di(aj1DW) = IF,oi,OjJ
and
,9j+1(Fww - F.i) = (8.a: - 8=au,% = 0
for j > 0. Also 8oF = LyF. We deduce that F is ASD from the fact that F is ASD
at the origin of the parameters.
4. It is possible to study the reduced hierarchies directly as systems of partial differential
equations. In Mason and Sparling (1992), it is shown that B(n) reduces to the NLS
and KdV hierarchies under translational symmetry along 8/8x°.
5. There are, of course, linearized solutions such that Q and Q + 6Q are not conjugate,
to the first order. These can be generated by solutions to (3.5.5), but with ¢ having
nontrivial dependence on z.
Part II
Twistor methods
9
Mathematical background II

In this chapter, we shall gather together some basic mathematical results that
we shall need to apply the Penrose-Ward transform to integrable systems. The
topics covered are: projective spaces and flag manifolds, the geometry of the
twistor correspondence, Birkhoff's factorization theorem, holomorphic bundles,
and 2-component spinors and their relation to the twistor correspondence. As in
the first part, the intention is not to give a complete or detailed exposition, but
rather to define the scope of the tools that we shall need, to introduce notation
and terminology, and to record in a convenient form a few basic propositions.
Only the first two sections and the statement of Birkhoff factorization theorem
are needed as essential background for the construction of the Penrose-Ward
transform in the next chapter.
9.1 PROJECTIVE SPACES AND FLAG MANIFOLDS
Let V be a vector space of dimension n + 1. The projective space PV is the set
of one-dimensional subspaces of V. There is a natural projection
V-{0}- liV
which maps nonzero Z E V to its linear span [Z]. By picking a basis in V, we can
label the points of PV by the corresponding linear coordinates (z°, z1, ... , zn),
subject to the identification
(z 0, z ,...,zn)
1
(fix0 1 n

for any nonzero scalar A. These are the homogeneous coordinates, although they
are not coordinates on PV in the standard sense since the z°s labelling a given
one-dimensional subspace are not unique. We can instead use inhomogeneous or
affine coordinates
1 xl 2 x2 n zn
C =z°, ' z0, C =z0,
which are unique labels, but only on the coordinate patch z° # 0. By dividing
instead by z1, or by z2, and so on, we cover IF'V with n + 1 systems of in-
homogeneous coordinates, and give it the structure of an n-dimensional manifold.
In the particular cases V = Rn+1 and V = Cn+l, we denote the corresponding
projective spaces, respectively, by Rll'n (a compact real manifold) and Cl(nn (a
compact complex manifold). When n = 1, the projective space is the (real or
138 Mathematical background II
complex) projective line; when n = 2, it is the projective plane. The complex
projective line is the same as the Riemann sphere, with the affine coordinate
= z1/z° determining the stereographic projection onto the Argand plane.
We can also consider subspaces of other dimensions. The set of n-dimensional
subspaces in V is the dual projective space PV*. By identifying such subspaces
with their annihilators in V', we see that PV* is the same as the projective space
of the dual space, as the notation suggests.
Given an ordered sequence k = (k1, ... , k,,,) of positive integers such that
ki < n + 1, we define the flag manifold IFk V to be the set of sequences of
subspaces

Ei = ki.
The projective spaces and the flag manifolds are homogeneous spaces for the
general linear group GL(V), which acts on V and hence on its subspaces. By
choosing a basis, and by considering the matrices that leave invariant the flag in
which El is spanned by the first k1 vectors, E2 by the first k2 vectors, and so
on, we see that
FkV = GL(n + 1)/Gk
where Gk is the group of (n + 1) x (n + 1) matrices of the block form
Al *

0 *

0 Am+1
where Ai is a (k1 - ki_1) x (ki - ki_1) matrix (with km+l = n + 1).
9.2 TWISTOR SPACE
In §2.3, we showed that a null 2-plane Z C C M is either an a-plane or a 3-
plane, according to whether its tangent bivector is self-dual or anti-self-dual. In
a double-null coordinate system, a general a-plane has equations of the form
(w+z=A, (z+w=µ, (9.2.1)
where A and µ are constant. Its tangent space is spanned by the vectors
aw - Caze az - (aw,
or else by aZ and aw in the limiting case that (is infinite. Thus the a-planes
in space-time (other than those on which ( is infinite) are labelled by the three
complex coordinates A, u, (, and the set of a-planes through a given point has
the structure of a Riemann sphere, with affine coordinate C.
The twistor space of C M
The set of all a-planes in complex space-time is a three-dimensional complex
manifold, which we call the turistor space of C M. We can understand its global
Twistor space 139

geometry by writing the equation of an a-plane in the homogeneous form

zZ2 + wZ3 = Z°, wZ2 + zZ3 = Z1, (9.2.2)
where the Zas, a = 0, 1, 2, 3, are complex constants; their order is determined by
the standard conventions of twistor theory-see p. 168. Provided that Z2 # 0,
(9.2.2) is equivalent to (9.2.1), with
Z° Z1 Z3
A= Z2, µ= Z2, C= Z2.
When Z2 = 0, Z3 # 0, we have that (is infinite and the tangent space is spanned
by 8;, and O. If we include these a-planes of constant w, z and interpret the
Zas as homogeneous coordinates, then we have an identification of the twistor
space of C M with an open subset of C P3. The points of C P3 that are excluded
are those on the line
I={Z2=Z3= 0).
Thus the twistor space of CM is the complex manifold C P3 - C P1. It can
be covered by two coordinate patches V and V, where V is the complement of
S = oo (the plane Z2 = 0) and V is the complement of C = 0 (the plane Z3 = 0).
On V we use the coordinates A, y, and on V, we use
Z° Z1 Z2
A=Z3, (=Z3.
On the overlap V n f/,
A=A/(, ii.=µl(, C=1/(
We denote by T the copy of C4 on which (Z°, Z1, Z2, Z3) are linear coordinates,
and by PT the corresponding copy of the projective space ClP3.
The twistor space of U C C Mi
Let U C C M and suppose that its intersection with each a-plane is connected
(but possibly empty). We define the twistor space of U to be the subset'
r}
of PT. If U is open, then so is P; and if U = C MI, then P is the complement of
the line I. We shall see in §10.3 that the excluded points of I can be interpreted as
`a-planes at infinity' and that the action of the conformal group on the a-planes is
given by the natural action of GL(4, C) on C P3. The entire projective space PT
is the `twistor space' of the conformal compactification of C M. This definition
is standard in differential geometry, but differs slightly from the terminology
in relativity, in which T is `twistor space' and 1PT is `projective twistor space'
(Penrose and Rindler 1986).
Lines in PT
There is another way to read eqns (9.2.2): if we hold fixed the space-time co-
ordinates w, z, w, z, and allow Za to vary, then the equations determine a 2-
140 Mathematical background II

Fig. 9.1. The correspondence between U and its twistor space P.

dimensional subspace of T, and hence a projective line in PT. This is the R.ie-
mann sphere of a-planes through the space-time point with coordinates w, z, w, z.
We denote the projective line corresponding to x E C M by ±, that is, 1 is the
twistor space of {x}.
Two points x, y E C M are null-separated if and only if they lie on an a-
plane, that is, if and only if 1 fly # 0. Thus two lines in twistor space intersect
whenever the corresponding space-time points are separated by a null vector,
and so the conformal geometry of space-time is encoded in the linear geometry
of PT. See Fig. 9.1.

The correspondence space

In passing back and forth between a subset U of space-time and its twistor space,
it is helpful to make use of the correspondence space F, which is the set of pairs
(x, Z), in which x is a point of U and Z is an a-plane through x. It is fibred
over U and P by the projections

U P ,

which map (x, Z) to Z and x, respectively; both these maps are surjective. If we
label the points of the correspondence space by (w, z, w, z, () (including C = oo),
then the two projections are
Twistor space 141

p: (w, z, w, z,S) ((w+z,('z+w,

q: (w,z,w,z,() i-+ (w,z,tD,z).
The tangent spaces to the leaves of the fibration p are spanned at each point by
the vector fields
t=aw-(ai, m=az-c8,w
on.F. The space .F is a subset of the flag manifold F =1Fi1,2iT, since a point of 1F
is a pair of subspaces E1 C E2 C T, with dim E1 = 1 and dim E2 = 2. The first
subspace determines a point of PT, and hence an a-plane Z, while the second
determines a line i c PT, and hence a point x E CM. The inclusion relation
implies that Z passes through x.
A function on P is a function of the three twistor coordinates (A, µ, (); by
pulling it back by p:.F P, we can represent it as a function on the correspon-
dence space, constant along a and m.

Reality structures
Each of the real slices in complex space-time can be characterized as the fixed
point set of an antiholomorphic involution a: C M -* C M. In double-null co-
ordinates, a is defined as follows, with two natural choices for the coordinate
representation in the ultrahyperbolic case:
(IE)
(U)1 or (w, z, w, z) = (w, z, w, z)
(U)2 or (w, z, w, z) = (w, z, w, z)
(M) or (w, z, ED, z) _ (w, z, w, z)
Note that in the second representation of the ultrahyperbolic conjugation, the
coordinates are real on the real slice. In the Minkowski case, a interchanges
a-planes and a-planes. However, it picks out a real hypersurface PN C PT, by
the condition Z n a(Z) # 0. If Z E PN - I, then Z n a(Z) is a null geodesic
in real Minkowski space, which, in turn, uniquely determines Z. Thus IPM - I is
the space of real null geodesics. 2
In the Euclidean and ultrahyperbolic cases, or maps a-planes to a-planes, and
therefore induces an involution a: FT PT, which is also antiholomorphic. This
is given in homogeneous and the inhomogeneous coordinates by the following:
_20,-23,
(IS) a(Z_) = (Z1, -Z2) and a(,\, ,u,
23,Z2)
(U)1 a(Z°) = (Z1,Z and a()1,µ,
(U)2 a(Z') = (Z°, Z1 Z2, Z3) and a(A, µ,
Although the definitions look very similar, there is an important difference. In
the Euclidean case, or has no fixed points, because a2 = -1 on the nonprojective
space, and so for each Z E PT, there is a unique line joining Z to a(Z). These
are the real lines-the line I and the lines corresponding to the points of the
Euclidean real slice in space-time. Since no two points of IS are null-separated,
no two real lines in PT - I intersect, and so the real lines are the fibres of
142 Mathematical background II
a nonholomorphic fibration FIF - I - IE (in fact the fibration extends to the
whole of C P3, with I interpreted as the fibre over the point at infinity in the
compactification of lE to S4).3 In the ultrahyperbolic case, on the other hand,
a leaves invariant the a-planes on which S = eie and A = e'Bµ, in the first
representation, or those with real values of (A, µ, () in the second representation.
There are therefore fixed points of the action of o on PT. They correspond to
the a-planes that intersect U in two-dimensional null planes, the `real' a-planes
in ultrahyperbolic space: there is a circle's worth of real a-planes through each
point of U. A general complex a-plane Z, that is, one that is not fixed by a,
intersects U in the point corresponding to the line joining Z to a(Z). The second
representation of the nonprojective action shows that the fixed point set of a is
RP3 C C IP3.
Remark. The first form of the ultrahyperbolic reality condition yields the
following explicit formulas. Consider the ultrahyperbolic slice U C C M given
the reality conditions w = w, z = z, on which the metric takes the pseudo-Kahler
form
ds2 = 2(dzd7 - dwdw).
An a-plane in C M with tangent space spanned by
L=aw-<ai, M+COz-(8w
meets U in a single point if Ir;I 1; but if ICI = 1, then the intersection is a real
a-plane, that is a real totally null 2-plane with self-dual tangent bivector, given
by an equation of the form
e'B"2w + e-ie/2z = K
where eie = [;, and rc E C. If we write K = a + iQ, then
a, 0, cos (10), sin (.10)
are homogeneous coordinates on IRIP3.

The global correspondence and compactifled Minkowski space

In §2.4, we showed that the Klein quadric C M# is a compactification of complex
space-time: the additional points are those on a 'light-cone at infinity', with
vertex I. In the twistor picture, the extra points are represented by lines in
twistor space that meet I, and the correspondence between lines in C P3 and
points of compactified space-time is the classical Klein correspondence, which
goes as follows.
Each line 1 C C P3 is associated with a bivector x°Q E A2C4 = C6, which is
uniquely determined by the line up to scale: if Z° and Z° are the homogeneous
coordinates of distinct points of 1, then
x«R = ZiaZQI.

Conversely, every bivector such that

xi°aell = 0
Twistor space 143

is simple and so determines a unique line. We therefore have a one-to-one corre-

spondence between lines in C P3 and points of the Klein quadric (CM# C C P,5,
which we defined by this equation in §2.4.
If i is a line corresponding to the finite point of space-time with coordinates
to, z, fu, z, then we can take
(Z°) = (z, w,1.0), (Z°) _ (w, z, , 0,1) ,
which are the homogeneous coordinates of two a-planes through the point. Thus
under the Klein correspondence, the finite points of space-time are given by
0 s -w z
- -s
W
0
z
z
0
w
1
s=zz - ww;
-z -w -1 0
and the points at infinity are those for which EQ0.y6x°QI''a = 0, where
0 1 0 0
1 0 0 0
0 0 0 0
0 0 0 0

and a«076 is the four-dimensional alternating symbol.4 Each Z° determines a

plane Z C C M# by
x1«AZ71 = 0.

If the corresponding point of C P3 does not lie on I, that is (Z2, Z3) ¢ 0, then Z
intersects C M C C M# in an a-plane. Otherwise, Z consists entirely of points
at infinity, that is, it is an a-plane in the null cone at infinity.
Real forms of the global correspondence
The conjugations of T induce conjugations of A2T = C6, which pick out copies
of R6 on which the symmetric form eap-y6xa13x''6 is real. If the real space-
time metric has signature (p, q), then the real symmetric form has signature
(p + 1, q + 1), that is (1, 5) in the Euclidean case, (2,4) in the Lorentzian case,
and (3,3) in the ultrahyperbolic case. In each case, the corresponding real slice
in C M# is given by forming the intersection of the unit sphere in R6 with the
null cone
eQp.y6x0'Qx 'y6-- 0,

and then by taking the quotient by the Z2 action x"19 '-+ -x°A. The topology of
the resulting compactification of the real space-time has topology SP X SQ/Z2.
In the Euclidean case, the topology is S4.
Explicit formulas for the compactification of U
If we take the second representation of the ultrahyperbolic conjugation, then the
real subspace Its C A2C4 is the set of bivectors with real components x'Q. It
follows that the compactification of U is the quotient of the set of nonzero real
144 Mathematical background II
decomposable bivectors by the equivalence x°A ti rx°Q, r E R - {O}. That is,
it is the real Klein quadric-the space of real projective lines in RIP3. We can
write any real bivector in the form
0 Pi +q, P2 + q2 P3 + q3
(x°Q) _ -Pt - qt 0 P3 - q3 -P2 + q2
-P2 - q2-P3 + q3 0 Pt - qt
-P3 - q3 P2 - q2 -Pt + qt 0

where p, q E R3. Then x°Q is decomposable, that is, has vanishing determinant,
if and only if p.p = q.q. If we scale p, q so that p.p = 1, then we have a
representation of the compactification as the quotient of the set of pairs of unit
vectors (p, q) by the involution
(p, q) - (-p, -q)
Thus the topology of the conformal compactification is S2 X S2/7L2.
The conformal structure is found by making stereographic projections from
the p and q spheres onto two copies of the complex plane. We then have
dvdv dudu
2
(1 + vv)2 (1 + u2)2 '
where u and v are affine coordinates on the two spheres, and the involution is
given by the two antipodal maps, that is
- tt
where here w, z, zv, z are double-null coordinates in which the conjugation is given
by (U)1, that is, w = w and z = z on U. The points such that JuvJ = 1 are at
infinity in U; if we excise them, then the conformal mapping onto U is given by

(v, u) (w'
z)
1- uv 12 (u(1 + Iv12), v(1 + 1u12)
We note that (w, z) is unchanged by the involution; also, when v = 0, we have
w = u, and when u = 0, we have z = v. Therefore the first sphere in the product
S2 X S2, less its point at infinity, is mapped holomorphically to the plane z = 0,
and second sphere is similarly mapped to the 2-plane w = 0.
The real a-planes in S2 X S2/Z2 are given by equations of the form

u=_
av+b , (9.2.3)
by - a
where as+bb = 1, that is, they are the graphs of orientation-reversing isometries
S2 -+ S2, projected into the quotient space by Z2; it is immediate that every
such graph determines a totally null 2-surface, and that it is invariant under the
involution. The real and imaginary parts of a and b are homogeneous coordinates
on RIP3, so the whole of RIP3 is the real twistor space of the compactified space-
time.
At each point of the compactification at which u and v are finite, we can
define a null tetrad by putting
Birkhof's factorization theorem 145

W = (1 + uu)au, Z = (1 + vv)8,,, W = (1 + uu)O, Z = (1 + vv)tj.

Then the a-plane defined by (9.2.3) has tangent bivector L[°Mbl, where

L=W-e'02, M=Z-e'BW, eie=bU -a

by - a
(9.2.4)

The tetrad is not constant in the coordinates w, z on U, and so the complex

parameter ( = e'8 varies from point to point on a given a-plane. However, at
each point of space-time, S is related to the original spectral parameter by a
Mobius transformation that preserves the unit circle.

9.3 BIRKHOFF'S FACTORIZATION THEOREM

Let F(9) be a smooth complex-valued function on the unit circle S' = {(= e'B}
in the complex (-plane. By expanding F in a Fourier series, and by separating
the positive and negative powers of (, we can write F = f - 1, where
00 00

f = 1: aj(', f=Eax-i
0 0

The positive frequency part f is the limit of a holomorphic function on the disc
Kii < 1 and the negative frequency part f is the limit of a holomorphic function
f on the exterior ICI > 1, including the point C = oo, where it is regular as a
function of ( = (-1. This splitting of F into the difference of f and f is unique,
apart from the freedom to apportion the constant term in the Fourier series
between f and f ; that is, it is unique up to f ' f + c, f '-, f + c for c E C.
Riemann-Hilbert problems
A Riemann-Hilbert problem is to find an analogous splitting when F takes values
not in the additive group of complex numbers, but in some more general complex
Lie group. In the case of the multiplicative group C" , the problem is as follows.
Given a smooth nonvanishing function F on the unit circle, we must find smooth
nonvanishing functions f and f on 1(I < 1 and ICI > 1, respectively, such that f
is holomorphic for I(I < 1, f is holomorphic for I(I > 1 (including C = co), and
F = f -1 f on S1. In contrast to the additive case, this does not always have a
solution. If it does, then
dF - f df - df 0,
s. F s. f sI f
by Cauchy's theorem. Thus a factorization exists only if the winding number

k E7G
27riis'
s F
vanishes. In this case, log F is single valued, and we can construct f and f by
splitting its Fourier series, and then exponentiating.
Whatever the value of k, (-kF has zero winding number, and can therefore
be factorized. Thus a nonvanishing smooth function on the circle can always be
146 Mathematical background II
written
F = f-'(kf
where k is the winding number, and f and f are nonvanishing holomorphic
functions on the inside and the outside of the circle, respectively. Birkhoff's
theorem extends this result to other Lie groups. We shall use the theorem in
the form proved by Pressley and Segal (1986) for GL(n, C). To state it, we need
some definitions. We denote the loop group of GL(n, C) by LGL(n, C ). This is
the group of smooth maps or loops
F: S' -GL(n,C)
under pointwise matrix multiplication. The subsets of loops that are boundary
values of holomorphic maps on
{1 I < 1} and {IKI > 1} U loo},
respectively, will be denoted by LGL+(n, C) and LGL_ (n, C ). The loop group
is an infinite-dimensional Lie group. As a manifold, it is modelled on the topo-
logical vector space E of smooth maps A: S' -> gl(n, C), with the topology of
uniform convergence in A and its kth derivative for each k. Charts are defined
by mapping small neighbourhoods of the origin in E to neighbourhoods of loops
F E LGL(n, C) by A ,- F exp A (these define both the manifold structure and
the topology of the loop group).
Theorem 9.3.1 BirkhoQ`''s factorization theorem. Any loop F E LGL(n, C )
can be factorized
F=f-'of
where f E LGL+(n, C), f E LGL_ (n, C), and A = diag((k...... (k^) for some
k; E Z. The kis are unique up to permutation. The loops for which A = 1 are a
dense open subset of the identity component of LGL(n, C ), and for these loops the
factorization is unique up to f ,--' cf, f i--4 cf for some constant c E GL(n, C ).
The final statement in the theorem is a consequence of Liouville's theorem. For
if F = f -' f and F = g- l g are two factorizations, then, with c defined by
c=9f-' =gf-',
we have that c is a global holomorphic map from the Riemann sphere into
GL(n, C ): the first equality shows that c is holomorphic inside the disc, the
second that it is holomorphic outside the disc. Therefore c is constant. We shall
use this argument many times.
Pressley and Segal explain the extension of the theorem to more general loop
groups. We shall not use their wider results, other than to note that we can
replace GL(n, C) by SL(n, C) and require all the matrices in the statement of
the theorem to have unit determinant (so that, in particular, E ki = 0); and
to note that the theorem remains true if we work with polynomials in S and
(-1, rather than holomorphic functions, or with rational functions of C, or with
analytic functions of S.
Birkhof's factorization theorem 147

Example 9.3.2 Let w E C and put

F= 0 C1 .

Then whenever w :A 0, we have the Birkhoff factorization F = !-V, where

f=(w(-I

-1
-0 ), f=(w11( 0)
1

For w = 0, however, the factorization is F = f -10 f , with f = f = 1, and

(-1).
A = diag((,
Example 9.3.3 Suppose that F = CR, where C: C -+ GL(n, C) is entire and
R is a rational matrix-valued function of (. If R = 1, then we have a Birkhoff
factorization with f = A = 1, f = C. There is a similar trivial factorization
when all the poles of R and all the zeros of r = det R lie outside the unit
circle. We shall consider the opposite extreme, that they all lie inside the circle.
Then, in general, we can construct the factorization with A = 1 explicitly (the
qualification excludes singular special cases). We shall use this factorization in
the construction of solutions from the Segal-Wilson ansatz.
The determinant r = det R must have equal numbers of poles and zeros in
the unit disc, otherwise det F has nonzero winding number and a factorization
with A = 1 is not possible. So we assume that

1 (-Qi
where Jai < 1 and IQil < 1, and that R is holomorphic except at the points )i.
Furthermore, we assume that, for each i = 1, 2, ... , k
(a) Ai = R(ai) has rank n - 1,
(b) Bi = limS_p, (( -132)R(132) exists and has rank 1.
These hold for almost all choices of R. For each i, we choose nonzero ai, bi E C"
such that a;Ai = 0 and such that bi E C" lies in the image of Bi. Here the `t'
denotes the transpose; note that i labels the different vectors in C", and not the
components of a single vector. The factorization is constructed by taking f to
be of the form
k

f = +(_at
xi yi

where xi, yi E C". We must choose xi and yi so that f = ICR is holomorphic

and nonsingular everywhere inside the unit circle. For f to be holomorphic at
the poles and zeros of r, we must have for each j that
k
xiYic
0. 1+ C(/3j)Bj = 0.
i=1 Qj - ai
These we can satisfy by putting yjl = and by choosing the xis so that
148 Mathematical background II
k
C(Qj )bj + > xiMij = 0

where M is the k x k matrix

J.
A
fij - ai
We must, of course, make the further assumption that M is nonsingular. The
effective freedom that we then have is to rescale the ais and bis; but this leaves
f unaltered. Thus f is uniquely determined by C and by the data consisting of
the points ai, Ni, together with the one-dimensional subspaces of C' spanned by
the vectors ai, bi.
The form of f implies that det f = pk(()/qk(0, where pk and qk are polyno-
mials of degree k. s On the other hand, by construction, det f has poles at the
points ai and zeros at the points i3 . Therefore det f is a constant multiple of
r-1. It follows that
det f = det(f) det(C) r
is nonzero throughout the unit disc.

Jumping points
It is a consequence of the way in which the theorem is proved by Pressley and
Segal that, if we are given a loop F(w, () depending smoothly on some additional
parameters w = (w1, w2, ...), and if a factorization with 0 = 1 exists at some
w, then a factorization with A = 1 exists in an open neighbourhood of w,
and the factors f and f can be chosen to depend smoothly on the parameters.
The same is true with `smooth' replaced by `holomorphic' in the case that F
depends holomorphically on [; (in a neighbourhood of the unit circle) and on
complex parameters wi. As we try to extend the A = 1 factorization throughout
the parameter space, the typical behaviour is that it fails on a submanifold of
codimension 1, on which 1 `jumps' to a value other than the identity. The more
that the set of integers ki differs from zero, the larger the codimension of the set
on which i = diag((k' , ... , (k ).
In the Ward construction, the parameters are coordinates on space-time and
the jumping points give rise to singularities in the ASDYM potential. In the
holomorphic case, the jumping singularities are at worst poles, as we shall de-
duce from the following proposition. In the statement, V and V form a two-set
open cover of the Riemann sphere; V is a neighbourhood of ( = 0, V is a
neighbourhood of ( = oo, and A = V fl V is an annulus in the complex plane
containing the unit circle.

Proposition 9.3.4 Ward (1984a). Let W be an open ball in Ck and let

F:WxA - GL(n,C)
jjolomorphic vector bundles: the Cech description 149

be holomorphic. Suppose that for some point of W, there is a Birkhoff factoriza-

tion of F as a function of ( with 0 = 1. Then there exist holomorphic maps f,
!from W x V and W x V, respectively, into the n x n matrices such that
(_) f F= f on W x A; and
(ii) for almost all w E W, det f 54 0 and det 136 0 on V and V, respectively.
The proof involves a reinterpretation of a theorem of Grauert and Remmert
(1958) on coherent analytic sheaves. We shall indicate how it goes in §9.4.
Example 9.3.5 In Example 9.3.2, we take W = C and put
f= /111
f=(( w), )
0
These satisfy (i) in the statement of the proposition: for all values of w (including
w = 0), we have IF = f. They form a Birkhoff factorization everywhere except
at w = 0, where f is not invertible.
Since the jumping points give rise to singularities, we should like to know how
to avoid them. One way is to choose F to be close to the identity for all values
of the parameters. We also have the following result.
Proposition 9.3.6 Gohberg and Krein (1958). Suppose that F E LGL(n, C )
and that F + F t is positive definite. Then 0 = 1.
Proof Put
A=ff`, P=f(F+Ft)f`,
and, as before, write A = diag((k1,... , (kr ). Then P is positive definite and
P=OA+AtO.
Now the Fourier series of the entries in A and ;V contain, respectively, no neg-
ative and no positive positive powers of r;, while the diagonal entries in P are
positive real functions. It follows that ki < 0 for each i. A similar argument
applied to (f -1) t (F + F t) f -1 gives ki > 0.

9.4 HOLOMORPHIC VECTOR BUNDLES: THE CECH DESCRIPTION

A holomorphic vector bundle E on a complex manifold M can be described in
two ways. First, in terms of its patching data, the Cech description, and second
in terms of its 8-operator, the Dolbeault description. We shall look at patching
data in this section, and at a-operators in the next.
Patching data
The patching data are the patching matrices or transition functions between
local holomorphic trivializations (the latter term is more appropriate for line
bundles). The manifold is covered by open sets Vo, and on each Vo, there is
given a holomorphic frame field eoi (i = 1.... , n, where n is the rank of the
bundle). On the nonempty intersections,
150 Mathematical background II

(erl,. ern) _
for some holomorphic map
Far:Vaf1Vr-GL(n,C).
We call Far the patching matrix from Va to Vr. The patching data satisfy three
conditions:

(a) each patching matrix is holomorphic and nonsingular;

(b) Far = FTa' whenever Va fl Vr # 0;
(c) FCTFTVFva = 1 whenever for each nonempty Va fl Vr fl V # 0.
Any collection of patching matrices satisfying these determines a holomorphic
bundle.
Two holomorphic bundles E and E' are equivalent if there exists a biholo-
morphic map E ,--, E' that sends the fibres of E linearly onto the corresponding
fibres of E'. Such a map exists if and only if there exist systems of local triv-
ializations for E and E', with the same open sets Va, such that their patching
matrices are related by
Far = ha 1Forhr
for some family of holomorphic maps ha: Va GL(n,C ). In particular, E is
trivial (equivalent to a product bundle) if and only if its patching matrices can
by factorized in the form Far = ho 1 hr

Bundles over CPI

Birkhoff's theorem has an obvious interpretation in this context as a statement
about holomorphic vector bundles on the Riemann sphere. Suppose that V, V
form a two-set open cover of the Riemann sphere, with V a neighbourhood of
(= 0, V a neighbourhood of ( = oo, and A = V fl V an annular neighbourhood
of the unit circle. Any holomorphic vector bundle over an open subset of the
complex plane is necessarily trivial; thus any holomorphic bundle E -' C P1 is
determined by a holomorphic function F: A -+ GL(n, G ), namely the patching
matrix from V to V, which is defined by
(el,...,en)= (el.... en)F,
where the sections e1, ... , en and e1, ... , en form holomorphic frame fields in V
and V respectively. A general local holomorphic section s has components s, in
the trivialization over V and components s"= in the trivialization over V On the
overlap,
s:=F,js3,
with summation, where the FF,s are the entries in F. Applied to F, the factor-
ization theorem can be read as the assertion that E is equivalent to the bundle
with patching matrix A = diag((", ... , (k^) for some integers k=.
Holomorphic vector bundles: the Cech description 151

Homogeneous functions
The line bundle with transition function (-k is denoted by O(k), and it has
a natural interpretation in terms of the geometry of C IP1. This is most easily
understood by going in the reverse direction, and by using the representation of
the R.iemann sphere as the projective line to construct a family of line bundles
with transition functions C-k. If z° and z' are linear coordinates on C2, then
(= z1 /z° is an affine (stereographic) coordinate on CP1. For each value of (,
including ( = oo, we have a one-dimensional subspace LS C C2. As c, varies,
the Ls form a line bundle L C IP1, which is called the tautological bundle, and
which is the same as 0(-1). We can take V = {z° # 0}, V = {z' # 0}, and
define local trivializations of L over V and V, respectively, by
e=(1,o, e=((-1,1).
Then e = (e, so the transition function is F = (. For other integer values of k,
we define O(k) by taking the fibre over ( to be
{h: LS - {0} -+ C I h(tz°, tz') = tkh(z°, zl), 0 0 t E C} .

With this definition, there is a natural representation of the holomorphic sections

of O(k) by holomorphic homogeneous functions of z°, z1 of degree k, which is
the origin of the notation: O(k) more properly denotes the sheaf of germs of
such functions. We can use the sections
e = (zo)k, c = (zl)k

to define the trivializations of O(k) in V and V Since e = (-ke, the transition

function is (-k.
A global holomorphic section of h of O(k) is represented locally by holomor-
phic functions f : V - C and f : V -i C, with the transition relation f = (k f .
Now f is a power series in ( and f is a power series in (-1. So if k < 0, then
f = f = 0, and therefore there are no global sections for negative values of k.
For k > 0, f is a polynomial of degree k. By multiplying by (z°)k, we deduce
that h is a homogeneous polynomial of degree k in z° and z1. For k > 0, the
space of global sections of O(k) has dimension k + 1.
If x and y are independent global sections of L-1 = 0(1), then x/y is
an affine coordinate on C 1P1.

Tangent and cotangent bundles

Note that O(k) ® 0(k') = O(k + k') and that O(k) = L-k (the kth power of
a line bundle is defined by taking the kth power of its transition functions). A
holomorphic tangent vector field on C IP1 is a section of the holomorphic tangent
bundle TC P1, which is a line bundle because dim C IPl = 1. The dual objects, the
holomorphic differentials, are the sections of the holomorphic cotangent bundle
T*C1P1, which is also a line bundle. If we put (_ (-1, then
d( -C2dC .
152 Mathematical background II
We can absorb the minus signs in these transition relations into the local trivi-
alization on V, and so deduce that
TCP1 = 0(2) = L-2 , T*C lPl = 0(-2) = L2 .
The unit section of TC lF ® T'C P1 = C is a natural global 1-form on C P1 with
values in TCP1. We denote it by r. In the local trivialization, r = d(Oo/8(;; or,
in a coordinate-free characterization, X J r = X for any holomorphic tangent X.
Grothendieck's theorem
Theorem 9.4.1 Grothendieck. Let E C P1 be a rank-n holomorphic vector
bundle. Then
E=L k. e ... ®Lk0 = O(-kl) ®...
for some integers k1,. .. , kn, which are unique up to permutation.
The bundle with patching matrix
A= diag((k...... (k^ )

is the direct sum 0(-k1) ® ... ® It follows that Birkhoff's theorem

implies Grothendieck's theorem.
Bundles on projective space
In exactly the same way as in the case of the projective line, we define the tauto-
logical bundle L and the line bundles 0(n) on the projective spaces C PN. Again
0(-k) = Lk, and again the global sections of 0(k) correspond to holomorphic
functions on CN+1, homogeneous of degree k. When N > 1, the tangent and
cotangent bundles are not line bundles, but the top exterior power ANT*C PN is
a line bundle, and is equivalent to 0(-N - 1). It is called the canonical bundle.
Proof of Ward's proposition
We remark finally on Ward's proof of Proposition 9.3.4, although it takes us
into areas outside the scope of this book. In the geometric terms we have just
introduced, the parametrized family of patching matrices F(w, () determines a
holomorphic bundle E -' W x C P1. Let 7r: W x C P1 -+ W denote the projection
onto the first factor. Grauert and Remmert prove that the direct image under 77-
of the sheaf of sections of a holomorphic vector bundle on the product space is
coherent analytic. The combination of this result and the fact that W is a Stein
manifold implies that there is a finite family of holomorphic sections of the dual
bundle E* with the property that any other holomorphic section of E* over a set
of the form W' x C P1, where W' C W is open, is a combination of these with
holomorphic functions as coefficients. Now a global section of E* is represented
by column vectors s ands on W x V and W x V, respectively, such that
Fls=s
on the overlap. On the other hand, the fact that F can be factorized with
A = 1 at wo implies that it can also be factorized in this way for w E Wo.
8-operators 153

where Wo is some open neighbourhood of wo. It follows that the restriction of

E to Wo x C P1 is trivial, and hence its space of holomorphic sections can be
generated by a set of n elements. Therefore, amongst the finite set of generators
of sections of E* W x C F'1, there must be n sections that are independent
throughout Wo x C F1, provided that Wo is chosen appropriately. If we assemble
the corresponding column vectors s ands into square matrices f t and P, then f
and f are holomorphic on W x V and W x V, respectively, and are nonsingular
on Wo x V and Wo x V They are therefore nonsingular almost everywhere on
W x V and W x V, respectively. On the overlap, IF = f.
Jumping lines
A family of holomorphic maps F(w, () from W x A to GL(n, C) determines a
family of vector bundles over C F1, labelled by the parameter w; alternatively, as
above, we can think of this as a single holomorphic bundle E -+ W x C F1, with
the parameters labelling the different copies of C P1 in the product space. If, at
some value of w, F has a factorization with 0 = 1, then the restriction of the
bundle to the projective line labelled by w will be trivial for almost all values of
to, but there may be `jumping lines' on which A 1 and the restricted bundle
is nontrivial. By `nontrivial', we mean `nontrivial as a holomorphic bundle': the
restricted bundle remains trivial as a topological or smooth bundle even on the
jumping lines. Triviality in the holomorphic category is a more restrictive, and
more subtle, property.

9.5 8-OPERATORS
By taking the real and imaginary parts of holomorphic coordinates, a complex
manifold M of dimension N can be represented as a real manifold of dimen-
sion 2N. It is distinguished from a general even-dimensional real manifold by
an additional structure, namely the operator f .-- a f , where D f denotes the
(0,1)-part of d f . This operator is of the form 8 = (1 + iJ)d, where J is an
almost complex structure, that is, an endomorphism a of TM (and T*M) such
that j2 = -1, which implies that J has eigenvalues ±i. It has the characteris-
tic property that the local consistency conditions for the linear system (9f = 0
are satisfied. 6 A 2N-dimensional real manifold on which there is given such an
integrable operator has the structure of a complex manifold: the holomorphic
functions on M are picked out as the local solutions to Of = 0.
The 8-operator extends to differential forms on M: a smooth k-form a on
M can be written as a sum of terms of the form
f(z,z)dza' A...AdzaPAdz' A...Adz" (9.5.1)
where p + q = k and the zas are local holomorphic coordinates. We say that a
is of type (p, q) if p and q are the same in all these terms. Clear any k-form can
be written uniquely as the sum of forms of types (0, k), (1, k - 1), ... , (k, 0). If
we apply the decomposition to the exterior derivative, then we have d = a + a.
For a (p, q)-form, 8a is the (p + 1, q)-part of da, and 8a is the (p, q + 1)-part of
154 Mathematical background II
da. For a function f, that is, a 0-form, of is the same as above. If a is given
by (9.5.1), then

8a= of dzcAdz°' A...Adz Adx' A ..Adzb'"

aa= 2dz`ndz°' A...Adz°" Adz A...Adz
Clearly a2 = 0 = a2 and as = -aa.

Forms with values in a bundle

We have the same decomposition when a takes values in a complex vector bundle,
and when the bundle is holomorphic, the 8-operator is well defined (although
one needs additional structure to define 8). In fact in the same way that one
think of a complex manifold as a real manifold with a 8-operator on functions,
one can think of a holomorphic vector bundle E -4 M as a smooth complex
vector bundle E -' M, together with an operator DE that maps smooth sections
of E to (0, 1)-forms with values in E. In a local holomorphic trivialization, DE
is defined component-by-component, by applying 5 to the entries in the column
vector representing a local section. It is independent of the choice of trivialization
since if F is the holomorphic patching matrix between two local trivializations,
then
a7,11

F
aWn aWn

where -rP = FiP.

In a general smooth trivialization,
aE = a +'
for some matrix-valued (0, 1)-form 1, which undergoes gauge transformations of
the form
(D -'g-1`I)g+g-lag
under change of smooth trivialization. The characteristic property of 8E is the
partial flatness condition
8'+ n4) =0,
where 8 is applied to 4D entry-by-entry. A smooth complex vector bundle on
which there is given an operator DE = a+4) with this property has the structure
of a holomorphic vector bundle. The holomorphic sections are the solutions to
OED = 0, with the partial flatness condition ensuring that there are enough local
solutions to determine a trivialization.
The operator 8E is a partial connection: it allows us to differentiate sections of
E along (0, 1)-vectors. Apart from the fact that it is defined only for a restricted
class of directions, it has all the properties of a flat connection. It extends in an
Cohomology 155

obvious way to forms with values in E. We shall usually drop the subscript, and
denote the operator simply by a. _
We say that a form a with values in E is 8-closed whenever as = 0, and
that it is a-exact whenever a =5,3 for some form ,0 with values in E. It is clear
from the local expression for the 6-operator that every a-exact form is 5-closed.

9.6 COHOMOLOGY
The linear Penrose transform maps the first holomorphic cohomology group of a
vector bundle over twistor space to the solution space of a linear wave equation
in space-time. We shall not need the full apparatus of cohomology theory to
introduce the transform in the context in which we shall use it, so we merely
sketch here some basic ideas that should be sufficient to explain the few parts of
the general theory that we shall call upon.
tech cohomology
Suppose that M is a complex manifold, that E M is a holomorphic vector
bundle, and that {Vr) is an open cover of M, indexed by r. In the tech theory,
an element of the first cohomology group of E relative to the cover is represented
by a map that assigns a holomorphic section gar E r(V, n Vr) to each nonempty
Intersection, such that
9ro = -gar, gar + 9rp + gp, = 0

(the second condition is the cocycle relation). Two such maps g and g' are
equivalent, written g - g', whenever
I
ga r - gar =hr -ha,
where hr is a holomorphic section of E over Vr (when this holds, g' - g is a called
a coboundary). The first cohomology group is the quotient of the additive group
{g,,} by this relation. There are similar definitions for the higher cohomology
groups: for the nth group, the gs are sections of E on (n + 1)-fold intersections
of open sets in the cover. If we replace the cover by a refinement, that is by a
second cover every set of which is contained in some set of the original cover,
then we can map the cohomology groups of the first cover to the cohomology
groups of the second by restricting the gs from sets of the first cover to sets of
the second cover. The nth cohomology group H" (M, E) of M with coefficients
in E is then defined by taking a limit over successive refinements. We shall not
be concerned with the details of this construction because it is always possible to
choose {Vr} so that the cohomology groups of the cover coincide with the limits,
by choosing the cover so that the Vs are Stein manifolds (see, for example, Field
1982, I, p. 142). In fact, in every case that we shall consider, it is possible to
to choose a two-set cover V, V with this property. We can then use the concrete
definition
H' (P E) = r(V n V, E)
r(V, E) + r(V, E)
156 Mathematical background II
where r(V, E) and r(V, E) are mapped into r(V n V, E) by restriction. 7
Example 9.6.1 Suppose that M = C P1, E = O(k), and that V and V are as
in §9.4. In the trivialization over V, we can represent g E r(V n V, E) by a
holomorphic function f on C - {0}. In the trivialization over V, g is represented
by f = (-k f For k > -1, the first cohomology group vanishes since if we
.

expand f in a Laurent series

f= fiv,
-00

then we can write f = h - (kh where

00 00
f-i(-t-k
h = > fi(i, h = - E
0 1

Since h is holomorphic in V and h is holomorphic in V (including the point at

infinity), we have g - 0. However, for k < -2, we have f = h - Ckh + fo, where
now
00 00 -k-1
h= ft(', h = - E f-i(-t-k, fo = >
0 -k 1

Again h extends to V, and h extends to V, so g - go, where go is given by fo in

the trivialization over V. The coefficients f-1, f-2, ... , fk+1 uniquely determine
the class of g, so we conclude that
C-k-1 fork < -2
H1(C 11P1, O(k)) =
{0 fork > -2 .

The Dolbeault isomorphism

For any g E r(V n V, E), we can always find nonholomorphic sections s and
of E on V and V, respectively, such that g = s - s on V n V, but we can choose
s, s to be holomorphic sections only if g - 0. However, even when s, s are not
holomorphic, we still have that Dg = 0, and hence that
as = D3.
The two sides of this equation define a global (0,1)-form -y with values in E such
that ay = 0. Clearly -y depends only on the equivalence class of g, and it is
independent of the choice of s ands up to the addition of aQ for some global
nonholomorphic section or of E over M. Thus we have a map from H1 (M, E)
to the first Dolbeault cohomology group of E, which is defined to be the space of
equivalence classes of 5-closed forms (0, 1)-forms on P with values in E, modulo
a-exact forms.
If we start with a general cover by open Stein manifolds, then we write
gor Sr - So, where sr is a nonholomorphic section of E over VT, and put
y = as,. Since as, = as, on the overlap, y is a well-defined global (0, 1)-form.
It satisfies ay = 0, and it contains the same information as the equivalence class
The Grassmannian 157

of g since we can write y = 8s, on each VT, and recover g up to equivalence by

putting gor = s., -s, (it is a basic result that if 8y = 0 on a Stein manifold, then
y = Of for some f ). Thus we have the Dolbeault isomorphism from H1(M, E)
to the first Dolbeault cohomology group of E. The kth cohomology group is
defined similarly in the Dolbeault theory as the quotient of the space of a-closed
(0, k)-forms, modulo the space of a-exact forms, and is similarly isomorphic to
the kth cohomology group of the Cech theory; see Griffiths and Harris (1978),
Wells (1973), Field (1982).

9.7 THE CRASSMANNIAN

There is another interpretation of Birkhoff's theorem which is important in the
theory of integrable systems. Let H denote the Hilbert space H = L2(S1,Cn)
of square-integrable functions a: S' - Cn, with the inner product
27r
(a, 3) = at 3 d9 .
J0
By splitting the Fourier expansion
00
a = E akeike
00

we can write any a E H uniquely as the sum a+ +a- of its positive and negative
frequency parts, where
a+ = a ke'ke , a_ = ak eikB
k>O k<O

This gives a direct sum decomposition H = H+ ® H_, where the positive and
negative frequency subspaces H+ and H_ are closed and orthogonal.
The loop group LGL(n, C) acts on H by multiplication: if F E LGL(n, C),
then F: a --+ Fa, where
(Fa)(9) = F(9)a(9).
In general this does not preserve the decomposition into positive and negative
frequency parts. In fact, F(H+) = H+ if and only if F E LGL+(n, C ), and
F(H_) = H_ if and only if F E LGL_ (n, C ); only the constant loops preserve
both H+ and H_ .
The Grassmannian Grn is the set of closed subspaces of H that can be ob-
tained from H+ by the action of the loop group. It is the homogeneous space
Grn = LGL(n, G)/LGL+(n, C) .
Pressley and Segal (1986) characterize the subspaces W C H that make up Grn.
Their properties include eie(W) C W, where e10 acts on H by multiplication. In
this context, Birkhoff's theorem is a statement about the orbits of LGL_ (n, C )
in the Grassmannian: it states that each orbit contains a subspace Wk E Grn,
where Wk, k = (k1,... , kn), is the space spanned by elements H of the form
158 Mathematical background II

(az
al > kl, a2 > k2, a. > kn.
a
Moreover, Wk is unique up to the ordering of the kis.
The Grassmannian Grn is contained in a larger Grassmannian Gr of spaces
W C H that are close to H+ in the sense that the orthogonal projections W -+
H+ and W - H_ are, respectively, Fredholm and Hilbert-Schmidt.
9.8 SCATTERING ON THE REAL LINE
In this section, we look at some background material that we use in treating the
inverse-scattering theory of the KdV and NLS equations.
The KdV scattering problem
The properties of the solutions to the time-independent Schrodinger equation
axx + ua = (a
play an important part in the analysis of the KdV equation. We shall recall here
some basic facts about them (for detailed proofs, see Deift and Trubowitz 1979).
We take x to be real, ( to be a complex parameter, and we suppose that u is a
real function of x such that
00

Im Ju(x)I(1 +x2)dx < oo.

For each real k # 0, there exist four solutions al ± and alt with C = -k2 such
that
a,+ - e
ikx
a2+ - e
-ikx

as x -*ooand
al- - e
ikx
, a2- - e
-ikx

as x -+ -oo. Since the solution space is two-dimensional, these must be con-

nected by two linear relations. We define the reflection and transmission coeffi-
cients by
T_al+ = al_ + R_a2_, T+a2_ = a2} + R+al+
By taking the Wronskians with a2_ and al+, respectively, and by evaluating
the right-hand sides at large IxI, we have
T+ = T_ = 2ik
(9.8.1)
Wr(al+, a2-)
We shall be concerned with the way in which the solutions and the scattering
coefficients T±, R± depend on k. From the definitions, al+(x, k) = a2+(x, -k)
and al_ (x, k) = a2_ (x, -k). Other properties are more easily expressed in terms
of the two Jost functions, m+ = al+e-ikx and m_ = a2_eikx. The first of these
satisfies the Volterra equation
Scattering on the real line 159

m+ (x, k) = 1 - f'(,2ik(y-.) - 1)u(y)m+(y, k) dy

2ik
and the second satisfies a similar equation. By considering the convergence of
the iterates of the integral equations, one can show that a,+ and a2_ extend
holomorphically to the upper half-plane, and one can derive estimates on the
behaviour of solutions for large k. In particular, as k oo in the upper half-
plane, m+ - 1.
It follows from (9.8.1) that T = T+ = T- extends to a meromorphic function
on the upper half-plane. It can be shown that T is continuous on the closed
half-plane, and that T 1 as k - oo. The poles of T in the open half-plane
are the values of k for which a1+ and a2_ are dependent. There are a finite
number of poles, they are all simple, and they lie on the imaginary axis at the
values of k for which c = -k2 > 0 is an eigenvalue of the Schrodinger operator.
The reflection coefficients need not extend to complex values of k, but the rate
at which they fall off for large real k is controlled by the smoothness of u (one
of many ways in which R± behave like Fourier coefficients).
We define the scattering matrix

S(k) (9.8.2)
T+ -R-) .

It is immediate from the relationship between the as that S(- k) = S(k)-1.

The NLS scattering problem

In the case of the NLS equation, we look instead at a pair of linear equations
ax + 1 / J /3 = i(a, )3 + V)a = -i()3,
for the unknown entries a, p in a column vector s. Provided that V) and fall
off sufficiently fast as x -+ ±00, there are solutions si± (11.5.2) (i = 1, 2) such
that fort=0,x,(E R,
eix( 0

0 )' s2+
- (
e-ixt )
asx -p oo,and
S1_
e'xt ) , S2- '.' 0
0 (e-iX ( )
as x -+ -oo. For each (, the four special solutions to eqns (11.5.2) must be
connected by two linear relations because the solution space to the linear system
is two-dimensional. Thus we can write
sl- = &(()s1+ + b(()s2+, S2- = b(()sl+ + a(()S2+ . (9.8.3)
where the functions a, b, a, b are the scattering or transmission coefficients. Since
the Wronksian a)3' - a',Q is constant for a pair of solutions s, s' , we have
as-bb=1.
160 Mathematical background II
It is shown by Faddeev and Takhtajan (1987) that the matrix ru with columns
sl+ and s2_ extends holomorphically as a function of ( to the upper half (-plane
and that the matrix re with columns s1_ and s2+ similarly extends to the lower
half (-plane. Since
a=detru, a=detre
by substituting from (9.8.3) and by evaluating the determinants in the limit
x ±oo, it follows that a extends to the upper half-plane and that a extends
to the lower half-plane. Faddeev and Takhtajan also deduce that a, a ti 1 as
(-' oo in the respective half-planes (in general, the bs do not extend off the real
axis). On the real axis, we have

ru (0 16) re ( lb a)

9.9 SPINORS
In §2.5, we introduced the isomorphism
SO(4,C) = SL(2,C) x SL(2, C )/Z2, (9.9.1)
under which complex orthogonal transformations in four dimensions are decom-
posed into products of left and right rotations (uniquely apart from a sign am-
biguity, which is the reason for the Z2 quotient on the right-hand side). Spinor
calculus exploits this in a way that is particularly well suited to the analysis of
self-duality conditions. In spinor calculus, we replace tensors, which are charac-
terized by the transformation rules for their components under orthogonal trans-
formations of the complex space-time coordinates, by spinors, which are defined
in terms of transformation rules under left and right rotations. To put this more
precisely, we denote by S and S', respectively, the fundamental representation
spaces for the two SL (2, C) factors in (9.9.1).
Definition 9.9.1 A spinor of type (m, n, in', n') is an element of the tensor
product
m n m' n'
S®...®S®S*®...®,g*®,S'®...®SS'*®...®S1*
where * denotes the dual space.
We denote the components of elements of S, S*, S', S'* by aA, OA, ryA' and
6A', respectively. The lower indices are used in the dual spaces, and the primed
indices are used in S' and S'*. All four `spin' spaces are two-dimensional, and
all four types of index run over the two values 0, 1, although it is conventional to
leave the primes in place when giving specific values to A', B', and so on. Thus
the two components of an element of S' are (ry", -y"), rather than ('y°,7'). A
general spinor of type (m, n, m', n') has components
aA...C A'...D'
D...F E'...K'
Spinors 161

with m upper unprimed indices, n lower unprimed indices, m' upper primed
indices, and n' lower primed indices.
We represent left rotations by matrices A = (A B), and right rotations by
A = (AA' ,). The unprimed and primed indices keep track of the different trans-
formation rules: for elements of S and S', these are
--, ABae, 7A ~ + AB''y$ ,
aA
with the usual summation convention. For the dual spaces, they are
QA -, r AOB, 6A' , rBA16B'

where I' = A-' and r = A-'. The rules extend in the standard way to general
spinors.

Tensors as spinors

The isomorphism (9.9.1) is built into a canonical identification T = S (9 S',

where T is the space of complex 4-vectors in tC M. In double-null coordinates,
the displacement vector from the origin has components (xa) = (w, z, w, z). This
is identified with the 2-index spinor xAA', where
(x AA') _ (i70

(X1o' x11') _ z

By taking duals, we have the identification T* = S* (DS", under which a covector

as becomes a spinor aAA' with two lower indices. In particular, the coordinate
derivatives (i%) = (8w, 8Z, ew, az), become the spinor operator 0AA', given by
NO, aov az -9.
(8AA') =
(alo' al l')=(alb aZ)=(W Z
where (W, Z, LV, 2) is the coordinate null tetrad.
Under the transformation from one double-null system to another with the
same orientation,
(z
iu A(w z)At, (9.9.2)

z)
for some A, A E SL(2, C ), which are uniquely determined by the transformation
up to an overall sign. In spinor notation, xAA' --, AA BAA',xBB . By taking the

dual, we have

(Z W) ~ A-1t (Z W) A_1
A general tensor, with m upper indices and n lower indices, determines a
spinor of type (m, n, m', n'), with n' = n and m' = m. It is conventional to
keep track of this correspondence by associating tensor indices a, b.... with the
corresponding pairs of capital spinor indices AA', BB', .... Thus the spinor
equivalent of T b, is written as TABCA B'C' or, more elegantly, as TAA BCB'C'
162 Mathematical background II
The metric and alternating tensors
The SL(2, C) transformations of S and S' are symplectic: they preserve the
skew-symmetric 2-index spinors eAB, EA'B', EAB, EA'B' with components

which we use to raise and lower indices. Because the E's are skew symmetric, it
is important to keep track of the order of the indices. The conventions are
,vA =EAB^YB
aB =a A CAB, ,

together with the same rules for primed indices. Since EABECB = 6A, if we lower
an index, and then raise it again, then we arrive back at the starting point. Note,
however, that aAQA = -CIA 3A.
With this notation, the spinor equivalents of the Minkowski metric tensor 71ab
and the alternating tensor Eabcd are
71ABA'B' = -AB-A'B',
EABCDA'B'C'D' = EACEBDEA'D'EB'C' - EADEBCEA'C'EB'D',
so that
ds2 = EABEA'B, dxAA' dxBB'
This is consistent with the rules for raising and lowering indices: one arrives at
the same result by lowering a tensor index and then taking the spinor equiv-
alent as by first constructing the spinor equivalent, and then lowering the two
corresponding spinor indices.
Spin frames
A spin frame in S is a basis oA, t A such that
= EABOAGB = 1 .
OAGA

There is a similar definition for primed spinors, with unprimed replaced by

primed indices. A choice of spin frame in S and S' determines a null tetrad
by
OAOA' = 2AA', GAGA' = WAA" ,
GALA' = ZAA' GAGA' = WAA'.
Conversely, a null tetrad determines a pair of spin frames in S and S', uniquely
up to an overall sign.
If we take oA and oA' to have components (1,0), and CA and CA' to have
components (0, 1), then for upper index spinors, we have, for example,
aA = a 0 0 A + aIcA .

But for a lower index spinor,

7rA' = 7r1'OA' - 7rO'GA' .
Spinors 163

This potential pitfall is a consequence of the use of a symplectic form to raise

and lower indices: the components of OA' and CA' are, respectively, (0, 1) and
(-1, 0). Note, however, that 7ro' = 7rA'oA and 7r1' = 7rA, LA .
Homogeneous functions on the Riemann sphere
From a geometric point of view, the spectral parameter ( in the linear system
of the ASDYM equation is an affine coordinate on PS' _ C P1. If we denote
an element of S' by 7rA', then the components 7ro' and 71, are homogeneous
coordinates, and t; = 7r1'/7ro'. For (# oo, we can take 7ro' = 1, 7r1' _ . Then
IrA' = (OA' - CA' in the standard spin frame.
A symmetric spinor OA'-.-C' with k indices determines a function of the ho-
mogeneous coordinates by
A'...C'

This is a homogeneous polynomial of degree k, and is therefore a global section of

0(k). Since the only global holomorphic sections are homogeneous polynomials
in irA', every global holomorphic section is of this form (§9.4).
SD and ASD 2-forms
A 2-form ryas has spinor equivalent 1'ABA'B', where
7'ABA'B' = "Y(AB)[A'B') +'Y(AB](A'B')
since ryab is skew symmetric. However, any skew 2-index spinor is necessarily a
multiple of e, because the second exterior power of a two-dimensional space is
one-dimensional. Therefore
-YABA'B' = OABEA'B' + PA'B'EAB
where 0 and 1b are symmetric. This is simply the decomposition of -y into its ASD
and SD parts: it follows from the spinor expression for the alternating tensor
that OABEA'B' is the spinor equivalent of an ASD form, and that V)A'B'EAB is
the spinor equivalent of an SD form.
Now suppose that OAB = 0, so that -y is SD. The symmetric spinor CA'B'
determines a homogeneous quadratic Q = 7pA'B'7rA'7rB'. If we put
7rA' = cOA' - GA', , A'B = COA'OB
+ 2bo(A 6B') + atA LB +
then Q = a(2 + 2b( + c. Let a and /0 be the two roots of Q and put
aA' =aOA' - CA', RA' =)30A' -6A',
so that A'B' = ace (A')3B'). Then ryab'Yab = 2?,bA,B,411A'B' = -a2(aA,/3A')2, so y
is null if and only if aA' and /3A' are proportional, and therefore, without loss of
generality, equal.
The Lie algebra
An element of the Lie algebra of right rotations is a trace-free matrix V _
(7/) a,). The trace-free condition is equivalent to the symmetry of 1 A'B" so 7p is
164 Mathematical background II
determined by the homogeneous quadratic Q = 4IlA'B'7rA,7rB,. We have
a(2 + 2b(+ c .
B") = (-a b), Q=

If tP" = [t//, Vi'], then the corresponding quadratics are related by

Q"=QacQ'-Q'ocQ.
We shall use this representation of the Lie algebra in our decomposition of the
Levi-Civita connection in Chapter 13.

a-planes
The spinor equivalent of a null SD 2-form is necessarily of the form 7A'71PEA13.
An a-plane through the origin is labelled, tip to scale, by a SD nonzero null
2-form, and hence by a nonzero spinor IrA', again up to scale. Thus the a-planes
through the origin correspond to the points of IFS'. If we put irA' = (OA' - LA',
then we obtain the same labelling by ( as in §2.3.

Spin structures

Given a double-null coordinate system, we can think of a spinor a E S as an

object having components aA, with the transformation rule aA -+ AABaB under
change of coordinates. The components of a `primed' spinor in S' transform
in a similar way under the right-hand component A B,. However, we have to
bear in mind that A and A are not uniquely determined by the coordinate trans-
formation, so we do not know whether to transform the components of a by
aA + AABaB or by aA 1_+ -A ABaB. So long as we work only in flat space-time,
without considering in detail the behaviour at infinity, it is legitimate to take
a pragmatic approach to this issue: we ignore the ambiguity and rely on the
general principle that it disappears whenever we construct tensors by forming
spinors with equal numbers of primed and unprimed indices. In other contexts,
when the geometry of space-time is nontrivial, it is necessary to specify the ge-
ometric structure in a more precise language, by formulating a definition that
captures the idea that the introduction of spinors involves making a particular
choice between the two possible pairs of SL(2, C) transformations on the overlap
of the domains of two local null tetrads, but allows for a notion of equivalence
between different choices.
We consider a complex four-dimensional space-time M, on which there is
given a holomorphic metric ds2 and a metric volume form v (the existence of v
is already a constraint on the global geometry).8 The null tetrads in M form
a principal bundle P with structure group SL(2, C) x SL(2, C )/7L2. A spin
structure is a double cover P -* P by a principal SL(2, C) x SL(2, C) bundle
such the covering map is equivariant with respect to the actions of the two
structure groups and such that
Spinors 165

P P
M
commutes. Spin structures need not exist, since there is a topological obstruction
in H2(M,Z2), and when they do exist, they need not be unique.
Given a spin structure, we can construct two rank-2 vector bundles S -* M
and S' -+ M associated to the two SL(2, C) factors in the product SL(2, C) x
SL(2, C), and we can identify SOS' with the tangent bundle TM. Such a
structure allows the unambiguous use of spinor notation and determines the
correspondence between vector fields Xa (sections of TM) and 2-index spinors
XAA' (sections of S ® S'), as well as between general tensors and multi-index
spinors. It also determines a natural connection on the spin bundles, which is
defined by pulling back from P to P the horizontal subspaces of the Levi-Civita
connection on P. The corresponding covariant derivative maps sections of spinor
bundles to 1-forms with values in the spinor bundles. By converting the 1-form
index to a pair of spinor indices, it can be represented as -a spinor operator
VAA'. Since the connection preserves the structure group of P, the spinors a are
covariantly constant. That is
DAA EBC = o,
and so on.
Spinor forms of the self-duality equations
The ASDYM equation on a connection D is the condition that its curvature
should have spinor equivalent of the form
FABA'B' = 'VIA' B'EAB
where GA'B' = .(A'B') is a matrix of symmetric spinors. The spinor equivalent
of the potential is related to Yang's matrix by
WAA' = -.I-1GA'OB,aAB'.I.
Yang's equation then takes the form
to DAA'
(OB' J-1
B' ) =0'
Real slices and spinor conjugations
On a real slice of CM, the structure group of P reduces to one of the follow-
ing real forms by imposing an appropriate reality condition on the double-null
coordinates:
(IE) SU(2) x SU(2)/Z2,
(U) SL(2, R) x SL(2, R)/Z2,
(M) SL(2, C)/Z2 .
The structure group of P is the corresponding double cover (i.e. without the
quotient by Z2). In the Euclidean case, the reality condition is 27v = -w, z = z,
166 Mathematical background 11

which is preserved provided that the left and right rotations are in SU(2); in the
ultrahyperbolic case it is that all four coordinates should be real, a condition
that is preserved provided that the left and right rotations are real; 9 and in the
Minkowski case, it is that z and z should be real, and w = w, which is preserved
provided that the left rotation is the complex conjugate of the right rotation
We can encode the reductions in the additional structure of `complex con-
jugation' on the spin spaces. In the Euclidean and ultrahyperbolic cases, the
conjugate of a spinor is a spinor of the same type. We define these as follows.
(E) If aA has components (x, y), then UA has components (p, -Y);
(U) If aA has components (x, y) then ZiA has components
We use the same definitions for primed spinors. In the case of real Minkowski
space, the conjugation interchanges primed and unprimed indices. That is, it.
maps S -+ S', and S' - S, antilinearly.
(M) If aA and OA' have components (x, y) and (p, q), respectively, then U A ' has
components (Y, V) and /3A has components (T2,4).
For lower index spinors, the conjugations are defined so that conjugation com-
mutes with raising and lowering. They then extend to the general multi-index
spinors by taking tensor products. In all three cases, the conjugation is pre-
served by the real structure group, and it maps spinors with equal numbers of
primed and unprimed indices to spinors of the same type. In each case the
vector equivalent of X AA' is real whenever X AA' = XAA' In the Euclidean
case, (a, a) = aAUA is an inner product on S, preserved by SU(2). Only in the
ultrahyperbolic case are there `real spinors', since only in this case is the reality
condition on the components of a spinor preserved by the structure group. In
the Euclidean case, the equation aA = UA has no nonzero solutions because of
the minus sign in the definition of the conjugation, which gives us aA = _aA,
so that aA = UA only if aA = _aA. In the Minkowski case, it does not make
sense to set a spinor in S equal to its complex conjugate because the complex
conjugate is an element of a different space.
Real spin structures
To define spinors on a real four-dimensional space-time M with a metric of any
signature, two conditions must hold.
(1) The structure group of the tangent bundle must reduce to the appropriate
real subgroup of SL(2, C) x SL(2, C )/Z2. For example, in the Lorentzian
case, M must be orientable and time orientable.
(2) It must be possible to construct the appropriate double cover of the associ-
ated principal bundle.
A spin structure on the complexification of M does not always induce a spin
structure on M, nor can spin structures be analytically continued from one
real slice to another. For example, compactified Minkowski space has two spin
structures, neither of which coincides with the analytic continuation of the unique
Spinors 167

spin structure on the compactification of E; and the compactification of U does

not admit a spin structure at all.
Geometry of the twistor correspondence
In its spinor form, the correspondence between the conformal geometry of com-
plex Minkowski space and the linear geometry of twistor space is more obviously
intrinsic since it does not involve a special choice of coordinates. It goes as
follows.
The tangent bivector to an a-plane is null and self-dual, and therefore its
spinor equivalent is of the form
7rA'7r B'£AB

for some 7r A'. We call 7r A' the tangent spinor. It determines the tangent space
and is determined by it up to multiplication by a nonzero complex number, since
the tangent space is the set of solutions T to the linear equation
TAA'7rAv = 0;
the solutions are the null vectors of the form TAA' = 7rA'QA, where QA is ar-
bitrary. Thus x and y lie on an a-plane with tangent spinor 7rA' if and only
if
yAA'7rA' = XAA'7rA' .

Let Z be an a-plane with tangent spinor 7r A', and put wA = xAA'7rA', where
x is any point of Z. Then wA and 7rA' are determined by Z, uniquely up to the
equivalence
(w A, 7rA') ,v (AwA, \7rA') ,

for nonzero complex A. Conversely, each pair of spinors (WA, 7rA, ), with 7rA' # 0,
determines an a-plane Z: the points of Z are the solutions to
X AA'IrA' = wA (9.9.3)
which is a pair of linear equations in the coordinates of x (see eqns 9.2.2 and
9.2.1). Thus each a-plane in complex Minkowski space determines a point in
the complex projective space on which the four components of wA and 7rA' are
homogeneous coordinates. We must exclude the line I, defined by 7rA' = 0,
but every other point of the projective space corresponds to an a-plane in C M.
Each x E C M determines a line 2 in the projective space by reading (9.9.3) as a
linear equation in the homogeneous coordinates for fixed x; and conversely any
line that does not meet I can be written uniquely in the form (9.9.3), and so
determines a point of C M.
Equation (9.9.3) is the condition that the point x and the twistor Z are
incident and can be read either as an equation in twistor space or in space-
time: the a-planes determined by the points of ± are those that pass through
x in space-time, and the points of an a-plane correspond to the lines that pass
through the corresponding point in the projective space. Since two points x, y
in complex space-time are null separated if and only if they lie on an a-plane,
168 Mathematical background 77

we can recover the conformal structure of C M from the linear geometry of the
projective space by characterizing null separation as the condition that I and
should intersect (see the discussion in §9.2).
In this description of the twistor correspondence, the only choice we have
made is of the origin in C M: if we translate the origin by T, then the tangent
spinors are unchanged, but we have to subtract TAA'7rA' from WA. Thus a change
of origin must be accompanied by a linear transformation,
(WA, 7rA,) ` (WA - T A11' 7rB, 7rA') ,
of the homogeneous coordinates. This preserves the linear geometry of the pro-
jective space and leaves invariant the line I.
We link the spinor treatment with our coordinate-based description of twistor
space by introducing a double-null coordinate system and its associated spin
frames, and by putting 7r°, = 1 and 7r1, _ (. Then oA7rA' and
eA7rA,

are the
spinor equivalents of the vectors
-aw+Caz, -aZ+Caw,
and (w°, w') = ((w + z, (z + w) = (A, µ). Therefore
0 1 2 3 (WO, 1
W

so the homogeneous coordinates are the same as before. This explains the ap-
parently eccentric order that we chose for the Z°s: it was determined by the
conventional representation of a twistor (a point of T) as an ordered pair of
spinors (W A, 7rA, ).
In spinor notation, the linear system of an ASDYM connection is the pair of
operators 7rA'DAA' (A = 0, 1).
NOTES ON CHAPTER 9
1. We can also define twistor spaces for subsets of space-time that do not satisfy the
connectivity condition, but in order to make the Penrose-Ward transform work in a
natural way, it is necessary to count each connected component of the intersection of
each a-plane with U as a separate point of P. In such cases P is a covering of a subset
of PT, and its topology need not be Hausdorff.
2. The space of ,0-planes is the dual projective space, and one can think of the conju-
gation as being determined by an antilinear map 'IF - T', Z° i-. Z. The indefinite
bilinear form Z°Z° has signature + + - -, and reduces the complex conformal group
to its real form SU(2, 2)/Z4. The space PN is picked out as the set of 'null twistors'
such that Z° 9° = 0. This 'reality structure' on twistor space is central to many of the
original applications in Minkowski space (Penrose 1976, Penrose and MacCallum 1972,
Penrose and Rindler 1986).
3. We can identify PT- I as a C lF bundle over l with the projective prime spin bundle
F, a construction that extends to S4 when we include I as the fibre over the point at
infinity. This observation is the starting point for the construction of the twistor space
of an ASD Riemannian metric in Atiyah et al. (1978a), in which F is given a complex
structure by taking alai(_ and 7rA'DAA, as the (0,1)-vectors (see §10.4).
4. Note that e is invariant under the action of SL(4, C ), but not that of GL(4, C ).
However, that any conformal transformation of space-time can be represented (uniquely
Notes on Chapter 9 169

modulo Z4) by an element of the special linear group, so e is a conformally invariant

structure on twistor space.
5. This follows from the identity
det(l + A) = 1 + A aa + A(a(aAbib( + + A(a(aAbb ... A`ici
where the last term contains n factors. If 1 + A is of the form of j, then the series
terminates at the term of degree k in A.
6. The integrability condition is the vanishing of the torsion tensor. In real coordinates
on M, this is defined by
N be = 4(adJ(o )Jb1 + 4Jdao(CJb1 .
That the vanishing of N is sufficient for the existence of local holomorphic coordinates
is quite straightforward when J is assumed to real analytic, but is the hard theorem
of Newlander and Nirenberg under the more general condition that J is smooth. See
Kobayashi and Nomizu (1969).
7. An element of the first cohomology group H1(M,E) can also be understood in a
more geometric way through the classification of affine bundles A - M associated with
a given vector bundle E. The fibres of A are affine spaces modelled on the fibres of
E, that is, A. is a copy of E,,,, except that the origin in A,,, is not specified. More
formally, A is a complex manifold together with a projection 7r: A - M and, for each
m E M, a map v: A. x A,,, 4 E,,,, where A,,, = 7r-'(m), such that
(i) v(a, b) + v(b, c) = v(a, c) for a, b, c E Am;
(ii) for each m E M, there is a neighbourhood V -3 m and a holomorphic map p: V
A such that (a) 7r o p is the identity and (b) a -+ v(p(7r(a)), a) is a biholomorphic
map from 7r-'(V) onto Elv.
Thus locally we can identify A with E by using p to pick out an origin in each fibre A,,,,
but on the overlap of two such open sets V and V, we shall have that the corresponding
maps p and p are related by v(p, p) = g for some section g of E over V fl V. The gs
determine a class in H'(M, E), which is uniquely associated with A independently of
the choices made for the Vs and ps. Conversely, every class in H1 (M, E) determines an
affine bundle such that the local representatives of the class give the transitions between
local choices for the origins in the fibres of A. Thus H1(M,E) can be identified with
the set of equivalence classes of affine bundles over M modelled on the fibres of E, with
the obvious definition of equivalence.
8. It is also possible to introduce spin structures in the more general context of complex
conformal geometry. Here we are given the metric only up to a complex conformal
factor, and a choice of `orientation', that is, a global duality tensor Zeabcd (the duality
tensor is determined locally by the conformal structure up to sign; we are assuming
that the signs can be chosen consistently over the whole of space-time, which is a non-
trivial global constraint). We define an oriented conformal null tetrad at a point to be
a basis {W, Z, W, Z} for the tangent space such that
(i) g(W, W) = -g(Z, 2), and all the other inner products vanish;
(ii) W(aZb( is self-dual.
These form a principal bundle C with structure group C/7L2, where G is the subgroup
of GL(2, C) x GL(2, C) on which the determinants of the two matrices are equal. A
complex conformal spin structure is a principal G-bundle d together with an equivariant
projection d -- C with the same property as before. This is sufficient to allow the
introduction of the spinor bundles S and S', and the isomorphism S ® S' -' TM, but
not the spinors CAB and CA'B'. It is possible to have a conformal spin structure for a
given global metric in a situation in which a spin structure does not exist.
170 Mathematical background II
9. We are using the form (U)2 of the reality condition; if we use (U),, then w = w and
z = z on the real slice, in which case the structure group reduces to the isomorphic
real form SU(1, 1) x SU(1, 1)/Z2.
10
The twistor correspondence

In this chapter, we shall derive the Perirose-Ward transform, by which a solution

to the ASDYM equation on a domain U in complex space-time is shown to
determine, and be determined by, a holomorphic vector bundle on the twistor
space P of U. For a general local analytic solution, the bundle can be represented
by a patching matrix, a matrix-valued function F of three complex variables. The
variables are coordinates on P, and F is the patching matrix of the holomorphic
vector bundle relative to a two-set open cover of P. We explain how F can
be constructed from the linear system, and how the solution to the ASDYM
equation can be recovered from F by solving a Riemann-Hilbert problem.
We introduce the transform first in a concrete form as a correspondence be-
tween potentials and patching matrices, and then in an abstract geometric form
as correspondence between connections and holomorphic bundles over P. To il-
lustrate the power of the construction, we give Ward's derivation of the Painleve
property of the ASDYM equation, and the complete solution to the instanton
problem due to Atiyah et al. (1978b), that is, the problem of determining the
space of global solutions to the ASDYM equation on S4. By methods of algebraic
geometry, the problem is reduced to equations on finite-dimensional matrices.
There is an analogous global problem in ultrahyperbolic signature and again the
corresponding bundles can be described precisely. The description involves an
arbitrary map from RP3 into the complexified gauge group together with addi-
tional algebraic data, and so here the space of solutions is infinite dimensional.
We extend the transform to the GASDYM equation and its hierarchy, which
were introduced in Chapter 8.
At the end of the chapter we discuss the linearization of the transform, which
identifies the solution space WD of the background-coupled wave equation with
a cohomology group constructed from E. We show that the symplectic form
on WD has a straightforward expression on twistor space. The patching matrix
will play a central part in our treatment of the Hamiltonian theory of integrable
systems because the recursion operator takes a particularly simple form when it
is expressed as a linear operator on infinitesimal variations in F.

10.1 THE CONCRETE FORM OF THE PENROSE-WARD TRANSFORM

The Penrose-Ward transform is derived from the observation that the ASDYM
equation is equivalent to the vanishing of the curvature on every a-plane (see
172 The twistor correspondence
Chapter 3). The basic step is to attach a vector space E' , the space of covariant
constant sections on the a-plane Z, to each point of Z E P. We shall see in the
next section that this defines a holomorphic vector bundle E' P and that the
ASDYM field can be reconstructed from E'. Given a suitable covering of P, any
bundle can be characterized by its patching matrix. In this section we derive the
patching matrix of the bundle directly from the solutions to the linear system
and we show that it determines the ASDYM field.
The fundamental solutions
Let U C CM and let D = d + D be an ASD connection on a vector bundle
E U with fibre C". Suppose that U is open and that each a-plane that meets
U intersects it in a connected and simply-connected set; for example, U might
be an open ball. In this and the next section, P will denote the twistor space of
U, that is, the space of a-planes that meet U, as defined in §9.2. We shall denote
by V, V a two-set open cover of P, such that V is contained in the complement.
of ( = oo and V is contained in the complement of (= 0. 1
The compatibility condition for the Lax pair
L=Dw-(Di, M=Dz- (Dt
implies that the linear system
Ls=0, Ms=0
can be integrated for each fixed value of (. Here s is a section of E, represented
by a column vector of length n. We can put together n independent solutions
to form an n x n matrix fundamental solution f : the columns of f form a frame
field for E, made tip of sections that are covariantly constant on the a-plane-,
tangent to 8,,, - (ai and 8Z - (8;,. They are single valued because the a-planes
intersect U in simply-connected sets.
Written in full, the equations satisfied by f are
P. + I'w)f - ((ai + 4i).f = 0,
(a.+4))f-((aw+4),;,)f = 0, (10.1.1)
As ( varies over the complex plane, we can make f depend holomorphically on (,
as well as on the space-time coordinates w, z, w, z. We cannot, however, extend
f to a regular function on the whole (-R.iemann sphere (by `regular', we mean
`holomorphic with non-vanishing determinant'), because if f were regular for all
(, including ( = oo, then, by Liouville's theorem, it would be independent of
In that case (10.1.1) would imply that the columns of f were covariantly
constant, so the connection would be flat.
Given the choice of gauge, f is unique up to f f H, where H is a non-
singular matrix-valued function of ( and the space-time coordinates such that
a,,,H - (8iH = 0, 8ZH - (a,7,H = 0. (10.1.2)
That is, H can be expressed as a function of A = (w + z, p = (z + w and (
(in the notation of §9.2). We can think of f as a function on a subset of the
The concrete form of the Penrose-Ward transform 173

correspondence space .F and of H as the pull-back of a holomorphic function

on V, with eqn (10.1.2) expressing the constancy of H along the leaves of the
fibration p:.F -+ P.
When D is not flat, it is impossible to choose f so that it is regular at C = 00,
as well as for finite values of C. We can, however, find another fundamental
solution f which is holomorphic in ( on the whole Riemann sphere, except at
= 0, by setting [; = 1/( and solving the linear system in the form
(DWf - Di.f = 0, (D2f -D;,f = 0. (10.1.3)
This solution is unique up to f - f H, where H is holomorphic on V
The patching matrix
On the overlap of the domains of f and f in F, we have
f =fF
where F satisfies equation (10.1.2), so that it is the pull-back by p of a holomor-
phic function on V n V. We call F the patching matrix associated with D. It
is determined by D up to the equivalence F ' H-'FH, where H is regular on
V, and H is regular on V. The matrices in the equivalence class of F are the
patching data of D. When F lies in the class of the identity function, that is
when F can be factorized in the form
F=H-'H,
with H regular in V and H regular in V, we have a fundamental solution f H =
fH which is global in (. In this case the curvature vanishes. When such a
factorization does not exist, the curvature is nonzero. In fact, as we shall show
next, the patching matrix encodes the ASDYM field since D can be recovered
from F.
Under a gauge transformation
4) F_+ (D'=g-'(Dg+g-'dg, (10.1.4)
where g is function of w, z, w, z with values in the gauge group. We can construct
fundamental solutions for the new potential F' by replacing f and f by g- 'If
and g-' f This leaves the patching matrix unchanged.
.

The map that assigns the patching data to an ASDYM field is the forward
Penrose-Ward transform. In section §10.3, we shall give the geometric definition,
in which F is the patching matrix for a pair of trivializations of a holomorphic
vector bundle on the overlap of any pair of open sets covering P, or in which
there may be a family of patching matrices For defined on the overlaps Vo n Vr
of a general open cover {V0}.

The reverse transform

For each fixed (w, z, w, z), we have that
F((w+z,<z+w,[)= f-'f
174 The twistor correspondence
which is a Birkhoff factorization. We also have that 4) is given in terms of f and
f by
4),,, - (4'i = (-awf +(eif)f-1 = (-awf +(aif)f-t (10.1.5)

together with a similar formula for the other two components. By the uniqueness
statement in Birkhoff's theorem, any other factorization must be given by
f'=gf, f'=gf,
where g is independent of (. If we define a new potential 4' by substituting f'
and f' for f and f in (10.1.5), then we shall have 4' = g'14'g + g-'dg. Hence
F determines 4' tip to gauge transformations.
Now suppose that we start with a given holomorphic matrix F(A, it, () with
nonvanishing determinant, defined on V fl V. By applying Birkhoff's theorem at
each point of space-time, we can factorize F in the form
F((w+z,(z+w,() = f-' f , (10.1.6)
where f (w, z, w, z, () is regular for 1(1 < 1, f (w, z, w, z, () is regular for I(I > 1,
including ( = oo, and A = diag((k,.. , (,) for some integers k, ... , m, which
may not he constant. If F is chosen so that A = 1 at some point of space-time,
then A = 1 in an open set U of space-time. We want to show that under this
condition, F is the patching matrix associated with some solution to the ASDYM
equation. Since F is constant along 8,,, - (8i, we have
(awf - (aif)f-1 = (awf - (aif)f-1
at every point of U, for all (in some neighbourhood of the unit circle. The
left-hand side is holomorphic for I(I < 1 and the right-hand side is holomorphic
for I(I > 1, except for a simple pole at infinity. It follows by an extension of
Liouville's theorem that both sides must be of the form -4)w + (ci, where 4),,,
and 4i are independent of (. We take these to be two of the components of 4'.
and we use the same argument with a,,, - (8i replaced by 8Z - (8,1, to construct
the other two components. We then have
Dwf - (Dif = 0, Dzf - (Dwf = 0,
where D = d + 4' acts on the columns of f. It follows that the linear system
associated with 4) is integrable and hence that D = d + 4) is ASD.
Thus not only can D be recovered from its patching matrix, but any given
patching matrix such that A = 1 at some point of space-time generates a solution
to the ASDYM equation on an open subset of space-time.
The J and K potentials
Given f and f we can write down the solution to the ASDYM equation more
directly by use of the following lemma (the notation is explained in §2.3).
Lemma 10.1.1 The gauge potential is given in terms off and f by
4, = h8h-1 + h5h-1 ,
The concrete form of the Penrose- Ward transform. 175

where h = fI(=o and h = fIc=,,..

proof The two first equations are derived by putting (= 0 in (10.1.1), and the
second two by putting ( = 0 in (10.1.3).
The factorization of the patching matrix also determines J and K matrices for
D directly without integration as follows. Since the matrices h and h in Lemma
10.1.1 satisfy the same identities as those in the definition of J in §3.3, we have
j = h-'h. Thus J is determined uniquely from F by evaluating its factors
at ( = 0 and ( = oo. To determine K, we look at the next terms in the
Taylor expansions of f and j. At points where 0 = 1, 4) is nonsingular and the
factorization can be fixed uniquely by imposing the gauge condition f = 1 at
=0. Then cI =4bZ=0and
f = 1 +(K+O((2). f = J-1 +O(ff)
for some matrix-valued function K. By substituting the first of these into
(10.1.1), and by picking out the terms of order (. we find that
CD =aK, 4).z =a,,K.
It follows that K is the same as the K-matrix introduced in §3.3.
Example 10.1.2 The Atiyah-Ward ansatz.2 Consider the patching matrix

F = (0 aryl )
where -y is an arbitrary scalar holomorphic function on V r1 V. By putting A =
(w + z and µ = (z + w and by expanding y in a Laurent series in (, we can write
00

7=E =7++4+7-
_00
where the ryas are functions of the space-time coordinates, 7+ and 7_ contain
only positive and negative powers of (, respectively, and 4 = 7o is the term of
order zero in C. Since f-y = m7 = 0, we have the recursion relations
aw = az7,+1, 9"7, =,
from which it follows that each of the 7;s satisfies the scalar wave equation.
The Birkhoff factorization is F = f -1 f where
1 t; +7+ _ 1 1 -(7-
f= ( f \-<-1 4+7-
It can be seen that this factorization is nondegenerate whenever 4 # 0 (when
4 = 0 we must take A = diag((, (-1)). From Lemma 10.1.1 and the recursion
relations, we have

(k, dx° =
1 50-80 2(¢Z dw + 4,,, dz)
20 ( 2(4Z dw + Ow dz) a46 - 80 )
176 The twistor correspondence
The ASDYM condition on 4 is precisely the wave equation on (k. This cor-
respondence between ASDYM solutions and solutions of the wave equation has
become known as the t'Hooft ansatz, and was used to generate the first examples
of instantons (see §10.4). With higher powers of ( and C-1 on the diagonal, the
result is a similar relationship between ASD connections and solutions to the
zero-rest-mass field with higher helicity.
Remark. The connection in the example has a particularly simple geometric
interpretation in terms of spinors. If we rescale the flat metric by 0, then the
primed spinor connection becomes
VAA'aB' = 19AA'aB' - aA'19AB' log¢ + 2aB'0AA' log .

The ASD part of the curvature of this connection is precisely the scalar curva-
ture of the rescaled metric which, according to standard formulas (Penrose and
Rindler 1986) vanishes if and only if 0 satisfies the wave equation.

10.2 THE ABSTRACT FORM OF THE TRANSFORM

Our presentation of the Penrose-Ward transform has been explicit, but gives the
false impression that the choices made for the cover V, V, and for the coordinates
play a special role in the construction. We shall see in this section that, viewed
abstractly, the transform is one between ASDYM fields on U and holomorphic
vector bundles on P, and that the choices are simply those required to repre-
sent the bundle in a concrete way by the transition matrix between two local
trivializations.
In place of a stereographic coordinate, we can equally well use the compo-
nents of a spinor 7rA' as homogeneous coordinates on the (-Riemann sphere. The
points C = 0 and C = oo then have no special significance; they are needed to
construct J and K, but not the gauge potential itself. In its geometric form, the
correspondence between ASD connections and holomorphic bundles is transpar-
ently covariant under space-time coordinate transformations.
We shall see that the abstract version of the transform gives us considerable
flexibility in specifying data on twistor space. Instead of describing the bundle by
its patching functions, we can, for example, specify the a-operator, as in the AHS
construction below (§10.4); see also §12.6 and Woodhouse (1985), and as well as
the formulation in Newman (1986), and in Sparling (1991). Other possibilities
are the algebraic descriptions in the ADHM construction and in §12.4.
Holomorphic bundles on P
The patching matrix F of an ASD connection D on U determines a rank-n
holomorphic vector bundle E' P, by interpreting F as the transition matrix
between holomorphic trivializations over V and V. Since F has a Birkhoff fac-
torization with A = 1 at each space-time point in U, the restriction of E' to each
line in P corresponding to a point of U is holomorphically trivial. The holomor-
phic bundle is uniquely determined up to equivalence by D since the freedom in
the construction of F from D is precisely the freedom in the choice of the two
The abstract form of the transform 177

local holomorphic trivializations. Conversely, given E' -+ P, we can recover F

and hence D provided that we can find holomorphic frame fields for E' in the
open sets V and V; this is always possible because these two open sets can be
chosen to be Stein manifolds. We therefore have the following theorem, which
we shall prove by a direct geometric method. The theorem also holds for other
complex gauge groups, such as SL(n, C ), and can be proved in the same way.
Theorem 10.2.1 Ward (1977). Let U C C M be an open set such that the inter-
section of U with every a-plane that meets U is connected and simply connected.
Then there is a one-to-one correspondence between solutions to the ASDYM equa-
tion on U with gauge group GL(n, C) and holomorphic vector bundles E' -i P
such that E'Ii is trivial for every x E U.
Proof To go in the forward direction, we start with an ASD connection D on
a rank-n bundle E - U and we define the fibre of E' at Z E P by
EZ={sEr(ZnU,E)IDslznu=0},
where r(ZnU, E) is the space of sections of E over ZnU. Because D is ASD, its
restriction to Z has zero curvature, and because Z n U is connected and simply
connected, the covariantly constant sections on Z n U are single valued, and
are uniquely determined by their values at any one point. Therefore E' is an
n-dimensional complex vector space. Clearly, it varies holomorphically with Z.
To go in the other direction, we suppose that we are given E' - P with the
stated properties, and we define a bundle E -+ U by
Ex = r(±, E') ,
where r denotes the space of holomorphic sections. By hypothesis, the restriction
of E' to 1 is the product bundle ± x C". In this trivialization, global sections of
E'11 are holomorphic maps ± - C". By Liouville's theorem they are constant,
and hence Ex is a vector space of dimension n.
We want to construct a connection D on E such that for each Z E P, E'
is the space of covariantly constant sections of E over Z n U. This requirement
contains a geometric characterization of parallel transport over a-planes: for each
Z E P, we identify the fibres Ex, x E Z n U, with E' by evaluation at Z (an
element of Ex is by definition a section of E%i). If D exists, then the covariantly
constant sections over Z n U are those that are constant in E'' . Therefore, if the
connection exists, then it is unique since null vectors tangent to a-planes span
the tangent space at each point of space-time.
To establish the existence of D, we shall work on the correspondence space
.F, and make another application of Liouville's theorem. Consider the pull-back
p`E', which is a bundle over.F; by construction, we also have p`E' = q'E. Let
Z E P. By the definition of the pull-back, the fibre of p*E' at each point of
p-1(Z) E .F is identified with E. Therefore the restriction of p*E' to p-1(Z)
is a product bundle. Now the leaves of the projection p:.F P are spanned by
the vector fields
e=a,,, -(az, m=aZ-(aw
178 The twistor correspondence
on F. We define a partial connection D that allows us to differentiate the sections
of p'E' along the twistor fibration by requiring that on each leaf p-'(Z) we
should have
Des = Q(s), D,ns = m(s)
in the trivialization p*E' = p-' (Z) x E. The sections for which Des and Dn, s
vanish are the pull-backs to F of local sections of E'.
We now pick a local trivialization of E over some open subset of U. This
determines a local trivialization of p'E', in which
De=e+(De, Dm=7n+4)m,
where 4)e and 4),n are matrix-valued functions of ( and the space-time coordi-
nates. From their definition, we see that (-'e and (-'ni are non-singular at.
( = oo. By considering the partial connection along these two resealed vector
fields, we see that the same must he true of r; -' 4)e and ('4), Therefore (De and
4),,, are holomorphic over the whole ( sphere, except for simple poles at = 00.
It follows that
4)e = 4)u' - (4)i, 4)rn = 4)z - ((Du
where 4) = 4) dw + 4), dz + 4),;, (Iza + 4)i dz is independent of (. The required
connection is D = (I + 4).
Remarks. (i) We can also recover the potential when E' is presented in terms of
a general system of local trivializations. Suppose that E has patching matrices
F,,: V, nVT , GL(n,C),
where { V,) is an open cover of P. We can then recover 4) by solving the factor-
ization problem
Farop=fo fT,
where fo is regular on p-' 'T'hen,
(D,u dw + 4)Z dz - ((4),i, dw + 4)i dz) = foc7ln -
for all or. If p-' (Vo) contains ( = 0 and p-' (VI) contains = oo, then we can
introduce J and K potentials by
i

where the factorization is fixed by the imposing the condition fo = 1 at ( = 0.

(ii) In spinor notation, the linear system of an ASDYM connection is the pair
of operators 7rA'DAA' (A = 0, 1), and the gauge potential is recovered from the
factorization of the patching matrix by
it A'4)AA' _ -7rA (aAA'f)f-) = -7r A'(AA'f)f-'.
(iii) Theorem 10.2.1 remains true when we include points at infinity: it gives a
correspondence between solutions on an open subset U C C M# and holomorphic
vector bundles over P (the subset of VI' of a-planes that intersect U). The only
part of the proof that does not clearly extend is the construction of 4) from the
The Painleve property 179

partial connection along t and m. However, the argument is local in space-time,

so we can apply it in a neighbourhood of a point x at infinity by first making a
conformal transformation to map x to a finite point of C M.
Reality conditions
We now turn to the conditions on E' that will ensure that D is real in the sense
that its structure group reduces to a real form of GL(n, C) or SL(n, C) on IE or
U; these we shall formulate in terms of the behaviour of E' under the involution.
We require that U and P should be invariant under the involution a defined
in §9.2. We then define a reality structure on E' to be an antiholomorphic map
r: E' -+ E", where E" is either E' or the dual bundle E", such that the diagram
E' E"

P P
commutes, and such that for each Z, T restricts to an antilinear map from E'f
onto Eoixi. If x E U lies on the real slice, then is invariant under o, and
r induces an antilinear map from r(x, E') to F(±, E"); that is, from Ex to Ex
or Ex*, according to the definition of E". There are several possibilities; for
example, the following.3
Unitary real structures. E" = E'' and the map Ex - E= determines a
Hermitian form on E for each real x. The structure group reduces to U(p, q) or
to SU(p, q) for some p, q with p + q = it.
Real vector bundles. E" = E' and the fixed points of the conjugation Ex -'
Ex form a rank-n real sub-bundle over the real slice. In this case the structure
group reduces to GL(n, R) or SL(n, R).
Other possibilities, such as reduction to an orthogonal or symplectic group, re-
quire a combination of the second type of structure with a reduction of the
structure group of the bundle on twistor space to either the complex orthogo-
nal or to the complex symplectic group, that is, they require the existence of a
holomorphic skew or symmetric form on each fibre.

10.3 THE PAINLEVG PROPEWFY

One of the characteristic properties of integrable systems is that their reductions
to ODEs pass the Painleve test, that is, the singularities of the reduced solutions
are either poles or else they are fixed, that is, they do not move with variations in
the initial data (see Chapter 1). Put another way, the solutions are meromorphic
for all choices of initial data on the complement of the set of fixed singularities.
Ward (1984a) proved an analogous property of the ASDYM equation which
explains why the reductions of the ASDYM equation pass the Painleve test.
The key to his proof is in the geometry of the twistor correspondence. If we
start with a solution in a suitable open set U in complex space-time, then its
180 The twistor correspondence

Penrose-Ward transform is a holomorphic bundle over the set P of a-planes that

intersect U. It may be, however, that the set of lines contained in P is larger
than U, in which case the reverse transform will extend the solution to a larger
set. The extended solution need not be regular everywhere because some of the
points in the extended region may be jumping lines. However it follows from
Lemma 10.1.1 and Proposition 9.3.4 that, in the gauge determined by f and f ,
the jumping singularities are at worst poles in the potential.
In the following statement of Ward's result, `meromorphic' should be un-
derstood in the following sense: in some neighbourhood of each point, there is
a gauge in which the potential c is a meromorphic function of the space-time
coordinates.
Proposition 10.3.1 Ward (1984a). Let S be a non-null complex hypersurface
in C M given by the vanishing of a holomorphic function t, let W C C M be the
intersection of {Re(t) < 0} with an open ball centred on a point x E S, and let
D be a solution to the ASDYM equation on W. Then D can be extended to a
meromorphic solution on W', where W' D W contains an open neighbourhood
of X.
Proof The set W satisfies the conditions of Theorem 10.2.1, and since S is
nonnull, every a-plane through x intersects W. Thus the twistor space P of W
contains the line 1 (see Fig. 10.1). Let W' be the intersection of the open ball
with the space of lines in P. Then W' is an open set containing W and x, and,
if y E W', then every a-plane through y intersects W. By making a Birkhoff
factorization of the patching matrix of a bundle over P, we can therefore extend
D to W'. It follows from the remarks above that the potential is meromorphic
in W' in the gauge determined by the factorization.

10.4 GLOBAL SOLUTIONS IN EUCLIDEAN SIGNATURE

In this section and the next we consider boundary conditions at infinity, in
this section in Euclidean space and in the next in ultrahyperbolic space-time.
Under mild topological conditions on U C C MO, the Penrose-Ward transform
gives a correspondence between solutions to the ASDYM equation on U and
holomorphic vector bundles over the twistor space of U. By representing the
bundles by their patching matrices and by solving a Riemann-Hilbert problem,
we can generate meromorphic solutions on U. However, this direct use of the
transform is not well adapted to global problems because it is not easy to locate
jumping singularities in advance, and because it is not easy to express boundary
conditions on the solution as restrictions on the the patching matrix.
We shall look at two examples of how we can use the transform to solve
boundary value problems by choosing different representations of the holomor-
phic bundles. The first is the ADHM construction of instantons; the second, in
the next section, is the construction of global solutions on the compact ultra-
hyperbolic space S2 x S2. First we discuss a simplification that occurs in the
Penrose-Ward correspondence in Euclidean signature.
Global solutions in Euclidean signature 181

Fig. 10.1. The proof of Proposition 10.3.1.

The AHS version of the correspondence

Atiyah et al. (1978a) observed that the Penrose-Ward transform from space-time
to twistor space is particularly straightforward for fields on IE, since in this case
there is a simple description of the construction in terms a-operators. It goes as
follows.
In §9.2, we showed that the twistor space P = PT-I is fibred nonholomorphi-
cally over lE by the map b: P --* lE that sends the twistor Z to the point b(Z) E lE
corresponding to the line joining Z to a(Z). Thus we can represent P as the
product 1E x C P', and use as coordinates the three complex variables z, w, C,
although they are not holomorphic. The complex structure on P is determined
by the condition that
e=aw - Cazi m =a.-Caw, 5/5C
should span the anti-holomorphic tangent space, that is, the eigenspace of the
complex structure with eigenvalue -i.4 The fibration extends to the conformal
compactification by adding a single point at oo to 1.E and by adding the line I
to P. We then have b: C 1P3 -+ S4, as in §9.2. The construction is conformally
invariant, as can be seen by identifying P with the projective primed spinor
bundle, and identifying the (0, 1) vectors with the projections into P of the
vector fields 7."SAA' and a/S A, (§9.9).
Now let D be an ASD connection on a bundle E E. We construct a
smooth bundle E' H P by putting E' = b'E and give it a holomorphic structure
by defining the a-operator to the (0, 1)-part of b*D. In other words, the local
182 The twistor correspondence
holomorphic sections are the solutions to

for which the integrability conditions hold by the ASD condition in the form
]De, D,,,] = 0 (the other commutators vanish trivially). By applying the reverse
transform, we recover from E' a holomorphic solution to the ASDYM on a
neighbourhood of E in CM, which coincides with the one we started with on
restriction to E. In particular, it follows that every solution to the ASDYIVI
equation on IE is analytic.
Instantons
Instantons are solutions to the ASDYM equations on IE with gauge group SU(n)
such that the action
S = -4 Jtr(Fab1'b)d4x

is finite (note that S is nonnegative since the trace is a negative definite inner
product on the Lie algebra of SU(n)). By an application of Uhlenbeck's (1982)
removable singularity theorem, they can be characterized geometrically as the
solutions that extend to the compactification S4. Uhlenbeck's theorem states
that a finite action solution to the Yang-Mills equation on B - {0}, where B is
the open unit open ball in 1R4, extends to the origin for some choice of gauge; by
making an inversion, and by noting that the action is conformally invariant, the
theorem also applies in a neighbourhood of the point at infinity.
Unless F = 0, the bundle E S4 on which the solution is defined is necessar-
ily topologically nontrivial. This follows from the fact that 2FabFa'd4x = FA*F
so that, if F = -*F i4 0, then
S 817r2J
= 0.
7f2

The right-hand side is an integer-valued topological invariant, which we denote

by k and call the instanton number. it is the second Chern class of E. s
Under the Penrose-Ward transform, an instanton is mapped to a holomorphic
vector bundle E' - CP3 with no real jumping lines. When F # 0, E' is
nontrivial even as a topological bundle, since its topological class is that of the
pull-back of E by the projection CP3 -+ S4. It has Chern classes cl = c3 = 0,
c2 = k.
Monads and holomorphic bundles on C IP3
The construction and classification of global holomorphic vector bundles on pro-
jective space is a classical problem in algebraic geometry. A powerful technique
for generating such bundles is the monad construction of Horrocks. By using
Barth's characterization of the bundles that arise from the Horrocks construc-
tion, Atiyah et al. (1978b) showed that the monad construction, combined with
the Penrose-Ward transform, gives all instantons (this is the ADHM construc-
tion).
Global solutions in Euclidean signature 183

Definition 10.4.1 A monad is a sequence

A °-`,B C
in which A and C are k-dimensional complex vector spaces, B is 2k + n-dimen-
sional complex vector space, and pZ and TZ are linear maps, depending linearly
on Z E IF, such that for every Z, (i) TZ o pZ = 0, (ii) pZ and 'rZ have maximal
rank.
A monad determines a rank-n holomorphic vector bundle E' - C F3i with fibres
EIZI = ker(rz)/im(pz), where [Z] E CF3 denotes the equivalence class of Z.
By the first condition, im(pz) C ker(rz), and by the second, the fibres are n-
dimensional. Because the maps depend linearly on Z, the fibre is independent
of the choice of scaling of Z E [Z]. (After being careful in the statement of the
definitions, we shall now revert to our normal abuse of convention, in which we
do not distinguish between Z E T and [Z] E CIP3.)
All the holomorphic vector bundle over C F3 corresponding to instantons arise
from monads. The proof has two parts: the first is to use the linear Penrose
transform to show that if E' is the transform of an instanton, then 6
H1(CF3,E'(-2)) =0,
where we use the notation E'(q) = E'®O(q). The second is an explicit construc-
tion of the monad under the assumption that this condition holds: the vector
spaces are the cohomology groups
A= H1(CIP3,E'(1)(gS12), B = H1(CIP3,E'(&ul'), C = H1(CP3,E'(-1)),
where OP is the bundle of holomorphic p-forms. The maps pZ and TZ are induced
from
Q2 ®O(1) l -+ 0(-1),
where Z is regarded as a constant vector field on T, and hence, by projection,
as a global section of TC P3 00(-1).7
Construction of the solution
To construct a monad one needs only to solve the algebraic relations implicit
in definition (10.4.1), which is a problem in finite-dimensional matrix algebra.
Given the monad data, therefore, the construction of the solution on space-time
is purely algebraic and much simpler than using Birkhoff factorization to find
the solution from the patching matrices of the bundle.
Suppose that we are given a monad
AB-`+C.
For each independent Z, Z E T, we put x°p = Zl'Zpl and define
E, =ker(rz)nker(r2) C B, Fx= (pZA)n(p2A)
and
Ox =TZop2:A-+C.
184 The twistor correspondence
The notation is justified by the fact that Ex, Fx, and Ox depend only on x,
and not on Z and Z individually, as a consequence of the linear dependence of
p and r on Z, and of the condition (i) in Definition (10.4.1), which implies that
TZp2 = TZpZ. When x is replaced by Ax, where A is a nonzero complex number,
the linear map Ox is replaced by 10x, while Ex, and Fx are unchanged. We
shall also denote by x the projection of x into C M#. At each x E C M#, the
monad determines a subspace Ex C B. We denote by U the subset of C M# on
which Ax is nonsingular. The following lemma implies that dim (Ex) = n and
hence that E U is a holomorphic. vector bundle of rank n.
Lemma 10.4.2 For all x E U,
B = Ex ® im(pz) (Dim (pZ) .
ForallxECM#, Fx=0.
Proof For each x E U, put
Px = 1 - pZO;1T2 +piA 17-Z:B -* B.
Then Px also depends only on x, and not on Z and Z individually, and is
unchanged when x is resealed. Moreover, P2 = Px and PP(B) = Ex, so Px is a
projection operator onto E. The lemma follows from the fact that
P2A(x)-'7-Z
P., Pzi(x)-'r2 ,

are projection operators from B onto the first, second, and third summands,
respectively.
If Fx # 0 for some x E CM#, then there exists nonzero b E B such that
b E pzA for all Z E i. But then pZ'b is a nonzero holomorphic section of A
O(-1) - i, where A £ is the product bundle and O(-1) is the tautological
line bundle over CP1 (note that pZ'b is well-defined because the ps all have
maximal rank). But r(A (9 O(-1)) = 0, so this is not possible, and therefore
Fx = 0 everywhere.
We construct a connection D on E by using Px+bx to define parallel transport
from Ex to Ex+6x . If we trivialize E in a small neighbourhood of U by identifying
Ex with EO = Ex0 by
Pox=PxOIEz:Ex - E0
for some base point xo E U, then D = d + 4b, where
4, = -POx(dPx)POx'
(note that Pox is invertible for x near x0). We claim that (i) E and D are
the same as the bundle and connection obtained from E' by the Penrose-Ward
transform and (ii) U is precisely the complement of the set of jumping lines.
To establish (i), let x E U and let Z be an a-plane through x. Then by the
lemma the kernel of the restriction of Px to ker(Tz) is im(pZ). Therefore, for
each x on Z n U, Px determines an isomorphism
E'Z = ker(Tz)/im(pz) -+ Ex .
Global solutions in Euclidean signature 185

It follows that E'I= = i x E= for every x E U, and hence that the space of global
sections of E' ll is identified with E, and also that a section of E over Z n U that
is parallel with respect to the connection on E determined by the Penrose-Ward
transform is also parallel with to D. Therefore the two connections are the same.
To establish (ii), suppose that AZ is singular at some x E C M#. Choose
a E A such that a 0 0 and 0=a = 0. Then pZa E ker(r2) for every Z E i,
and therefore pZa determines a global holomorphic section of E. This section
vanishes at Z, by construction, but cannot vanish at any other Z E i, since
otherwise pZa would be a nonzero element of F. It follows that E'1= cannot be
trivial.

A more explicit description

We can write down compact explicit formulas for the gauge potential by using a
spinor notation, which is closely related to the quaternion formalism in Atiyah
(1979).
We suppose that D= is nonsingular for some x E C M*, and, without loss of
generality, we choose this point to be x = oo, that is, x°Q = I. We then use
A, to identify A with C, and we use the decomposition in Lemma (10.4.2), with
Z = (oA, 0), Z = (t A, 0), to identify B with E, ®(V (9 S), where V = A and
E,, = El. We then have
A=C=V, B=E,,. ®(V(9 S),
where V and E,,. are of dimension k and n respectively, so the elements of B are
of the form (e, vA), where e E E,,. and vA is a spinor with components in V. If
we write Z E T in terms of its spinor parts WA, 7r'A, then the conditions in the
definition (10.4.1) imply that the maps pZ and rZ are of the form
P7V = (7rA,RA'v, VWA + 7r SAA'v)
TZ(e,vA) = 7rA,TA'e +WAVA +TrA'SAA VA,
TA,
where v E V, (e, VA) E B, and the components of the spinors RAE, SAA' , and
are linear maps V --a E,,., V - V, and E,,. -i V, respectively. Since TZpZ = 0
we have that
T(A R9 + SA(A SA ) = 0, (10.4.1)
as an indexed family of linear maps V - V. Going in the other direction, suppose
that we are given linear maps such that (10.4.1) holds. Then we can reconstruct
PZ and rZ. If they have maximal rank for all Z, then we have a monad.
We can write down the corresponding potential. For each point of C M, we
take Z and Z to be (xAB'oB,,oA') and (x AB' tB',LA'), and we define
M1:V®S-V®S'
by
186 The twistor correspondence
This notation should be understood as follows: given a basis in V, Mx is a 2k x 2k
matrix, with entries that are linear functions of the space-time coordinates; the
four k x k blocks are the matrices of the individual components of MAA', which
are linear maps from V to V. We denote the inverse of M, by Nx: this is a map
of the form
Nx: V ® S' -+ V ® S: vA' -, NAA'VA'-
With this notation,
A. =TA,RA' +MAA,MAA':V - V,
and, for each x E C M[, we have that Ex is the subspace of B defined by
MAAVA=-TAe.
So we can define ax: Em -' Ex by e (e, vA), where vA is defined by this
equation. The result is a trivialization of E over part of C M.
If Z is given by the spinor pair (:iAB'7rB', 7rA' ), then
7,zaxe = 7rA,TA' e + 7rA,XAA'VA + 7rA, SAA'VA
= lrA (XAA' - XAA')NAB,TB'e.
Consequently, in our trivialization the parallel propagator
Px+6x: E. -i Ex+6x
is given by
e'-4 (1 - OB,RB'OX 1LA,bXAA'NAB,TB' + tB,RB'OX 1oA,6XAA'NAB,TB')e
to the first order in 6x. We conclude that
D=d+RA'Oz1NAB,TB'dxAA'
Remark. Other explicit formulas can be obtained directly within this frame-
work. For example, the vector space C = V is the space of global solutions
to the background-coupled Dirac equation, and from this observation, one can
write down the solutions.
Reality conditions
The instanton bundles are determined from monads that satisfy two further
conditions. First they must have no real jumping lines, that is, Ox must he
nonsingular for every x E S4. Second E' must admit a unitary real structure.
that is, we must have an identification of E' *z with EZ for every Z E C P3, where
o is the Euclidean involution.
If E' has monad
A-.B T`+C
then E" and a* P' have respective monads
C* - B' - A',
Global solutions in ultrahyperbolic signature 187

'A _10-1-4 _9

A bundle determines its monad, so if the reality condition holds, then these two
must be isomorphic, according the obvious definition of isomorphism, and we
have B* = B and C = T. If the reality condition is to reduce the structure
group of E to SU(n), then the pairing (b, b) must be a positive definite Hermitian
metric on B, and so we must have rZ = PQZ.
We can construct the monads for which these conditions hold byTA, taking V and
EE to be Hermitian vector spaces, and imposing the conditions = (RA')t,
SAA' = (SAA')t, where t denotes the Hermitian conjugate combined with the
appropriate spinor conjugation.
Example 10.4.3 The t'Hooft ansatz. To construct an instanton of charge k,
w e choose k points xi E IE and k nonzero real numbers A_, i = 1, ... , k. We set
V = Ck and Ec,. = S', and we put
SAA' = -diag(x1 RA' =TA't = (A1,...,ak)bB, .

The monad conditions are satisfied since each term in equation (10.4.1) vanishes
separately. By working through the various formulas, we find that
AAA, D'C' = _6D' .16D'

where 0 is the solution to the four-dimensional Laplace equation

=1 +
k

i=1
(x
A
-`xt)2

The result is a 5k parameter family of instanton solutions (1 also has a singular

point at oo, which could be moved to an arbitrary point). Atiyah et al. (1978a)
prove that the moduli space of instantons for each k has dimension 8k - 3.
However, for k = 1 and, for k = 2, they can be reduced to the above form,
perhaps after a conformal motion (Hartshorne 1978). These are the same as the
examples discussed at the end of §10.1, but with the special form of 0 above.

10.5 GLOBAL SOLUTIONS IN ULTRAHYPERBOLIC SIGNATURE

In the Euclidean case, the geometrization of the boundary conditions at infinity
is possible because S4 is a natural one-point compactification of E. Because the
twistor space of S4 is compact, the bundles over it can be constructed by methods
of algebraic geometry. In the ultrahyperbolic case, the picture is somewhat
different. Here the conformal compactification is
U# = S2 X S2/Z2 C CM#
and the `points at infinity' form a null hypersurface (§9.2). We cannot deduce, as
we could in the Euclidean case, that solutions that satisfy some natural asymp-
totic condition extend smoothly across this hypersurface; indeed we shall see that
nonzero solutions to the ASD Maxwell equations cannot be extended. Nor can
188 The twistor correspdndence
we apply the Ward theorem (Theorem 10.2.1) directly to H# because the real a-
planes in U compactify to copies of RIP2, which is not simply connected. Instead
we shall work with the boundary condition that the solution should extend to
the double cover IJ# = S2 X S2 of U#. Here there is a large class of solutions to
the linear ASD Maxwell equations, the a-planes are simply connected, and the
twistor space is compact.
In this section, we shall explore the geometry of D# and the corresponding
global real version of the Penrose-Ward transform.
The twistor space and the correspondence spaces
We construct three spaces from U#:
(i) the complex correspondence space F, which is the C IPl-bundle of complex
self-dual null bivectors, modulo scale (that is, F and \F are identified for
A 34 0);
(ii) the real correspondence space FR, which is the S' bundle of real self-dual
null bivectors, modulo scale;
(iii) the real twistor space RIP3, which is the 3-manifold of a-surfaces, that is the
compact totally null 2-surfaces in U with self-dual tangent bivectors (we
explained in §9.2 that this is the real form of the Klein correspondence).
An a-surface in U# lifts to FR, by taking its tangent bivector at each point,
and FR is fibred over RIP3 by the lifted surfaces. An a-plane in C M# meets U#
either in a single point or in a real a-surface. Those in the second category make
up a copy of RIP3 in C IP3, which we can identify with the real twistor space of
U#; those in the first category can be identified with points of F-.FR, by taking
their tangent bivectors at the point of intersection with U. Thus we have a
fibration
9+: C IP3 - RIP3 - U# .
The fibres are the intersections with the-complement of RIP3 of the invariant lines
under the ultrahyperbolic conjugation a: C IP3 C 1P3 .
We can also construct these three spaces for the double cover U. The real
twistor space is the same because the projection gives a one-to-one correspon-
dence between a-surfaces in H# and a-surfaces U#. The correspondence spaces,
which we denote by .FR and .F, are double covers of FR and T. Each a-surface in
U# has topology S2, and is the double cover of an a-surface with topology RIP2
in U#. However, F-.FR is not connected: it consists of two copies of C P3 -RIP3,
the common boundary FR being fibred over Iil;IP3, which is the real twistor space
of U. So we now have two fibrations
9+: C IP3 - RIP3 _4 IJ# , q-: C IP3 - RP3 _ QJ# ,
given by identifying the two components of .F -.FR with C P3 - RP3i the fibres
in each case are complex half-planes. We denote the components by .F+ and F- ,
and their closures by 13+ and 8-: these are manifolds with common boundary
.FR. We have smooth surjections
Global solutions in ultrahyperbolic signature 189

P+:B+-+CP3, P-: B- -+CI3,

which restrict to bijections on j±-
We shall see that a global ASDYM field on v# is transformed to a pair of
holomorphic vector bundle over C lP3i together with a map that identifies their
restrictions to RI3: the data combine algebraic information (the global bundles
over complex projective space) with what one can think of as `scattering data'
(the identification map).8 There is therefore a very suggestive analogy with
the behaviour of the scattering transform for solutions to the KdV and NLS
equations (§11.5), where there is a corresponding decomposition into algebraic
`solitonic' data and smooth scattering data.

Coordinate expressions
We introduce coordinates on U#, less the hypersurface at infinity, by choosing
(w, z, w, z) so that they are real on U. By writing self-dual null bivectors as
multiples of LAM, in the notation of §2.3, we can extend the coordinates to the
correspondence space F by using use ( as a complex fibre coordinate on F; the
real correspondence space is given by real values of (, including C = oo, and the
a-surfaces in U are given by
wZ3+2Z2=Z°, zZ3+wZ2=Z', (10.5.1)
where the Z° are real homogeneous coordinates on the real twistor space RI3i
the fibration FR RI3 is given by these relations, together with [; = Z3/Z2,
and its leaves are spanned by the vector fields
e=aw-(az, m=az - (az (10.5.2)
for real values of C. For complex values of (;, the vectors e, m and at span the anti-
holomorphic tangent space at each point F -.FR, which has a complex structure
from its identification with C lP3 -RI3. We can use these same local coordinates
for the spaces constructed from U#, with the components ± distinguished by
F

the sign of Im(().

The Penrose-Ward transform for local solutions

Before considering global solutions, we shall look at the patching matrix form of
the Penrose-Ward transform for local solutions in U: it is particularly natural
in this context, and it has a straightforward extension to nonanalytic solutions.
Let U C U, let PR C RIP3 denote the set of real a-planes that meet U, and,
as usual, let P be the set of complex a-planes in C M that meet U. If X E U,
then the line f C P intersects PR in a copy of RIP1, that is, in a circle. In
our coordinates, this is the real axis on the (-Riemann sphere, together with
the point at infinity. In the notation of §9.2, it is the circle given by the third
equation in (9.2.4), where ( = e'0 is related to t; by a Mobius transformation.
To find an ASD connection from a patching matrix F(A, p, (), we substitute
A=r;w+z, µ=r;z+w,
190 The turistor correspondence
and make a Birkhoff factorization F = 1-1f, with f and f holomorphic with re-
spect to C in the upper and lower half-planes, respectively, including the point at
infinity. 9 We then find D by exploiting the fact that f and f are solutions to the
corresponding linear system. It is clear, however, that to find the gauge potential
at points of U, it is only necessary to know F at real values of (A, p, (). Indeed, it
is possible to follow through the steps in the construction to obtain a solution to
the ASDYM equations when F has been chosen as an arbitrary smooth function
defined on PR with no holomorphic extension to a neighbourhood of PR in P,
but in this case the connection is generally not analytic.
To go in the reverse direction, and construct a patching matrix from a given
smooth solution to the ASDYM equations on U, we must find fundamental so-
lutions f and f to the linear system, which are smooth on the real axis in the
(-plane, including the point at infinity, and which extend holomorphically in (
to the upper and lower half-planes, respectively. It will follow from arguments
that we give later in a global context that this can be done. 1°
When the gauge group is real, we can choose f = f, where
f(w,z,w,z,0) =
The patching matrix then satisfies the reality condition F-1. For unitary
gauge group, we can take f = f t-1, in which case F = f t f is Hermitian.
Remark. When the gauge group is U(n), the patching matrix is positive def-
inite. Going in the opposite direction, if F is positive definite, then the lines
corresponding to real points of U cannot be jumping lines, by the results of
Gohberg and Krein (1958) (see §9.3), and so the solution must be regular every-
where.

Global ultrahyperbolic solutions

One class of global solutions can be constructed by extending this construc-
tion directly to the compactification. By applying the local construction in a
neighbourhood of each point, any smooth patching matrix F: RP3 --* GL(n, C )
generates a smooth solution to the ASDYM equation on S2 x S2, but one that
does not, in general, descend to the quotient space U. In U*, the factorization
cannot be made globally because F-FR is connected, and so it is not possible to
make a global distinction between the upper and lower halves of the (-plane. 11
However, if F is close to the identity or positive definite, then the solution will
be smooth on S2 x S2, with no jumping singularities.
In the case of the ASD Maxwell equations, one can see in another way that no
nonzero solution on S2 X S2 descends to the quotient. If Fab is a global solution
and if Wab is the form

w
_ dvAdU du Adu
(1 + -V-V)2 (1 + uu)2 '
then = FabWab is a global harmonic function on S2 X S2 (here u and v are
complex stereographic coordinates on the two spheres and w is a multiple of
Global solutions in ultrahyperbolic signature 191

the pseudo-Kahler form, see §9.2). If F descends to the quotient, then ¢ must
be odd with respect to the involution, since w is odd. However the space of
harmonic functions on S2 X S2 is spanned by those of the form 0 = cud,,,
where 0u is a spherical harmonic on the u-sphere and 0, is a spherical harmonic
on the v-sphere, with the same eigenvalue as Ou, as an eigenfunction of the
two-dimensional Laplacian. Every such product is even with respect to the
involution, because the spherical harmonics in the product are either both even
or both odd. Therefore ' vanishes; but then so does Fob, by the ASD condition
combined with Liouville's theorem (the remaining components of Fob determine
a global holomorphic 2-form on S2 X S2 thought of as C 1P1 x C 1P1).
The natural space on which to look for global solutions is therefore the unre-
duced product S2 X S2, and a natural boundary condition on a local solution
on U is that it should extend to this compact space. The solutions generated
by maps F: RP3 -i GL(n, C) are special: we can characterize them loosely as
the solutions with `zero instanton number'. We have the following more general
result.
Proposition 10.5.1 (Mason 1992b). There is a bijection between (i) the space
of solutions to the ASDYM equations on S2 X S2, modulo gauge and (ii) triples
(E+, E_, F), where E+ and E_ are holomorphic vector bundles on C 1P3 and
F: E+JRP, E_ IRP, is an isomorphism of real bundles, subject to a triviality
condition.
Proof In one direction, the proof is a_ direct generalization of the construction
above. By pulling E+ and E_ back to F, we have two bundles over the closures
of .F+ and F , respectively, together with a map that identifies their restrictions
to FR. By using the identification to piece together the two bundles, we obtain
a smooth bundle E F. Its restriction to each fibre over space-time has the
structure of a holomorphic bundle (made up of two holomorphic bundles over
the upper and lower half-planes, patched by F on the real axis); the `triviality
condition' is that each such restriction should be trivial. When it holds, the
standard factorization construction gives a global solution to the ASDYM equa-
tion, with no jumping singularities. When E+ and E_ are product bundles, the
construction reduces to the one above.
To go in the other direction, we start with an ASD connection D on a bundle
E S2 X S2, and reconstruct E+, E_ and F. We take t to be the pull-back of
E from space-time, and we note that there are natural holomorphic structures
on the restrictions of t to F±, given by the AHS construction in the previous
section: the local holomorphic sections are the solutions to
Des=0=D,,,s, e9s=0, (10.5.3)

where D is pulled back to a connection on E. This gives us the restrictions of E+

and E_ to the complement of RP3 in C P3. We also have a smooth bundle ER
RP3, of which the fibre over a point of RP3 is the space of covariantly constant
sections of E over the corresponding real a-surface (it is here that the simple
connectivity of the real a-surfaces is important). The only nontrivial step is to
192 The twistor correspondence
show that E+ and E_ can both be extended canonically to holomorphic bundles
over C 11'3 by adding the fibres of ER. If we define the holomorphic structures by
their a-operators, then we have to show that for some neighbourhood W of each
point of RIF'3i there exists a local trivialization of t in W = p+1 (W) in which
eqns (10.5.3) can be written
eJ (ds + 4D) = mJ (ds + (D) = 61(J (ds + 4D) = 0
where 4 = p{. (6) for some (0,1)-form 6 on W, together with a similar statement
for p-. Because eqns (10.5.3) determine the holomorphic sections of k on the
interior .F+, it will necessarily hold that A satisfies the integrability condition
for a a-operator.
The key observation is that a sufficient condition for a smooth function h on
W to be the pull-back by p+ of a smooth function on V is that h and all its
partial derivatives should vanish on AR n W. Since a and m commute with oL.
it is sufficient to establish the existence of the extension to find a matrix-valued
function g: W GL(n, C) for some neighbourhood W of each point of R1?3 such
that
Dig, D.9, a9,
and all their partial derivatives vanish at FR. If the columns of g are taken as
the frame for a local trivialization of t, then has the required property. To
construct g, first we choose go with nonvanishing determinant on .FR n W such
that Dego = 0 = D ..go. This is possible if W is chosen appropriately because
the a-surfaces in S2 X S2 are simply connected. We work in local coordinates
and write ( = a + ib. We set g = go at b = 0, and compute recursively the
coefficients of the Taylor series of g in powers of b by setting

a6 o9= 2ab(aa+iOb)9=0,
at b = 0. We repeat this in different coordinate patches, and hence determine a
Taylor series for g at each point of W n.FR. We then choose g in a neighbourhood
of the boundary so that it has this Taylor series. This is possible by a standard
result, due to Borel, that any such Taylor series on a submanifold of a manifold
is the Taylor series of a smooth function in a neighbourhood of the submanifold
(Hormander 1990). By construction, &.-g and all its partial derivatives vanish
on the boundary. Since De commutes with t, the same is true of oDeg and
hence of Deg since Deg vanishes at b = 0. The same argument applies with m
replacing e.
Having constructed E+, and similarly E_, in this way, we find F from the
fact that the sections of the restrictions of both E+ and E_ are identified with
the solutions to Des = 0 = D,,,s on .FR. 0

Reality conditions. If the ASDYM field has gauge group GL(n, R), then E+
determines E_, and there are also reality conditions on F. Complex conjugation
interchanges .F±, but preserves the fibration over U#, and gives an identification
Global solutions in ultrahyperbolic signature 193

E_ = E+. In this case, we have F = F-'. In the case of a unitary gauge group,
^ E+, and F is Hermitian and positive definite. 12
Topological conditions. The vector bundle E on S2 X S2 has second Chern
class c2(E) = c2(E+) +c2(E_). We see this by choosing smooth connections on
each of the two copies Of C P3 which are trivial in a neighbourhood of RP3, and
pulling them back to F. The evaluation of the second Chern number of k is
then given by the sum of the two integrals over C P3 that determine the Chern
numbers of E+ and E_. This argument also shows that c3(E±) = 0.
Example 10.5.2 The simplest examples with nontrivial E± arise from the
Atiyah-Ward ansatz. We put Ef = O(-1) ® 0(1) and take F to be the upper-
triangular matrix
F= (1 ih) h=
1
(ZO)2 + (Z1)2 + (Z2)2 + (Z3)2 .
0 1 '

This satisfies the appropriate reality conditions to give an SL(2, R) bundle on

space-time. According to the comments in Example 10.1.2), D in this case
coincides with the Levi-Civita connection on the primed spinor bundle, provided
that the conformal scale is chosen that
dude dvdv
ds 2 -
(1 + uu)2 (1 + vv)2 '
that is, so that 0 = 1, where is the conformally weighted solution to the wave
equation generated by applying the linear Penrose-Ward transform to h; note
that 0 is obtained from h by the X-ray transform.13 We can generate other
solutions by adding to h arbitrary functions of homogeneity degree -2. These
give rise to global solutions, provided that the function is small enough not to
violate the triviality condition.
Analogy with the inverse-scattering transform
Two striking features of the inverse-scattering transform are, first, that the scat-
tering data are not subject to any `gauge freedom', and, second, that the data
split naturally into two parts, one algebraic in nature, the solitonic part, and the
other smooth, the values of the reflection coefficients. The transform that we
have just described has the same features. First, the `gauge freedom', that is the
freedom in the representation of the triple (E+, E-, F), is removed by choosing
standard presentations of the bundles on C P3, which are necessarily algebraic.
The representative of F is uniquely determined up to constant factors-a feature
of the global correspondence that we do not see when the patching function is
constructed locally. The solitonic data are the algebraic data needed to describe
E± in the standard presentation; the smooth part is F.
The existence of the continuous part of the data in the inverse-scattering
transform could be anticipated from the linear theory: in the linearized limit,
it reduces to the Fourier transform of the initial data. This is not true of the
solitonic part, which is necessarily trivial in the linearized limit. We see similar
behaviour in the case of ASDYM fields on S2 x S2. Here, the solitonic data in
194 The twistor correspondence
the bundles E± are necessarily trivial in the linearized limit. For Maxwell fields,
the relationship between the continuous part of the data, F, and the space-time
field is a higher spin extension of the classical X-ray transform. 13
10.6 THE GASDYM EQUATION
The Penrose-Ward transform extends in a straightforward way to the generalized
ASD Yang-Mills equation (§8.6), to give a correspondence between solutions on
an open subset U of C2k and holomorphic bundles over a corresponding open
set P C C Pk+1. Here the inhomogeneous twistor coordinates are
AA =
(xA+xA (A= 1,...,k)
together with (. The corresponding homogeneous coordinates are introduced by
writing the equation of a general a-plane, that is, a k-plane in space-time on
which the i' s are constant, in the form
1AZk+I + xAZk+2 = ZA (A = 1, ... , k)
where (= Zk+2/Zk+1
We shall only sketch the theory of the extended transform, since the minor
modifications necessary to replace C P3 by C Pk} 1 should be self-evident. As in
the standard case, the fibres of E' are the covariantly constant sections of E
over the a-planes. A solution to the GASDYM equation determines a holomor-
phic patching matrix F(ju A, (), with values in the gauge group, defined on some
neighbourhood of I(I = 1 in twistor space, uniquely up to the same equivalence
as before. It is recovered from F by making a Birkhoff factorization
F((xA + xA, () = !-If
as a function of ( at each fixed set of values of the space-time coordinates xA, iA,
and by putting
(DA = haAh-I , 6A=haAh-1,
where
h=fI<_00, aA=a/aXA, aA=a/OXA.
h=fI(=0,

We can express this in a way that more obviously respects the linear symmetries
of the GASDYM equation by writing the homogeneous twistor coordinates as
(Zn) = (W A, 1rA,) (A = 1, ... k, A' = 0,1), and the space-time coordinates as
xAA', with xA0 = 2A, and xAI = zA. Then the patching matrix is a function
F(WA, IrA' ), homogeneous of degree zero, and the solution is recovered by making
a Birkhoff factorization of F(xAA'aA',7rA') as a function on the 7rA'-R.iemann
sphere, with x fixed.
It is worth noting that this generalization of the Penrose-Ward transform
reveals additional nonlinear symmetries of the GASDYM equation under the
action of GL(k + 2, C) on C Pk+1, which generalize the conformal symmetries of
the ASDYM equation. In order for this group to act on space-time, elk must be
compactified to become the Grassmannian of complex 2-planes in Ck+2 (Bailey
and Eastwood 1991; see also Pedersen and Poon 1988).
The truncated GASDYM hierarchy 195

10.7 THE TRUNCATED GASDYM HIERARCHY

We explained in §8.6 how the ASDYM equation and its generalization were
special cases of the the system H(k, p) (with p = 1). When we take the Penrose-
Ward transform of H(k, 1), the number of twistor variables increases with k in
an unsurprising way: in passing from k = 2 (the ASDYM case) to a general
value of k, we replace C lP3 by G Pk+l . There is also a twistor construction for
general values of p, but as p increases the dimension of the twistor space remains
constant: all that changes is its topology. This powerful feature underpins the
use of the GASDYM hierarchy to construct commuting flows on the solution
manifolds of integrable equations. It reflects the fact that the flows given by
higher values of p propagate data from initial surfaces of the same dimension as
those of the original equation.
In the notation of §8.6, H(k, p) is a system of equations in the independent
variables xAi, 1 < A < k, 0 < i < p. We think of the xAis as coordinates
on the space-time Ck(P+l), and, as usual, we interpret the equations as the
integrability conditions for a connection on a-planes; but here the a-planes are
pk-dimensional, and are defined by
AA = xA0 + (xA1 + (2xA2 +... + (pxAP, (10.7.1)
for constant (, pA (A = 1, . . . , k). Such a pk-plane is spanned by the vectors
19A OA2 - CaA1, .. , aAp - (aAp-1
Therefore the system H(k, p), which is the commutation condition for the oper-
ators
LAi = aAi + PAi - ((aA,i-1 + QA,i-1), 1-<A-<k, i < 1 < p,
is precisely the condition that the linear system LAS = 0 should be integrable
on every a-plane.
We can include the a-planes on which oo by writing (10.7.1) in the
alternative form
AA
= (PxAO + P-1xA1 + (p-2xA2 +...+x Ap, (10.7.2)
where (-1 and µA = (-pµA: this system is regular at (= 0, but singular at
C = 0 ((= oo). Between them, the two coordinate systems (,,u A and (, µA cover
the whole complex manifold of a-planes and the transformation from one to the
other respects the projection (,u A, () -' ( onto the (-Riemann sphere. For each
A, the transformation from pA to µA is the same as that of the fibre coordinate
of the line bundle O(p) -+ C P1. Therefore twistor space is the total space of the
vector bundle
k

O(P) ED ... 0 O(P) - Ic P1,

and we can read (10.7.2) as defining a correspondence between points of space-
time and sections of this bundle.
We now proceed in exactly the same way as before. Given an open set
U C Ck(P+l), we define the twistor space P C ®k O(p) to be the set of all a-
196 The twistor correspondence
planes that intersect U. Each point x E U corresponds to a holomorphic section
1 C P, that is, a holomorphic section of the projection P C P1.
Proposition 10.7.1 Suppose that the intersection of every cr-plane with U is
connected and simply connected. Then there is a natural correspondence be-
tween gauge-equivalence classes of solutions to H(k, p) on U, with gauge group
GL(n,C), and rank n holomorphic vector bundles E' -+ P, such that E'Ij is
trivial for all x E U.
We shall not give the full details of the by now standard argument that proves
this. The fibre of E' is the space of solutions to the linear system over the
corresponding a-plane. Going in the other direction, we trivialize E' on each
of the coordinate neighbourhoods (one a neighbourhood of ( = 0, the other a
neighbourhood of ( = oo), and so represent it by a patching matrix F(,UA, o.
We then substitute for the,uAs from (10.7.1), and make a Birkhoff factorization
F = f -' f , treating the left-hand side as a function of ( at each fixed point of
U. Because F depends on the coordinates xA' only through uA,
(aA,f - (5A,i-If)f-' = (OAi.f - ('9A,i-If)f-'
By the same application of Liouville's theorem as in §10.1, both sides are linear
functions of (; so we can recover the commuting operators
LAi = aAi + PAi - ((UA,i- I + QA,i-1)
by putting (aAif - (aA,i-1f )f -' = -PAi + (QA,i-1
An important special case (Mason and Sparling 1992) is k = 1: the proposi-
tion then gives a correspondence between solutions to the truncated Bogomolny
hierarchy and holomorphic vector bundles over an open subset of 0(p).

10.8 THE LINEAR PENROSE TRANSFORM

In Chapter 8, we constructed a sequence of symplectic structures for the ASDYM
equation from a bilinear form on the solution space WD of the background-
coupled wave equation
D*D¢ = 0. (10.8.1)
We now turn to the twistor representation of WD and its symplectic structure.
We shall show that, under the linear version of the Penrose transform,
WD = H'(P,A), (10.8.2)
where E' is the Penrose-Ward transform of D and A = adj(E') ® 0(-2) (here
0(-2) is one of the standard line bundles over CIP3i see §9.4). We shall also
show that the bilinear form on WD has a simple and natural expression when the
elements of the cohomology group are represented by differential forms, by using
the Dolbeault isomorphism. The isomorphism in (10.8.2) is closely related to
the linearization of the Penrose-Ward transform, which gives a correspondence
between perturbations of E', represented by elements of H' (P, adj(E')), and
linear perturbations of D. In our treatment of the recursion operator in Chapter
The linear Penrose transform 197

12, we shall represent the corresponding perturbations of D by perturbations of

the J or K matrices, and hence by elements of WD.

The cohomology group

We shall show that for a suitable open set U C C M, the linear transform is an
isomorphism
G:H1(P,A) - WD,
where WD is the space of holomorphic solutions to (10.8.1) on U. To begin with,
we shall work with a particular Cech representation of the cohomology group.
Given our standard two-set cover V, V of P, where V and V are Stein manifolds,
we can represent H1 (P, A) as the quotient of r(V f1 V, A) by the equivalence
relation
y y + Q - (-2F-14F (10.8.3)
where y is a holomorphic matrix-valued function on V f1 V representing a section
of A in the trivialization over V, and 0 and 0 are two other matrix-valued
functions with 0 extending to V, and /3 extending to V; the factors F-1, where
the patching matrix of E', and C-2 come from the transition relations for adj(E')
and O(-2).
The background-coupled wave equation
We construct £ (y) E WD by first making a Birkhoff factorization F
and by putting
0 = Cfy((w+z,(z+zv,()f-1,
so that z/, depends on C (in some neighbourhood of the unit circle) and the
space-time coordinates. In geometric terms, V/J is a section of the pull-back of
the bundle p*A on the correspondence space Y; it is expressed in a trivialization
pulled back from one of E -* U. Next, by differentiating along t and m, and by
using the fact f is a solution to the linear system of D, we have
DwV - (DiV5 = 0, DZV/' - (DZV, = 0. (10.8.4)
We expand V/) as a Laurent series in (,

where the coefficients are functions of the space-time coordinates. The Ojs be-
have as sections of adj(E) under gauge transformations, and they all satisfy the
background-coupled wave equation because we have
D.Oj = D=(bj+1 , D.Oj = D,;,07+1 (10.8.5)

from eqn (10.8.4).

When we replace y by an equivalent section, we replace V/J by
zG + Cfaf-1- (-1 f/3f-1.
198 The twistor correspondence

Since Q is holomorphic in ( inside the unit circle, and Q is holomorphic in (

outside the unit circle, this leaves 4)o unchanged. It follows that if we define G
by
G:yH0=00,
then G is a linear map from H' (P, A) to WD.

The inverse
To go in the other direction, we suppose that we are given a solution 0 on U.
We then construct 1(i and hence 'y by solving the following two pairs of inhomo-
geneous linear equations for 0 and 9, which are unknown matrices representing
(-dependent holomorphic sections of adj(E),
D,,0 - (Di0 = Di4), Dz9 - CDwe = D@O
(-IDJ - Die = D.0, (-'DJ - DJ = D4),
which are integrable whenever D is ASD and 0 satisfies (10.8.1). These equations
determine the covariant derivatives of 9 and 9 along a-planes, and hence they
determine 9 and 9 uniquely in terms of their values at a chosen base point on
each a-plane. If 0 and 9 are solutions, and if we put V _ 0 + (0 - then we
have
Dw1/'-(Di'=0, DzV) - (D,' 0
and hence that y = (-'f -'V) f is a holomorphic function of the twistor coordi-
nates A, u, Con some subset of P.
Suppose that we can find solutions such that 9 is holomorphic in C inside
the unit circle, and 9 is holomorphic outside the unit circle. With appropriate
choice of V and V we then have that -y is holomorphic on V n V. If we identify
-y with a section of A over V n V by using the trivialization over V, then the
corresponding class in H' (P, A) will generate 0. A different choice of 9 and 9 will
give an equivalent -y, and therefore the same cohomology class, since solutions
to the inhomogeneous equation differ by solutions to the homogeneous equation.
Thus the construction of G-' comes down to the problem of picking a base
point on each a-plane to fix the values of 0 and 9 in such a way that they are
holomorphic with respect to C in the required domains. It is not hard to do this
for simple regions U, and more generally, one can adapt this argument to prove
that G is an isomorphism whenever Z n U is connected and simply-connected
for each a-plane Z E P, but instead we shall give an invariant geometric proof
below in which it is easier to understand the topological conditions on U. 1''

The Dolbeault representation

Under these topological condition on U, it is always possible to find a smooth
nonholomorphic map b:P U such that b(Z) E Z for every Z E P. For
example, when U = C M, we can take b to be the projection onto IE C C M: this
is not holomorphic, but it satisfies the condition because every a-plane intersects
IE in a unique point, the one corresponding to the line joining Z to a(Z). Given
The linear Penrose transform 199

such a map b, we lift it to the correspondence space by defining a: P .F by

a(Z) = (b(Z), Z). We shall use a and b to construct a different representation of
the elements of H' (P, A) by equivalence classes of (0, 1)-forms, as in §9.6.
Suppose that we are given a representative y of a class in H' (P, A) relative to
our two-set open cover V, V, and that we define V) from -y as above by choosing
a factorization of F. Then, V) = (0 + 0 - C-' B, where
00 00
i-1 i+1
1 1

Hence if we put 0 = (f a f ') o a and (3 = (f & f -') o a, where a and & are the
matrix-valued functions on the correspondence space defined by

a=0+ &=0-
1 (( 1+
then -y = (3 - (-2 F-'/3F. Moreover Q is smooth in V and (3 is smooth in
V, although they are not holomorphic with respect to the twistor coordinates
because b is not holomorphic, and because of the explicit dependence on Z.
Nonetheless, we can use them to define a global (0, 1)-form r on P with values
in A by putting (i) r = 8/3 in the trivialization of E' over V and (ii) IF = 0I3 in
the trivialization over V The definitions are consistent on the overlap because
ey = 0, and together they determine a representative of -y tinder the Dolbeault
isomorphism. A different choice of b will give an equivalent representative.

The symplectic form

We now turn to the symplectic structure on WD. Given two solutions 0 and 0'
to (10.8.1), we have two cohomology classes y, y' E H' (P, A), and hence two
e-closed (0, 1)-forms r, r' on P with values in A, which are determined by y and
-y' up to the addition of 5-exact forms. There is an intrinsic holomorphic 3-form
on P with values in 0(4), given in homogeneous coordinates by
l; = 6EQpti6Z° dZ' A dZ'' A dZ6,
where E<,o.y6 is the four-dimensional alternating symbol. In the trivialization of
0(4) over V determined by the section (Z2)4, we have
1; =dAAdiAd(.
Consider the complex (scalar-valued) 5-form,
o=tr(r'Ar)A .

This is independent of the choices of local trivialization; it is closed because

ar = o = ar', and it is independent of the choice of r and r' up to the addition
of an exact form.
Our aim is to express the symplectic form on the solutions to the wave equa-
tion as an integral of a. We suppose that r and r' are constructed from b as
above, so that
200 The twistor correspondence
r=aa, r'=aQ'
on V, with similar expressions in V Because P is three-dimensional, the exterior
product of more than three (0, 1)-forms vanishes, so we have
Q= tr(di3' A d,3) A = a*tr(Da' A Da) A dA A dp A d(
where A = t;w + z, p = (z + to, and where we define D in terms of the gauge
potential and the exterior derivative d on the correspondence space by
D=d+[
Now we have
DaAdAAdu =ct d(AdAAdµ+DmadAAdzAdw-Deadu AdwAd2,
modulo terms in d(, where 2 = 8,,, - (8z and m = aZ - (8w. But, by using
DeO = Dzo, we also have

De a =
DjO + (Dw0
(10.8.6)

It follows that
(Dw-O dw - DiO dz + (DpO dw - ZDw46 dz A
Dea dA A dz A dw A d(- w
1 + (C
where w = dw A dw - dz A dz, together with a similar expression for Dmee.
Suppose that E is a real 3-manifold in complex space-time in the image of
b:P U. If we put E' = b-1(E), E" = q-1(E), where q is the projection
F- C M, then we have
f(r' A F)A =f
' ,l
tr(Da'AD)AdAAdtiAd(

= 7ri J tr(o o ' - 0BD.O') A w .

E
Therefore the bilinear form is given by

1J
27ri E'
(r' A r) A = 1 f tr(-0*D(k' - ¢'*D(k).
So the symplectic structures on the solution space to the ASDYM equation, and
hence the symplectic structures on its reductions to integrable systems, have
their origin in the simple and natural integral of the twistor representatives on
the left-hand side. Note that the derivation of this equality does not involve any
formal applications of Stokes' theorem, although the Dolbeault representatives
r and r' have been constructed in a particular way that involves E. The left-
hand side is independent of the choice of these representatives up to boundary
terms, but to make the boundary terms vanish it is necessary to impose boundary
conditions at infinity in space-time.
Notes on Chapter 10 201

NOTES ON CHAPTER 10
1. We do not need to specify V and V precisely, beyond requiring that V fl V should
intersect each projective line in P in an annulus containing the circle I and that
they should be Stein.
2. There is no systematic method for performing the Birkhoff factorization, but there
exists large number of special constructions, where the factorization is possible for
patching data of a particular form for which there is an algorithm. See §9.3 for a
factorization method that we shall use in a later example. The example here is one of a
class introduced by to Atiyah and Ward (1977) and developed and generalized in Ward
(1981), Woodhouse (1983) and Ivancovitch et at. (1990), in all of which the patching
matrices are upper triangular. The examples are nontrivial, however, in the sense that
they do not lead to upper triangular solutions to the ASDYM equations when the
entries on the diagonal have nonzero winding number on the annulus in C P'. Indeed,
it is shown in Ivancovitchet at. (1990) that the solutions obtained in this way are dense
in the space of all local solutions in the Weierstrass sense, that is, in the same sense
that polynomials are dense in the space of smooth functions. We give only the simplest
case here, in which the gauge group is SL(2, C) and the diagonal entries in the patching
matrix have winding number ±1. Ward and Wells (1990, §8.2) give a full discussion of
the general SU(2) case.
3. Atiyah et at. (1978a) remark that when the real slice is Euclidean and D is irreducible
(i.e. there are no nontrivial sub-bundles preserved by parallel transport), positive real
forms are essentially unique whenever they exist; so such a reduction of the structure
group on the real slice is really a property of the holomorphic bundle E', rather than
an additional structure on it.
4. Atiyah et at. (1978a) identify P with the sphere bundle of radius 2 in the bundle of
self-dual 2-forms and observe that each real self-dual 2-form of norm 2 determines a
complex structure on the horizontal tangent space.
5. These solutions are, in normal terminology, anti-instantons. The instanton number
is usually defined to be -k.
6. The linear Penrose transform gives an isomorphism between H1 (C 1P3, E'(-2)) and
solutions to the conformally invariant Laplace equation
-DaD°o+ IRcb=0
on sections of E (see the discussion of the corresponding transform for adj(E)) in
§10.8). The Laplacian is a positive operator on S4 and so the equation has no nontrivial
solutions by the standard integration by parts argument.
7. The full proof is quite involved and requires spectral sequence arguments and lengthy
diagram chases; see the detailed survey by Atiyah (1979) or Drinfeld and Manin (1978).
See Beilinson (1978) or Okonek et at. (1980) for more details on the point of view that
is sketched in the following.
We consider the Cartesian product C 1P3 X C IP3 with homogeneous coordinates Z°
on the first factor and Y° on the second. The idea is to pull back the bundle from
the second factor, restrict to the diagonal, and push down to the first factor. This
obviously yields the same bundle. What makes the construction nontrivial is that the
pushdown can be computed from the spectral sequence that arises from the following
resolution of the sheaf of sections of the restriction of the bundle to the diagonal 0:
0 - E0 f23 (-2,2) E®Sty(-1,1) E®52y. - E(1,-1) -+ EIo - 0.
Here the same symbol is used for a bundle and its sheaf of sections. The symbol Sty
denotes the pull-back of the sheaf of holomorphic i-forms from the second factor, and
202 The twistor correspondence

adjoining (p, q) to a sheaf denotes the tensor product with the sheaf of functions of
homogeneity degree p on the first factor and q on the second. The second, third and
fourth maps are contraction of the form with the vector field Z°8/8Y°. The image of
the fourth map is the ideal sheaf of the diagonal tensored with E(1, -1): it maps into
the ideal sheaf since the contraction of Y°8/8Y° into any form yields zero, so that
contraction with Z°8/8Y° yields zero when Z° is proportional to Y. The map is
surjective since Sty/(Z°8/8Y° J Sty) is the conormal bundle of the diagonal. Thus the
image of the fifth map is the restriction of E(1, -1) to the diagonal, which is isomorphic
to E since 0(1, -1)10 = O1A.
By standard sheaf cohomology arguments there is a spectral sequence that con-
verges to the sheaf E on the first factor. The ith row in the first level of the spectral
sequence is the ith push-down of the resolution above onto the first factor, and there
are horizontal maps which are those induced on cohomology from the maps in the reso-
lution above. However, most of these groups vanish. The zeroth and third cohomology
all vanish for elementary reasons (the zeroth because sections of bundles of negative
Chern class vanish on restriction to generic lines, and the third by Serre duality). The
group H' (E (9 123(2)) = H' (E(-2)) vanishes by assumption, and the vanishing of the
second cohomology can be deduced from the vanishing of H'(E(-2)), together with
Serre duality and other standard arguments. Thus the array reduces to
0 0 0

0(-1)®H'(E(g 122(1)) -+ H'(E(9 S2y) - 0(1)®H'(E(-1))

0 --4 0 0.

The spectral sequence converges to E at the second level, that is, E is the cohomology
of the above sequence. It follows that we have constructed a monad for E.
Conversely one can start with E as defined by a monad, compute its cohomology,
and deduce that the vector spaces and maps can be identified as above. See Atiyah
(1979), Okonek et al. (1980) and references therein for more details of these arguments.
8. Another way to look at the twistor transform in this case is to construct a non-
Hausdorff twistor space for S2 X S2 by gluing together two copies of C 1P3 on open
neighbourhoods of the two copies of RIP3: the global analytic solutions on S2 X S2
are then represented by bundles over the resulting non-Hausdorff complex manifold
(Mason 1995).
9. The real axis can be turned into the unit circle by means of the Mobius transformation
S '-. (( + i)/(( - i) if desired.
10. See also Lerner (1992) and the extension of the Dolbeault lemma in Woodhouse
(1992a).
11. For the same reason, U# does not have a spin structure (§9.9): if it did, one
could identify F with the projective prime-spin bundle and distinguish two connected
components of F - FR by the sign of Im (7r A' TA'), which is nonzero if 7rA' O TA'. One
can construct the prime-spin bundle, and the ultrahyperbolic conjugation aA' TA',
but not the spinor CA'B'.
12. The complex conjugate E -. C P3 of a holomorphic bundle is the holomorphic
bundle with fibre E,z at Z E C IP3, where a is the ultrahyperbolic conjugation.
13. The X-ray transform gives a correspondence between weight -2 functions f on
RIP3 and solutions 0 of the ultrahyperbolic wave equation in which q5 is the integral of
f along the lines in RIP3. This will be considered in a forthcoming paper, The Funk
transform as a Penrose transform, by T. N. Bailey, M. G. Eastwood, A. R. Cover, and
L. J. Mason; see also Woodhouse (1992a).
Notes on Chapter 10 203

14. We can also construct G-' by adapting the method of Eastwood (1982). The
argument is only given in outline, because although it is straightforward to fill in the
details, the complete proof requires a rather more careful discussion of the definition
of the cohomology groups than our limited use of the theory would justify. We shall
prove the following proposition. Suppose that the intersection of U with each a-plane
is connected and simply-connected. Then the linear Penrose transform -y ,--' 46 is an
isomorphism H'(P, A) W. Given D and 4), we construct a complex affine space
Az for each Z E U as follows. Over each a-plane Z, we have a natural line bundle
0(-1) -i Z, of which the fibre at each point is the one-dimensional space of spinors 7rA'
tangent to Z. By taking tensor powers, we can construct the line bundles 0(k) -+ Z.
There is a natural flat connection on 0(-1), determined by the space-time covariant
derivative. We define Az to be the set of solutions on Z n U to the inhomogeneous
equations
r A'DAA'OB' = DAB'O, OB' r = o, B,

where 0A' is a section over Z n U of 0(-1) ® S' ® adj(E). The integrability condition
for the first equation is satisfied as a consequence of (10.8.1), and the contraction with
7rA' of the firstOA,
equation gives the derivative of the second, so the two equations are
compatible. If and 0A' are solutions, then
OA' - 0A' = 7GrA' (10.8.7)
for some E I'(Z n U,adj(E) ® 0(-2) such that 7rA'DAA''t1. = 0, that is, 0 is an
element of Az. Because of the topological conditions on U, the solutions are uniquely
determined by their values at any one point of Z n U, and therefore Az is an affine
space modelled on the Lie algebra of the gauge group. As we allow Z to vary in P,
we obtain a holomorphic affine bundle A P, such that the corresponding vector
bundle is A. However an affine bundle modelled on A is determined by an element of
H'(P,A), so we have a map from WD to H'(P,A) (see §9.6). It coincides with the
one defined above because we can construct solutions to the inhomogeneous equation
from 0 and 0 by putting
OA, + O7rA' + OoA' = 0, OA' + (-2 O7rA' + (_' OLA' = 0,
so that =7rA,(0+(_'O-C_z©).

OA' -OA'
To show this geometric definition
0A,
gives an isomorphism is straightforward. In the
forward direction, we can choose so that it varies holomorphically with Z, but only
locally in P since there are no global functions homogeneous of degree -1. The different
local choices are a family of sections 0; ' of A over the open sets U, of some cover of P.
By (10.8.7), their differences 00 - 0A' determine a representative 7/Jo, of an element
of H' (P, A). In the other direction we recover the OA's from the 7/.'s by interpreting
(10.8.7) as a splitting formula, by exploiting the fact that H'(CIP1,O(-1)) = 0 and
by noting that if is homogeneous of degree -2 in irA', then the two components of
7rA'7/J are both homogeneous of degree -1.
The proposition can also be proved as a corollary to Theorem 10.2.1 since if we put
ry = S-'F-'6F, then the transform maps a perturbation 6F of the patching matrix to
= J-'6J E Wn.
11
Reductions of the Penrose-Ward
transform

We showed in the last chapter that the Penrose-Ward transform gives a corre-
spondence between solutions to the ASDYM equation and holomorphic vector
bundles over twistor space. We showed how the transform can be used to rep-
resent ASD gauge fields by patching matrices: the patching matrix of an ASD
connection D generates the connection itself by Birkhoff factorization, and, as we
shall see in the next chapter, it embeds D in an infinite family of new solutions,
which are related to each other by the commuting flows of the ASDYM hierarchy.
Our central interest, however, is not in the ASDYM equations themselves, but in
the various integrable systems that can be obtained from them by reduction. We
are therefore interested in how the transform can be used to construct solutions
that are invariant under subgroups of the conformal group.
The proper conformal transformations of complex space-time map a-planes
to a-planes, and therefore induce holomorphic motions of twistor space, which
coincide with those of the natural action of GL(4, G) on C P3. If a given ASDYM
field is invariant under a group of conformal symmetries, then its transform,
a bundle over twistor space, is invariant under the corresponding subgroup of
GL(4, C ). There are a number of different ways of representing invariant bun-
dles, and which one is most convenient depends on the symmetry group, and in
particular on the size of its singular set E, which is the set of a-planes that are
fixed by a non-trivial subgroup. If E is small, then we can excise it, if necessary
by reducing the domain of the solution, and construct a reduced twistor space by
taking the quotient of P - E by the symmetry group. In this case, the invariant
bundles are the pull-backs of unconstrained holomorphic bundles on the reduced
space and we can construct a reduced Penrose-Ward transform by factoring out
the actions of the symmetry group on space-time, on the correspondence space,
and on twistor space. However, if E is larger, then it may intersect P whatever
choice we make for the domain U. In this case nontrivial information about
the solutions is encoded in the action of the symmetries on the fibres over E,
and the symmetric solutions cannot be obtained simply by discarding ignorable
coordinates and by working on a reduced twistor space: we have to work with
invariant bundles over a larger space.
In the next section, we shall explain how the conformal symmetries of space-
time act on twistor space. We shall then consider various ways of imposing
symmetry on a holomorphic vector bundle over twistor space.
Symmetries of the twistor correspondence 205

11.1 SYMMETRIES OF THE TWISTOR CORRESPONDENCE

In §9.2, we introduced two spaces associated with an open subset U C C M:
the twistor space P C PT, which is the set of a-planes that meet U, and the
correspondence space F C IF, which is fibred over both U and P:

U P.
A point of F is a pair (x, Z), where x E U and Z E P is an a-plane that passes
through x.
We shall label the points of F by the space-time coordinates w, z, w, z, ( of
x and by the stereographic coordinate ( of the tangent bivector to Z. We then
have that P is the quotient of F by the flows of the vector fields
f=aw-(ai, m=a2-(aw. (11.1.1)
In spinor notation, Jr is the complement of the zero section in the bundle
-+ U, modulo the equivalence relation (x, TrA,) - (x, A7rA-) for nonzero A E
C. The twistor space P is the space of leaves of the foliation spanned by the
projections of two horizontal vector fields 7r A' BAA-, A = 0, 1. These span an
integral distribution on S", which descends to F under the quotient by -.
We shall transfer the action of the conformal group from space-time to twistor
space by lifting the conformal Killing vectors from U to F, and then by project-
ing them into P. The coordinate representation of F is useful in making this
construction explicit, while in the spinor formalism it is more obviously covariant
and is clearly well defined at C = oo.
Lifts of symmetries to.F
A space-time vector field
X = aaw + bat + aa,;, + bai
is a conformal Killing vector if the flow of X preserves the metric up to scale.
That is, if a(cXd) is proportional to the metric tensor. In terms of the components
of X, this condition is
awa + a,;,a = a,b + aib, 82a = awb, awa = aib = 0, aia = awb,
together with the same equations with the tilded and untilded variables inter-
changed. Suppose that the condition holds. Then, for each fixed (,
IX, e1= Qai , [X, m] = Qaw
modulo t, m, where
Q = (tai + ((6i - aw) - 6w ,
and X is regarded as a vector field on F with no 9, component. By the conformal
Killing equation, Q is constant along a and m. So if we define the lift of X to F
to be the vector field
206 Reductions of the Penrose- Ward transform
X" =aaw+baZ+aaw+bBz+Qat,
then we have [X", e] = 0 = [X", ml, modulo combinations of 2 and m, and
q.X" = X. The flow along X" in the correspondence space determines the
behaviour of a-planes under the flow along X in space-time.
Flows on P
Since the flow of X" preserves the distribution spanned by t and m, its projection
X' = p.X" is a well-defined holomorphic vector field on twistor space: the flow
of X' gives the action on a-planes of the conformal motions generated by X. In
the coordinates A, µ, (,
X'=((a+b+wQ)aa+((b+a+zQ)a,L+Qa(, (11.1.2)
where the components are constant along f and m and are therefore functions
of A, p, ( alone.
Every vector field on PT generated in this way from a conformal Killing vector
can be written in homogeneous coordinates in the form
a
X'=A°0 Z" aZo (11.1.3)

where A is a constant 4 x 4 matrix, determined uniquely by X up to the addition

of a multiple of the identity; the left-hand side is homogeneous of degree zero,
and is therefore well defined on the projective space. The explicit correspondence
between A, a generator of GL(4, C ), and X is given in Table 2.1.
Spinor formulation
In the spinor description of the correspondence space, F is the quotient of S"
by the flow of the Euler vector field 7rA'a/07rA' and X" is the image under the
projection S'' -+ F of the vector field
a a
Xa
aXa
- 7rA'OA'B, a7rB, , (11.1.4)

where OA'B' = 2aAA,XAB,. The image is single-valued because (11.1.4) is ho-

mogeneous of degree zero in 7rA'; and the fact that the flow of X" preserves the
fibration of F over PT is a consequence of the vanishing, modulo 7rA'19AA', of
the Lie bracket of (11.1.4) with 71A'aAA'. For a translation or a left rotation,
OA'B' = 0; for a right rotation, OA'B' is constant, and is the generator of the
SL(2, C) action on the constant primed spinors.
11.2 SYMMETRIES OF THE TWISTOR BUNDLE
Let U be a neighbourhood in complex space-time satisfying the condition in
Theorem 10.2.1, let E' -+ P be the Penrose-Ward transform of an ASDYM
connection D on a vector bundle E -+ U, and let H be a subgroup of the
conformal group. From the preceding section, we know that H acts (locally) on
U and P, and that each element of the Lie algebra lj gives rise to a conformal
Killing vector X on U and to a holomorphic vector field X' on P.
Symmetries of the turistor bundle 207

If D is invariant under a lift of the action of H to E, then each h E H maps

parallel sections of E over an a-plane Z to parallel sections over the a-plane
h(Z), and so the action of H on P lifts to E'. Conversely, if E' is invariant in
the sense that such a holomorphic lift exists, then H acts on E and preserves D,
by the construction of E and the geometric characterization of D in the proof
of Theorem 10.2.1. By arguing in the same way at the infinitesimal level for the
action of ll, we have the following.
Proposition 11.2.1 An ASDYM connection on a bundle E -i U is invariant
under a Lie algebra of conformal transformations if and only if the holomorphic
action of on P lifts to the Penrose- Ward transform E' --+ P.
If U is invariant under H, then the same statement holds for the group actions.

Invariant bundles
In the twistor picture, there are two useful ways to make explicit the invariance
condition, since the invariance of E' P under h is equivalent to either of the
following.

(a) There exists a representation X' --* C x, of b by Lie derivative operators on

the local sections of E'.
(b) Let F be a patching matrix for E'. Then for each X' E h there exist matrix-
valued functions 9x' and 0x', holomorphic on V and V, respectively, such
that
X'(F) = FOE' - OX'F, (11.2.1)
and, for each X', Y' E 4,
X'(Gy') - Y'(9X') + (ox',Ov'l = Otx',Y') ,

together with the same identity for the 9s.

In (a), we are using the formalism of §2.5 to express the condition that the
vector fields generating the action of h on P should lift to E', with the same
commutation relations. In (b), we are using an infinitesimal form of the condition
that a transformation p: P - P should preserve E': the patching matrices F
and F o p determine equivalent bundles if Fop = H-IFH, where H is regular
on V and H is regular on V (In fact, this last statement needs to be interpreted
with some caution because p will move the covering sets as well as changing the
patching matrix. It is only true if p is close to the identity.)
The two formulations are connected through the local expressions for the Lie
derivative
GX' = X' + ox', GX' = X' + BX' (11.2.2)
in the trivializations on V and V, respectively. There is a straightforward mod-
ification of (b) when E' is represented by a system of patching matrices F,,
relative to some general open cover Vo of P. 1
208 Reductions of the Penrose-Ward transform
Lie derivatives

Suppose that E' is invariant under the action of tj. Then E and D are also
invariant, and for each generator in h, we have Lie derivative operators LX, Lx,.,
and LX, acting, respectively, on sections of the Yang-Mills bundle over space-
time, E - U, on sections of its pull-back E" = q*E - F to the correspondence
space, and on sections of E' - P. In space-time, the Lie derivatives preserve the
connection. We shall now consider how the local expressions for these operators
are related.
Suppose that the Lie derivatives on E' P are given by (11.2.2) on V and
V, respectively. If F is the patching matrix, then
Ox, = F-'X'(F) + F-'6x,F, (11.2.3)
on the overlap. Now pull F and the Os back to the correspondence space, and
make a Birkhoff factorization F = f -1 f of F, where f and f are functions of
( and the space-time coordinates. The factorization determines a local trivi-
alization of E in which the gauge potential is given by Lemma 10.1.1. From
(11.2.3),
fex'f-' - X"(f)f-' = J x't ' - X"(f)f-'
The left-hand side is holomorphic in (on p 1(V) and the right-hand side is holo-
morphic in ( on p-1(V) (including ( = oo). By Liouville's theorem, therefore,
both sides are independent of (. So if we put
Ox = fOx' f-1 - X"(f)f-' , (11.2.4)
then OX depends only on the space-time coordinates.
By pulling back the space-time trivialization of E by q and by pulling back
the trivialization of E' over V by p, we construct two trivializations of
Ell =q'E=p'E' F.
The transition matrix from the first to the second is f. Hence it follows from
the form of the right-hand side of (11.2.4) that
Lx = X + Ox LX = X" + OX,
in the space-time trivialization.
Higgs fields
Suppose that X' = 0 at some point of Z E P C PT. Then X is tangent to the
a-plane Z, and at points (x, Z) E F, where x E Z, the lifted vector field X" is
a linear combination of t and m. The value of OX, at Z determines an element
of adj(E)Z, which is independent of the choice of local trivialization: when we
change the local holomorphic frame, OX, transforms by
Ox, - H-'Ox'H + H-'X'(H) .
However, the second term vanishes at Z, and the first term is the transforma-
tion law for the adjoint bundle. Thus the Lie derivative defines a linear map
Symmetries of the twistor bundle 209

OX, (Z): Ez Ez, that is, a linear transformation of the space of covariantly
constant sections of E over ZnU. If we denote the Higgs field of X in space-time
by 46x, then we have the following.
Lemma 11.2.2 If X' = 0 at Z E P, then the map E'z -+ E'Z coincides with the
action of -ox on the fibres of E over Z fl U.
Proof This is a direct consequence of the geometric definition of the Penrose-
Ward transform, but we can also demonstrate it explicitly by using the Birkhoff
factorization of the patching matrix. We can express the gauge potential in
terms of f by
.tw - (4'i = -fwf-' + (fif-' z - (cw = -fzf-' + (fwf
Equivalently,
eJ -e(f)f-', mJ q'4, = -tu(f)f-' ,

where a and m are the vector fields on F defined by (11.1.1). Therefore, at points
where X" is a combination of e and m,
XJ4' = X"J q'4, = -X"(f)f-' = -fex-f-' +0x.
Hence
Ox=XJ4) -Bx=-f0x,f-'. (11.2.5)
We again note that f is the transition matrix from the space-time trivialization
of E" = q*E to the pull-back of the trivialization of E' over V. The lemma
follows.
From this we obtain a basic proposition about the Lie derivatives at points in
twistor space at which the symmetry Lie algebra fails to act freely.
Definition 11.2.3 The singular set of b is the subset E C PT on which the
isotropy algebra
hZ={X'Et IX'(Z)=0}
is nonzero.
At points of the singular set, l)z acts linearly on the fibres of E'. On the other
hand, for any Z E P, there is a natural identification of EZ with the space of
covariantly constant sections of E over the intersection of the corresponding a-
plane with U, by the definition of the Penrose-Ward transform. We can therefore
identify El znu with (Z n U) x E. Hence if Z is in the singular set, then 4Z also
acts linearly on the fibres of E at points of ZnU. The following is an immediate
consequence of the lemma.
Proposition 11.2.4 For any Z E E, the action of hZ on on Ex, x E Z n U, is
generated by X - -¢x.
The reduced linear system
Away from the singular set in PT, there exist local invariant sections of E'. Their
pull-backs to .'F by p are the simultaneous solutions to
210 Reductions of the Penrose- Ward transform
Des=O, D,,,s=0, L .,s=0,
where X, Y... are the generators of H, and we have pulled back D to a connec-
tion on E" by q. The first two equations equations express the constancy of s on
a-planes, and thus the fact that it descends a local section of E'; the remaining
equations express its symmetry.
We can use the symmetry equations to discard the ignorable space-time co-
ordinates; the first two equations then become a reduced linear system, for which
the compatibility condition is equivalent to the reduced ASDYM equation. We
did this in Chapter 6 to make reductions by two-dimensional translation groups,
but in the general case, we have to take account of the fact that the spectral
parameter C is not constant along the generators on F because a general confor-
mal motion does not map a-planes to parallel a-planes. To make the reduction
in general, we introduce coordinates on the quotient space of space-time by H
(these are functions which are constant along the generators), and an invariant
spectral parameter a, which is a function on an open subset of F. We require
that a should be constant along X", Y",..., but not on the fibres of q: .F --. C M.
The symmetry equations then allow us to express s as a function of or and the
nonignorable coordinates (the coordinates on the quotient space), and hence to
find the reduced linear system, by substituting into the first two equations.
Example 11.2.5 Stationary axisymmetric solutions. In §6.5, we obtained the
Ernst equation by reducing by the two commuting Killing vectors
X=waw-wa,-, Y=az+8Z.
In this case,
X"=waw-wa,,-t;a(, Y" 19z+ aj
X'=-µa,,-Ca(, Y'=as+Ca,,.
As in §6.5, put w = re'0, w = re-i0, z = t - x, z = t + x, and choose the invari-
ant gauge so that (6.6.1) holds. Then x,r are the coordinates on the quotient
space, and the invariant local sections of the twistor bundle are represented by
simultaneous solutions to the linear equations
aws-C(aj+Q)s=0, Us-((19,b-tb-1P)s=0, X"(s)=0, Y"(s)=O,
where s is a function of ( and the space-time coordinates. For the invariant
spectral parameter, we take a = (ei0; then the symmetry conditions are that s
depends only on x, r, a, and the remaining two equations on s are
(ar - a5 + r-'a,%)s - a(J-1JJ)s = 0
(ax + aa, - r-la2a")s + a(J-1Jr.)s = 0,
where J(x, r) is defined by (6.6.2). This is a linear system for the reduced form
of Yang's equation (6.6.3), that is, it is integrable if and only if eqn (6.6.3) holds.
Another possibility for the invariant spectral parameter is
r = 2 0 - -1µ) = x + 2r(a - a-1) , (11.2.6)
Reduced twistor spaces 211

which has the advantage that it is also constant along 2, m, and so descends
to a function on twistor space. If we express s as a function of x, r, r, then the
reduced linear system takes the simpler form
(0,.-aa=)s-a(J-'J=)s=0, (ax+uar.)s+a(J-'Jr.)s=0, (11.2.7)
where now or is defined as a function of x, r, r by (11.2.6); J satisfies the reduced
form of Yang's equation if and only if this is integrable for every r. We shall see
in the next section that r is a coordinate on the reduced twistor space.

11.3 REDUCED TWISTOR SPACES

The most straightforward reductions of the Penrose-Ward transform occur when
h acts freely on P. If dim fl = k, then there are k holomorphic vector fields
X', Y', ... on P which are everywhere independent and which generate the action
of f). We can construct invariant local trivializations of an invariant vector bundle
E' - P by solving
GX's=0, Ly's=0, ...
to find invariant sections. Since the generating vector fields are independent
and their algebra is closed under Lie brackets, there are enough solutions in
a neighbourhood of each point to define a frame field. In this case, we can
characterize the invariant bundles by the condition that their transition matrices
should be constant along the generators.
From a more geometric point of view, the invariant bundles are the pull-backs
to P of unconstrained vector bundles Er - R over the reduced twistor space R,
which we define to be the quotient of P by fI, that is, 1Z is the space of leaves
of the foliation spanned by the vector fields X', Y', .... The fibre of Er over a
point of R is the solution space to
L:8=0, Ly's=0,
on the corresponding leaf of the foliation.
We can also take the quotients of the U and F by the action of fl, to construct
S = U/li and Fr = S x CP1. We then have a reduced form of the twistor
correspondence with no ignorable variables, based on a reduced version of the
double fibration,

S 7Z

The reduced linear system determines a partial connection along the leaves of p,
and its integrability is equivalent to the reduced ASDYM equation; the solutions
to the linear system are identified with the local sections of Er.
In this section, we shall look at three examples: minitwistor space, which is
the reduction by a non-null translation (Hitchin 1982b, Ward and Wells 1990),
Atiyah's modification of this construction in which the translation is replaced by
a rotation (Atiyah 1987), and the simultaneous reduction by a translation and a
212 Reductions of the Penrose- Ward transform
commuting rotation, which is used in the analysis of the Ernst equation (Wood-
house and Mason 1988). The second and third example illustrate a significant
point: that although E' can be represented by invariant patching matrices, it
may not be possible to do this with only a two-set open cover. We have seen
already in Chapter 10 that global solutions may require covers with more than
two sets, but in the third example, the reduced twistor space of any open subset
of C M is non-Hausdorff, and more than two sets are required to determine even
local solutions in space-time from invariant patching data. (See Tod 1990 for
another example of a non-Hausdorff twistor space.)
Minitwistor space
In the first example, k = 1 and the symmetries are translations along a constant
non-null vector X. The corresponding reduced twistor space comes up in a
number of contexts: it is called minitutistor space, and we shall denote it by
MT. We choose the coordinates so that X = 8,,, - aj,. Then

in the coordinates on P introduced in §10.1; the first expression is valid on V

and the second on V. The coordinates ( and ( are constant along X', as are the
two holomorphic functions

We shall use y, ( and ry, ( as two coordinate systems on MT. The two patches
cover the whole of M7, and on the overlap,
7= (2_1'
(=(-t,
which are the transformation rules for the coordinates on the total space of the
line bundle 0(2) . CP,, with ( and ( as affine (stereographic) coordinates oil
the base, and y and ryas linear coordinates in the fibres. In spinor notation, the
projection PT - I - MT = 0(2) is
(w",1rn')'-' (Xs wB a
where we think of the components of 7rA, as homogeneous coordinates on the
Rietnann sphere and of the expression XB 'wB?rB', which is homogeneous of
degree two in the twistor variables, as an element of the fibre of 0(2) over (7rA-j.
Thus MT is the line bundle 0(2), the holomorphic tangent bundle of the
Riemann sphere. If we combine this with the remarks above and with our iden-
tification of the complex form of the Bogomolny equations with the reduction
by X of the ASDYM equation (§5.1), then the Penrose-Ward transform gives a
correspondence between solutions to the complex Bogomolny equations in C M.
modulo gauge, and holomorphic vector bundles over MT. Such vector bun-
dles can be trivialized over the two coordinate domains in MT, and therefore
represented by patching matrices F(-y, (). We recover the solution from F by
substituting
-Y=(lr+A=z+(x+(ZZ, (11.3.1)
Reduced twistor spaces 213

where x = w + f o , and by making a Birkhoff factorization F = - I f for each

fixed z, z, x. If we choose the gauge so that f = 1 when ( = 0, then
is uniquely determined by F as a function of z, z, x, and
P = (8=h)h-1, Q= -(OZh)h-i

is a solution of the complex Bogomolny equation, in the form (5.1.1).

This construction is, in fact, a special case of the twistor solution of the
truncated GASDYM hierarchy H(1, 2) (§10.7), and we can interpret our solution
to the Bogomolny equations in the same geometric way. Equation (11.3.1) gives
a correspondence between points (z, z, x) in the quotient of space-time by X and
global sections of 0(2), and the recovery of the solution involves trivializing the
restriction to sections of the bundle Er 0(2) = MT.
Real solutions
We could characterize solutions to the unitary Bogomolny equations on E 3 (3-
dimensional Euclidean space) by the behaviour of the reduced bundle Er under
the antiholomorphic involution C H -Z-1 of the Riemann sphere, which induces
an antiholomorphic involution of TCP1. However, Hitchin (1982b) gave a more
direct construction, in which the reduced twistor space is constructed from the
geometry of E3. Each point of MT is identified with an integral curve of X'
in PT, and hence with a one (complex) parameter family of parallel a-planes in
O M. These make up a 3-plane N C G M with tangent space spanned by
x, 0w-(1921 19=- Oil I
for constant ( (with the usual modification when ( = oc). In an invariant gauge,
the fibre of the twistor bundle over a point of MT is identified with the solution
space to
a,,,s-8ws=0. D,,s-(Dss=0. Dzs-r;D,,s=0, (11.3.2)
on N.
We can think of E 3 as the intersection of the Euclidean slice with the complex
hyperplane w = w. Each N intersects E 3 in a real line, and every real line
arises in two distinct ways, with the two values of ( related by the conjugation
( F-+ -Z-1. So we have a one-to-two correspondence between lines in E3 and
points of MT (see Fig. 11.1). However, if we identify the sphere of unit tangent
vectors to the oriented lines through a point in E3 with the Riemann (-sphere,
then we can distinguish the two values of ( by associating them with the two
orientations of a given line.
Each solution to the system (11.3.2) is uniquely determined by its restriction
to N n E 3. We conclude that (i) M T is the same as the space of oriented lines
in E3, and (ii) if Er -i MT is the transform of a solution to the Bogomolny
equations on E3, then the fibres of Er can be identified with the solutions spaces
of ODEs on the lines in E 3. In the notation of §5.1, the ODE on a line with unit
tangent t is
Dts = ids.
214 Reductions of the Penrose-Ward transforrn

Fig. 11.1. Each line in Euclidean 3-space is the intersection with a hyperplane N in
complex space-time made up of parallel a-planes (shown as lines at 45°).

This construction was used to find monopole solutions; we shall not go through
it in any more detail, because it is fully described both in the original paper
(Hitchin 1982b) and in the book of Ward and Wells (1990).

R.eal solutions in the Lorentzian case

The same reduced twistor space arises for the reduction of the ASDYM equation
on U by a non-null translation. We saw in §5.2 that the reduction of Yang's equa-
tion in this case gives the Manakov-Zakharov model and Ward's chiral model
on the three-dimensional space-time R3 with a Lorentzian metric of signature
(+ + -). Here the antiholomorphic involution is ((, 7) i- which leaves
invariant the real hypersurface TRPI C M T. The points of the hypersurface
correspond to null 2-planes in R3 and the other points of MT at which ( is not
real correspond to timelike lines in R3, which are future pointing or past point-
ing, according to the sign of Im(C). When c is real, but -y is not, the line is at
infinity (Ward 1989).
It was noted in Chapter 5 that these systems admit a conserved positive
energy density, which suggests a natural boundary condition, that the total
energy should be finite. Ward has constructed finite-energy lump solutions which
behave like solitons in the sense that they preserve their form and have trivial
interaction (Ward 1988b). There are also two-lump solutions that scatter at
right angles.2 The corresponding bundles over MT are those that extend to
the compactification P(O (D 0(2)), which is a Hirzebruch surface (Ward 1990c).
Reduced twistor spaces 215

Such bundles have been completely classified (Buchdahl 1987). For a given
lump number, however, the space of such bundles is finite dimensional, while
the general solution space is infinite dimensional because it is possible to choose
arbitrary initial data with finite energy.
More general solutions can be obtained by a construction analogous to that
in §10.5 from pairs consisting of a holomorphic vector bundle E on P(O (D 0(2))
together with a map F that identifies E and E on the real hypersurface. 3 This
construction yields all the solutions obtainable from the construction of Manakov
and Zakharov (1981) and the solutions all have finite energy. However, it is not
known whether the finite energy condition characterizes such solutions.
Hyperbolic monopoles
If we replace the translation Killing vector by
X = wa,,, - waw ,

then the symmetries are (complex) rotations, and the action on twistor space is
generated by
X'=-'aa,-Ca(=aaa+Sa(.
This time the two coordinate systems on the reduced twistor space are
a=A, 3 =Cµ-1 and
with the transition relations
&=a-1

which are the same as those for the affine coordinates on the product of two
Riemann spheres. So the reduced twistor space is the quadric 7Z = C P1 x C P1. In
Atiyah's construction of hyperbolic monopoles (§5.2), 7Z is identified in the same
way as in the previous case with the space of directed geodesics in 3-dimensional
hyperbolic space, which is a product of two spheres because a directed geodesic
is determined by its ordered pair of endpoints on the sphere at infinity.
The Ernst equation
In the reduction to the Ernst equation, b is the Abelian Lie algebra generated
by
X=we,,,-wag Y=aZ+az,
so in this case, both the previous symmetries are present. We derived the reduc-
tion in §6.5 and the reduced linear system in Example 11.2.5, and we shall use
the notation introduced there.
In this case, 7Z is one dimensional, and there is just one invariant coordinate,
which we can take to be
T = 2(X - (-1U) = !(C-1 -
The surfaces of constant T make up a pencil of quadrics in PT, including the
plane pair C = 0, oo, on which 7 = oo.
216 Reductions of the Penrose- Ward transform
A point of R is a leaf of the foliation of P spanned by X' and Y'; it is
labelled by a fixed value of r, and determines a two-parameter family of a-
planes in space-time-the orbit of an a-plane under the flows along X and Y.
The corresponding fibre of E, is the space of solutions to (11.2.7): this defines
the forward transform from solutions of the stationary axisymmetric reduction
of Yang's equation (6.6.3) to holomorphic vector bundles over R. For the reverse
transform, we pull back Er to P, and represent it by a family of patching matrices
that depends only on r. We then recover J by substituting from (11.2.6), with
a = e'©(, and by making a Birkhoff factorization.
The new feature in this case is the non-Hausdorff topology of R, which one
can see as follows. Suppose that J is defined on U C C M, where U satisfies the
conditions in Theorem 10.2.1, and that P is the twistor space of U. Consider the
surface in P on which r takes some constant value. This is a family of a planes
in space-time, two of which pass through a general point of U, corresponding to
the two roots of the quadratic equation
ra2+2(x-r)a+r=0
for a. If, keeping r fixed, we can continuously change one root into the other by
moving the point of U around a closed loop, then there is just one leaf of the
foliation for this value of r, and r labels a single point of R; otherwise, r labels
two points of R. By examining the discriminant, it follows that for each r, there
is one point of R if r = x f it for some point of U, and two otherwise; there
are always two points for r = oo (the leaves on which ( = 0 and ( = oo). Thus
R is a compact Riemann surface covering CP1i but it is not Hausdorff: the two
points of R at which r = x ± it for some point on the boundary of U cannot be
separated in the quotient topology.
The structure of R depends on U; it is described in Woodhouse and Mason
(1988) and in Fletcher and Woodhouse (1990). We give here just one application,
in which the non-Hausdorff reduced space is assumed to consist of two copies
of C lP identified in an open disc. With this simplification, the solutions are
determined by two pieces of information: a set of integers that determines the
restrictions of Er to each CP1, and a holomorphic matrix that determines the
identification of the fibres of the restricted bundles in the open disc. If the
restricted bundles are trivial, then the corresponding solution is regular on some
interval of the axis of symmetry; otherwise, J is singular, but has standard
asymptotic behaviour as r - 0 determined by the integers, although in the
general case, the ASDYM field itself is regular on the axis. For simplicity, we
shall take the gauge group to be GL(2, C ).

Example 11.3.1 Suppose that U is a neighbourhood in space-time of a point

at which x, r are real, and let V, V be our standard two-set cover of P (V, V are
neighbourhoods of ( = 0 and ( oo such that V n V intersects each line in P
in an annular neighbourhood of the unit circle). On a line in P representing a
point of U,
2r = 2x + r(e'B( - e-'B(-1).
Reduced twistor spaces 217

By considering the behaviour of r as a function of ( at the point where r and

x are real, we see that p: (A, u, r) --4 r maps both V and V onto the Riemann
Sphere T on which r is a holomorphic coordinate. A particular class of solutions
to the Ernst equation (the `regular solutions' in Fletcher and Woodhouse 1990)
is constructed by assuming that both E'Iv and E'jp are pull-backs of mutually
dual vector bundles over T.
To find these solutions, we have to take a four-set open cover of P. We cover
T with two sets To (a neighbourhood of r = 0) and T,o (a neighbourhood of
r = oo), and put
Vo=p-1(To)nV, V.=p-1(T,,)nV,
Vo =p-1(To)nV, =p-1(T,o)nV.
Note that the points ( = 0 are in V,o and the points ( = oo are in V,.. By
Grothendieck's theorem, the bundles over T can be trivialized over To and T,o
so that their patching matrices on To n T,0 are of the form diag(-rk, 7k') and
diag(r-k, r-k' ). With this choice, the patching matrices of E' on the three
intersections V,o n Vo, Vo n Vo, V,o n Vo are, /respectively,1
rk 0 0
F(r)
( 0 rk' / ' ,
\ r-k
0 r-k' / ,

where F is a regular function of r in some neighbourhood in the r plane. How-

ever, we can write r = h ho = h,,. ho 1, where
1

ho=(, h,o=!((A-li), ho=(,

are regular functions on V0, V,,, Vo and f,., respectively. Therefore E' is also
given by the single patching matrix
( )
( (k') F(r) (0 (Ok
0 standard open cover V, V However, in general E'
between the two sets of the
cannot be described by a single patching matrix that depends only on r, which
is the point at which the curious topology of R exerts an influence.
In fact, the solution J of Yang's equation recovered by Birkhoff factorization
from this single patching matrix depends on 0, although the solution to the
ASDYM equation itself must of course be invariant. The dependence on 0 can
be removed, however, by replacing J by the equivalent J-matrix
wk 0 wk 0
( 0 wk')J(0 wk'
which generates the same connection. One can combine these two steps in a
construction of Ward's (1983): choose a regular matrix-valued function F(r),
substitute for r in terms of x, r, a from (11.2.6), and make a Birkhoff factorization
A 0 k -k
= f-If ,
rk'a0

((r 0) (rv)k') F(r) (r 0 k'

218 Reductions of the Penrose- Ward transformn
with respect to o for fixed x, r. Put h = f 1 a=o, Then
J(x,r) = h-1h

is a solution to (6.6.3). Ward's construction is useful in finding solutions to Ein-

stein's equations (Ward 1983) and in analysing their global geometry (Fletcher
and Woodhouse 1990).

11.4 THE KdV AND NLS EQUATIONS

In §6.3, we showed that the KdV and NLS equations are reductions of the
ASDYM equation with gauge group SL(2, C) by the Lie algebra h generated
by the two translations
X=aw - aw, Y =az.
Both equations are derived from the linear system
L=D' -((ay+Q), M=D'-((D' - P), (11.4.1)
under appropriate algebraic conditions on Q, where P and Q are the Higgs fields
ofXandY,x=w+zu,t=z,y=z, and
D'=d+4?, dx+4),dt
is a connection on a bundle over the complex x, t-plane. The Higgs fields P, Q
and the connection components 'I are functions of x and t alone. Under
gauge transformations of the connection, P and Q behave as sections of the
adjoint bundle. It is possible to choose the gauge so that either 4iw = P and
Q takes a standard form (the normal gauge) or so that 'w = 4i,, = 0 (the
Higgs gauge). For finite values of (, the linear system has solutions which are
independent of y, and on which L and M are given by (6.3.1), but any nonzero
solution holomorphic at (= oo must have nontrivial y-dependence.
On twistor space, the symmetries are generated by the flows along

in the two standard systems of inhomogeneous coordinates (§9.2), and the sin-
gular set is the plane C = oo ( = 0). Every line in twistor space intersects this
plane, so there is no space-time domain U such that acts freely on P. Conse-
quently it is not possible to characterize the invariant bundles as the pull-backs
from a completely reduced twistor space, however much we restrict U. The best
that we can do is to impose the symmetry in two stages, by first taking the
quotient by X' to construct MT, and then by imposing the further symmetry
along Y' as a constraint on the bundle over M'1(; that is, we can remove one
'ignorable coordinate' from twistor space, but not both.
Invariant patching data
Let E' P be the Penrose-Ward transform of an ASD connection D on E -+ U
with gauge group SL(2, C ), and suppose that D is invariant under X and Y.
Introduce the usual two-set open cover V, V of P. Since X' has no zeros in V
The KdV and NLS equations 219

and neither X' nor Y' has zeros in V, we can choose the trivializations of E' so
that CX, = X' in both sets and
in V
Ly, = (11.4.2)
in V.

The patching matrix is then constant along X', and can therefore be expressed
either as a function F(-y, () or as a function F(ry-, ), where
ry= +µ(=y+x(+t(2,
+y(2+x(+t. (11.4.3)
There remains the freedom to change the frame in V by a transformation de-
pending on ( alone, and to change the frame in V by a transformation depending
on ' 5and ( alone. This has the effect
FH H-1FH, (11.4.4)
B -. H-16H +
H = H(5', () is regular in V and H = H(() is regular in V.
Construction of the patching matrix
If we start with the commuting operators of the full linear system (11.4.1),
then we construct a patching matrix by finding fundamental solutions f and
f to the linear system L f = 0 = M f , which determine trivializations of E'.
The fundamental solutions are functions of x, y, t, and (, with f holomorphic
with respect to ( for finite values of (, and f holomorphic with respect C in a
neighbourhood of ( = oo. On the overlap of their domains, F = !-If, where
the patching matrix F depends only on y and C. We have CX, = X' in both
trivializations, and if we choose f so that fy = 0, then Ly, is given by (11.4.2),
with B = f -1 fy. By putting ( = 0, in the linear equation L f = 0 we have
9=_f-1Qf ate=o.
In the NLS and KdV cases, Q is nonzero, and so we cannot eliminate the y-
dependence of 1. Therefore 9 54 0 and F has nontrivial dependence on ry.
Lie derivatives and Higgs fields
We can say more about the relationship between the Higgs fields and B. Suppose
that we are given E' with patching matrix F(-y, () and the Lie derivative Ly, in
(11.4.2), where 9 is holomorphic in V, and is related to F by eqn (11.2.1). We
find solutions to the linear system of the corresponding ASDYM field, and hence
D, by substituting for -y from (11.4.3), and by making a Birkhoff factorization
F If. Since f satisfies the linear system, we have
ayf = C-2D'f+(-1Pf - Qf
By differentiating F = f -If with respect to y, we have
fef-1
= fyf-1 - fyf-1
220 Reductions of the Penrose- Ward transform
We can choose the factorization, and hence the gauge, so that f is independent
of y at C = 0. Then by expanding both sides in the Laurent series in (, we
deduce that fy = 0, and hence that the components of b are independent of y.
By eliminating C9y f between the two equations, we have the following lemma.
Lemma 11.4.1 The Lie derivative along Y' is given by

GY' = CZ
a7 + 00, C) ,
where for fixed t, x, y,
= -f-1(Q - CP)f + 0((2) ,
g

as 0. Herery = F = f-1 f is a Birkhoff factorization, and P and

Q are the Higgs fields of X and Y in the gauge determined by the factorization.

In Chapter 12, we shall characterize the bundles that correspond to the KdV
case by giving a more refined asymptotic formula. In the Segal-Wilson ansatz,
the patching matrix is chosen so that the Lie derivative is reduced to a standard
form.

11.5 THE INITIAL VALUE PROBLEM AND INVERSE SCATTERING

The initial value problem for the KdV and NLS equations has a celebrated
solution in the inverse-scattering (IS) method, which is analogous to the solution
of the heat equation vt = vxy by Fourier analysis: the Fourier transform in the
x-variable of the heat equation is the ODE
iit = -k2v- ,
which can be integrated directly, so the evolution of v is determined by taking the
Fourier transform, by integrating, and then by taking the inverse transform. In
the IS method, one similarly replaces the dependent variables by the scattering
data, which also have a straightforward time-dependence, and then solves the
original evolution problem by the inverse-scattering transform.
In this section, we shall approach the initial value problem for the H+o reduc-
tions of the ASDYM equation by using the Penrose-Ward transform in the same
spirit. We shall seek to determine E' from the initial data, and then to find the
field variables at other values of t by Birkhoff factorization. Both the inverse-
scattering method and the Fourier solution to the heat equation fit naturally
within this general scheme.
The linear system
The idea is to determine fundamental solutions f, (x, y, t, () to the linear system
(11.4.1) in various domains of the (-sphere (labelled by a) from their initial
values f,(x, 0, 0, () at y = t = 0. Knowing the fs, we can write down the
ASDYM potential and hence the solution to KdV or NLS equation.
Now the transition matrices for E' are given by
The initial value problem and inverse scattering 221

F.,,('Y,()=fT'fo,
where y = y + (x + (2t. Since F, depends on x, y, t only through ry, we have
fr'(x,y,t,()f,(x,y,t,() = fr'((-'7,0,0,()f.,((-'7,0,0,()
If we given the initial values of the fos that appear on the right-hand side, then
we can find the unknown values fQ (x, y, t, () on the left-hand side by interpreting
this as a Riemann-Hilbert problem. So the initial value problem can in principle
be solved in two steps: the first is to find from the initial data the initial values
of enough solutions to the linear system to cover all values of ( on the Riemann
sphere (including ( = oo); the second is to solve a Riemann-Hilbert problem.
Unfortunately, there is a difficulty in the first step since, knowing only the
initial values of the potential, it is not always possible to identify functions
f (x, 0, 0, () that (i) extend to solutions to the linear system and (ii) are regular
at infinity. We can see why by writing the full linear system for the reductions
in the normal gauge
Lf=9xf+4 f-Cayf-(Qf=0,
Mf=atf+IDzf-(a=f0,
where 4),,, and I are functions of x, t, and the Higgs field Q = cI of Y = (9i
is some standard constant matrix, which is characteristic of the reduction. For
example,
Q= (t) i) ,
Q=
(0 0 J (11.5.1)

in the NLS and KdV cases, respectively. For finite val/ues of (, we can find the
initial values of y-independent fundamental solutions f by solving L f = 0 at
t = 0; the second equation M f = 0 then determines f at other values of t.
However, the equations for y-independent solutions are singular at ( = oo since
L f = 0 degenerates to the algebraic condition Q f = 0 at (= oo and has no
regular solutions independent of y.
The best that we can do at t; = oo is to look for solutions with some standard
y-dependence. If we require that f -18y f = Q, then the equations for f are
still singular at ( = oo, but it is possible to find solutions that are regular in
various sectors around (= oo. We can then proceed by modifying the first step:
instead of seeking solutions to the linear system that are holomorphic in (in a
neighbourhood of ( = oo, we require only a standard asymptotic behaviour as
C -- oo. Finding the initial values of such solutions in various sectors of the (-
plane is in fact sufficient to set up a well-posed Riemann-Hilbert problem, which
turns out to be equivalent to that of the IS method.
In implementing this strategy, we shall see the parabolic nature of the nonlin-
ear equations, in that the initial data at large values of x influences the solution
everywhere for arbitrarily small nonzero values of t. The reason is that in order
to integrate M f = 0 to find f (x, t, () from f (x, 0, () for large (, we have to know
the values of f at t = 0 for large x. Thus there is a direct connection between
the behaviour that we require of f at large (, and its behaviour at large x (see
222 Reductions of the Penrose-Ward transform

Fig. 11.2. For large l(i, the lines of constant x + (t through (x, t) intersect t = 0 at
large (xi.

Fig. 11.2). We shall see that the solutions to the linear system that approach a
standard form as x -a ±oo are also the solutions that are regular in the various
sectors at ( = oo.
In the twistor picture, the solutions to the linear system with the standard
asymptotic behaviour at ( = oo determine frames for E' in which Cy, = Y' + K,
where , is some standard linear expression in C. For the NLS and KdV equations,
respectively, the standard forms are

KKdV (0 O) , KNLS = - 1 0 Oi

Given E' and Ly', such frames are found by solving a linear ordinary differential
equation with a singularity at ( = oo. The sectors in the (-plane are separated
by the Stokes' lines of the ODE, and the smooth parts of the scattering data are
encoded in the Stokes' matrices that relate the solutions in the different sectors;
these can in turn be interpreted as transition matrices for E. We shall illustrate
how this works in the two basic examples.
The NLS equation
The complex NLS ",equation//
7p2
1'Nt = -2Wxx + W 1Wt = 2Wxx - W2W
is equivalent to the ASDYM equation [L, M] = 0, where

L=ax+(7p 0)- (ab-((0 0)

M at+ 2i () C8x
When z/' = 0 we can define a\solution to the linear system by
fo = exp(C(x + Ct)os) ,
where
The initial value problem and inverse scattering 223

We note that f is independent of y and regular for finite values of (, but, as

expected, it is singular at t; = oo.
When V) and zb are nonzero, we look for y-independent fundamental solutions
to the linear system that have the same asymptotic behaviour as fo for large
positive or negative real values of x. The columns of f satisfy the linear equation
Ls = 0 = Ms, so that at each t,
ax Qx+Va=-i[;f, (11.5.2)
where a and i3 are the first and second entries in s. Provided that 7P and fall
off sufficiently fast as x -+ ±oo, there are solutions si f to (11.5.2) (i = 1, 2)
which coincide with those constructed in §9.8 at t = 0 for real x, C. That is, si+
behaves like the the first column of fo for large positive x, S2+ behaves like the
entries in the second column of fo for large positive x, and so on.
We now form two fundamental solutions to the linear system by putting
ru = (sl+ , s2-) , rt = (sl_ , s2+) .
It follows from the discussion in §9.8 that r is extends holomorphically as a
function of S to the upper half (-plane and re similarly extends to the lower half
(-plane. On the real axis, we have

(11.5.3)
ru (0 16) = rt (1b a) '

where a, b, a, 6 are the scattering coefficients at t = 0, x E R.

The patching data

The two fundamental solutions ru and re solve the linear system for all finite
values of (, but are singular at C = oo, and have vanishing determinants at the
zeros of a or a, that is, at the eigenvalues of (11.5.2). Any other fundamental
solution f to the linear system is of the form f = ruHu(y, () for (in the upper
half-plane, or f = rtHt(y, () for (in the lower half-plane, where Hu and Ht are
holomorphic, but possibly singular at the zeros of a or a. We define two such
solutions by
fu = rua-i exp(-ya3), ft = rt exp(--Ya3)
These are fundamental solutions to the linear system, with fu regular in (on the
upper half-plane, except at the zeros of a, and ft regular in [; on the lower half-
plane, except at the zeros of a, where det ft = 0. Both satisfy the asymptotic
condition f - 1 as ( oo in their respective domains, and both extend to the
real axis in the (-plane, on which they are related by fu = ftFut, where
bel
Fut(y, C) = exp('ra3)re'ru exp(-ya3) = (11.5.4)
1 + bb (-be-2'7
224 Reductions of the Penrose- Ward transform,
We note that fu and ft determine holomorphic frames for E' on domains that
cover almost all of twistor space, and that the transition matrix between them
can be written down from the scattering coefficients b, 6, which in turn are found
by solving eqns (11.5.2) on the real x-axis at t = 0. In these frames, the Lie
derivative operator Cy, takes its normal form
LY' =Y'+KNLS

Nonsolitonic solutions
By a nonsolitonic solution, we mean one for which a has no zeros in the closure
of the upper half-plane and a has no zeros in the closure of the lower half-plane
(which is always the case when we have the reality condition 0 _ 0 for real x
and t). For these solutions, the domains of the frames determined by fu and fe
are the closed subsets of twistor space defined by
Vu = {Im(() > 0}, Vt = {IM(S) < 0),
(including the points at ( = oo). These cover twistor space, and so the transition
relation (11.5.4) determines E'. By a simple extension of the previous argument
to take account of the fact that the cover is closed, we can deduce that the
solution is given by solving the Riemann-Hilbert problem
1 1 6e2iy
1 + bb ( -be-2iy
I ) = fe 1
fu

where y = (x + (2t, fu(x, t, () is regular in the upper half-plane, fe(x, t, () is

regular in the lower half-plane, and both satisfy f - 1 as ( oo in the respective
domains. These conditions determine fu and ft uniquely, and hence V' and by
using the fact that fu exp(-yo3) and ft exp(ya3) solve the linear system. Thus we
have reduced the initial-value problem to a Riemann-Hilbert problem. Moreover,
any functions b and 6 on the real axis will determine a solution; the criterion of
Gohberg and Krein (1958) can be used here to deduce that the Riemann-Hilbert
problem has a solution (Proposition 9.3.6).

Solitonic solutions
Suppose that b = b = 0 for large 1(I. If we put f = fu on the upper half-plane
and f = ft on the lower half-plane, then f is regular as a function of ( in a
neighbourhood of infinity and so determines a frame for E' in a neighbourhood
V of ( = oo. To trivialize E' over the rest of twistor space, we can take f to be
a y-independent fundamental solution to the linear system for finite values of (,
We then have a patching matrix of the form
F = exp(7a3)9(0-1
where g takes values in GL(2, C ). We can recover the solution directly by the
standard application of the Penrose-Ward transform. In particular, if we fix f
by the condition f = 1 at x = t = 0 for all (, then
The initial value problem and inverse scattering 225

g __ a-lrulx=t=o Im(() ? 0
rtl x=t=o Im(() <_ 0,
which gives a solution to the initial value problem.
To obtain the solitonic solutions, we specialize to the case that t
real values of the coordinates, and make the further assumption that a has a
finite number of zeros, at the points ( = (i (i = 1, ... , k) in the upper half-plane.
Because as = 1 on the real axis and because both tend to unity at infinity, we
then have
a=a-1=f
(-(i
In this case, g is rational, and is regular except at the points (i and (i. There-
fore the solution can be found explicitly, by using the factorization method in
Example 9.3.3, in terms of the scattering data, which are (i) the points (i and
(ii) the nonzero constants ai such that

ru (x, (i) 1i
Q =0
(ai determines the eigenfunction of (11.5.2) corresponding to the eigenvalue (i).
We can also combine the solitonic and nonsolitonic constructions. We then
have have patching matrix constructed from b and b between two fundamental
solutions fu and ft near infinity, and a further patching matrix constructed from
a rational function of ( between these and the fundamental solution f in the rest
of the (-plane. We recover the solution to the initial value problem by solving a
'Riemann-Hilbert problem with zeros' (see, for example, Ablowitz and Clarkson
1991 or Faddeev and Takhtajan 1987).

The KdV equation

The KdV equation
4ut - uxxx - 6uux = 0
is equivalent to the ASDYM equation [L, M] = 0, where

L 8x+(4x+92 -9)-(av_((0
0)
1 M=at+29xx+99x -qx _Ox
C - 29xx - qqx
with u = 2qx and c = 4gxxx + 29x +929x +ggxx (§6.3). For finite (, we can drop
the dependence on y, but if L l = M f = 0 and if f is regular at oo, then it must
depend nontrivially on y. We shall consider solutions for which u is real for real
values of x and t.
If we think of L and M as acting on column vectors, then the linear system
is
/ \ / \
LI I =0=M1 0 I .
226 Reductions of the Penrose- Ward transform
When a and /i are independent of y, the first equation is equivalent to
axx +ua = (a, (11.5.5)
with 0 defined by ,0 = ax +qa. The second equation determines the t dependence
of a and /3. Thus any solution to the time-independent Schrodinger equation
(11.5.5) at t = 0 determines a solution to the linear system; every y-independent
solution to the linear system arises uniquely in this way.
When q = 0, eqn (11.5.5) at t = 0 has solutions
a1=eikx a2=e -ikx
where k2 The se evolve to give th e fundamental solution to the linear
system given by
eik(x+(t) e-ik(x+(t)
r0 = (1keik(x+(t) -ikeik(x+(t)
Because of the sign ambiguity in k, r0 is not single-valued, and it is regular
neither at C = 0 (where det r0 = 0), nor at C = oo, where there is an essential
singularity. Nonetheless, any other fundamental solution must be of the form
roH(-y, (), where H is holomorphic in -y and C. By appropriate choices of H, we
can find solutions that are regular in different parts of the Riemann (-sphere.
For example,
H0 = 1 1 1
e-"/k ei7/k -1
and
(ik -ik) - H0 = ( ike-"'/k -ikei7/k
(11.5.6)

give solutions that are single-valued and regular at = 0 and oo, respectively.
They determine the patching matrix
F = Ho 1Ho = exp('7A/() ,
where
A = 00 0)
The solution f
In the general case, provided that u falls off sufficiently fast as x - ±00 for real
x at t = 0, we can pick out solutions to (11.5.5) with the same asymptotic form
as a1 and a2 for large real values of x. For each real k # 0, we denote by s1+,
s2+ the y-independent solutions to the linear system such that
a1} ,\, eikx, a2+ - e -ikx
as x - oo on the real axis at t = 0, where a1+ and a2+ are the first entries in
s1+ and s2+. Similarly, we denote by s1_ and s2_ the two solutions such that
a1 _ ti eikx a2_ - e-ikx

as x -oo on the real axis at t = 0. We then have that

T R+ = (sl-,s2-) 1 0
(s1+ ,s2+) 0 1
R- T
The initial value problem and inverse scattering 227

where T, Rt are the scattering coefficients of u(x, 0), defined in §9.8. It follows
from the scattering theory outlined in §9.8 that sl+ and s2_ extend holomorphi-
cally to Im(k) > 0.
We denote by k = the square root defined by Im(k) > 0 with a cut
along the negative real axis in the (-plane, so that k has a positive limit as (
approaches the cut from below and a negative limit as it approaches the cut from
above. We put
r(x, t, () _
(Q1+ 02- )
Then r is a regular fundamental solution to the linear system except on the real
axis in the (-plane. Its determinant vanishes at the eigenvalues on the positive
real axis in the (-plane, it is discontinuous across the cut on the negative real
axis and it has exponential behaviour as ( oo. On the cut,
r((+) = r((-)S(jkj),
where r((+) and r((_) denote the limits of r as ( approaches the cut from above
and from below, and S is the scattering matrix (9.8.2).
Any other solution to the linear system must be of the form f = rH, where
H is a holomorphic function of y = y + x( + t(2 and (. In particular, we can
define a y-dependent solution f by

f =r(0 )Ro.
At t = y = 0, and real values of x, the entries in the first row of f are
m+ + Tm_ m+ - Tm_
2 2ik

where m+ = al+e'7/k and m_ = a2_e-'7/k are the Jost functions of the po-
tential u(x, 0) (see §9.8). From the general results of scattering theory, we have
that f is holomorphic throughout the (-plane, apart from the singularities at
the poles of T on the positive real axis and a discontinuity across the cut on
the negative real axis. Moreover, f is bounded as k oo (from the asymptotic
properties of T and the Jost functions).
For real k, the scattering matrix S(k) satisfies the identity S(k) = S(-k)-1.
It follows that
-R(-k) _ 1 -R(k) 0 1
T(-k) 0 T(k) 1 0 '

where R = R+. We also have

Ho(-k) _ (0 0) Ho(k)
Hence if we define rl by

71 -r(0
1 -R
T )=yj'(0 R)Ho, (11.5.7)
228 Reductions of the Penrose- Ward transform
then we have that 77 is an even function of k, for real k. This determines
and hence the discontinuity in f across the real axis, in terms
of the reflection coefficient R.

Generalized patching data

In the twistor picture, the columns of f form a frame for the Penrose-Ward trans-
form E' in which the Lie derivative!standard form along Y' takes the standard
form
LY, =Y'_(-'A.
It is holomorphic on the open dense subset of twistor space on which ( is not
real. It has poles at the poles of T for positive real values of (, where its residues
are determined by the corresponding eigenfunctions of (11.5.5) at t = 0, and its
discontinuity at negative real values of ( is determined by the reflection coefficient
R. It is possible to reconstruct E' from the behaviour of the frame at its poles and
on the cut, and therefore from scattering data of (11.5.5) at t = 0: all we have to
do to find a system of local trivializations is to transform the frame by matrix-
valued functions of -y and ( to correct for the singular behaviour in different
neighbourhoods in twistor space. We can therefore think of the scattering data
as `generalized patching data'.

Riemann-Hilbert problem
The condition that n should be even, together with the asymptotic properties
of the Jost functions and T, is sufficient to determine m+ and m_ for all t, and
hence to determine the evolution of u from the scattering data at t = 0. If we
define a and b by
a=m+(1-R)+Tm_, b=m+(1+R)-Tm-
where R = Re-zi-,/k, then because the entries in the first row of 77 are even, we
have that a and b are respectively even and odd functions of k for real k, and
for all x, t. Hence for real k, we have
T(k)m_(k) = R(k)m+(k) + m+(-k)
where we have suppressed the dependence on x and t. Given that m+, and m-
are holomorphic in the upper half k-plane, and asymptotic to unity as k -> oo,
this determines them uniquely, as the solution to a Riemann-Hilbert boundary-
value problem (Ablowitz and Clarkson 1991, p. 73).4

Reflectionless potentials
There is one class of solutions for which we can write down the patching matrix
directly from the initial data. This is the class for which R+ = R_ = 0, that is,
for which the initial value of u is a reflectionless potential. In this case,

r((+) = r((-) (T 0 )
The initial value problem and inverse scattering 229

on the cut, and f is single-valued on the cut. Since f is bounded as k oo,

it must be holomorphic in ( at C = oo. We can therefore construct a patching
matrix by taking f to be (-independent fundamental solution to the linear system
such that f = 1 at x = t = 0 for all finite (. This gives

F = f-1(0,0,'Y,C) = exp('yA/()Ho 1 (0 T ') r-I(0,0,5),

which is the patching matrix exp(yA/() of the trivial solution `dressed' by mul-
tiplication on the right by a single-valued function g(() (which is irregular at
the origin and at the eigenvalues). We also have a patching matrix of this form
whenever R+ and R_ vanish for large IkI. We shall look at such matrices in
more detail in our treatment of the Segal-Wilson ansatz (§12.4), where we shall
see that the soliton solutions are given by taking g to be rational.

Scattering data and Stokes' matrices

The fundamental solutions to the KdV and NLS linear systems that we defined
for large ( determine frames for E' in which Ly, takes a standard form. Another
way to interpret (part of) the scattering data is to consider the problem of
constructing such a frame for a given holomorphic vector bundle E' -. MT with
structure group SL(2, C) with a symmetry £y, along 87.
The symmetry along Y = 8z is generated in a neighbourhood of ( = oo by
the Lie derivative
Gy, _ -2 +0

where B is holomorphic as a function of y = (2y+(x+t and = t;-1. The problem

is to choose the frame so that Ly' = Y' + #c, where r. = Ko + C't is a standard
linear expression in (. The constant matrices Ko and ki are characteristic of the
reduction. For the two principal cases, we want
0
Gy- = Y' - 1 (KdV case)
0

Gy, =Y'- Ii
\\\0
0)
i
(NLS case).

To solve this problem, we must find a holomorphic SL(2, C )-valued function

H(', r;) such that

This is an ODE in the independent variable ry, which is well behaved everywhere
except at ( = 0, where it is singular.
There is an obvious algebraic obstruction to the existence of a solution regular
at ( = 0. Clearly it is necessary that there should exist H = ho(5') + (hl (5') such
that
H-'BH = Ico + (, .1+0(( 2) (11.5.8)
230 Reductions of the Penrose- Ward transform.
as C -. 0. If we expand 9 as a Taylor series
00

in , then (11.5.8) is equivalent to

Bo = horc0ho', 91 = horc1ho1 + [h1ho',90).
These in turn are equivalent to the two conditions (i) Bo is conjugate to rco for
all 5' and (ii) tr(9091) = tr(rcorcl), which by Lemma 11.4.1, we can translate into
conditions on the Higgs fields of the corresponding ASD connection. In the KdV
case, Q is everywhere conjugate to
CO

0) 1 0

and tr(PQ) = -1. In the NLS case, Q is everywhere conjugate to 0`3 and
tr(PQ) = 0. When the appropriate version holds, we assume without loss of
generality that 9 = rc0 + (rcl + (2r and that H = 1 at (= 0. We then look for
H such that
8H +c2TH=-k
2ay
From this, we see that there is a further obstruction in the KdV case: since the
left-hand side is 0((2), we must have
)++O(2),
H = (q
i
where Q is lower triangular, with equal entries on the diagonal, and q,13 are
independent of r;. But then there cannot be a solution unless the upper right-
hand entry in r vanishes at = 0, that is, we must have tr(rcr) = 0 at S = 0.
This is the origin of the second constraint in (6.3.19).
When this further obstruction vanishes, we can develop the solution as an
asymptotic power series in , and locally we can find solutions that have the
required asymptotic form as --+ 0 for fixed x, t, y. In general, however, the
asymptotic series will not converge at = 0, and will only be a valid asymptotic
expansion in a certain sector of the c-plane at C = 0. This is an example of the
Stokes' phenomenon (see, for example, Berry and Mount 1972): it arises because
there are solutions to (11.5.9) that behave like a-OH as C - 0, where H{ is an
eigenvector of
H [C-2ic, H)
with eigenvalue . In the KdV case, the nonzero eigenvalues are ±2iC-3/2; ill
the NLS case, they are ±2iS 2. The corresponding solutions are exponentially
decreasing and invisible in the asymptotic expansion for some values of arg(C),
but exponentially increasing for others. The different regions defined by their
asymptotic behaviour are the Stokes' sectors: they correspond to the regions in
which the solutions f,,, and fe were defined in the NLS case, and f was defined
Isomonodromy and the Painleve equations 231

in the KdV case, and we can interpret the Stokes' matrices, which connect the
solutions with asymptotic form in the different sectors, as the continuous part
of the scattering data.
In our treatment of the Drinfeld-Sokolov construction (§12.2), the reduction
of the Lie derivative to a standard form is the starting point for the construction
of the nKdV hierarchy.
11.6 ISOMONODROMY AND THE PAINLEVE EQUATIONS
We showed in Chapter 7 that the Painleve equations are the reductions of the
ASDYM equation by five Abelian subgroups of the conformal group. These
'Painleve groups' are three dimensional, and they all have open orbits in CP3.
In the generic case, Pvi, the group is the diagonal subgroup of PGL(4, C) and
it acts transitively on the complement of the four planes in C 1P3 that make up
the singular set; in the other cases, the planes come into coincidence in various
combinations.
We can read off the generators from Table 2.1, where each group is repre-
sented as a four-dimensional subgroup of GL(4, C). For example, in the case of
Pv, the flows along the four parameters a. b, c, d are generated by
Zla°, Z°a°+Zla1, 2282, Z383
in the homogeneous coordinates Z°. The first three of these project onto the
vectors on C ]P3 in the Table 11.1, while the fourth projects onto a linear combi-
nation of the first three. The three are independent except on the singular set,
which is characterized by the vanishing of the determinant
Z1 0 0 0
Z° Z1 0 0
0 0 Z2 0 = (Z 1 ) 2 Z 2 Z3 ,

0 0 0 Z3
and which is therefore the union of the planes Z' = 0 (double root), Z2 = 0,
and Z3=0.
In Table 11.1, we list the generators of the Painleve groups in the standard
inhomogeneous coordinate systems A.,u, ( and ( on twistor space: the three
vector fields in each case generate the flows of the parameters a, b, c. We also
list the determinants A = det G and 0 = det d = -(4u of the matrices
X X' X' X Xµ X
G= Ya Yµ YS G = Y Yµ Y!
Za ZL Za Z ZS

where the subscripts denote the components of X', Y', Z'; A vanishes on the
part of E on which ( is finite, and 0 vanishes on the part on which is finite.
Here we are at the opposite extreme to the reduced twistor space construc-
tions: in the Penrose-Ward transform of the Painleve equations, all the infor-
mation is contained in the action of the symmetries at the singular set. In this
section, we shall explain how the explicit representation of the Painleve-invariant
232 Reductions of the Penrose-Wcrd transform
Table 11.1 The action of the Painleve groups on twistor space
Generators on C P3 0 0

P1.11 x' _ (aa = as -(4 1

Z' = µaa + a, - ((aaa + µa + (a()

= µaa + (aµ + a

x' = -((aaa + pav + (a< = a( (202 -µ2

Pill
Y' = pa, = µaa

P,v x' = as = Caa -( 3

Y' =µa.\ + a,, =µaa + Caµ
Z' = -(a( = aaa + µaµ + (-a<
-(µ2 µ2(
Pv x' = µaa = IBS
Y' =aaa + µa = aaa + µaµ
Z' - Pa'. (a(

Pvi x'-aaa-Ag aµC

Y' -)aaa - - (a< = Cat
iu9pp

Z' = Aa" = A,

bundles over P C C P3 is related to the theory of singularities of ODEs, and

gives rise to the classical solution to the isomonodromy problem by Painleve
transcendents. 5
The family of ODEs
Suppose, to begin with, that we are given a rank-n holomorphic vector bundle
E' P, where P is a neighbourhood of a line in C P3. Suppose that E' is
invariant under the Lie algebra of one of the Painleve groups in Table 11.1 and
that it satisfies the usual triviality condition on lines. In P - E, the three
generating vector fields X', Y', Z' are independent, and so we can define a
holomorphic connection V on E'Iy-E by
VX, = LX'' VY' = Ly' , Oz' = Gz' ,
which is flat because the Lie derivatives commute. In any local trivialization,
the connection form has a pole on E, at which it behaves like A-1 or 0-1.
Let x E U and let a be the corresponding projective line in P, with stereo-
graphic coordinate (, and let T be the tangent to i such that T(() = 1. Then
E'Ji is trivial; and in a global trivialization, OT is an operator on local C"-valued
functions on a of the form
VT = d( - A((),
Isomonodromy and the Painleve equations 233

where A is a matrix-valued function of C, with poles at the intersections of i

with E, the order of the pole being equal to the multiplicity of the intersection.
Thus for each point of U, we have an ODE

= A(()u, u(C) E C" , (11.6.1)

du
with four singular points (which need not be distinct and may be at infinity). It
is uniquely determined by E' and by its symmetry group up to the ambiguity in
the global trivialization of E'Il and the choice of the stereographic coordinate
C; that is, up to gauge transformations A ,--+ g-1 Ag, u -+ g- t u, where g is
independent of (, and Mobius transformations of C.
The right-hand side of the ODE can be determined directly from the Higgs
fields of the corresponding solution to the ASDYM equation. We use the same
technique and notation as in the construction of invariant linear systems (§11.2).
In an invariant gauge the pull-back to F of an invariant local section of E' is a
solution to the linear equations
Dtu = O, D,,,u = 0, X"(u) = 0, Y"(u) = 0, Z"(u) = 0,
where X", Y" , Z" are the lifts to F of the conformal Killing vectors. Now the
five vector fields e, m, X", Y", Z" span the tangent space to F everywhere except
on p-1(E); by eliminating the four space-time derivatives at each x, we obtain
an explicit form for the ODE satisfied by u on i.
At each point of 1, we have

T= (aX'+fly' + 1Z')
where a, 3, y are polynomials in A, it, and C. On the fibre above x in F,
therefore, aX" +,OY" + ryZ" is a linear combination of a and m, and so
Btu = T(u) = -A-' (aX +,OY + yZ) J Pu = -A-1(aP + f3Q + -yR)u,
where P, Q, R are the Higgs fields. It follows that
A = -A- I (aP + OQ + ryR).
The poles of the ODE are at the zeros of A (including C = oo if C = 0 is a zero
of A); they are distinct in the case of Pvi, and coincident in the case of P1,11. In
the other cases: there are two double poles (P111), one single and one triple pole
(Piv), or two single and one double pole (Pv).
We determine the coefficients a, 0, -Y in terms of A = (w + z, p = Cz + uw and
C, by using
T(A-(w-z)=0, T(µ-(z-w)=0, T(()=1,
where the space-time coordinates are held fixed at their values at x. The result
is
(a,,3, -Y) = O(w, z,1)G_1 ,
where G is the matrix defined above. For example, in the case Pt,it,
234 Reductions of the Penrose-Ward transform.
C-4 [(t(2
A= + ((w - 1)2)P + ((w - 1)(Q + (2R]
where t is as in Table 7.4.
Isomonodromic deformations
Equation (11.6.1) is a four-parameter family of ODEs, labelled by x E U. When
the coordinate t in Table 7.4 takes the same values at x1 and x2, the correspond-
ing ODES are equivalent in the sense that one can be transformed into the other
by a gauge transformation combined with a Mobius transformation of (, because
if t(xl) = t(x2), then there is an element of the symmetry group that maps xl to
x2. The ODEs at different values of t are still closely related in that they have
the same monodromy data.
To explain what this means, we shall consider a general ODE of the form
(11.6.1), where A is a rational function of ( with poles at (1,(2 ....; we also
count poles at infinity: the ODE has a pole of order r at ( = oo if the leading
term in A is of order (r-2. In general, a fundamental solution f will be singular
at the poles (o, (1 ... and will be multivalued on their complement in C P1. Part
of the monodromy data measures the way in which f fails to be single-valued.
If r: [0,1] , C P1 - {() } is a closed curve in the complement of the poles, then
the values off at the initial and final points of r are related by fo = fiMr, for
some Mr E GL(n, C) depending only on the homotopy class of r. The first part
of the data is the monodromy representation
M: 7, 1(G P1 - {(j}) -- GL(n, C),
which is determined by the ODE, uniquely up to conjugation by a fixed matrix.
When all the poles are simple, this is all there is: we say that two equations with
the same number of simple poles are isomonodromic whenever their monodromy
representations are conjugate. To extend this to the general case, we must add
more detailed information about the behaviour of f at the multiple poles (Jimbo
et al. 1981, Its and Novokshenov 1986).
Suppose that ( = 0 is a pole of order r + 1, and that the leading coefficient
in A at (= 0 is diagonal, so that
a1 U ... U
0 a2 ... 0
A =-r-1 +O((-r)
0 0 ...
where the eigenvalues aj are distinct, and no two differ by an integer in the case
r = 0. We can expand in powers of t; to construct a fundamental solution f (()
of the form
f =(1+(F1+ )exp(Tlog(
where the matrices T, S3, F3 are independent of (, and T and the S3s are
diagonal (when r = 0, we omit the Sj terms). The coefficients F3, T, S3 are
unique, although in general the series does not converge for r > 0, so f is a
Isomonodromy and the Painleve equations 235

solution only in a formal sense. It does, however, determine the asymptotic

behaviour as ( 0 of actual solutions in various sectors of the (-plane bounded
by lines of constant arg((). We define the separation rays to be the lines through
the origin on which
Re [(a, - a.i)(-r]
changes sign for some i, j, and we define a fundamental sector to be an open set
of the form S = {Bo < arg(() < 01 } such that (i) S is bounded by separation
rays, (ii) 81 - Bo > 7r/r, and (iii) S is minimal in the sense that it does not
properly contain any other sector satisfying (i) and (ii). Since the eigenvalues
are assumed to be distinct, it is clear that we can cover any neighbourhood of the
origin, less the origin itself, by 2r fundamental sectors. It is proved by Birkhoff
that for any fundamental sector S, there is a unique fundamental solution, Is
that is asymptotic to the formal solution as ( 0 in S (see Theorem 12.3 and the
remarks on p. 84 in Wasow 1976, and also Jimbo et al. 1981). The asymptotic
behaviour of the fundamental solutions changes from sector to sector because
terms that fall off exponentially in one sector can grow exponentially in another,
and only for very special equations is it true that the solutions in different sectors
are the same.
The construction of Is uses a special gauge in which the leading coefficient
of A at the pole is diagonal, and a particular choice of coordinate at the pole. In
a general local gauge in a neighbourhood V of the pole, and with general choice
of coordinate ( on V such that ( = (o at the pole, the fundamental solutions Is
are characterized, by the condition
r
Tlo g((- (o) + Si((- (o)i I

as ( (o, with (- (o E S, where T and the matrices S3 are diagonal; T is called

the exponent of formal monodromy at the pole. This condition determines Is
uniquely up to the choice of the branch of the logarithm (see Balser et at. 1979).
The monodromy data of the equation are (i) the exponents of formal monodromy
at each pole and (ii) the constant matrices fs,1 fs that relate the solutions at the
various sectors at the various poles. There is some redundancy here, and Jimbo
et at. (1981) explain how to reduce the data to a minimal set.
It is immediate that two ODEs constructed from a given Painleve-invariant
holomorphic vector bundle have the same monodromy representation since the
monodromy representation coincides with the holonomy of V in the complement
of the singular set. We want to show that they also have the same monodromy
data. This is not quite a well-posed problem because we have defined the `mon-
odromy data' only under generic conditions on A. Rather than try to extend
the definition to cover the singular cases, at which the eigenvalues coincide, we
shall prove a stronger statement that implies the proposition in the cases that
the data are well defined, but still makes sense when they are not. We say that
two ODEs
236 Reductions of the Penrose- Ward transform

- = A(C)u, d, = A'((')u ,
with rational coefficients on the Riemann spheres 2 and 2' are strongly isomon-
odromic whenever there exist connected open sets V C 1, V' C ±', and a biholo-
morphic map p: ( '--i (' from V onto V' such that V contains the poles of the
first equation, V' contains the poles of the second equation, and
A'((') = g-'Ag - g-la(g (11.6.2)
on V for some holomorphic g: V -+ GL(n, C). Clearly if this holds, and if
the data for the two equations are well defined, then the data are equivalent
because p and g will transform the solutions of the first equation with the special
asymptotic form into solutions of the second equation with the same asymptotic
form. If V is the whole of x, then p is a Mobius transformation and g is constant,
but p and g can be more general when V is not the whole sphere.
Proposition 11.6.1 The ODEs constructed from a Painleve-invariant holo-
morphic vector bundle E' - P are strongly isomonodromic.
Proof Suppose that x, x' E U. When x and x' are close, the existence of V and
p is a consequence of the following. Let f be a two-dimensional subalgebra of li.
Then at almost every point of 1, the 2-plane in the tangent space to P spanned
by the generators of t is transverse to x; the exceptional points are given by the
roots of a quadratic in C. If t is generated by
a=aa + b,a,, + cjat, i=1,2,
then the exceptional points are the points at which T = waa + za,, + at is
a linear combination of these two vectors (together, possibly, with the point
= oo). That is, they are the roots of
w z 1

al bl cl
a2 b2 C2

which is a quadratic in ( (if the coefficient of (2 vanishes, then we include ( = 00

as a root). We calculate this quadratic from Table 11.1 for the subalgebras
generated, respectively, by X' and Y', Y' and Z', and Z' and X' in each case.
Up to scale, the results are
P1,ll : (2 (2(2 - z) - 2t-v( - 1 (('w( - 1)
-V)2 (2
Pill : ((z( + w) (z( + I
Ptv: 1 ((wz(+ z2(- w) (
Pv: (z(+ iu)2 ( z(+ w
Pvl: ( z( + w ((z( + zu)
where w, z, w, z are the coordinates of x. In each case, the three quadratics are
independent, so by a suitable choice of f, the two exceptional points (the two
roots) can be any two chosen points on the sphere. We choose them so that they
do not coincide with any of the poles. We can then find an open set N C P such
that V = 1 n N is connected and contains all the poles on x, and such that the
Isomonodromy and the Painleve equations 237
Table 11.2 Linear ODEs in terms of Higgs fields
Case ODE

PI 11 u' = C-4Pu - S-3Qu + C-2(R + tP)u

Pill u' = (-2Pu + t2Qu - ('Ru
PIV u' = (Pu + (Q - tP)u + ( ' Ru
Pv u' = t(1 +()-2Pu+(-1(1 +()-1Qu+(-1Ru
Pv1 u'=-[;-1Pu-((+1)-IRu+((+t)-1(P+Q+R)u

the generators oft span a 2-dimensional integrable distribution at each point of

N which is transverse to 1 in V. The leaves are the connected components of
the intersection with N of the orbits of the subgroup generated by C, and they
include the intersections with the planes in E (since all the generators of are
tangent to these planes). We assume that N has been chosen so that the leaves
are simply-connected.
The Lie derivative operators along the generators of a determine a flat con-
nection on the leaves, which coincides with the V (restricted to the leaves) in
N - E, but is still well defined on the leaves in E. We define p : V --+ V by
projecting along the distribution. Since the planes in E intersect N in leaves,
p maps poles to poles. Provided that ±' is close to 1, V' = p(V) contains all
the poles on V. If we lift p to a bundle isomorphism p': E'I v -' E'',, by parallel
propagation along the leaves, then it maps V I v to V J v' because V is flat. If
we define g to be the gauge transformation between the global trivialization of
E'' and the global trivialization of E'Ix-, pulled back to i by p', then we have
(11.6.2), and hence that the ODEs on the two lines are strongly isomonodromic.
When 1 and 1' are not close, we construct p and g in stages.
The Painleve systems
Each Painleve transcendent determines an invariant ASDYM field and hence a
four-parameter family of isomonodromic ODEs. Three of the parameters label
trivial deformations (gauge and coordinate transformations), while the fourth is
nontrivial. By making special choices for three of the space-time coordinates,
one can pick out in each case a simple representative ODE labelled by the non-
ignorable coordinate t in Table 7.4. The resulting one-parameter families of
isomonodromic ODEs are shown in Table 11.2, where P, Q and R are the Higgs
fields of the corresponding ASDYM field.
Remark. When the gauge group is SL(2, C), the only deformations which pre-
serve the monodromy data are those given by the twistor construction. In this
case, therefore, strong isomonodromy is the same as isomonodromy in the stan-
dard sense. However, this is an accident of the low dimensionality of SL(2, C ),
and is not true for a general gauge group in the presence of irregular singular
points (as follows from the enumeration of the deformation parameters in Jimbo
et al. 1981).
238 Reductions of the Penrose- Ward transform
The inverse transform
Suppose we are given an ODE of the form (11.6.1) with four poles, not necessarily
distinct. Then we can embed the Riemann sphere on which the equation is
defined as a line ± C C IP3 in such a way that the poles are the intersections with
x of the singular set E of a Painleve group H (with the correct multiplicities). In
the generic case (Pvi), this means choosing ± so that t(x) is equal to the cross-
ratio of the poles. We shall now explain how to construct an invariant bundle E'
on some neighbourhood P D x, from which we can recover the original equation
together with a family of isomonodromic deformations.
At each point of x, the holomorphic tangent space to C IP3 is spanned by the
tangent space to the line together with the generators of some two-dimensional
subalgebra t C 1), provided that x is not at one of the fixed singularities. We
cannot use the same t at every point, but in each case it is possible to use Table
11.1 to pick two subalgebras t and t and an open cover V, V of a neighbourhood
of x such that (i) the generators oft are transverse to i in V, (ii) the generators
of t are transverse to ± in V, and (iii) no pole of the ODE is in V fl V
Now define two functions and on V and -f/ by the conditions (a) on
Let f(()
be a fundamental solution to the ODE on x and let E' --> P be the holomorphic
vector bundle with transition matrix
F =
f is many valued, F is single valued because F = 1 on x.
Also, although f is singular at the poles, F is nonsingular on V n V because the
poles lie outside V n V. Moreover E'Iy is trivial, so E' is trivial on lines in a
neighbourhood of i.
If K is a generator of the action of t, then K(f 0, by construction.
If X' is the holomorphic vector field corresponding to some other element of i),
then X' must lie in the space spanned by the Ks at each point of E, because
the generators of fl are linearly dependent on E. Therefore 0 on E: in
fact has a zero of the same order as the corresponding pole of the ODE.
Therefore
XV)df A(f)
is holomorphic in V (if V contains points at which ( = oo, then we can replace
( and by their inverses). By combining this with the same argument for V, we
conclude that for every generator X' of the action of h,
X'(F) = HF - FH
where the matrices H and ft are holomorphic on V and V respectively. It follows
that E' is invariant.
Finally, we claim that the ODE recovered from E' on 1 is the one we started
with. This follows from the fact that we can define a (multivalued) invariant
frame field for E' on the complement of E by taking the basis vectors to be the
The Schlesinger equation 239

columns of f and f (l;) on V and V respectively. Since f (l;) = Ff (l;), these

snatch up on the overlaps. The frame field restricts to a fundamental solution
to the corresponding ODE on each line, in the global trivialization of E' on the
line. But F = 1 on i, so the local trivializations that we used to construct E'
coincide with the global trivialization of E'1zi therefore, f (() is a (multivalued)
fundamental solution to the ODE on ± determined by E', which must therefore
be the same as the original ODE.
Example 11.6.2 Suppose that A has one double and two simple poles, which
is the case Pv. Choose ( so that the double pole is at ( = -1 and the simple
poles are at ( = 0, ( = oo, and take xto be the point w=0,z=w=z=1.
Then, on ±, A = 1 and µ = 1 + (, and so 0 = -((1 + ()2.
We take V to be a small neighbourhood of ( = -1 and f to be the span of
y' and Z': and we take V to be the complement of ( = -1 and t to be the span
of X' and Y'. Then C = A- lit - 1, which is constant along Y' and Z', and equal
to ( on x, and = (, which is constant along X' and Y'. So in this case, the
patching matrix is
F=f(()(f(A-'j-i))-'.
When n = 2, we obtain from this a solution y(t) to Pv by substituting A = t,
µ = 1 + ( (see Table 7.4), and by making a Birkhoff factorization. Every solution
arises from this construction by (i) solving an initial ODE to find f and then (ii)
by solving the Riemann-Hilbert problem to determine the ASDYM field, and
hence y(t).
We remark that the Painleve property is an obvious consequence of this con-
struction. The singularities of a solution y of one of the Painleve equations are
of two types: fixed singularities, at which i passes through the intersection of
two planes in E, or lies in one of the planes; and movable singularities on the
jumping lines. At the latter, by Proposition 9.3.4, y is meromorphic.

11.7 THE SCHLESINGER EQUATION

Both the forward construction, by which we derived an isomonodromic family of
ODEs from an invariant holomorphic vector bundle, and the reverse construction,
by which we embedded a given ODE in an isomonodromic family, extend directly
to ODEs of the form (11.6.1), where where now A is rational with any number of
poles, with any multiplicities (Mason and Woodhouse 1993). We simply replace
C P3 by CPk+1, where k + 2 is the total number of poles, counted according to
multiplicity. In the general construction, the Painleve groups are replaced by
Abelian subgroups of PGL(k + 2,C) of matrices with block decomposition
0
0

BN
where for each j = 1, . . . , N, B2 is an k3 x kj matrix of the form
240 Reductions of the Penrose-Ward transform
a1 a2 a3 ... ak,
0 al a2 ... ak,_,

Bj = 0 0 a1 ...
ak,

0 0 0 ... a1

There is one block of size kj for each pole of order kj. As in the cases that we
have considered, the action of the symmetry group is generated by k + 2 holo-
morphic vector fields on CPk+1, which are independent except on a singular set
made up of hyperplanes (one for each pole, again counted according to multi-
plicity). In the forward construction, an invariant holomorphic vector bundle on
a neighbourhood P of a line in C Pk+1 gives rise to a strongly isomonodromic
family of ODEs, with one ODE for each line in P. The construction goes in
exactly the same way as before: the symmetry determines a connection on the
restriction of the bundle to the complement of the singular set, and the invariant
frames of the connection determine fundamental solutions to the ODEs. In the
reverse construction, an ODE of the form (11.6.1), in which A is a given ra-
tional matrix, determines an invariant holomorphic vector bundle over an open
set in C Pk+1, and hence an isomonodromic family of ODEs. The ODEs in this
family associated with distinct lines ±1 and x2 are generally distinct (i.e. they
are not related by gauge and Mobius coordinate transformations) whenever the
intersections of :1 and x2 with the singular set are not projectively equivalent.
In the generic case, the symmetry group is the diagonal subgroup and all the
poles are simple. The intersections then each consist of k + 2 points, and so
the family of ODEs contains a (k -1)-parameter family of distinct, but strongly
isomonodromic ODEs.
An ODE with k + 1 simple poles can be written in the form
du k A,,
dS E
to S - a,,
where the Aas are independent of (, and Ao+ +Ak+l = 0 (so that there should
not be a further pole at infinity). In the twistor construction, ( is a coordinate
on a line in C Pk+1 and the points a,, are the intersections with the coordinate
hyperplanes. It was shown by Schlesinger (1912) that when the poles move, the
monodromy representation remains constant if and only if the coefficients A,,
satisfy the Schlesinger equation
9A,, [AQ, AO]
aap
- as - ap
provided that no two eigenvalues of any A,, differ by an integer. Our construction
maps solutions of the Schlesinger equation to holomorphic vector bundles that
are invariant under the diagonal subgroup of GL(k + 2, C ).
We showed in §10.7 that the Penrose-Ward transform maps a holomorphic
vector bundle on a neighbourhood of a line in C Pk+1 to a solution to the
Notes on Chapter 11 241

GASDYM equation on C2k (the hyper-Kahler equation when k is even). There-

fore the Schlesinger equation is a symmetry reduction of the GASDYM equation.
In the inhomogeneous coordinates introduced in §10.6, the generators of the
symmetries in CPk+l are the vector fields

x; = lL' 8Ei 11 xk = µk a a
7 , xk+1 = '9(
a

and the equation of a general line in C Pk+1 is

IL A-(xA-xA=0 (A=1,...,k).
By writing the tangent to the line as a linear combination of the generators
T=a1Xi+...+akXk' +(-1Xk+1,
where aA = xA/µA (without summation), we deduce that the ODE associated
with the line is
du _ Pk+1 rk XAPA
td
d( [, - 0 (xA + iA
where the PAS are the Higgs fields of the symmetry generators in space-time,
that is of the vector fields
X1=x1a,+x181, x2=x282+i252, xk+1=-x181-...-xkak,

where aA = a/ax A, aA = 3/5A Therefore the corresponding solution to the

GASDYM equation is
D=d+(P1+Pk+1)d(logx')+...+(Pk+Pk+1)d(logxk)
- Pk+Id(log(x'...ik)).
Every invariant solution can be written in this gauge, and for every solution,
(11.7.1) is an isomonodromic family of ODEs. To obtain the corresponding
solution to the Schlesinger equation, one has to make a Mobius transformation
of the independent variable in (11.7.1) to move the poles at ( = 0 and r;' = 00
into general positions.
NOTES ON CHAPTER 11
1. If F is the patching matrix of a bundle which is invariant under the flow of X',
then X'(F) satisfies (11.2.1) for some 0x,, 0x'. However, 0x, and 0x' are determined
uniquely by F only if r(adj(E'),P) = 0; otherwise E' will be invariant along X' in
more than one way. If w E r'(adj(E')P), and if CX, is the Lie derivative of one action
of X', then Cx, + w is the Lie derivative of another.
2. This recent result appears in a preprint by R. S. Ward.
3. To construct the solution on space-time from these data, we work locally on space-
time. We choose a small open neighbourhood U in R3 such that the subsets of MT of
timelike lines that intersect U at positive and negative values of lm((), respectively, are
contained in Stein sets V+ and V_ in lP(O (D 0(2)). We can trivialize E over V+ and E
over V_ , and so represent the linear map F by a matrix-valued function of ((, -y). The
standard Birkhoff factorization of F((, z + (x + (2z) gives a solution to the ASDYM
242 Reductions of the Penrose- Ward transform
equation with required symmetry. This construction can be related to that of Manakov
and Zakharov (1981), and yields all their solutions.
4. More explicitly, the properties of a and b lead directly to the Gel'fand-Levitan-
Marchenko equation. Since they are both even functions of k, we have for any p E R,
0=J (sin(kp)(a - 2) + icos(kp)b) dk
00

=i ((m+-1)(e-''k+Re'k')+Reik' - (Tm_ - 1)e'k') dk.

At t = 0, the singularities off in the upper half k-plane are at the poles of T, that is,
at the eigenvalues (i = -k , ... , (n = -kn (the kts are on the imaginary axis in the
upper half plane). Now detr = -2ik/T vanishes at the eigenvalues-see eqn (9.8.1).
We put
n
A=!J '(0 i)Nn
t_1
where the constants ct are chosen so that at kt,

(T) k
lktvt
where r(kt)vt = 0. Then at t = 0, A12k is holomorphic throughout the upper half
k-plane, and bounded. The same is true at all x, t. Hence, for p > 0,
00 00
0=J kA12eIk'dk = 2 J (o(m+ - 1) + o + (Tm_ - 1))e"'dk,
-00 00

where a = ae-Zi7/k, since m+ - 1 is also holomorphic in the upper half-plane, and

vanishes at infinity. We conclude that for p > 0,
((m+ - 1)(e-l'k + weik') + weIk') dk = 0
00

where w = R + a. This is the Fourier transform of the Gel'fand-Levitan-Marchenko

equation,

M+(x,t,P)'+'cl(x,t,P)+ 2a j'M+(x,t,p!)Q(x,t,p+p!)dp'=O
where (with y = 0)
00

M+ (X, t, P) = (m+ - 1)e-''k dk

and
ct
(R + i
k
) e-2iry/kC'k dk
100
(note that M+ vanishes for negative p).
The Gel'fand-Levitan-Marchenko equation implicitly determines M +(x, t, p) and
hence u(x, t) from the scattering data R = R+, ct of u(x, 0). It is clear from the way
that t enters the formula for Q through y = (x + (2t (at y = 0) that the scattering
data of the potential u(x, t) are
R(t,k) = R(k) exp (-2ik3t) , ct(t) = ctexp(-2ik3tt)
The transmission coefficients, which do not enter, are constant. In the standard deriva-
tion of the Gel'fand-Levitan-Marchenko equation, one finds the evolution formulas
Notes on Chapter 11 243

first, and then introduces the equation as a solution to the inverse-scattering problem
of reconstructing a potential from its data.
5. The ideas here owe much to N. J. Hitchin; see, in particular, Hitchin (1995).
12
Twistor construction of hierarchies

In Chapter 8, we saw that a given solution to the ASDYM equation could be

embedded in an infinite family of new solutions by moving it along the commut-
ing flows of the ASDYM hierarchy. The lowest level flows in the hierarchy are
the translations in space-time, from which the higher flows are generated by the
recursion operator. In this chapter, we shall consider the Penrose-Ward trans-
form of the recursion operator and we shall show that it leads to a particularly
straightforward and elegant representation of the flows: if the original solution
has patching matrix F(A, µ, (), then the new solutions have patching matrices
F(A+a,µ+(3,(),
where a and (3 are holomorphic functions of C defined on an annular neighbour-
hood of the circle I<I = 1. The coefficients in the Laurent expansions of a and (i
are the parameters along the flows. In twistor space, the flows are linear, and the
reason that they commute is transparent. In the first section we shall derive this
representation, and explain how the Backlund transformation (§4.6) also takes
a simple form on twistor space.
In the next four sections we derive a twistor form of Drinfeld and Sokolov's
construction of the nKdV hierarchy. I In §12.2 we give a brief review of the origi-
nal construction; we explain that Drinfeld Sokolov hierarchy is a reduction of the
GASDYM hierarchy and outline the corresponding reduction of the GASDYM
twistor construction. We show that it has a particularly simple representa-
tion when we make the Segal-Wilson ansatz that the dressing transformation
converges. In §12.3, we present a twistor version of the Drinfeld-Sokolov con-
struction (the original is reviewed in Appendix B). One starts with an operator
representing `one half of a Lax pair' and constructs from it a sequence of com-
muting operators that generate the flows of the hierarchy. We use a version of
the Penrose-Ward transform to map the seed operators to holomorphic bundles
over a subset of C P2, and then construct the flows on the space of their patching
matrices. In §12.4, we derive two explicit methods for finding solutions to the
hierarchy. The first is based on the factorization derived in §9.3. The second
is a form of the Krichever construction, in which the twistor bundle is assumed
to be invariant along some combination of the higher flows. In this case, the
bundle can be constructed from a family of line bundles over a Riemann surface
and the solution can be written down in terms of theta functions. In §12.5, we
Transformations of the patching matrix 245

consider the Hamiltonian formulation of the theory. In the final section we show
how the twistor construction can be extended to give the general solution of the
gP equations.

12.1 TRANSFORMATIONS OF THE PATCHING MATRIX

A key property of the Penrose-Ward transform is that the patching data are free:
the patching matrix is required to be holomorphic in A, µ, and ( in some neigh-
bourhood of IKI = 1, and to have non-vanishing determinant, but is otherwise
unconstrained. Any holomorphic perturbation of F is allowed, so any holomor-
phic matrix-valued function bF on the domain of F determines a tangent vector
on the solution space of the ASDYM equation.
We explained in Chapter 8 that W could be regarded as a solution to the
linearized ASDYM equation, and that new tangent vectors (that is new solutions
to the linearized equation) could be generated by applying the recursion operator.
We remarked that the recursion operator is not uniquely defined as a structure on
the moduli space of ASD connections, because it involves a choice of integration
constants. We shall now see that the corresponding description in twistor space
is clearer and better defined because F contains more information than 4i, since,
for example, it determines not only D, but also the J and K matrices. In fact,
a linearized patching matrix 6F contains precisely the same information as the
sequence of potentials Oj (for positive and negative values of A. The recursion
operator is defined without ambiguity on 6F, and is given by a very simple
formula. It is for this reason that the Penrose-Ward transform of the hierarchy
of commuting flows on the solution space of a reduction of the ASDYM equation
takes the following particularly straightforward form.
Proposition 12.1.1 The Penrose-Ward transform of the recursion operator is
the multiplication operator
R: 6F (6F. (12.1.1)
Proof We suppose that the factorization of the perturbed patching matrix is
F+6F = (f +bf)-1(f +bf),
and we put
V) = fF-1(bF)f-1 = f(6F)F'1 f - 1 .
Under the gauge transformation (10.1.4), V) behaves as a section of adj(E). By
expanding the right-hand side of the factorization formula, and by keeping only
the first order terms in the perturbation, we have
V = (bf)f-' - (6f)f-(12.1.2)

for Sin some neighbourhood of the unit circle. Since f and f satisfies (10.1.1),
and since F and 6F are functions of A, µ, and ( alone,
D,,,t/) - (DZlji = 0, D.V, - (DI = 0, (12.1.3)
where here D = d + [1, I is the connection on adj(E).
246 Twistor construction of hierarchies
Now -r/) is holomorphic in ( in a neighbourhood of the unit circle, so it has a
Laurent expansion
00

where the coefficients are sections of adj(E). If we fix the Birkhoff factorizations
of F and F + 6F by imposing the gauge conditions f I(=o = 1, b f k(=o = 0, then
rl) and the coefficients !6j are uniquely determined by F and 6F. It follows from
(12.1.3) that
DwOj = DjOj+1, DzOj = Dwoj+l
and thus that the successive coefficients are related by the recursion operator.
With our gauge choice,
(bf)f = (6K +0((2), = -J-1bJ+O((-1),
(6f)f-1

as (-' 0 and (-' oo, respectively. Therefore, by (12.1.2),

¢o = J-16J, ¢_1 = 6K.
Hence 00 is the solution to the background-coupled wave equation that generates
the perturbation IF of the connection, and the other Ojs are obtained from it by
applying the recursion operator and its inverse.
When bF is replaced by (6F, V is replaced (VG and 00 is replaced by 01 = Rdo.
Therefore 6F H (bF induces the recursion operator on linearized solutions to
the ASDYM equation.
As a geometric object, 'y = (-1(6F)F-1 is a representative of a cohomology
class in H' (P, A), where A = adj(E') ®O(-2) relative to the two-set open cover
V, V, and ¢o is its image under the linear Penrose transform (§10.6). With this
interpretation, we can construct V) directly from perturbations of the patching
matrices F, of a more general system of local trivializations of E.
The symplectic form S2 on the solutions to Yang's equation is obtained by
the construction in §10.8 by using the ys to represent tangent vectors to the
solution space, and the higher symplectic forms are constructed by combining 0
with powers of the recursion operator 'y --* (ry.
Recursion from translations
By translating a given ASDYM field along a constant vector in space-time, we
generate a one-parameter family of new solutions that differ only trivially from
the original. However, by applying positive and negative powers of the recursion
operator to this `seed flow', we can generate genuinely new solutions. We saw in
Chapter 8 that the KdV and NLS hierarchies arise in this way. We shall now
show that the twister transforms of the new solutions are related to those of the
original in a particularly simple way.
A one-parameter group of translations pt: C M --+ CM is generated by a
constant vector field
X =a8,,,+b8Z+a8,,,+bBZ,
Transformations of the patching matrix 247

where a, b, a, b E C. In twistor space, the induced flow p't: PT - PT is generated

by
x'=vaa+7a,,
where v = a( + 6, a = b( + a, and A = w( + z, µ = z( + w, ( are the standard
inhomogeneous coordinates.
Let D = d + 4) be an ASD connection on a bundle E U C C M, and let
F(A, µ, () be a patching matrix. For t E C, put Ft = F o pi. Then F(o) = F and
F(t) (A, p, () = F(A + tv, A + to, ().
For each t, F(t) generates a new solution D = d + 4D(t) to the ASDYM equation,
but one that differs only by translation from the original.
Now let tj, j E Z, be a sequence of complex parameters, and put
Co
T=Eti(j.
-00
Provided that the Laurent series converges in a neighbourhood of the unit circle
in the (-plane, we can define a family of patching matrices by
F(A,,u,(,t) = F(A+rv,,u +ra,().
For each t, F generates an ASDYM field D; and, as the parameters ti vary, we
obtain a hierarchy of commuting flows on the space of connections; the individual
connections on an orbit are labelled by convergent Laurent series. Since 8i+1F =
(8iF, where 8i = 8/ati, the flows are related by the recursion operator. The
seed flow is the translation along X, and the flow along aj is generated from this
by R3, for positive or negative values of j, so that
a34D =DbOjdw+DZ0j dz,
where the Ors are the coefficients in the Laurent expansion

fX'(F)F-'J-1 = Eli(-J
00
_00

We have therefore recovered the sequence of flows generated by recursion from a

one-parameter group of translations. In the twistor picture, however, there is no
ambiguity in the definition of the recursion operator. Once the patching matrix
of the original solution has been chosen, there is no further freedom: the choice
fixes the integration constants at each application of the recursion operator or
its inverse, and determines the flows uniquely.
Finally, we note that if we put
00 00
ro = E X01(t
r1 = 1: xii(i
00 -00
then the patching matrices
F(A+ro,µ+r1,()
248 Twistor construction of hierarchies
generate a family of solutions of the ASDYM equations labelled by the param-
eters xA`. From the argument above, the J and K potentials constructed from
these patching matrices have the properties stated in Proposition 8.4.2, and so
the proposition follows from the twistor construction. The twistor construction
in fact does more: it allows us to pick out domains in the parameter space on
which the solutions exist, for example, any domain such that the Laurent series
converges in some fixed neighbourhood of the unit circle. We can prove Propo-
sition 8.4.1 in the same way by taking tj = 0 for negative j and by constructing
J, K from F(A, µ, (, t).

The geometric picture

To develop a solution to the ASDYM equation along its higher flows, one must
solve a set of differential equations. In the language of §8.6, the solutions extend
a given solution to H(2,1) in successive steps to a solution of H(2, p) for any p.
We can look at this process from a geometric point of view by using the twistor
construction in §10.7. This makes it easier to understand the freedom in the
definition of the flows and also the obstructions to preserving any symmetries of
the original solution as it evolves.
As an illustration, we shall consider the problem of extending a given solution
of the truncated Bogomolny hierarchy B(n - 1) = H(1, n - 1) on Cn to one of
B(n) = H(1, n) on Cn+1 by solving for the flow in xn (we use the notation of
§10.7, but we take k = 1, and drop the superscript A). The step from H(2, n - 1)
to H(2, n), or indeed from H(k, n - 1) to H(k, n) is made in almost exactly the
same way.
The embedding of Cn C Cn+1 as the set xn = 0 maps the twistor space of
C', which is the total space of the line bundle O(n - 1) - C IP1, into that of
Cn+1, which is the total space of O(n). If we parametrize O(n) by homogeneous
coordinates (w, , IrA' ), where we make the identification (wn, lrA') ^' (Anwn, A7rA')
on O(n), then the map is
t: (wn-1,irA') ~ (wn,7rA') = (wn-1ir0',irA')
We put ryn = wn/moo , n = wn/ri , (= rl'/iro', and _ iro,/7rl'. Then we have
t: ('yn-1,0 ~' ('Yn,() = ('Yn-1
(1'n- 1, 0 r-+ (7n, C) = Mn-1 , 0
For (# oo, the fibre of O(n-1) is mapped onto the corresponding fibre of O(n).
but the fibre over ( = oo in O(n - 1) is mapped to zero.
The solution to B(n - 1) corresponds to a holomorphic vector bundle
over a subset of O(n - 1), which has a patching matrix F(ryn_ 1, (), where ( lies
in some annulus in the complex plane. To extend it to a solution to B(n), we
must find a holomorphic bundle En over a corresponding subset of O(n) such
that t'En. There are many possibilities; in particular, we can construct
E by substituting yn for ryn_1 in F. Because of the singular behaviour of t
Transformations of the patching matrix 249

at infinity, however, an equivalent choice of F for the original solution will in

general give a different bundle over 0(n), and hence a different extension.
If the solution to B(n - 1) is independent of x°, then we can also require this
symmetry of the extension. But now there are obstructions due to the behaviour
at C = oo of the corresponding Lie derivative, which for the original solution has
the form a
en-1
a7'n-1
in a neighbourhood of C = 0. If the solution is to extend with the same invariance,
then for some choice of local trivialization, we must have 9n-1 = en o t, where
in is also holomorphic at ( = 0. But this implies that
On(C7n-1, 0 = On-1(yn-1, 0 (12.1.4)
Therefore the condition for an invariant extension to exist is that, in some local
trivialization,
en-1(C"5' ,C)
should be holomorphic at < = 0. There are obvious obstructions since eqn
(12.1.4) implies that Bn _,(%_1,0) is constant. So if a frame-invariant quan-
tity constructed from On-1(7n-1 a () is not constant at (= 0, then an invariant
extension does not exist.
In the Drinfeld-Sokolov construction, we go one step further. We require that
not only should the invariant extension exist, but also that the Lie derivatives
for both the original solution and the extended solution should be of the same
normal form, with 9 independent of '. The extension is then unique.
Backlund transformations
In §4.6, we described a family of discrete symmetries ik: J '- J', 1 < k < n,
of solutions to Yang's equation with gauge group GL(n, C ). By combining ik
with the transformations J '- C- JC, for constant C, we can generate new
solutions from a given seed: transformations of this form generate a group of
`hidden symmetries' of Yang's equation. Like that of the recursion operator, the
twistor transform of this construction is very simple. Moreover, although the
transformations in space-time are not uniquely defined because there is some
freedom to choose integration constants in passing from J to J', the additional
information contained in the patching matrix is precisely what is needed to
remove this freedom, as before.
We shall use the same notation as in §4.6. We suppose that we are given a
patching matrix F. This determines J uniquely, and if we write J in the form
(4.6.1, then we can choose the factorization F = f -1 f so that f = h at ( = 0,
and f = h at C = oo. If g is as in (4.6.4), then g-I f and g-1 f are fundamental
solutions to the transformed linear system, except at S = 0 and C = oo, where g
is singular. However, if we put
0 lk
=k = ((- 1 lk 0 1
250 Tvristor construction of hierarchies
where 1k and 1k are the k x k and k x k identity matrices, then, because of the
special form of f and f at C = 0 and C = oo, we have that
f' = 9-1f f = 9f =-k
are regular at 0 and = oo, respectively. Moreover, from the fact that f
satisfies the original linear system, we deduce that
+ O(() -CAB + 0((2)
f - B'+ 0(() 1+0(()
as C 0, with a similar result for f' at = 0. It follows that
F' = f'-1f' _ --1F=
k
is a patching matrix for J'. We have proved the following proposition.
Proposition 12.1.2 The discrete symmetry ik; J -4 J' is given by the transfor-
mation of the patching matrix F --+ °k 1 F°k
The constant transformations J '-+ C-1 JC are given by F '--+ C-1 FC, and
together with ik, they generate the action of the loop group LGL(n, C) on F
by conjugation. This result has been used to analyse hidden symmetries of the
Ernst equation (Woodhouse and Mason 1988).
12.2 DS OPERATORS AND THE GASDYM HIERARCHY
We now turn to the Drinfeld-Sokolov (DS) construction and its twistor repre-
sentation. The full details of the construction are given in Appendix B. In this
section, we first give a brief review; we then recall the connection with the GAS-
DYM hierarchy (§8.6) and we discuss the associated twistor theory. In the next
section we shall give a twistor construction of the DS flows and in §12.4 we shall
derive classes of explicit solutions.
Drinfeld-Sokolov operators and dressing transformations
Drinfeld and Sokolov's construction is an algebraic procedure for finding inte-
grable equations and their hierarchies of commuting flows. The idea is to begin
with an operator L, which represents `one half of a Lax pair', and to construct
from it a sequence of operators Mk, each of which is the other half of the pair for
some nonlinear integrable equation, and which together generate the commuting
flows. The seed operator is of the form
L=ax+A-A, (12.2.1)
where A is a n x n trace-free lower-triangular matrix depending on x, and A =
(A 1 + Ao, where
0 1 0 ... 0 0 0 0 0 0 0
0 0 1 ... 0 0 0 0 0 0 0
Ao = (12.2.2)
A1
0 0 0 ... 0 1 0 0 0 0 0
0 0 0 ... 0 0 1 0 0 0 0
DS operators and the GASDYM hierarchy 251

Such an L is called a DS operator. We regard two DS operators as equivalent if

they are related by a gauge transformation
L,--*g-1Lg=8. +g-'ag +g-1(A-A)g
for some g: C -' SL(n, C ). Because of the special form of DS operators, g satisfies
algebraic constraints. In fact, if we denote by N C SL(n, C) the subgroup
of lower triangular matrices with ones down the diagonal, then we have the
following.
Lemma 12.2.1 Suppose that g: C - SL(n,C) is a gauge transformation be-
tween two DS operators. Then wg takes values in N, where w is an nth root of
unity.
proof By considering separately the terms of the first and zeroth order in
we have that [g, A1] = 0 and that gx - [Ao, g] = gA' - Ag, where A and A' are
lower triangular. The first equation implies that gll = gnn and that the other
entries in the first row and last column of g vanish. The second then forces g to
be lower triangular, with equal entries on the diagonal. However, detg = 1, so
the lemma follows.
Gauge transformations by constant multiples of the identity act trivially on
L, so it follows that two DS operators are equivalent if and only if they are related
by a gauge transformation with values in N. We denote by M the space of DS
operators, with x in some specified domain, modulo gauge transformations.
Given an operator L of the DS form, we construct a dressing transformation
T(x, () as a formal power series in negative powers of A with diagonal matrices
as coefficients, such that
T-'LT = ax - A, (12.2.3)
by solving a recursion relation for the coefficients in the power series. If we
define, f o r k = 1, 2, 3, ...,
Rk = (TAkT-1)+ (12.2.4)
where the subscript `+' denotes the polynomial part of the expression written as
a formal series in powers of (, then the flows on M are given by
akL = [Rk, L]
There is some freedom in the choice of T, but for each L, the Rks are uniquely
determined by L up to the addition of rk(x), where rk takes values in the Lie
algebra of N, and is independent of (. It follows that the dependence of L on tk
is independent of the choice of T, up to gauge, and hence that the flows on M
are well defined. The first flow is simply translation in x, but the higher flows
are nontrivial.
The flows commute, and the family of operators L(x, t) is determined by the
seed operator L up to gauge transformations of the form
L '-' g-1Lg,
where g depends on x, t and takes values in N.
252 Turistor construction of hierarchies
There is a unique choice of gauge in which L is of the form ax + A - A, where
/0 0 ... 0 0\
0 0 ... 0 0
A= (12.2.5)
0 0 ... 0 0
U0 ut ... un-2 0
The dependence of the uis on t is given by nonlinear differential equations of the
form aktLi = Pik, where Pik is a polynomial in the uis and their x-derivatives,
which can be found explicitly from the dressing transformation. When it = 2,
there is just one function u; the flow along a3 is governed by the KdV equation,
the flows along a5, a7, ... are those of the KdV hierarchy, and the flows for even
k are trivial. For a general value of n, the flows are trivial when k is a multiple
of n; otherwise they are those of the nKdV hierarchy.
In this gauge, the equation Ls = 0, where s is a column vector of length 71,
is equivalent to the eigenvalue equation
d n 2 n
LVI +...+Un_2dxn (12.2.6)
+ dx
where zp = st, and the other entries in s are given by sj = a3x. st. Thus we can
also think of M as the space of nth-order scalar differential operators, of the
form of that on the left-hand side.
Any DS operator L can be reduced to the special form by a gauge trans-
formation L F-+ g- i Lg, where g takes values in N. If f (an n x n matrix) is a
fundamental solution to L f = 0 in the original gauge, then g-1 f is a solution in
the special gauge. However, for g E N, the entries in the first row of f are un-
changed by such a transformation. It follows that if f is a fundamental solution
to Lf = 0 in any gauge in which L takes the DS form (12.2.1), then the entries
in the first row of f make up a basis of solutions to (12.2.6).
In the scalar formulation, the information in the dressing transformation is
encoded in the Baker function i,b(x, A), which is the basic object in the approach
of Segal and Wilson (1985). We put
An=(, k=(1,0,...,0), vt = (1,A,...,an-1),
(12.2.7)
so that v is an eigenvector of A with eigenvalue A, and we define the Baker
function to be
ii(x,A) = i.Texp(xA)v. (12.2.8)
This is a formal power series solution to the eigenvalue equation (12.2.6), which
is characterized uniquely tip to multiplication by a formal power series in .X
with constant coefficients by the condition that it should be of the form
V) (x, A) = eax(1 + at (x)A-1 + a2(x)A-2 + ... ). (12.2.9)
To reconstruct the dressing transformation from ii, we first construct a basis si,
of solutions to (12.2.6) by taking the n linearly independent symmetric combi-
nations
pS operators and the GASDYM hierarchy 253
n-1

i=O

where wn = 1 and j = 1,. .. , n. We then put sit = as-1s1j and define 7' by
T = se-z^. The resulting matrix satisfies equation (12.2.3) and is a formal series
in nonpositive powers of A.

The twistor theory of the DS hierarchy

In §8.6 we showed that the system of equations determining the flows of the DS
hierarchy is a reduction of H(n - 1, oo) and we showed that the first nm flows
determine special solutions to H(n - 1, m) with symmetry under translation
along the coordinate vector fields aAO in C. By combining this observation
with the twistor correspondence in §10.6, we have a construction that generates
a holomorphic vector bundle E' over the twistor space ®n-1 O(m) from the
image of a given seed operator L under the first nm flows, where O(m) denotes
the total space of the line bundle of Chern class m over C ll'1.
The symmetry of the solutions under translation along aAO is reflected in the
existence of Lie derivative operators LA, A = 1, . . . , n - 1, which take the local
form
a,UA +OA, +OA,
in the two coordinate systems introduced in §10.7. It always possible to choose
the frame in the first coordinate patch so that 0A = 0; the special solutions
that arise from the DS construction are characterized by the property that it is
possible to choose the frame in the second patch so that z
BA = (AA +O((N)
as ( -p oo, for arbitrarily large N. In fact it follows from Proposition (12.3.2)
that if this holds for some sufficiently large N, then it can be made to hold for
any N.

The Segal-Wilson ansatz

A class of solutions to H(n - 1, m) that certainly arise from the DS construction
are those for which BA = (AA and therefore for which the patching matrix is
given by
n-1
(-1µAAA
F(t,A,() = exp g(O
CE
A=1
for some matrix-valued function g((). In this case, the formal series defining the
dressing transformation converges in a neighbourhood ( = 0, and so it is possible
to choose T so that the linear system has a fundamental solution of the form
f = Texp \u X AO(-1AA1
254 Twistor construction of hierarchies
in a neighbourhood of = 0. The corresponding frame for the bundle over
twistor space is the one in which 9A takes the standard form.
Conversely, given a choice of g(() on a neighbourhood of the unit circle, we
can recover a solution to the nKdV hierarchy, up to level nm, by setting
AA = xA0
+ CxAl + (2xA2 + ... + (mxAm
with xAi = tn(i-1)+A, and making Birkhoff factorization F = f f . It is possible
to fix the freedom in the choice of factors so that T = f exp(- E xAOAA/() is
independent of x'40, and takes values in N at ( = oo. It is then a dressing
transformation for the solution to the nKdV hierarchy.
These solutions are special. In particular, they are necessarily meromorphic
functions of the time variables ti, although not all such meromorphic solutions
arise in this way because there exist meromorphic solutions constructed from
Painleve transcendents for which the dressing transformation does not converge.
Segal and Wilson (1985) study this class of solutions with convergent dress-
ing transformations. They identify the DS flows with those generated by left
multiplication by exp(Ek tkAk) on the Grassmannian LGL(n, C)/LGL+(n, C).
From the twistor point of view, the map from the solutions to the Grassmannian
is given by g (we are free to multiply g on the right by functions of ( that are
regular in the unit disc: this gives an equivalent patching matrix, and the same
point of the Grassmannian). The DS flows, which are given by

g - exp E tkAk g
k
when the patching matrix has this special form, are mapped to the flows on the
Grassmannian. Thus we are led naturally to the Segal-Wilson version of Sato
theory (Date et al. 1983).

12.3 THE TWISTOR CONSTRUCTION OF THE DS FLOWS

We have outlined the way in which a given DS operator seeds the flows of the
nKdV hierarchy, and therefore generates a solution to H(n - 1, m) and hence
a holomorphic bundle over 0(m) ®... ® 0(m). It is straightforward to reverse
the construction when the patching matrix is assumed to have the Segal-Wilson
form. In this section, we turn to the general case, in which the dressing trans-
formation is not assumed to converge, and look at the DS construction from a
different point of view. We shall also fill in some of the details of the theory we
sketched in the last section. Our first step is to represent the elements of M by
holomorphic vector bundles over subsets of C 1P2, and hence by patching matrices
F(-y, (). We then construct the flows on the space of patching matrices. If the
dressing transformation of the L E M converges, then the patching matrix of L

(
can be chosen so that its image under the flows is given by
\
F exp ti A3 I F F.
/
7'he twistor construction of the DS flows 255

that is, 8kF = AkF, and we recover the Segal-Wilson ansatz. In the general
case, the dependence on t is less explicit, and we have
8kF = (SAkS-1)finF,
in A-1 with diagonal matrices as coefficients
where S is a formal power series
constructed from F and the subscript `fin' denotes truncation at some finite
nonpositive power of (. As in the KdV case, the tangents to the flows are
related by a recursion operator, which is given by R = A in the Segal-Wilson
case, and by (SAS-1) fin in general.
DS operators as connections
In our geometric interpretation of the DS construction, instead of (12.2.1), we
consider
L=Bx+A-A-('8y, (12.3.1)
as an operator on vector-valued functions of the two `space-time' variables x and
y; of course, when the functions are independent of y, L reduces to (12.2.1).
Such operators are special examples from a more general class in which
L=Bx+P-((By+Q) (12.3.2)
where Q and P are trace-free n x n matrices depending holomorphically on x
and y.
We shall think of D = d + Pdx + Qdy as a connection on a rank-n vector
bundle E - U, where U is a convex open subset of the C2, and allow general
gauge transformations
g-1Pg
Q'-' g-1Qg + g-18vg, P- + g-18xg
where g takes values in SL(n, C ). If a gauge class contains an operator of the
special form (12.3.1), then the special representative is unique up to a gauge
transformation by g, where g is independent of y and takes values in N. Therefore
M is embedded in the space of operators of the form (12.3.2), modulo SL(n, C )
gauge transformations.
The twistor space
In two dimensions, the Penrose-Ward transform is a correspondence between
connections on bundles over a two-dimensional space-time and holomorphic vec-
tor bundles over a two-dimensional twistor space. There are no field equations,
and the connections are not subject to any constraints. In this context, `twistor
space' is the space of lines in C2, which is C P2, less a point representing the line
at infinity (if we write a general line in the form ax + by + c = 0, then a, b, c are
homogeneous coordinates on twistor space, so this is the standard dual space
construction in plane projective geometry). We shall use two inhomogeneous
coordinate systems, -y, ( and y", (, which are defined by writing the equation of a
line in one or other of the forms
(x+y=y, x+(y=y.
256 Twistor construction of hierarchies
The first excludes lines parallel to the y-axis, the second excludes lines parallel
to the x-axis; on the overlap, they are related by C = (-I, y = y.
For U C C2, let P denote the subset of C 1P2 of lines that intersect U, and
let V, V be a two-set open cover of P with V contained in the domain of the
coordinates y, C, and V contained in the domain of ry, C. In considering the DS
operators, we shall take U = W x C, where W is some open set in the x-plane. In
this case, P includes all (y, () for finite C, but excludes the points (y, () _ (x, 0)
for x ¢ W. We call P the twistor space of U or W.
For U = C2, the twistor space is fibred over C Pi by the projection (y,
(which maps a line in C2 to its tangent space), and is isomorphic to the line
bundle 0(1) -+ CIPI. Each point of U corresponds to a section of 0(1), given
by
The Penrose-Ward transform
The Penrose-Ward transform in this context is reminiscent of the Radon trans-
form; it maps E and D to a holomorphic bundle E' P, with the fibre over a
point of P defined to be the set of parallel sections of E along the corresponding
line in U.
We can trivialize E' over V and V by finding fundamental solutions f and f
to the equation L f = 0 that are regular in a neighbourhood of r; = 0 and S = oc,
respectively, and we can characterize E' by a patching matrix F(y, () = f - I f ,
which is regular on V f1 V In the other direction, we can represent a bundle
E' -+ P by a patching matrix F(y, (), and then recover Q and P from F by
making a Birkhoff factorization
F(Sx+y,() =.f-'(x, y,()f(x,y,
and by applying the standard argument: since 8xF - (ByF = 0,
(axf - (ayf)f-I = (axf - Cayf)f-I
Both sides must therefore be of the form (Q - P, by the same extension of Liou-
ville's theorem as in four-dimensional theory, and f and f must be fundamental
solutions to Lf = 0, where L is given by (12.3.2). A different choice of F or a
different choice of factorization gives a gauge-equivalent connection. The only
constraint is that the factorization should exist; that is, that the restriction of E'
to each of the projective lines y = x( + y in P should be trivial. We assume in
the following that the origin is in U and that E' is trivial on the corresponding
line y = 0. Then the triviality condition holds for nearby points, and for nearby
bundles.
The DS bundles
As it stands this is a straightforward specialization of the four-dimensional the-
ory, but less interesting because P and Q are not constrained. However, the
bundles corresponding to operators of the form (12.3.1) have additional struc-
ture: the DS operators are invariant under translations in y and therefore the
corresponding bundles E' -* P are invariant under the flow along
The twistor construction of the DS flows 257

Y'=a,.=Sa7
That is, E' has a Lie derivative operator £y,, given by
a,.+B, ca;.+e
in the respective trivializations, where
0= f-lfy, 0=f-11Y
We can take f to be independent of y, which implies that 0 = 0. However, at
=0, -h-'Alh,
0=
where h = &_o, so the leading term in the expansion of 9(', S) in powers of C
is invariant up to conjugation. If we replace f by f H, where ft(' , t;) is regular
at 0, then
B- H-1BH+<H-1Hry,
and the conjugacy class of the leading term is unchanged. In fact we can say
more than this.
proposition 12.3.1 An operator L = a= + P - t;(8, + Q) is gauge-equivalent
to an operator of the DS form
ax+A-A-<ay,
if and only if E' is invariant under the flow along Y' and there exists a trivial-
ization of E' in a neighbourhood of C = 0 in which
LY, = Y' - A + Ca
where a is holomorphic in V and lower triangular at oo.

Proof If L is of the DS form, then

&-'(A-A+ax)f =f-'ayf=e.
Hence if we define H(y, c) by evaluating f at y = 0, x = ry, and if we put
a = A(ry), then
B = H-1(-A+a+a7)H,
since 0 depends on x and y only through ry" = x + (y. A change of frame by H
reduces 0 to the required form.
Conversely, suppose that E' and Ly, have the stated properties, and let L
be the operator determined by E'. Because E' is invariant under the flow along
Y', there exists an invariant gauge in which
L=ax+P-((ar+Q)
where P and Q are trace-free and depend on x alone. The special frame for E'
determines a solution Ax, y, t;) to L l = 0 such that
-A+a=Cf-'f =f-'(a=+P-(Q)f.
600 Twistor construction of hierarchies
Write f (x, 0, ) = g(x)(1+ch(x)+O(c2)) and a = ao+O((). By making a gauge
transformation by g(x), we can arrange that g = 1. Then by putting y = 0, we
have
P-(Q=-A+ao-[h,Q]+O((),
from which it follows that Q = Al and that P + Ao is lower triangular (since a
commutator with Al is lower triangular). p
Patching matrices
It follows from the proposition that M is embedded in the space of holomorphic
vector bundles over P, with patching matrices such F(-y, () such that
8ryF = (-!(A - a)F (12.3.3)
where a is holomorphic in V and lower triangular at ( = oo (( = 0). Any
patching matrix such that (12.3.3) holds for some such a determines a bundle
with the required symmetry by putting 0 = 0, B = F-IF-'.
We recover the DS operator from F by putting ry = (x, and by making the
Birkhoff factorization F with f = 1 at (= oo. By the argument in the
proof of the proposition
L=8.-fzf-1 =ex+P-(Q
is a DS operator, and the entries in the first row of f are the complete solution
to (12.2.6).
The DS construction and the twistor dressing matrix
By taking the Penrose-Ward transform, we can represent an integral curve of
a flow on M by a one-parameter family of patching matrices F(-Y, (, t). In this
section, we shall define the DS flows by giving the derivatives of F with respect
to the parameters ti.
The idea is to use the symmetry of E' to pick out special representative
patching matrices for each point of M. What we should like to do is to find a
frame for each E' in a neighbourhood of (= 0 in which
8 = -(A. (12.3.4)
For a given DS operator, we can chose the frames in V and V such that
.1,-1 =((A-a),

where a is as in Proposition 12.3.1. Then the transition matrix S(y, () to the

required frame is given by solving
8ryS+aS-[A,S]=0. (12.3.5)
Proposition 12.3.2 Suppose that F,F-1 = c(A - a), where a is holomorphic
in V and lower triangular at ( = 0. Then there exists a formal power series
00
S = Esj (ry)A-i
0
y'he twistor construction of the DS flows 259

such that a;.S+aS- [A, S] = 0, where each sj is a diagonal matrix. Moreover, S

=g uniquely determined by F up to S f-+ S >o c2 A-2, where the c, s are constant
Scalars. If the limit of (-ma as -+ 0 exists and is lower triangular, then we
can choose the constants so that S = 1 + 0(S')-
The proof is by the same argument as the one used to establish the existence
,of the dressing transformation in Appendix B. In fact, S is related to the dressing
transformation by
Texp(-ySA) = fS (12.3.6)
in a gauge in which L takes the DS form.
If we start in a general frame, then we can still find a formal power series
solution to
Y'(S) + 6S = -ScA,
with the same uniqueness property, by first transforming to a frame in which 0
takes the special form in Proposition (12.3.1). We call S the dressing matrix.
If the series converges, then S determines a transformation to a frame in which
(12.3.4) holds. If it does not (the general case), then we can still find a frame
in which ('(§ + t; A) vanishes at C = 0 for arbitrary m by truncating the formal
series. By choosing m large enough, we can define the flows along tk up to any
given value of k.

The DS flows
We define a hierarchy of flows on M by
akF = (SAkS-1)finF (12.3.7)

(k = 1,2,3,.
2,3 ...), where the subscript denotes the truncation of the formal power
series at some finite nonnegative power of (the truncated series is a holomor-
phic function of ( and y on V n V). Note that in this equation S is a functional
of F determined by Proposition (12.3.2) and so it also depends on tk. Because
multiplication of F on the left by a function regular at t; = 0 does not change
E', it does not matter where the series is truncated. Also, the flows are inde-
pendent of the choice of trivializations of E' and of the choice of S. When the
dressing matrix converges for some E' E M, we can choose the frame for E' in
a neighbourhood of c = 0 so that S = 1. Then the flows are given by

F --* exp (:L tiAj F . (12.3.8)

In any case, the flows commute, and preserve the characteristic symmetry prop-
erty of the bundles corresponding to DS operators. This follows from the fact
that for any m > 0, we can pick a patching matrix for each E' E M such that
S=1+ (by using a truncation of the series for S to transform an initial
choice for the frame on V). With this choice,
ajF = (A3 - a.,)F
260 Twistor construction of hierarchies
where a. . is holomorphic on V, and, for given j, can be made to vanish to
arbitrarily high order at = 0 by an appropriate choice of m. By taking m large
enough for given k, P, it follows that the flows along tk and to commute, modulo
an equivalence transformation of F.
Proposition 12.3.3 The flows on M defined by (12.3.7) coincide with those of
the nKdV hierarchy.

Proof Let E' E M. Choose the gauge so that L is of the DS form. Then
(12.3.6) holds, and, modulo terms in positive powers of (,
TAIT-' = f(a,F)F-'f-' = (aif)f-1 -
(a,f)f-1

Therefore we can take (03f)f' = -Rj, where R, is the polynomial part of

TA'T-1, written as a formal Laurent series in powers of (. But f is a solution
to L f = 0. Hence (aj L) f + La, f = 0, which implies that a; L = [Rj, L].
Remark. The solution to (12.3.7), is a family of patching matrices F(-y, (, t),
labelled by t = ( t 1 , t2,. ..). It is determined uniquely by the initial DS operator,
up to the standard freedom F H-1FH, where H and H are regular SL(n, C)-
valued functions on V and V, depending holomorphically on the parameters.
Since each patching matrix represents a DS operator, we can use this freedom
to ensure that for all t,
FyF-1
= (-1(A-a),
where a is holomorphic in V and lower triangular at ( = oo. Under this condition
on F, the flow along t1 is translation in x. We deduce this from
a,F = -BF = ((-'SAS-' +Y'(S))F.
But the formal series Y'(S) contains only positive powers of (, so (Ft, -(F-'
is holomorphic in V, and F is constant, up to equivalence, along at, - (ati.
This implies that the effect of adding a constant t1 to x is the same as that of
adding (t1 to y. It follows that w e can integrate the first flow by t,) _
F(y + (t1i (), where F on the right-hand side is the initial patching matrix.
Construction of the solution
Suppose that we have found the dependence of F on t. Then we can recover the
DS operators by the following steps (i) substitute y = x(, in F(y, r;, t), (ii) make
a Birkhoff factorization F with f = 1 at = oo. Then for each t, we
have that
ax - fxf-1
is a DS operator. Further, the entries in the first row of f give the general solution
to the corresponding eigenvalue equation (12.2.6) and, if S is the dressing matrix
of F, then T = f S is the dressing transformation of L.
If, at fixed t, we put y = (x + y and make any Birkhoff factorization F =
f f -1, then f and f are both fundamental solutions to the linear equation L f = 0.
The twistor construction of the DS flows 261

although in general both will depend on y. However, if S is the dressing matrix,

then f S is also a (formal) solution and
(fS)-'ay(fS) = -CA.
We can multiply f S on the right by any matrix-valued function of -y = (x + y
and C, and it will remain a formal solution. In particular, f S exp((x + (-y)A) is
a formal solution which is independent of y.
Now put y = 0, 'y = r;x, and fix the factorization by the condition f = 1 at
oo. Then ax - fx f -' is a DS operator, and
(ax - fxf_1)fSexA = 0.
So if we put
7i (x, A) = kfSexAV = eAxk fSv,
then ip is the Baker function as defined in equation (12.2.8).

Patching matrices for the nKdV hierarchy

Equation (12.3.7) determines the flows of the nKdV hierarchy in terms of the
patching matrix of the initial DS operator. However, since S depends on F in
a complicated way, the equation is not easy to solve in its general form. We
have mentioned already one straightforward special case: if the dressing matrix
of the initial operator converges, then we can choose the initial patching matrix
so that S = 1 and we can integrate (12.3.7) by (12.3.8). In this case, the twistor
construction reduces the initial value problem to a Riemann-Hilbert problem.
The patching matrices of the form (12.3.8) are those of the Segal-Wilson ansatz;
we give an example below.
Even when the dressing transformation does not converge, we can still use the
construction to find the general solution to the hierarchy up to any predetermined
level. We do this by deducing from (12.3.7) that the patching matrices can be
made to depend on the parameters in a particularly simple way. We write down
`generating functions' that have this dependence, and construct solutions to the
hierarchy by solving a Riemann-Hilbert problem. The generating function has
a natural geometric interpretation as the patching matrix of a solution to the
truncated GASDYM hierarchy.

The generating function

We say that a matrix-valued holomorphic function or formal power series in
C _ <-' is O(A-k) if it can be expressed as a formal series in negative powers
of A with diagonal matrix-valued functions of ,7y as coefficients, and with leading
term a diagonal matrix times A-k. When k = 0, this means that the coefficient
of (° must be lower triangular.
Suppose that n _> 2 and choose m _> 1. For each DS operator, we can choose
the local trivializations so that
S = 1 + O(A-nm+l )
262 Twistor construction of hierarchies
Then for k < nm,
88FF-1 =r;-'(A- a), akF=AkF+rk, (12.3.9)
where a and rk are holomorphic in V, a = O(A-nm+2), and rk = O(A-nm+1+k)
We define G((, µA), by evaluating F at
(-m/LA
tnm-n+A = A = 1, 2, ... , n - 1 ,

with -y and all the other parameters tk set equal to zero; then, because An = (1,,,
where 1n is the identity matrix,
aG
aµA
G- = C AA + (-'RA
1
(12.3.10)

where RA is holomorphic in V, with RA = O(A-n+1+A). We call G a generating

function of the hierarchy (up to level nm).
A generating function determines the first nm - 1 flows as follows. Given G,
we define F(y, (, t) by evaluating G at 3
µl = y + Ctl + (2tn+1 ... + Cmtnm-n+l
A2 = (t2
+(2tn+2 + ... + Cmtnm-n+2

An-l
= On-1 +(2 t2n-1 + ... + Cmtnm-1 ;
then F also satisfies (12.3.9), and so .P is a solution to (12.3.7). Moreover, at the
origin in the parameter space, F is a patching matrix for the initial DS operator.
It follows that F is a patching matrix for all the DS operators obtained from the
initial one by the first nm - 1 flows; moreover, the dressing matrix for F is again
of the form 1 + O(C-")
We obtain the solution directly from G by solving the Riemann-Hilbert prob-
lem G = f- If, with the above substitution for the µA s. Any matrix G((, µA)
such that (12.3.10) holds determines a solution to the first nm - 1 levels of the
nKdV hierarchy, and every solution arises in this way. We summarize this in the
following proposition.
Proposition 12.3.4 Let G be a holomorphic matrix-valued function of (, µA.
where ( is in some annulus in the complex plane and A = 1, 2, ... , n - 1, such
that
(m(Ic-'
A-(-1AA)
µ
exte nds over the exterior of the annulus, and is O(A-n+1+A) put
m- I
AA = [1 (k+ItA+nk
kjJO

and let G = f -1 f be the Birkhoff factorization with f = 1 at ( = oo. Then

L=Bx-fxf-1
The twistor construction of the DS flows 263

is a family of DS operators parametrized by t2, t3, ..., satisfying the nontrivial

equations of the nKdV hierarchy up to order nm. Every solution arises in this
way.

The GASDYM hierarchy

In the previous subsection we showed that a solution to the nKdV hierarchy up to
level nm determines a solution to the truncated GASDYM hierarchy H(n -1, m)
and hence to a holomorphic vector bundle over the total space of the vector
®n-1
bundle 0(m) over C P1. We have a direct connection between these two
constructions in the following.
Proposition 12.3.5 Let G((,,u A) be a generating function for a solution to
the nKdV hierarchy (up to level nm). Then G is a patching matrix for the
corresponding solution to H(n - 1, m).
Proof Put
AA = xA0 + (sAl + ... + (mXAm
and make a Birkhoff factorization G = f -1 f for fixed xAi. Then
LAi = aAi - (aA,i-1 + (aAif - (an,i-lf)f-1
(1 < i < m, 1 < A < n - 1) is a solution to H(n - 1, m). On the other hand, if
we put xAO = 0, 5Ai = tn(i-1)+A (i > 0), and fix the factorization by imposing
f=1at(=oo,then
L11 = all - (a11f)f-1 .

is a family of DS operators. We have to show that (8.6.2) holds, where the Mks
are the operators constructed from L11 by the DS construction.
By differentiating with respect to xA*, and by making use of (12.3.10),
f-1(8Aif)f-1f =('-'A A + ('-"RA +f-1(aAtf)-
However, the dressing transformation of L11 is T = 1S, where the dressing
matrix S is of the form 1 + O(A-l"), and so, for i > 1,
(anif)f-1 =T('-'A AT-1 +O(A-1),

and the proposition follows.

Explicit construction of the flows
When the dressing transformation converges, the flows are integrated by (12.3.8),
which determines the functions satisfying the nKdV hierarchy explicitly from the
patching matrix F of the initial DS operator, up to the solution of a Riemann-
Hilbert problem.
We can also integrate the flows if we make a particular ansatz for F. Let us
suppose that
F( Y, () = (12.3.11)
where g takes values in SL(n, C) and is holomorphic in V n V (i.e. in an annulus
in the (-plane), and 0 takes values in sl(n, C) and is of the form
264 Twistor construction of hierarchies
46 = A - a,
where a is holomorphic in V, depends only on (, and is lower triangular at
( = oo. Then
0= -F7F-1

is holomorphic in V, and has the form required for the application of Proposition
12.3.1. Therefore F determines a DS operator.
By differentiating (12.3.5) with respect to y, we have in this case
8ry(S-'Sj) + [S-1Sry.A] = 0
since a is independent of y, and so, by the uniqueness of formal solutions to the
dressing equation (Appendix B), S-'SS = Eo biA-' for some constant complex
scalars b;. Hence
(12.3.12)
where the constants b= are determined by the behaviour of the eigenvalues of 0
as (--*oo.
Now consider the family of patching matrices

F(y, C, t) = exp (:t tS ¢S I F(y, (12.3.13)

f
Since F. F-' = (-'t for every t, these all have the same dressing matrix S. Also
Ft,F-1 S(A+bo+b1A-' +... )SS-1

= SASS-'
+ jboSAj-1S-1 + .

Hence the tangent to the flow along tS is the tangent to the jth DS flow, plus a
constant linear combination of the tangents to the lower DS flows, the constants
being polynomials in the b=s. It follows that (12.3.13) integrates the DS flows.
up to a linear transformation of the parameters. This clearly reduces to the
Segal-Wilson ansatz when 0 = A.
12.4 EXPLICIT CONSTRUCTION OF SOLUTIONS FROM TWISTOR DATA
In this section we discuss the construction of explicit solutions from two different
forms of twistor patching data. The first case is the Segal-Wilson ansatz in which
the function g is taken to be rational; the second is the Krichever construction.
The Segal-Wilson ansatz with rational g
Take g = R, where R is a rational function of ( as in Example (9.3.3). Define

C((, t) = exp (tkAk) . (12.4.1)

To find the Baker function explicitly in this case, we first find the Birkhoff
factorization CR = f -' f , where CR is treated as a function of (, with t fixed,
and f =1 at ( = 0. Then
Explicit construction of solutions from turistor data 265

V)(x,C,t) +x,t2,...),
where = rcfv, with k, A, v defined by eqn (12.2.7). We shall factorize CR by
using the method and notation of Example (9.3.3). The only difference is that
now C is the particular entire function (12.4.1); C depends on the parameter
t = (ti , t2, ... ), but R, and the vectors a,, bi E C" determined by the behaviour
of R at its zeros and poles, do not. In the notation of Example (9.3.3),
k
Vi=rcfv=l+ rcxi
S-ai
y; v
1

where k
yj,v = KC(/3j, t)bj + j rcxiMij = 0.
i=1

We shall express in terms of the r function, which we define in this context by

r(t) = det(Mi.) = det t)bj ,

/3j -ai /
(see §12.6). It should be remarked that C is given by (12.4.1), and that the
singularity data ai, /3i (2n points inside the unit circle) and ai, bi (2n vectors in
C") can be chosen freely. Thus an expression for in terms of r gives a way of
determining explicit solutions to the hierarchy in closed form from unconstrained
data.
The central calculation is the following. For A E C near oo, put

Then
to = (tl
1
, tz i1
, t3
1
3,\3' ... .

C((, ta) = Q((, ,\)C((, t)

where Q = 1 -A-'A. Now,
(a /3avrc
Q(a, A)-'Q(Q, A) = 1 +

where C = an (note that Q and vrc are n x n matrices). Therefore,

r(ta) = det [aC_'(cit)

1 vr, ) C(Q3,t)bj]
(13j ai ai
k
1
= det [MI(t) + aTEyivrCxrMrj(t)
r=1
t
= det[Mij(t)] det bij + yiv rcxj
ai

where the expressions in square brackets are scalars labelled by i, j, and in taking
the determinants, we treat them as entries in a k x k matrix. We conclude that 4
266 Twistor construction of hierarchies

((t))
'+bt) _
If we write
z =1+al.\-1+a2A-2+
then we find the coefficients ai from the formal expansion of r in inverse powers
of A, and hence find the uis in (12.2.6) as functions of t. For example, a1
-81 log r, and hence
un_2 = n81 log T.
As a simple explicit example, take n = 2, k = 1, al = 0, /3, = s2, where s
is real, and ai = bi = (1,0). Put tl = x, t3 = t, and set the other tis to zero.
Then

C(al,t) = 1 + (0 xl ,

0 1

cosh(sx + sat) s-1 sinh(sx +_s 30

C(/11, t) = s sinh(sx + sat) cosh(sx + sat)
from which we get r(x, t) = s-1 cosh(sx + s3t), and hence the standard single
soliton solution to the KdV equation,
u = 2s2sech2(sx + s3t).

The multisoliton solutions are given by taking higher values of k.

The Krichever construction

Another possibility is to take F as in eqn (12.3.11), with g = 1 and C-10 a
polynomial of degree m in (-1. In this case, from equation (12.3.13), we have to
solve the Riemann-Hilbert problem
f((,t)eXp(t10+t202+t303+...)
t) (12.4.2)
at each fixed value of t, with f (oo, t) = 1. In fact, all we need to do is to extract,
the Baker function, which we can do explicitly in terms of theta functions by
using ideas from Dubrovin (1981), and Segal and Wilson (1985)-the central one
being to replace the r;-sphere by a complex spectral curve E on which the various
expansions in powers of 0 become power series in a coordinate A.
We define E to be the set of ((,.A) such that
det(q - \ln) = 0, (12.4.3)
where In is the n x n identity matrix. Near ( = oo, this implies that \n -
because of the asymptotic condition on 0. To deal with the singularities in
(12.4.3), we compactify E by using \-1 as a coordinate near ( = oo, and either
r; or /i = elsewhere, so that E becomes an n-fold branched covering of
C P1. The branches correspond to the solutions to
det((D -µ1n) = 0,
Explicit construction of solutions from twistor data 267

as an equation for p in terms of (, where c = (1-10, which is a polynomial

of degree m in C. The, branch points are the points at which the discriminant
vanishes, that is, where two or more eigenvalues come into coincidence. Now the
discriminant is a polynomial of degree n(n - 1) in the entries in 4D, and hence
a polynomial of degree n(n - 1)m in (, so there are n(n - 1)m branch points
(n - 1 of them are at C = oo). By the standard triangulation argument (see, for
example, Kirwan 1992), E has genus g = I(n - 1)(mn - 2).
Away from the branch points, each eigenspace
LC,t, = {4 v = pv} C en
is one-dimensional, and so we can define a holomorohic map by
p: ((, A) E E'-' [CC,MJ E CPn-1 ,
that is, by mapping each point of E to the corresponding eigenspace. We can also
construct local holomorphic maps E -+ Cn by mapping ((, p) to, for example,
the vector of cofactors of the first row of 1 - µ1n. These are holomorphic at the
branch points, and wherever they are nonzero, they combine with the projection
Cn -+ C 1Pn_ 1 to give p. By piecing together these local definitions for a general
choice of 0, we extend p to obtain a global holomorphic map p: E --+ C 1Pn_ 1.
Let L = p"(O(-1)) denote the pull-back to E of the tautological bundle
O(-1) - C Pn_1.
Then, for every ((,,U) E E, including the branch points, the elements of LC,,, are
solutions to the eigenvector equation Qiv = .\v. Any row vector (element of Cn* )
determines a homogeneous function of degree one on Cn, and hence a global
section of L' = p*(O(1)). Thus we have a linear map Cn' -p r(L*). The basic
fact that we need to solve the nKdV hierarchy is that this is an isomorphism.
It then follows from the Riemann-Roch formula that, in general, L has degree
n + g - 1, and hence that, in general, a section of L' has n + g - 1 zeros. Each
section corresponds to a hyperplane in C 1Pn_ 1 i the zeros are its intersections
with p(E).5
Now suppose that f, f is a solution to the Riemann-Hilbert problem (12.4.2).
Denote by re the row vector (1, 0, ... , 0) and let v E LC,,,. Then we can define a
holomorphic function 1' on E by
7pnv = r f exp(t10 + t202 + t303 + )v
= etJA+t2A2+t3A3+...
Kfv
= Kfv. (12.4.4)
From the second line, 7/i has an essential singularity at ( = oo, where its behaviour
is that of the exponential, and it has poles at the zeros of rev, the locations of
which are independent of t. Near C = oo, 0 - A - a, where a is lower triangular,
and therefore a holomorphic section of L at ( = oo has the form
268 Twistor construction of hierarchies
,\n-1
,\n-2
I
v - r

for some r. It follows that rcv has a zero of order n - 1 at F = oo (as a section of
L*), and therefore that ?i has g poles elsewhere. Note also that the zero of r4.v
at infinity cancels that of n f v because f = 1 at infinity.
We know from the general theory that, with x = t1i is a solution to (12.2.6);
and we have just shown that
V) , exA+t, a'+...
near F = oo, where is holomorphic in A-1 at A = oo. Therefore 1/i is an
x-independent multiple of the Baker function.
However, the behaviour of ?/i at p = oo, together with the fact that it has g
poles elsewhere on E, is sufficient to determine it uniquely for each t, up to mul-
tiplication by a constant, the uniqueness being a consequence of the Riemann -
Roch formula. We borrow here from Dubrovin, who gives an explicit formula in
the context of a different approach to the same problem. In his terminology, 1/i is
a Baker-Akhezier function. Segal and Wilson (1985) give the formula as follows.
Let R = C9 denote the space of holomorphic differentials on E and let {ai, 6, },
i = 1, ... , g, be a standard basis for the cycles on E (that is, the only nonempty
intersections are ai n 6j with i = j). Let wj, j = 1, 2, ..., be the holomorphic
1-forms on E - {F = co} that behave like dA' at ( = oo. Let 9 be the classical
theta function on R* = C9 and let a: E -+ R* be the Abel map,

cEE, PER,
where the lower limit is the point ( = oo (a depends on the path of integration,
but the dependence disappears from the formula). Finally let c E C9 be the,
vector such that 9(a(o) - c) vanishes at the poles of i(i and let wj E C9 be the
vector of b-periods of wj. Then

= exp
(V)
[C
i (tiw, + t2w2 + ...)/I
9(a(o) - c + t1w1 + t2w2 +
00 9(a(o) - c))
We remark, finally, that this construction of Krichever's maps the DS flows
onto a linear flow on the Jacobian of E, since the dependence of the transition
functions of L - E is given by the exponential in (12.4.4).
Geometric interpretation
The solutions given by this construction have a simple geometric characterization
in terms of H(n - 1, m), namely that they are invariant under translation along
81,,,, as well as along 810. The flows along 81o and 81n are generated on the
twistor space O(m) ® ... ® 0(m) by the vector fields 8/8µl and F,naL', respec-
tively. For the particular patching matrix given by (12.3.13), the corresponding
yamiltonian formalism 269

in V, and by
aa1
yie derivatives operators are given by
G10 = 1 L im =µ+ -D

a
Llm = a74=1

in
y Their difference G1,,, - ('LIO is a global section of adj(E') ® 0(m), where
0(m,) denotes the standard line bundle over C 1P1, pulled hack to the twistor
space. Conversely, given a bundle constructed from the first nm flows of the
DS hierarchy for which such a global section exists, with the same form as O/(
at infinity, we can construct Lie derivative operators and recover F of the form
(12.3.11) by choosing a frame in V which is invariant along a/aµ', and one in
V which is invariant along ('c9laµ'. If we restrict the corresponding patching
matrix to µ2 = A3 = ... = 0, then it has form
F(-y, () = exp ('YO/()
where ry = ti'. It follows that the corresponding solution to the truncated nKdV
hierarchy is given by the Krichever construction. The Riemann surface can
found directly from the section of adj(E') ® 0(m), and the construction of the
line bundle L is an example of Abelianization.

12.5 HAMILTONIAN FORMALISM

In Appendix C, we explain how a presymplectic form on a manifold determines
a symplectic structure on the reduced space-the quotient by the distribution
spanned by the characteristic distribution of the presymplectic form. In this
section, we shall explain how the Hamiltonian aspects of the Drinfeld-Sokolov
theory can be understood as an example of this construction (see also Wilson
1988). The idea is to define a presymplectic structure on a space of patching ma-
trices in such a way that reduction identifies patching matrices that are related
by F -, HF, where H is regular in V Each point of the quotient will then gen-
erate a unique operator and we can understand the construction of the dressing
matrix as the selection of a particular representative in the presymplectic space
for each element of the quotient.
We shall show that the DS flows are Hamiltonian with respect to the presym-
plectic form

Q(6F, 6'F) = 2r1 f tr(aa7a' - a'a.ya) d(A dry

where a = F-'SF, a' = F-16'F, and the integral is over the product of the unit
circle in the (-plane and some contour in the ^y-plane. Because the integrand is
holomorphic, the integration path can be deformed without altering the integral.
To make sense of this, and to show that this is the same symplectic form as the
one that we have already considered in the case of the KdV flows, we shall look
270 Twistor construction of hierarchies
now at the twistor construction from a different perspective, in which we can
take account of boundary conditions in a simple way.
DS operators on the real line
In the Hamiltonian formalism, we are interested in DS operators on the real line
which take a standard form at infinity. We put
L=ax+P-((ay+Q),
where x is real, A = P - (Q + A is trace free and lower triangular, and we think
of L as acting on sections of a SL(n, C) bundle E - R. When P and Q can be
continued into a neighbourhood of the real axis in the x-plane, we can represent
L by the patching matrix F('y, () of the corresponding bundle over part of C p2.
To extend this description to operators that are not necessarily analytic in
x, we look at the patching matrix in a different way. We define G: W x S'
SL(n, C) by G(x, () = F(x(, (), where W is a subset of the real line and S1 is
the unit circle in the (-plane. Then the symmetry condition on F becomes
GXG-1 = A - a, (12.5.1)
where, for each x, a extends holomorphically to the exterior of S' and is lower
triangular at ( = oo. A generating function is any smooth function G: 1[2 x S'
SL(n, C) that satisfies this condition.
Given a generating function, we can recover a DS operator on the real line by
making the unique Birkhoff factorization G = I f for each fixed x such that
f (x, oo) = 1. We then have
.f (A - a)!-' + 1.1 -' = .fx f -1
and hence, by the same argument as on p. 258, that
L=ax-fxf-1 =ax+P-(Q
is a DS operator; f is a fundamental solution to the linear equation L f = 0, and
f satisfies
Lf = f (a - A).
When a = 0, f is a dressing transformation in the sense of Appendix B.
Every DS operator arises in this way since we can, for example, take G(x, () _
f (x, (), where f is the fundamental solution to the linear equation L f = 0 such
that f = 1 at some point xo (in this case a = A = P - (Q + A). Given one
generating function for L, we can construct others by replacing G by CGC,
where C depends only (, and extends to a regular function on the interior of
the unit circle, and C(x, () extends a regular function on the exterior of the unit
circle for each x, and takes values in N at ( = oo.
Boundary conditions
When A = 0, we can take G = exp(xA), a = 0, and f = 1. The operators
that we shall consider all approach this standard form at infinity in the sense
that A - 0 as x ±oo. Each has a family of generating functions such that
$amiltonian formalism 271

(i) a --+ 0 as x -+ ±oo, and (ii) f --+ 1. For example, G = f where f is any
fundamental solution to. L f = 0 which extends holomorphically in C inside the
unit circle.
We shall denote by G the space of all generating functions such that a 0
as x -+ ±oo, and by M the space of DS operators on the real line such that
A -+ 0 as x -+ ±oo. The exact definition of convergence is not critical, but here
and below we shall take it to mean `rapidly decreasing in x' uniformly in (', for
16 = 1.
Within G, we want to pick out subspaces which are large enough to contain
orbits of the DS flows but small enough to allow the definition of a symplectic
form. There are many ways to do this, but we shall use the following. First we
define g{ to be the group of smooth maps H: R x S' , SL(n, C) such that
lim H E LSL(n, C) .
Its Lie algebra fj consists of smooth maps R x S' - sl(n, C) with smooth limits
as X -+ ±00.
Given G, G' E G, we say that G - G' whenever G'G-1 E R. This is
an equivalence relation, and the corresponding equivalence classes are the in-
tersections with G of the right cosets of 9-( in the group of all smooth maps
R x S1 -+ SL(n, C ). We shall define the presymplectic form on each of the
equivalence classes.
The tangent space
Let C C 9 be an equivalence class. A tangent vector to C at G is represented by
a = 6GG-1 E$ with the property
ba=-8xa+[A-a,&]-+0,
as x - ±oo. We can also represent a tangent vector by the map a = G-1&G.
We then have
a = G-16G, 6a = -GaxG-1.
where a has the asymptotic property GaxG-1 -' 0 as x -+ oo (this must be
used with caution because G is generally singular at x = too).
The presymplectic form
We define a bilinear form on T0C by
d( A dx
S2(a, a)
27rni
f tr(ab'a - a'ba)
d(A dx
a'8xa)
27rni
where a = G-16G, a' = G-16'G, and so on. It is clear from the first integral
that c is well defined since the integrand is rapidly decreasing, and from the
second that fl is closed, since the integrand does not involve G. However, St is
degenerate, in particular because
272 Twistor construction of hierarchies
=0
whenever & = 6GG-1 is rapidly decreasing in x and extends holomorphically
to the outside of the unit circle on the (-sphere. Thus S2 descends to a closed
form on the quotient C/9-l_ of C by the left action of N_, where N_ C N is the
subgroup of smooth maps H: ll8 x S' SL(n, C) such that (i) H is holomorphic
in (-1 outside the unit circle, and (ii) H --* 1 as x -i ±oo.
For any generating function, the factorization G = 1-1f is unique, so a
variation bG in G determines variations in f and f, and hence a variation in L.
Since L and L + bL are both DS operators, 5Q = 0. Moreover, the left action of
N_ leaves L invariant, so each element of the quotient C/N_ generates a unique
DS operator.

Dressing
To construct the Hamiltonians for the DS flows, we want to choose a representa-
tive G for each element of C/H_ such that a takes a standard form. We do this
by adapting the construction of the dressing transformation (see also Lemma
B.3). Given G E C, we look for H E N_ such that
aH - [A, H] + aH = H(ho + h1A-1 + + hnk-lA-nk+l + r) (12.5.2)
where a is defined by (12.5.1), the his are rapidly decreasing complex (scalar)
functions of x, and r = O(A-nk) as C - oo. We can find a suitable H by first
constructing the formal series S = SjA-3, where the Sjs are diagonal matrices
determined recursively by
axSj + (aS), = SI 1 + (Sh)j, So = 1. (12.5.3)
where h = o hiA-i and the other notation is as in Appendix B. The difference
from the previous definition is that at each stage, we choose the entries in Sj+1
to be polynomials in the entries in a and their x derivatives so that the trace-free
part of eqn (12.5.3) holds, and we then pick hj+l so that the trace of the next
equation holds (noting that tr(53+2) = 0). This is possible since if Sk and hk
are such polynomials for k < j, then so are hj and Sj. Defined in this way, the
Sjs and the hjs are rapidly decreasing (before we took hj = 0, and adjusted Sj
at each stage by adding an x integral to make the left-hand side trace free, but
in general this gives coefficients Sj that are not rapidly decreasing). We define
H by truncating the formal series at some large enough value of j, and then by
normalizing the determinant. The result is an element of N_, which is uniquely
determined by (12.5.2) up to multiplication on the right by an element of N_ of
the form
C = co + c1A-1 + ... + +
Cnk-lA-nk+l O(A-nk)

where the cjs are rapidly decreasing scalar functions of x. For any element of
C/7.1-, we can choose G such that
a = ho + h1A-1 + + hnk_lA-nk+1 + r (12.5.4)
fjamiltonian formalism 273

where the hjs are polynomials in the coefficients of the corresponding DS oper-
ator, and their x-derivatives, by first taking G = f, for which A = a, and then
replacing G by HG. If a is given by (12.5.4), then
L l = j (A + ho + h1 A-' + ... + hnk_ l A-nk+l) + O(A),
and therefore, by the theory in Appendix B, the first nk - 1 DS flows are given
by
ajG=AUG.
Note that these are tangent to C since ajGG E S5, the Lie algebra of X
Hamiltonians
We define the Hamiltonians Hj:C - C by

Hj=j 00 hjdx
Although S is not uniquely determined by (12.5.2), these integrals are well de-
fined on the quotient because the effect of replacing S by SC is to add to the
hjs the x-derivatives of a sequence of rapidly decreasing functions.
Suppose now that G has been chosen so that (12.5.4) holds, and consider the
tangent vector & = Aj for some j < nk. Let &' be another tangent at G such
that the corresponding variation preserves the form of a in equation (12.5.4)
(this can always be arranged by adding an element of the Lie algebra of ?{_,
which is in the kernel of S2, to an arbitrary initial choice of &'). Then we have
that
ba = 0, 5'a = b'ho + b'hl A-' + + 6'hnk_ lA-nk+l + O((-A)
and consequently that
d( A dx = 5'H3
Q(&, &') = 1 tr(A'b'a) .
27rni J
Hence the Hamiltonians Hj generate the flows 63G = AMG-that is, the DS
flows.
Example 12.5.1 The KdV equation. In the case of the KdV flow, we can take

L ax+
(0 (0 1)
0/ - 0)
(see Appendix B). Let us look for the formal series S of the form

S
\P q/
where p and q are formal power series in C-1. We require
/ -4 2
S-'LS =ax-A+ hi A-'=ax+1I -pq
2+u-C
q \P-+P qx+pq
If we put A2 = (, then this gives
274 Twistor construction of hierarchies
q=+2pq=0, p2+Px+u-(+(q2=0, 1: hi,\-1=-p-A(q-1).
With w = -Ap - \2(q - 1), the first two equations are equivalent to
u = 2w + \-lwx - \-2W2,

which is the Miura-Gardner-Kruskal transformation (Miura et al. 1968). It

follows that the Hamiltonians determined by the DS construction are the same
as those constructed in §8.1 by expanding
oc
wdx
f 00
in powers of \-1.
Remark. The (n + 1)th flow of the nKdV hierarchy determines a solution to
the ASDYM equation with the symmetry H+0 by identifying
L,
with the Lax pair for an ASD connection D = d + P, with x = w + iu and
to+1 = z (see Appendix B). Thus we can map M into the space of solutions to
the ASDYM equation, modulo gauge, by taking L E M as the initial operator
in the construction of the (n + 1)th flow. If the factorization of G is G = !-If
and if we put O equal to the value of -6 f f -1 at( = 0, then 6L = -8xq5 - [P,
and a formal integration by parts gives

H(a, a') = f tr(OD.O' - O'Dx4) dx.

By making a gauge transformation to the special gauge used in Chapter 8, we
can deduce that 52 formally coincides with 521. s
12.6 THE KP EQUATION AND THE KP HIERARCHY
The Kadomtsev-Petviashvili (KP) hierarchy is a key family of integrable equa-
tions in two space and one time variables, which have important applications
across pure and applied mathematics. It is also the most basic member of a col-
lection of systems with similar theoretical properties, which includes the Davey-
Stewartson equations. Segal and Wilson (1985) show that the KP hierarchy
contains all the nKdV hierarchies and explain how to identify a large class of
solutions with points of an infinite-dimensional Grassmannian; they also explain
some of the connections with algebraic geometry and quantum field theory.
It has not been possible to identify the KP equations with a straightforward
reduction of any of the self-duality equations, and there is a good reason for this:
the Lax pairs of the self-duality equations and their reductions can always be
represented by vector fields on some larger space. For example, in the case of
an ASD connection on a vector bundle E - U, the covariant derivatives De and
D/z can be identified with a pair of vector fields on the total space of q' E, where
q is the projection from the correspondence space. But the Lax pair for the
KP equation involves a time-dependent Schrodinger operator or heat operator,
The KP equation and the KP hierarchy 275

which cannot be reduced to a vector field by introducing additional variables. 7

So, unless the KP equations have two inequivalent linear systems, which seems
an unlikely possibility, they cannot be reductions of any self-duality equation.
On the other hand, our central theme has been that the twistor constructions
provide a unifying framework for the study of integrable equations. It is certainly
one that embraces all the reductions of the ASDYM equations, and we shall
see in the final chapter how it can be extended to include various self-duality
conditions on the curvature of a four-dimensional metric. In this section we shall
look at ways in which the twistor constructions can be adapted to include the
KP hierarchy.
In the first subsection, we review an approach to the KP equations based
on Segal and Wilson (1985), but presented in a way that makes clear the con-
nection with our patching-matrix construction of the nKdV-hierarchy. We then
turn to a different construction, which takes as its starting point the alterna-
tive characterization of holomorphic vector bundles in terms of their 8-operators
(see §9.5). We sketch the 8-version of the correspondence given by the Penrose-
Ward transform between holomorphic vector bundles on 0(n) and solutions of
the Bogomolny hierarchy. To extend this to the KP equation, we replace the
8-operator by a differential operator that restricts to a Dirac operator on each
C P1 in twistor space. We then go on to review the definition of the r function
and to relate it to the earlier material, and in the final subsection we explain
how the approaches extend to the Davey-Stewartson equations.

The KP hierarchy
In the twistor construction of solutions to integrable equations, we start with a
patching matrix (a function of ( and some other twistor variables), we substitute
linear expressions in the space-time coordinates for the twistor variables, and
then find the solution by making a Birkhoff factorization. In the case of the
ASDYM hierarchy in §12.1, for example, the twistor variables become Laurent
series in (, with the parameters along the flows as coefficients. Different choices
of the patching matrix give different solutions.
For the solutions given by the Segal-Wilson ansatz, the patching matrix is
F = C((, t)g(() where C is the standard expression

C((, t) = exp (f, tjA3 .

and g varies from solution to solution and determines a point in a Grassman-

nian. In the KP construction, the point of view shifts. We now regard C as
a fixed `patching matrix' f o r each t = (t1i t2, ...) and think of the matrix g(()
as modifying the meaning of the `positive frequency' boundary condition on the
unit circle, so that, in the new definition, a row vector-valued function w(() is
positive-frequency whenever wg-1 is holomorphic in the interior of the unit disc.
To set this in a more general context, consider a Hilbert space H with a fixed
decomposition H = H+ ® H_ into two orthogonal subspaces. Suppose that we
276 Twistor construction of hierarchies
are given (i) a linear map A: H -+ H and (ii) an element eo E H+ such that
ek = Akeo, k > 0, is a basis for H+ and e_k = A-keo, k > 1, is a basis for
H_. We can then identify H with L2(S1, C), where S1 is the unit circle in the
complex A-plane, by putting ek = Ak, but in the examples it will sometimes be
useful to think of H as an abstract space. If we make the identification, then H+
and H_ become the standard positive- and negative-frequency subspaces, and
C becomes an Abelian group of multiplication operators c: H - H, where

c(A,t) = exp (t tjAi) ,

with the parameters t; labelling the different elements of the group.

We now choose a subspace W C H in the Grassmannian Gr such that H =
W ® H_ (see §9.7). The generalized factorization problem is that of finding
and such that 1/)c = i/i, where bi - 1 E H_ and 7P E W. 8 We then have
7//(t, A) = c(l + ajA-1 + a2 A-2 +... )
where the ais are functions of t. We call 1/i the Baker function.
By differentiating once with respect to tk and repeatedly with respect to t1,
we have
akV) al =c(A
=c(Ak+alak-1+...), +a1,1i-1+i81a1Ai-2+...).

By adding combinations of the t1 derivatives to the derivative with respect to tk

(k > 2), we find a unique sequence of functions ui(t) such that
k-2
c-1 (akt - a;- uzaiJ E H_
\ o

where c-1 acts on H = L2(S1,G) by multiplication. But the left-hand side is

in c-1 W, so both sides vanish whenever c 1W is transverse to H_ in H, which
will certainly be the case for small t. If we put
k-2
Mk=8k- a,
0
then the Baker function satisfies the linear equations Midi = 0, and the operators
Mi all commute. From the conditions [M;, Mj] = 0, we obtain a sequence of
nonlinear evolution equations for the coefficients ui as functions of t2, t3, ...,
which is called the KP hierarchy. We can write the first two operators in the
form
M2=C72-81-u, M3=a3-01 -2u81+v
where u = -2aja1 and v is another function constructed from the a;s. The
condition [M2, M3] = 0 determines 81v in terms of u, and is equivalent to the
KP equation,
ax (out - uxx2 - 6uu,) = 3uvy
where x=t1, y=t2, t=t3.
The KP equation and the KP hierarchy 277

If ,\2W C W, then 0 is independent of y, and u satisfies the KdV equation;

when ,1"W C W, we obtain a solution to the nKdV hierarchy. In fact, any
solution to the KdV equation is also a y-independent solution to the KP equation;
but not every such solution arises from this construction-only those for which
the dressing matrix converges, that is, those for which there exists a frame for
the Penrose-Ward bundle in a neighbourhood of infinity in which the Ly' takes
the standard form.
We can also understand the KP equation as a generalization of the KdV
equation in another closely related sense by observing that the linear equation
M2V) = 0 is the time-dependent Schrodinger equation (with it2 as `time') corre-
sponding to the time-independent Schrodinger equation (11.5.5) associated with
the KdV equation. This is the basis of the inverse-scattering approach to the
KP equation (see, for example, Ablowitz and Clarkson 1991).
Special choices for W
Particular classes of solutions to the KP or KdV equations are obtained by
making special choices for W.
(i) To recover the nKdV solutions given by the Segal-Wilson ansatz, we take
H=L 2 (S 1, Cn), where S1 is the unit circle in the (-plane, and H+ and H_
to be the usual positive and negative frequency subspaces. We represent
the elements of H as row vectors, take A to be the standard matrix in the
DS construction, acting on H by right multiplication, eo = (110, ... , 0),
and W = H+g-1, that is, W is the image of the H+ under multiplication
on the right by g-1. The identification map t: H - L2(S1,C) is given by
w1(A')+Aw2(An)+...+An-IWn(An)

(w1(C),w2(C),...,4Un(())
under which eo f C is mapped to the Baker function.
(ii) In the Krichever construction, we identify A-1 with a coordinate in a neigh-
bourhood of the point A = oo on the Riemann surface E, and we identify
H with the square-integrable functions on a small circle around this point,
with values in a line-bundle over E. In this case W is the space of sections
that extend holomorphically to the exterior of the circle in E.
The Dirac operator and the KP equation
This construction is elegant, but differs from twistor constructions in that, except
in special cases, it is not formulated in terms of finite-dimensional geometry.
However, the analogous `Dolbeault' construction, in which the twistor 19-operator
is replaced by a Dirac operator, is a finite-dimensional geometric construction.
Before describing it, we first give a brief description of the Dolbeault version of
the O(n) correspondence for solutions of the Bogomolny hierarchy following that
in Sparling (1991), so that we can emphasize the analogy with the Penrose-Ward
transform. 9
This formulation in terms of Dirac operators has the further advantage that
it gives the full class of solutions to the KP equations rather than just those
278 Twistor construction of hierarchies
given by the Segal-Wilson ansatz. 10 It is an adaptation of the `a-construction'
of Fokas and Ablowitz (see Ablowitz and Clarkson 1991, and references therein,
and also Zakharov and Manakov 1985), which generates all the solutions given
by the inverse-scattering method.

The Dolbeault formulation of the Penrose-Ward transform

Let E' be a holomorphic vector bundle on the twistor space 0(n). In the `Dol-
beault' description of E', one chooses a global smooth frame of the bundle and
represents the holomorphic structure on the bundle by the operator 8E, defined
in §9.5. Since the fibres of 0(n) are Stein manifolds, the frame can he chosen so
that it is holomorphic along the fibres. Then
8E' =a+adA
where 8 is the ordinary 8-operator on functions, A is an affine coordinate on
C P1, y is a linear coordinate on the fibres of 0(n), and a(-y, A, A) is a smooth
matrix-valued function of -y and A, with holomorphic dependence on -y.
Recall that points of space-time Cl+1 correspond to the global holomorphic
sections
-'=to+t1A+...+t,An

of 0(n) '- C P1. To recover a solution to the truncated Bogomolny hierarchy

on Cn+1 from E', we must construct global holomorphic frames on each such
section. This requires the solution of the 8-equation
(12.6.1)
where t E C"+1, 'y = to + t1A + + and f is required to be smooth
and nondegenerate on the whole of C P1. Instead of appealing to Grothendieck's
theorem, we can use the index theorem and a genericity argument to show that,
for generic t and a, such a solution exists and is unique up to left multiplication
by a constant: the index of 8E' is the same as that of 8, so it is equal to the
rank of the bundle.
The linear system for the corresponding solution of the Bogomolny hierarchy
is then obtained by observing that the operators
Li = f-'(8,+1 - A81)f (12.6.2)
commute with 8a as a consequence of eqn (12.6.1) and are regular over each C 1P1
except for a simple pole at A = oo. They are therefore linear functions of A by
the usual Liouville argument. They also commute with each other, and hence
determine a solution to the Bogomolny hierarchy.

The a-approach to the KP hierarchy

To obtain solutions to the (truncated) KP hierarchy, we again work on the total
space of 0(n) with the same correspondence between sections and points of the
space-time C1+1, but we now replace eqn (12.6.1) by
The KP equation and the KP hierarchy 279

0, where 111a = (a 8a) (12.6.3)

and where a is a smooth function on O(n) restricted to a section y = tjAi

We take
a=exp(Y-7)ao(A,)),
where ao = O(exp(-IAI"+1)) as A oo. We can again argue by the index
theorem and genericity that the solution space has real dimension 2 for almost
all t. If we assume that it is two-dimensional at t = 0 and then it will also be
two-dimensional in some neighbourhood of t = 0. We can identify the solution
space with C by taking the value of v/" at oo and we can fix the solution uniquely
by the condition V = 1 at A = oo.
Because of the asymptotic condition on a, = O(exp(-Ian"+1) so that
at A = oo, Vi' has a Taylor series
tp'(t, A) - 1 + a1A-1 + a2 A-2 + ,

where the as are independent of A. If we put 0 = /e"ti', we find that

0, where =I 0 I . (12.6.4)

Since Pao is independent of ti, we have that O W is also a solution. As before,

we construct operators
k-2
Mk = Bk - O - E ui81
0

for k = 2, ... , n such that a-yMktJi has a Taylor series containing only negative
powers of A at A= oo. Then 17a0Mk = 0 where Mk acts on' by diag(Mk, Mk),
so that
Pa ((e-rMk)q,) = 0
where IF = y) and the t, are assumed to be real. However, e-,'Mk?P is
smooth on all of C P1 and vanishes at infinity, and so must vanish everywhere.
Therefore, since MkV) = 0 for all k, the Mk commute, and so give a solution to
the truncated KP hierarchy.

Connection with the Grassmannian

Put H = L2(S1, C ), where S' is the unit circle in the A-plane, and let H+ be
the standard positive frequency subspace. Choose ao to have support in the unit
disc, and define W to be the set of boundary values V) on the circle of functions
such that ' satisfies the Dirac equation (12.6.4) on the unit disc Al J< 1. If we
put c = e", then a = ao-c/c and the Baker function is given in a neighbourhood
of A = oo by +/,(t, () = ci/)(t, A, A) where is the unique solution to the Dirac
equation (12.6.3) on the A-sphere such that = 1 at A = oo.
280 Tuistor construction of hierarchies
The -r-function
Given the basic ingredients-the Hilbert space H = H+ ® H_ and the Abelian
group of linear transformations of H formed by the operators C(t)-the Segal-
Wilson construction gives a family of solutions u(x, y, t) to the KP equation,
parametrized by t4, t5, ..., for each choice of W E Gr. There is a very elegant
way to extract u directly from the data, without doing the factorization, by
using the fact that u is encoded in the -r-Junction of C with respect to W. For a
general linear transformation p: H -i H such that p(H+) = H+ we define r(p)
to be the determinant of the composite of the maps
H+
I+W p- '(W) l + H+ H+ ,
where I denotes orthogonal projection. Segal and Wilson explain the technical
restrictions on p required for this definition to make sense and for the determinant
to exist. They show that u = 281 logr(C(t)). They also explain the connection
with quantum field theory: r(C) can be interpreted as a transition amplitude
(OICIW), where 10) is the vacuum state associated with the polarization H+ in
the charged fermionic quantum field theory based on H, I W) is the vacuum state
associated with W, and C is an operator constructed from C (see also Witten
1988). In the Dirac operator formulation, the T function can instead be defined
in terms of the Quillen determinant of 1« (see the introduction to Mason and
Singer 1994). The following example, based on the example in Segal and Wilson
(1985, p. 20), shows how the -r-function is defined in case (i) above.
Example 12.6.1 We return to example (9.3.3), and use the notation introduced
there. In this case, the determinant is finite, so its definition is straightforward.
Let H = L2(S',C"), as in (i) above, and W = H+R-1. We shall compute r(C),
where C is as in the example, and acts on H by multiplication on the right. Let
w E H; we have w E W whenever w extends to a meromorphic function on the
disc I(I < 1, with poles at the points a,, such that (i) w(/ji)bi = 0, and (ii) for
each i,
w N
( - ai
(no summation) as ( ai, for some ryi E C. By thinking of W as the graph of
a map H+ -+ H_, we can also characterize it as the subspace of H of functions
of the form w = s + I:i /£i(s)si, where s E H+,

si(() = at
- ai
and the n is are elements of the dual space H+ constructed from the data. From
the condition on w at the points fi, we have that
k
E Liji,j(s) + s(Qi)6i = 0

where
The KP equation and the KP hierarchy 281

abb.
Lij _
/3i-aj
Now the orthogonal projection of sic-' into H+ is the function

1 (s, C,-1 ). a:(C-1(()-C-1(fi))

( - ai
Therefore in this case the image of s(C) under the composite map H+ H+ is
the element of H given by
k at (1 - C 1(ai)C(O)bi
s(O+ E ki(s) -
Now the determinant of a linear map of the form 1 + _1 2i ®rci, where ti E H+,
is equal to the determinant of the k x k matrix with entries bi; Therefore
we have r(C) = det(bij + Ki3), where
a,(1 - C-1(ap)C((33))bj
k LpiKij

_- = -Lpj + Mpj
i=1 Nj - ap
RR

Consequently, r(C) = det(M) / det(L). When we take C = C((, t) as above, we

recover the formula for u in §12.4, since det(L) is independent of t.

Extensions to the Segal-Wilson construction

We shall now extend the Segal-Wilson construction in two stages to generate
more general solutions to the KP equation, and its hierarchy. First, we put H =
L2(S1,C2), with H+, H_ the usual positive and negative frequency subspaces,
and, as before, we take W to be some other subspace such that H = W ® H_ .
This time, however, we look for a modified Birkhoff factorization
C(C,t) = f-1f
where f, f are 2 x 2 matrices with the rows of f in W and the rows of f-' in H_ .
By differentiating with respect the tis, the same a procedure gives us a sequence
of commuting matrix operators. The first two are of the form
0 1
M2=a1+A- a2 0

M3=93+B+A82-1 2 0
2 a2
where A and B are matrices, with A trace-free and lower triangular. By setting
[M2, M31 = 0, we again recover a solution to the KP equation, with Baker
function 0 = eo f v, where v` = (1, A). This gives nothing new. However, we can
generate a wider class of solutions by replacing C by the patching matrix of a
DS operator, and then by defining its dependence on t by following through the
same steps as in the DS construction.
282 Twistor construction of hierarchies
The Davey-Stewartson equation
We can generalize in a different direction by replacing {C(t)} by some other
Abelian group of linear transformations of H. For example, take H = L2(S', C2),
where S' is the unit circle in the (-plane, and put

C(t t) = exp t ((ti oti)<

(t; Otix,
Again we take H+ and H_ to be the standard positive and negative frequency
subspaces and W to be some other subspace such that H = W ® H_, and look
for a 2 x 2 matrix factorization
C= f-'f
where the rows of f are in W and the rows of f -' are in H_ . By differentiating
with respect to the parameters t; and Ti, and by eliminating the positive powers
of (, as before, we construct a sequence of commuting matrix operators M, and
M;. We have that M1 and M2 /are of the form
/
M1=a1-(1 01 181+I U

M2a2-(0 Ii)ai+(0 0)+B

where B is some matrix independent of (, and 0 and/0 are unknown functions
of the parameters. This is the linear system for the Davey-Stewartson equation
in the complex form
i0t = (0xx + 1 ) + (0 - X)t
z
i0t = - (0xx + Uy)
i - X)

where x = tl, y = t1, t = -it2i and Xxx - Xyy = 2((fifi)xx.

In this case the appropriate Dirac operator is, as before,

pR 4_ (a a()
but where now 0' is a 2-component column vector and a is a 2 x 2 matrix
a = C-'a0C with C = C(t, t) as above and where ao is independent of (t, t).

NOTES ON CHAPTER 12
1. See also Szmigielski (1993) and McIntosh (1993).
2. Since the operators LA; in the linear system of the GASDYM hierarchy commute,
we can find f such that LAif = 0, with f a regular function of ( at (= 0. We then
have
'9Aof = ( (tAm - RA+mn)f = (-lf-'TAAT-'f + O((-m),
by using the definition of Rk, and the correspondence xAm = tn(m_ I )+A of §8.6 between
the independent variables of the GASDYM hierarchy and the parameters of the DS
Notes on Chapter 12 283

hierarchy. Thus we see directly that BA is conjugate to (AA, up to an error term of

order S-m. However, this is not sufficient to establish the special property, since it
is necessary to impose conditions on the error term. Sufficient conditions are those
implicit in Proposition 12.3.4. Note that the algebraic conditions in Mason and Singer
(1994) are not quite sufficient.
3. Here t = (t;), where i runs from 1 to nm - 1, omitting multiples of n.
4. This is a special case of eqn (5.16) in Segal and Wilson (1985).
5. One can prove directly that p(E) intersects a hyperplane in C 1P"_ 1 in n + g - 1 =
1nm(n - 1) points. Consider the intersection with the hyperplane v, = 0. This occurs
where the determinant of 4? - µ and the cofactor of its first entry vanish together (the
condition for 4) to have an eigenvector with zero first entry and with eigenvalue µ).
Thus we must have a common zero for a polynomial of degree n in p and a polynomial
of degree n - 1 in µ. The condition that the resultant of these should vanish is a
polynomial of degree n(2n - 1) in the entries in 1, and therefore of degree nm(2n - 1)
in C. Hence there are nm(2n - 1) intersection points.
6. In the gauge used here, 0 generates a perturbation such that 64i , = -Dwg , 6I = 0,
which is gauge-equivalent to 64?,,, = 0, 641Z = ds', hence 0 coincides with the q5 in
Chapter 8.
7. One can see this by considering the Fourier transform of the Lax pair operators in
the trivial case. The Fourier transform of a solution to the linear system is supported
on a linear subspace when the operators can be reduced to vector fields, whereas for the
heat equation, the Fourier transform of a solution must be supported on a parabola.
8. We have that i/i and ib exist whenever H = c-1W ® H_ since then c-1W intersects
the affine subspace 1®H_ in a unique point 11' obtained by projecting 1 into c-1W by
using the direct sum decomposition.
9. The analogy in the following is not quite exact because, for the nKdV hierarchy,
the variable Sin the twistor approach is more usually identified with A", where A is
the variable of the KP formulation. This is due to the degeneracy of the Higgs fields
in these cases and is associated with the Abelianization given by diagonalizing A, as
described above. In the case of the Davey-Stewartson equation the analogy becomes
precise.
10. Just as we discussed in the case of the nKdV hierarchy in §12.2, the solutions arising
from the Grassmannian construction are special. They are meromorphic functions
of the entire time variables, and so this class does not include nonanalytic solutions.
Indeed, some entire meromorphic solutions are excluded also. One route to generalizing
the framework has been to admit Baker functions that are merely formal power series
in A (Date et at. 1983). However, even this device cannot give nonanalytic solutions as
the formal power series at one value of (t1, t2) can be grafted onto any formal power
series at another value of (t, , t2), but the Baker function and its derivatives with respect
to t, and t2 determine the element of the Grassmannian, which cannot be unique for
such a solution.
13
ASD metrics

In this chapter, we shall look at various forms of the ASD condition on the
curvature of a metric in four dimensions. The conditions are naturally expressed
in terms of `Lax pairs' of vector fields, either on space-time itself or on some
bundle over it. By interpreting a vector field as a generator of a diffeomorphism
group, we can represent some of the conditions as special cases of the ASDYM
equations, but with infinite-dimensional gauge groups. In §13.1-4 we derive
various forms of the ASD condition and its subcases on the curvature of a metric.
In §13.5 we describe the extension of the twistor correspondence by which ASD
conformal structures are encoded in a curved twistor space P, and the way
in which various supplementary conditions on the metric (vacuum, Einstein,
Kahler) give rise to additional structures on P. In §13.6 we discuss the reductions
by one symmetry to the equations of an Einstein-Weyl space and to the 'SU(oo)
Toda field equations'. In §13.7 we shall explain how many standard integrable
equations arise as reductions of the ASD conditions on a metric, and how the
curved twistor construction relates to the Penrose-Ward transform. 1

13.1 SELF-DUALITY IN CURVED SPACE-TIME

Let M be a complex four-dimensional manifold and let g be a holomorphic metric
on M. We shall suppose that M is oriented, so that it is possible to distinguish
right and left-handed orthonormal tetrads. We then have a preferred volume
element, given by
v=w0AwIAw2Aw3 (13.1.1)
where {wa} is the dual basis of any right-handed orthonormal tetrad. Because
right-handed orthonormal tetrads are related by elements of SO(4, C ), v is in-
dependent of the choice of tetrad. The metric and volume element determine a
* operator on forms, as in flat space, and hence a decomposition of 2-forms into
their SD and ASD parts. It makes sense, therefore, to impose the ASD condition
on a connection on a bundle over M.
When we consider the special case of the Levi-Civita connection on the tan-
gent bundle, there is a further refinement. In this case, the curvature 2-form 1Z
takes values in so(4, C) in a local trivialization, so we can decompose R into a
sum of SD and ASD 2-forms, as with any connection, but we can also decompose
the value of R at each point into a sum of infinitesimal left and right rotations.
Self-duality in curved space-time 285

Conditions on the Riemann tensor

By using the correspondence between elements of so(4, C) and 2-forms (see §2.5),
we can represent the value of R at x E M as a linear map
Rr:A2TyM A2TiM,
and therefore as a 6 x 6 matrix. This breaks up into 3 x 3 blocks when the
2-forms on M are decomposed into their SD and ASD parts. Thus we can write
R++ R+-

= R-+ R-- (13.1.2)

where R++ maps SD 2-forms to SD 2-forms, R+- maps ASD 2-forms to SD

2-forms, and so on. The trace-free parts of R++ and R-- encode the SD and
ASD parts of the Weyl curvature, while tr(R++) = tr(R--) = where R is
2R,

the scalar curvature. The off-diagonal blocks encode the trace-free part of the
Ricci tensor, Rah - ! Rgab. By requiring that various parts of R. should vanish,
we obtain the following ASD conditions.
(C) The trace-free part of R++ vanishes. The Weyl tensor is ASD and the
metric is conformally half-flat. This condition is conformally invariant; when
it holds, we say that g determines an ASD conformal structure.
(E) The trace-free part of R++ vanishes and R-+ = 0. The Weyl tensor is
ASD and the metric is Einstein; that is Rab = ! Rgab, with R necessarily
constant. This is the ASD Einstein condition.
(V) R++ = 0, R-+ = 0. The Levi-Civita connection is ASD and the metric is
half-flat. The Ricci tensor and the SD part of the Weyl curvature are both
zero, and so the metric is a solution to the ASD vacuum equation.
(S) RZ++ = 0. The Weyl tensor is ASD and the scalar curvature vanishes, that
is, the metric is ASD scalar-flat.
These conditions are all integrable in that they can be `solved' by twistor con-
structions. In fact, the third condition (V) is equivalent to a reduction of the
ASDYM equation, but with an infinite-dimensional gauge group.
Two other conditions on the curvature are also of considerable physical and
geometric importance: the Einstein condition,
Rab = qRgab ,
with R necessarily constant, and the vacuum field equation
Rab=0.
The first is equivalent to R+- = 0, the second to this together with the additional
condition tr(R++) = 0. They are not integrable in any straightforward sense,
except that under certain conditions the vacuum equation with two commuting
Killing vectors is equivalent to a reduction of the ASDYM equation; see §6.6.
Before considering self-duality conditions in more depth, we shall first take
a more detailed look in the next section at the properties of the Levi-Civita
connection.
286 ASD metrics
13.2 THE LEVI-CIVITA CONNECTION
We denote the Levi-Civita connection by V. Given a local frame field ea for the
tangent bundle, V is represented by the matrix-valued 1-form r = (rab) such
that
Veb=earb.
It is determined uniquely by the following two conditions:
(a) it is torsion-free: Deaeb - Dej,ea = [ea, eb);
(b) it is compatible with the metric:
d9ab = 9acrcb + 9bcrca ,

where gab = g(ea, eb)

Condition (a) can be replaced by the first Cartan equation,
dwa + rab A wb = 0,
where wa is the dual basis of 1-forms, that is, ea J wb = ba. When the frame is
rigid, that is, when the gabs are constant, rab = 9acrcb is skew-symmetric.
The curvature is the matrix-valued 2-form R = (Rab) defined by the second
Cartan equation
Rab=2(drab+FCArcb),
or in terms of the components of the Riemann tensor Rbed, by
Ra a d
b=R bcdWc A
W

The covariant curvature tensor Rabcd = gaeRebcd is skew-symmetric in ab and

in cd, and has the additional symmetries Ra[bcd) = 0, Rabcd = Rcdab (the last of
these follows from the others).
Decomposition of the connection
Suppose that we choose the frame to be a null tetrad W, Z, Then (g.,')
has the same form as the flat metric in double-null coordinates,
0 0 -1 0
to 0 0 1
(gab) -1 0 0 0
0 1 0 0
and the frame is rigid. With this choice, we can write r as a sum,
r =7+'Y,
where -y takes values in the left rotations and' takes values in the right rota-
tions. Both y and' are 1-forms with values in sl(2, C ), and they determine the
covariant derivatives of the tetrad vectors by

(OW vZ)= t(W Z)+(W Z)7'

We write
The Levi-Civita connection 287

where r., A, p, k, A, µ are holomorphic 1-forms (in the ordinary sense). When we
change the tetrad, y and y undergo separate SL(2, C) gauge transformations, so
the two SL(2, C) connections defined by
D=d+-y, D=d+
are independent of the choice of null tetrad. We shall see that they are the
connections on the bundles of unprimed and primed spinors. Let F and F
denote their respective curvatures. These are 2-forms with values in the Lie
algebras of left and right rotations of the tangent space. We put
L=W-(Z, M=Z-(W,
where < is an auxiliary complex variable, and define the 2-forms A, B, and C by

F 2(dy+. A7) = (-A B) .

The covariant derivatives of the tetrad vectors and their relationship to the
curvature of D can then be put in a convenient form by introducing the 1-form
0 and the 2-form 6, defined by
8 = kC'2 +2A(+9 = A(2 +2BC+C,
which are forms on M that depend quadratically on the parameter (. We then
have
V.L' = -A.L6 - KaMb + 8a2b + 1L68t8a,
VaMI = I2aLb + AaM' + 8a1'Vb + IMb8t8a,
2
(13.2.1)
for every C E C, and
6 = 2(d8+0A8t0). (13.2.2)

Decomposition of the Riemann tensor

Because left and right rotations commute, we can calculate R by finding the
curvatures F and F of D and D separately, and then combining the results. By
the correspondence in §2.5, all three curvatures can be regarded as 2-forms with
values in A2T'M. With this interpretation, 7Z = F + F.
We can equally well represent 7Z as a section of a tensor bundle over M by
writing
?Z = Rabcd(w' A ,b) ® (wc Awd) EM2 T*M (9 A2T'M)
or as a map from 2-forms to 2-forms,
7Z: wab -' Rabcdwcd (13.2.3)
where, as usual, indices are lowered and raised by gab and its inverse. The
symmetries of Rabcd imply that
288 ASD metrics
Rabcd = Cabcd + 2fi[Qbb +
where
dab = Rab - R9ab , Rab = Rcac6, R=R a ,
4
and Cabad = 0. This expresses the curvature in terms of the Weyl tensor Cabcd
the Ricci tensor Rab, and the scalar curvature R. The Weyl tensor further
decomposes into a sum
Cabcd =Cabcd +Cabcd
where C bcd and Cabcd are, respectively, SD and ASD on both pairs of indices ab
and cd. That is, C+ belongs to the tensor product of the space of SD 2-forms
with itself, and C- belongs to the tensor product of the space of ASD 2-forms
with itself. Thus on decomposing w into its self-dual and anti-self-dual parts,
w+ and w-, we can write (13.2.3) in the form
v++: w+ _4 C+ cdw+ + 1 p+
ab ab cd 6 ab
R+ : wab
R.-+: w+
,--+ -
c+
ab

R wab ~ Cabcdwcd + 6 Rwab

It follows that
ecd = L[aMb](Cabcd + 1a[c9d)b + sR9ac96d), (13.2.4)
from which we deduce the following lemma.
Lemma 13.2.1 The four ASD conditions can be restated as follows.
(i) Condition (C) is equivalent to 6(L, M) = 0 for every (, and also to the
vanishing of Cabcd'
(ii) Condition (E) is equivalent to Li ® = MJ 6 = 0 for every (, and also to
Cabcd=0=1ab
(iii) Condition (V) is equivalent to 6 = 0 for every (, and also to Cbcd = 0 =
Rab, that is F = 0.
(iv) Condition (S) is equivalent to the condition that 6 should be ASD, and is
also equivalent to C bcd = 0 = R.

Tetrad identities
To express the self-duality conditions (C), (E), and (V) as integrability conditions
on the null tetrad, we first need some notation. We introduce the correspondence
space F = M x C P1 and define two vector fields a and m on F (less the points
C=oo) by
e=L+(Li0)at, m=M+(MJB)8t,
We put
'r=d(-0 E=d(Av, 1; 12rAv(L,M,-,'),
Spinors and the correspondence space 289

where v is the volume form on space-time; r, .=, and are forms on .F of degrees
1, 5, and 3 respectively. For a vector field S on M, we define div S by
Gsv = (div S) v,
or equivalently by div S = VaSa; and for a vector field T on .F, we define div T
by
GTE = (div T)E .
On M, we treat ( as an auxiliary parameter, which is held constant when taking
derivatives, while on F it is one of the coordinate variables.
Lemma 13.2.2 Put x = Li at9 and y = MJ at9. Then
[e, m] + (div e)m - (div m)e = 2xm - 2ye + 6(L, M)ac . (13.2.5)
If 6(L, M) = 0, then
GlF = £m = 2y . (13.2.6)
Proof From eqn (13.2.1),
divL=-LJA-MJn+ZJ9+2LJaa9,
divM = L i p +MJA+WJ9+ ZMJat9.
Therefore, since [L, M] = DAM - VML, we have
[L, M] - (div M)L + (div L)M
= (LJ 9)W - (MJ 9)Z + (ZJ 9 + x)M - (W J 9 + y)L, (13.2.7)
by using (13.2.1) again. Equation (13.2.5) follows by using (13.2.2) and
dive=divL+at(LJ9),
together with a similar expression for div m. Equations (13.2.6) follow from eqn
(13.2.5) by taking the Lie derivatives along t and m of E(t, m, , ). 0
13.3 SPINORS AND THE CORRESPONDENCE SPACE
We shall express the self-duality equations (C), (E), and (V) as conditions on the
vector fields a and m on M x C P1. This explicit and straightforward point of view
makes clear some of the connections with ASDYM equations, and is useful as a
starting point for reduction, but it loses sight of the underlying geometry: the
construction lacks obvious invariance under rotations of the null tetrad. In this
section, we shall explain the intrinsic nature of the various objects involved. We
shall assume rather more familiarity than hitherto with the geometry of bundles
and connections, but we shall not make essential use of the results, other than
to make clear the geometric interpretation of some of the twistor constructions.
The a-plane bundle
From a geometric point of view, the central object is the analogue of the cor-
respondence space in the flat-space theory. We define it in this more general
setting as a C lP bundle .7 M by taking the total space to be the set of pairs
290 ASD metrics
(x, II), where x E M and II is an a-plane through the origin in TINT. Given
a null tetrad, (is a fibre coordinate on F, by which we can identify Jr locally
with M x C P1. We call F the a-plane bundle; it is also known as the projective
prime-spin bundle, for reasons that will be clear below.
There is a natural line bundle 0(-2) --+ F, of which the fibre at (x, II) is
A21I C
that is, an element rr of the fibre is a null bivector at x such that 7rabVQ = 0 for
every V E H. A choice of null tetrad determines both a fibre coordinate C on F
and a local trivialization of 0(-2) by picking out the basis element 7r = L A M
in each fibre of 0(-2). To include the points at C = oo, we can use instead the
basis element i = L A M, where
L=(W-Z, M=(Z-W
(as usual, C = C-1). Since irr = C-27r, the restriction of 0(-2) to each fibre of F
is the standard line bundle 0(-2) -* CP1, which explains the notation.
Parallel transport along a path ry C M from x to x' determines a linear
map TxM Tx,M, which preserves the metric and maps a-planes through
the origin in TxM to a-planes through the origin in T.,,M, so the Levi-Civita
operator V gives rise to a connection on F. It follows that the tangent bundle
T.F decomposes into a direct sum
T,r'=H®V,
where H is the rank-4 bundle of horizontal vectors and V is the line bundle of
vertical vectors. Within H, we have the two-dimensional twistor distribution D
which is given by the horizontal lifts of the tangent a-planes on M; that is,
D(x,n) is the horizontal lift of II. The restriction of V to a C Pl fibre is the
tangent bundle TCP1 = 0(2), so V is the dual of the bundle 0(-2); and from
the definition we also have 0(-2) = D A D. We denote the kth tensor power of
V by 0(2k) and its dual by 0(-2k).
In the local trivialization, the horizontal vectors on .7' are of the form
X+(XJ9)8S, (13.3.1)
where X is a vector on M. The vectors 2 and m are the horizontal lifts to F
of the (-dependent vectors L and M, and they span the twistor distribution at
each point. We also have a direct geometric characterization of the 1-form T
as an intrinsic 1-form on F with values in 0(2). It is determined by the two
conditions: (i) its restriction to each fibre of F is the natural 1-form on C P1 with
values in 0(2) (see §9.3) and (ii) -r vanishes on H. The first property implies
that the vertical part of T is d( in the local trivialization; the second implies
that r annihilates the vectors (13.3.1), and therefore that T = d( - 0. Thus r
is a natural geometric structure on F. So also are E, which we can define by
= r n v, and l;, which is formed from it by contracting with the unit section of
0(-2) ® 0(2) and scaling, where we use 0(-2) = D A D. Defined in this way,
is a 5-form with values in 0(2) and (is a 3-form with values in 0(4).
Spinors and the correspondence space 291

a-surfaces
An a-surface E C M is'a 2-surface of which the tangent plane at each point is
an a-plane (later we shall require that E is also connected and maximal, that
is, not a proper subset of another connected a-surface). It has has a geometric
property analogous to that of a null geodesic, which is stated in the following
lemma.
Lemma 13.3.1 The tangent plane to an a-surface is preserved by parallel trans-
port along curves in the surface.

Proof Choose two independent vector fields S and T tangent to the surface
such that (S, T] = 0. Then S and T are null and orthogonal at points of the
surface, and satisfy
SaVaTb = TaVaSb .

So if we put 7rab = SlaTbl, then 7r is a tangent null bivector to E. By noting that

SbS' = 0 = SbT1, we have
0 = SbSaVa7rbc = TbSaValrbc = SbTaVa7rbc = TbTaVa7rbc,
at the surface. Therefore the covariant derivatives SaVa7rbc and TaVa7rbc are
both orthogonal to S and T, and so are also tangent bivectors to E. It follows that
they are both proportional to 7r, and hence that the tangent plane is invariant
under parallel propagation within the surface.
Another way to say this is that the tangent bivectors to E form a line bundle
A2TE C A2TMIE which is preserved by parallel transport with respect to the
Levi-Civita connection on A2TM. We also denote this line bundle by O(-2):
we shall see below that this use of notation is consistent. What we have shown
is that the Levi-Civita connection induces a connection on O(-2) E.
Given a null tetrad, we can trivialize O(-2) by taking 7rab = LlaMbl as a
basis element, where in the definition of L and M, ( is now a function on E,
chosen so that L and M are tangent to E. Then for any vector V tangent to E,
VV7rbc = Va(ea - aa0)(ZIbM`l + L[bWc]) + 7rbcVaat9a ,
by (13.2.1). Therefore V- 0 = V((), and the potential 1-form of the connection
on O(-2) is 5S0. By mapping x E E to the point (x, II) E F, where 11 is the
tangent plane at x, we lift E to a surface t C F; in the local trivialization,
the lift is x (x, ((x)). Because the tangent plane is invariant under parallel
transport, t is horizontal in the sense that its tangent spaces are contained in
H. It follows that the tangent space at each point of t coincides with the twistor
distribution. One can also see this from
Laaa( = La0a,Maaa( = M-0.,
which implies that f and m are tangent to E. The restriction of O(-2) . to
E coincides with the pull-hack of O(-2) - E, so it is consistent to use the same
notation for both line bundles.
292 ASD metrics
The prime-spin bundle
The intrinsic nature of the geometric structures on F emerges more clearly in the
spinor formalism. We shall not look at this in any detail since a full justification
of what is said in the rest of this section would be somewhat out of proportion
to the limited improvement it provides on what we have already done. Our brief
sketch is intended simply to orient those who are already familiar with Penrose
and Rindler (1986), and Ward and Wells (1990), and to indicate the framework
within which much of the twistor construction was derived.
A null bivector 7rab tangent to an a-surface has a spinor equivalent of the
form
7rA'7rB'EAB .

Since 7rnb and 7r A' both determine the tangent plane, and are determined by
it uniquely up to multiplication by nonzero complex numbers, we can equally
well represent F as the quotient of S', less its zero section, by the equivalence
relation (x, 7r A) - (x,) A), A E C. That is, F is the bundle of projective
lines constructed from the prime-spin bundle. Hence the alternative designation
`projective prime-spin bundle'. It is useful to think of F as the quotient of S' by
the one-dimensional distribution spanned by the Euler vector field
a
T=7rA a7rA' '
which is tangent to the fibres of S', and is a natural geometric structure on S'.
We define the tautological line bundle 0(-1) -+ F by taking the fibre at
(x,1I) to be the one-dimensional subspace of Sy of spinors tangent to H. By
its construction, 0(-2) = 0(-1)2, so 0(-1) is a natural square-root of 0(-2).
Note, however, that the definition of 0(-1), unlike that of 0(-2), involves a
choice of spin structure. We denote the dual bundle by 0(1), and its various
tensor powers by 0(k), which extends the earlier definition to odd integers.
Sections of 0(k) are represented by homogeneous holomorphic functions on S'
of degree k; that is, by solutions to the Euler equation T (f) = k f . Similarly,
forms on F with values in 0(k) are represented by differential forms (in the
ordinary sense) on S' such that
TJa=0, £Ta=ka.
Spinor connections
The decomposition of the connection and the curvature has a straightforward
interpretation in terms of spinors. We identify the rank-2 vector bundles on
which D and b are defined are the bundles with S and S'. A choice of null tetrad
determines (up to the usual sign ambiguity) spin frames in the two bundles such
that WAA' = OALA', ZAA' _ LALA' WAA' = tA0A' 2AA' = 0A0A, and in the
trivializations defined by these frames,
D77A = OBB'I1A dxBB = d77A' + ;7 A' ,77B'
DaA = V BB'QAdXBB = daA + yBaB.
Spinors and the correspondence space 293

Because 'y and ry take values in sl(2, C ), the connections preserve the symplectic
structures on the two bundles, something that also follows from
DAA'EBC -0- DAA'EB'C,

(see §9.7).
In the notation of Penrose and Rindler (1984), the decomposition of the cur-
vature splits the Riemann tensor into a combination of the two Weyl spinors
TABCD, `I'A'B'c'D', the Ricci spinors 4?ABC'D', which contains the same infor-
mation as the trace-free part of the Ricci tensor, and the scalar curvature; the
Ricci spinor is symmetric in both pairs of indices and the two Weyl spinors are
both totally symmetric. With all its indices lowered, the Riemann tensor RabM
is equivalent to
WABCDEA'B'EC'D' + WA'B'C'D'EABECD + "I'ABC'D'EA'B'ECD
+ 4CDA'B'EABEC'D' +12!-R(EACEBDEA'C'EB'D' - EADEBCEA'D'EB'C')
If we put
7r A' = COA' - 4A' ,

then
7r A'DAA' = tAL - OAM,
so that L and M are given by the two components of the spinor operator
7r A' V AA' The 1-form 9 and the 2-form e are, respectively,
0 =1B'c'7rB7rC,
e = 7rA'7rB' (%4B'C'D' dxc ' A dxCD' + 4?A'B'CD dx C, A dxDC'
+ 12RdXAA, AdxBA ,).
It follows that and e(L, M) = rB'7rC'7rD', and hence that the
WA'B'C'D'7rA'

conformal structure is ASD if and only if WA'B'C'D' = 0.

Forms on F
If we take ( = 7r,, /7ro, as the fibre coordinate on .T, then
r = 7rA' D7r'4 , 27rA,7rB,7rc, D7rA' A dxB ' A dxBC ,

where D is the invariant horizontal d-operator, defined by

D7rA' = d7rA' + yC g,7rB' dxc.
These formulas make clear that r is homogeneous of degree 2 in 7rA,, and therefore
takes values in 0(2), and that f is homogeneous of degree 4.
Finally, we can define V by picking any two independent spinors aA and ,QA,
and defining e and m to be the horizontal lifts of the vectors L and M equivalent
aA7rA'

to and QA7rB'. We write these lifts as

a
aA7rA DAA' = La a - La-7a B'C'7rC' a7rB,
axa
294 ASD metrics

m = Qa7r'1'DaB' = MaBxa Maya C IT° B,

8
Here the V notation is useful since a little thought shows that we can compute
the Lie brackets of a and m by making V act as the spinor covariant derivative,
but treating the its as if they were covariantly constant.
13.4 ASD CONFORMAL STRUCTURES
The following is an immediate consequence of (13.2.5).
Proposition 13.4.1 The distribution spanned by a and m is integrable if and
only if C b,d = 0, that is, if and only if condition (C) holds.
We use this to establish a form of condition (C) that makes explicit its invariance
under rescaling of the metric. We define a conformal null tetrad to be a set of
four independent null vector fields W, Z, W, Z such that
g(W, W) = -g(Z, Z) .

Every conformal null tetrad is a null tetrad for some conformal resealing of the
metric, so a conformal null tetrad determines the conformal class of the metric.
Proposition 13.4.2 Let W, Z, 2, W be independent holomorphic vector fields
on a four-dimensional complex manifold M. Then W, Z, W, Z determine an
ASD conformal structure if and only if there exist two holomorphic functions u
and v on M x C P1 such that the distribution on M x C 11 n1 spanned by
e=W -t;Z+u8t, m=Z-(W+vac
is integrable.
Proof First suppose that the integrability condition holds. Consider the metric
for which the vector fields are a null tetrad. Integrability, together with (13.2.7),
forces u = L J 9 and v = M J 9, and hence C+ = 0 . Conversely, if C+ = 0, and
if we take u = LJ 9 and v = MJ 9, then the distribution is integrable. On the
other hand, the distribution spanned by a and m is the same as that spanned by
any multiple of a and m, so integrability is preserved under resealing of W, Z,
W, Z, u, and v by any nonvanishing function on Jul.
The condition in the proposition is straightforward because it can be applied
to a null tetrad without first calculating the connection and curvature coeffi-
cients. It implies a geometric characterization of ASD conformal structures in
terms of the existence of a-surfaces. If condition (C) holds, then there is a 2-
surface through each point of F tangent to a and m. Its projection into M is an
a-surface. The converse is also true.
Proposition 13.4.3 Suppose that for every point of x E M, and for every null
SD bivector IT E A2TTM, there is an a-surface through x with tangent bivector
7r. Then condition (C) holds.
Proof Choose a null tetrad, and at each x E M, use ( to label the a-planes
through the origin in TIM, as in §2.3. Then each a-surface in M lifts to a
ASD conformal structures 295

2-surface in M x C, by mapping its tangent plane to the corresponding value of

C. Because there is an a-surface through each point of M for any given value
of ( (including ( = oo), the lifted surfaces foliate M x C. We define u and v by
requiring that 2 and m should be tangent to the foliation. The distribution is
then integrable, and so the conformal metric is ASD.
In fact when condition (C) holds, the tetrad can be chosen so that t and m
commute. 2
ASD Einstein metrics
The Einstein condition on a metric in an ASD conformal class is equivalent to a
stronger integrability condition on t and m.
Proposition 13.4.4 The metric satisfies condition (E) if and only if
PJ(drnr)=0=mJ(drAr).
Proof Note that U r = 0. From the definition of r,
dr n r = -(d9 + d(A 9(0) A (d( - 9) = -17- r A e
by using (13.2.2). Therefore Q.1 (dr A r) = 0 if and only if Li O = 0. The
same is true with t replaced by m. The proposition now follows from Lemma
13.2.1.
Again, we can extract from this a more direct condition on the null tetrad.
If the metric is ASD Einstein, then dr A r is a scalar multiple of l; since both are
3-forms which are annihilated by t and m. In fact, it follows from (13.2.4) and
the calculation in the proof that
12drnr+Rl;=0.
Now suppose that we are given on M four independent holomorphic vector fields
W, Z, Z, W and a family of 1-forms 9, parametrized holomorphically by (. Put
L = W - (Z, M = Z - (W and define the 4-form v by (13.1.1), where w° is the
dual basis of 1-forms. Put
e=L+uB<, m=M+vB(, r=d(-9.
where u = L J 9, v = M J 9. Finally put E = d(A v and =
In the following proposition, r and l; are determined from the vector fields
W, Z, W, Z and 9(() in this way.
Proposition 13.4.5 Four independent vector fields W, Z, W, Z on M form a
null tetrad for an ASD Einstein metric with scalar curvature R, where R is a
nonzero constant, if and only if there exists a one-parameter family of 1-forms
9(C) on M such that 12 dr A r + Ri; = 0 on M x C, where r = d(- 9.
Proof We have already proved the `only if' part. The converse follows from
previous proposition, provided that we can show that 9 coincides with the (-
dependent 1-form constructed from the connection coefficients of the metric de-
termined by the tetrad.
296 ASD metrics
Suppose that we are given a family of 1-forms as in the statement of the
proposition. By definition, t1 T = mJ r = 0 and 2J = mJ = 0. Hence,
by contracting 2 and m with dr A r, we see that QJ dT and mJ dr are both
proportional to T, and so by comparing coefficients of d(,
QJdT=xT, mJdT=yT (13.4.1)
where x = LJ 859, and y = MJ 859. Therefore
RGet; = RQJ dl; _ -12PJ (dT A dr) = -24xr A dr = 2xRl .

It follows from this and a similar calculation with 2 replaced by m that

Gef = 2x1; , Cml; = 2y (13.4.2)
Hence
=0,
and so [P, m] is a linear combination of Q and m. Therefore the distribution
spanned by f and m is integrable, and, by the argument in the proof of Proposi-
tion 13.4.2, u and v coincide with the functions constructed from the connection
coefficients. Also, by comparing (13.2.6) and (13.4.2), x and y also coincide with
the functions x and y constructed from the connection. But 0 is uniquely de-
termined by LJ 0, MJ 0, LJ 850 and MJ 850. Therefore 0 coincides with the
1-form constructed from the connection.
ASD vacuum metrics
Condition (V) is
R++=0=R-+

(§13.1). However, R-+ = 0 if and only if 7Z+- = 0. So condition (V) holds if

and only if D, which is the Levi-Civita connection on the primed spinor bundle,
is flat-in other words, if and only if there exists a null tetrad W, Z, W, Z such
that ' 5= 0, and therefore in which the Levi-Civita connection is given by left
rotations. The following is a simple test for the vanishing of'.
Lemma 13.4.6 y = 0 if and only if [L, M] = (div M)L - (div L)M for all (.
Proof Note that y = 0 if and only if 0 = 0. If 0 = 0, then,
[L, M] - (div M)L + (div L)M = 0
by (13.2.7). Conversely suppose that this equality holds. Then
LJ0=0, MJO=0, ZJ0+LJ850=0, WJO+MJ859=0
for every (. But 85(LJ 0) = LJ 850 - 21 0. Therefore LJ 0 = 0 = 21 0 for all
(. Similarly M J 0 = 0 = W J 0. It follows that 0 = 0.
A consequence is that condition (V) is a reduction of the ASDYM equation, with
gauge group the volume-preserving transformations of M. This is the content
of Proposition 13.4.8, which is a corollary of the following lemma.
ASD conformal structures 297

Lemma 13.4.7 Suppose that condition (V) holds. Then there exists a null
tetrad and a function I on M such that [AL,1 M] = 0 and
G01-10 = CM02-10 = 0.
Proof We know that we can choose the null tetrad so that
[L, M] = (div M)L - (div L)M.
This equation is preserved by left rotations, which do not change 5'. The linear
terms in ( give
Vc(WlcWd) - ZI`Zd)) = 0, (13.4.3)

where the components are taken in a general coordinate system. Choose a solu-
tion 0 to the wave equation VCV 4 = 0 and put
a = Z(O), b = -W(O), a = 2(o), 6 = -W (ds)
By using (13.4.3) and gcd = 2(Z(cZd) - W(cWd)), we find that
0 = Vc(gcdVdb) = -2div (aZ + bW).
Also
div (aW + bZ) _ [W, Z](t) + (div W)Z(4) - (div Z)W(O) = 0,
since (W, Z] = (div Z)W - (div W)Z. Similarly,
div (aZ + bW) = 0 = div (aW + b2).
There is no loss of generality in assuming that as - bb # 0. Put S2 = (aa - bb)-'
and make the left rotation

(W Z) b a) (W Z)
Then, for the rotated tetrad, S2 div W = W (Q), and so on. Hence Q div L =
L(SE), 1 div M = M(1), and therefore (S1L, QM] = 0. Moreover,
GL(Q-'v) = div (S2-'L)v = 0,
and similarly for M.
Proposition 13.4.8 Let W, Z, W, 2 be independent holomorphic vector fields
on a four-dimensional complex manifold Ail, and let SE be a holomorphic function.
Denote by v the holomorphic 4-form such that 24 v(W, Z, W, Z) = 1, and put
L=W-(Z, M=Z-(W.
If for every ( E C, we have
CL(St-'v) = ,CM(SE-'v) = 0, [92L,S2M] = 0,
then W, Z, W, Z is a null tetrad for an ASD vacuum metric. Every ASD vacuum
metric arises in this way.
298 ASD metrics
Proof We have
0 = ELM-lv) = cr'(divL - L(log 1))v. (13.4.4)
Therefore div L = L(log 0) and similarly div M = M(log S2). Hence
[L, M] = -(div L)M + (div M)L,
and so the metric satisfies condition (V). The final statement is a consequence
of (13.4.4) and Lemma 13.4.7.
This is interpreted as a Lax pair form of the ASD vacuum equation (Mason
and Newman 1989). We suppose that we are given a four-dimensional complex
manifold M together with a holomorphic volume element or on M. We then
consider a pair of divergence-free vector fields Lo, Mo, which depend linearly on
the auxiliary parameter (. If
[Lo, Moj = 0, (13.4.5)
then we can construct an ASD vacuum metric by writing
Lo=A-(B, Mo=B-(A (13.4.6)
and by putting L = 11-'Lo, M = f 'MO, Where S12 = 24v (A, B, A, B). Every
ASD vacuum metric arises in this way, so (13.4.5) is equivalent to the ASD
vacuum equation. We shall interpret it below as the ASDYM equation with
gauge group the volume preserving transformations of (M, a), reduced by the
group of translations of C M.
ASD Kahler metrics
A null tetrad on M determines three SD 2-forms w, a, a, by
a(2+w(-a = (13.4.7)
The form w is nondegenerate, and together the three forms span the space of SD
2-forms at each point. In flat space-time,
w=dwAdw-dzAdz, a=dwAdz, a=dii'Adz
in agreement with the definitions in §2.3. We remarked there that, in the case
of a flat metric, w restricts to the Kahler form, multiplied by .1 i, on a Euclidean
real slice. We shall now consider metrics that admit null tetrads such that the
following condition holds.
(K) The two pairs of vectors W, Z and W, Z are tangent to two foliations of M
by a-surfaces, and the 2-form w is closed.
This includes all the metrics that satisfy condition (V), since when (V) holds,
the tetrad can be chosen so thatry = 0, from which it follows that
V,(I.[°Md]) = 0
and hence that d(a(2 - we + a) = 0 for constant C. Condition (K) is a complex
form of the Kahler condition, as the following establishes. A real Kahler metric
ASD conformal structures 299

on a two-dimensional complex manifold with holomorphic coordinates w, z is of

the form
ds2 = 2(S,,,wdwdw + SwZdwdz + S2wdzdw + SZZdzdz).
where S is real-valued and smooth. In flat space, S = zz - ww. The complex
2-manifold can be embedded as the real slice w = w, z = z in the complex 4-
manifold with coordinates w, z, uw, z, which is the product of the 2-manifold and
its complex conjugate. If S is analytic, then we can continue the metric to the
4-manifold by replacing w and 'z by ti, and z in the expressions for S and ds2.
Condition (K) is then satisfied by taking W, Z to be tangent to the surfaces of
constant zi, z, and W, Z to be tangent to the surfaces of constant w, z, and by
taking w to be the analytic continuation of -aaS. On the real slice, w is the
Kahler form, multiplied by 2i.

Suppose now that condition (K) holds. Then w is a complex symplectic

form, and the two foliations tangent to W, Z and to W, Z are isotropic. So if
we introduce coordinates w, z, w, i such that zu, z are constant on the surfaces of
the first family and w, z are constant on those of the second, then w = -88S for
some function S, where d = a + 8 is the decomposition of the exterior derivative
determined by the two foliations (see §2.3).3 Hence
ds2 = 2(Sw,;,dwdzo + Swidwdz + S2,,dzdzu + SZdzdz) . (13.4.8)
A direct calculation of the curvature tensor of such a metric gives that O is
proportional to
cam. (13.4.9)
where A = log(SwZSZ,j, - S,,,wSZ). Therefore
1z" (a) = IZ"(&) = 0, R++(w) = aw,
for some scalar A. Hence the trace-free part of R++ vanishes if and only if its
trace vanishes.I We have proved the following proposition, of which an imme-
diate corollary is that a (complex) Kahler metric with R = 0 satisfies condition
(S).
Proposition 13.4.9 When condition (K) holds, the conformal structure is ASD
if and only if the scalar curvature vanishes.
The Plebanski equations
If condition (V) holds, then 9 = 0, so A = k(w, z) + k(w, z), for some functions
k and k of two variables. By making a change of coordinates, in which w and z
are replaced by functions of w and z, and w and z are replaced by functions of
w and z, we can make k = k = 0, and
Swiszw - swws2i = 1, (13.4.10)
which is Plebanski's first equation. Conversely, given a solution to (13.4.10), we
can write down the metric and deduce that it satisfies condition (V), either from
(13.4.8), or by noting that
300 ASD metrics
L0 = aw + ((Swiaw - Swfuai) , Mo = az + ((Sziam - Szwai)
satisfy (13.4.5) and are divergence-free with respect to or = dw A dz A div A dz.
If we put p = Sw, q = -S.i then w = dz A dq - dw A dp. In this system
Lo =aw+aag-bap-Cag, Mo=az+caq-aap-(ap, (13.4.11)
for some functions a, b, c. Since Lo and Mo commute and are divergence-free
with respect to a, we have that a = Tpq, b = Tqq, c = Tpp, where T is a solution
to Plebanski's second equation
Tpw - TQz + TpgTgp TppTgq = 0

(Plebanski 1975). By using (13.4.11) to define Lo and Mo, and by using the
equivalence of (13.4.5) to the ASD vacuum equation, with a = dw A dz A dp A dq,
we see that every solution generates a metric satisfying condition (V). Therefore
Plebanski's first and second equations are both equivalent to the ASD vacuum
equation.
An ASD vacuum metric satisfies condition (K) in more than one way since
ifry = 0 for one choice of null tetrad, then it also vanishes in any null tetrad
obtained from it by a constant left rotation. They are complex hyper-Kahler
metrics. Real hyper-Kahler metrics are real four-dimensional metrics that admit
three Kahler structures which are compatible in the sense that the three complex
structure tensors I, J, K satisfy the quaternion multiplication relations at each
point. Any real slice in a complex ASD vacuum space-time on which the metric
is real and positive definite is hyper-Kahler.

The twistor distribution

We can restate some of the results of this section in terms of the geometry of
the a-plane bundle in the following proposition.
Proposition 13.4.10 The conformal structure on M is ASD if and only if the
twistor distribution on .7 is integrable. The metric on M is an ASD Einstein
metric with (constant) scalar curvature R if and only if 12 dr A r + R. = 0.

13.5 CURVED TWISTOR SPACES

As in complex Minkowski space, we can construct a twistor space P from an
ASD conformal structure by considering the set of a-surfaces that pass through
a suitably convex neighbourhood in space-time. The picture is very similar to
that in flat space. On the one side we have a neighbourhood U C M, which we
suppose has been chosen so that its intersection with each a-surface is connected
and simply connected; on the other side, we have P, which is the set of all a-
surfaces that meet U. Between them, we have the a-plane bundle F, with the
double fibration

U P,
Curved twistor spaces 301

where the first map is the projection onto U, and the second is the quotient by
the twistor distribution. .

Each point x E U determines a 2-sphere i C P, made up of all the a-surfaces

through x: these are labelled, as in flat space, by their tangent a-planes at x,
and hence by a stereographic coordinate C. If we make a general choice of U,
then P may not be Hausdorff, but we shall avoid this by making what amounts
to a convexity assumption: we shall assume that P has the same topology as
the twistor space of a convex region in CM. Ward and Wells (1990, p. 435)
use the term civilised for such a space-time region. At the topological level, the
correspondence between U and P is then exactly the same as before, but the
complex geometry is different: the twistor space of a curved conformal structure
is not biholomorphically equivalent to an open set in C 11'3. In fact, remarkably,
both U and its conformal geometry can be recovered simply from the information
contained in the complex structure of P. The points of U are identified with
a special four-parameter family of holomorphic C P1 s in P, and the conformal
geometry is determined by their intersection properties: two nearby points are
null separated whenever the corresponding curves intersect.
As the first step in this construction, we shall consider what is special about
the way in which i is embedded in P. We define the normal bundle of 1 to be
the quotient
N = TPIz/T±.
This is a rank-2 holomorphic vector bundle over ± = C Pl : the fibre at each
point of ± is the quotient of the tangent space to P (a three-dimensional com-
plex vector space) by the tangent space to ± (a one-dimensional subspace). By
Grothendieck's theorem, therefore, N is a direct sum of two standard line bundles
O(k) --+ C P1. The special property of the embedding is the following.
Lemma 13.5.1 N = 0(1) ® 0(1).
Proof A point p E x is determined by an a-plane in the tangent space at the
point x in space-time, so there is a biholomorphic correspondence p H 11 between
I and the projective line of a-planes II through the origin in TINT.
A tangent vector Y E TpP can be represented by a small displacement in x,
that is, by a vector X E TxM, together with a small variation in the tangent a-
plane 1I; the vector X is determined by Y up to the addition of a tangent vector
to the a-surface; and the difference between Y and Y' is tangent to I whenever
the difference between the corresponding space-time vectors X and X' is tangent
to the a-surface. We can therefore identify the fibre Np of the normal bundle
at p with the quotient TxM/II. As this is now a problem in linear geometry,
it is enough to show that TxM/H = 0(1) ® 0(1) in flat space-time. In spinor
notation, this is an immediate consequence of the fact that the projection of X
into the quotient by II is given by X AA' --* X AA'7r'A, and that the components
of XAA'7rA, are linear in 7rA'.
Under our convexity assumption, P contains a four-parameter family of em-
bedded complex curves labelled by the points of U; each is biholomorphically
302 ASD metrics

Fig. 13.1. The correspondence between the vector joining x to a nearby point y and
a section of the normal bundle.

equivalent to the projective line and has normal bundle N = 0(1) ® 0(1). Be-
cause H1 (C IP'i, O(1) (D O(1)) = 0, it follows from the main theorem in Kodaira
(1962) that there is a natural identification between T1U and I'(i, N) for each
x E U, under which a section of N over i is represented by a field of vectors
connecting i to a nearby curve (see Fig. 13.1).
We now consider how to recover U and its conformal structure from its twistor
space. Suppose now that we are given P, with its four-parameter family of C lP
curves with normal bundles 0(1) ® 0(1). Let U denote the parameter space,
and for each x E U, denote by i the curve in P labelled by x. By Kodaira's
theorem, the tangent space to U at x is the space of sections of the normal bundle
0(1) ® 0(1). We shall use the same letter to denote both a vector in TTU and
the corresponding section of the normal bundle of i. Since the normal bundle
is everywhere the sum of two copies on 0(1), we can choose independent vector
fields W, Z, W, Z on U such that at each x, W and Z take values in one copy of
0(1), and Z and W take values in the other. Then the ratios
= W/Z, (' = Z/W
are both stereographic coordinates on i (see §9.3), and must therefore be related
by a Mobius transformation,
_ a('+b
c('+d
By replacing Z and W by aZ + bW and cZ + dW, we can arrange that
We then have that W, Z, W, Z is a null tetrad for a conformal structure on U.
We claim (i) it is ASD and (ii) it is independent of the choice of tetrad. The first
statement follows from the following argument for the existence of a-surfaces.
For p E P, consider the set E of all x such that i that passes through p. This is
a totally null 2-surface because if x E E, and if W/Z = Z/W = t; at p, then
L=W-CZ, M=Z - CW
vanish at p, as sections of the normal bundle of i, and therefore the corresponding
space-time vectors are tangent to E. In this way, each point of P determines an
a-surface in U, so P is the twistor space of U. It also follows from this argument
that two nearby points of U are null separated whenever the corresponding curves
Curved twistor spaces 303

in P intersect; so the null-cone at x is determined by the intersection properties

of the curves in P, and therefore the conformal structure does not depend on the
choice of tetrad. Consequently, we have established the following proposition,
which is due originally to Penrose (1976), but is taken in this form from Ward
and Wells (1990).
Proposition 13.5.2 There is a one-to-one correspondence between (i) ASD
conformal structures and (ii) three-dimensional complex manifolds containing
a four-parameter family of holomorphic copies of C!lD1i each with normal bundle
0(1) ® 0(1).
The proposition is local in space-time. It is not stated with complete precision
since it requires in addition a suitable convexity condition in M, and a corre-
sponding topological condition on P. Also from Kodaira's theorem and from the
fact that the sections of 0(1) ® 0(1) -+ CP1 form a four-dimensional space, it
follows that any one holomorphic C1P1 with normal bundle 0(1)®0(1) is one of
a four-parameter family of such curves, so the condition in (ii) can be weakened.

The canonical line bundle over P

When the conformal structure is ASD, the connection on the line bundle 0(-2)
over on a-surface has vanishing curvature. We see this by recalling (p. 291)
that the connection form of the connection is a (O, restricted to E. Therefore a
covariantly constant section of 0(-2) can be represented by a function f on the
lifted surface in .7 such that
£(f)+xf =0, m(f)+yf =0.
However, these equations are integrable as a consequence of (13.2.6). We define a
line bundle on P, which we also denote by 0(-2), by taking the fibre at p E P to
be the space of covariantly constant sections of the line bundle over the a-surface
determined by p. The restriction of 0(-2) to each curve x is the standard line
bundle 0(-2) = T*ClPi CIP1i so the notation is consistent. Again we denote
the kth tensor power by 0(-2k), and the kth tensor power of the dual by 0(2k).
In particular, 0(-4) = K, where K = A3T*P is the canonical line bundle.5
If condition (E) holds, then as a consequence of (13.4.1) and (13.4.2), 7 and
are the pull-backs of forms on P with values in 0(2) and 0(4), respectively
(see §13.3); we shall also denote the forms on P by r and l;. So for an ASD
Einstein metric with constant scalar curvature R, there exists a 1-form r on P
with values in 0(2) and a 3-form with values in 0(4), both holomorphic, such
that 6
(a) the restriction of r to each i is the natural 1-form on C P1 with values in
TC P1 (see §9.4);
(b) R1;+12drA-r=0.
The converse is also true: suppose that we are given and r on the twistor
space of an ASD conformal structure such that (a) and (b) hold. Then t; and
T determine an Einstein metric in the conformal class with scalar curvature R.
304 ASD metrics
To prove this, we go through the same procedure as in the proof of Proposition
13.5.2, except that, in addition, we now have the forms r and on.F, defined by
pulling back the given forms by the projection F -+ P. These have values in V
and V2, respectively, where V is the line bundle of holomorphic tangents to the
fibres of F -. M. In the trivialization determined by the coordinate (, we have
r = d( - 0 for some (-dependent 1-form 0 on M. We can fix the overall scaling
of the tetrad by the condition f = 20 =_(e, m, , , ) where E = d(A v, with v the
volume element on M determined by the tetrad. When R # 0, we then have all
Einstein metric with scalar curvature R, by Proposition 13.4.5 (see also Ward
1980).
When R = 0, we have that dr A r = 0, and hence that the distribution on P
spanned by the vectors that annihilate r is integrable. We can arrange that (is
constant on the integral surfaces: with this choice, a and in have no components
along at and
Cpl=0=,Cml.
It follows that the quantities u, v, x, y constructed from the connection coeffi-
cients on M vanish (see pp. 289 and 294). Hence 0 = 0 and the metric satisfies
condition (V).
We can, in fact, formulate the conditions on P in a more concise way. We
know from the existence of 0(-2) that it is possible to find a line bundle K112
such that K1/2 0 K1 /2 = K. If we are given a 1-form r on P with values hi
K-1/2 such that the restriction of r to each line is nonzero, then we can use T
to identify K1/2Iz with T*C P1 in such a way that (i) holds; and we can use (ii)
to define l;. Hence we have the following.
Proposition 13.5.3 A given ASD conformal class of metrics contains a metric
satisfying condition (V) if and only if there is a holomorphic 1-form on the
corresponding twistor space, with values in a square root of K-1, such that its
restriction to each line is nonzero.
We can similarly characterize the twistor spaces of the metrics that satisfy con-
dition (K), the complex form of the Kahler condition (Pontecorvo 1992).
Proposition 13.5.4 A metric satisfying condition (C) is conformal to a metric
that admits a tetrad such that condition (K) holds if and only if there is a sectiozl
s of 0(2) P with exactly two distinct zeros on each 1.

Proof Suppose that such a section s exists. Since a general section of 0(2)
CPI, that is, a general homogeneous quadratic, has two zeros, the condition on
s is simply that it should not vanish identically on any curve i, and that its
zeros on ± should not coincide.
The pull-back of s to F is a section of 0(2) -+ F, represented relative to a
general choice of conformal null tetrad by a function s :.F C, quadratic in
with distinct roots, and satisfying
e(s) = xs, m(s) = ys.
Reductions 305

By making a right-rotation of the tetrad, and by resealing, we can arrange that

s = -(. It then follows that 9 - d( is annihilated by L and M, and hence that
Ti = 0, so we have
Da (LFaMbi) 2.\a (WfaZb1 (221ai Vbll
_
(see 13.2.1). It follows that condition (K) holds. For the converse, we take
s = w(L, M). o
Note that s actually determines the conformal factor uniquely. To summarize,
we have the following.
Proposition 13.5.5 Let P be the twistor space of an ASD conformal structure.
Then the conformal class contains (i) an Einstein metric if and only if there is
a holomorphic 1-form r on P with values in 0(2) such that for each x, rli is
the canonical 1-form on C 1P1 with values in TC P1; or (ii) a metric satisfying
condition (K) if and only if there is a holomorphic section s of 0(2) with exactly
two distinct zeros on each i. In (i), the metric satisfies condition (V) if and
only if, in addition, dr A r = 0.
Reality conditions
To obtain a real Riemannian geometry from a twistor space, one needs a reality
structure, that is, an antiholomorphic involution P - P that preserves the 4-
parameter family of curves, and has no fixed points. Given this, one obtains
from P a real conformal structure on the real 4-manifold M of curves that are
invariant under the involution, and in fact P is fibred over M by the invariant
curves. There is a natural identification between P and the projective-prime
spin bundle of M, which is the basis of the construction Atiyah et al. (1978a). If
the additional structure r or s is compatible with the involution in the obvious
sense, then the conformal class contains a real vacuum, Einstein, or scalar-flat
Kahler metric, according to the case.
When an ASD conformal metric has a real ultrahyperbolic slice, there is
again an antiholomorphic involution of its twistor space, but this time it has
fixed points, namely the a-planes that meet the real slice in real 2-surfaces.

13.6 REDUCTIONS
Many classical integrable equations are symmetry reductions of one or other of
the ASD conditions on a metric. A large number of examples can be generated by
using the connection between the ASD conditions on a metric and the ASDYM
equation that we describe in the next section (see also Ward 1992). We mention
here just three other cases of geometric interest.
Example 13.6.1 Three-dimensional Einstein-Weyl spaces. A Weyl geometry
on a real or complex manifold consists of a (i) a conformal structure and (ii) a
compatible projective structure. A projective structure is a family of affine con-
nections on the tangent bundle (generally with torsion) which have the same
(unparametrized) geodesics, and the compatibility condition is that the null
306 ASD metrics
geodesics of the conformal structure, which can be characterized as null curves
in null hypersurfaces, should also be geodesics of the projective structure. Given
a Weyl geometry and a representative metric gab from the conformal class, it
is always possible to choose a representative connection D from the projective
structure such that
Dagbc = wagbc ,

for some 1-form w (Ehlers et at. 1972). Under conformal resealing, we have
gab 1l2gab, w --+ w + 2d log 92.
If w is closed, then we can choose SZ to make w vanish. In this case, D is the
Levi-Civita connection of the corresponding metric. In general, however, dw # 0
and the Weyl structure does not reduce to a (complex) Riemannian or pseudo-
Riemannian geometry.
A three-dimensional Einstein-Weyl space is a 3-manifold with Weyl struc-
ture which satisfies the additional condition that every geodesic lies in a totally
geodesic null hypersurface. Hitchin (1982a) shows that this is equivalent to the
condition that
R(ab) = Agab
for some A, where Rab is the Ricci tensor of D. In the general case, D is not
a Levi-Civita connection and Rab is not symmetric, but in the special case that
w vanishes, and the geometry is (complex) Riemannian, this is the Einstein
condition. In three dimensions the Einstein spaces are the spaces of constant
curvature (see also Pedersen and Tod 1993).
Hitchin also describes how every three-dimensional Einstein-Weyl geometry
can be constructed from a two-dimensional twistor spaces, that is, from a two-
dimensional complex manifold containing a three-parameter family of copies of
CP1 with normal bundle 0(2). Jones and Tod (1985) interpret the construc-
tion as a reduction of the twistor correspondence for four-dimensional ASD con-
formal structures. They observe that a four-dimensional ASD conformal met-
ric with a non-null conformal Killing vector T determines a three-dimensional
Einstein-Weyl geometry. The 3-manifold is the quotient of the four-dimensional
space-time by the flow of T, the conformal structure is given by projecting the
conformal metric from the 3-spaces orthogonal to T in the space-time, and the
geodesics of the projective structure are the projections into the 3-manifold of the
null geodesics in space-time. The totally geodesic null hypersurfaces that charac-
terize the resulting structure as an Einstein-Weyl geometry are the projections
of the a-surfaces in space-time.
In the twistor picture, the conformal Killing vector determines a holomorphic
vector field X on the twistor space P of space-time (the flow along this gives the
action of the flow of T on a-surfaces), and Hitchin's twistor space is the quotient
T of P by the flow of X; the three-parameter family of copies of CP1 is made
up of the projections of the curves in P corresponding to points in space-time.
The Einstein-Weyl geometry is recovered from T by taking the points of the
3-manifold to be the curves in the three-parameter family. A non-null geodesic
ASDYM fields and the switch map 307

is made up of the curves in the family that pass through two fixed points of T,
and a null geodesic is made up of the curves in the family that pass through a
fixed point of T with a given tangent direction.
Example 13.6.2 The Monge-Ampere equation. This is the equation
S1SYy-Sty=1.

A solution also satisfies Plebanski's first equation by putting x = w+w, y = z - z,

and hence gives a metric satisfying condition (V), with two commuting Killing
vectors X =8u,-B,l,,Y=82+82.
Example 13.6.3 The SU(oo)-Toda field equation. A four-dimensional Kahler
metric with a Killing vector that preserves the Kahler form can be written in
the form
ds2 = ve' (dx2 + dye) + v dz2 + v-' (dt + C,)2,
where u, v are functions of x, y, z and a is a 1-form on the x, y, z-space; the
Kahler form is .

w =ve"dxAdy+dzA (dt+a).
Given u, we can find v and a so that the metric is scalar flat if and only if u
satisfies the equation
uxx + uyy + (e') zz = 0, (13.6.1)
(Lebrun 1991, Tod 1995a; see also Boyer and Finley 1982). Given u, we choose
v and a such that
vxx + vyy + (ve")ZZ = 0,

da = v,.dyAdz+vydzAdx+ (ve")xdxAdy
For a vacuum solution, we take v = ux; for the Gibbons-Hawking metrics, we
take u = 0. Tod (1995a) shows that the metric z-2ds2 is Einstein where ds2 is
the metric given above and u satisfies eqn (13.6.1), but where now 2Av = zuz - 2
where A is proportional to the scalar curvature. Equation (13.6.1) is interpreted
by Ward (1990b) as the SU(oo)-Toda field equation. Tod (1995b) gives examples,
including the metric of Pedersen and Poon (1990); he describes an ansatz that
reduces the equation to a special case of Pill (see Example 13.7.8 below). See
also Tod (1991, 1995c).

13.7 ASDYM FIELDS AND THE SWITCH MAP

Under certain conditions, we can interchange the `internal variables' of an in-
variant ASDYM field with the ignorable coordinates on space-time to construct
a new space-time metric. If the connection and the original conformal structure
are ASD, then so is the new one. We call this the switch map. It gives a way of
generating nontrivial ASD conformal structures from solutions to the ASDYM
equation in flat space-time and allows us to show that many reductions of the
ASDYM equation are also reductions of one or other of the ASD conditions on
a four-dimensional geometry. For notational simplicity, we shall give the details
308 ASD metrics
of the construction only in the case in which the original space-time is an open
subset U in complex Minkowski space, although the generalization is evident.
The bundle £
First, we need some notation. Suppose that h is a k-dimensional Lie algebra
of conformal Killing vectors that acts freely on U by infinitesimal conformal
transformations, where 1 < k < 4. Since the action is free, the elements of 1)
span a k-dimensional distribution on U. We suppose also that we are given a
faithful action of the gauge group G on some k-manifold iv, or equivalently, that
we have an embedding of its Lie algebra g into the Lie algebra of vector fields
on N. We choose independent vector fields X= on N, and we choose a basis for
ll and denote the corresponding vector fields on U by Y1, where i runs from 1 to
k. In all the examples that we shall consider, there is also an invariant volume
element on N, which we denote by rc. This is not needed when we use the switch
map to construct ASD conformal structures, but is used to pick out particular
scalings of the metric with additional properties.
As usual, we denote by D an 4-invariant ASD connection on a vector bundle
E -+ U. By replacing the transition maps of E by the corresponding transfor-
mations of N, we construct the associated bundle £ -> U with fibre N; and by
applying the Lie algebra homomorphism to its components, we represent gauge
potential by
-t =Adw+Bdz+Adw+Bdz, (13.7.1)
where A, B, A, B are vector fields on Al', depending on the space coordinates
w, z, w, z on U.
Horizontal and vertical distributions
In a local trivialization, each vector field Y on U determines a vector field
Y - Y J',
on £, which we call the horizontal lift of Y. This is independent of the choice of
local trivialization, and together all the horizontal lifts span a horizontal distri-
bution 1i on E. Thus
T£=R(DV
where V is the vertical distribution, that is, the bundle of tangents to the fibres
of the projection 7r : £ - U. At each e E 6, we have a canonical identification
9-le = TTU, where x = 7r(e). The horizontal distribution is not integrable unless
the connection is flat, but because connection is ASD, the horizontal lifts of the
tangents to an a-surface E C U span the tangent spaces to a k-dimensional
family of 2-surfaces in £ which are mapped onto E by the projection. We call
these the horizontal lifts of E.
The action of h on £
In the same way, we can transfer the action of on E to £, by constructing
vector fields on £ from the Lie derivative operators Ly = Y + Oy. In a local
ASDYM fields and the switch map 309

trivialization, Y E 1) determines the vector field Y' given by

Y'=Y-Vy,
where Vy is the image of Oy under the Lie algebra homomorphism. The Y's
span an integrable k-dimensional distribution S on 6, and their flows preserve
x. We shall assume that 1-I and S are transverse on some open set, so that
T£ = ?-I ® S, which amounts to an assumption that the Higgs fields of D are
linearly independent as vector fields on N.
The new conformal metric
The switch map generates a conformal metric on the quotient space M = £/S
from these data, which is defined as follows. We denote the projection onto the
quotient p : £ -+ M and we transfer the flat space-time metric 77 on U to a
conformal metric on M by using the identifications
TxU=lie= TmM,
where x = 7r(e) and m = p(e). Because 17 is invariant up to scale under the
action of h, the result does not depend on the choice of e on the fibre 7r-1(m).
We have the following.
Proposition 13.7.1 The conformal metric on M is ASD.
This follows from the existence of a-planes in C M. The projections into M of
the horizontal lifts of the a-planes in U are a-surfaces in M.
Remark. It is not necessary to start with flat space-time: exactly the same
construction works in a more general context, in which U and 77 are replaced by
a curved space-time with a general ASD conformal structure. Note that we also
obtain an ASDYM field on M with gauge group H, where H has Lie algebra
, by interpreting 6 -+ M as the associated principal bundle, and 7{ as the
horizontal distribution. See Maszczyk et at. (1994), Maszczyk (1995).
Vacuum metrics
We can obtain an ASD vacuum metric in the conformal class by making a further
geometric assumption, and by using the volume element on N to determine the
scaling. We assume that there is an -invariant null tetrad on U such that
[L, MI = 0 and div L = 0 = div M for every (, which means in particular that h
must be a Lie algebra of Killing vectors.
First we construct a volume element or on M. From rc (the invariant volume
element on N) and the pull-back of v (the metric volume element on U), we
define a (k + 4)-form rc A v on E. We then put
p*a= (rcnv) n (Y1.....
where the Yis make up a basis for l). Since the right-hand side is invariant and
annihilated by the tangents to S, it is necessarily the pull-back of a 4-form on M.
We then use the further assumption to construct divergence-free vector fields on
M. By projecting the horizontal lifts of L and M by p., we obtain two vector
310 ASD metrics
fields Lo, Mo on M, which depend linearly on the spectral parameter <. They
commute, and the coefficients of degree 1 and degree 0 terms in C form a null
tetrad for the metric on AA. Moreover they are divergence free with respect to
o. By resealing the metric and applying Proposition 13.4.8, we obtain an ASD
vacuum metric. We shall give explicit formulas below.
The twistor spaces
We have constructed a (k + 4)-manifold E which is fibred over the original neigh-
bourhood U in complex Minkowski space, and over the new space-time M. We
construct geometric objects on M by pulling the corresponding objects on U
back to E, and then projecting. We have a similar picture at the twistor level by
constructing a complex manifold which is fibred over the original twistor space
and the new one.
Although the four-dimensional distribution 9-l is not integrable in general,
it does admit a (k + 3)-dimensional family of integral 2-surfaces, namely the
horizontal lifts of the a-surfaces in U. These project under p onto a-surfaces in
M. We denote by PE the space of all such 2-surfaces in E. Both g and h act on
PE, and the quotients are, respectively, the twistor spaces Pu and PM of U and
M (we use the subscripts to distinguish the spaces associated with U, M, and
6).
We associate with E a correspondence space PE, consisting of pairs (e, II),
where e E E, and II is an a-plane element at ir(e) E U. Both g and l) act
on Fc in a natural way, and the corresponding quotients are, respectively, the
correspondence spaces FU and .F'M of U and M. Over Fe, we have the line
bundle O(-2) of tangent bivectors to II: the quotients of this by g and 4 are the
line bundles O(-2) over Fu and FM. The tangent spaces to the a-planes in U
determine a twistor fibration of FE by 2-surfaces, which projects onto the twistor
fibrations on FU and 7m, and the quotient of Ye by the twistor fibration is a
bundle over the twistor space of U with fibre N: it is the N-bundle associated
with the Penrose-Ward transform of the original ASDYM field.
Explicit formulas
We work locally and identify E with U x N. We put U = T x 0, where 0 is an
orbit of in U and T = U/4. We suppose that the gauge has been chosen so
that I is invariant, so that Vy = 0 for each Y E h. Then we have
E=Tx0xH,
with the distribution S given by the tangent spaces to the second factor, and
hence that M = T x N.
We choose a frame field T. for the tangent bundle of T. The horizontal
vectors on 6 are combinations of
Y1+oilX, , Ta+walXj,
where OiJ X.i and waa X j are the vectors on N corresponding to the gauge Lie
algebra elements Yi J 4) and Ta J 4i, respectively, with the indices i, j running over
ASDYM fields and the switch map 311

1,...,k and a,b running over 1,...,n - k. We denote by 77 and g components

of the metrics on U and ,M in the respective frame fields (Ta, Y;) and (Ta, X=).
We then have that g is determined from 77 by 77 = IIgIIt, where

II = I ba
0
wa 1 .

Under the further assumption that leads to a vacuum metric, we can choose
the Xs so that the components of the volume form is are constant. Then if
W, Z, W, Z is the null tetrad of coefficients of La and M0, we have that
a(W, Z, W, Z)
is a constant multiple of det 1I, and hence that det(II)g is an ASD vacuum metric.
Example 13.7.2 The Gibbons-Hawking ansatz. Here M is complex Minkowski
space and D = d + 4) is a solution of the ASD Maxwell equations. With k = 1,
we take the generator of ( to be the translation vector 8o in complex Cartesian
coordinates xa. Then in an invariant gauge, D = (0, w), where O(x) is a solution
of Laplace's equation V20 = 0 as a function of x = (X 1, x2, x3), and
curlw = V O, (13.7.2)
which is the abelian form of the Bogomolny equation. We interpret w, W2, w3
and 0 as the components of vector fields on N = C, and put Ta = 8a, a = 1, 2, 3,
Y = 8o. Then we have
II= C13 wl .
0

in 1+3 block form and hence that g = (IItII)-1, since 77 is the identity matrix.
Therefore the result of the switch map is the conformal metric
ds2 = dx.dx +0-2 (dt - w.dx)2.
We note that the further assumptions under which we obtain a vacuum metric
holds, by taking L = 8v, - (8i, M = 8Z - (8,;, and n to be a constant 1-form on
C. Therefore, since det II = 0, we have that
ds2 = idx.dx + 0-1(dt - w.dx)2
is an ASD vacuum metric, for any solution ¢ of Laplace's equation, with w
defined by (13.7.2). This is the Gibbons-Hawking ansatz (Gibbons and Hawking
1978). If instead we start with a general ASD conformal structure, on which
we are given a non-null conformal Killing vector and an invariant ASD Maxwell
field, then we obtain a new ASD conformal metric, which is related to the original
one by the condition that the Einstein-Weyl geometries obtained by taking the
quotients of the two space-times by the conformal Killing vectors should coincide.
See Example 13.6.1, and Jones and Tod (1985).
Example 13.7.3 The KdV equation. We take g = sl(2, C), k = 2, and
Y1=aw-a,,, Y2=az
312 ASD metrics
and consider the ASDYM field determined by a solution to the KdV equation
by (6.3.16). We map the elements of g to vector fields on N = C2, by using the
natural linear action of sl(2, C) on two-dimensional row vectors. We denote the
linear coordinates on C2 by a, 0, and take rc to be the 2-form da A d/3. We put
Xl = a,,, X2 = aa, and write the Minkowski metric in the coordinate system
t, x, w, z, where t = z, x = w + ti'. Then Y1 = a,,, and Y2 = az. Finally, we put
T1 = at, T2 = as. We then have
1 0 as +/3c ab-/3a ,0 (0 0 0 1

_ -1
II- 0
0
1

0
0
aq + or -a - ,Qq
0
'
0
0
0
-1 2
0
0
0 0 ,Q 0 1 0 0 0

The construction gives that g = is an ASD vacuum metric in

det(II)II-177II-1t

the coordinate system t, x, a, /3. If the coordinates and the solution to the KdV
equation are real, then g has ultrahyperbolic signature. We can also find solutions
to the vacuum equation in the same way from other SL(2, C) reductions, for
example, from solutions the NLS equation. Another possibility is to take N
to be one of the homogeneous spaces S2 or IEll for SU(2) or SU(1, 1) or their
complexification.
Example 13.7.4 Reductions of the SL(oo) ASDYM equation. As an another
example with two ignorable coordinates, consider the abelian Lie algebra t) gen-
erated by the translation vectors
1'1 = a,;,, Y2 = az
We take .A( to be a two-dimensional complex manifold, and t to be a holomorphic
area element (i.e. a holomorphic symplectic form) on N, and we denote by g the
Lie algebra of holomorphic vector fields on N that preserve ic. In the real case,
the corresponding Lie algebra has been interpreted as su(oo), so there is a formal
sense in which g is sl(oo).7
We take g to be the Lie algebra of the gauge group, and we suppose that we
are given an f)-invariant ASDYM field on U. Then the gauge potential is given
by (13.7.1), where A, B, A, b are divergence-free vector fields on N, depending
on the space-time coordinates w and z, and satisfying the reduced ASDYM
equations
Bw-AZ+[A,B]=0, A,,, - BZ+ [A, A] - [B,B] =0, [A,E]=0.
We put S12 = #c(A, b), and we assume that the two Higgs fields A and B are
independent, so that St is nonzero and depends only on w and z. We construct
L and M from the coordinate null tetrad in U, and note that their horizontal
lifts
aw+A-((az+B), az+B-((aw+A),
form a Lax pair for the ASDYM equation. Since L and M commute and have
zero divergence,
W=S1-1(av,+A), Z=cr'(az+B), W=n-1A, 2=S2-1B
ASDYM fields and the switch map 313

is a null tetrad for an ASD vacuum metric on T x N, where in this case, T is

the w, z space.
Every ASD vacuum metric arises in this way, as can be seen from the deriva-
tion of Plebanski's second equation. Starting from (13.4.11), we can take N to
be the p, q-plane, with symplectic form dp A dq, and
A=aaq - bap, B =caq - aap, A =ap, B=aq.
Thus the ASD vacuum equation is a reduction by two orthogonal null translations
of the flat-space ASDYM equation with gauge group C. This correspondence is
described from various points of view by Mason (1990), Park (1990), and Ward
(1990a).
Other vacuum metrics can be constructed from ASDYM fields in C M with
translational symmetry along aZ and O. We need only to find a way of rep-
resenting the gauge group by canonical transformations of a two-dimensional
symplectic manifold. In the case of SL(2, C ), for example, we can take the nat-
ural action on C2 or on C 1P1 x C P1, both of which admit invariant holomorphic
symplectic forms. Thus solutions of the SU(2)-chiral model or of the sine-Gordon
equation lead to solutions of the ASD vacuum equation.
Example 13.7.5 Translation-invariant ASDYM fields. Proposition 13.4.8 can
be interpreted as an example of the switch map, by using the same idea as in the
previous example. We take N to be four dimensional, with a volume element r.,
we take the gauge group to be the group of volume-preserving transformations
of N. We take A, B, A, B to be as in (13.4.6), and interpret (13.7.1) as the
potential for a translation-invariant ASDYM field on C M: the components of
are divergence-free vector fields on N, and therefore elements of the Lie algebra
of the gauge group.
Example 13.7.6 The Ashtekar-Jacobson-Smolin equations. The same con-
struction gives ASD vacuum metrics from solutions to the ASDYM in C M with
three translational symmetries, by taking the gauge group to be the group of
volume-preserving transformations of a three-manifold N, and the symmetries
to be the translations parallel to a non-null hyperplane. In the this case, the
ASD condition is equivalent to
X1 = (X2, X31, X2 = (X3, X1 1, X3 = [X 1, X2J , (13.7.3)
where the dot denotes at, and where the Xi are divergence-free vector fields on
N. This connection between Nahm's equations (see §7.2) and the ASD vacuum
equation is due to Ashtekar et al. (1988).
Again the construction gives all the solutions to ASD vacuum equation, as we
can see from the proof of lemma (13.4.7). The rotated tetrad introduced there
satisfies
clW(q) = slw(-P) = 0, QZ(0) = 02(0) = 1.
Therefore

X1 = si(W + zW), X2 = i 1(W - w), X3 =1(Z - z)

314 ASD metrics
are tangent to the surfaces of constant 0. If we take t = - io to be one of the
2
coordinates, and choose the others so that ifl(Z+Z) = 8t, then the commutation
relation [11L, SZM] = 0 comes down to eqns (13.7.3).

Example 13.7.7 ASD Bianchi IX conformal structures. One class is obtained

from the preceding example by taking Xi E SO(3), which gives the Euler top
equations. A more general class is associated with the Painleve groups-which
one depends on the algebraic type of the primed spinor connection on the hyper-
surfaces of constant t. For example, take h to be the Lie algebra of the Painleve
group PVi, let G = SL(2, C), and let N be G itself. Then £ is the principal bun-
dle associated with E. If we transform to the coordinates t, p, q, r on U defined
in Table 7.4, then the flat metric on U is
ds2 = 2ww (dp dr - dg dt + (t - 1)dp dq - tdg dr) ,

and Yl = 8p, Y2 = 8q, Y3 = or form a basis for . We choose the basis in

g = sl(2, C) to be

we take the Xis to be the corresponding left-invariant vector fields, and we take
T. = T = 8t. Then, in the notation of Chapter 7,
1 0 0 0
0 k 0 0
II- 0 Aµv (13.7.4)

0 p o T

where the entries are functions oft determined by a solution to the sixth Painleve
equation. Hence the conformal metric on M has components in the basis
T, X1, X2, X3
0 0 -1 0

II
1 0 0 t-1 1
II it
-1 t-1 0 -t
0 1 -t 0
By construction, the metric on M is invariant under SL(2, C ), and so determines
an ASD conformal structure of Bianchi type IX. It is, in fact, the general ASD
conformal metric with this type of symmetry. Given a Bianchi IX ASD conformal
metric, that is, a complex ASD metric conformal metric which is invariant under
a free action of SL(2, C ), one recovers the Painleve transcendent that generates
it by constructing the corresponding isomonodromic family of ODES from the
geometry of the curved space-time. One does this by eliminating the space-time
derivatives between the two components of 7rA' VAA', and the Lie derivatives
along the Killing vectors, acting as operators on sections of the prime spin bundle
(see Maszczyk et al. 1994). In the special case in which the Painleve parameters
are given by a = $, 0 = -1,8ry = 1,8 6 = 2,8the metric on M can be diagonalized
ASDYM fields and the switch map 315

in an invariant basis, and there exists a choice of conformal scaling for which the
metric is Einstein (Hitchin 1995, Tod 1991, 1992a, 1994).

Example 13.7.8 P111 and scalar fiat Ki hler metrics. The same construction
works for the other Painleve equations: one needs only to modify the expression
for the conformal metric on U in the coordinates p, q, r, t by making the appro-
priate transformation from the standard double-null form of the metric on U, as
in Table 7.4.
In all cases, there exists a holomorphic 5-form A on Fc with values in 0(4)
which is invariant under the actions of g and . It is given by first rescaling to
make the metric on U invariant under f , and then by taking the exterior product
of the 3-form f on FU (pulled back to .fig) with an invariant volume element on
SL(2, C ). By contracting A with the six generating vector fields of the actions on
lj and g on -'Fe, we obtain a section s2 of 0(4), which descends to a section of the
line bundle 0(4) over the twistor space PM. In the case of P111, there is a global
square root s E I'(P, 0(2)), which determines a (complexified) scalar-flat Kahler
metric in the conformal class of g by Proposition 13.5.4. It is given explicitly
as follows. We make the coordinate transformation in U given by Table 7.4 and
rescale the flat metric by z"u2, so that it becomes the b-invariant metric
ds2 = 2t2dr2 - 2dp dq - 4tdt dr .
We define a fibre coordinate ( on PE by using the flat-space null tetrad
W =a,,,, Z=azi W =aw, .Z=aj
in U, which is a conformal null tetrad for the rescaled metric, and we identify E
locally with U x SL(2, C ). We then have, up to a constant factor,
1;

where L = W - (2, M = Z - (W. From Table 11.1, the action of h on .PU is

generated by the three vector fields
ap - (tat , aq, a,_.
Consequently, we have that s2 is given in the trivialization of 0(4) determined
by the tetrad by
s2 - (2(w7V + (z)2
w2

up to an inconsequential constant factor. Therefore s = ((1 + (z/zv). We note

that this quadratic has discriminant equal to 1. To obtain a scalar-flat Kahler
metric with SL(2, C) symmetry, therefore, we transfer the conformal metric to
.M, as above. The result is
0 0 0 -2t
0 0 -1 0
II is
g n 0 -1 0 0
-2t 0 0 2t2
316 ASD metrics
in the frame T,X1 on M, where H is given by (13.7.4), but now by using the
functions constructed from a solution to P111. Special cases of this metric have
previously been found by Dancer and Strachan (1995), and Pedersen and Poon
(1990), and the general form by Tod (1995b).

NOTES ON CHAPTER 13
1. The section on the twistor constructions is not a prerequisite for most of the material
on reductions. Besse's (1987) book contains important material on the application of
some of the ideas in this chapter in Riemannian geometry.
2. We use the spinor formalism to prove the following. If the conformal structure on
1 is ASD, then there exists a conformal null tetrad such that 11, m] = 0. Suppose that
W A'B'C'D' = 0, and let t, m be as on p. 293. Then we have
it, m] = 7r A, 7r B' (aADAA'QB - QAV AA'aB)OBB'
= 7r A' 7rB' (VgA'k + QBVAA'a A - aBOAA'QA )O 8B',
where aAQB - QAaB = keAB
Pick two independent solutions 71A, 7A to the neutrino equation
VAA'77A = 0,
and put aA = h-1rrA, QA = h-1o.A, where h = 7lA?A. Then k = h-I and It, m] = 0.
3. The two foliations by a-surfaces are polarizations, and S is the generating function of
one polarization with respect to the other. See Woodhouse (1992b), and Chakravarty-
al. (1991).
4. These conditions on Tl. in any case imply that C 6,d is of type D.
5. The identity 0(-4) = K follows from the existence of the 3-form t on F with
values in 0(4). A section of 0(4) -+ P is represented by a function b on F such that
£(b) + 2xb = 0 = m(b) + 2yb. The product bt; is a scalar-valued holomorphic form on 17
such that Ge(bf) = 0 = Gm(bt;), as a consequence of eqn (13.2.6). It is also annihilated
by vectors tangent to the twistor distribution. It is therefore the pull-back of a 3-form
on P; that is, of a section of K.
6. There are other ways of expressing the conditions. In Ward and Wells (1990), r takes
values in an abstract line bundle, the restriction of which to each curve is equivalent
to the line bundle 0(2). There is then some freedom (multiplication by a constant)
in the identification of the restriction with TCPI, which is, in effect, removed by the
requirement that the restriction of r should coincide with the natural 1-form.
7. This interpretation appeared in J. Hoppe's Ph. D. thesis (MIT 1982); see also (Ward
1990b).
Appendix A
Active and passive gauge
transformations

Gauge transformations can be viewed in one of two ways. In §2.6, they ap-
peared as passive transformations: a choice of gauge was equivalent to a choice
of a local frame field {ei} (i.e. a local trivialization of the bundle E), and the
transformation
,D '-' = g- l,Dg + g- l dg (A.1)
was seen as the change in the local representation of the connection D = d + 4)
when e3 is replaced by ej = eigij. The alternative point of view is to regard
(A.1) as an active transformation. We keep the local trivialization fixed and
use (A.1) to define a new connection D. To say this in more geometric terms,
suppose that the base space of E is some open subset U of space-time and that
we are given a connection D on E and an automorphism g: E -+ E such that
g(Em) = E. for every point in E U, where Em denotes the fibre above m. We
can regard g as a map that assigns a linear transformation g(m): Em - Em to
each m E U. Given g and D, we define the new connection D by
Ds = g-' D(gs) ,

where s is any local section. The connection 1-forms of D and b are then related
by (A.1), where gi, is the matrix of g in some local trivialization. We call g an
active gauge transformation, and whenever two connections are related in this
way, we say that they are gauge equivalent. The curvatures of D and b are
related by F = g-' Fg, so b is self-dual whenever D is.
The active gauge transformations form an infinite-dimensional group that we
denote by G. Mathematicians generally reserve the term `gauge group' for G,
and do not, as is standard in the physics literature, allow `gauge group' as an
alternative for `structure group'.
There are two senses in which a connection D can be `invariant' under the
action of a group H of conformal isometries. On the one hand, for some choice of
lift, we can have phD = D for every h E H; on the other hand, we can have the
weaker condition that for every h E H, phD and D should be gauge-equivalent-
a condition that is independent of the choice of lift. The weaker condition does
not always imply the stronger.
318 Active and passive gauge transformations
Suppose that H acts on CM and that p '- p. is a lift of the action to E.
Suppose also that D is invariant in the weaker sense. Choose a local trivialization
of E. Then, for each p E H, there is a function gp with values in the structure
group such that
p* = gp gp 1dgp, (A.2)
where (D is the connection form. In general, there will be no active gauge trans-
formations that preserve D: that is, the only g E G such that Ds = g-1D(gs)
for every s will be the identity. In this case, there is only one possible choice of
gp for each p E H and it necessarily holds that gpp' = gp'gp for every p, p' E H:
we can then absorb gp into the definition of the lift of H and have that D is
invariant in the stronger sense.
But it may happen that there is a nontrivial subgroup of G that preserves D.
In this case the condition gp,' = gp'gp need not hold, and there may not exist a
lift such that D is invariant in the strong sense.
Appendix B
The Drinfeld-Sokolov construction

Drinfeld and Sokolov (1981, 1985) introduced a general algebraic method for
generating families (or `hierarchies') of integrable partial differential equations
in two independent variables. Their construction begins with the choice of a Kac-
Moody algebra and certain other data. Here we shall look at a basic example
that can be understood from a more concrete point of view, where the resulting
equations make up the generalized KdV (nKdV) hierarchy. The ideas are closely
related to those in Gel'fand and Dikii (1976, 1977), and in §4 of Segal and Wilson
(1985); see also Wilson (1979).
We begin by introducing the differential operator

L=ax+A-A, (B.1)
where
0 0 ... 0 0 0 1 0 ... 0 0
0 0 ... 0 0 0 0 1 ... 0 0
A A=
0 ...
0 0 0 0 0 0 ... 0 1
U0 ... un-2 0
U1 ( 0 0 ... 0 0
Here ( is a complex parameter and the u;s are functions of the independent
.... This particular choice of L has its origin in the equivalence
variables x, t2, t3,
of Ls = 0 to the eigenvalue equation
8 ' + un_2C9 -2V)
+ un-319x -3,b + ....i.. 7/,041 = (W
by putting s1 = 4 / and sj = axsj-1 (j = 2, ... , n).
We shall derive integrable evolution equations for the us by constructing
operators of the form Mk = 8k - ilk, where 9k = 19/8tk and 1Ik is a matrix-
valued polynomial in S depending on x and tk, and by imposing the condition
(L, Mk) = 0.
Formal Laurent series
The derivation is based on the manipulation of formal Laurent series in C with
matrix coefficients, and makes use of the fact that such series also have formal
representations as expansions in positive and negative powers of A with diagonal
coefficients. To see how this works, note that An = (1n, where 1n is the identity
320 The Drinfeld-Sokolov construction
matrix, and that if Do, D1,. .. , Dn-1 are diagonal and independent of C, then
Do + D1A-1 + ... + Dn-lA-n+1 = f + (-lu,
where t is lower triangular and u is strictly upper triangular (i.e. with zeros
on the diagonal). Any such 2 and u can be expressed uniquely in this way. It
follows that any formal series Ep,,. B,QQ, in which Bp is lower triangular and
the other coefficients are arbitrary matrices, can be written uniquely in the form
En p Dj A3, where the Ds are diagonal. There is no implication of convergence:
the two series are equal only in the sense that the coefficients of the various
powers of C coincide. The fact underlying the construction is that >o B3C) (the
polynomial part in () differs from Eop D;AJ (the polynomial part in A) by a
strictly lower-triangular matrix of degree 0 in
For D = diag (d1, ... d,,), we denote by D' the trace-free diagonal matrix
diag(d2 - d1,. .. , do - dn_ I, d1 - dn). In this notation,
[A, D] = D'A.
We denote by N the group of lower-triangular matrices with ones on the diagonal,
and by n its Lie algebra (the strictly lower-triangular matrices).
Dressing transformations
We shall use a dressing transformation L e--+ T-1 LT, where T is a formal series
in inverse powers of (, to reduce L to the standard form 8S - A. We shall
then construct Mk by applying the reverse transformation to 81 - Ak for some
k > 1 and discarding the negative powers of A. The condition [L, Mk] = 0 will
determine the t-dependence of A.
For the moment, we shall take A to be more general than in (B.1): we shall
assume simply that it is trace free and lower triangular. Its expansion in powers
of A is
n-1
A=E A;A-j,
0

where the coefficients A; are diagonal and trA0 = 0.

Put T = Eo TEA-j, where To = 1 and the other coefficients Tj are as yet
undetermined diagonal matrices depending on x and tk. At each fixed value of
tk, we want 1
T((9x - A)T-1 = L,
which is formally equivalent to
[A,T]=C7xT+AT. (B.2)
Lemma B.1 Equation (B.2) has a formal solution T = >oTjA-', with To =
1n, which is unique up to multiplication on the right by h = In + Fi° h3A-1,
where the coefficients h3 are complex numbers, independent of x.
Proof By equating the powers of A on each side, we see that (B.2) holds if for
each j > 0,
The Drinfeld-Sokolov construction 321

Ti+1 = axTj + (AT)E, (B.3)

where (AT)3 is the coefficient of A-J in the expansion of AT. Provided that
the right-hand side is trace free, (B.3) determines Tj+1 in terms of the TTs with
i < j, uniquely up to the addition of a multiple of the identity. The right-hand
side is certainly trace free when j = 0 because tr(Ao) = 0 and To = 1,,. We can
therefore solve iteratively for Tj, by using the freedom at one stage to ensure
that the right-hand side is trace free at the next. This procedure determines T
uniquely, up to multiplication on the right by h where axh = 0 = [A, h], with
the condition To = 1 determining the leading coefficient in h. 2
Evolution equations
Now choose k > 1 and define Ilk to be the polynomial part in A of the the formal
series TAkT-1. Then we have the following proposition.
Proposition B.2 For each k > 1, put Fk = axfk + [A - A, Ilk]. Then Fk is
trace-free, lower triangular, and independent of (.
Proof Clearly Fk is trace free. By the definition of the dressing transformation,
[ax + A - A, TAkT-1 ]
= T [ax - A, AkIT-' = 0.
Hence
Fk=[ax+A-A,Bk],
where Ek = Ilk -TAkT-1. Therefore Fk is a series in nonpositive powers of A.
But it is also a polynomial in (, so the proposition follows.
We can therefore define the evolution of L with respect to the parameter tk
by akL + Fk = 0, which is equivalent to the commutation condition
[L,Mk]=0,
where Mk = ak - IIk. Note IIk is independent of the choice of dressing matrix,
and so the dependence of L on tk is uniquely determined from its initial value.
Moreover, although the construction of T involves integration at each stage of the
iteration in order to satisfy the trace condition at the next stage, the coefficients
of IIk can in fact be expressed as polynomials in the entries in A and their first
k - 1 derivatives with respect to x, as a consequence of the following.
Lemma B.3 For any trace-free, lower-triangular matrix A(x), there exists a
formal power series T = 1 + Ei° T3A-' with diagonal coefficients such that
each T3 is a polynomial of degree j - 1 in the entries in A and its first j - 1
derivatives, and such that
00
ax+A - A=T(ax -A+1: hjA-')T-1,
0

where the h3 s are some scalar functions of x.

Proof The proof is the same as that of Lemma B.1, except that now we have
322 The Drinfeld-Sokolov construction
tr(Bai + (AT)E - (Th)j) = 0, Tj'+1 = (Bail + (AT)j - (Th)j)tf,
where `tf' denotes the trace-free part and h = E hjA-3. The fist equation de-
termines h3, given ho, ... h31, T1,. .. , T'3; the second determines Tj+1, provided
that we impose some supplementary algebraic condition, for example that the
first entry in Ti+1 should be 0. p

If we construct t in this way, then T = T cj A-', where the cjs are scalars,
and consequently
TAkT-1 = TAkT-1.
It follows that the coefficients of IIk are polynomials of the required form.
When k = n,the evolution is trivial because An = On- When k = n + 1, the
evolution is given by [L, M] = 0, where
M=Mn+1-(L=at -(Bs-IIn+1-(T9T-1-(TAT-1,
where t = to+1. But the right-hand side is a polynomial in A, by construction,
so we must therefore have
M=Bt-(By+B-(C, (B.4)
where B is upper triangular and C is strictly lower triangular, and both are
independent of (. By writing t = z and x = w + w, we obtain from L and M a
solution to the ASDYM equation with H+O symmetry. See §6.3.

Gauge transformations
The evolution equation for L does not preserve the special form of the operator
(B.1). However, if we regard two operators as equivalent if they are related by a
gauge transformation
g-1Lg
L'-'
where g(x) E N, then we have (i) each gauge class contains a unique operator of
the form (B.1), and (ii) the evolution preserves equivalence. The first statement
follows from a straightforward calculation, and the second by noting that if T
is a dressing transformation for L, then g-1T is a dressing transformation for
g-1Lg. Taking (i) and (ii) together, the construction gives evolution equations
for the uis. If 2(x, t) is any strictly lower-triangular function of x and t, then the
equations
BkL + [L, IIk] = 0 , BkL + [L, IIk + t] = 0
determine the same evolution of the equivalence class of L. So all that is needed
to determine the evolution of the ups is to choose t so that L retains the special
form: the result is a sequence of evolution equations in which the tk derivatives
are given by differential polynomials in the u=s and their x derivatives.
It also follows that the evolution of the equivalence class of L can be found
by solving the alternative evolution equation
BkL+[L,Rk]=0,
The Drinfeld-Sokolov construction 323

where, at each fixed t, Rk is defined to be the polynomial part in ( of TAkT-1

(because the polynomial parts in ( and A differ by a strictly lower-triangular

matrix of degree 0 in ().
Special gauges
With an appropriate choice of gauge, A reduces to the special form in (B.1) and
the commutation conditions [Mk, L] = 0 give integrable evolution equations for
the functions u2.
Another possibility is to choose the gauge so that T = 1,, +C-1H, where H
is a power series in.(-1. We can then write down Mk quite simply by making
expansions in powers of (-1. For k < n, we have that Ak = Ek( + Fk is linear
in (, and therefore
TAkT-1 =Ak+[Ho,Ek]+O((-1), (B.5)
where H° is the coefficient of (0 in the expansion of H in negative powers of (.
It follows that, for 1 < k < n,
Mk = 8k + B - [Ho, Ek) - A k
where B is strictly lower triangular. With this choice of gauge, we also have
L=Bx-A-[Ho,Ell, =8n+1 -(A-([Ho,E1]+O(C°).
Therefore,

where C is of degree 0 in ( and t = There is a similar relationship between

the Mk and Mk_ for higher values of k.
A third possibility is to make A diagonal, in which case we obtain the modified
nKdV hierarchy. The gauge transformation between the first and third choices
is the Miura transformation (§6.3).
Commuting flows
For each k > 1, we have constructed an evolution equation for A and an associ-
ated `Lax pair' L = Bx+A-A, Mk = at -IIk, where IIk is a polynomial of degree
k in A constructed from A and its x derivativ's. The evolution is trivial when k
is a multiple of n (since Ak is then a multiple of the identity), but generally not
otherwise. We can think of the Mks as a sequence of vector fields on the space
of operators L, modulo gauge, with the Lie brackets given by the commutators
[Mk, Mt]. Now for each k,
Mk=T(ak-Ak+rk)T-1,
where rk is a formal series in negative powers of A. Therefore we have
0 = [L,Mk] =T[8x - A,ak - Ak+rk]T-1 =T([A,rk] -exrk)T-1.
We deduce that axrk = 0 = [A, rk] and hence that
T-1
[Mk, Mt] = T [,9k - Ak + rk, at - A' + re] = T (akre - aerk)T-1
.
324 The Drinfeld-Sokolov construction
However, the left-hand side is a polynomial in A and the right-hand side contains
only negative powers of A. Both sides must vanish, therefore. We conclude that
the flows commute.
Example B.4 The Boussinesq equation. Take n = 3, k = 2, put t = t2, and
suppose that at some t
ax -1 0
L=ax+A-A= 0 ax -1 . (B.6)
v-( u ax

Then Ao = 0, Al = diag (0. 0, u), A2 = diag (0, 0, v). The first of the sequence of
equations (B.3) implies that T1 = 0 and hence that
TA2T-1 = (13+T1A-1+...)A2(13-T1A-1+...)
=A2+B+O(A-1),
where B is trace free, diagonal, and independent of C. Therefore II2 = B + A2.
The evolution is determined by atL + [L, II2 = 0. To preserve the special form
of L in (B.6), we replace II2 by 112 + e, where a takes values in n. This gives a
gauge-equivalent evolution equation
atL = - [L, II2 + ej _ [L, W - A2], (B.7)
where W (x, t) is trace free and lower triangular. We put
fa 0 0
W= d b 0
f e c

where a+b+c = 0. By equating coefficients of ( in (B.7), we find that c-a = u.

The remaining terms in (B.7) give a = -3u, b = c = 3u, together with
d=v - 23 u, f =vx-2uxx,
3
a=v - 3ux,
fx+av+du-cv-vi -0, cx+bu+f -cu-ut =0.
On eliminating a, b, c, d, e, f , we have
ut = -uxx + 2vx, Vt + 2uux - vxx + 3uxxx = 0,
from which we can eliminate v to obtain, finally, the Boussinesq equation
utt + 3uxxxx + 3ux + 3uuxx = 0.
Example B.5 The KdV equation. Take n = 2, k = 3 and choose the gauge so
that
L=ax+I 01 I-(I0 0I .

Then A0 = 0 and Al = diag(0, u). By following through the steps in the con-
struction of T, we find that
/ \
T= 12+aA-1 10 0 I +...
The Drinfeld-Sokolov construction 325

where ay = -Zu and 3 - y = 2u. From this and eqn (B.4), we find the C
and (2 terms in II3 and deduce that, with t = t3, the evolution of u is given by
[L, M] = 0, where
0)
M at+(c b
a)-Ca.-21 0

From the terms in CO in the commutation condition` [L, M]= 0, we have

ax = c + bu, bx = -2a, cx - ut = -2au,
while the nonzero terms in C give
b+Zu=0, tux-2a=0.
So we have five equations in four unknowns (a, b, c, u). The construction ensures
their consistency, and on eliminating a, b, c, one obtains the KdV equation in the
form
Out = uxxx + 6uux.

Uniqueness of the flows

We can think of the Drinfeld-Sokolov construction as defining a sequence of
commuting vector fields on the manifold M of gauge classes of operators
ax+A - A,
with A lower triangular. The sequence is complete in the following sense: if
t i.-+ L is a curve in M such that 8tL = [11, L], where 1I is a polynomial in t; of
degree m, then at each t, the tangent vector to the curve is a linear combination
of the tangents to the flows up to k = nm + n - 1. In other words,
II = 1: pk(t)lIk
for some t-dependent complex coefficients Pk. A new operator M = at - 11 that
commutes with the hierarchy must be a t-dependent combination of the existing
ones.
To prove this, choose a dressing transformation T for each t. Then
LMT = MLT = -MTA.
But MT is a formal Laurent series in A, with highest power not exceeding
k = nm + n - 1. Therefore T + A-k-1 MT is also a dressing transformation, and
so by the uniqueness property of T,
k
T-'MT =T-18tT-H=I: pjAj
-00
where the pas are scalar functions of t. The statement follows by taking the
polynomial part in A.
326 The Drinfeld-Sokolov construction
Characterization of the hierarchy
Sometimes we shall want to think of the flows not as evolution equations for A(x),
but as a sequence of simultaneous equations for A as a function of t1 = x, t2, ....
It will be useful to have a characterization of this system so that the equations
can be recognized when they arise in other contexts.
To emphasize the shift in point of view, we shall write L = M1, x = t1,
and t = (t1i t2, ...). Then the operators Mj = 8j - II3(t, () have the following
properties:
(i) M1 = 81 + A - A where A(t) is trace-free and lower triangular;
(ii) IIj is a trace-free polynomial in t;, for all j > 1;
(iii) [Mj, Me] = 0, for all j, e > 1;
(iv) IIj = Ei<j SjiA`, with Sjj = 1, where the Sjis are diagonal matrices
independent of (;
(v) IIj - (l,_. BjiAj where the Bjis are diagonal matrices, inde-
pendent of (.
These characterize the Ms uniquely, up to a linear transformation of t that
adds a constant linear combination of M1,... , Mj_1 to each Mj. To prove
this, we choose a dressing transformation T for M1 (depending on t), and put
Dj = T-1MjT = 8j +Qj. Then, by the proof of the uniqueness property above,
(a) [Dj, De] = 0, for all j, t > 1;
(b) Q, = E' sjiA', with sjj = 1, where the sjis are scalars independent of (;
(c) Qj - (Qj_" _ >" b, Ai where the bjis are scalars.
By picking out the coefficients of powers of A in the commutation condition (a),
we see that
8esji = 0, j>i>f.
On the other hand, (c) implies that
sj+rn,i+rn = sji , j, r > 0, 0 < i < i .
By choosing r _> (Q - i)/n, we deduce that 8esji = 0 for all i > 0. It follows that
the operators 8j - (Qj)_ commute, and hence that by making a further dressing
transformation by Eo hiA', where the his are scalar functions of t with ho = 1.
we can arrange that (Qj)_ = 0 for all j. We then have Qj = F,osjiAj, where
the sjis are constant. By construction, sjj = 1; and since trQj = 0, we have
sjo = 0. This completes the proof.
It should be stressed that (i)-(v) characterize the infinite hierarchy of equa-
tions: the corresponding conditions for the truncated sequence Ml,. . . , Mk are
less simple.
NOTES ON APPENDIX B
1. If T = J:o Tj(x)(-j satisfies T(81 - A)T-' = L, then kro necessarily takes values
in N for some constant scalar k. To prove this, put A = Ao + (A1. Then by picking
out the coefficients of the lowest powers of (-',
[To, AI ] = 0, 9.7-o + ATo - [Ao, To[ - [A 1, 7-1 1 = 0.
The Drinfeld-Sokolov construction 327

The first equation implies that (TO)11 = (TO)nn, and that the other entries in the first
row and last column all vanish. An inductive argument uses the second equation to
show that To is lower triangular with equal entries on the diagonal. By taking the trace
of the second equation, we deduce that these entries are constant.
2. Since A is trace-free, it follows from (B.2) that det T is independent of x (as a formal
series in powers of (-1). By making an appropriate choice of h, we can ensure that
detT = I.
Appendix C
Poisson and symplectic structures

Symplectic structures
A symplectic structure S2 on a finite-dimensional manifold M is a closed, non-
degenerate 2-form. That is,
(a) dSl = 0; and
(b) X 10 54 0 for every nonzero tangent vector X.
In coordinates,
nab = 9[abJ , alaStbci = 0, and det(Slab) 0 0 .
Since the rank of a skew-symmetric matrix is necessarily even, condition (b) can
hold only if the dimension of M is even.
Poisson structures
A Poisson structure is a bracket operation ('Poisson bracket') that assigns a
function {f, g} to every pair of functions f and g on M. It must have the
following properties. For all f, g, h,
(a) If, g} = -{g, f } (skew-symmetry);
(b) {a f + bg, h} = a{ f, h} + b{g, h} for constant a, b (bilinearity);
(c) { f g, h} = f {g, h} + If, h}g (Leibniz rule);
(d) {{ f, g}, h} + {{h, f }, g} + {{g, h}, f } = 0 (Jacobi identity).
The first three properties imply the existence of a skew-symmetric contravariant
tensor field fl, with components flab = {xa,xb} in local coordinates xa, such
that
If, 9} = flabC7afab9
(fl is called the Poisson tensor and its components are called the structure func-
tions.)
Both types of structure can be either real or complex. In the real case, 1 and
the functions in the definition of the Poisson bracket are smooth; in the complex
case, they are locally holomorphic.
Relations between Poisson and symplectic structures
A symplectic structure determines a Poisson structure. Given S2, we construct
from each function f a Hamiltonian vector field X f, defined by
Poisson and symplectic structures 329

XfJS2+df =0. (C.1)

Then the Poisson bracket determined by S2 is given by If, g} = X f(g). Prop-
erties (a)-(c) follow from the linearity and skew-symmetry of 0, and from the
Leibniz rule for differentiation along vector fields. The fourth property of
is equivalent to the closure of Q. In local coordinates, the relationship is
SII6`_-1bc
lab 2a
Such Poisson structures are special in being everywhere nondegenerate (i.e.
det(fl) 76 0). In the symplectic case, the only functions that have vanishing
Poisson bracket with all other functions are constant. This is not true of a
general Poisson structure.
Hamiltonian vector fields
On a general Poisson manifold, the Hamiltonian vector field Xh of a function It
is defined by Hamilton's equation
Xh(g) = {h,g} Vg,
or in coordinates by Xb = lI°b8alt. In the symplectic case, this is equivalent to
(C.1). For any f, g,
[Xf, X91= X{f.9} .
A given flow on M is a Hamiltonian system if its generator is of the form Xh for
some function It ('the Hamiltonian') and some Poisson bracket {., - 1. The flow
pt: M - M along a Hamiltonian vector field necessarily preserves the Poisson
structure. That is
{fopt,gopt} = {f,g}opt (C.2)
The derivative of this at t = 0 is the Jacobi identity.
In the symplectic case, a vector field X is locally of the form X = Xh for
some function It if and only if
GXS2 = 0; (C.3)
or, equivalently, if and only if the flow along X preserves the Poisson structure.
This is because
£xcl=d(XJS2)+XJdS2=d(XJS2),
so (C.3) is equivalent to the closure of X J Q. However, X is globally of the
form Xh for some It only if X 10 is exact. On a general Poisson manifold, the
condition that the flow preserves is not sufficient to characterize the flow
as Hamiltonian, even locally.
Example C.1 Let M = R2, with coordinates pn, qn (a = 1, 2, ... , n). Then
Q=dpandq'
is a symplectic structure. Every real symplectic form in 2n dimensions can be
brought into this form by a local coordinate transformation (Darboux's theorem).
The corresponding Poisson bracket is the classical expression
330 Poisson and symplectic structures
Of 8g 89 8f
{f,9} Spa
= -aqa- - - -a (C .4)
Spa 9q
We can also take the right-hand side of (C.4) as the definition of a nonsymplectic
Poisson bracket on R2'', with coordinates pa, qQ, x', i = 1, 2, ... k (k > 0).
Note that in this case the flows along vector fields 8/8x' preserve the Poisson
structure, but are not Hamiltonian.
If M is a real manifold of dimension 2n + k and if is a Poisson structure
such that H has constant rank 2n, then there exist local coordinates pa, qa
xt, i = 1, 2, ... k in which is given by (C.4) (Libermann and Marie 1987,
Theorem 11.5, p. 128). The submanifolds of constant x' are symplectic, with
Poisson brackets defined by restriction. That is, if f and g are functions on one
of these submanifolds, then if, g} is defined by extending f and g to functions
on M, and by restricting the Poisson bracket on M to the submanifold. Because
of the degeneracy of II, the result is independent of the choice of extensions.
The submanifolds are the leaves of the characteristic distribution of the Pois-
son structure, which is the distribution spanned at each point by the Hamiltonian
vector fields of 1. It is integrable as a consequence of the Jacobi identity. Note
that a nondegenerate Poisson structure is symplectic.
Example C.2 In an important class of examples, M = g', the dual of a Lie
algebra g. If f and g are functions on g', then df and dg are maps g` g.
There is a natural Poisson bracket on g' defined by
{f, 9}(A) = A([d f, dg]), A E g* ,

or in terms of the structure constants, by 1

{f,9}(A) = A.Cgcabfa`9
In this case, the leaves of the characteristic distribution are the coadjoint orbits
in g*, which are symplectic.
Presymplectic structures and reduction
The distinction between the two types of structure is that a symplectic structure
is required to be nondegenerate, while a Poisson structure is not. Thus Poisson
structures are more general. However, it is possible to relax the nondegeneracy
condition on a symplectic structure in another direction by dropping condition
(b) in the definition. We can define the characteristic distribution K of such a
degenerate 2-form by
Km = {X E TmMIXJ S2 = 01.
This is integrable, as a consequence of the closure of 11. If S2 has constant rank
and if the quotient M' = M/K is a manifold, then M' has a symplectic struc-
ture S2' such that S2 = ir'(Q'), where 7r: M M' is the projection. In this case,
we call (M, S2) a presymplectic structure and we call (M', S2') the symplectic re-
duction of (M, 11). In the applications to differential equations, in which M is
infinite dimensional, it is often more straightforward to work with a degenerate
Poisson and symplectic structures 331

presymplectic form rather than to carry out the reduction explicitly. For ex-
Wple, in gauge theories reduction typically amounts to the removal of gauge
freedom, but it may be simpler to admit all choices of potential, and therefore
to work with an unreduced space of solutions, rather than to impose conditions
that uniquely determine a particular potential within each gauge class. Another
nportant illustration is provided by taking M to be the space of complex L2
functions on R and 0 to be the imaginary part of the Hermitian inner product.
$ere reduction identifies functions that are equal almost everywhere.
Under suitable conditions, the leaves of the characteristic distribution of a
poisson structure are symplectic; in the presymplectic case, it is the quotient by
the characteristic distribution that is symplectic.
In infinite dimensions, the nondegeneracy condition on a symplectic structure
can lead to technical complications (for example the same formal expression on
the solution space of a differential equation can be degenerate or nondegenerate,
depending on boundary conditions); but without it, the flow of a Hamiltonian is
not uniquely determined. The Hamiltonian theory of soliton equations generally
focuses on Poisson structures, for which this technical complication does not
arise. On the other hand, symplectic forms behave well under restriction to
submanifolds. If M' C M is a submanifold, then S2IM' is a closed 2-form. Under
favourable conditions, it is symplectic or presymplectic. By contrast, there is no
general way to construct a Poisson bracket for functions on M' from a Poisson
structure on M.
Poisson operators
A useful class of Poisson structures can be defined by taking M = V, where V
is a vector space with an inner product g, interpreted as a translation-invariant
metric. We then think of as being determined by the map L that assigns a
linear transformation Lu: V V to each u E V, defined by
Lb = 9bcIIca

We recover { , } from L by constructing the gradient vector fields v and v' from
a pair of functions f and f' by putting g(v, - ) = d f , g(v', - ) = d f', and then
by putting {f, f'} = g(Lv, v'). More simply, we define the Poisson bracket by
flab = -La gbc, where g°bgbc = 6 , but the coordinate-free formulation will be
useful in infinite dimensions. In order that should be a Poisson structure,
it is necessary that each L should be skew-adjoint with respect to g, so that
g(Lv,v') = -g(Lv', v) ; (C.5)
and that should satisfy the Jacobi identity. To put this latter condition in
a usable form, we define the derivative avL oflL along a vector v at u E V by

d Lo
Then the Jacobi identity is equivalent to
9(v,,9Lv' Lv") + 9(v', 5Lv" Lv) + g(v", 0 (C.6)
332 Poisson and symplectic structures
for constant vectors v, v', v". We shall call a map L: V - GL(V) such that (C.5)
and (C.6) hold a Poisson 2 operator.

With the Poisson structure defined in this way from a Poisson operator, the
Hamiltonian vector field Xf of a function f is given by X f = Lv, where v is the
gradient vector field of f.
The definition of a Poisson bracket in terms of a Poisson operator involves
additional structure on M. It is necessary to identify M with a vector space V,
and to introduce an inner product g. It may be that there is not a unique natural
choice for (V, g), and it may be possible to obtain the same Poisson structure
from different Poisson operators L by making different choices for V and g. In
fact, the flow along a Hamiltonian vector field on M will preserve but, in
general, will change L, g, and the linear structure of V. So if is determined
by a Poisson operator relative to one choice of (V, g), then by pulling back g and
the linear structure by the flows along different Hamiltonian vector fields, one
can represent {., } by Poisson operators in many other ways.
Compatible Poisson structures
Two Poisson structures and are said to be compatible whenever

satisfies the Jacobi identity. In this case

a{.,.}
is a Poisson structure for every constant a and ,Q.
Suppose that we are given two compatible Poisson structures, with
nondegenerate. Let Q be the symplectic form of which {., } is the Poisson
structure. Then we can define a linear map R:TmM T,,,M in the tangent
space at each point of M by
R(X f) = X f' (C.7)
where f is any function on M, and X f and X f are the Hamiltonian vector
fields generated by f with respect to the two Poisson structures. Since { , } is
nondegenerate, the vector fields X f, for different choices of f, span TmM at each
m. Also as a consequence of nondegeneracy, if X1(m) = 0, then d f (m) = 0, and
therefore X f(m) = 0. It follows that R is well defined. It is called the recursion
operator. In coordinates,
Rca = -2S2abH'bc
where II' is the Poisson tensor of From this definition, and the skew-
symmetry of S2 and H', it follows that for any vectors Y and Z,
S2(RY, Z) = 1(Y, RZ) = -Q(RZ, Y) .
By using the recursion operator, we can define from S2 a sequence of 2-forms
Sti, i = 0, 1, 2,..., by
S22(Y, Z) = 1 (R'Y, Z).
These are all closed. One can see this by forming the series
Poisson and symplectic structures 333
00
cit _ 1: ti52i .

In some neighbourhood of each point of M, this converges for sufficiently small

t, and the sum is given by
ctt(Y, Z) = 52((1 - tR)-'Y, Z) ,

which is the symplectic form of the nondegenerate Poisson structure

I.,.) -t{.,.}'.
Therefore Q is closed by the compatibility condition. Since this is true for all
small t, it follows that each of the Qis is closed.
Bi-Hamiltonian systems
Let X be a vector field on M. We say that the differential equation
:ia = Xa
(where the xas are coordinates on M) is a bi-Hamiltonian system whenever X is
Hamiltonian with respect to two compatible Poisson structures and
at least one of which is nondegenerate. 3
Suppose that we are given such an X. Then the flow of X preserves the
symplectic form SZ constructed from {., } and the recursion operator R. Hence
it also preserves each of the closed forms SZi. Consequently, for each i,
Gx52i = d(XJ SZi) = d((RiX)J 52)= 0.
Therefore the vector fields R'X are all locally Hamiltonian with respect to
It follows that there exists a sequence of functions hi (locally) such that
R'-'X =Xh,,
where Xh, denotes the Hamiltonian vector field of hi with respect to Since
S2 = 520, the first function of the sequence is the Hamiltonian h = h1 for X
relative to the Poisson structure For j, k > 1,
{hj, hk} = -252(Xh,,Xh,,) = -252(R'-1X, Rk-'X) = -2ij+k-2(X,X) = 0.
Therefore the vector fields R'X all commute, and the functions hi are all con-
served by the flow along X.
We can add to the sequence the Hamiltonian h0 of X relative to the Poisson
structure Since X(ho) = 0, we have {h1i ho} = 0. Also
R(Xho) = X
by (C.7). So we also have
{h,, h0} _ -2c(Rj-' X, Xha) = -252(X, Ri-2X) = 0
for j > 2. Therefore, the entire sequence hi, i > 0 is in involution (i.e. has
vanishing Poisson brackets) with respect to By (C.7), the sequence is also
in involution with respect to T.
334 Poisson and symplectic structures
The flows of the vector fields Xh, commute with each other and with the given
flow. In certain circumstances, this may be enough to establish integrability (see,
for example, Olver 1986).

Example C.3 The vector field X = a/aql is bi-Hamiltonian with respect to

the Poisson structures and { , }' of the two symplectic forms
H = dp1 A dql + dp2 A dq2 + ... + dp, A dqn

In this case, ho = pn, h1 = pl and

R(a)= aq2 , Ra R(a

aqn )= aql.
a
a a
ag l aq2 = aq3 , .. .

The his are the functions pn, Pi p2, ...: there are n independent constants of
the motion in involution in this sequence, which is sufficient to establish the
integrability of X.
If, on the other hand,
Q'= dpl Adg2+dp2Adq'+dp3Adg3+...+dpnAdgn,
then R'X is either a/aq' or a/aq2 as i is odd or even. There are only two
independent constants in the sequence of Hamiltonians, which is not sufficient
to establish integrability.
The eigenvalues of R are always constants of the motion. In this example,
they are trivial (i.e. they are constant everywhere).
Infinite-dimensional structures
At a formal level, we use the same definitions of symplectic and Poisson struc-
tures on infinite-dimensional manifolds. There are, however, different ways to
make the conditions precise. For example, there are different infinite-dimensional
extensions of the nondegeneracy condition for a symplectic structure: if M is
modelled on a Banach space, then 0 is weakly nondegenerate if the linear map it
defines at each point from T,,,M to T,nM is injective, and strongly nondegener-
ate if this map is an isomorphism (see Chernoff and Marsden 1974). A form of
Darboux's theorem can be proved for strongly nondegenerate symplectic forms
(Moser 1965, Weinstein 1971), but most of the structures that we shall consider
will satisfy neither condition.

Solution spaces
We shall be interested in symplectic or Poisson structures on the space F of
solutions to a system of differential equations on a finite-dimensional manifold
M. A tangent vector v to .F at a given solution u is a solution to the linearization
of the system about u (u might be a either a scalar or a vector). For example.
for the KdV equation on 1R2,
4ut - uxxx - 6uuy = 0 ,
Poisson and symplectic structures 335

a tangent vector at u is a solution to

4vt - vsxx - 6uvx - 6usv = 0.
We shall not consider general Poisson structures on F, but only those that
arise from two straightforward local constructions. These cover all the cases
of interest. In the first, we construct a symplectic or presymplectic form on
F by integrating some bilinear expression in two linearized solutions v and v'
over a submanifold of M. In the second, we identify F with a vector space, for
example by taking Cauchy data at some time, and define a Poisson structure by
introducing a Poisson operator.
Example C.4 The expression
Sl(v, V') = J_ (vv,' - v'vt) dx, (C.8)

defines a closed 2-form on the solution space of the real wave equation
Ou=utt-uxx=0. (C.9)
Here the equation is linear, so v and v' are also solutions to (C.9). The integral
is independent of t, and 11 is nondegenerate provided that an appropriate choice
is made for the solution space.
Equation (C.8) also defines a 2-form on the solutions of Du = p(u) for any
polynomial u, which is again closed and independent of t. In this case v and v'
are solutions to the linearized equation

Dv= v
du
More generally, in any hyperbolic system generated by a Lagrangian density
depending on the fields and their derivatives, the Lagrangian determines a closed
2-form on the solution space. It is given by integration over a Cauchy surface (a
hypersurface on which Cauchy data can be given); see §3.5.
Example C.5 Maxwell's equations. In the absence of sources, Maxwell's equa-
tions can be expressed in terms of the electric field E and the magnetic potential
A in the form
divE = 0, Et = curl curl A, curl (At + E) = 0.
The integral
r
(A.E' - A'.E) dx dy dz (C.10)
J
is bilinear and skew-symmetric in a pair of solutions E, A and E', A'. Under
suitable boundary conditions, it is independent of t and determines a presym-
plectic structure on the space of solutions. Reduction identifies gauge-equivalent
magnetic potentials: the integral vanishes for every choice of E', A' such that
div E' = 0 if and only if E = 0 and A = grad f for some f . The corresponding
symplectic structure is well defined on the solution space of Maxwell's equations.
336 Poisson and symplectic structures
We can define in the same way a symplectic structure on the solution space
of the Yang-Mills equations in real Minkowski space. Here we replace (C.10) by
tr(6A.6E' - 6A'.6E) dxdydz,
f
where 5E, 6A and 6E', 5A' are solutions to the linearization of the Yang-Mills
equations, written in the form (2.6.3), p. 30. This is also time-independent and
closed. It determines a presymplectic structure on the solution manifold, with
reduction identifying gauge-equivalent pairs (E, A). It vanishes identically on
restriction to ASD fields.
Example C.6 KdV equation. The KdV equation
Out = 6uux + uxxx
is a bi-Hamiltonian system (Magri 1978). We consider a suitable space V of
square-integrable real functions of x, with the L2 inner product

g (v, v') = vv' dx .

foo
On this space,
Lu = ax and Mu = 82 + uC7x + u__
a 2
are Poisson operators. They are both skew adjoint for any u. The first satisfies
the Jacobi identity because it is constant. For the second, we have
C7v Mu = vlax + vx .
2
Therefore

9(v,(9Mv'My") = roo 4v((vxx+4uv,+2uxti )vs

+ 2(vxxxx + 6uxv' + 4uvyx + 2uxxv')ti'I dx,

which, by integration by parts, vanishes on skew-symmetrizing over v, v' and

v". Therefore M also satisfies the Jacobi identity.4 The same is true for any
constant linear combination of L and M, so the two Poisson structures { , } and
}' determined by L and M, respectively, are compatible.
The KdV equation
Out - uxxx - 6uux = 0,
can be written in either of the forms
Btu = Lu(w) or Btu = Mu(w')
where w and w' are the two vector fields on V,
w = u2 + ,luxx and w' = u,
4
which are, respectively, the gradients (with respect to g) of the two functions

h2 = (2u3 - ui )dx and hl = 00 2 u2dx .

J 0000 8 J
Poisson and symplectic structures 337

Therefore the flow is Hamiltonian with respect to both Poisson structures. In

particular, this implies that both Poisson structures are conserved (although
their representation in terms of Poisson operators changes with t). Both are
degenerate. However, the integral
00
u dx

is constant along the flow. If we take F to be the space of solutions such that it
takes a given value, then { , } becomes nondegenerate on F. The tangents to F
now satisfy 00
fvdx=O. (C.11)

On this space, we can invert { , } to obtain the symplectic form

00
S2(v, V') = (ixrr' - irxir) dx, (C.12)
J
where 7rx = 2v, rrx = 2v', with the constants fixed by the condition that 7r and
ir' should vanish at infinity. The corresponding recursion operator is
R = ML-1 = 48s + uax + 2ux)8= 1 = 4as + u + 2ux8s'
The constant in the integral 8s' is determined by (C.11). By iterating R on v =
u, we obtain the generators of a sequence of commuting flows (see Chapter 8).
Example C.7 NLS equation. The space of solutions to the NLS equation
+,)2,;
20x.

has a conserved symplectic structure given by

00
(X, X') = 2i f 00
(XX' - X'X) dx,

where X and X' are solutions to the linearized equation. The flow is generated
by the Hamiltonian

h('+G) = 2 f 00

00
(0x 0x +'+G2V2)dx.

Example C.8 ASDYM equation on a Kahler manifold. Let M be a compact

Kahler manifold of four real dimensions and let E - M be a U(n) bundle.
Let w be the Kahler 2-form. A connection on E is ASD if its curvature F is a
(1,1)-form such that
FA =0.
Let .F denote the set of all such connections, modulo gauge equivalence. A
tangent vector to.F at an ASD connection D is a 1-form ' with values in adj(E)
such that DW is ASD. Since two gauge-equivalent potentials determine the same
point of F, %F and 41 + Df represent the same tangent for any section f of adj(E).
The 2-form
338 Poisson and symplectic structures

= tr(lk A V) A w (C.13)
JM
is well defined on the tangent space to F since
r
cl(Df,W') = tr(Df A V) Aw = d(tr(fW')) Aw = 0
J J
whenever DW' Aw = 0. It is, in fact, a symplectic structure on F. This is proved
by first noting that (C.13) determines a symplectic form on A", the space of
connections on E such that the (2, 0) and (0, 2) parts of the curvature vanish.
Then one shows that .F is the Marsden-Weinstein quotient of A" by the action
of the group G of active gauge transformations. That is, F is the zero-set of
the moment map of G, quotiented by gauge equivalence. See Donaldson 1985,
Donaldson and Kronheimer (1990), pp. 251-2.

NOTES ON APPENDIX C
1. Elements of g have components X° with `contravariant' indices and elements of g'
have components a, with `covariant' indices. For a function f on g', the gradient
a°f = 9f /a,\. has an upper index.
2. More properly, L is a section of End(TM), but we are identifying V both with M
and with the tangent space to M at each point. Magri (1978) uses the term `symplectic
operator', while Fuchssteiner and Fokas (1981) use `implectic operator'.
3. Nondegeneracy is not usually imposed as part of the definition of a bi-Hamiltonian
system, but it is necessary to prove useful results (such as Theorem 7.24 in Olver 1986).
If one of the two Poisson structures is nondegenerate, then almost all constant linear
combinations of the two will also be nondegenerate, so very little would be lost by
requiring both to be nondegenerate, but this is not always convenient.
4. The second structure can be understood as the natural Kirillov-Kostant Poisson
structure on the dual of the Lie algebra of a central extension of either Diff(S1) or
LSL(2, R) (Segal 1991).
Appendix D
Reductions of the ASDYM equation

In this appendix, we summarize the principal reductions of the ASDYM equation

that we derive in the main body of the text, where the references to the original
sources can be found. We have written the reduced equations in the list below
in a uniform notation, which in some cases differs slightly, but in an inessential
way, from that used in text. The principal change is that we here use t, x, and
y to denote the independent variables in the reductions to one, two, and three
dimensions.
Free reductions
With an appropriate choice of gauge, many of the reductions of the ASDYM
equation by freely-acting subgroups of the conformal group can be expressed
entirely in terms of the Higgs fields, by choosing a transversal to the orbits in
complex space-time on which the connection is flat and its potential vanishes
(we called this a Higgs gauge in Chapter 6). In the following, we list
(a) the symmetry group H as a subgroup of GL(4, C) with parameters a, b ...
(since we include the diagonal elements, which act trivially on space-time,
H has dimension 3 for a reduction to two dimensions, and dimension 4 for
a reduction to one dimension);
(b) the equations in terms of the Higgs fields P, Q, and, in the reductions to one
dimension, R; these are functions of the nonignorable coordinates (denoted
by x, t in the two-dimensional case, and by t in the one-dimensional case);
(c) the integrable systems that arise for particular choices of gauge group, and,
possibly, particular reality conditions, together with a reference to the section
in which they are treated in detail.
In many cases, it is necessary to make a gauge transformation to obtain the
equations explicitly.
340 Reductions of the ASDYM equation
Free reductions in Higgs gauge

c 0 0 a The self-duality equations on

0 c b 0 Pt = [Q, P] a Riemann surface; harmonic
0 0 c 0 Q. = [P, Q] maps and the chiral equation;
0 0 0 c §6.2
c 0 b a The KdV and NLS
0 c -a 0 Qt +Px =0 equations; the modified KdV
0 0 c 0 Q. = [P, Q] equation; Heisenberg
0 0 0 c ferromagnet equation; §6.3
c 0 a 0
0 c b 0 [P, Q] = 0 The topological chiral model;
0 0 c 0 Pt-Qt=0 the Boussinesq equation; §6.4
0 0 0 c
a 0 b 0 The Ernst equation;
0 c 0 b P. + tQt = 2[P, Q] stationary axisymmetric
0 0 a 0 Pt-tQx=0 Einstein-Maxwell equations;
0 0 0 c §6.6
d 0 b a
Pt = [Q, R]
0 d -a c
Qt = [P, Q) Nahm's equation; §7.2
0 0 d 0
Rt = 0
0 0 0 d
d c b a Pt=0
0 d c b Qt=[R P] Painleve I/II; §7.4
0 0 d c
Rt = [tP + R, Q]
0 0 0 d
d b 0 0
Pt = 0
0 d 0 0
tQt = 2[Q, R] Painleve III; §7.4
0 0 c a
0 0 c
Rt = 2t[Q, P)
0
c b a 0
Pt =0
0 c b 0
Qt = [P, R + tQ] Painleve IV; §7.4
0 0 c 0
d
Rt = [Q, R]
0 0 0
b a 0 0
Pt = 0
0 b 0 0
Qt [P, R] Painleve V; §7.4
0 0 c 0
tRt = [R, tP + Q]
0 0 0 d
d 0 0 0
Pt = 0
0 c 0 0
tQt = (R, Q) Painleve VI; §7.4
0 0 0
0 0
b
0 a
t(1-t)Rt=[tP+Q,R]
Reductions of the ASDYM equation 341

Other free reductions

Free reductions that do not appear in the list above include the three-dimensional
cases. Here the equations can be expressed in terms of the components 4iw = -P,
Q of the potential, which are functions of three variables x, y, t, as in
the following list, in which the symmetry group is shown as a two-dimensional
subgroup of GL(4, C)
0 0 a
b -a 0 Q. + PV = [P, Q1 The Bogomolny equation; §5.1
0 b 0 Qt+P==0
0 0 b
0 0 0
b 0 0 xQy+Py=2[P,Q] Hyperbolic monopoles; §5.2
0 a 0 xQt+P,' = 0
0 0 b

0 a 0
b 0 0 Q. = [P, Q1 Zakharov's system; §5.3
0 b 0 Qt+Py=0
0 0 b

Discrete symmetries

c 0 0 a w 0 0 0
w-1 0 0 Extended Toda field
0 c b 0 0
0 0 c 0 0 0 w 0 equation; §6.2
0 0 0 c 0 0 0 w-1
c 0 0 a -1 0 0 0
Harmonic maps into
0 c b 0 0 1 0 0
1
Riemann symmetric spaces;
0 0 c 0 1 0 0 0 1
§6.2
0 0 0 c 0 0 1 0
c 0 0 a 1 0 0 0
0 c b 0 0 1 0 0
n-wave equation; §6.4
0 0 c 0 0 0 1 0
0 0 0 c 0 0 0 -1
d 0 b a 1 0 0 0
0 d -a c 0 -1 0 0 Lagrange and Kovalevskaya
to 0 d 0 0 0 -1 0 tops; §7.2
0 0 0 d U 0 0 1

d 0 b a 1 0 0 0
Euler top and
0 d 0 c 0 1 0 0
1
Euler-Arnold-Manakov
0 0 d 0 0 0 1 0
1
equation; §7.3
0 0 0 d 0 0 0 -1
342 Reductions of the ASDYM equation
Constrained reductions
a b 0 0
c d 0 0
Liouville' s equation; §6.8
0 0 a b
0 0 c d
c 0 0 a
0 d b 0
Toda field equation; §6.2
10 0 c 0
0 0 0 d
References

Ablowitz, M. J. and Clarkson, P. A. (1991). Solitons, nonlinear evolution equa-

tions and inverse scattering. London Mathematical Society Lecture Notes
in Mathematics, 149, Cambridge University Press, Cambridge.
Ablowitz, M. J., Chakravarty, S. and Takhtajan, L. J. (1993). A self-dual Yang-
Mills hierarchy and its reduction to integrable systems in 1 + 1 and 2 + 1
dimensions. Commun. Math. Phys., 158, 289-314.
Adler, M. and Van Moerbeke, P. (1980). Completely integrable systems, Eu-
clidean Lie algebras and curves, and linearization of Hamiltonian systems,
Jacobi varieties and representation theory. Advances in Math., 38, (1980),
267-317 and 318-79.
Arnold, V. I. (1984). Mathematical methods of classical mechanics. Graduate
Texts in Mathematics, 60. Springer, Berlin.
Ashtekar, A. Jacobson, T., and Smolin, L. (1988). A new characterization of
half-flat solutions to Einstein's equation. Commun. Math. Phys., 115, 631-
48.

Atiyah, M. F. (1979). Geometry of Yang-Mills fields. Lezioni Fermiane. Ac-

cademia Nazionale dei Lincei and Scuola Normale Superiore, Pisa.
Atiyah, M. F. (1987). Magnetic monopoles in hyperbolic spaces. In Vector
bundles on algebraic varieties. Ed. M.F. Atiyah, Oxford University Press,
Oxford.
Atiyah, M. F. and Hitchin, N. J. (1985). Low energy scattering of non-abelian
monopoles. Phys. Lett., 107A, 21-5.
Atiyah, M. F. and Hitchin, N. J. (1988). The geometry and dynamics of magnetic
monopoles. Princeton University Press, Princeton.
Atiyah, M. F., Hitchin, N. J. and Singer, I. M. (1978a). Self-duality in four-
dimensional Riemannian geometry. Proc. Roy. Soc. Lond., A 362, 425-61.
Atiyah, M. F., Hitchin, N. J., Drinfeld, V. G. and Manin, Yu.I. (1978b). Con-
struction of Instantons. Phys. Lett., A65, 185-7.
Atiyah, M. F. and Ward, R. S. (1977). Instantons and algebraic geometry.
Commun. Math. Phys., 55, 111-24.
Bailey, T. N. and Eatswood, M. G. (1991). Complex paraconformal manifolds-
their differential geometry and twistor theory. Forum Math., 1, 61-103.
344 References
Balser, W., Jurkat, W. B., and Lutz, D. A. (1979). Birkhoff invariants and
Stokes' multipliers for meromorphic linear differential equations. J. Math.
Anal. Appl., 71, 48-94.
Bateman, H. (1910). Partial differential equations of mathematical physics.
Dover, New York.
Beilinson, A. (1978). The derived category of coherent sheaves on pN. Selecta
Math. Soviet., 3, 233-7.
Belavin, A. A., Polyakov, A. M., Schwartz, A. S., and Tyupkin, Yu. S. (1975).
Pseudoparticle solution of the Yang-Mills equations. Phys. Lett., B95,
85-7.
Berry, M. V. and Mount, K. E. (1972). Semiclassical approximations in wave
mechanics. Phys. Rep., 35, 315-97.
Besse, A. L. (1987). Einstein manifolds. Springer, Berlin.
Bobenko, A. I., Reyman, A. G., and Semenov-Tian-Shansky, M. A. (1989). The
Kowalewski top 99 years later: a Lax pair, generalizations and explicit so-
lutions. Commun. Math. Phys., 122, 321-54.
Boyer, C. P. and Finley, J. D., III (1982). Killing vectors in self-dual, Euclidean
Einstein spaces. J. Math. Phys., 23, 1126-30.
Buchdahl, N. P. (1987). Stable 2-bundles on Hirzebruch surfaces. Math. Zeit-
schrift, 194, 143-52.
Burstall, F. E. and Rawnsley, J. H. (1990). Twistor theory for Riemannian
symmetric spaces. Lecture Notes in Mathematics, 1424. Springer, Berlin.
Carey, A. L., Hannabuss, K. C., Mason, L. J. and Singer, M. A. (1993). The
Landau-Lifschitz equation, elliptic curves and the Ward transform. Coin-
mun. Math. Phys., 154, 25-47.
Chakravarty, S. and Ablowitz, M. J. (1992). On reductions of self-dual Yang
Mills equations, in Painlev6 transcendents, their asymptotic and physical ap-
plications. Eds. D.Levi and P.Winternitz. NATO ASI series B278. Plenum.
New York.
Chakravarty, S. Ablowitz, M. J., and Clarkson, P. A. (1992). One dimensional
reductions of self-dual Yang-Mills fields and classical equations. In Recent
advances in general relativity: essays in honor of Ted Newman. Eds. A. I.
Janis and J. R. Porter. Birkhauser, Boston.
Chakravarty, S., Mason, L. J., and Newman, E. T. (1991). Canonical structures
on anti-self-dual four-manifolds and the diffeomorphism group. J. Math.
Phys., 32, 1458-64.
Chandrasekhar, S. (1986). Cylindrical gravitational waves. Proc. Roy. Soc.
Lond., A408, 209-32.
References 345

Chau, L-L., Ge M-L., and Wu, Y-S. (1982). Kac-Moody algebra in the self-dual
Yang-Mills equation. Phys. Rev., 25D, 1086-94.
Chernoff, P. R. and Marsden, J. E. (1974). Properties of infinite dimensional
Hamiltonian systems. Lecture Notes in Mathematics, 425. Springer, Berlin.
Corrigan, E. (1986). Monopoles and reciprocity. In: Field theory, quantum
gravity and strings. Eds. H. J. de Vega and N. Sanchez. Lecture Notes in
Physics, 246. Springer, Berlin.
Cosgrove, C. M. (1977). New family of exact stationary axisymmetric gravi-
tational fields generalising the Tomimatsu-Sato solutions. J. Phys., A10,
1481-524.
Dancer, A. S. and Strachan, I. A. B. (1995). Cohomogeneity-one Kahler metrics.
In Turistor theory. Ed. S. Huggett. Lecture Notes in Pure and Applied
Mathematics, 169. Marcel Dekker.
Date, E., Jimbo, M., Kashiwara, M., and Miwa, T. (1983). Transformation
groups for soliton equations as dynamical systems on infinite-dimensional
Grassmann manifolds. In Proceedings of RIMS Symposium on non-linear
integrable systems-classical theory and quantum theory, Kyoto Japan, May
1981. Eds. M. Jimbo and T. Miwa. World Scientific, Singapore.
Deift, P. and Trubowitz, E. (1979). Inverse scattering on the line. Commun.
Pure Appl. Math., 32, 121-51.
Dirac, P. A. M. (1931). Quantised singularities in the electromagnetic field.
Proc. Roy. Soc. Lond., A133, 60-72.
Donaldson, S. K. (1984). Nahm's equations and the classification of monopoles.
Commun. Math. Phys., 96, 387-407.
Donaldson, S. K. (1985). Anti-self-dual Yang-Mills connections on complex
algebraic varieties and stable vector bundles. Proc. Lond. Math. Soc., 3,
1-26.

Donaldson, S. K. and Kronheimer, P. B. (1990). The geometry of four manifolds.

Oxford University Press, Oxford.
Drazin, P. G. (1983). Solitons. London Mathematical Society Lecture Notes in
Mathematics, 85. Cambridge University Press, Cambridge.
Drinfeld, V. G. and Manin, Ju. I. (1978). Locally free sheaves on C P3 associated
to Yang-Mills fields. Uspekhi Mat. Nauk., 33, 165-6.
Drinfeld, V. G. and Sokolov, V. V. (1981). Equations of Korteweg-de Vries type
and simple Lie algebras. Soviet Math. Dokl., 23, 3, 457-62.
Drinfeld, V. G. and Sokolov, V. V. (1985). Lie algebras and equations of Korte-
weg de Vries type. Jour. Sov. Math., 30, 1975-2036.
346 References

Dubrovin, B. A. (1981). Theta functions and non-linear equations. Russian

Math. Surveys, 36, 11-92.
Eastwood, M. G. (1982). The Penrose transform without cohomology. Twistor
newsletter, 14, 28. Reprinted in: Further advances in twistor theory, Vol. 1:
The Penrose transform and applications. Eds. L. J. Mason, L. P. Hughston.
Pitman Research Notes in Mathematics 231, Longman, Harlow, 1990.
Ehlers, J, Pirani, F. A. E., and Schild, A. E. (1972). The geometry of free-
fall and light propagation. In General relativity (papers in honour of J. L.
Synge). Oxford University Press, Oxford.
Faddeev, L. D. and Takhtajan, L. A. (1987). Hamiltonian methods in the theory
of solitons. Springer, Berlin.
Field, M. (1982). Several complex variables and complex manifolds, Vols I and
II. London Mathematical Society Lecture Notes in Mathematics, 65 and
66. Cambridge University Press, Cambridge.
Fletcher, J. and Woodhouse, N. M. J. (1990). Twistor characterization of sta-
tionary axisymmetric solutions of Einstein's equations. In Twistors in math-
ematics and physics. Eds. T. N. Bailey and R. J. Baston. London Mathe-
matical Society Lecture Notes in Mathematics, 156. Cambridge University
Press, Cambridge.
Fokas, A. S., and Ablowitz, M. J. (1982). On a unified approach to the trans-
formations and elementary solutions of Painleve equations. J. Math. Phys.,
23, 2033-42.
Forgacs, P. and Manton, N. S. (1980). Space-time symmetries in gauge theories.
Commun. Math. Phys., 72, 15-35.
Forgacs, P., Horvath, Z., and Palla, L. (1981). Towards complete integrability
of the self-duality equations. Phys. Rev., D23, 1876-9.
Forgacs, P., Horvath, Z., and Palla, L. (1983). Solution-generating technique for
self-dual monopoles. Nuclear Physics, B229, 77-104.
F ichssteiner, B. and Fokas, A. S. (1981). Symplectic structures, their Biicklund
transformations and hereditary symmetries. Physica, 4D, 47-66.
Gel'fand, I. M. and Dikii, L. A. (1976). Fractional powers of operators and
Hamiltonian systems. systems. Func. Anal. Appl., 10, 259-73.
Gel'fand, I. M. and Dikii, L. A. (1977). Resolvants and Hamiltonian systems.
Func. Anal. Appl., 11, 93-104.
Geroch, R. (1971). A method for generating new solutions of Einstein's equa-
tions. J. Math. Phys., 12, 918-24
Geroch, R. (1972). A method for generating new solutions of Einstein's equa-
tions, II. J. Math. Phys., 13, 394-404.
References 347

Gibbons, G. W. and Hawking, S. W. (1978). Gravitational multinstantons.

Phys. Lett., 78B, 430-2.
Glazebrook, J. F., Kamber, F. W., Pedersen, H., and Swann, A. (1994). In:
Geometric study of foliations. Ed. T. Mizutani et at. World Scientific, Sin-
gapore.
Gohberg I.C. and Krein, M. G. (1958). Systems of integral equations on the
half line with kernels depending on the difference of the arguments. Uspekhi
Mat. Nauk, 13, 3-72. (Russian)
Grauert, H. and Remmert, R. (1958). Bilder and urbilder anaytischer garben.
Ann. Math., 68, 393-443.
Griffiths, P. and and Harris, J. (1978). Principles of algebraic geometry. Wiley,
New York.
Hartshorne , R. (1978). Stable vector bundles and instantons. Commun. Math.
Phys., 59, 1-15.
Helgason, S. (1962). Differential geometry and symmetric spaces. Academic
Press, New York.
Hitchin, N. J. (1982a). Complex manifolds and Einstein's equations. In Twistor
geometry and non-linear systems. Eds. H. D. Doebner and T. D. Palev.
Lecture Notes in Mathematics, 970, Springer. Berlin.
Hitchin, N. J. (1982b). Monopoles and geodesics. Commun. Math. Phys., 83,
589-602.
Hitchin, N. J. (1983). On the construction of monopoles. Commun. Math. Phys.,
89, 145-90.
Hitchin, N. J. (1986). Metrics on moduli spaces. Contemporary Mathematics,
58, Part I.
Hitchin, N. J. (1987a). The self-duality equations on a Riemann surface. Proc.
Lond. Math. Soc., 55, 59-126.
Hitchin, N. J. (1987b). Monopoles, minimal surfaces and algebraic curves. Semi-
naire de mathematiques superieures, NATO ASI, Les Presses de l'Universite
de Montreal, Montreal.
Hitchin, N. J. (1995). Twistor spaces, Einstein metrics and isomonodromic de-
formations. J. Diff. Geom., 42, 30-112.
Hoenselaers, C. and Dietz, W. (eds.) (1984). Solutions of Einstein's equations:
techniques and results. Lecture Notes in Physics, 205. Springer, Berlin.
Hormander, L. (1990). The analysis of linear partial differential equations, 2nd
edition. Springer, Berlin.
Ince, E. L. (1956). Ordinary differential equations. Dover, New York.
348 References
Its, A. R. and Novokshenov, V. Yu. (1986). The isomonodromic deformation
method in the theory of Painleve equations. Lecture Notes in Mathematics,
1191. Springer, Berlin.
Ivancovitch, J. S., Mason, L. J. and Newman, E. T. (1990). On the density
of the Ward ansatze in the space of solutions of anti-self-dual Yang-Mills
solutions. Commun. Math. Phys.. 130, 139-55.
Ivanova, T. A. and Popov, A. D. (1991). Self-dual Yang-Mills fields and Nahm's
equations. Lett. Math. Phys., 23, 29-34.
Ivanova, T. A. and Popov, A. D. (1992). Soliton equations and self-dual gauge
fields. Phys. Lett., A170, 293-9.
Jimbo, M., Miwa, T. and Ueno, K. (1981). Monodromy preserving deformation
of linear ordinary differential equations with rational coefficients, I. Physica,
2D, 306-52.
Jimbo, M. and Miwa, T. (1981). Monodromy preserving deformation of linear
ordinary differential equations with rational coefficients, II and III. Physica,
2D, 407-48 and 4D, 26-46.
John, F. (1938). The ultrahyperbolic differential equation with four independent
variables. Duke J. Math., 4, 300-22.
Jones, P. and Tod, K. P. (1985). Minitwistor spaces and Einstein-Weyl spaces.
Class. Quant. Grav., 2, 565-77.
Kac, V. G. and Wakimoto, M. (1989). Exceptional hierarchies of soliton equa-
tions. In Proceedings of Symposia in Pure Mathematics, 49, 191, American
Math. Soc. , Providence.
Kinnersley, W. (1977). Symmetries of the stationary Einstein-Maxwell field
equations, I. J. Math. Phys., 18, 1529-37.
Kinnersley, W., and Chitre, D. M. (1977-8). Symmetries of the stationary
Einstein-Maxwell field equations, II-IV. J. Math. Phys., 18, 1538-42, 19,
1926-31, 2037-42.
Kirwan, F. (1992). Complex algebraic curves. London Mathematical Society
Student Texts, 23. Cambridge University Press, Cambridge.
Kobayashi, S. and Nomizu, K. (1969). Foundations of differential geometry, Vol.
2. Wiley, New York.
Kodaira, K. (1962). A theorem of completeness of characteristic systems for an-
alytic families of compact submanifolds of complex manifolds. Ann. Math.,
75, 146-62.
Kostant, B. (1970). Quantization and unitary representations. In Lectures in
modern analysis III Ed. C. T. Taam. Lecture Notes in Mathematics, 170.
Springer, Berlin.
References 349

Kostant, B. (1979). The solution to a generalized Toda lattice and representation

theory. Adv. in Math., 34, 195-338.
Kramer D., Stephani, H., MacCallum, M., and Herlt, E. (1980). Exact solutions
of Einstein's field equations. VEB Deutscher Verlag der Wissenschaften,
Berlin, and Cambridge University Press, Cambridge.
Kronheimer, P. B. (1990a). A hyper-Kahlerian structure on coadjoint orbits of
a semisimple complex group. J. Lond. Math. Soc., 2, 42, 193-208.
Kronheimer, P. B.. (1990b). Instantons and the geometry of the nilpotent variety.
J. Dif. Geom., 32, 473-490.
Kundt, W., and Trumper, M. (1966). Orthogonal decomposition of axi-symmet-
ric stationary space-times. Z. Phys., 192, 419-22.
Lakshmanan, M. (1977). Continuum spin system as an exactly solvable dynam-
ical system. Phys. Lett., A61, 53-4.
Lebrun, C. (1991). Explicit self-dual metrics on C P2#... #C P2 J. Difi`. Geom.,
-

34, 223-53.
Leaute, B. and Marcilhacy, G. (1979). Sur certaines particulieres transcendantes
des equations d'Einstein. Ann. Inst. H. Poincare, 31, 363-75.
Lerner, D. E. (1992). The linear system for self-dual gauge fields on a space-time
with signature zero. J. Geom. Phys., 8, 211-19.
Libermann, P. and Marle, C-M. (1987). Symplectic geometry and analytical
mechanics. Reidel, Dordrecht.
McIntosh, I. (1993). Soliton equations and connections with self-dual Yang-Mills.
In Applications of analytic and geometric methods to nonlinear differential
equations. Ed. P.A.Clarkson, NATO ASI series 413. Kluwer, Dordrecht.
Magri, F. (1978). A simple model of the integrable Hamiltonian equation. J.
Math. Phys., 19, 1156-62.
Magri, F. (1980). A geometrical approach to the nonlinear solvable equations.
In Nonlinear evolution equations and dynamical systems. Eds. M. Boiti, F.
Pempinelli, and G. Soliani, Lecture Notes in Physics, 120. Springer, Berlin.
Manakov, S. V. (1976). Remarks on the integrals of the Euler equations of the
n-dimensional heavy top. Funct. Anal. Appl., 10, 93-4.
Manakov, S. V. and Zakharov V. E. (1981). Three dimensional model of rela-
tivistic invariant theory, integrable by the inverse scattering transform. Lett.
Math. Phys., 5, 247-53.
Manton, N. (1981). Multi-monopole dynamics. In Quantum field theory. World
Scientific, Singapore.
350 References

Mason, L. J. (1990). f-space, a universal integrable system? Twistor Newsletter,

30. Reprinted in Further advances in twistor theory, Vol. IL: Integrable
systems, conformal geometry and gravitation, §11.1.7. Eds. L. J. Mason, L.
P. Hughston, and P. Z. Kobak. Pitman Research Notes in Mathematics
232, Longman, Harlow, 1995.
Mason, L. J. (1992a). On the symmetries of the reduced self-dual Yang-Mills
equations. Twistor Newsletter, 35. Reprinted in Further advances in twistor
theory, Vol. 11: Integrable systems, conformal geometry and gravitation,
§II.1.10. Eds. L. J. Mason, L. P. Hughston, and P. Z. Kobak. Pitman
Research Notes in Mathematics 232, Longman, Harlow, 1995.
Mason, L. J. (1992b). Global solutions of the self-duality equations in split sig-
nature. Twistor Newsletter, 35. Reprinted in Further advances in twistor
theory, Vol. II: Integrable systems, conformal geometry and gravitation,
§11.1.7. Eds. L. J. Mason, L. P. Hughston, and P. Z. Kobak. Pitman Re-
search Notes in Mathematics 232, Longman, Harlow, 1995.
Mason, L. J. (1995). Generalized twistor correspondences, d-bar problems, and
the KP equations. In Twistor theory. Ed. S. Huggett. Lecture Notes in
Pure and Applied Mathematics 169, Marcel Dekker.
Mason, L. J., Chakravarty, S. and Newman, E T. (1988). Backlund transfor-
mations for the anti-self-dual Yang-Mills equations. J. Math. Phys., 29, 4,
1005-13.
Mason, L. J. and Newman, E. T. (1989). A connection between the Einstein
and Yang-Mills equations. Commun. Math. Phys., 121, 659-68.
Mason, L. J. and Singer, M. A. (1994). The twistor theory of equations of KdV
type, I. Commun. Math. Phys., 166, 191-218.
Mason, L. J. and Sparling, G. A. J. (1989). Nonlinear Schrodinger and Korteweg
de Vries are reductions of self-dual Yang-Mills. Phys. Lett., A137, 29-33.
Mason, L. J. and Sparling, G. A. J. (1992). Twistor correspondences for the
soliton hierarchies. J. Geom. Phys., 8, 243-71.
Mason, L. J. and Woodhouse, N. M. J. (1993). Twistor theory and the Schle-
singer equations. In Applications of analytic and geometric methods to non-
linear differential equations. Ed. P.A.Clarkson, NATO ASI series 413.
Kluwer, Dordrecht.
Mason, L. J. and Woodhouse, N. M. J. (1993). Self-duality and the Painleve
transcendents. Nonlinearity, 6, 569-81.
Maszczyk, R. (1995). The symmetry transformation-self-dual Yang-Mills fields
and self-dual metrics. Ph. D. Thesis, Warsaw University.
Maszczyk, R., Mason, L. J., and Woodhouse, N. M. J. (1994). Self-dual Bianchi
metrics and the Painleve transcendents. Class. Quantum Grav., 11, 65-71.
References 351

Miura, R. M., Gardner, C. S. and Kruskal, M. D. (1968). The Korteweg de

Vries equations and generalizations, II. Existence of conservation laws and
constants of motion. J. Math. Phys., 9, 1204-9.
Moser, J. K. (1965). On the volume elements on a manifold. Trans. Amer.
Math. Soc., 120, 286-94.
Nahm, W. (1983). Self-dual monopoles and calorons. In Proc. XII Colloq. on
gauge theoretic methods in physics, Trieste. Eds. G. Denado et al.,. Lecture
Notes in Physics, 201. Springer, Berlin.
Newman, E. T. (1978). Source-free Yang-Mills theories. Phys. Rev., D18, 2901-
2908.

Newman, E. T. (1986). Gauge theories, the holonomy operator and the Riemann-
Hilbert problem. J. Math. Phys., 27, 2797-802.
Novikov, S. P. (1994). Solitons and geometry. Acadame Nazionale dei Lincei and
Scuola Normale Superiore, Lezioni Fermiane. Cambridge University Press,
Cambridge.
Okonek, C., Schneider, M., Spindler, H. (1980). Vector bundles on complex
projective spaces. Prog. Math., 3. Birkhauser, Boston.
Olver, P. J. (1986). Applications of Lie groups to differential equations. Gradu-
ate Texts in Mathematics, 107. Springer, Berlin.
Painleve, P. (1900). Sur les equations differentielles du second ordre et d'ordre
superieur dont 1'integrale generale est uniforme. Acta Math., 25, 1-85.
Park, Q-H. (1990). Self-dual gravity as the large N limit of the two dimensional
non-linear sigma model. Phys. Lett., 236B, 429-32.
Pedersen, H. and Poon, Y. S. (1988). Hyper-Kahler metrics and a generalization
of the Bogomolny equations. Commun. Math. Phys. 117, 569-80.
Pedersen, H. and Poon, Y. S. (1990). Kahler surfaces with zero scalar curvature.
Class. Quantum Grav., 7, 1707-19.
Pedersen, H. and Tod, K. P. (1993). Three-dimensional Einstein-Weyl geometry.
Adv. Math., 97, 74-109.
Penrose, R. and MacCallum, M. A. H. (1972). Twistor theory: an approach to
the quantization of fields and space-time. Phys. Rep., 6C, 241-315.
Penrose, R. (1976). Nonlinear gravitons and curved twistor theory. Gen. Rel.
Grav., 7, 31-52.
Penrose, R. and Rindler, W. (1984). Spinors and space-time. Vol. 1: Two-spinor
calculus and relativistic fields. Cambridge University Press, Cambridge.
352 References

Penrose, R. and Rindler, W. (1986). Spinors and space-time. Vol. 2: Spinor

and turistor methods in space-time geometry. Cambridge University Press,
Cambridge.
Penrose, R. (1992). Twistors as spin 3/2 charges. In Gravitation and modern
cosmology. Eds. A. Zichichi and N. Sanchez. Plenum Press, New York.
Persides, S. and Xanthopoulos, B. C. (1988). Some new stationary axisymmetric
asymptotically flat space-times obtained from Painleve transcendents. J.
Math. Phys., 29, 674-80.
Plebanski, J. F. (1975). Some solutions of complex Einstein equations. J. Math.
Phys., 16, 2395-2402.
Pohlmeyer, K. (1980). On the Lagrangian theory of anti-self-dual fields in four-
dimensional Euclidean space. Commun. Math. Phys., 72, 37-47.
Pontecorvo, M. (1992). On twistor spaces of anti-self-dual Hermitian surfaces.
Trans. Amer. Math. Soc., 331, 653-61.
Popov, A. D. (1992). Anti-self-dual solutions of the Yang-Mills equations in 4n
dimensions. Mod. Phys. Lett., A7, 2077-85.
Pressley, A. and Segal, G. B. (1986). Loop groups. Oxford University Press,
Oxford.
Sato, M. and Sato, Y. (1983) Soliton equations as dynamical systems on infinite
dimensional Grassmann manifolds. In Nonlinear differential equations in
applied science (Tokyo 1982). Math. Stud. 81. North-Holland, Amsterdam.
Schiff, J. (1992). Integrability of Chern-Simons-Higgs vortex equations and a
dimensional reduction of the self-dual Yang-Mills equations to three dimen-
sions. In Painleve transcendents. Eds. D. Levi and P. Winternitz. Plenum
Press, New York.
Schlesinger, L. (1912). Uber eine klasse von differentialsystemem beliebiger ord-
nung mit festen kritischen punkten. J. fur Math., 141, 96-145.
Segal, G. B. and Wilson, G. (1985). Loop groups and equations of KdV type.
IHES Publ. math., 61, 5-65.
Segal, G. B. (1991). The geometry of the KdV equation. Int. Jour. Mod. Phys.,
A6, 2859-69.
Sparling, G. A. J. (1991). Generalizations of Yang-Mills. In Further advances
in turistor theory. Eds. L. J. Mason and L. P. Hughston. Pitman Research
Notes, 37. Longman, London.
Strachan, I. A. B. (1992). Null reductions of the Yang-Mills self-duality equa-
tions and integrable models in (2+1)-dimensions. In Applications of analytic
and geometric methods to nonlinear differential equations. Ed. P. A. Clark-
son, NATO ASI series C, 413. Kluwer, Dordrecht.
References 353

Strachan, I.A.B. (1994). Deformed twistor spaces and the KP equation. Twistor
Newsletter, 39, 10-11.
Symes, W. (1980) Systems of Toda type, inverse spectral problems and repre-
sentation theory. Invent. Math., 59, 13-53.
Szmigielski, J. (1993). On the soliton content of the self-dual Yang-Mills equa-
tions. Phys. Lett., A193, 293-300.
Tod, K. P. (1990). A non-Hausdorff mini-twistor space. Twistor Newsletter,
30, 21-3. Reprinted in: Further advances in twistor theory, Vol. II: Inte-
grable systems, conformal geometry and gravitation. Eds. L. J. Mason, L. P.
Hughston, and P. Z. Kobak. Pitman Research Notes in Mathematics 232.
Longman, Harlow, 1995.
Tod, K. P. (1991). A comment on a paper of Pedersen and Poon. Class. Quan-
tum Grav., 8, 1049-51.
Tod, K. P. (1992a). Metrics with self-dual Weyl tensor from Painleve VI. Twistor
Newsletter, 35, 5-7.
Tod, K. P. (1992b). Some new scalar-flat Kahler and hyper-Kahler metrics.
Twistor Newsletter, 35, 8-10.
Tod, K. P. (1994). Self-dual Einstein metrics from the Painleve VI equation.
Phys. Lett., A190, 221-4.
Tod, K.P. (1995a). Self-dual Einstein metrics with symmetry. Twistor Newslet-
ter, 39, 19-24.
Tod, K. P. (1995b). Scalar-flat Kahler and hyper-Kahler metrics from Painleve-
III. Class. Quantum Grav., 12, 1535-47.
Tod, K.P. (1995c). Cohomogeneity-one self-dual metrics. In Twistor theory. Ed.
S. Huggett. Lecture Notes in Pure and Applied Mathematics 169, Marcel
Dekker.
Ueno, K. and Nakamura, Y. (1983). Transformation theory for the anti-self-dual
equations. Publ. RIMS, Kyoto Univ., 19, 519-47.
Uhlenbeck, K. K. (1982). Removable singularities in Yang-Mills fields. Com-
mun. Math. Phys., 83, 11-29.
Van Moerbeke, P. (1985). Algebraic geometrical methods in Hamiltonian me-
chanics. Phil. Trans. Roy. Soc. Lond., A315, 379-90.
Ward, R. S. (1977). On self-dual gauge fields. Phys. Lett., 61A, 81-2.
Ward, R. S. (1980). Self-dual space-times with cosmological constant. Commun.
Math. Phys., 78, 1-17.
Ward, R. S. (1981). Ansatze for self-dual Yang-Mills fields. Commun. Math.
Phys., 80, 563-74.
354 References

Ward, R. S. (1983). Stationary axisymmetric space-times: a new approach. Gen.

Rel. Grav., 15, 105-9.
Ward, R. S. (1984a). The Painleve property for the self-dual gauge-field equa-
tions. Phys. Lett., A102, 279-82.
Ward, R. S. (19846). Completely solvable gauge field equations in dimension
greater than four. Nucl. Phys., B 236, 381-96.
Ward, R. S. (1985). Integrable and solvable systems and relations among them.
Phil. Trans. R. Soc., A315, 451-7.
Ward, R. S. (1986). Multidimensional integrable systems. In Field theory, quan-
tum gravity and strings II. Eds. H. de Vega and N. Sanchez, Lecture Notes
in Physics, 280. Springer, Berlin.
Ward, R. S. (1988 a). Integrability of the chiral equations with torsion term.
Nonlinearity, 1, 671-9
Ward, R. S. (1988b). Soliton solutions in an integrable chiral model in 2+1
dimensions. J. Math. Phys., 29, 386-9.
Ward, R. S. (1989). Twistors in 2 + 1 dimensions. J. Math. Phys., 30, 2246-51.
Ward, R. S. (1990a). Integrable systems in twistor theory. In Twistors in math-
ematics and physics. Eds. T. N. Bailey and R. J. Baston. London Mathe-
matical Society Lecture Notes in Mathematics, 156. Cambridge University
Press, Cambridge.
Ward R. S. (1990b). The SU(oo) chiral model and self-dual vacuum spaces.
Class. Quantum Grav., 7, L217-22.
Ward, R. S. (1990c). Classical solutions of chiral models, unitons, and holomor-
phic vector bundles. Commun. Math. Phys., 128, 319-32.
Ward, R. S. (1992). Infinite-dimensional gauge groups and special nonlinear
gravitons. J. Geom. Phys., 8, 317-25.
Ward, R. S. and Wells, R. 0. (1990). Turistor geometry and field theory. Cam-
bridge University Press, Cambridge.
Wasow, W. (1976). Asymptotic expansions for ordinary differential equations.
Wiley, New York.
Weinstein, A. (1971). Symplectic manifolds and their Lagrangian submanifolds.
Adv. in Math., 6, 329-46.
Wells, R. 0. (1973). Differential analysis on complex manifolds. Prentice-Hall,
Englewood Cliffs.
Wilson, G. (1979). Commuting flows and conservation laws for Lax equations.
Proc. Camb. Phil. Soc., 86, 131-43.
References 355

Wilson, G. (1988). On the quasi-hamiltonian formalism of the KdV equation.

Phys. Lett., A132, 445-51.
Witten, E. (1977). Some exact multipseudoparticle solutions of classical Yang-
Mills theory. Phys. Rev. Lett. 38, 121-4.
Witten, E. (1988). Quantum field theory, Grassmannians and algebraic curves.
Commun. Math. Phys., 113, 529-600.
Witten, L. (1979). Static axially symmetric solutions of self-dual SU(2) gauge
fields in Euclidean four-dimensional space. Phys. Rev., D19, 718-20.
Woodhouse, N. M. J. (1983). On self-dual gauge fields arising from twistor
theory. Phys. Lett., A94, 269-70.
Woodhouse, N. M. J. (1985). Real methods in twistor theory. Class. Quantum
Grav., 2, 257-91.
Woodhouse, N. M. J. (1987). Twistor description of the symmetries of Einstein's
equations for stationary axisymmetric space-times. Class. Quantum Grav.,
4, 799-814.
Woodhouse, N. M. J. (1990). Ward's splitting construction for stationary ax-
isymmetric solutions of the Einstein-Maxwell equations. Class. Quantum
Grav., 7, 257-60.
Woodhouse, N. M. J. (1992a). Contour integrals for the ultrahyperbolic wave
equation. Proc. Roy. Soc. Lond., A438, 197-206.
Woodhouse, N. M. J. (1992b). Geometric Quantization, 2nd edition. Oxford
University Press, Oxford.
Woodhouse, N. M. J. and Mason, L. J. (1988). The Geroch group and non-
Hausdorff twistor spaces. Nonlinearity, 1, 73-114.
Yang, C. N. (1977). Condition of self-duality for SU(2) gauge fields on Euclidean
four-dimensional space. Phys. Rev. Lett., 38, 1377-9.
Zakharov, V. E. (1980). The inverse scattering method. In Solitons. Eds. R. K.
Bullough and P. J. Caudrey. Springer, Berlin.
Zakharov, V. E. and Manakov, S. (1985). Construction of multidimensional
nonlinear integrable systems and their solutions. Funct. Anal. Appl., 19,
89-101.
Zakharov, V. E. and Shabat, A. B. (1974). Integration of the nonlinear equations
of mathematical physics by the method of the inverse scattering transform,
I. Funct. Anal. Appl., 8, 226-35.
Zakharov, V. E. and Shabat, A. B. (1979). Integration of the nonlinear equations
of mathematical physics by the method of the inverse scattering transform,
II. Funct. Anal. Appl., 13, 166-73.
A note on notation

We use C throughout to denote the spectral parameter, and C to denote (-I:

this is suggested by the fact that ( = Z on the unit circle. Generally, we use a
tilde (") to distinguish quantities defined in a neighbourhood of ( = 00 from the
corresponding quantities defined in a neighbourhood of C = 0. Other frequently
used symbols that are reserved (not quite exclusively) for special use are: L and
M for the elements of a Lax pair; T] and v for the space-time metric and its vol-
ume form; w, z, w, z for double-null coordinates on space-time; 8w, 8Z for
the partial derivatives 8/8w, 8/8z, 8/8w, 8/82; X, Y, ... for conformal Killing
vectors; P, Q.... for the corresponding Higgs fields, 7rA' for the spinor with com-
ponents (1, (); U for a neighbourhood in complex space-time; D = d + r for an
ASD connection; E U for the vector bundle on which it is defined; g for the
matrix of a gauge transformation; P for the twistor space of U; and F for the
correspondence space. We use A, µ, and (to denote inhomogeneous coordinates
on twistor space, V and V to denote a 2-set Stein cover (V contains the points
( = 0 and V contains the points ( = oo), E' -> P to denote Penrose-Ward
transform of an ASD connection, F to denote the patching matrix of E', and f
and f to denote its Birkhoff factors.
Index of notation

A, 30 F-, 14
adj, 26 F+, 14
a, 17
a, 17 r(E), 24
I'(U, E), 24
B, 17, 30 9,39
B(rn), 123
R, 308
Cabcd, 288 H(k,p), 130
CM, 14 Hn(P, E), 155
CM#, 20 H++, 68
C P,,, 137 H+O, 68
HSD, 68
D, 25, 27 HASD, 68
0,18,153
aD, 40 1, 139
a, 18 2k, 55
8D, 40 tA, 162
5,18
0., 356 J, 34
J,39
E, 15
E, 17, 30 K, 36
E, 308 IC, 39
E', 176
Eabcd, 16 L, 19, 34
eAB, 162 LA3, 129
EA'B', 162 A, 250
flab, 16 A, 138
7lab, 16 A, 139
LGL(n, C ), 146
F, 146, 173 LGL+(n, C ), 146
F, 141 LGL_(n,C), 146
.7 , 140 ,Cx, 28
fin, 259 e, 141, 288
FkV, 138
f , 146, 173 M, 19, 34
f , 146, 173 M, 15
358 Index of notation
M, 39 p`, 27
m, 141, 288 p., 27
M T, 212 1R1Pn, 137
µ, 138
µ, 139 S, 258
S, 309
Vabcd, 16 Q, 141
S,160
O(k), 290, 292
S', 160
0(k), 151
* 16
O(A-k), 261 '
0A, 162
r, 152, 288
1, 41
9, 287
w, 17 9, 287
wA, 167 Ox, 28, 48
92k, 116 '1<', 139

P1,11, 104 U, 15
PAj, 130 U#, 187
Pill, 105 T, 292
P, 139
Piv, 105 V, 308
Pv, 106
Pvt, 106 w, 15
4, 25 WD, 115
I)ab, 288 w, 15
'DABC'D', 293
0k, 115 x, 289
Ox, 49 xAA , 161
11, 18 x = (xaQ), 20
7.A', 167 X x Y, 31
PN, 141 S, 288
41, 39 C, 199, 288
W ABCD, 1 A'B'C'D', 293 X', 206
PT, 139 X", 206
PV, 137
y, 289
QAj, 130
z, 15
R, 288 Z-, 139, 168
Rab, 288 (, 19, 138
7Z, 211, 285 (, 139
R, 114 z, 15
Index

Abelianization, 269 generalized, see GASDYM

action equation
conformal, 20 linearized, 39, 114, 115, 117
free, 28, 49 on a Kahler manifold, 337
left, 31 reduction, 43-44
Lie algebra, 28 solution space, 39
lift, 27, 317 spinor form, 165
orthogonally transitive, 85 ultrahyperbolic
action-angle variable, 1 global solutions, 187-194
ADHM construction, 182 reality conditions, 192
adjoint bundle, 26 topological conditions, 193
affine bundle, 169 ASDYM hierarchy, 118, 120, 121
AHS construction, 181 Ashtekar-Jacobson-Smolin equa-
a-plane, 19, 23, 128, 138, 164, 167, tions, 313
194, 195 Atiyah-Ward ansatz, 175, 193
real, 142
a-plane bundle, 289 Backlund transformation, 55, 249
a-surface, 188, 291, 294 Baker function, 252, 261, 266, 268,
alternating symbol, 16 276
alternating tensor, 162 Bessel's equation, 46
anti-self-dual (ASD), 14 Q-plane, 19
ASD conformal structure, 284, 285, Bianchi identity, 27
294, 307 Bianchi metrics, 95
Bianchi IX, 314 bi-Hamiltonian structure, 111, 115,
ASD Einstein metric, 285, 295, 303, 333
305 KdV, 126
ASD equation Birkhoff factorization, 146, 149
linear, 45 Bogomolny equation, 59, 213
on a R.iemann surface, 67, 69 complex, 60
ASD Kahler metric, 298 Bogomolny hierarchy, 118, 122, 130,
ASD scalar-flat metric, 285 275
ASD vacuum equation level m, 123
Lax pair, 298 Boussinesq equation, 83, 324
ASD vacuum metric, 285, 296 Burgers' equation, 8
ASDYM equation, 4, 32, 296, 307
Euclidean, 180-187 canonical bundle, 152
first integral, 44, 63, 65 Cartan equation
360 Index
first, 286 homogeneous, 137
second, 286 inhomogeneous, 137
Cech cohomology, 155 correspondence space, 140, 205,
characteristic distribution, 330 288-294
chiral equation, 70 complex, 188
chiral model, 214 real, 188
coboundary, 155 cotangent bundle, 151
cocycle relation, 155 covariant tensor, 19
cohomology group, 155 curvature, 26, 317
Dolbeault, 156
commuting flows, 118 Davey-Stewartson equation, 43, 66,
compactified space-time, 19 274, 282
condition (C), 285 a-operator, 153-155, 275
condition (E), 285 differential form, 16
condition (K), 298 anti-self-dual (ASD), 17, 163
condition (S), 285 8-closed, 155
condition (V), 285, 296 a-exact, 155
conformal group, 20 self-dual (SD), 17, 163
complex, 20 real, 17
generators, 22 type (p, q), 153
real, 21 values in a bundle, 154
conformal Killing vector, 20, 21, 23, Dirac operator, 275
205 Dolbeault isomorphism, 156, 157,
ASD, 23 199
lift, 205 double fibration, 140, 205, 300
non-null, 61 reduced, 211
null, 63 dressing matrix, 258, 259
rotational, 61 dressing transformation, 251, 272,
SD, 23 320
conformal chapping, 144 Drinfeld-Sokolov construction, 81,
conformal transformation, 19 114, 130, 250, 319--326
special, 20 uniqueness of flows, 325
connection, 25 DS flow, 254, 259
gauge equivalent, 317 twistor construction, 254-256,
invariant, 48 264
potential, 25 DS hierarchy, 253
pull-back, 27, 47 DS operator, 254
restricted, 27
spinor, 287 Einstein condition, 285
constraint Einstein-Maxwell equations, 88
dynamic, 54, 96 Einstein's equation, 85
kinematic, 50 solution generation, 87
coordinates Einstein-Weyl space, 306
affine, 137 equation
double-null, 14, 161 Painleve type, 5
Index 361

equivalence, of holomorphic vector Hamiltonian vector field, 328

bundles, 150 harmonic map, 67, 70, 72
Ernst equation, 67, 84, 88, 108, 210, heat equation, 45
215 Heisenberg ferromagnet equation,
Ernst potential, 88 78
Euclidean group, 67 Higgs field, 44, 48, 130, 131, 208-
Euler-Arnold-Manakov equation, 209, 220
102 homogeneous function, 151, 163
Euler's top, 2, 95, 97, 99 horizontal distribution, 308
Euler vector field, 206, 292 horizontal lift, 308
exponent of formal monodromy, 235 hypergeometric equation, 46
exterior derivative, 16 hyper-Kahler metric, 300
exterior product, 16
instanton, 36, 66, 182
flag manifold, 138 instanton number, 182
fundamental solution, 172 inverse-scattering method, 220, 243
inverse-scattering transform, 3, 193,
Galilean transformation, nonlinear, 220
75 isomonodromy, 95, 231-241
GASDYM equation, 127, 194 strong, 236
GASDYM hierarchy, 129, 244
truncated, 130, 195 J-matrix, 34, 174-176, 178
gauge, 8, 26 Jost function, 158
Higgs, 74, 339 jumping line, 153
invariant, 28 jumping point, 148
normal, 74 Kac-Moody algebra, 7
gauge group, 13, 26, 317 Kahler metric, 298
gauge potential, 25 scalar-flat, 315
gauge theory, 13 KdV equation, 2, 3, 66, 67, 75,
gauge transformation, 26, 174, 317- 79, 108, 111, 124, 218-220,
318, 322 225, 273, 311, 324, 336
active, 317 linearized, 111, 124
infinitesimal, 40 modified, 81
irregular, 55 KdV flow, 269
passive, 317 KdV hierarchy, 112, 246
Gel'fand-Levitan-Marchenko equa- Klein quadric, 20, 142
tion, 242 real, 144
generating function, 262, 263, 270 K-matrix, 36, 174-176, 178
Gibbons-Hawking ansatz, 311 Kovalevskaya's top, 2, 6, 95, 99
Grassmannian, 7, 157, 279 KP equation, 43, 66, 274, 276
Grothendieck's theorem, 152 KP hierarchy, 274-282
a approach, 278
Hamiltonian, 42, 113, 273 Krichever construction, 7, 244, 266
Hamiltonian formalism, 269
Hamiltonian system, 1, 329 Lagrange's top, 2, 95, 101
362 Index
Lagrangian, 29, 36 Nahm's equation, 97, 98
Landau-Lifschitz equation, 9 negative frequency, 145
Laplace's equation, 45 nKdV equation, 67, 83
Lax pair, 3, 34, 172 nKdV hierarchy, 81, 131, 244, 261
left rotation, 22, 161 characterization, 326
Levi-Civita connection, 286 NLS equation, 4, 67, 75, 77, 108,
Lie derivative, 28, 48, 207-208, 220, 126,218-220,222-225,337
229, 249, 257 attractive, 78
standard form, 228 complex, 77
light-cone at infinity, 142 repulsive, 78
line bundle, 24 NLS hierarchy, 246
canonical, 303 nonsolitonic-solutions, 224
tautological, 292 normal bundle, 301
linear system, 3 null cone at infinity
ASDYM, 34, 168, 174 a-plane, 143
reduced, 209-212 null tetrad, 15, 161
reduced, 68 conformal, 294
Liouville's equation, 6, 67, 92 n-wave equation, 67, 83
loop group, 146
lump solution, 214 Painleve equation, 5, 95, 97, 102,
231-239
first, 107
Manakov-Zakharov model, 214 second, 104
Maxwell's equations, 13, 29, 335 third, 105, 106, 315
metric fourth, 105, 107
conformally half-flat, 285 fifth, 106, 107
Einstein, 284 sixth, 106, 314
Euclidean, 14 Painleve group, 95, 97, 232
Kahler, 284 Painleve property, 5, 6, 95, 179-180
metric, 14 Painleve system, 237
ultrahyperbolic, 14 Painleve test, 95
vacuum, 284 Painleve transcendent, 232
metric tensor, 162 paraconformal geometry, 128
minitwistor space, 212 parallel transport, 25
Minkowski space partial connection, 154
compactified, 20, 142-145 partial flatness, 154
Miura-Gardner-Kruskal transfor- patching data, 149, 173, 223
mation, 113, 274 invariant, 218
Miura transformation, 81 patching matrix, 24, 150, 171, 173-
monad, 183 261
Monge-Ampere equation, 307 Penrose transform, 196
monodromy data, 234, 235 Penrose-Ward transform, 5, 171-
monodromy representation, 234 179,194,244, 255-275
monopoles, 60 Dolbeault form, 278
hyperbolic, 60 forward, 173
Index 363

global ultrahyperbolic solu- scalar curvature, 288

tions, 190 scattering coefficient, 159, 223
local ultrahyperbolic solutions, scattering data, 3, 220, 242
189 scattering matrix, 159
Plebanski equation, 299 Schlesinger equation, 239-241
first, 299 Schrodinger equation, 158
second, 300 time-independent, 226
Poisson operator, 332 section
Poisson structure, 112, 328 global, 23
compatible, 332 local, 23
Poisson tensor, 328 Segal-Wilson ansatz, 229, 244, 253,
positive frequency, 145 264,275
presymplectic structure, 330 self-dual (SD), 14
prime-spin bundle, 292 separation ray, 235
projective, 290, 292 singular set, 204, 209
principal bundle, 24 sinh-Gordon equation, 72
projective line, 138 skew-symmetrization, 16
in P'll', 139 soliton, 3, 224, 266
projective plane, 138 space of orbits, 49
projective space, 137 spectral curve, 266
dual, 138 spectral parameter, 3, 19
pull-back, 19 invariant, 210
spin frame, 162
spin structure, 164-165
reality structure, 141, 179, 305 conformal, 169
real slice, 15, 141, 165 real, 166
Euclidean, 15, 32, 141 spinor, 160-168, 289-294
Minkowski, 15 conjugation, 165
ultrahyperbolic, 15, 32, 142 tensor correspondence, 161
recursion equations, 118 star operator, 16
recursion operator, 112, 114, 123,
Stein manifold, 155
129, 171, 245, 247, 332 Stokes' sector, 230
ASDYM, 125 structure function, 328
KdV, 112, 124, 125 structure group, 24, 317
recursion relations, 115, 127 summation convention, 16
reflection coefficient, 3, 158 switch map, 307
reflectionless potential, 228 symmetrization, 30
regular function, 172 symmetry
Riccati equation, 5 discrete, 52
Ricci tensor, 288 point, 63
Riemann-Hilbert problem, 145-149, Z2, 53
228 symplectic form, 269
Riemann tensor, 286 strongly nondegenerate, 334
right rotation, 22, 161, 163 weakly nondegenerate, 334
rigid frame, 286 symplectic reduction, 330
364 Index
symplectic structure, 196, 199, 328 vacuum metric, 309
KdV, 125 vector bundle, 23
associated, 25
tangent bivector, 18 direct sum, 24
tangent bundle, 151 dual, 25
tangent spinor, 167 holomorphic, 24
r function, 265, 280 invariant, 207
t'Hooft ansatz, 176, 187 over C?1, 150
Toda field equation, 67, 71 pull-back, 27
extended, 67, 72 restriction, 27
Toda lattice, 101 tautological, 151
topological chiral model, 82 tensor product, 24
torsion tensor, 169 total space, 23
transition function, 24 vertical distribution, 308
transition matrix, 224
transmission coefficient, 3, 158, 159 Ward's theorem, 177
trivialization, 24 wave equation, 45
twist, 62 background-coupled, , ,
twist scalar, 93 196-200
twistor distribution, 290, 291 ultrahyperbolic, 33
twistor space, 138-145, 205, 255, Weyl geometry, 305
256, 275 Weyl tensor, 288
of U, 139
projective, 139 X-ray transform, 194
real, 188
reduced, 204, 211-218 Yang-Mills equations, 29
non-Hausdorff topology, 216 ASD, 32
two-lump solutions, 214 Yang's equation, 34, 37, 68, 69, 89
two-plane reduction, 62
null, 18 Yang's matrix, 34

Uhlenbeck's theorem, 182 Zakharov-Shabat hierarchy, 82

Zakharov's system, 64
vacuum field equation, 285 complex, 64

Common questions