Applied Mathematical Stlences
Volume 109
Editors
J.E. Marsden L. Sirovich F. John (deceased)
Advisors
M. Ghil J.K. Hale T. Kambe
J. Keller K. Kirchgiissner
B.J. Matkowksy C.S. Peskin
J.T. Stuart
Springer Science+Business Media, LLC
Applied Mathematical Sciences
1. John: Partial Differential Equations, 4th ed. 34. Kevorkian/Cole: Perturbation Methods in
2. Sirovich: Techniques of Asymptotic Analysis. Applied Mathematics.
3. Hale: Theory of Functional Differential 35. Carr: Applications of Centre Manifold Theory.
Equations, 2nd ed. 36. BengtssonIGhlt/Kiillen: Dynamic Meteorology:
4. Percus: Combinatorial Methods. Data Assimilation Methods.
s. von Mises/Friedrichs: Fluid Dynamics. 37. StJperslone: Semidynamical Systems in Infinite
6. Freiberger/Grenander: A Short Course in Dimensional Spaces.
Computational Probability and Statistics. 38. Lichtenberg/Lieberman: Regular and Chaotic
7. Pipkin: Lectures on Viscoelasticity Theory. Dynamics, 2nd ed.
8. Gincoglia: Perturbation Methods in Non-linear 39. PicclnVStampacchlDlJlidossich: Ordinary
Systems. Differential Equations in R'.
9. Friedrichs: Spectral Theory of Operators in 40. NaylorlSell: Linear Operator Theory in
Hilbert Space. Engineering and Science.
10. Stroud: Numerical Quadrature and Solution of 41. Sparrow: The Lorenz Equations: Bifurcations.
Ordinary Differential Equations. Chaos, and Strange Attractors.
II. Wolovich: Linear Multivariable Systems. 42. Guckenheimer/Holmes: Nonlinear Oscillations.
12. Berkovitz: Optimal Control Theory. Dynamical Systems, and Bifurcations of Vector
13. Bluman/Cole: Similarity Methods for Fields.
Differential Equations. 43. Ockendollfl"aylor: Inviscid Fluid Flows.
14. Yoshizawa: Stability Theory and the Existence 44. PaZ}': Semigroups of Linear Operators and
of Periodic Solution and Almost Periodic Applications to Partial Differential Equations.
Solutions. 45. GlashoffiGustafton: Linear Operations and
IS. Braun: Differential Equations and Their Approximation: An Introduction to the
Applications, 3rd ed. Theoretical Analysis and Numerical Treatment
16. Leftchetz: Applications of Algebraic Topology. of Semi-Infmite Programs.
17. CollatziWellerling: Optimization Problems. 46. Wilcox: Scattering Theory for Diffraction
18. Grenander: Pattern Synthesis: Lectures in Gratings.
Pattern Theory, Vol. I. 47. Hale et al: An Introduction to Infinite
19. MarsdenlMcCrackell: Hopf Bifurcation and Its Dimensional Dynamical Systems-Geometric
Applications. Theory.
20. Driver: Ordinary and Delay Differential 48. Murray: Asymptotic Analysis.
Equations. 49. Ladyzhenslcaya: The Boundary-Value Problems
21. Courant/Friedrichs: Supersonic Flow and Shock of Mathematical Physics.
Waves. so. Wilcox: Sound Propagation in Stratified Fluids.
22. RouchelHabets/Laloy: Stability Theory by 5 I. GolubitslcylSchaeffer: Bifurcation and Groups in
Liapunov's Direct Method. Bifurcation Theory, Vol. I.
23. Lamperti: Stochastic Processes: A Survey of the 52. Chipot: Variational Inequalities and Flow in
Mathematical Theory. Porous Media.
24. Grenander: Pattern Analysis: Lectures in Pattern 53. Mqjda: Compressible Fluid Flow and System of
Theory, Vol. II. Conservation Laws in Several Space Variables.
25. Davies: Integral Transforms and Their 54. Wasow: Linear Turning Point Theory.
Applications, 2nd ed. 55. Yosida: Operational Calculus: A Theory of
26. Kushner/Clark: Stochastic Approximation Hypertimctions.
Methods for Constrained and Unconstrained 56. Chang/Howes: Nonlinear Singular Perturbation
Systems. Phenomena: Theory and Applications.
27. de Boor: A Practical Guide to Splines. 57. Reinhardt: Analysis of Approximation Methods
28. Keilson: Markov Chain Models-Rarity and for Differential and Integral Equations.
Exponentiality. 58. Dwoyer/HussainWoigt (eds): Theoretical
29. de I'eubeke: A Course in Elasticity. Approaches to Turbulence.
30. Shiatycki: Geometric Quantization and Quantum 59. SanderslJlerhulst: Averaging Methods in
Mechanics. Nonlinear Dynamical Systems.
31. Reid: Sturmian Theory for Ordinary Differential 60. Ghit/Childress: Topics in Geophysical
Equations. Dynamics: Atmospheric Dynamics, Dynamo
32. MeisiMarkowitz: Numerical Solution of Partial Theory and Climate Dynamics.
Differential Equations.
33. Grenander: Regular Stroctures: Lectures in
Pattern Theory, Vol. III. (continued following index)
Eberhard Zeidler
Applied Functional Analysis
Main Principles and Their Applications
With 37 Illustrations
t Springer
Eberhard Zeidler
Max-Planck-Institut fUr Mathematik
in den Naturwissenschaften Leipzig
Inselstrasse 22-26
Leipzig, D·04103
Germany
Editors
J.E. Marsden L. Sirovich
Control and Dynamical Systems Division of Applied
107-81 Mathematics
California Institute of Technology Brown University
Pasadena, CA 91125 Providence, RI 02912
USA USA
Mathematics Subject Classification (1991): 34A12, 42A16, 35J05
Library of Congress Cataloging-in-Publication Data
Zeidler, Eberhard
Applied functional analysis : main principles and their
applications / Eberhard Zeidler.
p. cm. - (Applied mathematical sciences ; v. 109)
Includes bibliographical references and index.
ISBN 978-1-4612-6913-7 ISBN 978-1-4612-0821-1 (eBook)
DOI 10.1007/978-1-4612-0821-1
1. Functional analysis. 1. Title. II. Series: Applied
mathematical sciences (Springer-Verlag New York lnc.) ; v. 109.
QA1.A647 voI. 109
[QA320J
510 s-dc20 94-41480
[515'.7J
Printed on acid-free paper.
© 1995 Springer Science+Business Media New York
Originally published by Springer-Verlag New York, Inc in 1995
Softcover reprint ofthe hardcover lst edition 1995
AII rights reserved. This work may not be translated or copied in whole or in part without
the written permission of the publisher Springer Science+Business Media, LLC,
except for brief excerpts in connection with reviews or scholarly
analysis. Use in connection with any form of informat ion storage and retrieval, electronic adap-
tation, computer software, or by similar or dissimilar methodology now known or hereafter
developed is forbidden.
The use of general descriptive names, trade names, trademarks, etc., in this publication, even
if the former are not especially identified, is not to be taken as a sign that such names, as
understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely
byanyone.
Production managed by Laura Carlson; manufacturing supervised by Joe Quatela.
Photocomposed copy prepared using Ii\TEJX.
98765432
ISBN 978-1-4612-6913-7 SPIN 10681507
Dedicated in gratitude to my teacher
Professor Herbert Beckert
on the occasion of his 75th birthday
Everything should be made
as simple as possible, but not simpler.
Albert Einstein
David Hilbert Stefan Banach
(1862-1943) (1892-1945)
John von Neumam
(1903-1957)
Preface
A theory is the more impressive,
the simpler are its premises,
the more distinct are the things it connects,
and the broader is its range of applicability.
Albert Einstein
There are two different ways of teaching mathematics, namely,
(i) the systematic way, and
(ii) the application-oriented way.
More precisely, by (i), I mean a systematic presentation of the material
governed by the desire for mathematical perfection and completeness of
the results. In contrast to (i), approach (ii) starts out from the question
"What are the most important applications?" and then tries to answer this
question as quickly as possible. Here, one walks directly on the main road
and does not wander into all the nice and interesting side roads.
The present book is based on the second approach. It is addressed to
undergraduate and beginning graduate students of mathematics, physics,
and engineering who want to learn how functional analysis elegantly solves
mathematical problems that are related to our real world and that have
played an important role in the history of mathematics. The reader should
sense that the theory is being developed, not simply for its own sake, but
for the effective solution of concrete problems.
viii Preface
Our introduction to applied functional analysis is divided into two parts:
Part I: Applications to Mathematical Physics (AMS Vol. 108);
Part II: Main Principles and Their Applications (AMS Vol. 109).
A detailed discussion of the contents can be found in the preface to AMS
Vol. 108.
As prerequisites for the present volume, we only assume that the reader
is familiar with some basic facts about normed spaces as summarized in
Section 1.27 of AMS Vol. 108. The most important propositions are called
theorems. A list of these theorems, along with the most important defini-
tions, can be found at the end of the book.
The presentation of material takes into account that, in general, no book
is read completely from beginning to end. We hope that even a quick skim-
ming of the text will suffice to grasp the essential contents. To this end,
we recommend reading the introductions to the individual chapters, the
"theorems" (without proofs), and the examples (without proofs) as well as
the motivations and comments in the text, which point out the meaning of
the specific results. The proofs are worked out in great detail.-Grasping the
individual steps in the proofs as well as their essential ideas is made easier
by the careful organization. It is a truism that only a precise study of the
proofs enables one to penetrate more deeply into a mathematical theory.
The book is based on lectures I have given for students of mathematics
and physics at Leipzig University. The manuscript has been finished during
a stay at the "Sonderforschungsbereich 256" of Bonn University and at
the Max Planck Institute for Mathematics in Bonn. I would like to thank
Professors Stefan Hildebrandt and Friedrich Hirzebruch for the invitations
and the kind hospitality. Finally, my special thanks are due to Springer-
Verlag for the harmonious collaboration.
I hope that the reader of this book enjoys getting a feel for the unity of
mathematics by discovering interrelations between apparently completely
different subjects.
Leipzig Eberhard Zeidler
Spring 1995
Contents
Preface vii
Contents of AMS Volume 108 xiii
1 The Hahn-Banach Theorem Optimization Problems 1
1.1 The Hahn-Banach Theorem. . . . . . . . . . . 2
1.2 Applications to the Separation of Convex Sets. 6
1.3 The Dual Space Ora, b]* . . . . . . . . . . .. . 10
1.4 Applications to the Moment Problem. . . . . . 13
1.5 Minimum Norm Problems and Duality Theory 15
1.6 Applications to Cebysev Approximation . . . . 19
1. 7 Applications to the Optimal Control of Rockets . 20
2 Variational Principles and Weak Convergence 39
2.1 The nth Variation . . . . . . . . . . . . . . . . . . . . . 43
2.2 Necessary and Sufficient Conditions for Local Extrema
and the Classical Calculus of Variations . . . . . . . . . 45
2.3 The Lack of Compactness in Infinite-Dimensional Banach
Spaces. . . . . . . . . . . . . . . . . . . . . . . . 48
2.4 Weak Convergence . . . . . . . . . . . . . . . . . 49
2.5 The Generalized Weierstrass Existence Theorem 53
2.6 Applications to the Calculus of Variations . . . 56
2.7 Applications to Nonlinear Eigenvalue Problems 59
2.8 Reflexive Banach Spaces. . . . . . . . . . . . . 61
x Contents
2.9 Applications to Convex Minimum Problems and
Variational Inequalities. . . . . . . . . . . . . . 66
2.10 Applications to Obstacle Problems in Elasticity 71
2.11 Saddle Points . . . . . . . . . . . . . . . . . . . 72
2.12 Applications to Duality Theory . . . . . . . . . 73
2.13 The von Neumann Minimax Theorem on the Existence of
Saddle Points . . . . . . . . . . . . . . . . . . . . . . 75
2.14 Applications to Game Theory. . . . . . . . . . . . . . 81
2.15 The Ekeland Principle about Quasi-Minimal Points. . 83
2.16 Applications to a General Minimum Principle via the
Palais-Smale Condition .. . . . . . . . . . . . . . . . 86
2.17 Applications to the Mountain Pass Theorem. . . . . . 87
2.18 The Galerkin Method and Nonlinear Monotone Operators 93
2.19 Symmetries and Conservation Laws (The Noether Theorem) 98
2.20 The Basic Ideas of Gauge Field Theory 102
2.21 Representations of Lie Algebras. . . 107
2.22 Applications to Elementary Particles . . 112
3 Principles of Linear Functional Analysis 167
3.1 The Baire Theorem. . . . . . . . . . . . . 169
3.2 Application to the Existence of Nondifferentiable
Continuous Functions . . . . . . . . 171
3.3 The Uniform Boundedness Theorem 172
3.4 Applications to Cubature Formulas . 175
3.5 The Open Mapping Theorem 178
3.6 Product Spaces . . . . . . . . 180
3.7 The Closed Graph Theorem . 181
3.8 Applications to Factor Spaces 183
3.9 Applications to Direct Sums and Projections 188
3.10 Dual Operators . . . . . . . . . . . . . . . . . 199
3.11 The Exactness of the Duality Functor . . . . 205
3.12 Applications to the Closed Range Theorem and to
Fredholm Alternatives . . . . . . . . . . . . . . . . . 210
4 The Implicit Function Theorem 225
4.1 m-Linear Bounded Operators . . . . . . . . . . . . . . . 227
4.2 The Differential of Operators and the Frechet Derivative. 228
4.3 Applications to Analytic Operators . 233
4.4 Integration . . . . . . . . . . . . . . 238
4.5 Applications to the Taylor Theorem 243
4.6 Iterated Derivatives . . . . . . 244
4.7 The Chain Rule. . . . . . . . . . . . 247
4.8 The Implicit Function Theorem . . . 250
4.9 Applications to Differential Equations 254
4.10 Diffeomorphisms and the Local Inverse Mapping Theorem 258
Contents xi
4.11 Equivalent Maps and the Linearization Principle 260
4.12 The Local Normal Form for Nonlinear Double
Splitting Maps . . . . . . . . . . . . . . . . . 264
4.13 The Surjective Implicit Function Theorem . . 268
4.14 Applications to the Lagrange Multiplier Rule 270
5 Fredholm Operators 281
5.1 Duality for Linear Compact Operators . . . . . 284
5.2 The Riesz-Schauder Theory on Hilbert Spaces 286
5.3 Applications to Integral Equations . . . . . . . 291
5.4 Linear Fredholm Operators . . . . . . . . . . . 292
5.5 The Riesz-Schauder Theory on Banach Spaces 295
5.6 Applications to the Spectrum of Linear Compact
Operators . . . . . . . . . . . . . . . . . . . . . . 296
5.7 The Parametrix . . . . . . . . . . . . . . . . . . . 298
5.8 Applications to the Perturbation of Fredholm Operators 300
5.9 Applications to the Product Index Theorem. . . . . . . 301
5.10 Fredholm Alternatives via Dual Pairs. . . . . . . . . . . 303
5.11 Applications to Integral Equations and Boundary-Value
Problems . . . . . . . . . . . . . . . . . . . . 305
5.12 Bifurcation Theory . . . . . . . . . . . . . . . . . . . 309
5.13 Applications to Nonlinear Integral Equations . . . . 313
5.14 Applications to Nonlinear Boundary-Value Problems 315
5.15 Nonlinear Fredholm Operators . . . . . . . . 317
5.16 Interpolation Inequalities . . . . . . . . . . . 322
5.17 Applications to the Navier-Stokes Equations 329
References 371
List of Symbols 385
List of Theorems 391
List of Most Important Definitions 393
Subject Index 399
Contents of AMS Volume 108
Preface
Prologue
Contents of AMS Volume 109
1 Banach Spaces and Fixed-Point Theorems
1.1 Linear Spaces and Dimension
1.2 Normed Spaces and Convergence
1.3 Banach Spaces and the Cauchy Convergence Criterion
1.4 Open and Closed Sets
1.5 Operators
1.6 The Banach Fixed-Point Theorem and the Iteration
Method
1. 7 Applications to Integral Equations
1.8 Applications to Ordinary Differential Equations
1.9 Continuity
l.lO Convexity
1.11 Compactness
1.12 Finite-Dimensional Banach Spaces and Equivalent Norms
1.13 The Minkowski Functional and Homeomorphisms
1.14 The Brouwer Fixed-Point Theorem
1.15 The Schauder Fixed-Point Theorem
1.16 Applications to Integral Equations
xiv Contents of AMS Volume 108
1.17 Applications to Ordinary Differential Equations
1.18 The Leray-Schauder Principle and a priori Estimates
1.19 Sub- and Supersolutions, and the Iteration Method in
Ordered Banach Spaces
1.20 Linear Operators
1.21 The Dual Space
1.22 Infinite Series in Normed Spaces
1.23 Banach Algebras and Operator Functions
1.24 Applications to Linear Differential Equations in Banach
Spaces
1.25 Applications to the Spectrum
1.26 Density and Approximation
1.27 Summary of Important Notions
2 Hilbert Spaces, Orthogonality, and the Dirichlet
Principle
2.1 Hilbert Spaces
2.2 Standard Examples
2.3 Bilinear Forms
2.4 The Main Theorem on Quadratic Variational Problems
2.5 The Functional Analytic Justification of the Dirichlet
Principle
2.6 The Convergence of the Ritz Method for Quadratic
Variational Problems
2.7 Applications to Boundary-Value Problems, the Method of
Finite Elements, and Elasticity
2.8 Generalized Functions and Linear Functionals
2.9 Orthogonal Projection
2.10 Linear Functionals and the Riesz Theorem
2.11 The Duality Map
2.12 Duality for Quadratic Variational Problems
2.13 The Linear Orthogonality Principle
2.14 Nonlinear Monotone Operators
2.15 Applications to the Nonlinear Lax-Milgram Theorem and
the Nonlinear Orthogonality Principle
3 Hilbert Spaces and Generalized Fourier Series
3.1 Orthonormal Series
3.2 Applications to Classical Fourier Series
3.3 The Schmidt Orthogonalization Method
3.4 Applications to Polynomials
3.5 Unitary Operators
3.6 The Extension Principle
3.7 Applications to the Fourier Transformation
3.8 The Fourier Transform of Tempered Generalized Functions
Contents of AMS Volume 108 xv
4 Eigenvalue Problems for Linear Compact Symmetric
Operators
4.1 Symmetric Operators
4.2 The Hilbert-Schmidt Theory
4.3 The Fredholm Alternative
4.4 Applications to Integral Equations
4.5 Applications to Boundary-Eigenvalue Value Problems
5 Self-Adjoint Operators, the Friedrichs Extension and
the Partial Differential Equations of Mathematical
Physics
5.1 Extensions and Embeddings
5.2 Self-Adjoint Operators
5.3 The Energetic Space
5.4 The Energetic Extension
5.5 The Friedrichs Extension of Symmetric Operators
5.6 Applications to Boundary-Eigenvalue Problems for the
Laplace Equation
5.7 The Poincare Inequality and Rellich's Compactness
Theorem
5.8 Functions of Self-Adjoint Operators
5.9 Semigroups, One-Parameter Groups, and Their Physical
Relevance
5.10 Applications to the Heat Equation
5.11 Applications to the Wave Equation
5.12 Applications to the Vibrating String and the Fourier
Method
5.13 Applications to the Schrodinger Equation
5.14 Applications to Quantum Mechanics
5.15 Generalized Eigenfunctions
5.16 Trace Class Operators
5.17 Applications to Quantum Statistics
5.18 C*-Algebras and the Algebraic Approach to Quantum
Statistics
5.19 The Fock Space in Quantum Field Theory and the Pauli
Principle
5.20 A Look at Scattering Theory
5.21 The Language of Physicists in Quantum Physics and the
Justification of the Dirac Calculus
5.22 The Euclidean Strategy in Quantum Physics
5.23 Applications to Feynman's Path Integral
5.24 The Importance of the Propagator in Quantum Physics
5.25 A Look at Solitons and Inverse Scattering Theory
xvi Contents of AMS Volume 108
Epilogue
Appendix
References
Hints for Further Reading
List of Symbols
List of Theorems
List ofthe Most Important Definitions
Subject Index
1
The Hahn-Banach Theorem
and Optimization Problems
The most practical solution is a good theory.
Albert Einstein
True optimization is the revolutionary contribution of modern re-
search to decision processes.
George Bernhard Dantzig
(born 1914)
The Hahn-Banach theorem is the most important theorem about the
structure of linear continuous functionals on normed spaces. In terms of
geometry, the Hahn-Banach theorem guarantees the separation of convex
sets in normed spaces by hyperplanes. Figure 1.1 describes a number of
important consequences of the Hahn-Banach theorem that will be studied
in this chapter and the following one.
In this chapter we want to show that the Hahn-Banach theorem repre-
sents a fundamental existence principle in linear functional analysis that
allows the solution of variational problems without using any compactness.
In the next chapter, we will study variational problems by employing weak
convergence, which is related to a generalized compactness concept.
The Hahn-Banach theorem was proved independently by Hahn in 1926
and by Banach in 1929. The discovery of this theorem was closely related
to the famous classical moment problem.
2 1. The Hahn-Banach Theorem and Optimization Problems
the dual space
era, b]'
reflexive Banach spaces
and duality
~
weak convergence
minimum norm problems and
duality in optimization ~
/ \
variational principle
(generalized Weierstrass
existence theorem)
moment problem
optimal control Cebysev ~
calculus of variations
of rockets approximation
FIGURE 1.1.
1.1 The Hahn-Banach Theorem
Theorem 1.A (The Hahn-Banach theorem for linear spaces). We assume
that
(i) L is a linear subspace of the real linear space X.
(ii) p: X --. JR is a sublinear functional, that is, for all u, v E X and all
a 2: 0,
p(u + v) ::; p(u) + p(v) and p(au) = ap(u).
(iii) F: L --. JR is a linear functional such that
F(u) ::; p(u) for all u E L. (1)
Then, F can be extended to a linear functional f: X --. JR such that
f (u) ::; p( u) for all u E X. (1*)
Proof. Step 1: We first prove the statement in the special case where
X = L + span{v} with fixed v rf. L.
To this end, we set
feu + )..v) := F(u) + c).. for all u E L, ).. E JR,
1.1 The Hahn-Banach Theorem 3
where c is a fixed real number that satisfies the following condition:
sup(F(u) - p(u - v)) :::; c:::; inf (p(w + v) - F(w)). (2)
uEL wEL
We have to show that such a number c exists. In fact, for all u, w E L, we
get
F(u) + F(w) = F(u + w) :::; p(u + w)
=p(u-v+w+v) :::;p(u-v)+p(w+v),
and hence
F(u) - p(u - v) :::; pew + v) - F(w) for all u,w E L.
This proves (2).
Obviously, the functional f: X --+ lR is linear. Thus, it remains to show
that
F(u) + CA :::; p(u + AV) for all u E L, A E lR. (3)
In fact, this is true for A = O. Let A > O. By (2),
c:::; p(A- 1U + v) - F(A- 1U) = A-1(p(u + AV) - F(u)).
This is (3). In the case where A < 0, it follows from (2) that
C 2: F( -A -lU) - p( -A -lU - v) = -A -l(F(u) - p(u + AV)),
and again we get (3).
Step 2: Induction. Suppose that there exists a sequence (Ln) of linear
subspaces of X such that L = L1 ~ L2 ~ ... along with
where
for some fixed Vn E X and Vn ~ Ln ,
and for all n. Using Step 1, a simple induction argument shows that F can
be extended to Ln for all n. This yields the desired extension f of F.
Step 3: If the situation from Step 2 is not at hand, then we can use the
Zorn lemma from the appendix of AMS Vol. 108. To this end, let C denote
the set of all the linear functionals
g: D(g) ~ X --+ ][{
that are an extension of F such that
g(u) :::; p(u) for all u E D(g).
4 1. The Hahn-Banach Theorem and Optimization Problems
We write
iff h: D(h) --> OC is an extension of g: D(g) --> oc.
This way C becomes an ordered set. Let I be a totally ordered subset of
C, that is, g, h E I implies
or h S g.
Then, there exists an upper bound b E C for I, that is,
9 S b for all gET.
To show this, let D(b) be the union of all the sets D(g) with 9 E I and
define
b(u) := g(u) on D(g).
Since I is totally ordered, the linear functional b: D(b) --> OC is well defined,
and b(u) S p(u) for all u E D(b).
By the Zorn lemma, there is a maximal element f in C. That is, the
linear functional f: D(J) S;; X --> OC has no proper extension in the sense of
C. This implies D(J) = X and feu) s
p(u) for all u E X. In fact, suppose
that D(J) =1= X. Then, there exists an extension of f in the sense of C,
by Step 1. This contradicts the maximality of f. Thus, f: X --> lR is the
desired extension of F. 0
Theorem loB (The Hahn-Banach theorem for normed spaces 1 ). We as-
sume that
(i) L is a linear subspace of the normed space X over OC.
(ii) F: L -+ OC is a linear functional such that
IF(u)1 s allull for all u ELand fixed a 2': O. (4)
Then, F can be extended to a linear continuous functional f: X -+ OC
such that
lJ(u) I allull s
for all u E X. (4*)
Proof. Step 1: Let OC = R Set
p(u) := allull for all u E X.
lBasic notions on normed spaces are summarized in Section 1.27 of AMS Vol.
108.
1.1 The Hahn-Banach Theorem 5
By Theorem LA, the functional F can be extended to a linear functional
f: X ~ R such that
f( u) ::; allull for all u E X.
Since f(±u) = ±f(u), we get (4*). Thus, f is continuous.
Step 2: Let IK = C. Define
H(u) := Re F(u) for all U E L.
Then,
F(u) = Re F(u) + i 1m F(u) = Re F(u) - iRe F(iu),
= H(u) - iH(iu) for all u E L,
and
IH(u)1 ::; allull for all u E L.
If we regard X as a real normed space, then it follows from Step 1 that
there exists a linear continuous functional h: X ~ R such that h( u) = H (u)
for all u ELand
Ih(u)1 ::; allull for all u E X.
Define
f(u) := h(u) - ih(iu) for all u E X.
Hence h(u) = Re f(u). We want to show that f is the desired functional.
Obviously, f: X ~ C is an extension of F. Moreover, f is linear. This
follows from
f(iu) = h(iu) - ih( -u) = if(u) for all u E X,
and from the linearity of h with respect to R Finally, we have to show that
If(u)1 ::; allull for all u E X.
In fact, for each u E X, we get f(u) = re i {3 with r ?: O. Hence
If(u)1 = r = Re(e- i {3 f(u)) = Re f(e- i !3 u )
= h(e- i !3 u ) ::; alle- i !3 u ll = allull· o
Standard Example 1. Let X be a normed space over K Then, for each
given Uo E X with Uo i:- 0, there exists a functional f E X* such that 2
f(uo) = lIuoll and Ilfll = 1.
2Recall from Section 1.21 of AMS Vol. 108 that the dual space X* to X
consists of all linear continuous functionals f: X -+ 1K.
6 1. The Hahn-Banach Theorem and Optimization Problems
Proof. Set L := span {uo} and
F( u) := A/lUo II for all u E L, where u = AUo·
Obviously, IF(u)1 = 111£11 for all 1£ E L. By Theorem 1.B, there exists a
functional f E X* such that f(u) = F(u) on Land
If(u)1 :s; /lu/l for all u E X.
Hence Ilfll = 1. o
Corollary 2. Let X be a normed space over lK.. Then, for all Uo EX,
111£011 = max If(uo)l·
fEX*,llfI19
Proof. Since If(uo)1 :s; IIflllluoll for all f E X*, the assertion follows from
Standard Example 1. 0
Corollary 3. Let X be a normed space over K Then, it follows from 1£ EX
and
f(u) = a for all f E X'
that u = O.
This is an immediate consequence of Standard Example 1.
1.2 Applications to the Separation of Convex Sets
Definition 1. By a closed hyperplane H in the real normed space X, we
understand a set
H:= {u E X:f(u) = a},
where f: X ---7 ]R is a linear continuous functional and a is a fixed real
number. We also define the half-spaces Hs. and H> of H through
Hs. := {u E X: f(u) :s; a} and H>:= {u E X:f(u) > a}.
Let A and B be two subsets of X. Then, we say that the closed hyperplane
H strictly separates the sets A and B iff
At;;;;H'5. and
Furthermore, we say that the closed hyperplane H separates the sets A and
B iff A t;;;; H'5. and B t;;;; H?.
Example 2. Let X := ]R2. Then, every closed hyperplane H in X is given
through
H:= {(E,ry) E JR.2: a E + bry = a},
1.2 Applications to the Separation of Convex Sets 7
H>~
~H~
(a) (b)
FIGURE 1.2.
FIGURE 1.3.
where a, b, and O! are fixed real numbers. In Figure 1.2, the sets A and B
are strictly separated by H.
Proposition 3. Let L be a linear subspace of the normed space X over 1K.
Then, for each point Uo E X with
dist(uo,L) > 0,
there exists a linear continuous functional f: X -t IK such that
f(u) = 0 for all u E L,
along with Ilfll = 1 and f(uo) = dist(uo, L).
Recall that
dist(uo, L) := inf lIuo -
vEL
vII. (5)
If X is a real normed space, then this means that the closed hyperplane
H:= {u E X:f(u) = O}
separates strictly the linear subspace L and the point Uo, where L ~ H
(see Figure 1.3).
Proof. Set Lo := span{uo} + L. Then, u E Lo iff
u = AUo + v, where A E IK and vEL.
This representation of u is unique. In fact, it follows from u = f.LUo + w with
f.L E IK and w E L that (f.L - A)UO = w - v. Hence f.L - A = 0 and v - w = 0
because Uo (j. Land w - vEL. Define
F(u):= A dist(uo,L) for all u E Lo.
8 1. The Hahn-Banach Theorem and Optimization Problems
FIGURE 1.4.
Obviously, F: L -> IK is linear. Furthermore,
IF(u)1 -::; lIuli for all u E Lo.
This follows from
for all vEL and A E IK with A =1= O. According to Theorem 1.B, F can be
extended to a linear continuous functional f: X -> IK such that
If(u)1 -::; lIuli for all u E X.
By (5), for each c > 0, there is a vEL such that lIuo -vii < dist{uo, L) +€.
Since f = F on L, we get f{uo - v) = dist(uo,L), and hence
f(uo - v) dist(uo, L)
lIuo - vii > dist(uo, L) + €'
Letting € -> 0, this implies that Ilfll = 1. o
Theorem 1.C. Let M be a nonempty closed convex subset of a normed
space X over 1K, and let Uo be a point of X with Uo 'I. M.
Then, there exists a linear continuous functional f: X -> IK such that
Ref(u) -::; 1 for all u E M and Re !(uo) > 1.
In terms of geometry, this theorem tells us the following. Let X be a real
normed space. Set
H:= {u E X:f(u) = I}.
Then, the closed hyperplane H separates the set M and the point Uo (see
Figure 1.4).
Proof. Since Uo 'I. M and M is closed, we get
d:= dist(uo, M) > O.
Otherwise, there exists a sequence (vn ) in M such that Iluo - Vn II -> 0 as
n -> 00, and hence Uo E M. This is a contradiction. Define
Md:= {u E X: dist(u, M) < ~}.
1.2 Applications to the Separation of Convex Sets 9
Since M is convex, so is Md. In fact, if Uj E Md, j = 1,2, then there exist
g
points Vj EM such that JlUj - VjJl < for j = 1,2. For all t E [0,1],
. d
JltUl + (1 - t)U2 - (tVl + (1 - t)v2)11 ~ tJlUl - vIii + (1- t)Jl u2 - v211 < 2'
and hence Md is convex. Furthermore, let
M := closure of Md.
Using sequences, it follows easily that the closure of each convex set is again
convex.
Summarizing, M is a closed convex set such that Uo ¢ M, and M also
has an interior point. In addition, M ~ M. By Section 1.13 of AMS Vol.
108, the Minkowski functional p of M is sublinear and we get
M = {u E X:p(u) ~ I}, (6)
along with
o ~ p(u) ~ cllull for all U E X and fixed c> O.
Step 1: Let lK = lR.. Define L := span {uo} and
F(Auo) := Ap( uo) for all A E lR..
Then,
F(u) ~ p(u) for all u E L.
In fact, if A 2: 0, then F(Auo) = p(AUo), and if A < 0, then F(AUo) ~ O.
According to the Hahn-Banach theorem (Theorem LA), F can be ex-
tended to a linear functional f: X -+ R such that
feu) ~ p(u) ~ cJlull for all U E X.
Since f(±u) = ±f(u), we get If(u)1 ~ cllull for all U E X, and hence f is
continuous on X.
Finally, we obtain that
feu) ~ p(u) ~ 1 for all U E M,
and f(uo) = F(uo) = p(uo) > 1, by Uo ¢ M and (6). This proves the
assertion.
Step 2: Let lK = Co If we regard X as a real normed space, then we may
construct the linear continuous functional f: X -+ R as in Step 1. Then,
the functional h: X -+ C defined by
h(u) := feu) - if(iu) for all u E X
has the desired properties. This follows by a similar argument as in the
proof of Theorem 1.B. 0
10 1. The Hahn-Banach Theorem and Optimization Problems
1.3 The Dual Space C[a, b]*
Proposition 1. Let -00 < a < b < 00. Then, F E C[a, bJ* iff there exists
a function p: [a, b] -> JR of bounded variation such tha&
F(u) = lb u(x)dp(x) for all u E C[a, bj. (7)
In addition, IIFII = V(p), where V(p) denotes the total variation of p.
The integral (7) represents a Stieltjes integral. Such integrals along with
functions of bounded variation are discussed in the appendix of AMS Vol.
lOS. The proof will be based on the Hahn-Banach theorem.
Proof. We set X := C[a, bj and Ilull := sUPa:S;x:S;b lu(x)l·
Step 1: Let F be as given in (7). By the appendix of AMS Vol. lOS,
IF(u)1 :::; V(p)lIuli for all u E X.
Hence F.E C[a,b]'.
Step 2: Let FE C[a, bj'. We want to prove that F allows a representation
of the form (7).
To this end, let Y denote the space of all bounded functions u: [a, bj -> lR.
Then, Y becomes a normed space with respect to Iluli.
Since X is a linear subspace of Y, it follows from the Hahn-Banach
theorem (Theorem loB) that F can be extended to a linear continuous
functional
f:Y->JR with IIFII = IIfll·
Set p(t):= f(vt} for all t E [a,b]' where
Vt ( x) := {Io ~f a :::; x :::; t
If t < x :::; b.
We will prove in Step 3 ahead that p: [a, bj -> JR is of bounded variation and
V(p) :::; IIFII. (S)
3Recall from Chapter 1 of AMS Vol. 108 that the space O[a, bJ consists of all
continuous functions u: [a, bJ -> JR. The norm on O[a, bJ is given by
Jlull := max lu(x)l·
a::;x:5b
The dual space O[a, b]* consists of all linear continuous functionals on O[a, bJ.
1.3 The Dual Space era, b)* 11
Now let U E X be given. Consider the partition a = Xo < Xl < ... <
Xn = b of the interval [a, b). Then, the continuous function u: [a, b) ---. 1R can
be approximated by the step function
n
un(x) := L u(Xj)(VXj (x) - VXj _1 (x».
j=l
Hence
n
I(un) = E u(Xj)(p(Xj) - p(Xj-1».
j=l
If the partition is made arbitrarily fine as n ---. 00, we have
by the definition of the Stieltjes integral in the appendix of AMS Vol. 108.
On the other hand,
Un ---. U in Y as n ---. 00.
Since I is continuous on Y, this implies I(u n ) ---. I(u) as n ---. 00. Hence
F(u) = I(u) = lb u(x)dp(x) for all U E X.
This is (7).
By Step 1, we get IIFII ~ V(p), and (8) yields V(p) ~ IIFII· Thus, IIFII =
V(p).
Step 3: Proof of (8). Using the partition {Xj} of [a,b] from Step 2 and
letting Sj := sgn(p(xj) - p(Xj-1», we have
n n
t::..:= L Ip(xj) - p(xj-1)1 = LSj(p(Xj) - p(Xj-1»
j=l j=l
n
= LSj(f(vxj ) - f(VXj_J)
j=l
=1 (tSj(VXj -VXj_l»)'
J=l
Thus,
n
t::.. ~ 11/11 L Sj(VXj - VXj _1 ) = 11/11 = IIFII,
j=l
12 1. The Hahn-Banach Theorem and Optimization Problems
and hence V(p) :::; IIFII, by the definition of the total variation V(p) of pin
the appendix of AMS Vol. 108. 0
Example 2. Let w: [a, b] ~ 00 be a continuous function, where -00 < a<
b < 00. Set
F(u):= lb u(x)w(x)dx for all u E C[a, b].
Then, FE C[a, b]* and
IIFII = lb Iw(x)ldx.
Proof. Define
p(x):=l x
w(t)dt for all x E [a, b].
= lb
Then
F(u) u(x)dp(x) for all u E C[a, b],
and IIFII = V(p), by (5) in the appendix of AMS Vol. 108.
Let a = Xo < Xl < ... < Xn = b be a partition of the interval [a, b]. Then
lb
Hence
V(p) :::; Iw(t)ldt.
By the mean value theorem,
n
~ = L Iw(tj)l(xj - xj-d,
j=l
Making the partition arbitrarily fine as n ~ 00, we get
~ ~
t, Iw(t)ldt
Ja as n ~ 00,
and hence
V(p) = lb Iw(t)ldt. o
1.4 Applications to the Moment Problem 13
1.4 Applications to the Moment Problem
The Finite Moment Problem. Let -00 < a < b < 00. We are given the
real numbers /-Lo, /-LI,"" /-LN for fixed N 2: 0. We are looking for a function
p: [a, b] ---+ lR. of bounded variation such that
lb xkdp(x) = /-Lk for all k = 0, ... , N. (9)
In terms of physics, we are looking for a charge density p that has the
prescribed moments /-Lk, k = 0, ... ,N. In particular, /-Lo is equal to the total
charge on the interval [a, b].
Proposition 1. The finite moment problem has always a solution.
Proof. Let X := C[a, b], and let Pk(X) := xk, k = 0,1, ... , N. Set
L := span{po, PI , '" ,PN}'
Then, the (N + I)-dimensional linear subspace L of X consists of all the
real polynomials of order :::; N. Let U EX. Then
for all x E [a, b].
Define
F(u) := ao + al/-Ll + ... + aN/-LN for all u E L.
Obviously, the functional F: L ---+ lR. is linear. This functional is also con-
tinuous. To prove this, let
Un ---+ U in L as n ---+ 00.
This implies un(x) ---+ u(x) as n ---+ 00 uniformly on [a, b]. By the well-known
Lagrangian interpolation formula, we get
N
un(x) = L un(Xj)cPj(x) on [a, b] for all n, (10)
j=o
where a = Xu < Xl < ... < XN = b is a fixed given partition of [a, b], and
cPo, ... , cPN are fixed N-th order polynomials with cPj(Xi) = Oij for all i,j. It
follows from (10) that the coefficients of Un converge to the corresponding
coefficients of U as n ---+ 00. Hence
as n ---+ 00.
Since F: L ---+ lR. is linear and continuous, we get
jF(u)j :::; constjjulJ for all u E L,
14 1. The Hahn-Banach Theorem and Optimization Problems
where Ilull denotes the norm on X, i.e., lIuli := maxa:O;x:O;b lu(x)l·
According to the Hahn-Banach theorem, F can be extended to a linear
continuous functional f: X -+ R By Section 1.3, there exists a function
p: [a, bJ -+ JR of bounded variation such that
feu) = lb u(x)dp(x) for all u E C[a, bJ.
This implies (9). In fact, f(pj) = F(pj) = J-tj. o
The Moment Problem. Let -00 < a < b < 00. We are given the real
numbers J-to, J-tl, .... We are looking for a function p: [a, bJ -+ JR of bounded
variation such that
lb xkdp(x) = J-tk for all k = 0,1, .... (11)
Proposition 2. The moment problem has a solution iff there is a constant
It It
e > 0 sueh that
akJ-tkl :::; e max akxkl for all real ak and all N = 0,1,2, ....
a<x<b
k=O - - k=O
(12)
Proof. We use the same notation as in the proof of Proposition 1.
If the moment problem has a solution p, then the functional
F(u):= lb u(x)dp(x) for all u E X
is linear and continuous on X. Hence
IF(u)1 :::;ellull for all u E X.
Using F(Pk) = J-tk and u = aopo + '" + aNPN, we get
(12).
Conversely, suppose that (12) holds true. Let L := span{PO,Pl, .. . }. De-
fine
for all k = 0, 1, ....
This way, we obtain a linear functional F: L -+ R By (12),
IF(u)1 :::; ellull for all, u E L.
Since L is dense in X, the functional F can be extended to a linear contin-
uous functional F: X -+ JR, by the extension principle from Section 3.6 of
AMS Vol. 108. According to Section 1.3, there exists a function p: [a, bJ -+ JR
of bounded variation such that
F(u) = lb u(x)dp(x) for all u E C[a, bJ.
This implies (11). o
1.5 Minimum Norm Problems and Duality Theory 15
1.5 Minimum Norm Problems and Duality
Theory
Along with the primal problem
inf lIu - uoll = a, u E L, (13)
let us consider the dual problem
sup(u*, uo) = (3, u* E L-L, lIu* II ::; 1, (13*)
where L-L:= {u E X*: (u*,u) °
= for all u E L}.
Theorem 1.D (Minimum norm problem on the normed space X). Let L
be a linear subspace of the real normed space X. We are given Uo EX.
Then the following conditions hold:
(i) Extremal values: a = (3.
(ii) Dual problem: The dual problem (13*) has a solution u*.
(iii) Primal problem: Let u* be a fixed solution of the dual problem (13*).
Then, the point u E L is a solution of the primal problem (13) iff
(u*,uo - u) = lIu - uoll· (14)
Corollary 1. If dim L < 00, then the primal problem (13) always has a
solution.
Let vEL and v* E L-L with Ilv* II ::; 1. Then, from (i) we obtain the
two-sided error estimate for the minimal value a:
Ilv - uoll ~ a ~ (v*, uo).
Proof. 4 Ad (i), (ii). For each e: > 0, there is a point u E L such that
lIu - uoll ::; a + e:.
Thus, for all u* E L-L with Ilu* II :S 1,
(u*, uo) = (u*, Uo - u) ::; Ilu* 1IIIu - uoll ::; a + e:.
Hence (3 ::; a + e: for all e: > 0, that is, (3 ::; a.
4The Latin notion "Ad (i)" stands for "proof of (i)".
16 1. The Hahn-Banach Theorem and Optimization Problems
Let Ct > O. It follows from Proposition 3 in Section 1.2 that there is a
functional u* E Ll.. with lIu* II = 1 such that
(u*, uo) = Ct. (15)
Along with (3 :::; Ct, tlais implies (3 = Ct.
If Ct = 0, then (15) holds with u* = 0, and hence we again have Ct = (3.
Ad (iii). This follows from Ct = (3 and (u*, u) = O. 0
Proof of Corollary 1. Since 0 E L, Ct :::; Iluo II. Thus, problem (13) is
equivalent to the finite-dimensional minimum problem
lIu - uoll = min!, u E Lo,
where the set Lo := {u E L: lIuli :::; lIuoll} is compact. By the Weierstrass
theorem (Proposition 8 in Section 1.11 of AMS Vol. 108), this problem has
a solution. 0
Remark 2. Let dim L = 00, where L is a closed linear subspace of the real
reflexive Banach space X (e.g., X is a real Hilbert space 5 ), and let Uo E X
be given. Then, the primal problem (13) has a solution.
This will be proved in Section 2.9 (cf. Theorem 2.E). Note that this result
will not be used in the present chaptei.
In contrast to (13), we now consider the modified primal problem
inf lIu* - uoll = Ct, u* ELl.., (16)
along with the dual problem
sup(uo, u) = (3, u E L, lIuli :::; 1. (16*)
Recall that Ll.. := {u* E X*: (u*,u) = 0 for all u E L}. Thus, the primal
problem (16) refers to the dual space X*, whereas the dual problem (16*)
refers to the original space X.
Theorem I.E (Minimum norm problem on the dual space X*). Let L be
a linear subspace of the real normed space X. We are given Uo E X*. Then
the following conditions hold:
(i) Extremal values: Ct = (3.
(ii) Primal problem: The primal problem (16) has a solution u* .
5The basic properties of Hilbert spaces can be found in Chapter 2 of AMS
Vol. 108.
1.5 Minimum Norm Problems and Duality Theory 17
(iii) Dual problem: Let u* be a fixed solution of the primal problem (16).
Then, the point u E L with /lu/l $ 1 is a solution of the dual problem
(16*) iff
(uo - u*,u) = /luo - u*/I. (17)
Proof. Ad (i), (ii). For all u· E LJ.,
/lu* -uo/l = sup «uo,u) - (u*,u))
lIull9
> o,
sup (u u) = (3,
lIull9,uEL
since (u*, u) = 0 for all u E L. Hence a 2:: (3.
Let u;: L - t lR be the restriction of uo: X - t lR to L. Then
lIu;lI= sup (uo,u) =(3.
lIuli9,uEL
By the Hahn-Banach theorem (Theorem I.B), there exists an extension
U*: X - t lR of u; u;
with II U* II = II II. This implies
v* := Uo - U· = 0 on L,
that is, v* E LJ.. Since a 2:: (3 and
IIv* - uoll = IIU*II = Ilu;11 = (3,
we get a = (3.
Ad (iii). This follows from a = (3 with (u*, u) = O. D
In Sections 1.6 and 1. 7, Theorems I.D and I.E will be applied to Cebysev
approximation and the optimal control of rockets, respectively. In this con-
nection, the following lemma will be used critically.
Let -00 < a $ c $ b < 00. Set
Oe(u) := u(c) for all u E eta, bl.
Obviously, oe E era, b]* and Iloell = 1.
Lemma 3. Let u* E eta, b]* be such that lIu* /I =f. O. Suppose that
(u*, u) = lIu* lI/1ull where Ilull := a::ox::ob
max lu(x)l,
and u: [a, b] - t lR is a continuous function such that lu(x)1 achieves its
maximum at precisely N points of [a,b] denoted by Xl, •• ' ,XN·
Then, there exist real numbers al, ... ,aN such that
18 1. The Hahn-Banach Theorem and Optimization Problems
Proof. By Section 1.3, there exists a function p: [a, b] -> lR of bounded
variation such that
(u*,u) = lb u(x)dp(x) for all u E era, b],
and V(p) = Ilu*ll, where V(p) denotes the total variation of p on the interval
[a,b] (cf. the appendix of AMS Vol. 108). We may assume that p(a) = O.
To explain the simple idea of the proof, assume that N = 1, ±U(XI) = Ilull,
and a < Xl < b.
Let J := [a, b]- ]XI - C, Xl + c[ for fixed c > 0, and let V,,(p) denote the
total variation of p on J. Then,
VJ(p) + Ip(XI + c) - p(XI - c)1 ::; V(p). (18)
Case 1: Let VJ(p) = 0 for all E > O. Then, by (18), P is a step function
of the following form:
{0 if a ::; X < Xl
p(x):= ±V(p) if Xl < X ::; b,
and, by the definition of the Stieltjes integral (cf. the appendix of AMS
Vol. 108),
(u*,u) = lb u(x)dp(x) = ±U(XI)V(p) for all u E C[a, b].
Hence u* = ±V(p)OXJ'
Case 2: Let VJ (p) > 0 for some E > O. We want to show that this is
impossible. By the mean value theorem, there is a point t E [x - E, X + E]
such that
r
(u*,u) = lJ u(x)dp(x) + l x1 E
Xl-E+ u(x)dp(x)
::; max lu(x)IVJ(p)
xEJ
+ lu(t)llp(x + c) - p(x - E)I.
Since lu(x)1 achieves its maximum exactly at the point Xl, we get
maxxEJ lu(x)1 < lIull· Thus, it follows from (18) that
(u*,u) < IlullV(p).
Hence (u*,u) < Ilu*lIllull. This is a contradiction.
For N > 1, we use a similar argument. o
1.6 Applications to Cebysev Approximation 19
1.6 Applications to Cebysev Approximation
For the given continuous function Uo: [a, b] ---+ lR. on the compact interval
[a, b], let us consider the following approximation problem:
max luo(X) - u(x)1 = min!, u E L, (19)
a::;x::;b
where L denotes the set of all real polynomials of degree:::; N, for fixed
N ?: 1. Problem (19) corresponds to the so-called Cebysev approximation
of the function Uo by polynomials.
Proposition 1. Problem (19) has a solution. If u is a solution of (19),
then
luo(X) - u(x)1
achieves its maximum at at least N + 2 points of [a, b].
Proof. Set X := era,
b] and Ilvll := maxa<x<b Iv(x)l. Then, the original
problem (19) can be written in the form - -
lIuo - ull = min!, u E L. (20)
Since dim L < 00, this problem has a solution, by Corollary 1 in Section
1.5.
We may assume that Uo f/. L. Otherwise, the statement is trivial. Let
u be a solution of (20). Then, Iluo - ull > o. By the duality theory from
Theorem l.D, there exists a functional u* E eta,
b]* such that
(u*, Uo - u) = lIuo - ull (21)
along with lIu* II = 1 and
(u*,p) =0 for all pEL. (22)
Suppose that luo(x) - u(x)1 achieves its maximum on [a, b] at precisely
the points Xl,'" ,XM, where 1 :::; M < N + 2. It follows from (21) and
Lemma 3 in Section 1.5 that there are real numbers 0.1, ... ,aM with lad +
... + IalII I = 1 such that
Assume that alII i= O. Choose a real polynomial p of degree N such that
p(xd = P(X2) = ... = p(XM -d = 0 and p(X/If) i= O.
This is possible, since M - 1 :::; N. Then, pEL and (u* ,p) i= 0, contra-
dicting (22). D
20 1. The Hahn-Banach Theorem and Optimization Problems
1.7 Applications to the Optimal Control
of Rockets
We want to study the motion of a vertically ascending rocket that reaches
a given altitude h with minimum fuel expenditure (see Figure 1.5).
The motion x = x(t) of the rocket is governed by the equation
mxl/(t) = F(t) - mg, 0< t < T,
(23)
x(O) = x'(O) = 0, x(T) = h,
where m = mass of the rocket, mg = force of gravity, and F(t) = rocket
force. We neglect the loss of mass by the burning of fuel. To simplify nota-
tion, we choose physical units with m = g = 1.
Let us measure the minimal fuel expenditure during the time interval
[0, T] through the integral
foT IF(t)ldt
over the rocket force F. First let T > 0 be fixed. Then, the minimal fuel
expendit1}re a(T) during the time interval [0, T] is given by a solution of
the following minimum problem:
ior
T
min IF(t)ldt = a(T), (24)
:F
where we vary over all integrable functions F: [0, T] ---+ lR. We now choose
the final time a(T) in such a way that a(T) becomes minimal, that is,
a(T) = min!. (25)
Integration of (23) yields x(t) = f~(t - r)F(r)dr - ~, and hence
h= i T
o
T2
(T - r)F(r)dr - - .
2
(26)
Summarizing, for a given altitude h > 0, we have to determine the optimal
thrust program F(·) and the final time T as a solution of problems (24)
through (26).
This formulation has the following shortcoming. If we consider only clas-
sical force functions F, then an impulse at time t of the form
"F = 8/'
is excluded. However, we expect that such thrust programs may be of im-
portance. For this reason, let us consider the following generalized problem
for functionals:
1.7 Applications to the Optimal Control of Rockets 21
rocket
_ _ _ _J.....L_ _ _ _ earth
FIGURE 1.5.
(a) For a given altitude h and fixed final time T > 0, we are looking for
a solution F of the following minimum problem:
min IIFII = a(T), FE C[O,T]*, (27)
along with the side condition
T2
h = F(w) - 2' where we set w(t) := T - t. (28)
(b) We determine the final time T in such a way that
a(T) = min!. (29)
Observe that condition (27) generalizes (24). In fact, if the functional
F E C[O, T]* has the following special form:
F(u) = loT u(t)F(t)dt for all u E C[O, TJ,
where the fixed function F: [0, T] --> ~ is continuous, then
IIFI! = loT IF(t)ldt,
by Example 2 in Section 1.3.
Proposition 1. Problem (a), (b) has the following solution:
F=T80 and
with the minimal "fuel expenditure" IIFII = T.
This solution corresponds to an impulse at the initial time t = 0. Propo-
sition 1 shows that, in control theory, it is quite natural to use minimum
problems with respect to functionals.
Proof. Step 1: Solution of problem (a). Let X := C[a, b] and L := span{ w}.
By the Hahn-Banach theorem, there exists a functional Fa E C[a, b]* such
that
T2
Fo(w) = h+ 2'
22 1. The Hahn-Banach Theorem and Optimization Problems
Then, condition (28) says that (Fo - F)(w) = 0, i.e., (Fo - F) E L1..
Consequently, problem (a) is equivalent to the primal problem:
min II(Fo - F) - Fo)11 = a(T), (Fo - F) E L1.. (30)
By Theorem I.E in Section 1.5, the dual problem reads as follows:
sup Fo(u) = a(T), u E span{w}, lIuli ::; 1. (30*)
Let us solve (30*) and (30). Observe that the dual problem (30*) is one-
dimensional. Since IIwll = max09:5T Iw(t)1 = T, (30*) has the solution
u = T-1w.
Hence
Explicitly,
u(t) = T-1(T - t) for all t E [0, T].
By Theorem I.E, the primal problem (30) has a solution Fo - F E L1..
Hence
IIFII = a(T) and Fo(w) = F(w).
Since u= T-1w, the functional F E Ora, b]* satisfies the equation F(u) =
Fo(u) = a(T), i.e.,
F(u) = 11F111iull, (31)
because lIull = 1. Since the functional u(·) achieves its maximum on [0, T]
precisely at the point t = 0, it follows from (31) and Lemma 3 in Section
1.5 that
F = f380
for some real number f3 with 1f31 = IIFII. Since 1180 11 = 1, this implies
F = ±11F1180 ' From F(w) = Fo(w) > 0 and 80 (w) = w(O) > 0, we get
F = 11F1180 , that is, F = a(T)80 .
Step 2: Solution of problem (b). It follows from a'(T) = -T- 2 h+2- 1 = 0
that the problem a(T) = min! has the solution T = (2h)~. Hence a(T) =
T-1h + 2- 1T = (2h)~ = T. 0
Problems
The concepts of "topological space" and "metric space" will be introduced
in Problem 1. 12ft'. Important interrelations are pictured in Figure 1.6 ahead.
Locally convex spaces will be defined in Problem 3.21 in connection with
the weak topology (weak convergence) on Banach spaces. For example,
Problems 23
important spaces of generalized functions (distributions) are locally convex
spaces, but not normed spaces.
1.1. Convex hull. Let M be a convex subset of the normed space X. Show
that the closure M is also convex.
1.2. The completion principle for Banach spaces. Two normed spaces X
and Y over IK are called normisomorphic iff there exists a linear bijective
operator j: X --+ Y such that j is isometric, Le.,
Ilj(u)11 = lIuli for all u E X.
Let D be a normed space over IK. The Banach space X over IK is called
a completion of D iff the set D is dense in X and the X-norm coincides
with the D-norm on D.
1.2a. Uniqueness of completion. Show that two completions X and Y of
Dare normisomorphic.
1.2b. Existence of a completion. Show that there exists a Banach space
X over IK that is a completion of D.
Hint: We will use the classic idea of Cantor and Meray who introduced
real numbers in 1872 with the aid of equivalence classes of Cauchy se-
quences. Two Cauchy sequences (un) and (v n) in D are called equivalent
iff
as n --+ 00.
Let X be the set of the corresponding equivalence classes [(un)]. For a E IK,
we define
and
Prove that these operations make sense and that they are independent of
the choice of the representatives. Cf. Zeidler (1986), Vol. 2A, p. 96.
1.3. The completion principle for Hilbert spaces. Two pre-Hilbert spaces6
X and Yare called H-isomorphic (or unitarily equivalent) iff there exists
a unitary operator j: X --+ Y. That is, j is linear, bijective, and
(j(u) I j(v)) = (u I v) for all u,v E X.
Let D be a pre-Hilbert space over IK. The Hilbert space X over IK is
called a completion of D iff the set D is dense in X and the X-inner
product coincides with the D-inner product on D.
6See Section 2.1 of AMS Vol. 108.
24 1. The Hahn-Banach Theorem and Optimization Problems
Show that there exists a completion X of D and that each completion
of Dis H-isomorphic to D.
Hint: Use Problem 1.2 and the fact that the inner product can be ex-
pressed by a sum of norms according to (99) in Chapter 2 of AMS Vol. 108.
In particular, if U := [(un)] and v := [(vn)], then
(U I v):= lim (un I v n).
n-+oo
This limit exists and is independent of the choice of the representatives
(UrI,) and (v n ) of U and v, respectively.
Show that two pre-Hilbert spaces over ][( are H-isomorphic iff they are
normisomorphic.
1.4. The energetic space as a completion. Let B: D(B) ~ X ---. X be
a linear, symmetric, and strongly monotone operator on the real Hilbert
space X. As in Section 5.3 of AMS Vol. 108 we introduce the energetic
inner product by setting
(U I V)E := (Bu I v) for all u, v E D(B).
Show that the energetic space X E from Section 5,3 of AMS Vol. 108
is just the completion of the domain of definition D(B) with respect to
(. I ')E.
1.5. Separation of convex sets. Let A and B be nonempty convex sets in
the real normed space X. Show that
(i) A and B can be separated by a closed hyperplane provided
B n int A = 0 and int A i- 0.
(ii) A and B can be strictly separated by a closed hyperplane provided
A n B = 0 and both A and B are open.
(iii) A and B can be strictly separated by a closed hyperplane provided
An B = 0, A is closed, and B is compact.
Hint: Use the Hahn-Banach theorem. Cf. Edwards (1994), Section 2.1.
1.6. Extension of linear positive functionals (the Krein theorem). Suppose
that X is a real ordered normed space in the sense of Section 1.19 of AMS
Vol. 108 with the order cone X+ and that L is a linear subspace of X such
that
L n int X+ i- 0.
Let F: L ---. lR be a linear functional such that
F(u) :::: 0 for all u E L with u :::: o.
Problems 25
°
Show that F can be extended to a linear continuous functional f: X --+ lR
such that feu) ~ for all u E X with u ~ 0.
Hint: Use the Hahn-Banach theorem along with p(u) := inf{F(v): v E
L, v ~ u}. Cf. Edwards (1994), Section 2.5.2.
1. 7*. Uniqueness of the Cebysev approximation. Set X := C[a, b], where
-00 < a < b < 00 and
\\u - vII := a:S;x:S;b
max \u(x) - v(x)\.
Let L be a finite-dimensional linear subspace of X with dim L = N + 1. By
definition, L satisfies the Haar condition iff each nonzero function v: [a, b] --+
lR from L has at most N zeros. By Section 1.5, for given u EX, the
approximation problem
Ilu - vII = min!, vEL (32)
has a solution. In addition, the following can be shown.
(i) If L satisfies the Haar condition, then the solution v of (32) is unique.
(ii) Suppose that L satisfies the Haar condition. Let u r:J. L, and let vEL
be a given function. Suppose that there is a finite set of points a :S
h < t2 < ... < t N +1 :S b such that
j = 1, ... ,N +2,
attains alternately the values lIu - vII and -liu - vII at consecutive
points tj.
Then, v is the unique solution of (32).
Study the proofs of (i) and (ii) in Kreyszig (1989), p. 340 and p. 345,
respectively. It is shown in Zeidler (1986), Vol. 3, p. 181, that (i) and (ii)
are special cases of a general functional analytic theorem.
In particular, (i) and (ii) apply to classic Cebysev approximation where
L is the space of all polynomials of degree :S N with real coefficients. In
this case, the Haar condition is obviously satisfied.
Remark. We shall prove in Section 2.9 that the minimum problem (32)
has a unique solution if X is strictly convex. Unfortunately, the space C[a, b]
is not strictly convex. Therefore, we need a more subtle uniqueness proof.
1.8. * A special case of the famous Pontrjagin maximum principle. This
principle plays a fundamental role in the optimal control of time-dependent
processes in technology and economics. Let us consider the following control
problem with fixed end time. For a given time interval [0, T] with T > 0, we
are looking for a process x: [0, T] --+ lR and a piecewise-continuous control
function u: [0, T] --+ lR such that the following hold:
26 1. The Hahn~Banach Theorem and Optimization Problems
(a) Control functional:
faT L(x(t), u(t))dt = min!.
(b) Control equation:
x'(t) = v(x(t), u(t)) on [0, TJ,
x(o) = fixed.
(c) Control restriction:
u(t) E U on [O,Tj.
Here, U is a prescribed subset of ]Rm, and we assume that the prescribed
functions L, v: lR x lRm ....... lR are C 1 . Following Pontrjagin, we introduce
the generalized Hamiltonian
1t(x,u,p):= pv(x,u) - L(x,u)
along with the generalized canonical equation
p'(t) = -1t x (x(t),u(t),p(t)) on [O,Tj,
(33)
peT) = 0.
Suppose that x = x(t), u = u(t) is a solution to the original problem (a)
through (c) and let p = pet) be the solution to (33). Then, the following
maximum principle holds:
H(x(t), u(t),p(t)) = maxH(x(t), u,p(t). (34)
uEU
Study the proof of this theorem in Luenberger (1969), p. 263. The proof
relies on the concept of the adjoint operator and the F-derivative from
Chapter 4. The situation becomes much more complicated if the end time
is free. A proof of the general Pontrjagin maximum principle can be found
in Zeidler (1986), Vol. 3, p. 422. This proof is based on a general functional
analytic theorem.
1.9. The relation of Pontrjagin's maximum principle to classical mechanics.
In the special case where
vex, u) := u, U :=lR,
we get x'(t) = u(t). It follows from (34) that Hu(x(t),u(t),p(t) = 0, and
hence
pet) = LX'(X(t),x'(t».
Problems 27
Then, problem 1.8(a) corresponds to the principle of least action in me-
chanics, where x = x(t) describes the trajectory of a particle. The control
function u coincides with the velocity of the particle, and p is called the
momentum of the particle. In particular, equation (33) coincides with the
equation of motion
:t LXI(X(t), x'(t)) = Lx(x(t), x'(t)).
This is exactly the Euler-Lagrange equation (5) from Section 2.2, and it
corresponds to the minimum problem 1.8(a).
1.10. Application of Pontrjagin's maximum principle to the farmer's allo-
cation problem [from Luenberger (1969)J. A farmer produces a single crop
such as w:heat. After harvesting his crop, he may store it or sell and reinvest
it by buying additional land and equipment to increase his production rate.
The farmer wishes to maximize the total amount stored during the time
interval [0, TJ. Set
x(t) := rate of production at time t,
xr(t) := rate of reinvestment at time t,
xs(t) := rate of storage at time t.
Then, the stored production during the time interval [0, TJ is equal to
J;xs(t)dt. Thus, we get the following maximum problem:
loT xs(t)dt = max!.
Obviously,
x(t) = xr(t) + xs(t) for all t E [0, TJ.
Moreover, we assume that
xr(t) = cx'(t) for all t E [0, TJ and fixed c > 0.
Roughly speaking, this says that the reinvestment rate increases if the
production accelerates. Define
u(t):= {
Xr(t)
O(t)
°
if x(t) -=1=
if x(t) = 0.
Hence xr(t) = u(t)x(t) and °: ;
u(t) ::; 1. Since xs(t) = x(t) - xr(t), we
obtain the following control problem:
loT (1 - u(t))x(t)dt = max!, (35)
28 1. The Hahn-Banach Theorem and Optimization Problems
pre-Hilbert space ..........--- Hilbert space
~
normed space ..
~
Banach space
/" ~
complete metric space
loooJl,'T~ '''''''.--_m_e_tr_i1_s_p_ac_e---,''
linear space Itopological space I
FIGURE 1.6.
cx'(t) = u(t)x(t) on [0, TJ, x(O) = fixed> 0,
o ::; u(t) ::; 1 on [0, T].
Let c := 1 and assume that T > 1. Using the Pontrjagin maximum principle
from Problem 1.8, show that this problem has the following quite natural
solution:
{ I if 0 < t < T - 1
u(t) = 0 if T -=- 1"< t ::; T.
This means that the farmer stores nothing until time T -1, at which point
he stores all products. Such so-called bang-bang controls are typical for
many control problems.
Hint: Cf. Luenberger (1969), p. 265.
1.11. Further optimization problems. Applications of separation theorems
to general classes of optimization problems can be found in Zeidler (1986),
Vol. 3, Chapters 47ff. As an elementary introduction to optimization theory,
we recommend the monograph by Luenberger (1969).
1.12. Topological spaces. 7 The most general class of "spaces" used in anal-
ysis is the class of topological spaces. Let us discuss the relation between
normed spaces and topological spaces. Figure 1.6 tells us that each normed
space is a metric space, and so forth.
1.12a. Definition. A set M is called a topological space iff there exists a
system T of subsets of M that has the following properties:
(T1) MET and 0 E T.
(T2) If U1 , ..• , Un E T for any natural number n, then nj=l Uj E T.
(T3) If UOi ET for all a E A, where A is an arbitrary index set, then
UOiEA UOi E T.
7The classic introduction to general topology is Kelley (1955).
Problems 29
The system T is called a topology. A subset U of the topological space
M is called open iff U E T.
A subset C of M is called closed iff the complement M - C is open.
1.12b. Properties of closed sets. Let M be a topological space. Show that
(i) M and the empty set 0 are closed.
(ii) The union of a finite number of closed sets in M is again closed.
(iii) The intersection of all arbitrary number of closed sets in M is again
closed.
1.12c. Normed spaces. Let X be a normed space over K In Section 1.4
of AMS Vol. 108 we defined open sets in X. Show that these open sets of X
form a topology of X (i.e., each normed space is also a topological space).
1.12d. Subsets of normed spaces. Let M be a subset of a normed space.
A subset U of M is called relatively open iff there exists an open subset Ux
of X such that
U=Ux nM.
Show that the relatively open sets of M form a topology. This way, M
becomes a topological space.
1.12e. Neighborhoods. A subset U(u) of the topological space M is called
a neighborhood of the point u iff there exists an open set W such that
uEW~U(u) (see Figure 1.7(a)).
Show that, in a normed space, each c-neighborhood U,,(u) of the point u
(d. Section 1.4 of AMS Vol. 108) is also a neighborhood of u in the general
sense of topological spaces.
1.12f. Separation. A topological space M is called separated iff, for each
pair u, v of different points in M, there are neighborhoods U(u) and U(v)
such that
U(u) n U(v) 1= 0 (see Figure 1.7(b)).
Show that each normed space is separated.
1.12g. Convergence. Let (un) be a sequence in a topological space. We
write
Un -+ U asn-+oo
iff, for each neighborhood U of the point u, there exists a natural number
nu such that
Un E U for all n ~ nu·
Show that, in a normed space, this definition is equivalent to the defini-
tion given in Section 1.2 of AMS Vol. 108.
30 1. The Hahn-Banach Theorem and Optimization Problems
00
U(u) U(u) U(V)
(a) (b)
FIGURE 1.7.
Show that in a separated topological space the limit point of a convergent
sequence is unique.
1.13. Continuity in topological spaces. Let NI and Y be topological spaces.
The map
j: M --+ Y (36)
is called continuous at the point u E M iff, for each neighborhood U (f (u»,
there is a neighborhood U (u) such that
j(U(u» <;;,; U(f(u».
Moreover, the map j from (36) is called continuous iff it is continuous at
each point u E !v!.
1. 13a. Preimages oj continuous maps. Show that the following three
statements are mutually equivalent for the map j from (36):
(i) j is continuous.
(ii) The preimage j-l(W) of each open set W is again open.
(iii) The preimage j-l(C) of each closed set C is again closed.
1.13b. Continuity in normed spaces. Let X and Y be normed spaces
over lK, and let M be a subset of X. Show that the definition of continuity
from Section 1.9 of AMS Vol. 108 coincides with the general definition in
topological spaces.
1.14. Compactness in topological spaces. Let M be a subset of a topological
space X (e.g., X is a normed space). The set M is called compact iff each
open covering of M contains a finite subcovering. That is, each family {Ua}
of open sets Ua with
contains a finite subfamily {Ua, , ... , Uan} such that
n
Problems 31
AI is called relatively compact iff the closure AI is compact.
1.14a. Compactness and continuity. Let X and Y be topological spaces,
and let f: AI ~ X -+ Y be a continuous map on the compact set AI.
Show that f(AI) is also compact.
Hint: Use Problem 1.13a.
1. 14b. The finite intersection property. A system S of sets is called cen-
tered iff the intersection of finitely many sets in S is never empty.
Show that a topological space AI is compact iff every centered system of
closed sets in AI has a nonempty intersection.
1.15. Compactness in normed spaces. Let AI be a subset of the normed
space X over lK. Then, AI is called precompact iff either AI = 0 or AI -f 0
and AI has a finite E-net 8 for each E > O. Moreover, AI is called complete
iff each Cauchy sequence in AI converges to a point in 111.
1.15a. The compactness theorem. Show that the following three state-
ments are mutually equivalent:
(i) AI is compact.
(ii) AI is sequentially compact.
(iii) AI is precompact and complete.
The proof will be given ahead.
1.15b. The relative compactness theorem. Use Problem 1.15a in order to
show that
(a) AI is relatively compact iff AI is relatively sequentially compact.
(b) If AI is relatively compact, then AI is precompact. The converse holds
true if X is complete.
Solution: Ad (a). Let AI be relatively compact, that is, the closure
111 is compact. By Problem 1.15a, AI is sequentially compact. Thus, each
sequence (un) in AI has a convergent subsequence (Le., Un' ---+ u as n' ---+ 00
with u E AI). Hence AI is relatively sequentially compact.
Conversely, let AI be relatively sequentially compact. If (un) is a sequence
in the closure AI, then there exists a sequence (v n ) in AI such that
as n ---+ 00.
Since AI is relatively sequentially compact, there exists a convergent sub-
sequence (v n ')' that is,
V n ' ---+ V as n ---+ 00.
8The definition of a finite c-net can be found in Section 1.11 of AMS Vol. 108.
32 1. The Hahn-Banach Theorem and Optimization Problems
Hence Ilu n, - vii:::; lIun' - vn,lI + Ilvn , - vii ---7 0 as n' ---7 00, that is,
Un' ---7 V as n' ---7 00 and v E M.
Thus, M is sequentially compact. By Problem l.I5a, !vI is compact, and
hence !vI is relatively compact.
Ad (b). Let M be relatively compact. Then, M is compact. By Problem
l.I3a, Mis precompact. It follows easily that this implies the precompact-
ness of M.
Conversely, let !vI be precompact and let X be complete. The proof of
Proposition 10 in Section 1.11 of AMS Vol. 108 tells us that M is relatively
sequentially compact. By (a), M is relatively compact.
1.15c. Proof of the compactness theorem from Problem 1.15a.
(i) =:} (ii). Let (un) be a sequence in M. Set
An:= {U n ,U n+1,"'}'
We first show that there exists a point U E !vI such that
U E nAn.
00
n=l
(37)
Otherwise, for each U E M, there is an index m such that U tJ- Am, i.e.,
U(X -
00
M S;; An).
n=l
Since X - An is open and !vI is compact, there exists a finite number of
indices, say n = 1, ... , k, such that
k k
M S;; U (X - An) S;; U (X - An).
n=l n=l
This is a contradiction, since Uk E M and Uk E An for all n = 1, ... , k.
(ii) =:} (iii). Let M be sequentially compact. If (un) is a Cauchy sequence
in X, then there exists a convergent subsequence, that is, Un' ---7 U as
n' ---7 00 with U E M. By Proposition 7 in Section l.3 of AMS Vol. 108,
Un ---7 U as n ---7 00 (i.e., M is complete).
It follows as in the proof of Proposition 10 in Section 1.11 of AMS Vol.
108 that M is precompact.
(iii) =:} (i). Set Br(u) := {v E X: Ilu - vii:::; r}. Let {Ua } be an open
covering of M. Suppose that there is no finite subfamily of {Ua } that
covers !vI. We want to construct a sequence (un) in M such that, for all
n = 1,2, ... ,
(38)
Problems 33
and
B 2 -" (un) is not covered by a finite subfamily of {UoJ. (39)
In fact, since M has a finite 2- I -net, there exist points Vb . .. ,Vk E M such
that the family
B 2 -1 (VI)' ... , B 2 -1 (Vk)
covers M. Thus, there exists some point Vm (1 ::; m ::; k) such that
B 2 -1 (v m ) is not covered by a finite subfamily of {UoJ. Set UI = V m . This
is (39) for n = 1.
Since M has a finite 2- 2 -net, there exists a ball B 2-2 (U2) with U2 E M
and
B 2-1(Ut} n B 2-2(U2) =10
such that B 2 -2(U2) is not covered by a finite subfamily of {Uo,}. This is
(38) and (39) for n = 2.
Now use an induction argument in order to prove (38) and (39) for n > 2.
According to (38),
for all n = 1,2, ....
It follows from Corollary 8 in Section 1.3 of AMS Vol. 108 that (un) is
Cauchy. Since M is complete, we get
Un -+ U as n -+ 00 and U E M.
Choose an index (3 such that U E Uf3. Since Uf3 is open, there is an r >0
such that
B2r(U) ~ Uf3.
If m is sufficiently large, then lIu - umll ::; r with 2- m ::; r. By the triangle
inequality,
This contradicts (39).
1.16. The generalized Weierstrass theorem. Let f: M ~ X -+ R be a con-
tinuous function on the nonempty compact subset M of the topological
space X.
Show that f attains its maximum and minimum on M.
Solution: By Problem 1.14a, the set f(M) is compact in R and hence
is closed and bounded. Consequently, the real numbers infuEM f(u) and
SUPuEM f(u) are contained in M.
1.17. The Banach space C(M, Y). Let M be a nonempty compact subset
of a topological space, and let Y be a Banach space over K Let C(M, Y)
denote the set of all continuous functions f: M -+ Y.
34 1. The Hahn-Banach Theorem and Optimization Problems
Show that C(M, Y) is a Banach space over OC equipped with the norm
11111 := max
uEM
III(u)II·
1.18. Metric spaces. A nonempty set M is called a metric space iff there
exists a function d: M - t [0, oo[ such that, for all u, v, wE X, the following
hold:
(i) d(u, v) = 0 iffu = v.
(ii) d(u,v) = d(v,u).
(iii) d(u,w)::; d(u,v) +d(v,w) (triangle inequality).
The number d(u, v) is called the distance between the two points u and v.
By convention, empty sets are also called metric spaces.
1.18a. The translation principle. Show that each subset M of a normed
space X becomes a metric space by setting
d(u,v):= lIu - vii for all u, v E M. (40)
Using (40), we can directly translate many notions and propositions from
normed spaces to metric spaces. For example, we say that a sequence (un)
in the metric space M converges to the point u E M iff
lim d(un,u) = O.
n-->oo
A sequence (un) in the metric space M is called Cauchy iff, for each c > 0,
there is a number no(c) such that
for all n,m 2: no(c).
A metric space M is called complete iff each Cauchy sequence in M is
convergent.
1.18b. Topology. A subset U of the metric space M is called open iff, for
each u E U, there is some c > 0 such that the set
{v EX: d( u, v) < c}
is contained in U.
Show that the collection of all these open sets forms a topology on 1'.1
and that this way each metric space becomes a separated topological space.
1. 18c. Compactness in metric spaces. Show that all the compactness
statements of Problem 1.15 remain valid if we replace normed spaces with
Problems 35
metric spaces. Convince yourself that the corresponding proofs for normed
spaces can be directly translated to metric spaces.
1.19 . Some fundamental theorems. Study the proofs of the following results.
1.19a.* The Stone-Weierstrass approximation theorem. Let M be a
nonempty compact subset of a separated topological space. Let P be a
family of continuous functions I: M ~ lK such that the following hold:
(i) P is an algebra, i.e., if I,g E P, then Ig E P and oJ + fJg E P for
all Q,fJ E P.
(ii) P contains the constant functions on M.
(iii) P separates the points of M, that is, if u, v E M and u :f. v, then
there exists a function pEP such that p( u) :f. p(v).
Then, the set P is dense in the Banach space C(M, lK).
Hint: Cf. Yosida (1988), introduction.
1.19b. The Weierstrass approximation theorem in ~n. Let M be a non-
empty compact set in ~n. Use Problem 1.19a in order to show that the set
P of all polynomials p: M ~ ~ in n variables with real coefficients is dense
in the Banach space C(M) of real continuous functions Ion M.
Explicitly, this means that for each continuous function I: M ~ ~ and
each t: > 0 there is a real polynomial p: M ~ ~ such that
I/(u) - p(u)1 < t: for all u EM.
1.19c.* The general Arzeld-Ascoli theorem. Let M be a nonempty com-
pact set of a metric space, and let Y be a Banach space over oc. Then, the
family F of continuous functions I: M ~ Y is a relatively compact subset
of the Banach space C(M, Y) iff the following two conditions are satisfied:
(i) For each u E M, the set {f(u): IE F} is relatively compact in Y.
(ii) F is equicontinuous, that is, for each u E M and each t: > 0, there is
a number 8(u,t:) > 0 independent of I such that, for all I E F,
d(v,u) <8 implies I/(v) - l(u)1 < t:.
Hint: Cf. Dieudonne (1969), Section 7.5.
1.19d. * The Tietze-Urysohn extension theorem. Let I: M ~ X ~ ~ be
a continuous function on the nonempty closed subset of the metric space
X. Then there exists a continuous extension F: X ~ ~ of I such that
inf I(u) ~ F(v) ~ sup I(u) for all vEX.
uEM uEM
36 1. The Hahn-Banach Theorem and Optimization Problems
.N.:
P2
P::
F
/
Po
(a) (b)
FIGURE 1.8.
Hint: Cf. Dieudonne (1969), Section 4.5. In the special case where M :=
[0,1] and X := JR, the intuitive meaning of this theorem is pictured in
Figure 1.8(a).
1.1ge.* The Krein-Milman convexity theorem. Let M be a nonempty
convex compact subset of a real normed space X. Then
M = co £(M), (41)
where £(M) denotes the set of extreme points of M. By definition, u is an
extreme point of M iff
u = tv + (1 - t)w with v, w E M and 0 < t < 1
implies v = w.
Hint: Cf. Yosida (1988), Chapter 12. In the special case where M is a
closed triangle in JR2, precisely the three vertices Po, PI, and P2 are extreme
points of M (see Figure 1.8(b». Here, statement (41) says that a closed
triangle is equal to the closed convex hull of its vertices.
1.19f. Application to linear optimization. The minimum problem
F(u) = min!, uEM
has a solution u, where u is an extreme point of M, provided F: M <;;;; X -+
JR is a linear continuous functional on the nonempty convex compact subset
M of the real normed space X.
1.20. * The structure of C* -algebras.
1.20a. * The GNS-theorem (Gelfand-Naimark-Segal representation the-
orem). Let w be a state9 of a C*-algebra ~. Then, there exist a complex
Hilbert space X and a *-homomorphism ¢>:2l-+ L(X,X) such that
w(A) := (u I ¢>(A)u) for all A E 2l and fixed u E X with lIuli = 1.
9See Section 5.18 of AMS Vol. 108.
Problems 37
In addition, u is cyclic, that is, by definition, the set {¢(A)u: A E Ql} is
dense in X.
Study the proof in Kadison and Ringrose (1983), Vol. 1, p. 278.
1.20b.* The Gelfand-Naimark representation theorem. Each C*-algebra
Ql is *-isomorphic to a C*-subalgebra of L(X, X) for some complex Hilbert
space X.
More precisely, there exists an injective *-homomorphism ¢: Ql ~ L( X, X)
such that each state w of Ql has the form
w(A} = (u I ¢(A}u) for all A E Ql and fixed u E X with lIuli = 1.
Study the proofs in Kadison and Ringrose (1983), Vol. 1, p. 281.
1.20c. * The Gelfand theorem. Each commutative C* -algebra is *-isomor-
phic to C(M, q for some compact topological space M.
Recall that C(M, q consists of all the continuous functions f: M ~ C
with the *-operation defined through
f*(x) := f(x) for all x E M.
Study the proof in Berberian (1974), p. 223.
1.21. * Applications of C* -algebras to spectral theory. Study Rudin (1973),
Chapters 12 and 13.
1.22.' Applications of C* -algebras and von Neumann algebras to quantum
statistics. Study Bratteli and Robinson (1979), Vol. 2 and Simon (1993).
1.23. Density and duality. Let X and Y be Banach spaces over ][{ such
that the embedding
X~Y
is continuous, and X is dense in Y. Show that the following are met:
(i) The embedding y* ~ X· is continuous.
(ii) If X is reflexive, then y* is dense in X· .
Hint: To prove (ii), use the Hahn-Banach theorem. Cf. Zeidler (1986),
Vol. 2A, p. 98.
2
Variational Principles and Weak
Convergence
Johann Bernoulli, professor of mathematics, greets the most sophis-
ticated mathematicians in the world. Experience shows that noble
intellectuals are driven to work for pursuit of knowledge by nothing
more than being confronted with difficult and useful problems.
Six months ago, in the June edition of the Leipzig Acta Erudi-
torum, I presented such a problem. The alloted six-month deadline
has now gone by, but no trace of a solution has appeared. Only
the famous Leibniz informed me that he had unraveled the knot of
this brilliant and outstanding problem, and he kindly asked me to
extend the deadline until next Easter. I agreed to this honourable
request .... I will repeat the problem here once more.
Two points, at different distances from the ground and not in a
vertical line, should be connected by such a curve so that a body un-
der the influence of gravitational forces passes in the shortest possible
time from the upper to the lower point. 1
Johann Bernoulli, January 1697
How does one apply the methods of maxima and minima in the
determination of unknown curves?
Leonhard Euler, 1744
The famous Euler succeeded in tracing back to a general method all
investigations on variational problems. But however sophisticated
IThe solution of this classic problem will be given in Problem 2.1.
40 2. Variational Principles and Weak Convergence
and fruitful his method may be, one has to admit that it is not
simple. Here one finds a method which only uses simple principles of
calculus.
Joseph Louis Lagrange, 1762
By generalizing Euler's method, Lagrange got the idea for his re-
markable formulas, where in a single line there is contained the so-
lution of all problems of analytic mechanics.
Carl Gustav Jakob Jacobi (1804-1851)
The Euler "Calculus of Variations" from 1744 is one of the most
beautiful mathematical works that has ever been written.
Constantin Caratheodory (1873-1950)
Mathematics knows, besides the exclusive area of the Greeks, no
luckier constellation than the one under which Leonhard Euler (1707-
1783) was born. It was up to him to give mathematics a completely
changed form and to shape it into the powerful edifice that it is
today.2
Andreas Speiser (1885--1970)
The classical Weierstrass existence theorem from Section 1.11 in AMS
Vol. 108 tells us the following:
(W) The minimum problem
F(u) = min!, uEM, (1)
has a solution provided the functional F: M -* lR is continuous on the
nonempty compact subset M of the Banach space X.
Unfortunately, this result is useless for many variational problems be-
cause of the following crucial fact:
In infinite-dimensional Banach spaces, closed balls are not compact.
This is the decisive difficulty in the calculus of variations. To overcome
this difficulty, we shall introduce the notion of weak convergence. The basic
result reads as follows:
(C) In a reflexive Banach space, each bounded sequence has a weakly
convergent subsequence.
2Seen statistically, Euler must have made a discovery every week. He wrote
nearly 900 research papers and 5,000 letters. His Collected Papers comprise 72
volumes.
2. Variational Principles and Weak Convergence 41
In particular, every Hilbert space is a reflexive Banach space. For Hilbert
spaces, the convergence principle (C) is a consequence of the Riesz theorem
from Section 2.10 in AMS Vol. 108.
In the case of reflexive Banach spaces, we need some results about linear
continuous functionals that are consequences of the Hahn-Banach theorem.
The reflexivity of the Banach space X implies
x = X**,
that is, the bidual space X** = (X*)* can be identified with the original
space X. Consequently, reflexive Banach spaces are closely related to the
concept of duality. Roughly speaking, in the case of a Hilbert space X, we
get
X=X*j
in other words, the dual space X* can be identified with the original space
X by means of the Riesz theorem. This implies the reflexivity X = X** .
In finite-dimensional Banach spaces, the weak convergence coincides with
the usual convergence. The fundamental notion of weak convergence in
Hilbert spaces was introduced by Hilbert in 1906.
The convergence principle (C) implies the following fundamental gener-
alization of the classical Weierstrass theorem (W).
(W*) The minimum problem (1) has a solution provided the functional
F:!v! --+ R is weakly sequentially lower semicontinuous on the closed ball
M of the reflexive Banach space X.
More generally, this remains true if M is a nonempty bounded closed
convex set in the reflexive Banach space X. In particular, we shall show
that the following result is an easy consequence of (W*).
The minimum problem (1) with M = X has a solution provided the
functional F: X --+ R is convex and continuous on the reflexive Banach
space X and F(u) --+ +00 as lIuli --+ 00.
lt turns out that the theory of infinite-dimensional minimum problems
allows a simple formulation if convexity is involved (Le., both the set M
and the functional F are convex).
We now want to discuss in which way minimum problems are closely
related to operator equations. Suppose that the original minimum problem
(1) has a solution Uo E int M, that is, Uo is an inner point of At. Then
F'(uo) = 0 (Euler equation), (1*)
where F'(uo) denotes the Gateaux derivative, which will be introduced
in Section 2.1. Formally, equation (1*) looks like the equation in classical
analysis where F is a real function. Here, condition (1 *) means that the
tangent line is horizontal at the minimal point Uo (cf. Figure 2.1(a».
42 2. Variational Principles and Weak Convergence
F'
Uo
(a) (b)
FIGURE 2.1.
In the case where F: M ~ X - t lR. is a functional, equation (1 *) represents
an operator equation for the operator F': M ~ X - t X*. This way it is
possible to solve operator equations of the form (1 *) by considering the
corresponding minimum problem (I).
If the solution Uo of the minimum problem (I) is not an inner point of
the convex set M, then we get
(F'(uo), v - uo) ;:::: 0 for all v E M. (1**)
This is called variational inequality.
Finally, let us explain why our considerations about convex mznzmum
problems are closely related to the theory of monotone operators. The op-
erator A: X - t X' on the reflexive Banach space X is called monotone
iff
(Au - Av, u - v) ;:::: 0 for all u, vEX.
If the functional F: X - t lR. is convex, then its Gateaux derivative F': X - t
X' is monotone, that is, each solution Uo of the convex minimum problem
(1) is also a solution of the monotone operator equation (1*). In Section
2.18, we will use the Galerkin method in order to solve the more general
operator equation
Auo = 0, (1***)
where A: X - t X* is a monotone operator that is not necessarily the
Gateaux derivative F' of a functional F.
The relation between convex functionals and monotone operators gener-
alizes the following well-known fact from classical analysis:
If the real function F is convex, then the derivative F' is monotone (cf.
Figure 2.1).
Convexity plays a fundamental role in the mathematical description of
nature. For example, by the first and second laws of thermodynamics, all
the processes in nature are governed by energy E and entropy S. Observe
2.1 The nth Variation 43
that the negative entropy -8 is a convex functional, and the energy E is
frequently a convex functional.
The following maximum problem
F(u) = max!, UEM,
can always be reduced to a minimum problem by replacing the functional
F with -F.
For the convenience of the reader, we first present an elementary ap-
proach to variational principles, using only- very simple facts about Banach
spaces and Hilbert spaces. This can be found in Sections 2.1 through 2.7.
The generalizations to reflexive Banach spaces via the Hahn-Banach theo-
rem will be considered in Section 2.8.
The applications in this chapter concern the calculus of variations, non-
linear eigenvalue problems, variational' inequalities, duality theory, game
theory, and nonlinear monotone operators.
2.1 The nth Variation
Recall that each map F: M --+ ][{ with values in ][{ = JR., C is called a
functional. By an open neighborhood U(uo) of the point Uo in the normed
space X, we understand an open set in X with Uo E U(uo).
Definition 1. Let F: U(uo) ~ X --+ JR. be a functional on the open neigh-
borhood U(uo) ofthe point Uo in the real normed space X. For fixed hEX,
set
</J(t) := F(uo + th),
where the real parameter t lives in an open neighborhood of the point t = O.
By the nth variation 6n F(uo; h) of the functional F at the point Uo in
the direction h, we understand
n = 1,2, ....
In particular, the first variation is given through 6F(uo; h) := ¢'(O).
The functional F has a Gateaux derivative F' (uo) at the point Uo iff the
first variation 6F(uo; h) exists for each hEX and there exists a linear
continuous functional F'(uo) on X such that
6F(uo; h) = F'(uo)(h) for all hEX.
The Gateaux derivative F'(uo) is called a Prrkhet derivative iff
F(uo + h) - F(uo) = F'(uo)(h) + Ilhlle:(h)
44 2. Variational Principles and Weak Convergence
for all hEX in an open neighborhood of h = 0, where c:(h) -> 0 as h -> O.
Obviously, the following condition holds:
If the Frechet derivative F'(uo) exists, then F is continuous at the point
Uo·
The functional F: U ~ X -> lR on the open set U of the normed space X
is called a CI-functional iff the Frechet derivative F' (u) exists for all u E U
and the operator F': U -> X* is continuous.
Definition 2. Let the functional F: U(uo) ~ X -> lR be as given in Def-
inition 1. Then, F has a local minimum (resp., local maximum) at the
point Uo iff there is an open neighborhood V(uo) of the point Uo such that
V(uo) ~ U(uo) and
F(u) ~ F(uo) for all u E V(uo)
(resp., F(u) :::; F(uo) for all u E V(uo».
The functional F has the critical point Uo iff
bF(uo; h) =0 for all hEX. (2)
If the Gateaux derivative F'(uo) exists, then condition (2) is equivalent
to
F'(uo) = O. (2*)
Example 3. Let the CI-function F: U(uo) ~ lRN -> lR be given on the
open neighborhood U(uo) of the point uo, where N = 1,2,-.... Then,
N
bF(uo; h) = L h 8 F(uo)
j j for all h E lR N ,
j=1
where h = (hI, ... ,hN)' Thus, Uo is a critical point of F iff
for all j = 1, ... , N.
Standard Example 4. Let X be a real Hilbert space. Set
F(u) := 2- I (u I u) - (v I u) for all u E X
and fixed v EX. Then, for all u, hEX,
bF(u; h) = (u I h) - (v I h), b2 F(u; h) = (h I h),
and 15" F( u; h) = 0 if n ~ 3.
Proof. Set ¢(t) := F(u + th) for all t E lR and fixed u, hEX. Then
¢J(t) = 2- I (u I u) + t(u I h) + 2- 1t2(h I h) - (v I u) - t(v I h).
By definition, 15 k F(u; h) = ¢J(k}(O). o
2.2. Necessary and Sufficient Conditions for Local Extrema 45
2.2 Necessary and Sufficient Conditions for Local
Extrema and the Classical Calculus of
Variations
Theorem 2.A. Let the functional F: U(uo) ~ X -+ lR be given on the open
neighborhood U(uo) of the point Uo in the real normed space X. Then the
following are true:
(i) Necessary condition. If F has a local minimum or a local maximum
at the point uo, then Uo is a critical point of F, that is,
8F(uo; h) = 0 for all hEX, (3)
provided the first variation 8F(uo; h) exists for each hEX.
If the Gateaux derivative F'(uo) exists, then condition (3) is equivalent
to
F'(uo) = 0 (Euler equation).
(ii) Sufficient condition. The functional F has a local minimum at the
point Uo provided the following hold true:
(0') Condition (3) is satisfied.
(;3) The second variation 82 F( u; h) exists for all u in an open neighbor-
hood of Uo and for all hEX. There is a constant c > 0 such that
for all hEX.
(-y) For each given c: > 0, there is an TJ(C:) > 0 such that
/8 2 F( u; h) - 82 F(uo; h)/ :S E//h// 2
for all u, hEX with //u - uo// < TJ(C:)·
Proof. Ad (i). Set ¢(t) := F(uo + th), where the real parameter t lives in
a neighborhood of t = O. The real function ¢ has a local minimum or local
maximum at t = O. Hence
¢'(O) = O.
This is condition (3).
Ad (ii). Since ¢'(O) = 0, the classical Taylor theorem yields
F(uo + h) - F(uo) = ¢(l) - ¢(O) = 2- 1 ¢"(8)
=T 1 82 F(uo + 8h; h) for all hEX,
where 0 < 8 < 1. Using 82 F(uo+8h;h) = 82 F(uo;h) + [8 2 F(uo +8h;h)-
OZ F (Uo; h)], we get
F(uo + h) - F(uo) 2: ~ (c - Dhl/2
/l 2: ~I/hI12,
46 2. Variational Principles and Weak Convergence
for all hEX with Ilhll < 3' o
The same argument yields the following, slightly more general, result.
Corollary 1. Let the function F: U(uo) ~ X -> IR be given on the open
neighborhood U(uo) of the point Uo in the normed space X. Let Y be a
linear subspace of X.
Suppose that F has a local minimum at the point Uo with respect to the
plane Uo + Y, that is, there is some r > 0 such that
F(u) :::: F(uo) for all u E X with u - Uo E Y and Ilu - uoll < r.
Then,
8F(uo; h) =0 for all hEY,
provided the first variation 8 F (uo; h) exists for all hEY.
Standard Example 2. Let us study the following classical variational
problem3 :
F(u):= fc L(x,u(x),lJru(x), ... ,oNu(x))dx=min!, (4)
u=g on oG,
where G is a non empty bounded open set in IRN, N :::: 1. We are given the
function g E C 1(C). Let the Lagrangian L: C x IRN+1 -> IR be C 1.
We set X := C 1 (C), where X is equipped with the maximum norm
lIull := maxxEG lu(x)l. Furthermore, set
Y:={UEX:u=O onoG}.
Suppose that Uo E X is a local C 1-minimal point of the original problem
(4). That is, by definition, there is a number r > 0 such that
F(u) :::: F(uo) for all u E X with u - Uo E Y and Ilu - uoll < r.
Then, Uo is a solution the following Euler-Lagrange equation:
L OjLaJ,,(p(x)) =
N I
L,,(p(x)) on G, (5)
j=l
where p(x) := (x, uo(x), 01UO(X), ... , ONUO(X)), and La)u and Lit denote
the partial derivatives of L with respect to OjU and u, respectively.
3Spaces of smooth functions like C 1 (0) were introduced in Section 2.2.3 of
AMS Vol. 108.
2.2. Necessary and Sufficient Conditions for Local Extrema 47
The following elegant proof dates back to Lagrange. This proof corre-
sponds to the proof of Corollary 1.
Proof. We set
</>(t) := F(uo + th) for fixed hEY,
where the real parameter t lives in a neighborhood of t = O. Then the real
function </> has a local minimum at the point t = O. Hence
</>'(0) = O.
Recall that 6F(uo;h) = </>'(0). Since
</>(t) = fa L(x, uo(x) +th(x), 8 Uo(a;) +t8 h(x), ... , 8NUo(x) +t8Nh(x»dx,
1 1
we get
0= </>'(0) = 1E N
a j=l
Laju(p(x»8j h(x) + Lu(p(x»h(x)dx.
In particular, choose h E C~ (G). Integration by parts4 yields
0= 6F(uo; h) = 1[t a 3=1
-8j L aj u(p(x» + Lu(P(X»] h(x)dx
for all hE CO'(G).
By the variational lemma from Section 2.2.3 in AMS Vol. 108, this implies
W. 0
Remark 3 (Critical points). Instead of the minimum problem (4), let us
consider the following more general problem:
F(u) := fa L(x, u(x), 81 u(x), ... , 8NU(X»dx = stationary! , (4*)
u= 9 on 8G.
Here we are looking for a critical point Uo of F. By definition, Uo is a
solution of (4*) iff Uo E C 1 (C) and the first variation vanishes, that is,
6F(uo; h) = 0 for all hEY.
The proof of Standard Example 2 immediately shows the following:
4See Section 2.2.5 in AMS Vol. 108.
48 2. Variational Principles and Weak Convergence
If Uo is a solution of (4*), then Uo is a solution of the Euler-Lagrange
equation (5).
Most variational problems in physics are not minimum problems, but they
are of the type that (4*) is (the principle of stationary action).
Remark 4 (Systems of Euler-Lagrange equations). Consider problem (4*),
where u = (U1' ... ,UlvI). Applying the proof of Standard Example 2 to each
fixed component Uj of u, we obtain the following:
Letuo = (UlO, ... ,UlvIO) be a solution of (4*). Then, Uo is a solution of
the Euler-Lagrange system
N
L 8 La u
j j m (p(x)) = Lu m (p(x)) on G, m = 1, ... , lvI, (5*)
j=l
where p(x) := (x, uo(x), 8 1 Uo (x), ... ,8lvIUO(X)).
2.3 The Lack of Compactness in
Infinite-Dimensional Banach Spaces
Theorem 2.B. A Banach space X is finite-dimensional iff the closed unit
ball is compact.
Proof. Let B := {u E X: Ilull :::; I}. If dim X is finite, then B is compact,
by Corollary 8 in Section 1.12 of AMS Vol. 108.
Conversely, let dim X = 00. We have to show that B is not compact. Sup-
pose first that X is a separable Hilbert space. Then there exists a countable
orthonormal system (un) in X. By the Pythagorean theorem,
for all n of- m.
Thus, the sequence (un) in B has no convergent subsequence, and hence B
is not compact.
Suppose now that X is a Banach space with dim X = 00.
Step 1,' Almost orthogonal elements. Let W be a closed linear subspace
of X with W of- X. Then, for each c E ]0,1[, there exists a point u" E X
such that
Ilu,,1I = 1 and dist(u", W) 2 1 - c. (6)
Recall that dist(v, W) = inf11lE w Ilv - wll.
To prove (6), let v E X - W. Then,
dist(v, W) > O.
2.4 Weak Convergence 49
Otherwise, there would exist a sequence (w n ) in W such that IIv - wnll -; 0
as n -; 00, and hence v E W, since W is closed. But this contradicts v ¢ W.
We choose a point we: E W with 0 < Ilv - we:ll :::; (1 - f)-l dist(v, W).
Set Ue: := I~~=::ll. Then, Ue: is the desired element. In fact,
lIu e - wll = IIv - we ll- 1 l1v - We - IIv - well· wll
~ Ilv - we ll- 1 dist(v, W) ~ 1- f for all W E W.
Step 2: We want to show that B is not compact. To this end, we choose
a point WI E X with IlwI11 = 1. Let W := span{wd. By Step 1, there
exists a point W2 E X with IIw211 = 1 and IIw2 - will ~ 2- 1 . Continuing
this construction, we get a sequence (w n ) with Ilwnli = 1 for all nand
for all n i' m.
Thus, the sequence (w n ) in B has no convergent subsequence, i.e.,·B is not
compact. 0
2.4 Weak Convergence
Recall that we introduced the following notation in Section 1.21 of AMS
Vol. 108:
(f, u) := f( u) for all f E X*, U E X.
Definition 1. Let (un) be a sequence in the normed space X over lK. We
write
Un ~u asn-;oo (7)
iff
(f, un) -; (f, u) as n -; 00 for all f E X*. (7*)
We say that the sequence (un) converges weakly to U in X as n -; 00.
The weak limit U is uniquely determined. In fact, if Un ~ U and Un ~ v
as n -; 00, then f(u - v) = 0 for all f E X*, and hence u = v, by Corollary
3 in Section 1.1.
The norm convergence Un - ; U as n -; 00 (i.e., Ilu n - ull -; 0 as n -; 00)
is also called the strong convergence.
Standard Example 2. Let X be a Hilbert space over lK. Then
(i) Un ~ u in X as n -; 00 iff
(vlun)->(vlu) as n -; 00 for all vEX.
50 2. Variational Principles and Weak Convergence
(ii) Suppose that dim X = 00, and let (un) be a countable orthonormal
system in X. Then, (un) has no convergent subsequence, but
as n ~ 00.
In particular, observe that Ilunll = 1 for all n, but the weak limit of (un)
does not belong to the boundary of the unit ball. We will show in Corollary
4 ahead that the weak limit of (un) always belongs to the closed convex
hull of the set {ur, U2, ... }.
Proof. Ad (i). This follows from the Riesz theorem in Section 2.10 of AMS
Vol. 108.
Ad (ii). By the proof of Theorem 2.B, the sequence (un) has no conver-
gent subsequence. Moreover, for each v E X, the Bessel inequality from
Section 3.1 of AMS Vol. 108 yields
00
L I(v I unW ::; II v 1l 2 ,
n=l
and hence (v I un) ~ 0 as n ~ 00 for all vEX. o
Proposition 3. Let X be a normed space over lK. Then
(i) Un ~ U in X as n ~ 00 implies Un ~ u in X as n ~ 00.
(ii) The converse is true if X is finite-dimensional.
Proof. Ad (i). This follows from the continuity of the functional fin (7*).
Ad (ii). If X = {O}, then the statement is trivial. Let dim X = n,
where n = 1,2, .... Choose a basis {e1' ... ,en} of X. Then, each functional
f E X* has the form
n n
feu) = L O!k13k, where U = L O!kek,
k=l k=l
and O!k,13k E IK for all k. Letting 13k = 1 for fixed k and 13m = 0 for all
m =F k, we obtain that Un ~ U as n ~ 00 is equivalent to the convergence
of the corresponding components. In turn, this is equivalent to Un ~ u as
n ~ 00. 0
Theorem 2.C. Each bounded sequence (un) in a Hilbert space X over IK
has a weakly convergent subsequence.
Corollary 4. The limit point of each weakly convergent subsequence of (un)
belongs to the closed convex hull of the set {U1' U2, ... }.
2.4 Weak Convergence 51
In the following proof we will critically use the Riesz theorem from Sec-
tion 2.10 of AMS Vol. 108.
Proof of Theorem 2.C. For X = {O}, the statement is trivial. Let X =1=
{o}.
Step 1: Suppose first that X is separable. We choose a countable set {Vk},
which is dense in X, and we use the following diagonal procedure:
(Ull I VI), (U12 I VI), (U13 I VI), .. ' -+ at,
(U2I I V2), (U22 I V2), (U23 I V2), ... -+ a2,
To be precise, since I(u n I vt}1 :$ Ilunllllvlll for all n, the sequence of the
numbers (un I VI) is bounded in K Thus, there exists a subsequence of
(un), denoted by (UI n ), such that (Ul n I vt} -+ al as n -+ 00. Furthermore,
there exists a subsequence (U2n) of (UI n ) such that (U2n I V2) -+ a2 as
n -+ 00, and so on.
The diagonal sequence (w n ) defined by Wn := U nn has the crucial prop-
erty that
(W n I Vk) -+ ak as n -+ 00 for all k.
Moreover, there exist numbers a(v) such that
(w n I v) -+ a(v) as n -+ 00 for each vEX. (8)
This follows from
I(w n - wm I v)1 = I(w n - wm I V - Vk) + (w n - wm I vk)1
slIwn - wmllllv - vkll + I(wn - wm I vk)1 < c
for suitable Vk and all n, m ~ no (c). Note that the sequence (w n ) is bounded
and the set {Vk} is dense in X.
Obviously, the map V 1-7 a(v) is linear and, by (8),
la(v)1 :$ IIvll sup IIwnll for all vEX.
n
According to the Riesz theorem from Section 2.10 of AMS Vol. 108, there
exists awE X such that
a(v) = (w 1 v) for all V E X.
By (8), (v 1 wn ) -+ (v I w) as n -+ 00 for all vEX. Hence
Wn ---' W as n -+ 00.
Step 2: If the Hilbert space X is not separable, then let Y be the closure
of spau{ Ul, U2,' .. }. Y is thus separable. In fact, for each 0: E OC and each
c > 0, there are rational numbers {3 and "Y such that
10: - ({3 + "Yi)1 < c.
52 2. Variational Principles and Weak Convergence
Consequently, the countable set of all the finite linear combinations
«(31 + ')'l i )Ul + ... + «(3n + ')'n i ), n = 1,2, ... ,
with rational coefficients (3j, ')'j is dense in Y.
Applying Step 1 to the space Y, there exists a subsequence (w n ) of (un)
such that
(v I w n ) --+ (v I w) as n --+ 00 for all v E Y and fixed W E Y. (9)
Let z EX. According to Section 2.9 in AMS Vol. 108, we get the decom-
position z = v + vJ., where v E Y and vJ. E Y 1.. Since (vJ. I y) = 0 for all
y E Y, it follows from (9) that
(z I wn ) --+ (z I w) as n --+ 00 for all z E X.
Hence Wn ~ was n --+ 00. D
Proof of Corollary 4. Let (w n ) be a bounded sequence in the Hilbert
space X such that Wn ~ W as n --+ 00, that is,
(v I wn ) --+ (v I w) as n --+ 00 for all vEX. (10)
It is sufficient to prove that there are indices nl < n2 < ... such that
as n --+ 00.
Replacing Wn by Wn - w, we can assume that W = O.
First let n1 := 1. By (10),
as n --+ 00.
Thus, there exists an index n2 > n1 such that I(w n, I wn2 ) I ~ 2- 1 . Contin-
uing this procedure, we get indices such that
l(wn1 I wnk)1 ~ (k _1)-1, ... , I(W nk _1 I wnk)1 ~ (k _1)-1,
for all k = 3,4, .... Since IIwn II ~ const = C for all n, we obtain
f; I wnj + f; I(wnj I Wn.,) I
k k-1
IIk- 1 (wn1 + ... + wnk )1I 2 ~ k- 2{ 112 2
f; I(wnj I wnk_,)1 + ... }
k-2
+2
~ k- 2 (kC 2 + 2(k -
l)(k - 1)-1
+ 2(k - 2)(k - 2)-1 + ... + 2)
~ k- 1 (C 2 + 2) --+ 0 as k --+ 00. D
2.5 The Generalized Weierstrass Existence Theorem 53
2.5 The Generalized Weierstrass Existence
Theorem
Definition 1. Let F: M ~ X ~ ~ be a functional on the subset M of the
real normed space X. Then
(i) F is called weakly sequentially continuous iff, for each u E M and
each sequence (un) in 11'1,
un ~ u implies F(u n ) ~ F(u) a.'3 n ~ 00.
(ii) F is called weakly sequentially lower semicontinuous 5 iff
F(u):::; lim F(u n ) (11)
n.-+OCl
for each u E M and each sequence (un,) in !vI with Un ~ u as n ~ 00.
(iii) F is called coercive iff ft~W ~ +00 as Ilull ~ 00 on M.
(iv) F is called weakly coercive iff F(u) ~ +00 as lIuli ~ 00 on M.
(v) F is called strictly convex iff the set M is convex and
F(o:u + (1- o:)v) < aF(u) + (1 - a)F(v),
for all a E ]0, 1[ and all u, v E M with u "I v.
Recall that F: M ~ ~ is convex iff "<" in (v) is replaced with ":::;."
In the case where F is a real function, the strict convexity of F means
that the graph of F lies properly under the chord. The functions F pictured
in Figures 2.2(a) and 2.2(b) are convex and strictly convex, respectively.
Intuitively, strict convexity of F ensures the uniqueness of the minimal
point Un (Figure 2.2(b)). This will be proved rigorously in Corollary 2 just
ahead.
Moreover, it follows intuitively from Figure 2.2 that each local minimum
of a convex function is also a global minimum. This will be proved in Section
2.9.
Theorem 2.D. Suppose that the functional F: M ~ ~ has the following
three properties:
(i) M is a nonempty closed convex subset of the real Hilbert space X.
SThe definition and properties of the classical symbols "lim" and "lim" will
be recalled in Problem 2.7b.
The intuitive meaning of (11) will be discussed in Problem 2.7b.
54 2. Variational Principles and Weak Convergence
Uo
(a) not strictly convex (b) strictly convex
FIGURE 2.2.
(ii) F is weakly sequentially lower semicontinuous.
(iii) If the set M is unbounded, then F is weakly coercive.
Then the minimum problem
F(u) = min!, uEM, (12)
has a solution.
Corollary 2. If, in addition, F is strictly convex, then problem (12) has a
unique solution u.
Proof of Theorem 2.D. Step 1: Suppose that M is bounded. Set
'Y:= inf F(u).
uEM
Hence -00 :::; 'Y < 00. Then there exists a sequence (un) in M such that
as n ~ 00. (13)
Since M is bounded, the sequence (un) is bounded. By Theorem 2.C, there
exists a convergent subsequence, again denoted by (un), such that Un -" U
as n ~ 00. Corollary 4 in Section 2.4 tells us that U lies in the closed convex
hull of {Ul' U2, . .. }. Hence u E M.
Since F is weakly sequentially lower semicontinuous,
F(u):::; lim F(u n ) = 'Y.
n--+oo
This implies F(u) = 'Y.
Step 2: Suppose that the set M is unbounded. Fix v E M. Since F(u) ~
+00 as lIuli ~ 00, there exists an r > 0 such that
F(u) > F(v) for all u E M with lIull > r, (14)
2.5 The Generalized Weierstrass Existence Theorem 55
and the set Mr := {u E M: Ilull ::; r} is not empty. By (14), each solution
u of the modified problem
F(u) = min!, (15)
is also a solution of the original problem (12).
Since the set Mr is closed, convex, and bounded, problem (15) has a
solution, by Step 1. 0
Proof of Corollary 2. Suppose that problem (12) has two different solu-
tions, u and v. Then, ~(u + v) EM, and hence
1 ) 1 1
F ( "2(u + v) < "2 F (u) + "2F(v) = F(u).
This contradicts the fact that F(u) is the minimal value of F on M. 0
Definition 3. Let F: M -+ JR be a functional on the subset M of the real
normed space X. For each r E JR, set
Mr := {u E M: F(u) ::; r}.
(a) F is called lower semicontinuous on the closed set M iff the set Mr
is closed for all r E JR.
(b) F is called quasi-convex on the convex set M iff the set Mr is convex
for all r E lR.
The following hold:
F is convex::::} F is quasi-convex;
F is continuous::::} F is lower semicontinuous.
In fact, let F be convex on the convex set M. If u, v E M r, then
F(o:u+(l-o:)v) ::; o:F(u)+(l-o:)F(v) ::; o:r+(1-o:)r ::; r for all 0: E [0,1]'
and hence o:u + (1 - o:)v E Mr. Furthermore, if F is continuous on the
closed set M, then it follows from Un E Mr for all n and Un -+ u as n -+ 00
that F(u n ) ::; r, and hence F(u) ~ r (Le., u E M r ).
Proposition 4. Suppose that the functional F: M ~ X -+ JR has the fol-
lowing three properties:
(i) M is a nonempty, closed, convex subset of the real Hilbert space X.
(ii) F is quasi-convex and lower semicontinuous.
56 2. Variational Principles and Weak Convergence
(iii) If M is unbounded, then F is weakly coercive.
Then, the minimum problem F( u) = min!, u EM, has a solution. This
solution is unique provided that F is strictly convex.
This follows from Theorem 2.D and Corollary 2 along with the following
result.
Lemma 5. Let the functional F: M ~ X ~ lR be lower semicontinuous
and quasi-convex on the nonempty, closed, convex set !vI of the real normed
space X.
Then, F is weakly sequentially lower semicontinuous on !vI.
Proof. If the assertion is not true, then there exist a point U E 11,1 and a
sequence (un) in M such that Un ~ u as n ~ 00 and
F(u) > lim F(u n ).
n->oo
Consequently, there is a real number r so that r < F(u) and Un E Mr- for
all n 2: no(c). Since Mr is closed and convex, it follows from Un ~ u as
n -> 00 that u E Mn and hence F(u) :::; r. This is a contradiction. 0
Remark 6 (Generalization to reflexive Banach spaces). All the results of
this section remain valid if X is not a real Hilbert space but rather a real
reflexive Banach space.
The proofs given above remain unchanged. However, instead of the prop-
erties of weak convergence in Hilbert spaces (Theorem 2.C and Corollary
4 in Section 2.4), we have to use the corresponding properties in reflexive
Banach spaces that will be proved in Section 2.8.
2.6 Applications to the Calculus of Variations
Instead of the classical variational problem
F(u) := i L(x, u(x), OlU(X), ... , oNu(x))dx = min!,
u = g on oG (boundary condition),
let us consider the following generalized probldm on a Sobolev space6 :
F(u) = min!, u E wd-(G),
o (16)
u - g EW~(G) (generalized boundary condition).
We assume that
6The Sobolev spaces W}(G) and W~(G) were introduced in Section 2.5 of
AMS Vol. 108.
2.6 Applications to the Calculus of Variations 57
(HI) G is be a nonempty, bounded, open set in R N , N ? 1.
(H2) The function L: 0 x RN+l -+ R is continuous.
(H3) (Convexity) For each x E 0, the function L(x, ... ) is convex on RN
with respect to the variables u, lh u, ... , GNU.
(H4) (Growth condition) For all (x, u,chu, ... , GNU) EO X RN+l,
(H5) (Coerciveness condition) For all (x, u, Gl U, . .. ,GNU) E 0 X RN+l,
N
C E IGj u l2 - d ~ L(X,U,GIU, ... ,GNU),
j=1
where c > 0 and d ? 0 are constants.
Proposition 1. For each given function g E Wi(G), the variational prob-
lem (16) has a solution.
The proof will be based on the following result.
Lemma 2. Let Un -+ U in L2(G) as n -+ 00, where G is a nonempty open
set in R N, N ? 1. Then, there exist a subsequence (un') and a function
w E L2 (G) such that
for almost all x E G,
and lun,(x)1 ~ w(x) for all n' and all x E G.
Proof. First let G be an open interval la, b[ in R Then, the result follows
from Step 4 of the proof of Standard Example 4 in Section 2.2.1 of AMS
Vol. 108 with w = IVll + s. For general open sets in R N , the proof proceeds
completely analogously. D
Proof of Proposition 1. We want to use Proposition 4 from Section 2.5.
To simplify notation, we consider the case where N = 1. The general case
proceeds completely analogously.
Set Y := Wi(G). Recall that
(fa
1
Ilull = (lu(xW + IGu(x)1 2)dX) 2
58 2. Variational Principles and Weak Convergence
Step 1: We show that the functional F: Y - lR. is convex, weakly coercive,
and continuous.
It follows from the growth condition (H4) that
F(u) := fa L(x, u(x), ou(x»dx :5 constllull~·
Thus, the functional F: Y - lR. is well defined. By (H3), F is convex. The
coerciveness condition (H5) yields
c fa lou(x)1 2dx - d fa dX:5fa L(x,u(x),ou(x»dx.
By the Poincare-Friedrichs inequality from Section 2.5.6 of AMS Vol. 108,
there is a constant C > 0 such that
for all u E Y.
Thus, lIuli - 00 on Y implies F(u) - +00. Hence F is weakly coercive on
Y.
To prove that F: Y - lR. is continuous, let
un - U in Y as n - 00.
Hence
Un - u and oUn - OU in L2(G) as n - 00.
By Lemma 2, there are a subsequence (un') and functions v, w E L2(G)
such that
Un' (x) - u(x) and oun,(x) - ou(x) as n' - 00,
for almost all x E G, and
and
for all n' and all x E G. The growth condition (H4) tells us that
IL(x,un,(x),oun,(x»1 :5 const(lv(xW + Iw(x)1 2),
for all n' and all x E G. By the continuity of L,
L(x,un,(x),Oun,(x» - L(x,u(x),ou(x» as n' - 00,
for almost all x E G. Thus, the dominated convergence theorem (cf. the
appendix of AMS Vol. 108) yields
fa L(x,un'(X),Ounl(x»dx -fa L(x,u(x),ou(x»dx as n' - 00.
2.7 Applications to Nonlinear Eigenvalue Problems 59
That is,
as n' --t 00.
The same argument tells us that each convergent subsequence of (F(u n ))
has the limit F(u). Hence the total sequence is convergent. That is,
as n --t 00.
Step 2: Let X :=W~(G), and let H(v) := F(g + v) for all vEX. By
Step 1, the functional H: X --t JR. is convex, weakly coercive, and continu-
ous. Proposition 3 in Section 2.5 with M = X tells us that the minimum
problem
H(v) = min!, VEX, (16*)
has a solution. Problem (16*) is equivalent to the original problem (16). D
2.7 Applications to Nonlinear Eigenvalue
Problems
Let us consider the following eigenvalue problem:
Au = AU, u E X, A E JR., Ilull = r, (17)
for the nonlinear operator A: X --t X. We assume that
(HI) The functional F: X --t lR. is weakly sequentially continuous on the
real Hilbert space X, where X =1= {o}.
(H2) The operator A: X --t X corresponds to the Frechet derivative of the
functional F; in other words, for each given u EX, we have
F(u + h) - F(u) = (Au I h) + IIhllc(h; u) for all hEX, (18)
where c(h; u) --t 0 as h --t O.
(H3) Au = 0 implies u = 0, and F(O) = O.
(H4) There is a point V E X with F(v) > O. We choose r > 0 in such a
way that IIvll ~ r.
Proposition 1. The eigenvalue problem (17) has a solution.
We will show that each solution w of the maximum problem
F(u) = max!, Ilull =r, (19)
60 2. Variational Principles and Weak Convergence
is a solution of the original eigenvalue problem (17).
Proof. Step 1: Let B := {u E X: Ilull ::; r}. We first show that the modified
maximum problem
F(u) = max!, UEB, (19*)
has a solution w. However, this follows from Theorem 2.D applied to -F.
Step 2: We show that Ilwll = r. In fact, if IIwll < r, then w is an inner
point of B, and hence Theorem 2.A yields
8F(w; h) = 0 for all hEX.
By (18), 8F(w; h) = (Aw I h). Thus, Aw = 0, and hence w = 0, by (H3).
This implies
F(O) = O.
Hence F(u) ::; 0 for all u E B. By (H4), there is some v E B with F(v) > O.
This is a contradiction.
Summarizing, we obtain w as a solution to the maximum problem (19).
Step 3: Finally, we prove that w is a solution to the original eigenvalue
problem (17). Let Y := span{w}. Set
1/1(t) := F«cost)w + (sint)k) for all t E IR and fixed k E y..L.
If we let u := w and h := (cos t - l)w + (sin t)k, it follows from (18) that
1/1(t) -1/1(0) = t(Aw I k) + tTJ(t), where TJ(t) -; 0 as t -; O.
Hence
1/1'(0) = (Aw I k).
Observe that, for all k E y..L with IIkll = r,
II (cos t)w + (sin t)k)1I2 = cos2 tllwl1 2 + sin2 tllkll 2
= (cos2 t + sin2 t)r2 = r2.
Since w is a solution of (19), the real function 1/1 has a local maximum at
the point t = O. Hence 1/1'(0) = 0, that is,
(Aw I k) =0 for all k E y..L.
Recall that Y := {AW: A E 1R}. Using the orthogonal decomposition
Aw = y+k,
we get y = AW and k = 0 because 0 = (Aw I k) = (k I k).
Therefore, Aw = AW, that is, w is a solution of (17). o
2.8 Reflexive Banach Spaces 61
2.8 Reflexive Banach Spaces
Let X be a normed space over K Recall from Section 1.21 of AMS Vol.
108 that, by definition, the dual space X* consists of all linear continuous
functionals f: X -> lK. We set
X** := (X*)*,
that is, the bidual space X** consists of all linear continuous functionals
F: X* -> K Recall also that we have introduced the following notation:
(f,u):= f(u) for all f E X*, u E X.
The following definition is crucial for the general theory of variational prob-
lems in terms of functional analysis.
Definition 1. The normed space X over lK is called reflexive iff each F E
X** allows the following representation:
F(f) = (f,u) for all f E X* and some fixed u E X.
Standard Example 2. Each Hilbert space X over lK is reflexive.
We will show that this is a consequence of the Riesz theorem from Section
2.10 in AMS Vol. 108.
Proof. By Section 2.11 in AMS Vol. 108, it follows from the Riesz theorem
that there exists a bijective map J: X -> X· such that
(Ju,v) = (u I v) for all u,v E X,
and J is antilinear, meaning that
J(au + (Jw) = aJu + 13Jw for all u,w E X, a,(J E lK,
where the bar denotes the conjugate complex number. Moreover, we have
lIJull = Ilull for all u E X. Let FE X**. We set
G(u) := F(Ju) for all u E X.
Then, G: X -> lK is linear. For all u EX,
IG(u)l:::; 1IFIIIIJuil = 11F1i11ull·
Hence G E X·. By the Riesz theorem in Section 2.10 of AMS Vol. 108,
there exists a v E X such that
G(u) = (v I u) for all u E X.
62 2. Variational Principles and Weak Convergence
This implies
F(w) = G(J lW) = (J-IW I v) = (W,V) for all w E X*. D
Proposition 3. Let X be a normed space over 1K. Define the map j: X ...-+
X** through
j(u)(J):= (f,u) for all u E X, f E X*.
Then the following are true:
(i) The map j is linear and
Ilj(u)1I = lIuli for all u E X.
(ii) The space X is reflexive iff j: X ...-+ X** is bijective.
Proof. Ad (i). Obviously, j(au+!3v) = aj(u) +!3j(v) for all u,v E X and
a,!3 E K Moreover,
IIj(u)11 = sup 1(f,u)1 = Ilull,
fEX',llfIl9
by Corollary 2 in Section 1.1.
Ad (ii). It follows from (i) that j: X ...-+ X** is injective, since j(u) = 0
implies u = O. By Definition 1, X is reflexive iff j is surjective. D
Proposition 4. Every closed linear subspace Y of a reflexive Banach space
X over IK is again a reflexive Banach space.
Proof. Obviously, Y is a Banach space. The following simple arguments
will be based on restrictions and extensions of functionals. In particular,
we will use the Hahn-Banach theorem and the separation of convex sets.
For each given x* E X*, let x;: Y ...-+ IK denote the restriction of x*: X ...-+
IK to the subspace Y. Clearly, x;
E Y*. In this sense, Y ~ X implies
X* ~ Y*. (20)
Hence we obtain (Y*)* ~ (X*)*, that is,
Y** ~ X**. (21)
Let y** E Y**. We have to show that there exists ayE Y such that
y**(y*) = y*(y) for all y* E Y* . (22)
2.8 Reflexive Banach Spaces 63
In fact, it follows from (20) and (21) that
y**(x;) = y**(x*) for all x* E X*. (23)
Since X is reflexive, there exists ayE X such that
y**(x*) = x*(y) for all x* E X* . (24)
We claim that y E Y. Otherwise, we would have dist(y, Y) > 0, since Y is
closed in X. By Proposition ,3 in Section 1.2 (separation of convex sets),
there exists an x* E X* such that x; = 0 and x*(y) =F O. By (23) and (24),
0= y**(x;) = x*(y),
contradicting x* (y) =F O.
Let y* E Y*. By the Hahn-Banach extension theorem (Theorem loB),
there exists an x* E X* such that x; = y*. Therefore, by (23) and (24),
y**(y*) = y**(x;) = y**(x*) = x*(y) = y*(y).
This is (22). o
Proposition 5. Let X be a Banach space over IK.
(i) If X* is separable, X is also.
(ii) Conversely, il X is separable and reflexive, then X* is separable.
In the following proof we will use Proposition 3 from Section 1.2 on the
separation of convex sets. If X = {O}, then the statements are trivial. Let
X =F {O}.
Proof. Ad (i). Suppose that the set {il, 12, ... } is dense in X*. Since
Illnll = sup IUn, u)l,
Ilull=l
there is a Un E X such that Ilunll = 1 and
Let Y be the closure of span {Ul' U2, ••. }. Assume that Y =F X. By Propo-
sition 3 in Section 1. 2, there exists a functional I E X * such that I (u) = 0
on Y and I =F O. Thus, for all n,
III - in II 2: IU - in,un)1 = IUn,un)l2: Tlilinil
:::: Tl(lIill - Iii - inl!)·
64 2. Variational Principles and Weak Convergence
Since {II, 12, . .. } is dense in X*, this implies
Ilfll :s; 3inf
n
IIf - fnll = 0,
contradicting f =I- 0.
Ad (ii). By Proposition 3, there is a bijective map j: X -> X** such
that IIJ(u)1I = lIuli for all u E X. Thus, the separability of X implies the
separability of X**. Since X** = (X*)*, the space X* is also separable, by
(i). 0
The following two crucial results generalize the corresponding properties
of weak convergence for Hilbert spaces, which we proved in Section 2.4.
Proposition 6. Each bounded sequence (un) in a reflexive Banach space
X over lK has a weakly convergent subsequence.
Corollary 7. The limit point of each weakly convergent subsequence of (un)
belongs to the closed convex hull of the set {Ul' U2, ... }.
The proof of Corollary 7 will be based on the separation of convex sets
by closed hyperplanes (Theorem I.C).
Proof of Proposition 6. The proof proceeds similarly to the proof of
Theorem 2.C. However, instead of the Riesz theorem on Hilbert spaces we
will use the reflexivity of the Banach space X.
For X = {O}, the statement is trivial. Let X =I- {O}.
Step 1: Suppose first that X* is separable. Let {Vk} be a countable set
in X*. Then
for all n, k.
As in the proof of Theorem 2.C, we obtain a subsequence (w n ) of (un) such
that, for each Vk,
as n -> 00.
Moreover, there exist numbers a(v) such that
(v, w n) -> a(v) as n -> 00 for each v E X * . (25)
This follows from
I(v, Wn - wm)1 = I(v - Vk, Wn - wm) + (Vk' Wn - wm)1
:s; Ilv - vkllllwn - wmll + I(Vk,Wn - wm)1 < 10,
for suitable Vk and all n, m 2: no(E). Note that the sequence (w n) is bounded
and the set {Vk} is dense in X*.
Obviously, the map v I--> a(v) is linear and, by (25),
la(v)1 :s; IIvll sup Ilwnll for all v E X *.
n
2.8 Reflexive Banach Spaces 65
Hence a E X** . Since the Banach space X is reflexive, there exists awE X
such that
a(v) = (v, w) for all v E X*.
Thus, it follows from (25) that
Wn -" W as n --+ 00.
Step 2: If X* is not separable, then let Y be the closure of span {Ul' U2, •.. }.
It follows as in the proof of Theorem 2.C that Y is a separable closed linear
subspace of X. By Proposition 4, Y is reflexive. Proposition 5 tells us that
y* is separable.
Applying Step 1 to the space Y, we see that a subsequence (w n ) of (un)
and awE Y exist such that
(y*, w n ) --+ (y*, w) as n --+ 00 for all y* E Y*.
Since X* ~ Y* , this implies Wn -" W as n --+ 00. o
Proof of Corollary 7. Let M be a closed convex set such that Un E M
for all n and Un' -" U as n' --+ 00. We have to show that U E M.
Suppose that U ¢ M. Then, by the separation theorem for convex sets
from Section 1.2 (Theorem 1.C), there exists a functional v* E X* such
that
Re(v*, v) ::; 1 for all v E M
and Re(v*, u) > 1. Letting n' --+ 00, it follows from Re(v*, Un') ::; 1 for all
n'that
Re(v*, u) ::; 1.
This is a contradiction. o
The following classic result will be used in the next section.
Lemma 8. Let ¢: J ~ R. --+ R. be a differentiable function on the interval
J. Then, ¢ is convex iff the derivative ¢': J --+ R. is monotone, that is, t ::; s
implies ¢'(t) ::; ¢'(s).
Proof. Suppose that ¢ is convex. Let t < T < s. It follows from
T-t S-T
T= --s+--t
s-t s-t
that
T-t S-T
¢(T) ::; S _ t ¢(s) + s _ t ¢(t), (26)
and hence
¢(T) - ¢(t)
~~~~ < ¢(s) - ¢(T) . (27)
T-t - S-T
66 2. Variational Principles and Weak Convergence
Letting r -7 t or r -+ S, we get
¢/(t) ::; ¢(s~ =:(t) : ; ¢/(S).
Conversely, if ¢I is monotone, then the mean value theorem yields
¢(r) - ¢(t) = ¢/(a) and ¢(s) - ¢(r) = ¢/(f3),
r-t s-r
where t < a < rand r < (3 < s. This implies (27) and hence we get (26),
which says that ¢> is convex. 0
2.9 Applications to Convex Minimum Problems
and Variational Inequalities
Let us consider the minimum problem
F(u) = min!, uEM, (28)
along with the variational inequality
bF(u;v - u) ~ 0 for all v E M and fixed u EM, (28*)
which corresponds to
(FI(U),V - u) ~ 0 for all v E M and fixed u E M. (28**)
We assume that
(HI) M is a nonempty, closed, convex subset of the real reflexive Banach
space X.
(H2) The functional F: M ~ X -7 1R is continuous and convex.
(H3) If M is unbounded, then F is weakly coercive.
For example, assumptions (H2) and (H3) are satisfied if
F(u):= Ilu-uoll for all u E X and fixed Uo EX.
In the following, the postulated existence of the first variation of(u; h)
or of the Gateaux derivative F/(U) includes tacitly that the functional F is
defined on an open neighborhood of the point u.
Theorem 2.E. Assume (HI) through (H3). Then the following are true:
2.9 Applications to Convex Minimum Problems and Variational Inequalities 67
(i) The minimum problem (28) has a solution u. If F is strictly convex,
then this solution is unique.
(ii) If the first variation 8F(v; h) exists for all v E M and all hEX, then
the minimum problem (28) is equivalent to the variational inequality
(28*).
(iii) If the Gateaux derivative F' (v) exists for all v EM, then the mini-
mum problem (28) is equivalent to the variational inequality (28**).
Proof. Ad (i). This follows from Proposition 4 and Remark 6 in Section
2.5.
Ad (ii). Let u, v E M. Since M is convex, we get u + t(v - u) E M for
all t E [0, 1]. Define
¢(t) := F(u + t(v - u)) for all t E [0,1].
If u is a solution of (28), then
¢(t) ~ ¢(O) for all t E [0, 1J.
Hence
8F(u;v - u) = ¢'(O) = lim C 1 (¢(t) - ¢(O)) ~ O.
t-++O
Conversely, let u be a solution of (28*). Then, 4>'(0) ~ O. Since ¢ is
convex on [0,1]' the derivative ¢' is monotone on [0, 1J. By the mean value
theorem, there is a number 0 < B < 1 such that
¢(1) - 4>(0) = ¢'(B) ~ 4>(0) ~ O.
Hence
F(v) - F(u) ~ 0 for all v E M,
that is, u is a solution of (28).
Ad (iii). Observe that 8F(u; h) = (F'(u), h). o
Proposition 1. Let F: X -+ IR be a continuous, convex, weakly coercive
functional on the real reflexive Banach space X. Suppose that the Gateaux
derivative F' (v) exists for each v EX. Then the following are true:
(i) The minimum problem
F(u) = min!, uEX, (29)
has a solution u. This solution is unique if F is strictly convex.
(ii) The minimum problem (29) is equivalent to the operator equation
F'(u) =0 (Euler equation). (29*)
68 2. Variational Principles and Weak Convergence
Proof. This is a special case of Theorem 2.E. Observe that the variational
inequality
(F'(u), v - u) ~ 0 for all v E X
is equivalent to (F'(u), ±h) ~ 0 for all hEX. In turn, this is equivalent to
F'(u) = O. 0
Proposition 2. Let F: M ~ X --+ lR be a convex functional on the convex
subset M of the real normed space X.
Then, each local minimum Uo of F on M is also a global minimum of F
onM.
Proof. There is a number r > 0 such that
F(uo) :s F(v) for all v E M with IIv - uoll < r.
Let u E M be given. Set v := Uo + a(u - uo). If a> 0 is sufficiently small,
then Ilv - uoll < r. By the convexity of F,
F(uo) :s F(v) :s aF(u) + (1 - a)F(uo).
Hence F(uo) :s F(u) for all u EM. o
The following two corollaries characterize convex functionals in terms of
the Gateaux derivative and the second variation, respectively.
Corollary 3 (Convex functionals and monotone operators). Let F: X --+ lR
be a functional on the real normed space X such that the Gateaux derivative
F'(u) exists for all u E X.
Then F is convex on X iff the operator F': X --+ X* is monotone, that
is,
(F'(v) - F'(u), v - u) ~ 0 for all u,v E X. (30)
Proof. Let u, v EX. Set
¢(t) := F(u + t(v - u)) for all t E R (31)
Then,
¢'(t) = 8F(u+t(v-u); v-u) = (F'(u+t(v-u)), v-u) for all t E R (32)
If F is convex on X, then so is ¢ on R By Lemma 8 in Section 2.8, the
derivative ¢' is monotone on R Hence
¢'(1) ~ ¢'(O).
This is (30).
2.9 Applications to Convex Minimum Problems and Variational Inequalities 69
Conversely, if F' is monotone on X, then it follows from (30) that
(F'(U + s(v - u)) - F'(U + t(v - u)), (s - t)(v - u)) 2: 0,
for all u, v E X and t, s E lR.. By (32), this implies
<P' (t) :::; <p' (s ) for all t :::; s.
Thus, <P' is monotone on lR.. By Lemma 8 in Section 2.8, <p is convex on lR..
Hence F is convex on JR.. 0
Corollary 4 (Convex functionals and the definiteness of the second varia-
tion). Let F: X -- JR. be a functional on the real normed space X such that
the second variation {J2 F( U; h) exists for all u, hEX. Then
(i) F is convex on X iff
(j2 F(u; h) 2: ° for all u,h E X. (33)
(ii) If
for all u, hEX with h i=- 0, (33*)
then F is strictly convex on X.
Proof. Ad (i). Let us use the function <p introduced in (31). Observe that
<p"(t) = ()2 F(u + t(v - u); v - u) for all t E lR..
If F is convex on X, then <p is convex on lR.. By Lemma 8 in Section 2.8,
<P' is monotone on lR.. Hence <p" (t) 2: 0 for all t E lR.. Letting t = 0, we get
(33).
Conversely, it follows from (33) that <p"(t) 2: 0 for all t E lR.. Thus, <P' is
monotone on JR., and hence <p is convex on JR. This implies the convexity of
FonX.
Ad (ii). Let v i=- u. It follows from (33*) that <p"(t) > 0 for all t E lR..
Hence <P' is strictly monotone on lR.. The proof of Lemma 8 in Section 2.8
tells us that <p is strictly monotone on lR.. Thus, F is strictly convex on X.
o
Example 5. A real Banach space is called strictly convex iff the norm
function
U f--> Ilull
is strictly convex. Each real Hilbert space X is strictly convex.
Proof. Set F(u) := (u I u) for all u E X. Then,
(j2F(u;h) = (h I h) > 0 for all hEX, hi=- 0,
70 2. Variational Principles and Weak Convergence
by Example 4 in Section 2.1. Hence F is strictly convex.
Set G(u) := lIuli = 1jJ(F(u», where 1jJ(x) := x~ for all x ~ O. Since the
real function 1jJ: [0,00[- lR. is strictly increasing and convex, the functional
G: X -lR. is strictly convex. In fact, let u, v E X be given such that u ¥- v,
and let t E ]0,1[. Then
G(tu + (1 - t)v) = 1jJ(F(tu + (1 - t)v» < 1jJ(tF(u) + (1 - t)F(v))
~ t1jJ(F(u» + (1- t)1jJ(F(v)) = tG(u) + (1 - t)G(v). D
The following result generalizes the main theorem on quadratic vari-
ational problems from Section 2.4 of AMS Vol. 108. An application to
elasticity will be considered in the next section.
Proposition 6 (Quadratic variational inequalities). Suppose that
(a) a: X x X - lR. is a symmetric, bounded, strongly monotone, bilinear
form, where X is a real Hilbert space.
(b) b: X - lR. is a linear continuous functional.
(c) M is a nonempty, closed, convex subset of X.
Then, the variational problem
2- 1a(u, u) - b(u) = min!, UEM,
has a unique solution u, which is also the unique solution of the variational
inequality
a(u,v - u) - b(v - u) ~ 0 for all v E M and fixed u E M.
Proof. Set
F(u) := T1a(u,u) - b(u) for all u E X.
Introducing the real function ¢(t) := F(u + th) for all t E lR. and fixed
u,h E X, we get
¢(t) = T1a(u, u) + ta(u, h) + T1a(h, h) - b(u) - tb(h).
By definition, onF(ujh) := ¢(n)(o). Thus, for all u,h E X, of(u;h) =
a(u, h) - b(h), 02 F(u; h) = 2- 1a(h, h), and on F(u; h) = 0 if n ~ 3.
Since a(·,·) is strongly monotone, there exists a constant c > 0 such that
a(u, u) ~ cllul1 2 for all u E X, and hence
for all hEX.
2.10. Applications to Obstacle Problems in Elasticity 71
By Corollary 4, F: X -+ ]R is strictly convex.
Moreover, it follows from Proposition 2 in Section 2.3 of AMS Vol. 108
that, as n -+ 00,
Un -+ u implies F(u n ) -+ F(u),
that is, F: X -+ ]R is continuous.
Finally, since Ib(u)1 ~ IIbllliull, we get
F(u) ~ 2- 1 cllull 2
-lIblilluli for all u E X and fixed c > O.
Hence F(u) -+ +00 as lIull -+ 00, so that F: X -+]R is weakly coercive.
The assertion now follows from Theorem 2.E. 0
2.10 Applications to Obstacle Problems
in Elasticity
Let us consider the following minimum problem:
r L(8 u)2dx - r fudx = min!,
N
F(u) := 2- 1 j UEM, (34)
lG j=1 lG
where
M:= {u EW~(G):u(x) ~ 0 for almost all x E G},
along with the following variational inequality:
1 G j=1
N
L 8j u(8j v - 8 j u)dx - r f(v - u)dxdx ~ 0
lG
for all v E M (34*)
and fixed u E M.
Proposition 1. We are given the function f E L 2(G), where G is a
nonempty, bounded, open set in ]RN, N ~ 1.
Then the minimum problem (34) has a unique solution u, which is also
the unique solution of the variational inequality (34*).
In the special case where N = 1 and G = la, b[, problem (34) allows the
following physical interpretation. As in Section 2.7 of AMS Vol. 108, we use
u(x) = deflection of a string at the point x;
f(x) = force density at the point x.
Then, problem (34) corresponds to the principle of minimal potential en-
ergy. The side condition u E M postulates that
u(x) ~ 0 for almost all x E la, b[ (obstacle condition)
72 2. Variational Principles and Weak Convergence
force
FIGURE 2.3.
and
u(a) = u(b) = 0 (boundary condition).
A possible physical situation is pictured in Figure 2.3.
o
Proof. Set X :=W~(G) and
a(u,v):= 1 N
"'I:.ojuOjvdx,
G j==1
b(u):= fafUdX, for all u,v E X.
By the proof of Theorem 2.B (Dirichlet principle) in AMS Vol. 108, a: X x
X ---> ~ is a symmetric, bounded, strongly monotone, bilinear form, and
b: X ---> ~ is linear and continuous.
Obviously, the set M is convex. Moreover, !vI is also closed. In fact, if
Un ---> u in M as n ---> 00,
then there exists a subsequence (un') such that
Un' (x) ---> u(x) as n' ........ 00 for almost all x E G,
by Lemma 2 in Section 2.6. Thus, un(x) 2: 0 for almost all x E G and all
n implies u(x) 2: 0 for almost all x E G, and hence U E M.
The assertion follows now from Proposition 6 in Section 2.9. 0
2.11 Saddle Points
Definition 1. Let the function L: A x B ........ ~ be given, where A and B
are arbitrary nonempty sets. The point (uo,Po) E A x B is called a saddle
point of L with respect to A x B iff
for all U E A, p E B. (35)
2.12 Applications to Duality Theory 73
FIGURE 2.4.
This is equivalent to
maxL(uo,p) = L(uo,Po) = minL(u,po). (35*)
pEB uEA
Example 2. Let L( u, p) := u 2 - p2. Then, (0,0) is a saddle point of L with
respect to lR x lR (cf. Figure 2.4).
In the following three sections we want to show that saddle points play
an important role in duality theory and in game theory.
2.12 Applications to Duality Theory
The point of departure for the formulation of dual problems is the following
symmetric pair of formulas:
inf (sup L(U,P)) = lX, (36)
uEA pEB
sup (inf L(U,P)) ={3. (36*)
pEB uEA
Theorem 2.F. Let L: A x B -> lR be a function on the product A x B of the
nonempty sets A and B. Then, the following two statements are equivalent:
(i) (uo,Po) is a saddle point of L with respect to A x B.
(ii) Un is a solution of the primal problem (36), Po is a solution of the
dual problem (36*), and lX = {3.
74 2. Variational Principles and Weak Convergence
Corollary 1. If (ii) holds, then a = (3 = L(uo,Po).
Suppose that we are given the primal problem
inf F(u) = a. (37)
uEA
In order to find the corresponding dual problem, we are looking for a func-
tion L such that
F(u) = sup L(u,p).
pEB
Thus, problem (37) is identical to (36). Letting
G(p) := inf L(u,p),
uEA
the dual problem corresponding to (36*) reads as follows:
sup G(p) = (3. (37*)
pEB
Corollary 2. Let L: A x B ---+ JR, where A and Bare nonempty sets.
(i) Then -00 :s; (3 :s; a :s; 00.
(ii) Let u E A and p E B be given. Then we get the following error
estimates for the extremal values a and (3:
G(p) :s; (3 :s; a :s; F(u).
(iii) Suppose that we know two points Uo E A and Po E B such that
G(po) ~ F(uo). (38)
Then Uo is a solution of the primal problem (37), Po is a solution of
the dual problem (37*), and a = (3.
Proof of Corollary 2. Ad (i). Obviously,
inf L(u,p) :s; L(v,p) for all v E A,
uEA
and hence
sup inf L(u,p) :s; sup L(v,p) for all v E A.
pEB uEA pEB
Therefore,
(3 = sup inf L(u,p) :s; inf sup L(v,p) = a.
pEB uEA vEA pEB
2.13. The von Neumann Minimax Theorem 75
Ad (ii). This follows from (i).
Ad (iii). By (ii),
G(po) ~ (3 ~ Q ~ F(uo)·
Thus, it follows from (38) that G(po) = (3 = Q = F(uo). o
Proof of Theorem 2.F. (i) => (ii). Since (uo,Po) is a saddle point of L,
sup L(uo,p) = L(uo,Po) = inf L(u,po). (39)
pEB uEA
Therefore,
Q = inf supL(u,p) ~ L(uo,Po) ~ sup inf L(u,p) = {3.
uEA pEB PEB uEA
Since (3 ~ Q, by Corollary 2, the equality sign appears everywhere, that is,
Q = L(uo,Po) = (3,
and (39) tells us that
F(uo) = L(po,uo) = G(po).
This proves statement (ii).
(ii) => (i). From (ii) it follows that Q = F(uo) and (3 = G(po). Hence
(3 = G(po) == uEA
inf L(u,po) ~ L(uo,Po) ~ sup L(uo,p) == F(uo)
pEB
= Q.
Since Q = (3, the equality sign appears everywhere. Hence (uo,Po) is a
saddle point of L with respect to A x B.
This also proves Corollary 1. 0
Application of this general duality theory to numerous optimization
problems can be found in Zeidler (1986), Vol. 3, Chapters 49 through 53.
In the following section we represent fundamental results on the existence
of saddle points.
2.13 The von Neumann Minimax Theorem
on the Existence of Saddle Points
Our goal is the relation
L(uo,Po) = min maxL(u,p) = max minL(u,p).
uEA pEB pEB uEA
(40)
By definition, a functional f is called concave (resp., upper semicontinuous)
iff - f is convex (resp., lower semicontinuous). We assume that
76 2. Variational Principles and Weak Convergence
(HI) The functional L: A x B -> ~ is given, where A and Bare nonempty,
closed, convex sets in the real, strictly convex, reflexive Banach space
X (e.g., X is a real Hilbert space).
(H2) The functional u f-+ L(u,p) is convex and lower semicontimlOus (e.g.,
continuous) on A, for each p E B.
(H3) The functional p f-+ L(u,p) is concave and upper lower semieontinu-
ous (e.g., continuous) on B, for each u E A.
(H4) The sets A and B are bounded.
Theorem 2.G. The functional L has a saddle point (-uo,Po) with Te8]1ect
to A x B and the relation (40) holds true.
Moreover, relation (40) is valid for each saddle point (Ul), Po) of L.
For the special case where X = ~N, John von Neumann proved this
theorem in 1928. His paper marked the birth of mathematical game theory.
A more general result than Theorem 2.G can be found in Zeidler (1986),
Vol. 1, Section 9.6. There the proof is based on a fixed-point theorelll for
llluitivalued mappings. Other proofs make use of the lemma of Knaster,
Kuratowski, and Mazurkiewicz or ofthe Hahn-Banach theorem (separation
of convex sets). The following proof uses only elementary ar:qnrnents baHed
on weak convergence.
Proof. By (H2), the functional 't£ f-+ L( u, p) is convex and lower Ht'micoll-
tinuous on A. Hence Un ~ u as n -> 00 on A implies
L(u,p):::; lim L(un,p) for all p E B,
'1"1.-+00
by Lemma 5 and Remark 6 in Section 2.5.
Furthermore, by (H3), the functional p f-+ -L(u,p) is convex and lower
semi continuous on B. Hence Pn ~ P as n -> 00 on B implies
-L(u,p):::; lim -L(u.,Pn.) for all ·U. E A.
Il·-+X'
Thh; is equivalent to
L(11.,p) 2: lim L(u,]1.,,).
n~'XI
Step 1: vVe set
G(p) :=minL(u.,p) for all ]J E B. (-H)
ilEA
and
F(u) := mHxL(u,]J) for alll/, E A. (-H*)
pEB
2.13. The von Neumann Minimax Theorem 77
These definitions make sense. In fact, since the functional u 1-+ L(u,p) is
convex and lower semicontinuous, the minimum problem (41) has a solu-
tion, by Proposition 4 and Remark 6 in Section 2.5.
Replacing L with - L, the same argument tells us that the maximum
problem (41*) has a solution.
Step 2: We show that the functional F: A -+ ~ is quasi-convex and lower
semicontinuous. To this end, put
Ar := {u E A: F(u) ::; r} for each r E JR.
Then, it follows from v, wEAr and z := av + (1- a)w with a E [O,lJ that
L(v,p), L(w,p) ::; r for all p E B.
By the convexity of u 1-+ L(u,p), this implies
L(z,p) ::; aL(v,p) + (1 - a)L(w,p) ::; r for all p E B,
and hence F(z) ::; r (Le., z EAr).
Moreover, if Un E Ar for all n and Un -+ u as n -+ 00, then
for all n and all p E B.
Since u 1-+ L(u,p) is semicontinuous, we get L(u,p) ::; r for all p E B.
Hence F(u) ::; r, that is, u EAr'
Replacing L with - L, we see that the function -G is convex and lower
semicontinuous on B. By Proposition 4 and Remark 6 in Section 2.5, the
minimum problem
F(u.) = minF(u)
uEA
and the maximum problem
G(po) = maxG(p)
pEB
have solutions u. and Po, respectively.
Step 3: For the moment, let us assume that
(H) The functional u 1-+ L( u, p) is strictly convex on A.
Then, for each p E B, the minimum problem
G(p) = minL(u,p) (42)
uEA
has a unique solution called u := ¢(p). Hence
G(p) = minL(u,p) = L(¢(p),p) for all p E B. (43)
uEA
78 2. Variational Principles and Weak Convergence
Set
By (43),
for all u EA. (44)
We shall show in Step 4 that the following key inequality is valid:
G(po) 2:: L(uo,p) for all p E B. (44*)
It follows from (44) and (44*) that G(po) = L(uo, Po), and hence
L(uo,p) ::; L(uo,Po) ::; L(u,po) for all u E A, p E B.
Consequently, (uo,Po) is a saddle point of L with respect to A x B. By
Theorem 2.F, we get a = (3, i.e.,
sup inf L(u,p) = inf sup L(u,p) = L(uo,Po).
pEB uEA uEA pEB
According to Steps 1 and 2, we may replace "sup" and "inf" with "max"
and "min." Hence
max minL(u,p) = min maxL(u,p) = L(uo,Po).
pEB uEA uEA pEB
Step 4: We prove the decisive inequality (44*). Let p E B. Put
for n = 1,2, ....
By (43),
G(po) 2:: G(Pn) = L(un,Pn) for n = 1,2, ....
Since p t--? L(u,p) is concave,
G(po) 2:: L(un,Pn) 2:: (1- n-1)L(un,po) + n- 1 L(un,p).
By (43), G(po) ::; L(un,po). Hence G(po) 2:: (1- n-1)G(po) + n- 1 L(un,p),
that is,
for all n = 1,2, ... and all p E B. (45)
Since Un E A for all n, the sequence (un) is bounded. Thus, there exists
a subsequence, again denoted by (un), such that
Un ---' W as n ---+ 00.
By (45),
G(po) 2:: lim L(un,p) 2:: L(w,p) for all p E B. (46)
n-+oo
2.13. The von Neumann Minimax Theorem 79
It remains to show that w = Uo. By (43) and the definition of Un,
for all u E A, n = 1,2, ....
Since the functional P 1-+ L( u, p) is concave, this implies
(l-n- 1)L(un ,po)+n- 1L(un,p) S L(u,Pn) for all P E B, n = 1,2, ....
(47)
By (43), L(un,p) ~ G(p). Hence
(1 - n-1)L(un,po) + n-1G(p) S L(u,Pn) for all P E B, n = 1,2, ....
If we let n - t 00, it follows from (46) and (47) that
L(w,po) S lim L(u,Pn) for all u E A.
n--+oo
Since Pn -t Po as n - t 00, limn--+ooL(u,Pn) S L(u,po)' and hence
L(w,po) s L(u,po) for all u E A.
Thus, it follows from (43) that w = ¢(Po), and hence w = uo.
Under the additional hypothesis (H), the proof of Theorem 2.G has been
finished.
Step 5: If condition (H) is not s~tisfied, then we use the regularized func-
tions
for n = 1,2, .... (47)
Since the Banach space X is strictly convex, the function u 1-+ /lu/l is strictly
convex. Consequently, the function Ln satisfies condition (H) together with
all the other assumptions, (HI) through (H4) (cf. Problem 2.7). By the
preceding proof, there exists a saddle point (un,Pn) of Ln with respect to
A x B, and hence
for all u E A and P E B.
Since (un) and (Pn) are bounded, there exist subsequences, again denoted
by (un) and (Pn), such that
Un ~ u and Pn ~ Po as n - t 00.
The sets A and B are closed and convex. Hence Uo E A and Po E B. Letting
n - t 00 in (48), we obtain
L(uo,p) $ lim L(un,p) + n-1llunii $ lim L(u,Pn) + n-1llull
n~oo n---+oo
S L(u,po) for all u E A, P E B.
80 2. Variational Principles and Weak Convergence
Hence
for all u E A, P E B.
Thus, (uo,Po) is a saddle point of L with respect to A x B. As in Step 3,
this implies (40). 0
We now replace the boundedness of the sets A and B by the following
more general condition:
(H4*) If A is not bounded, then there exists a point q E B such that
L(u,q) -+ +00 as Ilull-+ 00, u E A.
If B is not bounded, then there exists a point v E A such that
L(v,p) -+ -00 as Ilpll -+ 00, p E B.
Proposition 1. Assume (HI) through (H3) and (H4*). Then, L has a
saddle point with respect to A x B.
Proof. Set
An. := {u E A: lIull :5 n} and Bn:= {p E B: Ilpll :5 n} n = 1,2, ....
For sufficiently large n, the sets An and Bn are not empty. By Theorem
2.G, the functional L has a saddle point (un,Pn) with respect to An X Bn.,
that is,
for all n, u E An, P E Bn· (49)
Letting u := v and P := q, this implies that the sequences (un) and (Pn)
are bounded, by (H4*).
In fact, (49) yields
for all n. (50)
It follows from (50) along with (H4*) that both the sequences (un) and
(Pn) cannot be bounded. If (un) is bounded, then the sequence ((L(u n , q)) is
bounded below. 7 Thus, (Pn) must be bounded, by (50) and (H4*). Similarly,
7By Lemma 5 in Section 2.5, U 1-+ L(u,p) is weakly sequentially lower semi-
continuous. Suppose that there exists a subsequence, again denoted by (Un), such
that
L(un,q) -+-00 as n -+ 00.
Since (Un) is bounded, there is a subsequence, again denoted by (Un), such that
Un ~ U as n -+ 00. Hence
L(u, q):S lim L(u n , q).
n~oo
This is a contradiction.
2.14 Applications to Game Theory 81
we obtain that the boundedness of (Pn) implies the boundedness of (un).
Thus, both the sequences (un) and (Pn) are bounded.
Passing over to subsequences, if necessary, we get
un ~ Uo and Pn ~ Po as n ---> 00.
By (49), for all 'U E A and P E B,
L(uo,p)::; lim L(un,p)::; lim L(u,Pn) ::; L(u,po).
n----'CX) n-oc>
Consequently, (uo,Po) is a saddle point of L. o
2.14 Applications to Game Theory
Game theory is a mathematical search for the optimal balance of competing
interests, such as between two partners. To explain this, let us consider two
players P and Q having the strategy set A and B available, respectively. If
P chooses the strategy u E A and Q having the strategy P E B, then let
L(u,p):= gain by Q = loss by P
(e.g., in dollars). We allow L(u,p) to be negative, and if this is the case,
player Q has a negative gain, that is, a loss of IL(u,p)1 dollars.
Definition 1. The pair (uo,Po) in A x B is called an optimal straJ;egy pair
iff (uo, Po) is a saddle point of the gain function L, with respect to A x B,
that is,
L(uo,p) ::; L(uo,Po) ::; L(u,po) for all u E A and p E B.
Here, L(uo,po) is called the value of the game.
This definition reflects the fact that each player will play so as to maxi-
mize his or her interests.
(i) If L(uo,Po) = 0, then the game ends undecided if both players play
optimally. Neither player gains or loses anything. This means that
the game is fair.
(ii) If L(uo,Po) > °
or < 0, then Q or P, respectively, wins by an amount
of IL(uo,Po)1 dollars if both players play optimally.
Example 2 (Coin game). Players P and Q each simultaneously displays a
coin showing heads or tails. They agree that Q will win one dollar if both
82 2. Variational Principles and Weak Convergence
show the same. Otherwise, P wins a dollar. In this case the strategy sets
are
A=B={H,T},
where Hand T correspond to heads and tails, respectively. Here, the gain
function L is given by
1 = L(H, H) = L(T, T) = -L(H, T) = -L(T, H).
Obviously,
min maxL(u,p) = 1 and max minL(u,p) =-1.
uEA pEB pEB uEA
Thus, it follows from Theorem 2.F that L has no saddle point-there can be
no optimal strategy pair. This corresponds to our intuition in the matter.
In order to get a nice result, which also applies to the coin game, let
us pass to a probabilistic approach. We assume that P has the strategies
E I , ... , EN available, while Q has F l , ... , FN. We let
Furthermore, we assume that
player P chooses strategy Ei with probability Pi,
while Q chooses Fj with probability qj. Set p := (PI,." ,PN) and q :=
(ql, ... , qN). Since Pi and qj are probabilities, we get pEA and P E B,
where
A:= {p E JR.N: 0::; Pi::; 1 for all i, t p i = I}'
t=l
B := {q EJR. N : 0 ::; qj ::; 1 for all j, t qj = I} .
)=1
Definition 3. Each pair (p, q) in A x B is called a mixed strategy. The
quantity
N N
C(p,q) := L LL(Ei,Fj)piqj
i=l j=1
is called the expected win value for player Q.
The pair (Po, qo) in A x B is called an optimal mixed strategy pair iff
(Po, qo) is a saddle point of C.
2.15 The Ekeland Principle about. Quasi-Minimal Points 83
The following main theorem of game theory was proved by John von
Neumann in 1928.
Proposition 4. Under the above hypotheses, there always exists an optimal
mixed strategy pair, and
min max.c(p,q) = max min.c(p,q). (51)
pEA qEB qEB PEA
For any optimally mixed strategy pair (Po, qo), the common value of the
two expressions in (51) is equal to .c(Po, qo).
This is a special case of Theorem 2.G, since A and B are compact convex
sets.
Example 5 (Probabilistic coin game). Parallel to Example 2, we assume
that
Player P chooses H (heads) and T (tails) with probability PI and P2,
respectively.
Analogously, Q chooses Hand T with probability ql and q2, respectively.
If we set EI = FI = Hand E2 = F2 = T, then the expected win value for
player Q is given by
2 2
.c(p, q) = L L L(Ei' F'.i) = (PI - P2)(ql - q2).
i=1 j=1
We also set
A := {(PI,P2) E ]R2: O:S; PI :s; 1, 0 :s; P2 :s; 1, and PI + P2 = I},
and B := A. Thus, A is the line segment connecting the points (0,1) and
(1,0) in ]R2.
An elementary argument shows that the point
(Po,qo) withPo=qo:= (~,~)
is the only saddle point of .c with respect to A x Band .c(Po, qo) = O.
As expected, the optimal mixed strategy for each player consists in choos-
ing heads and tails with equal probability. Therefore, this game is fair.
2.15 The Ekeland Principle about
Quasi-Minimal Points
Our point of departure is the following minimum problem:
F(u) = min!, u E AI. (52)
84 2. Variational Principles and Weak Convergence
We want to prove the existence of quasi-solutions. To this end, we consider
the following regularized problem:
F(W)+EA-11Iu-wll = min!, WEM, (53)
for fixed numbers E > 0 and A > O.
Definition 1. Each solution u E M of (53) is called an c-quasi-solution of
(52), that is,
F(u) < F(w) + EA-111u - wll for all w E M with w =1= u. (54)
Obviously, each solution u of the original problem (52) is also an E-quasi-
solution.
We assume that
(HI) M is a nonempty closed subset of the real Banach space X.
(H2) The functional F: M -t lR is lower semicontinuous and infwEM F(w) >
-00.
(H3) Fo\' given numbers E > 0 and A > 0, we choose a point v E NI such
that
F(v) S inf F(w) + 1o.
wEM
Theorem 2.H. There exists an 1O-quasi-solution u E M of (52) such that
lIu - vII SA and F(u) S F(v).
This theorem was proved by Ekeland in 1974.
Corollary 2. Suppose that the functional F: X - t lR is lower semicon-
tinuous on the real Banach space X. Moreover, suppose that the Gateaux
derivative F'(u) exists for each u E X and infuEx F(u) > -00.
Then, for each E > 0, there exists a point u E X such that
F(u) ::; inf F(w) + 10 and IIF'(u)1I S E.
wEX
In the next two sections (2.16 and 2.17) we will show that the existence of
quasi-solutions implies the existence of solutions provided the Palais-Smale
condition, which represents some compactness condition, is satisfied.
Proof of Corollary 2. By Theorem 2.H with A = 1, there exists a Zl EX
such that
F(u) S F(w) + 10llu - wll for all w E X.
2.15 The Ekeland Principle about Quasi-Minimal Points 85
We choose w = U + tv, where t E lR with t =f:. 0 and vEX. Then
C 1 (F(u + tv) - F(u» ~ -cllvll.
As t - t 0, we obtain (F'(u), v} ~ -cllvll. That is, (F'(u), z} ~ ±cllzll for
all z E X. Thus IIF'(u)1I ~ c. 0
Proof of Theorem 2.H. It suffices to assume that A = 1, since we can pass
from 11·11 to A-III·II. We inductively define a sequence (un) for n = 0, 1, ....
Let Uo := v. If we know Un ElM, then we construct Un+! EM as follows.
Case 1: F(w) > F(u n ) -cllun -wll for all wE M. Then, let Un+! := Un.
Hence
F(u n +!) = F(u n ).
Case 2: F(w) ~ F(u n ) - cllun - wI! {or some w E M. Let Sn be the set
of all these points w, and let an := infwEs n F(w). We then choose a point
un+! E Sn with
F(U n+l) ~ an + Tl(F(u n ) - an).
This is possible since F(u n ) ~ an + cllun - wll. It follows from Un+! E Sn
that
F(un+d ~ F(un) - cllun - Un+! II ~ F(un).
Our construction is so constituted that all the F(u n ) form a nonincreas-
ing sequence, which by (H2) is bounded below and hence convergent. Let
F(u n ) - t (3 as n - t 00. We will show that the sequence (un) is convergent.
In fact, by construction,
for all n.
Using the triangle inequality, addition yields
for all m > n. (55)
Therefore, (un.) is a Cauchy sequence and thus a convergent sequence. Let
Un - t Uas n - t 00. Hence u E M and
F(u) ~ lim F(u n ), (56)
n--+oo
since F is weakly sequentially lower semicontinuous, by Lemma 5 in Section
2.5. We shall show that the point u has all the desired properties.
Proof of F(u) ~ F(v). From (56) and F(u n ) ~ F(uo) for all n, it follows
that F(u) ~ F(uo). Note that Uo = v.
Proof of IIv - ull ~ 1. For n = 0 and m - t 00, it follows from (55) and
(56) that
cllv - ull ~ F(v) - F(u) ~ F(v) - inf F(w) ~
wEAf
C.
86 2. Variational Principles and Weak Convergence
Proof of (54). On the contrary, suppose that (54) is false. Then, there
exists a point w E M with w =f u such that
ellw - ull + F(w) :::; F(u). (57)
As m ~ 00, from (55) and (56) we obtain
for all n.
The triangle inequality yields
for all n.
By Case 2, w E Sn for all n. Hence
for all n.
This implies (3 :::; F(w), since F(u n ) ~ (3 as n ~ 00. From (56) it follows
that F(u) :::; (3. Thus, F(u) < F(w). By (57), F(w) < F(u). This is a
contradiction. 0
2.16 Applications to a General Minimum
Principle via the Palais-Smale Condition
Together with the minimum problem
F(u) = min!, UEX, (58)
we consider the operator equation
F'(u) = 0, u E X. (59)
The following existence theorem is based on an important compactness
property of functionals, which we shall first define.
Definition 1. Suppose that the functional F: X ~ lR has a Gateaux deriva-
tive F' (u) for each point u EX, where X is a Banach space. Then, F
satisfies the Palais-Smale condition (PS) iff the following holds:
If Un is a sequence in X with these two properties:
(i) (F(u n )) is bounded, and
(ii) 1IF'(un)1I ~ 0 as n ~ 00,
then (Un) has a convergent subsequence.
Theorem 2.1. Let F: X ~ lR be a functional on the real Banach space X
such that the following hold:
2.17 Applications to the Mountain Pass Theorem 87
(i) The Gateaux derivative F'(u) exists for each U EX,
(ii) The functional F is lower semicontinuous (e.g., continuous), is bounded
below on X, and satisfies the Palais-Smale condition (PS).
Then, the minimum problem (58) has a solution u, which also satisfies
the operator equation (59).
Proof. Let 0; := infuEx F(u). Then 0; > -00. According to Corollary lin
Section 2.15, there exists a sequence (un) in X such that
and
The condition (PS) ensures the existence of a subsequence, again denoted
by (un), such that Un ~ u. By Lemma 5 in Section 2.5, F is weakly
sequentially lower semicontinuous on X. Hence
F(u)::::: lim F(u n ).
n-+oo
Thus, we get F(u) = 0;, so that u is a solution of (58). By Theorem 2.E,
F'(u) = O. 0
2.17 Applications to the Mountain Pass Theorem
We assume that
(HI) The functional H: Y ~ IR is 0 1 on the real Banach space Y, and H
satisfies the Palais-Smale condition (PS).
(H2) There exist positive constants Rand 0; such that
H(y) 2:: 0; for all y E Y with lIyll = R.
(H3) H(O) < 0;.
(H4) There exists a point Y1 E Y such that
IIY111 > R and
(H5) We denote by M the set of all continuous functions p: [0, 1] ~ Y with
p(O) = 0 and pel) = Y1·
Furthermore, we set
c:= inf sup H(p(t)). (60)
pEM OStS1
88 2. Variational Principles and Weak Convergence
Let us discuss the intuitive meaning of this situation. If Y := 1R2, then
we can think of H(y) as the height of a mountain landscape at the point
y. We shall designate the points y with Ilyll = R as a mountain chain C.
Then, by (H3) and (H4), valleys occur at the points y = 0 and y = Yl·
To each p(.) there corresponds a path that connects the two valleys over
the mountain chain C. Intuitively, one now expects that there exists a saddle
point of our landscape at height c. The following theorem justifies this
expectation.
Theorem 2.J. If (HI) through (H5) hold, then the functional H possesses
a critical point y E Y, that is,
H'(y) = o.
In addition, H(y) = c and c 2 a.
This theorem was proved by Ambrosetti and Rabinowitz in 1973. Nu-
merous interesting applications of this important theorem to periodic so-
lutions of Hamiltonian systems and to nonlinear partial differential equa-
tions can be found in Rabinowitz (1986), Mawhin and Willem (1987), Eke-
land (1990), and Struwe (1990). In particular, applications to the famous
N-body problem in celestial mechanics are contained in Ambrosetti and
Coti-Celati (1993) (cf. Problem 2.9).
Proof. We want to use Theorem 2.H. To this end, let X denote the set
of all continuous functions p: [0, 1] - t Y. Then, X becomes a real Banach
space equipped with the norm
Ilplix := max
O::;t9
IIp(t)lIv
(use the same argument as in the proof of Standard Example 6 in Section
1.3 of AMS Vol. 108). Define the functional F: X - t IR through
F(p) := max H(p(t».
O::;t9
This definition makes sense, since the functional H: Y - t IR is continuous,
by (HI). Thus, the continuous function t f-+ H(p(t»
I
attains its maximum
on the compact set [0,1).
Our goal is the investigation of the following minimum problem:
F(p) = min!, pEM. (61)
By (60), c = infpEM F(p).
We also set d := max{H(O), H(Yl)}. From (H2) we get
c 2 a > d. (62)
2.17 Applications to the Mountain Pass Theorem 89
Step 1: Existence of a quasi-solution via Theorem 2.H. Choose c; > 0
such that
0< c; < c- d. (63)
Since c denotes the infimum of F on the set M, there exists a v E M such
that
c:::;F(v):::;c+c;.
Letting .A := d, it follows from Theorem 2.H in Section 2.14 that there
exists a point u E M such that
F(u) :::; F(v) and Ilu - vllx :::; c;~, (64)
as well as
F(u) < F(w) + c;~ lIu - wllx, (65)
for all w E M and w i- u.
Step 2: We shall show ahead that (64) and (65) imply the existence of a
number s E [0, 1J such that
IIH'(u(s))11 :::; c;~ (66)
and
c - c; :::; H(u(s»). (67)
It follows from H(u(s)):::; F(u):::; F(v):::; c+c; that
c-c;:::; H(u(s)):::; c+c;. (67*)
Step 3: Existence of a solution via the Palais-Smale condition (PS). Set
c; := ~ and Yn :=u(s). Then, it follows from (66) and (67*) that
(68)
for sufficiently large n 2': no. By (PS), there exists a subsequence, again
denoted by (Yn), such that Yn ---+ Y in Y as n ---+ 00. Thus, it follows from
(68) that
IIH'(y)1I = 0 and H(y) = c.
This is the desired result.
Step 4: It remains to prove the assertion from Step 2. Let u E M be given
as in Step 1. Recall that, by the definition of M, the path u: [0, 1J ---+ Y is
continuous along with u(O) = 0 and u(l) = Yl. Observe that
IIH'(u(s))1I = sup (H'(u(s)), y).
iiylly=l
Furthermore, let us introduce the set
S := {s E [0, 1J: c - c; :::; H(u(s))}.
90 2. Variational Principles and Weak Convergence
By (63), d < c - c;. Since u(O) = 0 and H(O) :::; d, we get 0 f/. 5. Moreover,
since the functional H is continuous on Y, the set 5 is closed, and hence
5 is compact.
Suppose that the assertion from Step 2 is not true. Then,
IIH'(u(s))1I > c;! for all s E 5.
Thus, for each s E 5, there exists a point Ys E Y with lIys II y = 1 such that
(69)
We want to construct a special path p: [0, 1] --; Y with p E M such that
pI- u
and
(70)
This is the desired contradiction to (65).
Step 5: Construction of p. Since H' is continuous on Y, it follows from
(69) that for each s E 5, there exist a number fJ8 > 0 and an open interval
J 8 in IR such that 8
(H'(U(t) + h), -ys) > c;!, (71)
for all t E J s and all hEY with Ilhll :::; fJ8'
Obviously, the family {J8 } 8E S of open intervals J 8 covers the compact
set 5. Therefore, a finite subfamily {JSl " ' " J Sm } already covers the set
5. For brevity of notation, we set
Since 0 f/. S, we may assume that 0 f/. Jk, and hence
[0,1] - Jk is closed and not empty for all k = 1, ... , m.
Therefore, if t E UZ'=l Jk, then
m
L dist(t, [0, 1] - Jk) > O.
k=l
Define
1jJ(t) := {01 ifif c:::;
c;.
H(u(t))
H(u(t)) :::; c -
Since the function t f-7 H(u(t)) is continuous on [0,1]' the function 1jJ is
defined on two disjoint closed subsets of [0,1]. Hence, it can be extended
80bserve that
A:= I(H'(y) - H'(z),ys)1 :::; IIH'(y) - H'(z)III1Ysll for all y, z E Y.
Consequently, A is sufficiently small provided Ily - zlly is sufficiently small.
2.17 Applications to the Mountain Pass Theorem 91
to a continuous function 'Ij;: [0, 1] ~ [0,1]' by the Tietze-Urysohn extension
theorem. Furthermore, for j = 1, ... , m, we define the function 'lj;j: [0, 1] ~
IR through
m
dist(t,[O,l]-Jj)
.
'f
1 t E
Umk-l J k
'lj;j(t) := { L:k=l d1st(t, [0, 1]- Jk) -
o otherwise.
One checks easily that 'lj;j is continuous on [0,1]. Furthermore, we get
m
for all t E [0,1]
and 'lj;j(t) = 0 if t rt J j .
Finally, we introduce the continuous function p: [0, 1] ~ IR through
m
pet) := u(t) + I3'1j;(t) '2: 'lj;j(t)ys J , (72)
j=1
where 13 := min{f3s 1 , ••• ,f3s m }·
Let us investigate the properties of the special path p from (72). We first
prove that p E M. In fact, since H(O) d, H(Y1) :s
d, and d < C - E, we:s
get 'Ij;(0) = 'Ij;(1) = O. Hence
p(O) = u(O) = 0 and p(l) = u(l) = Y1,
that is, p EM.
Next let us prove inequality (70). To this end, set
¢(r) := H(u(t) + r[p(t) - u(t)]), r E IR.
By the classical mean value theorem, there is some ro E ]0, 1[ such that
¢(1) - ¢(O) = ¢'(ro).
This is identical to
H(p(t)) - H(u(t)) = (H'(u(t) + ro[p(t) - u(t)j),p(t) - u(t)). (73)
Note that
m
pet) - u(t) = f3'1j;(t) '2: 'lj;j(t)YSj for all t E [0,1].
j=1
Because !!Ys j II = 1, we have
m
!!p(t) - u(t)lly:S f3'2:'Ij;j(t)IIYs j !!:S f3:S f3s k , (74)
j=1
92 2. Variational Principles and Weak Convergence
for all k = 1, ... , m and t E [0,1]. Since °< TO < 1, this implies
/Iro(p(t) - u(t))IIy ::; (3sk for all k = 1, ... , m and t E [0,1].
Let t E S. Then, t E Jk for some k. Thus, it follows from (71) and (73)
that 9
m
H(p(t)) - H(u(t)) = (3'ljJ(t) L 'ljJj(t)(H'(u(t) + TO [P(t) - u(t)]) , YS j )
j=l
m
::; (3'ljJ(t) L'ljJj(t)(-€4).
j=1
Hence we get the following key inequality:
H(p(t)) - H(u(t)) ::; -€4(3'ljJ(t) for all t E S. (75)
Ht E [0, l]-S, then H(u(t)) < C-€, and hence 'ljJ(t) = 0. Thus, pet) = u(t),
which implies
H(p(t)) - H(u(t)) = ° for all t E [0,1] - S. (76)
Hence
H(u(t)) ~ H(p(t)) for all t E [0,1]. (77)
Choose the number a E [0,1] in such a way that
H(p(a)) = max H(p(t)).
099
By the definition of F,
H(p(a)) = F(p).
Since p E M and c = infqEM F(q), we get F(q) ~ c. By (77),
H(u(a)) ~ H(p(a)) = F(p) ~ c. (78)
Hence a E Sand 'ljJ(a) = 1. It follows from (75) with t := a that
H(u(a)) ~ 10 4(3 + H(p(a)).
By the definition of F, we get F(u) ~ H(u(a)), and hence
F(u) ~ 10 4(3 + F(p) > F(p), (79)
which yields p =1= u in X. According to (74),
lip - ullx = max IIp(t) - u(t)IIy ::; (3.
0~t9
Thus, inequality (79) tells us that
F(u) ~ 10 4lip - ul/x + F(p).
This is the desired inequality (70). o
90bserve that 1/Jj(t) = 0 for t rt. Jj and 1/Jj(t) > 0 for t E Jj.
2.18 The Galerkin Method and Nonlinear Monotone Operators 93
2.18 The Galerkin Method and Nonlinear
Monotone Operators
We want to solve the operator equation
Au=b, UEX. (80)
To this end, we assume that
(HI) The operator A: X - t X* is monotone on the real, separable, reflexive
Banach space X, that is,
(Au - Av, u - v) ? 0 for all u,v E X.
(H2) The operator A is continuous on each finite-dimensional subspace of
the Banach space X.
(H3) The operator A is coercive, that is,
.
11m (Au,u)
= +00.
lIull-+oo lIuli
Theorem 2.K. For each given b E X*, the original equation (80) has a
solution u.
Corollary 1. The solution u of (80) is unique provided the opemtor A is
strictly monotone, that is,
(Au - Av,u - v) > 0 for all u, v E X with u =/:. v. (81)
In fact, if Au = Av = b, then (81) implies u = v. This yields Corollary 1.
Theorem 2.K is called the main theorem on monotone operators. This
famous result was proved independently by Browder and Minty in 1963.
Important applications to partial differential equations and integral equa-
tions can be found in Zeidler (1986), Vol. 2B.
The proof of Theorem 2.K will be based on the following three lemmas.
First let us investigate the real system
9j(X) =0, xEJR N , j=I, ... ,N. (82)
Here, we set x:= (6, ... ,eN) and B:= {x E JRN:llxll < R} for fixed
R > 0, where II . " denotes a given norm on JRN, N ? 1.
Lemma 2 (Existence principle). Suppose that
94 2. Variational Principles and Weak Convergence
(i) The function gj: B ---+ JR is continuous for each j = 1, ... , N.
(ii) For all x E JRN with IIxll = R,
N
2:gj(x)~j 2: O.
j=l
Then, equation (82) has a solution x E B.
Proof. We will use the Brouwer fixed-point theorem from Section 1.14 in
AMS Vol. 108. Set g(x) := (gl(X), ... ,gN(X)) and suppose that g(x) =I- 0
for all x E B. Let
Rg(x)
f(x) := -llg(x) II for all x E B.
The function f: B ---+ JRN is continuous on the compact convex set B. In
addition,
IIf(x)11 = R for all x E B. (83)
Thus, f(,B) ~ B. By the Brouwer fixed-point theorem, the map f has a
fixed point x:
f(x) = x, x E B.
By (83), Ilxll = R. Furthermore,
N N
2:gj(x)~j = -R- 1 1Ig(x)lI2: fj(x)~j
j=l j=l
N
= -R-lllg(x)II2:~; < O.
j=l
This contradicts assumption (ii). D
Lemma 3 (The monotonicity trick). Assume (HI) and (H2). Let b E X*.
Then it follows from
Un ~ U in X as n ---+ 00,
AU n ~ b in X* as n ---+ 00,
and (Au n , un) ---+ (b, u) as n ---+ 00 that Au = b.
Proof. By the monotonicity of A,
(Au n , un) - (Av, un) - (Au n - Av, v) = (Au n - Av, Un - v) 2: 0,
2.18 The Galerkin Method and Nonlinear Monotone Operators 95
for all v E X and all n. Letting n - t 00, we get lO
(b, u) - (Av, u) - (b - Au, v) ~ 0,
and hence
(b - Av, u - v) ~ 0 for all vEX. (84)
Next let v:= u-tw, where t > 0 and wE X. Then, relation (84) implies
(b-A(u-tw),w) ~O for all w E X.
I
Letting t -t +0, from (H2) we get
(b - Au,w) ~ 0 for all w E X.
Replacing w with -w, this implies (b - Au, w) = 0 for all w E X (Le.,
b - Au = 0). 0
Lemma 4 (Local boundedness). Assume (HI). Then, the opemtor A: X - t
X* is locally bounded, that is, for each point u EX, there exist numbers
r > 0 and {j > 0 such that
IIAvll :::; (j for all v E X with IIv - ull :::; r.
Proof. Assume that A is not locally bounded. Then, there exist a point
u E X and a sequence (Un) with
un -t U and IIAunli - t 00 as n - t 00.
Without loss of generality, we may assume that u = O. We set
It follows from the monotonicity of the operator A that
(Au n - A(=Fv), Un - (=t=v») ~ 0,
and hence
±an(Au n, v) :::; an( (Au m un) - (A(±v), Un =t= v)
:::; an(IIAunlillunli + IIA(±v) II II Un =t= vII)·
lOBy Problem 3.5, AUn ~ b in X* as n -+ 00 implies
(AUn, v} -+ (b,v) as n -+ 00 for all vEX.
96 2. Variational Principles and Weak Convergence
Since anllAunlillunll ::; 1 and an ::; 1 for all n, we get
suplan(Aun,v)1 < 00 for all v E X.
n
According to the Banach-Steinhaus theorem from Section 3.3, there exists
a number N such that
sup IlanAun II ::; N.
n
We set bn := IIAunli. Then,
bn ::; a.;;-l N = (1 + bnllunll)N for all n.
Since Ilunll - t 0 as n - t 00, the sequence (b n ) is bounded.H This is a
contradiction to IIAunll - t 00 as n - t 00.
Proof of Theorem 2.K. We will use the Galerkin method, which gener-
alizes the Ritz method from Section 2.6 of AMS Vol. 108. The basic idea
is to replace the original operator equation Au = b by the upcoming finite-
dimensional approximate equation (85) and to prove the convergence of
this approximation method.
If X = {O}, then the assertion of Theorem 2.K is trivial. Therefore, let
X t= {O}.
Step 1: The Galerkin equations. Since the Banach space X is separable,
there exists a countable set {Xl, X2, •.. } that is dense in X. Set
Xn := span{xl'"'' x n }.
Thus, we get Xl ~ X2 ~ •.. and
00
UXn is dense in X.
n=l
By definition, the nth Galerkin equation reads as follows:
(Au n - b, v) = 0 for fixed Un E Xn and all v E X n. (85)
Step 2: Solution of the Galerkin equation. Let {el, ... , eN} be a basis of
X n , and let
N N
v:= L:ejej, Un := L:enjej,
j=l j=l
as well as
llIn fact, there is some no such that Nllunll ::; 2- 1 for all n ~ no. Hence
T 1 bn ::; N for all n ~ no.
2.18 The Galerkin Method and Nonlinear Monotone Operators 97
Finally, we define the real functions
gj(x n ) := (Au n - b, ej) = 0, j = 1, ... ,N.
Therefore, the Galerkin equation (85) is equivalent to the following system:
gj(Xn) =0, xnEjRN, j=l, ... ,N. (86)
By (H2), the function gj: jRN ~ jR is continuous for each j = 1, ... , N.
Furthermore, there exists a number R > 0 such that
N
Lgj(x)~n 2: 0 for all x E jRN with Ilxll = R.
j=l
In fact, we have
N
Lgj(x)~j = (Av - b,v).
j=l
By (H3), (~~il) ~ +00 as IIvll ~ 00. Hence
(Av - b, v) = (Av, v) - (b, v)
2: (Av, v) - IIbllllvll 2: 0
for all v E X with Ilvll = R and fixed sufficiently large R > O.
According to Lemma 2, system (86) has a solution Xn with Ilxnll ::; R.
Consequently, the Galerkin equation (85) has a solution Un, where
for all n. (87)
Step 3: Boundedness of the sequence (Au n ). By Lemma 4, the operator
A is locally bounded, that is, there exist positive numbers rand 8 such
that
Ilvll : : ; r implies IIAvll::::; 8.
The operator A is monotone. Thus,
(Au n - Av, Un - v) 2: O.
By the Galerkin equation (85),
(Au n , un) = (b, un) for all n.
Hence
for all n.
By the definition of the norm in X* and by the monotonicity of A,
IIAunl1 = sup r- 1 (Aun, v)
Ilvll=r
::::; sup r- 1 ( (Av, v) + (Au n , un) - (Av, un))
IIvll=r
::::; r-1(8r + IIbliR + 8R).
98 2. Variational Principles and Weak Convergence
Step 4: Convergence of the Galerkin method. The Banach space X is
reflexive. Thus, the bounded sequence (un) has a weakly convergent sub-
sequence, again denoted by (un), that is,
as n --t 00.
From the Galerkin equation (85) it follows that
UX
<Xl
lim (Au n , w)
n-><Xl
= (b, w) for all w E n· (88)
k=l
Since the sequence (Au n ) is bounded and U~l Xk is dense in X, we get
lim (Au n , z) = (b, z) for all z E X. (89)
n-><Xl
In fact, for each z E X and given c > 0, there is awE U%"=l X k such that
liz - wll < c. Hence
I(Aun,z) - (Aun,w)1 ::; sup IIAunllllz - wll < const· c.
n
Thus, (88) implies (89). Since X is reflexive, relation (89) is equivalent to
as n --t 00
(cf. Problem 3.5). Furthermore, it follows from the Galerkin equation (85)
that
Now, Lemma 3 tells us that Au = b. o
2.19 Symmetries and Conservation Laws
(The Noether Theorem)
The following fundamental fact was discovered by Emmy Noether in 1918:
Conservation laws in nature are caused by symmetries of variational
principles.
For example, as we will show here conservation of energy corresponds to
invariance under time translation.
Let us start with the following fundamentall symmetry property of the
one-dimensional Lagrangian L = L(x, u, u'):
L(y, u(y,c), uy(y, c))Yx(x, c) = L(x,u(x),u'(x)) (90)
for all x E la, b[ and c E j-co, col, where the argument y has to be replaced
with y(x, c}. Here
y(x,O) == x, u(x,O) == u(x).
We assume that
2.19. Symmetries and Conservation Laws (The Noether Theorem) 99
(AI) The Lagrangian L:~3 ...... ~ is C 2 . Let -00:::: a < b:::: 00 and C:o > O.
(A2) The C 2 -function u: la, b[ ...... ~ is a solution of the corresponding
Euler-Lagrange equation
d
-Lu' - Lu = 0 on ]a,b[. (91)
dx
(A3) The symmetry relation (90) holds. Here, the function y = y(x,c:) is
C 2 on )a,b[ x) - so,c:o[. Moreover, the function u = u(y,c:) is C 2 on
some appropriate open set such that (x,c:) I---> u(y(x, c:), c:) is C 2 on
la, b[ x 1- C:o, C:o [. Set
8y(x) := Ye:(x, 0) and
Proposition 1. The function u satisfies the conservation law 12
d~ (L8y + Lu,(8u - u'8y)) = 0 on ]a,b[. (92)
Standard Example 2 (Conservation of energy). Suppose that the La-
grangian L does not depend on the real variable x. Letting
and u(y, c:) = u(x),
we get 8y(x) = 1 and 8u(x) = O. Hence
~(u'Lu'
dx
- L) =0 on ]a,b[. (93)
In mechanics this corresponds to conservation of the energy E := u'Lu' - L
provided we regard x as time. For example, if
then the Euler equation (91) is identical to the equation of motion
mu" = -V'(u),
12Explicitly, this means that
~(L(P)t5y(x) + L u ,(P)(t5u(x) - u'(x)t5y(x)) = 0 for all x E ja,b[,
dx
where P:= (x,u(x),u'(x)).
100 2. Variational Principles and Weak Convergence
and (93) describes conservation of energy:
1
E := -mu,2 + V(u) = const.
2
Proof of Proposition 1. Differentiating the symmetry relation (90) with
respect to the parameter c at c = 0, we get
0= Lx(P)yc(x, 0) + Lu(P)(ux(x, O)Yc(x, 0) + u,;(x, 0))
+ LU'(P)(Uxx(x, O)Yc(x,O) + ucx(x,O)) + L(P)ycx(x, 0),
where P := (x, u(x), u'(x)). Observing that 8y(x) = Yc(x,O), ux(x,O)
u'(x) and letting ¢(x) := uc(x,O), we see that this is identical to
0= Lx8y + L u (u'8y + ¢) + LUi (u"8y + ¢') + L(8y)'
= L'8y + Lu¢ + Lu ¢' + L(8y)'.l
Using the Euler equation (91), Lu = (LUi)', we find that
0= (L8y)' + (Lu l ¢)'. (94)
Differentiating u(y( x, c), c) with respect to c at c = 0, we obtain
8u(x) = ux(x, O)Yc(x, 0) + uc(x, 0) = u'(x)8y(x) + ¢(x),
and hence ¢ = 8u - u'¢. Thus, from (94) we get the assertion (92). 0
Now let us generalize this to multidimensional variational problems. Our
point of departure is the following symmetry relation for the Lagrangian
L = L(x, u, 8u):
L(y,u(y,c), 8y u(y,c)) det8 xY(x,c) = L(x,u(x),8x u(x)) (95)
for all x E G and eEl - co, cor, where the argument y has to be replaced
with y(x,c). Here
y(x,O) == x, u(x,O) == 0,
along with u = (Ul, ... , UM), X = (6, ... , ~N)' Y = (Yl, ... , YN), as well as
8 x u(x) = (8 1 u(x), ... , 8 N u(x)), where 8j := 8/8~j. We assume that
(HI) G is a nonempty open set in JRN. The Lagrangian L:]RN x ]RM X
]R N M --> ]R is C 2 .
(H2) The C 2 -function u: G --> ]RM is a solution of the corresponding Euler-
Lagrange equation
N
L 8 La u
j j m - LU m = 0, m=l, ... ,M. (96)
j=1
2.19. Symmetries and Conservation Laws (The Noether Theorem) 101
(H3) The symmetry relation (95) holds. Here, the function y = y(x, c) is
C2 on Gx 1- co, cor. Moreover, the function u = u(y, c) is C 2 on
some appropriate open set such that (x, c) 1--+ u(y( x, c), c) is C 2 on
Gx 1- co, cor. Set
8y(x) := yc(x, 0) and
Theorem 2.L (The Noether theorem). Assume (HI) through (H3). Then,
the function u satisfies the conservation law
N
L:OjJj = 0 on G
j=l
with the current J = (J1, ... , J N ), where
Here, we sum overn= I, ... ,N andm= I, ... ,M.
This is one of the most important theorems in mathematical physics. The
proof proceeds completely similarly to the proof of Proposition 1. Observe
that
This follows from
y(x, c) = x + c8y(x) + O(c), c -+ 0,
and hence
oxy(x, c) = 1+ cox(8y(x)) + O(c), e -+ 0,
which implies
det oxY(X, c) = 1 + e tr ox(8y(x)) + O(e), e -+ O.
Remark 3 (Interpretation of the symmetry relation). It follows from Sec-
tion 2.2 that the Euler-Lagrange equation (96) represents a necessary con-
dition for the following variational problem:
fa L(x, u(x), ou(x))dx = stationary!, (97)
u = given on oG.
102 2. Variational Principles and Weak Convergence
The decisive symmetry relation (95) means that the variational integral is
invariant under the transformation y = y(x, c), that is,
r L(y,u(y,c),Oyu(y,c))dy= JarL(x,u(x),ou(x»dx
Ja,
(98)
for all c E J - co,co[, where G is transformed into Gc under y = y(x,c).
Remark 4. (The Noether theorem via local invariance of the variational
integral). The following method is used frequently in applications. Let the
variational problem (97) be given. Choose an arbitrary point x E G. Sup-
pose that we have the invariance condition
1 u,
L(y,u(y,c),Oyu(y,c»dy = 1 u
L(x,u(x),ou(x»)dx
for all c E J - co, co[, and for all sufficiently small neighborhoods U of the
point x, where U is transformed into Uc under y = y(x, c).
If we shrink U to the point x, then it follows from the mean value theorem
for integrals that the crucial symmetry condition (95) is satisfied. Conse-
quently, we can use the Noether theorem with respect to the transformation
y = y(x,c).
Applications of the Noether theorem to gauge field theory and string
theory can be found in Section 2.20 and Problem 2.15, respectively.
2.20 The Basic Ideas of Gauge Field Theory
In this section, all the functions are assumed to be sufficiently smooth.
In order to understand the basic ideas of gauge field theories in modern
physics, let us first study a simple model. To this end, we consider the
following variational problem:
lb (i¢,¢' -::. m¢'¢ )dx = stationary! ,
(99)
</>( x), '¢( x) = given at the boundary points x = a, b.
Here, ¢(x) denotes the complex conjugate number to </>(x).
Proposition 1. If ¢, '¢ is a solution of (99), then
i'¢' - m'¢ = 0 on Ja,b[, (100a)
i¢' +m¢=O on Ja,b[, (100b)
and hence both </> and'¢ satisfy equation (100a).
2.20 The Basic Ideas of Gauge Field Theory 103
Proof. If we let L := i¢'IjJ' - m¢'IjJ, the Euler-Lagrange equations are given
by
d d
-L;[,I - L;[, = 0 and -L""-L,,, =0.
dx dx
This yields (100). o
Corollary 2. If'IjJ is a solution of (lOOa), then 'IjJ satisfies the following
conservation law:
(ijJ'IjJ)'=O on ]a,b[. (100*)
This means that the "density" ijJ'IjJ is constant.
Proof. By (100), i(ijJ'IjJ)' = iijJ''IjJ + iijJ'IjJ' = -mijJ'IjJ + mijJ'IjJ = O. 0
Let a = a(x) be a real function. The transformation
'IjJ+{x) = ei"'(x)'IjJ(x), ¢+(x) = ei"'(x)¢(x)
is called a (local) gauge transformation. 13 If a = const, then we speak of a
global gauge transformation.
Corollary 3. The Lagrangian L = i¢'IjJ' - m¢'IjJ is invariant under global
gauge transformations.
Proof. Since 'IjJ+ = ei"''IjJ, 'IjJ~ = ei"''IjJ', and ¢+ = e- i"'¢, we get
i¢+'IjJ~ - m¢+'IjJ+ = i¢'IjJ' - m¢'IjJ. o
By Section 2.19, symmetries of the Lagrangian imply conservation laws.
We want to show that
The global gauge invariance of the Lagrangian L yields the conservation
law (100*).
To this end, set
Then
8'IjJ(x) := 'IjJ",(x,O) = i'IjJ(x), 8¢(x) := ¢",(x, 0) = -i¢(x).
Suppose that ¢, 'IjJ is a solution of the Euler equation (100) with ¢ = 'IjJ.
By the Noether theorem (Theorem 2.L), it follows from the global gauge
invariance
L(¢(x, a), 'IjJ(x, a)) = L(¢(x, O)'IjJ(x, 0)) for all a
13Gauge transformations are also called phase transformations.
104 2. Variational Principles and Weak Convergence
that
(L1jJ,01/J + L¢,O¢)' = O.
Since L = i¢1/J' - m¢1/J, this yields (¢¢)' = O. o
In elementary particle physics, global gauge invariance implies the con-
servation of particle numbers or charges (e.g., the number of baryons or
the electric charge).
Now to the point. The following principle is crucial.
Gauge field theories correspond to Lagrangians that are invariant under
local gauge transformations (i. e., a depends on x).
In order to obtain such a Lagrangian, we have to modify L. The simplest
possible ansatz consists in replacing the classical derivative d/ dx by the
so-called covariant derivative
V:= ! +iKA,
where the real "field" A = A(x) depends on x, and the positive number K
is called the coupling constant (of the interaction).
Proposition 4. The modified Lagrangian
is invariant under gauge transformations provided the field A is transformed
into A+, where
Proof. Letting
d .
V+:= dx +zKA+,
it follows from 1/J+(x) = ei c>(x)1/J(x) that
V+1/J+ = ei C>V1/J,
that is, the covariant derivative V transforms like the function 1/J. This is
the key property of V. In fact,
V+1/J+ = (e i C>1/J)' + iKA+e i C>1/J = ei C>(ia'1/J + 1/J' + iKA1/J - ia'1/J)
= ei C>(1/J' + iKA1/J) = e i C>V1/J.
Consequently,
LA+ (¢+, 1/J+) = i¢+ V+1/J+ - m¢+ 1/J+
= i¢e- i C>e i C>V1/J - m¢e- i C>e i C>1/J = i¢V1/J - m¢v)
= LA(¢, 1/J). 0
2.20 The Basic Ideas of Gauge Field Theory 105
Let us now replace the original variational problem (99) with the follow-
ing gauge invariant problem:
lb (i¢\7'l/J -:. m¢'l/J)dx = stationary, (101)
¢(x),'l/J(x) = given at the boundary points x = a,b.
That is, we replace the classical derivative d/ dx with the covariant deriva-
tive \7.
Proposition 5. If ¢, 'l/J is a solution of (101), then
i\7'l/J - m'l/J = 0 on Ja,b[, (102a)
i\7¢ - m'l/J =0 on Ja,b[. (102b)
Proof. Letting L := i¢'l/J' + i¢iK,A'l/J - m¢'l/J, the Euler-Lagrange equations
are given by
d
and -L1/1 ' -L1/1 =0.
dx
This yields (102a) and i(¢'-iK,A¢)+m¢ = 0, which is equivalent to (102b),
since A is real. 0
It follows from the gauge invariance of the Lagrangian L that
The gauge field equations (102) are invariant under gauge transforma-
tions.
Explicitly, this follows from
i\7 +'l/J+ - m'l/J+ = eiQ (i\7'l/J - m'l/J).
Thus, if'l/J is a solution of (102a), then 'l/J+ is a solution of the transformed
equation
The same is true for ¢.
Remark 6 (Parallel transport). The function A allows a geometrical in-
terpretation. To explain this, let C: x = x(t), to :::; t :::; h be a curve. Set
~ := x'(t)\7.
Then, by definition, the function 'l/J = 'l/J(x) is parallel along the curve C iff
D'l/J
-(x(t» = 0
dt
106 2. Variational Principles and Weak Convergence
Explicitly, this means that
x'(t)1jJ'(x(t)) + iKx'(t)A(x(t))1jJ(x(t)) = 0 on [to, td·
In the special case where A(x) == 0, we obtain
~1jJ(x(t)) = x'(t)1jJ'(x(t)) = 0
dt
that is, 1jJ is constant along the curve C.
The function A is called a connection. This interpretation of the field
A reflects an important relation between the gauge field theory of physi-
cists and modern differential geometry of mathematicians based on parallel
transport. This will be discussed in Remark 5 of Section 2.22.3.
Remark 7 (Gauge field theories in modern physics). In modern elementary
particle physics, gauge field theories are used in order to describe mathe-
matically the fundamental interactions in nature. In terms of our simple
model given earlier, the basic ideas are roughly the following:
(i) The functions 1jJ and 4> correspond to the basic particles P and the
antiparticles 15, respectively.
(ii) The postulate of gauge invariance forces the existence of a new field
A, which describes the interaction between the particles from (i).
For example, if 1jJ and 4> correspond to electrons and positrons, respec-
tively, then the gauge field A corresponds to the electromagnetic field,
which is related to the photon after quantization. Thus, the existence of
the photon (i.e., light) is a consequence of the postulate of local gauge
invariance.
In the so-called standard model of modern elementary particle physics,
there exist six quarks and six leptons (e.g., the electron and the neutrino)
along with the corresponding antiparticles. According to the principle of
gauge invariance, local twelve particles (gauge fields) describe the interac-
tion between quarks and leptons, namely, the photon, three vector bosons
(W+, w- , ZO) and eight gluons. The existence of the particles W±, ZO was
theoretically predicted by gauge field theory. These particles were detected
experimentally in 1983 at CERN (Geneva, Switzerland).
The standard model corresponds to a gauge field theory with the gauge
group U(l) x SU(2) x SU(3). Both the quark model and SU(N)-gauge
field theories are based on the representations of Lie algebras. The basic
ideas will be explained in the next three sections.
As an introduction to modern elementary particle physics from the physi-
cal point of view, we recommend the textbook by Rolnick (1994). A detailed
study of the standard model can be found in the monograph by Donoghue
et al. (1992). We also recommend the textbook by Sterman (1993) as an
introduction both to quantum field theory and to the standard model from
the physical point of view.
2.21 Representations of Lie Algebras 107
2.21 Representations of Lie Algebras
In the next section we will show that representations of the Lie algebra
su(3) playa fundamental role in elementary particle physics.
Definition 1. Let L be a linear space over lK. Then, L is called a Lie
algebra over OC iff there exists a product [A, B] on L that has the following
properties:
(i) To each given ordered pair (A, B) with A, BEL, there is assigned
exactly one element of L, which we call the Lie bracket [A, B].
(ii) For all A, B, C ELand a, (3 E OC,14
[A, B] = -[B, A] (anticommutativity),
[A, [B, CIl + [B, [C, All + [C, [A, BIl = 0 (Jacobi's identity),
[aA + (3B, C] = alA, CJ + (3[B, CJ (distributive laws),
[C, aA + (3B] = arc, A] + (3[C, B].
In addition, a linear subspace M of the Lie algebra L is called a subalgebm
of L iff
A,BEM implies [A,B] EM.
This is equivalent to saying that M is a Lie algebra with respect to the Lie
brackets of L.
A Lie algebra over lR (resp., q is also called a real (resp., complex) Lie
algebra.
Standard Example 2 (Linear operators on Banach spaces). Let X be a
Banach space over lK. Set
[A,B]:= AB - BA for all A, BE L(X, X).
Then, the Banach space L(X, X) of all linear continuous operators A: X --+
X is a Lie algebra over lK.
Proof. Note that [A, BJ = -[B, A] and
[A, [B, C]] + [B, [C, All + [C, [A, BIl
= (ABC - ACB - BCA + CBA) + (BCA - BAC - CAB + ACB)
+ (CAB - CBA - ABC + BAC). D
14 Jacobi's identity replaces the missing associative law for the Lie product
[A,BJ.
108 2. Variational Principles and Weak Convergence
Example 3 (The Lie algebra su(N)). Let X be an N-dimensional complex
Hilbert space, where N = 1,2, .... Then the following are true:
(a) The set su(N) of all traceless skew-adjoint linear operators A: X ....... X
forms a real Lie algebra with respect to [A, B] := AB - BA.
(b) su(N) is a subalgebra of L(X, X).
(c) dim su(N) = N2 - 1.
(d) su(N) is a real Hilbert space with respect to the inner product
(A I B) := -tr(AB).
Proof. Ad (a). Let A, BE su(N) and a, /3 E JR. Then,
A*=-A, B*=-B, and trA=trB=O.
Hence
(aA + /3B)* = aA* + /3B* = -(aA + /3B),
tr(aA + /3B) = atr A + /3tr B = 0,
that is, aA + /3B E su(N). Furthermore, it follows from
[A, B]* = (AB - BA)* = B* A* - A* B* = BA - AB = -[A, B]
and tr[A, B] = tr(AB) - tr(BA) = 0 that [A, B] E su(N).
Ad (b). First we want to show that there exists a one-to-one correspon-
dence
A ~ (akm)
between the operators A E su(N) and the traceless skew-adjoint (N x N)-
matrices (akm), namely,
all + ... +aNN = 0 and akm = -amk for all k,m = 1, ... ,N.
In fact, let {el, ... , eN} be an orthonormal basis of X. For A E su(N), set
k,m = 1, ... ,N.
I
Then
k = 1, ... ,N. (103)
m=l
It follows from tr A = L:~=l(ek I Aek) = 0 that all + ... + aNN = 0, and
A* = -A implies
2.21 Representations of Lie Algebras 109
Thus, the matrix (akm) is traceless and skew-adjoint. Conversely, the same
argument shows that each traceless skew-adjoint matrix (akm) corresponds
to an operator A E su(N) given by (103).
Consider next the case N = 3. Each traceless skew-adjoint (3 x 3)-matrix
can be written in the following form:
ia
( -ii i/3
a b) ,
c
-b -c i'y
where the real numbers a, /3, and'Y satisfy the trace relation a + /3 + 'Y = O.
Since a, b, and c are complex numbers, such a matrix depends on 2 + 2(2 +
1) = 8 real parameters. Thus,
dim su(3) = 8.
Define the following matrices,
0 ~ ) V,~ G ~ ),
V,,~
0
0
0 -1
,
0
1
0 -1
(104)
1) A",~ G
A,,~ ~1
0' Ul 0) COO) D
1 0
0 .A13
oo 0 , 0
( :=
o
B",~ 0
0 0 -1
B,,~ G
0'
i
0 o ~ , 8 23 := o 0 i .
0 o 0 o i 0
Then, each traceless skew-adjoint (3 x 3)-matrix can be written as a real
linear combination of the eight matrices from (104). Thus, they form a
basis of the linear space su(3).
In the general case, where N = 1,2, ... , the same argument yields
dim su(N) = (N - 1) + 2(1 + 2 + ... + N - 1)
= N - 1 + (N - l)N = N 2 - 1.
Ad (iii). Let A, B E su(N). Then, (iA)* = iA (Le., the operator iA is
self-adjoint). Hence
N N N
tr(iA)2 = ~)ej I (iA)2ej) = ~)iAej I iAej) = L>;'
j=l j=l j=l
where iAej = Ajej for all j. Thus,
(A I A) = -tr A2 = tr(iA)2.
110 2. Variational Principles and Weak Convergence
Hence (A I A) 2:: 0 for all A E su(N), and (A I A) = 0 implies Aj = 0 for
all j, that is, A = O.
Furthermore, for all A, B, C E su(N) and a, {J E lR, we get
(A I B) = -tr(AB) = -tr(BA) = (B I A),
and
(aA + (JB I C) = -tr(aAC + (JBC) = -a tr(AC) - (Jtr(BC)
= a(A I C) + (J(B I C). 0
Definition 4. Let L be a Lie algebra over lK. By a representation of L on
the Banach space X over OC, we understand a linear map
¢: L ....... L(X, X), (105)
which respects the Lie brackets, that is,
¢([A, BD = [¢(A), ¢(B)] for all A,B E L.
The representation ¢ is called irreducible iff there is no nontrivial invari-
ant linear subspace Y of X, meaning that there is no linear subspace Y of
X such that Y -:f:. {O} and Y -:f:. X as well as ¢(A)(Y) ~ Y.
Example 5 (The dual representation). Consider the representation ¢ from
(105). Define
¢D(A) := -¢(Af for all A E L.
Then, ¢D: L ....... L(X*, X*) is a representation of L on the dual space X*
called the dual representation of ¢ on X*.
Set A:= ¢(A). Observe that the dual operator AT: X* ....... X* to A: X .......
X is defined by
(AT u*)(u) := u*(Au), for all u E X, u* E X*.
Proof. We will show in Section 3.10 that A E L(X, X) implies AT E
L(X*, X*) and
(aA + {J8)T = aAT + {J8 T and (A8f = 8 T AT,
for all A, 8 E L(X, X) and a, (J E lK. Hence
[¢D(A), ¢D(B)] = ¢(A)T ¢(B)T - ¢(B)T ¢(A)T
= -(¢(A)¢(B) - ¢(B)¢(A)f = -[¢(A), ¢(B)]T
= -¢([A, Blf = ¢D([A, B]). 0
2.21 Representations of Lie Algebras 111
In what follows we will use tensor products, which have been discussed
in detail in Problems 3.8fI of AMS Vol. 108.
Example 6 (Tensor representations). Let {eI, ... , eN} and {fl, .. ·, fM}
be a basis of the linear space X and Y over OC, respectively. Suppose that
¢: L -4 L(X, X)) and 'IjJ: L -4 L(Y, Y) are representations of the Lie algebra
L over II{. For A E L set
and
for all ajk E II{.
Then ¢T: L -4 L(X ® Y, X ® Y) is a representation of L on the tensor
product X ® Y, which is called the tensor product of ¢ with 'IjJ. We also
write ¢T := ¢ ® 'IjJ.
Proof. One checks easily that ¢T is linear and that
(¢T(A)¢T(B) - ¢T(B)¢T(A))(ej 0 ik) = (¢(A)¢(B) - ¢(B)¢(A))ej 0 fk
+ ej 0 ('IjJ(A)'IjJ(B) - 'IjJ(B)'IjJ(A))fk'
Since ¢([A, BD = ¢(A)¢(B) - ¢(B)¢(A) and 'IjJ([A, BD = 'IjJ(A)'IjJ(B) -
'IjJ(B)'IjJ(A), we get
[¢T(A), ¢T(B)] = ¢T([A, BD for all A, BEL. 0
Example 7 (The adjoint representation). Let L be a subalgebra of the Lie
algebra L(X, X), where X is a Banach space over II{. Set
¢(A)C := [A, C] for all A E L, C E L(X, X).
Then ¢ is a representation of L on L(X, X), which is called the adjoint
representation of L.
Proof. Recall that [A,C] = AC-CA. Hence II¢(A)CII ~ 2I1AIIIICII. Thus,
the operator ¢(A): L(X, X) -4 L(X, X) is linear and continuous, that is,
¢(A) E L(Y, Y), where Y := L(X, X). It follows from
¢(aA + (3B)C = faA + (3B, C] = alA, C] + (3[B, CJ = (a¢(A) + (3¢(B))C
for all A, BEL and a, (3 E OC that ¢: L -4 L(Y, Y) is linear.
Furthermore, we get
(¢(A)¢(B) - ¢(B)¢(A))C = [A, [B, ClI- [B, [A, ClI = [[A, Bj, Cj
= ¢([A, BDC,
by the Jacobi identity. o
112 2. Variational Principles and Weak Convergence
y y
n p b6 bl
T=t
E+ bs b2
T= 1 T1 T1
T=-!
S- s+ b4 b]
(a) baryons (B = 1) (b) weight diagram
FIGURE 2.5.
2.22 Applications to Elementary Particles
Physicists introduce quantum numbers in order to classify elementary par-
ticles and to understand scattering processes in particle accelerators. For
example, an important role is played by the isospin T, the third compo-
nent T3 of the isospin, and the hypercharge Y. The electric charge Q of an
elementary particle is given by
(Gell-Mann-Nishijima formula),
where e = electric charge of the electron. Moreover,
T3 = T, T - 1, T - 2, ... , -T,
and T = 0, ~, 1, ~, .... Figure 2.5(a) displays eight baryons which behave
similarly. In particular, Figure 2.5(a) tells us that
1
T3 = 2' Y = 1 for the proton p
and
1
T3 = -2' Y = 1 for the neutron n.
Particles possess the same isospin T if they lie on the same vertical line
in Figure 2.5(a). For example, we get T = ~ for both the proton and the
neutron (d. Table 2.1). In this section we want to show that the diagram
from Figure 2.5(a) corresponds to the weight diagram of a representation of
the Lie algebra su(3). As we will see later, abstract mathematical arguments
lead us to the following fundamental hypothesis:
Both the proton and the neutron consist of three quarks.
This was formulated by Gell-Mann and Zweig in 1964. 15
15Murray Gell-Mann was awarded the Nobel Prize in 1969.
2.22 Applications to Elementary Particles 113
TABLE 2.1. Quantum numbers of nucleons
Proton Neutron
Mixed state
of three quarks u,u,d d,d,u
1 1
Isospin T 2 2
Third component
of isospin T3 1 1
2 -2
Hypercharge Y 1 1
Q/lel 1 0
(Q = electric charge)
Baryon number B 1 1
Strangeness S 0 0
2.22.1 Baryons and Quarks
Let X be a three-dimensional complex Hilbert space with the orthonormal
basis {e1,e2,e3}. Set
ejkm := ej ® ek ® em·
Recall that the tensor product X ®X ®X consists of all linear combinations
3
L °jkmejkm,
j,k,m=l
where Ojkm E C for all j, k, m. Moreover, X ® X ® X is a complex Hilbert
space with the orthonormal basis {ejkm: j,k,m = 1,2,3}. Let A:X ....... X
be a linear operator, namely, A E L(X, X). We write
3
A c:= (ajk), j, k = 1,2,3, iff Aek = L ajkej for all k.
j=l
In particular, let us introduce the four linear operators y, 73, B, S: X ....... X
0 0'
through
0
0
~J T,~
0
1 1 (106)
yc:= a- -2
0 0
0 G ~)
and
0 0
D'
1 Sc:= 0
Bc:= a-
D 0 -1
114 2. Variational Principles and Weak Convergence
TABLE 2.2. Quantum numbers of guarks
Quantum Number Quarks Antiquarks
U = el d = e2 S - e3 U - ei d - ez S - e3
1 1 1 1
Isospin T 2 2 0 2 2 0
Third component
1 1 1 1
of the isospin 13 2 -2 0 -2' 2' 0
1 1 2 1 1 2
Hypercharge Y 3 3 -3 -3 -3 :3
Baryon number B 1
S
1
S
1
S -s1 -s1 -s1
Strangeness S 0 0 -1 0 0 1
Q/lel
2
3 -s1 -s1 -s2 1
3
1
S
(Q = electric charge,
e = charge of
the electron)
Note that iY, i13, iB, is E su(3). For all A E L(X, X), define
¢(A)ejkm := Aej 0 ek 0 em + ej 0 Aek 0 em + ej 0 ek 0 Aem
and
¢(A) L CXjkmejkm := L cxjkm¢(A)ejkm. (107)
Then, the operator ¢(A): X 0 X 0 X -+ X 0 X 0 X is linear.
Furthermore, set
Sjkm := S(123)ejkm, ajkm:= A(123)ejkm' Ujkm:= A(13)S(12)ejkm,
Vjkm := A(12)S(13)ejkm-
Here, S(123) (resp., A(123» means symmetrization (resp., antisymmetriza-
tion) with respect to all three indices. Similarly, S(12) means symmetriza-
tion with respect to the first and second indices, and so forth. Explicitly,
we get
Sjkm = ejkm + ejmk + ekmj + ekjm + emjk + emkj,
ajkm = ejkm - ejmk + ekmj - ekjm + emjk - emkj,
as well as
Ujkm = A(13)(ejkm + ekjm) = ejkm - emkj + ekjm - emjk,
Vjkm = A(12)(ejkm + emkj) = ejkm - ekjm + emkj - ekmj'
2.22 Applications to Elementary Particles 115
Finally, define the following linear subspaces of the tensor product X @X @
X:
Ll := span{Sjkm}, L 2 := span{ajkm},
L3 := span{ Ujkm}, L 4 := span{ Vjkm}.
By (106),
(108)
where Al = A2 = k,
A3 = -~, f.-ll = ~, f.-l2 = -~, f.-l3 = 0, and VI = V2 = 0,
V3 = -1. By (107),
¢(Y)ejkm = (Aj + Ak + Am)ejkm, ¢('I3)ejkm = (f.-lj + f.-lk + f.-lm)ejkm,
¢(8)ejkm = ejkm, ¢(S)ejkm = (Vj + Vk + vm)ejkm,
(109)
for all j, k, m. The same is true if we replace ejkm with Sjkm, ajkm, Ujkm,
or Vjkrn.
Proposition 1. The following statements hold true. 16
(ii) Each space L j is invariant under the representation ¢ of the Lie al-
gebra su(3) on X @X @X. Here, ¢ is defined through (107).
(iii) dim L1 = 10, and a basis of L1 is given by
By (109), these basis vectors are common eigenvectors of the operators
¢(Y) and ¢('I3) with the eigenvalues T3 and Y, respectively, which are
pictured in Figure 2. 6(b). For example,
¢(Y)S112 = Y S112, ¢('I3)S112 = T3s 112 ,
where Y = Al + Al + A2 = 1 and T3 = f.-l1 + f.-ll + f.-l2 = ~. Figure
2.6(b) is called the weight diagram of the representation of su(3) on
the space L 1 .
JGThe direct sum (i) says that, for each u E X ® X ® X, there exists a unique
decomposition
where Uj E L1 for all j.
116 2. Variational Principles and Weak Convergence
y y
8222 8112 8111
-2 8333
(a) baryon resonances (B = I) (b) weight diagram
FIGURE 2.6.
(iv) dim L2 = 1, and a123 is a basis vector of L2 with
by (109).
(v) dim L3 = 8, and a basis of L3 is given by
b6 := U221, b7 := U123, bs := U132·
The corresponding weight diagram is pictured in Figure 2. 5 (b).
(vi) dim L4 = 8, and a basis of L4 is given by
b6 := V212, b7 := V132, bs := V123·
The corresponding weight diagram is identical to the weight diagram
of the representation of su(3) on L3 (cf. Figure 2.5(b)).
Proof. Ad (i). Use the decomposition
Hence X @ X @ X = L1 + L2 + L3 + L 4. As we will show, dim L1 = 10,
dim L2 = 1, and dim L3 = dim L4 = 8. Since
dim X @X @X = 27 and dim L1 + dim L2 + dim L3 + dim L4 = 27,
2.22 Applications to Elementary Particles 117
Ad (ii). Observe that ¢(A) commutes with symmetrization or antisym-
metrization. For example,
¢(A)S(123)ejkm = S(123)¢(A)ejkm.
Hence U E Ll implies ¢(A)u ELI. The same argument applies to L 2, L 3,
and L 4 .
Ad (iii). Since Sjkm is symmetric with respect to all indices, we get
S123 = S132 = S213 = S231 = S312 = S321, and so on.
Moreover, observe that {ejkm} forms an orthonormal system. Hence the
vectors from (110) form an orthogonal system. That is, they are linearly
independent.
Ad (v). Observe that
Ujkm = Ukjm and Ujkm + Ujmk = -Ukmj'
This implies that each Ujkm is a linear combination of b1 , ... , bs . Moreover,
an elementary computation shows that b1 , •.. , bs are linearly independent.
For example, it follows from
that o:e112 + ,6e113 + 'Ye331 = 0, and hence 0: = ,6 = 'Y = O.
Ad (iv), (vi). Use similar arguments as previously. o
One can show that the representations of su(3) on LI, L 2, L3, and L4
are irreducible (cf. Cornwell (1989), Vol. 2, pp. 636ft').
Remark 2 (Physical interpretation). Set
el = state of the u-quark,
e2 = state of the d-quark,
e3 = state of the s-quark
and
ejkm == ej i8l ek i8l em
= composite state of the three quark states ej, ek, em·
Comparing Figure 2.5(a) with Figure 2.5(b), it turns out that
The weight diagram of the representation of the Lie algebra su(3) on L3
is identical to the quantum number diagram of physicists.
Since each vector bI , ... , bs is a linear combination of ejkm, we say that
118 2. Variational Principles and Weak Convergence
The elementary particles from Figure 2.5{a) are mixed states consisting
of three quarks.
For example,
proton p ~ bi = UllZ = 2(el 0 el 0 ez - ez 0 el 0 el)'
This leads us to the following interpretation:
The proton consists of two u-quarks and one d-quark.
In quantum physics, one uses normalized states. Therefore, we have to
replace UllZ with
u (norm)
1l2 = V2
1 ( el 0 el 0 e2 - e2 0 el "" )
'61 el .
Then, (ui~~rm) I uil~~rm») = 1. Observe that {ej 0 ek 0 em} forms an or-
thonormal system. 17
The quantum numbers Y, T 3 , B, and S of quarks are given by the eigen-
values of the operators y, 73, B, and S, respectively. For example, by (108),
1 1 1
Yel = 3el, 73 el = 2el, Bel = 3el, Sel = O.
Hence the u-quark el has the quantum numbers Y = T3 = ~, B = 1, i,
and S = O. This corresponds to Table 2.2.
Furthermore, the quantum numbers of composite particles correspond to
the eigenvalues of the representations ¢(Y), ¢(73), and so on. For example,
if we note that bl = U1l2, it follows from (109) that
1
¢(Y)b l = bl , ¢(73)b l = 2bl, ¢(B)b l = bl , ¢(S)b l = O.
Thus, the quantum numbers of the proton are given by Y = 1, T3 ~,
B = 1, and S = O. This corresponds to Table 2.l.
Furthermore, the weight diagram of the representation of su(3) on Ll
(cf. Figure 2.6(b)) corresponds to the baryon multiplet from Figure 2.6(a).
Using the su(3)-symmetry of this diagram, GelHvlann and Neeman pre-
dicted the existence of the particle [2- in 1961. This particle was discovered
shortly after its prediction.
17If we use the representation of su(3) on £4, then we get the same weight
diagram as for £3. In this case, the proton p corresponds to the state vector
bJ = VIZJ = 2(eJ 0 ez ® eJ - e2 ® eJ ® el)'
This differs from the state vector UllZ. Thus, the state vectors of elementary
particles are not uniquely determined. We need a fixed convention. However,
note that the possible proton states UllZ and VIZJ correspond to a mixed state
consisting of two u-quarks and one d-quark; the essential quark contents are the
same for UllZ and V12J.
2.22 Applications to Elementary Particles 119
2.22.2 Mesons and Pairs of Quarks and Antiquarks
We want to show that the dual representation of su(3) is related to an-
tiquarks and that the representation of su(3) on X 0 X* corresponds to
mesons (pairs of quarks and antiquarks).
To begin with, define a basis {ei, e2' e3} of the dual space X* by letting
for all Ctk E C.
The tensor product X 0 X* consists of all linear combinations
3
L Ctkmekm, where ekm := ek 0 e;" and Ctkm E C for all k, m.
k,m=l
Let A: X -> X be a linear operator such that
with the matrix A = (akm).
Set
Then, ¢D: L(X, X) -> L(X*, X*) is the dual representation of the Lie al-
gebra L(X, X) on L(X*, X*). Explicitly, for j = 1,2,3,
3
_AT ek(ej) = -ek(Aej) = - L amjek(em) = -akj
m=l
3
=- L akme;"(ej)'
m==l
where AT denotes the transposed matrix (Le., Arm = Amk)' For example,
Y~O
0
1
3
0 JJ 0 T,~
0
0
1
-'2
n
n,
implies
¢D(Y) ~ (-~l
0
-3
0
1 ¢n(T,) ~ (-~l
n 0
1
'2
0
120 2. Variational Principles and Weak Convergence
Remark 1 (Physical interpretation). We regard ei, e2, and e; as the state
vectors of the antiquarks il, d, and s, respectively. Then, for j = 1,2,3,
¢D(Y)ej = -Ajej, ¢D('T:3)ej = -Iljej,
where Al = A2 = ~, A3 = -~, and III = -1l2 = -~, 113 = O. We say that
ej corresponds to a quark state with the quantum numbers Y = - Aj and
T3 = -Ilj (cf. Table 2.2).
For the linear operator A: X ---t X define
¢(A)(ek 18> e:n) := Aek Q9 e:n + ek 18> (_AT e:n).
Finally, set
and
Jvh := span{ a}, M 2 := span{ tkm: k, m = 1,2, 3}.
Recall that ekm := ek Q9 e:n. Obviously,
This implies
¢(Y)a = ¢('T:3)a = 0, ¢(Y)tkm = (Ak - Am)tjk, ¢('T:3)tkm = (Ilk - Ilm)tkm'
(112)
Proposition 2. The following are true:
(i) X 18> X* = Ml ED M 2 •
(ii) Jvh and M2 are invariant under the representation of su(3) on X Q9
X*.
(iii) dim Ml = 1.
(iv) dim M2 = 8, and a basis of M2 is given by
The proof will be given ahead. It can be shown that the representations
of su(3) on Ml and M2 are irreducible (cf. Cornwell (1989), Vol. 2, pp.
636ff).
Remark 3 (Physical interpretation). We regard
2.22 Applications to Elementary Particles 121
y y
bs
b3
. - - -....- -........- - · T 3
-1
(a) mesons (B = 0) (b) weight diagram
FIGURE 2.7.
as the state vector of a composite state consisting of the quark ek and the
antiquark e;". Observe that b1 , ... , bs are eigenvectors of ¢(Y) and ¢(73)
with the eigenvalues Y and T 3 , respectively. The values of the quantum
numbers Y and T3 are pictured in Figure 2.7(b}. This weight diagram of
the representation of su(3} on M2 corresponds to the quantum number di-
agram of physicists for mesons (cf. Figure 2.7(a)). Therefore, physicists as-
sume that the mesons from Figure 2.7 (a) correspond to the states bl, ... , bs
according to Figure 2.7(b}. For example, the pion 71"+ is described by the
state vector
in other words, 71"+ consists of one u-quark el and one d-antiquark e2'
The basis vector (J' of Ml corresponds to the meson r,'.
Proof of Proposition 2. Ad (i). For all k, m,
Furthermore, note that dim(X ® X*) = 9, and dim Ml + dim M2 = 1 + 8,
by the proof of (iv) ahead.
Ad (ii). Let A E su(3}. Then
¢(A) (~ekk) = ~Aek ® ek - ek ® ATek
3
= L: amkem ® eh, - akmek ® e;" = O.
k,m=l
Hence ¢(A}(J' = 0, i.e., Ml is invariant under su(3).
122 2. Variational Principles and Weak Convergence
Furthermore, we get
3
¢(A)tkm = 3¢(A)ekm = 3Aek @ e;" - 3ek @ AT e:'n = 3 ~)ajkejm - amjekj)
j=l
3
= L ajk(3ejm - 0'I5jm ) - amj( 3ekj - a8kj) E M2 •
j=l
Thus, M2 is also invariant under su(3).
Ad (iv). Observe that
tll + t22 + t33 = 0, t33 = b8 , t11 - t22 = 3b2 · o
2.22.3 Applications to Gauge Field Theory
In this section, we sum over two equal lower and upper indices from 1 to 4
(the Einstein convention). We also set
for N ~ 2,
for N = 1.
Here, Aj E su(N) iff Aj is a complex traceless skew-adjoint (N x N)-matrix,
and Aj E u(1) iff iA j is a real number.
The basic idea of gauge field theory and its importance for modern ele-
mentary physics has been discussed in Section 2.20, by considering a simple
model. We are now ready to study the fundamental su(N)-gauge field the-
ory. Our point of departure is the following variational problem: 18
fa L(¢, 'f/;, A)dx = stationary! , (113)
¢, 'f/;, Aj = given on 8G, j = 1, ... ,4,
18This variational problem refers to an arbitrary inertial system. By definition,
a Cartesian coordinate system is an inertial system only if there exists a system
time t such that each mass point, which is sufficiently distant from other masses
and shielded against fields, remains at rest or moves rectilinearly with constant
velocity.
Einstein's famous principle of relativity from 1905 postulates that all inertial
systems are physically equivalent, meaning that physical processes are the same
in all inertial systems when the initial and boundary conditions are the same. A
detailed mathematical and physical discussion of this principle can be found in
Zeidler (1986), Vol. 4, Chapter 75.
In order to prove that the principle (113) is indeed valid for each inertial system,
one has to know the transformation rules for x, </J, 1/J, and A under a change of
the inertial system. It turns out that the four-potential A has to be a covariant
vector and both </J and 1/J have to be spinors under Lorentz transformations of x
(cf. Thaller (1992), and Zeidler (1986), Vol. 5, Chapter 91).
2.22 Applications to Elementary Particles 123
with the Lagrangian
· 4 "( k Vk'l/l ) - m ( ¢
L:= (¢ I z"( I "( 4 '1/1) - l k
-(Fjk, FJ ), (114)
4
where G is a bounded nonempty open set in ]R4,
Vj:=aj+KAj , AjEL(N), j=1, ... ,4, N::::1,
and
j, k = 1, ... ,4,
as well as
Observe that
for all j,k = 1, ... ,4.
Furthermore, we define
Here, the symbols have the following meaning:
m = mass of the fundamental particle P,
1j;, ¢ = field of P (below we will set ¢ = 1j;),
A = potential of the interaction between the particles P,
F = field strength of the interaction,
K = coupling constant of the interaction,
J = current generated by the particles P,
Xl, X2, X3 = Cartesian coordinates,
t = time,
c = velocity of light, X := (Xl, X2, X3, X4), where X4 := ct,
OJ ;= OJ / aXj = classical derivative,
Vj, D j = covariant derivatives.
Observe that the components Aj of the potential and the components Fjk
of the field strength are elements of the Lie algebra L(N). Furthermore, let
us define the metric matrix
o
-1
o
o
124 2. Variational Principles and Weak Convergence
and the so-called Dirac matrices
n ~ ), ,"-C D·
0 0 0 0
1 0 0 1
"Y4:= .- 0
0 -1 -1 0
0 0 -1 -1 0 0
0. (
0 0 0 1
"Y2 :=
0 i
i 0 o ' ,"~ ~1
~') 0
0 0
0 ~1)
o .
0 0 0 1 0 0
The crucial property of these matrices is given by the relation 19
"Yi"Yk + "Yk"Yi = 2gik for all j, k = 1, ... ,4.
Here, (gik) denotes the inverse matrix to (gjk), that is, gik = gjk for all
j, k. Moreover, ("Y4)* = "Y4 and ("Yk)* = -"Y k for k = 1,2,3. We also set
Fik := gjr gks Frs.
Finally, we want to define the Hilbert space X under consideration. First
let Y be the four-dimensional complex Hilbert space of all the complex
matrices
4
with the inner product (¢i l'l/Jj)Y := "r:>fijk'l/Jjk.
k=l
Then, by definition, X is the complex Hilbert space of all the N-matrices
with 'l/Jj E Y for all j
and the inner product
N
(¢ I'l/J) := ~)¢j l'ljJj)Y'
j=l
Let Ai E L(N), and denote the elements of Aj by akm. Recall that the
complex (N x N)-matrix Aj = (akm) is traceless and skew-adjoint. This
means that all + ... + aNN = 0 and akm = -amk for k, m = 1, ... , N.
Recall that "Yk is a (4 x 4)-matrix.
19This relation says that the Dirac matrices -yl, -y2, -y3, and -y4 generate a
Clifford algebra (cf. Zeidler (1986), Vol. 5, Chapter 91).
2.22 Applications to Elementary Particles 125
Let us now extend "'(k and Aj to operators "'(k, Aj : X --+ X on the space
X in a natural way. To this end, set
all
(
and A j 1/J:= :
aNl
Finally, set
(A, B) := -tr(AB) for all A, BE C(N), N;::: 1.
This way, we get an inner product (-,.) on the Lie algebra C(N), by Example
3 in Section 2.21. Let {Bj } be an orthonormal basis of C(N} with respect
to (', .).
Theorem 2.M (Basic equations of gauge field theory). Each sufficiently
smooth solution ¢, 1/J, A l , ... , A4 of the original variational problem (113)
with ¢ = 1/J satisfies the following Euler-Lagrange equations:
k = 1, ... ,4 (Yang-Mills equation) (115a)
j, k, m = 1, ... ,4 (Bianchi identity)
°
(115b)
(i"-'/V k - mI}1/J = (Dirac equation). (115c)
Here, the current J is given by
N 2 -l
Jk := - L K,(1/J I i"'(4"'(k Br1/J}Br for N 2:: 2,
r=l
The proof of Theorem 2.M will be given ahead.
Example 1 (The Maxwell equations for the electromagnetic field). Let
N = 1. In the special case where ¢,1/J == 0, (115a, b) represent the Maxwell
equations in vacuum. Here, the real functions -iAl' ... , -iA4 are the com-
ponents of the four-potential, and the real functions
j,k = 1, ... ,4,
are the components of the classical electromagnetic field tensor 20 (cf. Zei-
200ur notation is chosen in such a way that Aj and Fjk are elements of the
Lie algebra £(N). This convention simplifies the notation and clarifies the math-
ematics. Physicists frequently replace Aj with iAj and Fjk with iFjk.
126 2. Variational Principles and Weak Convergence
dler (1986), Vol. 5, Chapter 83). Explicitly,
where E l , E 2 , E3 and B l , B 2 , B3 are the Cartesian components of the
electric field vector E and magnetic field vector B, respectively.
Note that, in this special classical case, we get [A j , AkJ = 0, and hence
\lj = D j = 8 j . In the language of vector calculus, the Maxwell equations
in vacuum (115a) and (115b) with ¢, 'lj; == 0 are identical to
(115a*)
div E = 0, div B = O. (115b*)
This shows that
The basic equations (115) of gauge field theory generalize the classical
Maxwell equations of electromagnetism.
Example 2 (The Dirac equation). Let N = 1 and Aj == 0 for all j.
In this case, system (115) passes over to the Dirac equation (115c) with
\l j = 8j . This equation describes a free relativistic electron.
Example 3 (The Maxwell-Dirac equations). Let N = 1 and K, = e =
electric charge of the electron. Then, system (115) represents the Maxwell-
Dirac equations of quantum electrodynamics.
Let us now discuss the fundamental gauge invariance. By a gauge trans-
formation
'lj; ~ 'lj;', ¢ ~ ¢', Aj ~ Aj, 'Yk ~ "I'k, (116)
we understand 2l
'lj;'(x) = U(x)'lj;(x), ¢'(x) = U(x)¢(x), 'Y,k = U'YkU-l, (116* )
Aj(x) = U(x)Aj(x)U(X)-l + K,- l U(x)8j U(X)-1,
where
U(x) := eA(x) with A(x) E C(N) for all x E ffi.4.
Corollary 4 (Gauge invariance). If we set \lj : = 8 j + K,Aj, then
\lj'lj;'(x) = U(x)\lj'lj;(x), (117)
21Note that the prime does not denote any derivative.
2.22 Applications to Elementary Particles 127
Fjk(X) = U(X)Fjk(X)U(X)-l, (118)
L(q/(x),1//(x),A'(x)",k) = L(¢(x), 1/J(x) , A(x)"k), (119)
for all j, k = 1, ... ,4 and x E JR4.
From (119) we get the following fundamental result:
The basic equations (115) from Theorem 2.M are invariant under the
gauge transformation (116).
Explicitly, this means the following. If ¢, 1/J, A is a solution of (115), then
¢', 1/J', A' is a solution of (115) provided a prime is assigned to all symbols.
Proof of Corollary 4. This follows from (116) by a straightforward com-
putation.
Ad (117). Since
OjU(x) = (ojA(x))U(x) and OjU(x)-l = -(OjA(x))U(X)-l,
we get
'\1j1/J' = (OJ + IiAj)U1/J = (OjU)1/J + UOj1/J + Ii(UAjU- 1)U1/J + (UOjU- 1)U1/J
= U(Oj + IiAj )1/J = U'\1j1/J.
Observe that AU = U A and (ojA)U = UojA, since eA = I +A+!A2+ ....
Ad (118). Note that IiFjk = ['\1j , '\1 kJ == '\1j '\1 k - '\1 k'\1j. By (117), '\1j¢ =
U'\1 j (U-l¢). Hence
IiFjk = ['\1j, '\1~J = [U'\1 jU-l, U'\1 kU-1J = U['\1 j, '\1 kjU-l = IiU Fjk U- 1 .
Ad (119). Observe that the operator U is unitary. Hence
(¢' I ,,41/J') = (U¢ I U,41/J) = (¢ I ,41/J),
(¢' I ,'4,'k'\1~1/J') = (U¢ I U,4,k'\1k1/J) = (¢ I ,4,k'\1k1/J),
and
(Fjk,F'jk) = (UFjkU-\UpjkU- 1 ) = -tr(UFjkU-lupjku-l)
= -tr(U FjkPjkU- 1) = -tr(U-1U Fjk Pjk ) = (Pjk, Fjk). 0
Proof of Theorem 2.M. Let T E JR. Set
8(T) := fa L(¢ + TO¢, 1/J + T01/J, A + ToA)dx,
where o¢(x), o1/J(x) E Y, oAj(x) E £(N), and the components of o¢, 01/J,
and oAj are CO'(G)-functions.
128 2. Variational Principles and Weak Convergence
Suppose that rp, 7jJ, A is a solution of the original variational problem
(113). Then 8'(0) = 0, that is,
0= 8'(0) = fa (brp I ,l(irykVk - mI)7jJ) + (rp I 14(irykVk - mI)b7jJ)dx
+ fa (rp I iry4'l/'ibAk7jJ) - ~(bFjk' Fjk)dx,
where
bFjk := ojbA k - okbAj + /'i[bAj, AkJ + /'i[A), bAk]'
Step 1: Let b7jJ == 0 and bA j == 0 for all j. Then
fa (brp I ')'4(i'lvk - mI)7jJ)dx = 0,
for all brp with CO" (G)-components. Hence
·l(i,lVk - mI)7jJ = O. (120)
Step 2: Let brp == 0 and bAj == 0 for all j. Observe that
(iry4,.l)* = _iry41k for k = 1,2,3,4. (121)
In fact, 1 4 * = ')'4 and
(h4')'k)* = _hh')'4 = hk')'4 = _h4')'k for k = 1,2,3.
Furthermore, it follows from (Ak)* = -Ak and Aj/'k = ')'k Aj for k,j =
1,2,3,4 that
(h4')'k A k )* = Ak(h4')'k)* = iry4')'k Ak for k = 1,2,3,4. (122)
Thus, integration by parts yields
0= 8'(0) = fc (rp I Bb7jJ)dx = fa (Brp I b7jJ)dx
for all b7jJ with CO"(G)-components, where B := 14(hk(Ok + /'iA k ) - mI).
Hence
Brp == 1 4(h kV k - mI)rp = O.
By (120), the function 7jJ satisfies the same equation. This shows that
our assumption rp = 7jJ from Theorem 2.M makes sense. In what follows we
will use this assumption.
Step 3: Let brp == 0 and b7jJ == O. Set bAk(X) := ak(x)BS) where ak E
COO (G). Then
(7jJ I h41 k /'ibAk7jJ) = (7jJ I h4')'k/'iBs7jJ)ak
= 2:)(7jJ I iry4,),k/'iBr 7jJ)B r , Bs)ak
r
2.22 Applications to Elementary Particles 129
because (BnBs) = Drs. Since (Br)* = -B;, it follows as in (122) that
(i'y4'ykBr)* = i'y4,ykBr .
Consequently, ('Ij; I i'y4'yk",Br'lj;) is real, and Br E C(N) hence implies that
Jk E C(N).
Furthermore, note that tr(DE) = tr(ED), and hence
([A, Bl, e) = -tr(ABe) + tr(BAe)
= -tr(eAB) + tr(AeB) = -([A, el, B).
Since Fjk = -Fkj, this implies
~(OFjk' Fjk) = (ojDA k , Fjk) + (",[A j , DA k], Fjk)
= (OjDAk, Fjk) - (",[A j , FjkJ, DAk)'
Integration by parts yields
0= S'(O) = { ('Ij; I i'y4"jk"'DAk'¢) - ~(DFjk,Fjk)dx
lG 2
= fa (-J k + OjFjk + ",[A j , Fjk], oAk)dx
== fa (_J k + DjFjk, DAk)dx = fa (_J k + DjFjk, Bs)akdx
for all ak E e[f(G) and all s. Hence
for all s.
Since {Bs} is an orthonormal basis of C(N) with respect to the inner
product (.,.), this implies
_J k + DjFjk = O.
In this connection, observe that Jk, Fjk, and DjFjk are elements of the
Lie algebra C(N).
Step 4: A straightforward computation shows that the Bianchi identity
(U5b) is a consequence of
Remark 5 (Gauge theory and modern differential geometry). A more de-
tailed study of the mathematical and physical aspects of gauge field theory
can be found in Zeidler (1986), Vol. 5, and in the handbook article Zeidler
130 2. Variational Principles and Weak Convergence
(1995), Chapter 19. There it is shown that, in terms of modern differential
geometry, the potential A and the field F of interaction are related to the
connection and the curvature of principal fiber bundles. Roughly speaking,
potential A = connection of the principal fiber bundle F,
field F of interaction = curvature of F,
particle field 'l/J = section of the associated vector bundle V.
Let us briefly discuss this. To simplify notation, let K, = 1. In what follows
we sum over j from 1 to 4.
(i) The Lie group Q. Set
Q(N) '= { SU(N) for N 2:: 2,
. U(l) for N = 1.
Recall that SU(N) denotes the set of all complex unitary (N x N)-matrices
U with det U = 1. Moreover, U(l) denotes the set of all complex numbers
u with lui = 1.
The Lie algebra .c(N) introduced at the beginning of Section 2.22.3 is
precisely the Lie algebra corresponding to Q(N).
(ii) Principal fiber bundle. The product space
F := ]R4 x Q(N)
is a special case of a fiber bundle. The set {x} x Q(N) is called the fiber over
the point x. Since the typical fiber Q(N) is a Lie group, the fiber bundle F
is called a principal fiber bundle.
(iii) The associated vector bundle V. The values 'l/J(x) of the field 'l/J live
in the linear space X. Set
V:=]R4 x X.
This is a fiber bundle. Since the typical fiber X of V is a linear space, V is
called a vector bundle associated to the principal fiber bundle F.
The map x I-t (x, 'l/J(x)) from the base space]R4 into the vector bundle V
is called a section of V.
(iv) Parallel transport in the principal fiber bundle F. Let x = x(O"),
a :S 0" :S b, be a Cl-curve in ]R4. Recall that Aj E .c(N). The differential
equation
(P)
is called the equation of parallel transport in F. We are given the matrix
g(O) E Q(N). We are looking for a function
where g(O") E Q(N) for all 0" E [a, b].
If 9 = g(O") is a solution of (P), then we say that the curve in F,
2.22 Applications to Elementary Particles 131
corresponds to a parallel transport of the initial point (x (0), 9 (0)) E F
along the base curve x = x(O').
The coefficients Aj of equation (P) are called a connection on :F.
In order to show that equation (P) makes sense, observe the following:
If g(a) lives in the Lie group Q(N) for all 0', then the derivative g'(O') lives
in the corresponding Lie algebra L(N).
(v) Parallel transport in the associated vector bundle V. Let the particle
field 1jJ be given, where
1jJ(x) EX for all x E JR4.
Set \}i(a) := 1jJ(x(a)). By definition, the field 1jJ is parallel along the base
curve x = x( 0') iff
a s:: a s:: b. (P*)
If we introduce the covariant directional derivative along the curve x = x(a)
through
then equation (P*) can be written elegantly as
a s:: a s:: b.
Observe that \}i'(a) = xj(a)oj1jJ(x(a)) and "V j = OJ + A j .
(vi) Curvature of the principal fiber bundle :F. In modern differential ge-
ometry, curvature is measured by the commutator of covariant derivatives.
Therefore, let us describe the curvature of F through the commutators
which is precisely the field F of interaction.
General fiber bundles are manifolds that possess a local product struc-
ture. The intrinsic formulation of parallel transport and curvature is then
based on the language of differential forms with values in a Lie algebra (cf.
Zeidler (1986), Vol. 5, (1995)).
As an introduction to this subject, we recommend Isham (1989).
132 2. Variational Principles and Weak Convergence
a -a a
e-------t--- x ---.----~----~-----x
-h
(a) (b)
FIGURE 2.8.
Problems
2.1. Bernoulli's brachistochrone problem. Two points, at different distances
from the ground and not in a vertical line, should be connected by such a
curve
y =y(x)
so that a body under the influence of gravitational forces passes in the
shortest possible time from the upper to the lower point. Compute this
curve. Show that this corresponds to the following variational problem (see
Figure 2.8(a)):
l
o
a .J1+172 - .,
r=;; dx - mm.,
y-y
y(o) = 0, y(a) = -h. (123)
Solution: The energy E of the body is given by
1
E = "2mv2 + mgy = const,
where m = mass, v = velocity, and 9 = gravitational acceleration. At the
beginning of the motion, we have y = v = 0, and hence E = 0. This implies
v = y'-2gy.
Let s denote the arclength of the curve. Then, v = ~~. Thus,
time = L At = L ~s = min!.
Using As = y'(Ax)2 + (Ay)2 and letting Ax -t 0, we get (123).
The Euler equation to (123) is given by
d
dxLyl - Ly = 0,
where L = v'~/2. Since L is independent of x, we get
d (L - YI Lyl )
dx = L yy I + Lyly II d Lyl
- YII Lyl - YI dx = 0.
Problems 133
Hence L - y'Lyl = const, that is,
1
(124)
y'2C'
One checks that the cycloid
x = C(u - sinu), y = C(cosu -1)
is a solution of (124).
2.2. The hanging rope. Compute the shape y = y(x) of a hanging rope
(Figure 2.8(b». Motivate that this corresponds to the following variational
problem:
yea) = y( -a) = 0, (125)
where p = density and 9 = gravitational acceleration.
Solution: A piece of the hanging rope has the potential energy pgb:.s,
where s denotes the arclength. Problem (125) corresponds to the principle
of minimal potential energy:
Epot = L pgb:.s = min!.
The Euler equation to (125) reads as follows:
d y'
dx
--====
J1 + y'2 -- 0
.
Hence b
V 1+y'2
= const. This yields y = cosh x - cosh a.
2.3. Minimal surfaces. Compute the Euler equation to the variational prob-
fa J1+U;+u~dxdy=min!,
lem
u = given on aG.
This is the problem of the least area for a prescribed boundary curve (see
Figure 2.9).
Solution: We get
a ux +~ uy = o.
ax J 1 + u; + u~ ay J 1 + u~ + u~
Existence theorems can be found in Problem 2.13.
134 2. Variational Principles and Weak Convergence
z
l' G22~z=U(X'Y)
~
Y
~G
IL.._ _ _ _ _ _ _ X
FIGURE 2.9.
2.4. The generalized Dirichlet problem. Compute the Euler equation to the
variational problem
fa (F(u; + u~) - 2fu)dxdy = min!, u = given on fJC.
Solution: We obtain
:x (F'(u; + u~)ux) + :y (F'(u; + u~)Uy) = f.
This is identical to the conservation law
div j = f,
where
j = F'(lgrad ul 2 )grad u.
Equations of this type appear frequently in mathematical physics (cf.
Zeidler (1986), Vol. 2B, Sections 25.9ff).
2.5. Motion of relativistic particles. Compute the Euler equation to the
variational problem
l tl(
to -moc2 y~
1 - "di' - U(x) ) dt = min!,
x(to) = a, x(tt} = b,
where x = (6,6,{3) and X,2 = E~=l fl-
Solution: We get
j = 1,2,3,
with the so-called relativistic mass
mo
m= .
J1-~
Problems 135
This can be written briefly as
(mx')' = -grad U.
This is the equation of motion x = x(t) for a relativistic particle in an
inertial system under the influence of the force K = -grad U. Observe that
the relativistic mass m of the particle depends on its velocity. This mass
goes to infinity if the particle approaches the velocity c of light.
A detailed study of the physical and mathematical meanings of this prob-
lem in special relativity can be found in Zeidler (1986), Vol. 4, Section 75.11.
2.6. The nth variation. Let
F(u):= Ib L(x, u(x), u'(x), ... , u(n)(x))dx,
where -00 < a < b < 00 and n 2: 1. Suppose that the Lagrangian L: [a, b] x
JRn+l -+ JR is C 2 • Set X := Cn[a, b] (cf. Problem 1.6c in AMS Vol. 108).
Show that
(i) The functional F: X -+ JR is continuous.
(ii) If L is convex with respect to the arguments u, u', ... ,u(n), then F is
convex on X.
(iii) Compute the first variation 8F( u; h) and the second variation 82 F(u; h)
for all u,h E X.
Hint: Letting ¢J(t) := F(u + th) for t E JR, we get 8r F(u; h) = ¢J(r)(o),
and hence
8F(u; h) = Ib n
L[DkL(x, u(x), ... , u(n)(x))]h(k)(x)dx,
1
a k=O
b n
82 F(u; h) = L [DkDm(X, u(x), ... , u(n)(x))]h(m)(x)h(k)(x)dx,
a m=O,k=O
2.7a. Lower semicontinuous functionals. Let F, G: M ~ X -+ JR be two
functionals on the subset M of the real normed space X. Show that
(i) If F and G are convex on the convex set M, then so is F + G.
(ii) If F and G are lower semicontinuous on the closed set M, then so is
F+G.
136 2. Variational Principles and Weak Convergence
Solution: Ad (i). Use the definition of convex functionals.
Ad (ii). For r E JR, let Mr(F) := {u E M: F(u) :::; r}. The set
Nr(F):= {u E M:F(u) > r} = M - Mr
is relatively open (on M) iff Mr is closed (cf. Problem 1.12d). Thus, F is
lower semicontinuous on M iff Nr(F) is relatively open for all r E R Using
all the possible decompositions r = a + {3, we get
Nr(F + G) = U (Na(F) nNf3(G)).
a+f3=r
By assumption, N r (F) and N r (G) are relatively open for all r E R Since
the union of an arbitrary family of relatively open sets is again relatively
open, the set Nr(F + G) is relatively open for all r E JR, and hence F + G
is lower semi continuous on M.
2.7b. The classical symbols lim and lim. Recall the following facts from
calculus. Let (an) be a sequence of real numbers. Then, the point a E
[-00,00] is called a cluster point of (an) iff there exists a subsequence (an' )
such that
a = lim an'.
n'-+oo
On [-00,00] there are always a smallest and a largest cluster point of (an),
which we denote by
and lim an'
n ..... oo
respectively. Instead of lim and lim, one also uses the notations lim inf and
lim sup, respectively. For example, if an = (_l)n, then
and
Let (an) and (b n ) be two sequences of real numbers. Show that
n ..... oo n ..... oo n ..... oo
provided the left-hand side does not correspond to the meaningless expres-
sions "00 - 00" or "-00 + 00."
2.8. Sequentially lower semicontinuous functionals. Let X be a real normed
space. The functional F: M <:: X - t JR is called sequentially lower semicon-
tinuous at the point u E M iff
F(u):::; lim F( un)
n ..... oo
Problems 137
)....--
)
•
•
(a) lower semi continuous (b) not lower semicontinuous
I FIGURE 2.10.
for each sequence (un) in M with Un --+ u as n --+ 00.
Furthermore, F: M ~ X --+ R is called sequentially lower semicontinuous
iff it is sequentially lower continuous at each point of M.
Show that
(i) If F, G: M ~ X --+ R are sequentially lower semicontinuous at the
point U E M, then so is F + G.
(ii) Let M be a closed set. Then, the functional F: M ~ X --+ R is lower
semicontinuous on M iff it is sequentially lower semicontinuous.
(iii) Let M be a closed set and let dim X < 00 (e.g., X = R). Then, the
following three statements are mutually equivalent for the functional
F:M ~ X --+R:
(a) F is lower semicontinuous.
(b) F is sequentially lower semicontinuous.
(c) F is weakly sequentially lower semicontinuous.
(iv) Let X := R and M := [0,2]. For Q :::; ,,(, define
F(u):= {
Q
{3
°
if < U < 1
ifu = 1
'Y if 1 < u :::; 2.
Then, F: [0,2] --+ R is sequentially lower semi continuous at the point u = 1
iff {3 :::; Q (cf. Figure 2.10).
Moreover, F is lower semi continuous on [0,2] iff {3 :::; Q.
By (iii), F is weakly sequentially lower semicontinuous on [0,2] iff it is
lower semicontinuous on [0,2].
Finally, F is continuous on [0,2] iff a = {3 = 'Y.
Hints: Ad (i). Use Problem 2.8.
Ad (ii). Cf. Zeidler (1986), Vol. 3, p. 165.
138 2. Variational Principles and Weak Convergence
Ad (iii). Weak and strong convergence coincide if dim X < 00.
2.9. Applications to the famous N -body problem. Let us describe the motion
of the sun and of N - 1 planets through
j=l, ... ,N,
where Xj(t) denotes the position vector of the jth body at time t. The
motion of these bodies is governed by the Newtonian equations:
j = 1, ... ,N, (126)
where the force is given through
N
Kj = L "Ymjmk(Xk - Xj)
IXk - Xjl3
k=l,k/-j
Here, mj denotes the mass of the jth body, and "Y denotes the gravitational
constant.
Existence theorem: Let T > O. Then there exist infinitely many T-
periodic noncollision solutions.
2.9a. Show that equation (126) is the Euler equation to the following
classical variational problem:
lb L(X1, .. . , XN, x~, ... , x~ )dt = stationary! ,
Xj(a) = given, xj(b) = given, j = 1, ... , N,
where the Lagrangian is given by
L = kinetic energy - potential energy.
That is,
N
L = '~
" 2mjxj
1 12 - Vj,
j=l
along with K j = -grad xj Vj and
Vj= L
k=l,k/-j
2.9b. * Study the proof of the existence theorem in the monograph by
Ambrosetti and Coti-Zelati (1993). The proof is based on modern varia-
tional methods (e.g., the Ljusternik-Schnirelman theory, the Morse theory,
Problems 139
and the mountain pass theorem). Observe that the force K j becomes sin-
gular for Xj = Xk, which corresponds to a collision. This complicates the
proof considerably.
As an introduction to the N-body problem we recommend Meyer and
Hall (1992).
2.10. Nonlinear elasticity. Let G be a nonempty, bounded, open, connected
set in ]R3 that has a sufficiently smooth boundary. We want to describe the
deformation of an elastic body by the vector equation
y = X+ u(x). (127)
Here, x denotes the position vector of a point in the undeformed region
G. Under a deformation, the position vector x is transformed into the
new position vector y (cf. Figure 2.11). The basic variational problem in
nonlinear elasticity reads as follows:
fa £(£(u'), x)dx - fa Kudx = min!,
(128)
U= Uo on aGo
This is the principle of minimal stored energy. We use the following nota-
tion:
u = deformation vector,
£ = stored energy function,
K = density of the outer forces,
£ = deformation tensor,
(J = stress tensor (first Piola-Kirchhoff stress tensor).
The deformation Uo of the boundary is given. We are looking for the defor-
mation u of the body. The terms appearing in (128) possess the following
physical meaning:
fa £(£(u'), x)dx = elastic energy of the body stored
by the deformation,
-fa Kudx = -(work done by the outer forces)
= stored potential energy.
Let us use a Cartesian coordinate system with the orthonormal basic
vectors el, e2, e3. In what follows we sum over two equal indices from 1 to
3. Set
140 2. Variational Principles and Weak Convergence
u(x)
FIGURE 2.11.
and OJ = o/OXj. Then
1
£ = -(u'(x) + u'(x)* + u'(x)*u'(x)).
2
Explicitly,
where
Using components, the variational problem (128) reads as follows:
fa £(£, x)dx -fa Kjujdx = stationary!, (128* )
Uj = UjO on oG,
2.10a. The equilibrium equation. Show that each sufficiently smooth so-
lution to the principle (128) of stationary energy satisfies the following
Euler-Lagrange equation:
div (J = K on G,
(129)
u =Uo, on oG.
Explicitly,
Oi(Jij = K j on G,
(129* )
Uj = UjO on oG, j = 1,2,3,
where
O£
(Jij := 8(Oi U j)·
Problems 141
Solution: Use Remark 4 in Section 2.2.
Remark: Equation (129) is called the equilibrium condition. It describes
the balance between the outer forces and the stress forces. Let H be a suf-
ficiently regular subregion of G, where n denotes the outer unit normal
vector to a boundary point of H. Under the deformation (127), the subre-
gion H is transformed onto the deformed subregion H'. Then
fH Kdx = outer force acting on the deformed subregion H',
-L (diva )dx = stress force acting on H'.
Integration by parts yields
r(div
JH
a)dx = r
J8H
(an)dS,
where an = (aijnj)ei.
A detailed discussion of both the physical and the mathematical back-
ground can be found in Zeidler (1986), Vol. 4, Chapter 61.
2.10b. Convex approximation models. Since it is difficult to solve the
highly nonlinear equilibrium equations (129), physicists and engineers con-
sider approximation models. To this end, they replace the nonlinear defor-
mation tensor £ with the linear approximation
1
"tij = 2(Oi Uj + OjUi).
If the stored energy function L is now convex with respect to the first partial
derivatives O;,Uj, then we can apply the existence theorem from Section 2.6.
(i) Generalize Proposition 1 from Section 2.6 to problem (128).
(ii) In addition, study such approximation models along with a duality
theory in Zeidler (1986), Vol. 4, Chapter 62.
2.10c. * Polyconvex material. It turns out that convex models, as con-
sidered in Problem 2. lOb, are never rigorous models. In 1977 John Ball
introduced a class of rigorous models in nonlinear elasticity based on the
notion of polyconvexity.
Theorem: Problem (128) has a solution
j = 1,2,3, 2:S P < 00,
where
det(I + u'(x)) > 0 for almost all x E G,
provided the following assumptions are satisfied:
142 2. Variational Principles and Weak Convergence
(HI) Polyconvexity. The stored energy function L is polyconvex, that is, we
have
L = P(A, adj A, det A),
where
A := 1+ u'(x),
and P is a convex continuous function of the three arguments A,
adj A, and detA > O. Explicitly, A = (aij) is a real (3 x 3)-matrix
with
aij := Oij + 8iuj.
Furthermore, adj A := (det A)A -1. Denote the space of all real (3 x
3)-matrices by M(3, 3). Polyconvexity means that
P(tA + (1- t)B, tadj A + (1- t)adj B, tdetA + (1 - t) detB)
::; tP(A, adj A, det A) + (1 - t)P(B, adj B, det B)
for all t E [O,IJ and all A, B E M(3,3) with det A, det B > O.
(H2) Coerciveness. There is a constant c > 0 such that
P(A, adj A, det A) ;:::: c(IAIP + ladj Ar + I det AI S ) + const
for all A E M(3, 3) with det A > O. Here, IAI := maxi.i laij I, and
1 1
2::; p < 00, -p + -q = 1, q ::; r < 00, 1 <s< 00.
(H3) The limit det A -> O. We have
if IAn - AI + IBn - BI -> 0 as n -> 00 in M(3, 3) and d n -> d as
n -> 00 in JR..
(H4) Boundary displacement. Let Uo be a given C 1 -vector field on G such
that
det(I + u~(x)) > 0 on G.
(H5) Outer forces. Let K j E Lq(G) for all j.
For q = 2, the spaces Lq(G) and Wi (G) were introduced in Chapter 2 of
AMS Vol. 108. The general definition for q ;:::: 1 can be found in Problems
5.9 and 5.12.
Hint: Study the proof in Zeidler (1986), Vol. 4, Section 62.13. The point
is that, surprisingly enough, det A and adj A possess nice properties with
respect to weak convergence in Sobolev spaces.
Problems 143
For example, rubberlike material is polyconvex (cf. Zeidler (1986), Vol.
4, Section 61.8).
2.11. A special Lagrange multiplier rule. Consider the minimum problem
feu) = min!, uE X,
(130)
qu) = 0' (side condition)
for fixed real 0' =I- O. Assume the following:
(HI) f: X --4 ~ is a functional on the real Banach space X.
(H2) The first variation of(u; h) exists for all u, hEX, and the map h f-+
of(u;h) is linear on X for each u E X.
(H3) The functional f: X --4 ~ is linear and continuous.
Show that if Uo is a solution to problem (130), then there exists a real
number A such that
of(uo; h) + Aof(uo; h) = 0 for all hEX,
(131)
quo) = 0'.
The number A is called a Lagrange multiplier.
Note that the multiplier A is uniquely determined by (131). In fact, since
or(uo;h) = f(h) = 0' for all hEX, we obtain of(uo;uo) + Af(uo) = O.
Hence
A = -O'-lof(uo; uo).
In particular, if the Gateaux-derivative f'(uo) exists, then (131) is equiva-
lent to the following condition:
f'(uo) + Af'(uo) = 0,
(131*)
f(uo) =0'.
This is a special case of the general Lagrange multiplier rule to be consid-
ered in Section 4.14.
Solution: Define
for all hEX. (132)
Then, f(Ph) = 0 for all hEX. Fix hEX. Set
c/J(t) := f(uo + tPh) for all t E R
Since r(uo + tPh) = 0' for all hEX, the function c/J has a minimum at the
point t = O. This implies c/J'(O) = O. Hence
c/J'(O) = of(uo; Ph) = 0 for all hEX.
144 2. Variational Principles and Weak Convergence
fluid
y
x
(a) (b)
FIGURE 2.12.
By (132),
for all hEX.
It follows from (H3) that 8r(uo; h) = r(h) for all hEX. Thus, we obtain
(131).
2.12. Capillary surfaces, natural boundary conditions, and experiments per-
formed in space shuttles. We want to show how to use the preceding La-
grange multiplier rule in order to obtain important information about an
interesting problem in fluid dynamics.
2.12a. The principle of minimal stored energy for capillary surfaces due
to Gauss. Let us consider a fluid of constant density in a container under
the influence of the graviational force. We are looking for the shape of the
free surface of the fluid (cf. Figure 2.12).
From the physical point of view, two additional forces appear, namely,
(i) the capillary force at the free surface,
(ii) the adhesion force at the wall.
The capillary force is due to the fact that the fluid molecules at the free
surface are not completely surrounded by fluid molecules.
Let us use a Cartesian (x, y, z)-coordinate system with the corresponding
orthonormal basic vectors i, j, and k. Then the equation of the free surface
is given by
z =u(x,y) on G.
Problems 145
The fundamental variational problem for determining the free surface reads
as follows:
a 1J + u~ + u~
G
1 dxdy - a{3 { uds
JaG
-1 G
pgudxdy = min! , (133)
fa udxdy = V (side condition).
We are given the volume V of the fluid. The set G is assumed to be
nonempty, bounded, open, and connected, where the boundary 8G is suf-
ficiently regular.
This variational principle corresponds to the principle of minimal stored
energy. It resembles the principle (128) of minimal energy in elasticity. The
terms appearing in (133) possess the following physical meaning:
a fa J + u~ + u~dxdy =
1 energy stored by the
deformation of the free surface,
-a{3 { udx = -(work done by the adhesion forces at the wall)
JaG
= stored potential energy,
1 G
pgudxdy = - (work done by the gravitational force)
'
= stored potential energy.
Here we start with the simplest possible assumption, which says that the
surface energy is proportional to the surface area. Moreover, we assume
that the work done by the adhesion forces at the wall is also proportional
to the area of the wall wetted by the fluid (s = arclength of the boundary
curve 8G). In addition, we are given the following positive constants:
a = surface tension,
p = density of the fluid,
{3 = relative adhesion coefficient,
9 = gravitational acceleration,
K, = capillary constant (K, = g:) .
Let
T'- grad u
.- Igrad ul'
Show that a sufficiently smooth solution u of the variational problem (133)
satisfies the following conditions:
div T = K,U +A onG, (134)
Tn={3 on8G,
146 2. Variational Principles and Weak Convergence
where n denotes the outer unit normal vector to the boundary aGo Fur-
thermore,
,\ = ,8length(aG) _ Ii V , (135)
meas( G) meas( G)
and
cos l' =,8. (136)
Here, l' denotes the contact angle between the free surface and the wall (cf.
Figure 2.12(a)).
Explicitly, equation (134) reads as follows:
on G.
Geometrical discussion. The outer unit normal vector N to the free
surface is given by
N = k - uxi - uyj
. /1 + u 2x + u Y2
V
Hence
cosl' = -nN = Tn.
By the boundary condition in (134), cos l' =,8. This is (136). The quantity
.- ~div
H·- 2 T (137)
represents the mean curvature of the free surface.
Physical discussion. From (136) we obtain the important physical fact
that the contact angle l' is constant, and l' depends only on the material
constant ,8, not on the shape of aGo
In a space ship, the gravitational force is weak, that is, the gravitational
acceleration 9 is small. If we set 9 = 0, then Ii = O. In this case, the basic
equation (134) is specialized to the following equation for capillary surfaces
without gravity:
div T = 2H, on G,
(138)
Tn = (3, on aG,
and
cos l' = ,8, 2H = (3len gth(aG) .
meas(G)
Consequently, capillary surfaces without gravity are surfaces of constant
mean curvature that have a constant contact angle.
Since the gravitational force in a space ship is weak, the capillary force
plays a fundamental role for handling fluids like fuel. Recently, NASA per-
formed experiments on a space shuttle based on the mathematical theory
of capillary surfaces. Further experiments are planned.
Problems 147
Historical remark. The differential equation (138) along with the cor-
responding boundary condition date back to papers written by Young in
1805 and Laplace in 1806. They used ingenious heuristic arguments that
have become standard in the engineering literature. The rigorous approach
give here is based on a method Gauss proposed in 1830.
Solution: We will use the Lagrange multiplier rule from Problem 2.11.
Set
f(u) := a 1 G
JI + U;' + u~ dxdy -
I
a{3 (
JaG
uds + 1 G
pgudxdy
fa
and
f(u) := udxdy.
Let X := C 1 (G). Choose hEX and t E JR.. Recall that
8f(u; h) := df(U/ th) I .
t t=o
Then
8f(u; h) =a 1Jl + +U;' + u~
G
uxh x Uyhy dxdy - a{3 [
JaG
hds + (
JG
pghdxdy,
fa
and
8r(u, h) = hdxdy.
By Problem 2.12a, there is a real number). such that
8f(u; h) + a)'8f(u; h) = 0 for all hE C 1 (G).
Integration by parts yields
{ (a). + pgu - adiv T)hdxdy + a { (Tn - (3)hds = 0 (139)
JG JaG
for all hE C 1 (G).
(a) From (139) we obtain
fa (a). + pgu - adiv T)hdxdy = 0 for all hE Co(G).
Hence
a). + pgu - a div T = 0 on G. (140)
(b) By (139),
fa (Tn - (3)hds = 0 for all h E C 1 (G). (141)
148 2. Variational Principles and Weak Convergence
If the boundary 8G is sufficiently regular, then C 1 (G) is dense in L2 (8G)
(d. Zeidler (1986), Vol. 2A, Section 21.3). Therefore, it follows from (141)
that
Tn-,8=O on 8G.
Since this boundary condition follows automatically from the variational
principle, it is called a natural boundary condition.
(c) Integrating equation (140) over G and using integration by parts, we
obtain
a).. [ dxdy + [ pgudxdy - a [ Tnds = O.
Jc . Jc Jac
Hence
a).. meas( G) + pg V - a ,8length( 8G) = O.
2.12h. Regions that have a corner. Suppose that the region G contains
a corner with interior angle 2a. Show the following:
(i) If a (sufficiently regular) capillary surface without gravity exists, then
the angle a must be sufficiently large, that is,
11'
a> --'""
- 2
(142)
where "( denotes the contact angle.
(ii) For each angle a that satisfies condition (142), there exists a capillary
surface without gravity provided the boundary of G is a polygon.
This theorem was proved by Concus and Finn in 1974.
Solution: Ad (i). Consider a situation as pictured in Figure 2.12(b).
We are given a capillary surface z = u(x, y) that satisfies equation (138).
Integration over the subregion fl yields
In div T dxdy = 2H In dxdy.
If we use integration by parts, then we obtain
[ Tnds = 2Hmeas(fl). (143)
Jan
Tn=,8
where ,8 = cos"(. Moreover, ITI < 1. Hence
Tn < 0 and ITnl < 1
Problems 149
Thus, from (143) we obtain the key inequality
2Hmeas(O) > (cos "Y)length(OlO) -length(002). (144)
By Figure 2.12(b),
length(olO) = 2R, length(020) = 2Rsina, meas(O) = const.
Dividing relation (144) by R and letting R ~ 0, we get
cos"Y ::S sin a.
This implies sin (~ - "Y) ::S sina. Hence, ~ - "Y ::S a.
Ad (ii). Use spherical caps as free surfaces (cf. Finn (1984), p. 136).
The monograph by Finn (1984) is a modern standard text on capillary
surfaces.
2.13. The Dirichlet principle and existence theorems for minimal surfaces.
Let us discuss some important res)llts concerning minimal surfaces. This
is still a very active area of research. The minimal surface problem (or the
Plateau problem) reads as follows: For a given boundary curve "Y, we are
looking for a minimal surface spanned through "Y.
By definition, a surface is called a minimal surface iff the mean curva-
ture vanishes at each interior point. For example, if the smooth surface is
described by the equation z = u(x, y), then it is a minimal surface iff it is
a solution of the partial differential equation given in Problem 2.3.
Each minimal surface locally represents a surface of least area.
2.13a. Conformal coordinates. The introduction of such coordinates el-
egantly reduces the minimal surface problem to the Laplacian. Let us con-
sider a Cartesian coordinate system in IR3 with the orthonormal vectors i,
j, and k. Set x = xi + yj + zk. We are looking for a minimal surface
x = x(v, w), (v,W) E D,
where D:= {(v,w):v 2 +w 2 < I} denotes the unit disk. The parameters
v, ware supposed to be conformal, that is, the partial derivatives Xv and
Xw satisfy the two conditions
x~ = x~ and XvXw = 0 onD. (145a)
If we use conformal coordinates, then the vanishing of mean curvature
means that
~x=O on D. (145b)
Finally, we have to add the boundary condition:
x = boundary curve "Y on aD. (145c)
150 2. Variational Principles and Weak Convergence
2.13b. The classical existence theorem for the Plateau problem. We are
given the closed el-Jordan curve "(.22 Then there exists a minimal surface
bounded by the curve "(.
More precisely, there exists a continuous map x: D ----t 1R 3 , which is an-
alytic on D, such that the conditions in (145) are satisfied. Moreover, the
map x: 8D ----t "( is a homeomorphism.
Hint: A fairly elementary proof of this theorem (essentially due to Courant)
can be found in the lecture notes by Jost (1994). The idea of the proof is
to use the following variational problem:
f
} D
T
(x V2 + x~Jdvdw = min!, (146)
x = boundary curve "( on 8D.
Observe that the Euler-Lagrange equation to (146) is precisely (145b).
First one constructs a minimal sequence (x n ) to (146). The point is then
to show that this sequence converges to a solution to (145). To this end, one
uses the famous Courant-Lebesgue lemma and the Arzela-Ascoli theorem.
Study this proof in Jost (1994). More general results can be found in
Dierkes, Hildebrandt, Kuster, and Wohlrab (1992), Vo!' 1.
2.13c. * Regularity up to the boundary.23 If the boundary curve "( is em,a
for fixed m = 1,2, ... , and 0 < a < 1, then the map x: D ----t 1R 3 from
Problem 2.13b is also em,a.
Hint: Cf. Dierkes, Hildebrandt, Kuster, and Wohlrab (1992), Vo!' 2,
Chapter 7. The proof is based on sophisticated properties of harmonic
functions and on differential inequalities.
2.13d.* Generic finiteness. For most boundary curves ,,(, the number of
minimal surfaces spanned through "( is finite.
Hint: This is a typical modern result based on the Sard-Smale theorem
(cf. Section 5.15). A detailed study can be found in Tromba (1977).
Historical remarks. Lagrange formulated the minimal surface equation
in 1762 (cf. Problem 2.3). In the nineteenth century, the physicist Plateau
performed numerous soap film experiments. Douglas and Rad6 in 1930
independently proved the general existence theorem from Problem 2.13b.
Douglas also proved existence theorems in the case where a finite number of
boundary curves is given. For his research on minimal surfaces, Jesse Dou-
glas was awarded the first Fields medal at the International Mathematical
Congress held in Oslo in 1936.
22Recall that a closed Jordan curve is the image 'ljJ(aD) of a homeomorphism
--+ lR 3 . In addition, we assume that the map 'ljJ is e 1 . Roughly speaking,
'ljJ: aD
reasonable smooth closed curves are of this type.
23The class em,,,, of Holder spaces was introduced in Problem l.S of AMS Vol.
lOS.
Problems 151
The boundary theorem from Problem 2.13c was first proved by Hilde-
brandt in 1966 (for m 2: 4). B6hme and Tromba obtained the generic
finiteness theorem in 1977.
The Plateau problem in higher dimensions. In order to solve the
minimal surface problem in higher dimensions, according to Giorgi, one
uses the concept of generalized surface area based on functions of bounded
variations. This leads to extremely weak solutions. The main task is then to
prove the regularity or the partial regularity of those very weak solutions.
This can be found in the monograph by Giusti (1984). The main ideas are
explained in Zeidler (1986), Vol. 2B, pp. 1114££.
Standard texts on minimal surfaces include the two monographs by
Dierkes, Hildebrandt, Kuster, and Wohlrab (1992), Vols. 1, 2, and that
by Nitsche (1992). We also recommend the monograph by Struwe (1988).
The minimal surface problem demonstrates how a simple question arising
from our real world leads to the invention of sophisticated techniques in
mathematics.
Harmonic maps. The solution x = x(v, w) of the variational problem
(146) represents a special case of a harmonic map. As an introduction to
the modern theory of harmonic map, we recommend the monograph by
Jost (1984). Many problems in physics that are governed by the princi-
ple of minimal stored energy are described by harmonic maps (e.g., liquid
crystals 24 ). In Problem 2.14 we will consider the Landau-Ginzburg model
in superconductivity and superfluidity.
2.14. Singular variational problems, phase transitions, and the Landau-
Ginzburg model in superconductivity, and superfiuidity. Let G be a nonempty
bounded open set in JR. 2 • Set SI := {z E C: Izl = I}.
2.14a. Harmonic maps from G to Co Let us consider a map
'1/1: G -+ C,
where '1/1 (x) = u(x) + v(x)i, and x = (e,17). Set 1V" '1/1 12 := l'I/1el 2 + 1'1/17)1 2.
Hence
11/112 = u2 + v 2 ,
and 1V"'I/112 = u~ + u~ + v~ + v~.
Show that each sufficiently smooth solution to the variational problem
fa 1V"'I/11 dx = min! ,
2 (147)
'1/1 = g on G
satisfies the Euler-Lagrange equation
on G, (148)
onaG.
24 As an introduction to the modern mathematical theory of liquid crystals, we
recommend the proceedings edited by Ericksen and Kinderlehrer (1987).
152 2. Variational Principles and Weak Convergence
Each solution 'Ij;: G ---> <C of the first equation in (148) is called a harmonic
map from G to <C.
2.14h. The Landau-Ginzburg model. Let G be the interior of a closed
Coo-Jordan curve C in ]R2. We are given the Coo-map g:C ---> <C, where
/g(x)/ = 1 on C. Let d > 0 denote the winding number of g. That is, if C
is surrounded once counterclockwise, then the origin is surrounded d times
counterclockwise by the image curve g(C). Let t: > O. Consider the following
problem of minimal stored energy:
r /V'7fJe/ 2dx + -\-
le
f (1 -17fJeI 2)2dx = min!,
2t: le
(149)
7fJE = 9 on BG,
Re 7fJe, 1m 'lj;e E Wi(G).
Show that each sufficiently smooth solution of (131) satisfies the Landau-
Ginzburg equation:
on G,
(150)
on BG.
2.14c. ** The delicate limiting process t: ---> O. The following conditions
are met:
(i) For each t: > 0, the variational problem (149) has a smooth solution
7fJe, which satisfies equation (150). In addition, /'Ij;e(x)/ ::; 1 on C.
If t: is sufficiently smooth (i.e., 0 < t: < t:o), then 'lj;e has precisely d
zeros on G. These zeros possess the index one. 25
(ii) There exist precisely d points PI, ... ,Pd in G and a sequence, t:n ---> 0
as n ---> 00, such that the limit
exists for all points x E C - {PI, ... , Pd}, uniformly on each compact
subset of C - {PI, ... , Pd}.
(iii) The function 7fJ satisfies the equation
-!).7fJ(x) = 0 and /7fJ(x)/ = 1
7fJ = 9 on BG.
25By definition, the index of an isolated zero Pj of 'ljJ is the winding number
of 'ljJ with respect to a sufficiently small circle centered at the point PJ • This
definition does not depend on the choice of the circle (cf. the mapping degree in
Zeidler (1986), Vol. 1, Chapter 12).
Problems 153
(iv) Let us write the complex number Z = ~ + 'Tji instead of x, and let Zj
correspond to the point Pj . Then, 7jJ behaves like
near the singular point Zj, where aj is a complex constant with laj I =
1. More precisely,
Thus, the limit function 7jJ is a harmonic map from G - {PI, ... , Pd} onto
the unit circle 51 that has singularities at the points PI, ... ,Pd'
Hint: The proofs along with further information about the computation
of the singular points can be found in the monograph by Bethuel, Brezis,
and Helein (1994).
Physical interpretation. For a superconductor, there is a critical (low)
absolute temperature Tc such that, for absolute temperatures T,
0< T < T e ,
superconducting regions appear. These regions correspond to supercurrents
of electrons. The passage from normal conductivity to superconductivity is
called a phase transition.
The complex function 7jJ = 7jJ(x) is called an order parameter (or a Higgs
field 26 ). By definition,
17jJ(x) I := density of superconducting electrons
at the point x.
The physical units are normalized such that 0 :::; 17jJ(x)1 :::; 1. Let {j > 0 be
a sufficiently small number. By definition,
if 1 - {j < 17jJ(x)1 :::; 1, then there is a superconducting state at the point x;
if 17jJ(x)1 < {j, then there is a normal state at the point x.
If we use the representation
7jJ(x) = p(x)eiS(x),
26In the standard model of elementary particle physics, the Higgs field corre-
sponds to a hypothetical Higgs particle that is responsible for the mass of the
gauge particles W± and Z detected in 1983. These gauge particles are responsi-
ble for the weak interaction in nature (e.g., the radioactive decay). If we do not
introduce the Higgs field, then the gauge particles are massless, contradicting
physical experiments.
154 2. Variational Principles and Weak Convergence
where Sex) is a real phase factor, then the vector grad Sex) is proportional
to the velocity vector of the supercurrent at the point x.
The Landau-Ginzburg model. The variational problem (149) due to
the physicists Landau and Ginzburg represents a highly simplified mathe-
matical model for superconducting or superfluid materials. 27 The Landau-
Ginzburg term
2~21a (1 -1'IjJ,,1 2)2dx
can be regarded as a penalty term. The penalty is maximal for normal
states. The minimum problem (149) forces the value 1'IjJ,,(x) I to be close to
1 provided the positive parameter E is sufficiently small. Thus, the penalty
term forces the appearance of superconducting states.
Renormalization of energy. The limiting function 'IjJ as E -+ 0 corre-
sponds to an (idealized) superconducting state on G, where the singular
points P 1 , •• . , Pd are called defects. An infinite amount of energy is con-
centrated at the defects. That is, we have
1 U(Pj )
1\7'IjJ12dx = 00
on each small neighborhood U (Pj ) of the defect Pj . The appearance of
infinite energies is typical for modern physics (e.g., for quantum field the-
ory and elementary particle physics). This phenomenon indicates that the
mathematical modeling is wrong. To overcome this serious mathematical
difficulty, physicists invented the technique of renormalization near 1950.
The idea is to pass from the original infinite energy to a renormalized finite
energy by subtracting terms that correspond to the singularities.
From a mathematical viewpoint, it is quite remarkable that one can
define a renormalized energy E ren such that the location of the defects
is obtained by minimizing E ren with respect to all possible defects (cf.
Bethuel, Brezis, and Helein (1994)).
Quantization on a classical level. It is quite interesting that the
number d of defects only depends on a topological invariant of the boundary
values (the winding number of g). Thus, d can be regarded as a topological
quantum number (on a classical level). Such quantization effects are typical
for the behavior of superconductors and superfluids in nature.
Cooper pairs. The Landau-Ginzburg approach to superconductivity
represents a purely phenomenological theory. A deeper physical under-
standing of superconductivity can be gained by using the methods of quan-
tum statistics based on second quantization in quantum field theory. The
theory of Bardeen, Cooper, and Schrieffer from 1957 explains supercon-
ductivity by means of Cooper pairs consisting of two electrons that have
27In superfluidity (e.g., supercooled Helium), 1"p(x)1 represents the density of
the superfluid component, and grad Sex) is proportional to the velocity vector
of the superfluid component.
Problems 155
the same energy, but anti parallel spin. Cooper pairs of particles are also
responsible for superfluidity (cf. Landau and Lirsic (1988), Vols. 9 and 10;
SchriefIer (1964); and Bogoljubov (1967).
Singular variational problems and phase transitions. The vari-
ational problem (149) represents a singular perturbation of the Dirichlet
problem (147), where the perturbation is given by the Landau-Ginzburg
term. In modern mathematical physics, singular variational problems are
used in order to model all kinds of phase transitions in materials (cf. Zeidler
(1986), Vol. 5).
The variational approach to free boundary problems. Free bound-
ary problems appear in many fields of physics (e.g., in hydrodynamics, elas-
ticity, or plasticity). For example, the free boundary may correspond to any
of the following: the surface of a rotating star, the surface of a water wave,
the boundary of a groundwater zone, the boundary of a plasticity zone,
or, more generally, the boundary of a phase transition zone (e.g., melting
ice). From a mathematical viewpoint, such problems can frequently be for-
mulated as constrained variational problems. Using appropriate Lagrange
multipliers, we arrive at unconstrained variational problems with an addi-
tional (possibly singular) term (cf. Section 4.14). A detailed study can be
found in the monograph by Friedman (1982) (cf. also Zeidler (1986), Vol.
5).
2.15. String theory and the Noether theorem. In what follows we will use
physical units such that
c = 1 (velocity of light), h = 27r (Planck's quantum of actiO'll).
Let Z denote the set of integers.
The role of string theory in modern physics will be discussed at the end
of this problem.
The basic idea of string theory. Consider a Cartesian (Xl, x 2 , x 3 )_
system I; that corresponds to an inertial system with time t. Set X4 := t.
Then the motion of a point particle is described by the one-parameter
equation
0:::; 7 :::; 70, j = 1, ... ,4.
This corresponds to a curve in the four-dimensional space-time manifold.
This curve is called a world line (cf. Figure 2.13(a)).
In contrast to this, the motion of a closed string is described by the
two-parameter equation
0:::;CJ:::;27r, 0:::;7:::;70, j=1, ... ,d, (151)
along with the periodicity condition
o :::; 7 :::; 70, j = 1, ... ,d.
156 2. Variational Principles and Weak Convergence
string
(a) motion of a point particle (b) motion of a closed string
FIGURE 2.13.
Let d = 4. Then this corresponds to the motion of a closed curve in the
Cartesian coordinate system ~. Moreover, equation (151) represents a two-
dimensional surface in the space-time manifold. This surface is called a
world sheet (cf. Figure 2.13(b».
Observe that the appearance of a general d-dimensional space-time man-
ifold in string theory is motivated by the quantum physics of strings.
Notation on the world sheet. Consider the parameter space
W := {(a, T): 0 ::; 0' ::; 271', 0 ::; T ::; TO},
Set 0'1 := 0', 0'2 := T, and BOt := B/Ba Ot . In what follows we will sum
over equal lower and upper Greek indices from 1 to 2. Consider a curve
a Ot = aOt(p) on W. By definition, the derivative of arclength s with respect
to the curve parameter p satisfies the differential equation
( ds(p»)2 = «) (»dxOt(p) dx!3(p)
dp gOt!3 0' P ,T P dp dp'
Here, we assume that gOt{3 = g{3Ot on W for all a, {3,
g22 > 0 and g:= det(gOt!3) < 0 on W.
Let (gOt{3) denote the inverse matrix to (gOt{3)' Then
on W for all a, {3, (152)
where
6{3 '= {I if a = {3,
Ot . 0 if a f. {3.
Problems 157
Notation on the space-time manifold JRd. Let
where x j E JR for all j.
In what follows we sum over equal lower and upper Roman indices from 1
to d. Define
where
I if j = k = d,
T/jk = T/jk:= { -1 if j = k = 1, ... , d - 1,
o if j =J k.
The pseudo-inner product xy on JRd corresponds to the Minkowski metric
on JRd.
2.15a. The equation of motion for the bosonic string. The basic varia-
tional principle due to Polyakov reads as follows:
(153)
x(O, T) = x(27r, T) for all T E [0, TO].
Here, T is a given positive constant called the string tension. We are looking
for x = x( a, T) and ga{3 = ga{3 (a, T). This variational principle tells us that
x: W ~ JRd is a harmonic map with respect to the metrics on Wand JRd.
Show that each sufficiently smooth solution to the variational problem
(153) satisfies the following equations:
(equation of motion), (154)
W a {3 = 0, a,(3 = 1,2 (constraints) . (155)
By definition,
._ 2"18o:x8{3x - 4ga{3g
1 'Y/5a 8
Wa{3 .- "IX /jX.
Solution: Introduce the Lagrangian
Then the Euler-Lagrange equations to (153) read as follows:
8 (8(~~mJ
a =0, m= 1, ... ,d (equation of motion), (156)
8L
8g a {3 = 0, a, (3 = 1,2 (constraints). (157)
158 2. Variational Principles and Weak Convergence
We want to show that (156) and (157) correspond to (154) and (155),
respectively.
Ad (156). Obviously,
8L r-;:; (OI.lia k )
8(801.xm) = -T v -g 9 (jX 'f/km .
Ad (157). Recall that 9 = det(gOl.,8). If gOl./3 depends on a parameter p,
then a classical formula for the derivative iJ with respect to p tells us that
iJ = ggOl./3 iJOI./3.
By (152), gOl.,BgOl./3 = og = 2. Hence
.
gOl.,Bg 01./3
+ gOl.,Bg.OI.,B =.
0
This implies
• .01./3
9 = -ggOl.,8g .
Consequently,
8g 8 gOl.,8
8g'Y{j = -ggOl.,B 8g'Y{j = -gg'Yo·
Therefore, equation (157) implies (155).
2.15b. Conservation laws. Set
Explicitly,
(POl.)j := -TFg gOl.,88/3x j ,
(JOI.)jk := x j (pOl.)k _ xk(POl.)j.
Use a simple computation in order to show that the equation of motion in
(154) implies the following two conservation laws:
(energy-momentum conservation), (158)
(angular-momentum conservation). (159)
2.15c. Poincare transformations, the Noether theorem, and conservation
laws. Show that the conservation laws (158) and (159) are consequences of
the Noether theorem from Section 2.19.
Hint: Ad (158). Use the translation
(160)
Note that the Lagrangian L is invariant under this transformation.
Problems 159
Ad (159): Let j, k = 1, ... , d - 1, where j ¥- k. Use both the Lorentz
transformation
( yi) =
yd
(COSh 1/J
sinh 1/J
sinh 1/J)
cosh 1/J
(xj)
x d
(161)
and the rotation
sin <p) (xi ) .
( ykyi) = ( - co~sm<p<p cos (162)
<p xk
Show first that yy = xx for these transformations. Thus, the Lagrangian
L remains invariant. Furthermore, observe that these transformations can
be written as
as lei ---+ 0, (163)
where
emr = -erm for all m, r,
and e~n = rJms esr . In addition, lei := maxm,r le~l. To see this, note that,
for example,
cosh 1/J = 1 + o( 1/J), sinh 1/J = 1/J + o( 1/J), 1/J ---+ o.
Apply now the Noether theorem to this situation.
Remark: The transformations (161) and (162) generate the Lorentz
group. If we add the translations from (160) to the Lorentz group, we
obtain the Poincare group, which plays a fundamental role in relativistic
physics.
The same considerations apply to each physical theory that is invariant
under Poincare transformations. Such theories are called relativistically
invariant. In this respect, the Noether theorem combined with Poincare
translations in (160) leads to conservation of the energy-momentum tensor,
whereas the Lorentz group is responsible for conservation of the angular-
momentum tensor.
2.15d. The conformal gauge. Use the following two-dimensional Minkowski
metric on the world sheet:
-1
(g"'/3) = ( 0
Physicists call this choice of metric the conformal gauge. Set
1
o± := 2(07" ± ou).
Show that the basic equations (154) and (155) now read as follows for
all CT,1" E JR:
(equation of motion), (164)
160 2. Variational Principles and Weak Convergence
ihx{hx = 0, fLx{Lx = 0 (Virasoro constraints), (165)
x(O' + 27f, T) = x(O', T) (periodicity). (166)
Note that equation (164) coincides with the classic equation for a vibrating
string (cf. Section 5.12 in AMS Vol. 108).
Solution: Ad (164). This follows immediately from (154).
Ad (165). Equation (155) yields
This is equivalent to
2.15e. Explicit solutions via Fourier series. Define the Virasoro charges
L±
m''= T 1 o
27r
dO'e±ima0 ± xo±,
x mEZ. (167)
Show that a solution to the basic equations (164) through (166) is given
by
T-O' i ~1 '()
x = Xo + - - 0 : 0 + - - L..J -0:;:;: e- tn r-a (168)
v'47fT v'41l'T n-lO n
+ 0' + +
+ T ~o:o i ~ 1 + -in{r+a)
~ L..J -O:n e ,
V 47fT v 41l'T n-lO n
where the Fourier coefficients 0: satisfy the equations
mEZ, (169)
along with
n E Z. (170)
Moroever, Xo E IRd is given.
Here, 0: = (0: 1 , ... , o:d), where the components o:j are complex numbers.
The asterisk stands for a passage to the conjugate complex components. In
order to obtain classic solutions, assume that
IO:n±I :::::
const
--4-' n = ±1, ±2, ....
n
This condition allow us to differentiate the Fourier series in (168) twice.
Solution: An explicit computation shows that the function x = x( 0', T)
from (168) solves the equation of motion in (164). Relation (170) guarantees
that the components of x(O', T) are real numbers.
Problems 161
Substituting the function x = x(a, r) from (168) into (167), we obtain
the expression for L;,
given in (169). Now, to the point: It follows from
L;, = 0 for all m that
for all a, r E JR.,
since the quantities L;,
are proportional to the Fourier coefficients of the
functions fhxfhx. Thus, the Virasoro constraints from (165) are fulfilled.
2.15£. The Virasoro algebra and infinite-dimensional Lie algebras. Con-
sider the unit circle 8 1 := {z E C: Izl = I}. Let X be the space of all
COO-functions f: 8 1 --t C that can be extended to a holomorphic function
on an open neighborhood of 8 1 . Define the operator
through
Cm f ·= _zm+1 df mEZ.
. dz'
Set lA, B] := AB - BA. Moreover, let
Viro := span{C m : mE Z}.
Finally, choose a symbol C, and set
Vir := span{C, Cm: mE Z}.
(i) Show that, for all m, nEZ, we have the commutation relations:
This way, the complex linear space Viro becomes an infinite-dimensional
Lie algebra called the special Virasoso algebra.
(ii) Determine the numbers a(n, m) in such a way that, for all n, mE Z,
[Cn, Cm ] = (n - m)Cn+ m + a(n, m)C,
This way, the complex linear space Vir becomes an infinite-dimensional
Lie algebra called the Virasoro algebra, which is also called the central
extension of Viro.
Hint: Use the Jacobi identity to show that
a(n, m) = const(n 3 - n)on,-m.
2.15g. Supernumbers. Consider an infinite number of symbols
162 2. Variational Principles and Weak Convergence
Introduce a product OkOm that has the following decisive property:
for all k,m.
Let aj be a complex number. Each (finite) complex linear combination of
finite products
ao + 2: akmOkOm + 2: akmlOkOmOI + ...
k,m k,m,l
is called a supemumber.
Show that the Grassmann algebra from Problem 4.14d is a model for
supernumbers.
The symbols 0 are called Grassmann variables. From an historical point
of view, it is interesting that such quantities were already introduced by
Hermann Grassmann in 1844. From a physical point of view, supermath-
ematics allows us to formulate theories that describe bosons (integer-spin
particles) and fermions (half-numberly spin particles) in a unique way.
Solution: Let X be a complex linear space with dim X = 00. Suppose
that X = span{Ol,02,"'}' where Ol, ... ,Bn are linearly independent for
each n. Define
and so on. For example, OkOm is a bilinear form on XT, where
for all u, v EXT.
2.15h. Supermathematics. It is possible to construct a reach mathematics
based on supernumbers. This is called supermathematics. Let us consider
some examples.
(i) Differentiation:
(ii) Integration: 28
J (a + (30)dO := (3.
1: 1:
28This definition is motivated by the classic formula
f(x + const)dx = f(x)dx
(translation invariance of the integral).
Problems 163
Compute the following expressions:
Solution:
(a) Note that (j2 = 00 = -00, and hence 02 = O. Therefore,
(j2
eO = 1 + 0 + 2 + ... = 1 + O.
(b) J eOdO = J+(1 O)dO = l.
(c) (eO)' = (1 + 0)' = l.
(d) 8l(020l) = 8 l (-Ol(h) = -(ho
(e) e02 sinOl = (1 + (2)Ol.
(f) 8 l (Ol + 020d = 1 - O2.
As an introduction to modern supermathematics, we recommend the
monographs by Berezin (1987), Bagger and Wess (1991) (supersymme-
try and supergravity), and Constantinescu and de Groote (1994) (sheaf-
theoretical approach).
Remark (The importance of string theory in modern physics). We will
use only formal arguments.
(i) The quantization of the bosonic string. Consider first the bosonic
string from (168). In order to quantize this string, we have to replace the
Fourier coefficients (a~y by operators in a "Hilbert space" 29 H satisfying
the following commutation rules:
[(a;y, (a~YJ = 0, n,mEZ.
Then the Virasoro charges become operators, that is,
mEZ,
by (169). It turns out that
[Lm, LnJ = (m - n)Lm+n + lc2 (n 3 - n)8m,-n I , m,nEZ,
29Note that the inner product on H is not positive definite.
164 2. Variational Principles and Weak Convergence
where c is a constant, and where I denotes the unit operator. This means
that the Virasoro charges form a Virasoro algebra. This fact plays a fun-
damental role in string theory.
The classic constraints L'!it = 0 have to be replaced by the conditions
L!t/J = 0, (Lo - 1)t/J = 0, for all m E Z. (171)
The set of all t/J E H that satisfy condition (171) form a linear subspace
H phys of H. By definition, the elements t/J of Hphys are called physical states.
Such a physical state t/J is called a ghost state iff
(t/J I t/J) < O.
An important result (based on formal arguments) says that the physical
states are ghost-free iff d = 26 (the dimension of the space-time manifold).
The theory makes sense in this case only.
This quantization procedure corresponds to the old canonical quantiza-
tion. In fact, modern string theory is based on quantization via the Feyn-
man path integral. Such a quantization procedure has the decisive advan-
tage that the relativistic invariance of the theory can be seen immediately.
(ii) Superstring theory. If one formulates string theories on the basis of
supernumbers, then one obtains so-called superstring theories. In partic-
ular, the critical dimension for the heterotic string is d = 10. The use of
supernumbers allows us to describe both bosonic and fermionic strings in a
unified manner. It is fascinating that the simplest vibration of a closed su-
perstring corresponds to a massless spin-two particle. Moreover, the gauge
symmetries of string theory show that this spin-two particle has all the
properties of the hypothetical graviton that is responsible for the gravita-
tional force.
For many physicists, superstring theory is therefore the leading candi-
date for the unification of all fundamental forces in the universe (gravita-
tion, weak interaction, electromagnetic interaction, and strong interaction).
However, to date there has been no experimental evidence for the existence
of strings. String theory predicts that the physical effects generated by su-
perstrings are striking if the particle mass is near 10 19 proton masses. How-
ever, even the largest particle accelerators or observations from cosmic ray
detectors and satellites will, at best, be able to probe only indirect signals
emerging from such extremely high energies.
As an introduction to string theory, we recommend the lecture notes
by Lust and Theissen (1989), and the monograph by Green, Schwarz, and
Witten (1987). From a mathematical point of view, it is fascinating that
string theory is related to many topics in mathematics (e.g., Riemannian
surfaces, Kahler manifolds, Calabi-Yau manifolds, topology, characteristic
classes and vector bundles, knot theory, Morse theory, Floer cohomology,
Korteweg-de Vries hierarchy and solitons, Kac-Moody algebras, quantum
groups, algebraic geometry, and number theory). In this connection, there
Problems 165
is now a fascinating flow of ideas from physics to pure mathematics, and
vice versa. This can be found in the monographs by Kaku (1987), (1991).
See also Waldschmidt et al. (1992).
The fundamental principle of stationary action in physics. The
variational principles in Problems 2.1 through 2.15 correspond to the prin-
ciple of stationary action. This is the most important principle in physics
needed to obtain the basic equations in all fields of physics. Further appli-
cations of this principle to important physical problems can be found in
the monographs by Soper (1975) and by Landau and Lifsic (1988), Vols.
1-10.
3
Principles of Linear Functional
Analysis
I love mathematics not only because it is applicable to technology
but also because it is beautiful.
ROSZii Peter (1905-1977)
It is true that a mathematician, who is not somewhat of a poet, will
never be a perfect mathematician.
Karl Weierstrass (1815-1897)
A mathematician, like a painter or poet, is a maker of patterns. If
his patterns are more permanent than theirs, it is because they are
made with ideas.
Godfrey Harold Hardy (1877-1947)
Linear functional analysis is based on the following two important prin-
ciples:
(i) the Hahn-Banach theorem, and
(ii) the Baire theorem.
The Hahn-Banach theorem on the extension of linear functionals has been
studied in Chapter 1. In this chapter we will investigate some applications
of the Baire category theorem to linear operation equations. The most
important consequences of the Baire theorem are the following (cf. Figure
3.1):
168 3. Principles of Linear Functional Analysis
Cantor's nested interval principal
l
Baire theorem
...---_-------------, l
uniform boundedness theorem
(nonlinear operators)
Iopen mapping theorem I I closed graph theorem
~
continuous inverse theorem
~
Banach-Steinhaus theorem implicit function
(linear operators) (well-posedness of - theorem (Chapter 4)
linear operator equations)
boundedness of weakly
l
a priori estimates normal forms of nonlinear
convergent sequences and closed range mappings (Chapter 4)
l l l
Hahn-Banach theorem _ Iclosed range theorem I - existence of Lagrange
multipliers (Chapter 4)
l
Fredholm alternative
~
linear Fredholm nonlinear Fredholm
operators (Chapter 5) operators (Chapter 5)
FIGURE 3.1.
(a) the uniform boundedness theorem;
(b) the open mapping theorem;
(c) the .closed graph theorem; and
(d) the closed range theorem.
These fundamental results were proved by Banach in the late 1920s. The
prototype of the Baire theorem was proved by Baire in 1899 before the
creation of functional analysis.
A detailed presentation of the fascinating history of linear functional
analysis can be found in the monograph by Dieudonne (1981).
3.1 The Baire Theorem 169
3.1 The Baire Theorem
Definition 1. Let M be a subset of a normed space X over IK. Then
(i) 1\1 is called nowhere dense in X iff
int M = 0,
that is, the closure M of M does not contain any interior points.
(ii) M is said to be of the fi~st category in X iff M is the countable union
of nowhere dense subsets Mn of X, that is,
Sets of the first category are also called meager.
(iii) M is said to be of the secondary category in X iff M is not of the
first category. Such sets are also called fat.
Standard Example 2 (Sets in !R).
(i) Each finite set {Xl, . .. ,xn } in !R is nowhere dense in R
(ii) Each at most countable subset of!R is of the first category in !R.
(iii) The set of rational numbers Q is of the first category in R
(iv) Each nonempty open subset of !R (e.g., !R itself) is of the second
category in R
Proof. Ad (i), (ii). This is obvious.
Ad (iii). Note that Q is countable.
Ad (iv). This is a special case of the Baire theorem (Theorem 3.A). 0
Proposition 3 (Cantor's nested interval principle). Let M1 ;:2 M2 ;:2 ... be
a sequence of nonempty closed subsets Mn of a Banach space X such that
lim diam Mn = O. (1)
n-+oo
Then there exists a unique point U with u E A1n for all n.
Proof. Existence. Choose a point Un E Mn for each n. By (1), the sequence
(un) is Cauchy, and hence there is a point U such that Un ~ U as n ~ 00.
Since Un E Mk for all n ~ k and the set Mk is closed, u E Mk for each k.
170 3. Principles of Linear Functional Analysis
Uniqueness. Let u, v E Mn for all n. By (1), Ilu - vii is arbitrarily small.
Hence u = v. 0
Theorem 3.A (The Baire theorem). Each nonempty open subset U of a
Banach space X over][( (e.g., U = X) is of the second category in X.
Proof. If U were not of the second category, then U would be of the first
category. Then there would exist a family {Mn} of sets in X such that
00
U= U Mn and int M n = 0 for all n.
n=l
Let us introduce the closed ball
Br(a) := {u EX: Ilu - all:::; r}
of radius r > O. First choose a point a E U. Since the set U is open,
for some r > O.
Since int M 1 = 0, there exists a point al E int Br (a) such that 1 dist( aI, M d>
O. Thus, there is a number rl with 0 < rl < T such that
Continuing this argument, we obtain a sequence of balls
such that
for all n = 1,2, .. , . (3)
It follows from (2) and the nested interval principle (Proposition 1) that
there exists a point u with u E Brn (an) for all n. By (3), u rf- Mn for all n.
This is a contradiction to
00
u E Br(a) ~ U = U Mn· o
n=l
1 Otherwise, dist(b, M d = 0 for all bE int Br(a). Since Ml is closed, this im-
plies b E M 1 for all b E int Br (a), and hence int M 1 f=. 0. This is a contradiction.
3.2. Application to the Existence of Nondifferentiable Continuous Functions 171
3.2 Application to the Existence of
Nondifferentiable Continuous Functions
Proposition 1 (Weierstrass). There exists a nondifferentiable continuous
function f: [0,1] -+ JR..
This will be proved by using the following general principle.
Existence Principle 2. Let M be a subset of a Banach space X, and let
M be of the first category in X.
Then, there exists a point u E X such that
u¢M.
Moreover, the set X - M is of the second category in X.
Proof. By the Baire theorem (Theorem 3.A), X is of the second category.
Since X = M u (X - M) and M is of the first category, the set X - M
must be of the second category. Note that the union of two sets of the first
category yields a set of the first category. 0
Define
M := {f E e[O, 1]: there exists a point x* E [0, 1[ such that the
right-hand derivative f~(x*) exists}.
Proposition 3. The set M is of the first category in C[O, 1].
This implies that the set C[O,I] - M is of the second category in the
Banach space C[O, 1]. Hence Proposition 3 implies Proposition 1. Roughly
speaking, Proposition 3 tells us that "most" continuous functions f: [0, 1] -+
JR. are nondifferentiable. In 1806 Ampere tried to prove that "each contin-
uous function is differentiable." More than fifty years later, Weierstrass
showed that such a statement is wrong.
Proof of Proposition 3. Let Mn denote the set of all functions f E C[O, 1]
such that there exists a point x* E [0, 1[ with
for all h E [0,1] with x* + h ::; 1. (4)
If f E AI, then f~(x*) exists and f is continuous on [0,1]. Thus, f E Mn
for some n, and hence
00
M~ U Mn·
n=l
172 3. Principles of Linear Functional Analysis
L -_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ~---.X
FIGURE 3.2.
We have to show that each set Mn is nowhere dense in C[O, 1]. Then M is
of the first category in C[O, 1].
We first prove that Mn is closed. To this end, let Uk) be a sequence in
Mn such that Ik E Mn for all k = 1,2, .... Then there exist points Xk such
that
I!k(Xk + h) - h(xk)1 S nh
for all hE [0,1] with Xk +h S 1, and k = 1,2,... . (5)
Since Xk E [0,1] for all k, there is a subsequence, again denoted by (Xk),
such that Xk ---t x* as k ---t 00. Letting k ---t 00 in (5), we have 2
for all hE [0,1] with X* +h S 1.
Hence I E M n , that is, Nln is closed.
We now show that int Mn = 0. Let lENIn. For each E > 0, there exists
a piecewise linear, continuous function g: [0, 1] ---t lR such that
III - gil == O~x9
max I/(x) - g(x)1 < E
and Ig+(x)1 > n for all x E [0, 1[ (see Figure 3.2). This implies 9 'I- M".
Hence I is not an interior point of Nln . 0
3.3 The Uniform Boundedness Theorem
Theorem 3.B (The uniform boundedness theorem). Let F be a nonempty
set of continuous maps
F: X ---t Y,
where X is a Banach space over lK and Y is a normed space over lK. Suppose
that
sup IIFuli < 00 for all u E X.
FCF
2This limit exists, since fk(X) --; f(x) as k -> 00 uniformly on [O,lJ and the
function f is uniformly continuous on [O,lJ.
3.3 The Uniform Boundedness Theorem 173
Then there exists a closed ball B in X of positive radius such that
sup (sup IIFUIl) < 00.
uEB FEr
n {u EX: IIFull :S k}.
Proof. Set
Mk:=
FEr
Obviously,
UMn·
00
X=
n=l
Since F is continuous, the set Mk is closed. 3 By the Baire theorem (Theo-
rem 3.A), int Mk i:- 0 for some k. Hence the set Mk contains a closed ball
B of positive radius. Then, by the definition of Mk,
sup (sup IIFU Il ) :S k. o
uEB FEr
Corollary 1 (The Banach-Steinhaus theorem). Let £, be a nonempty set
of linear continuous operators
L:X -> Y,
where X is a Banach space over lK and Y is a normed space over lK. Suppose
that
sup IILull < 00 for all u E X.
LEe
Then SUPLEe IILII < 00.
Proof. By Theorem 3.B, there exists a closed ball B of positive radius in
X such that
sup (sup IILull) < 00. (6)
xEB LEe
Since L is linear, we get IILr(u - uo)11 :S rllLul1 + rllLuol1 for all r > 0 and
Uo EX. Thus, relation (6) remains true if B denotes the closed unit ball.
Hence
sup IILII = sup (sup IILUII) < 00. o
LEe LEe lIull9
30bviously, the set {u E X: IIFul1 ::; k} is closed. Furthermore, observe that
the intersection of an arbitrary number of closed sets is again closed (cf. Problem
1.12).
174 3. Principles of Linear Functional Analysis
Proposition 2. Let (Ln) be a sequence of linear continuous operators
Ln: X -> Y,
where X is a Banach space over lK and Y is a normed space over lK.
Then the following two conditions are equivalent:
(i) There exists a linear continuous operator L: X -> Y such that
Lu = lim Lnu for all u E X.
n-+oo
(ii) There is a dense subset D of X such that lim n-+ oo Lnu exists for all
u E D, and sUPn IILnll < 00.
Proof. (i) =} (ii). This follows from the Banach-Steinhaus theorem (Corol-
lary 1).
(ii) =} (i). Let u E X. Then, for each c: > 0, there exists a point v E D
such that
Ilu - vII < c:.
Since (Ln v) is Cauchy,
for all n, m 2: no(C:).
Hence
IILn u - Lmull :::; IILnu - Lnvll + IILnv - Lmvll + IILmv - Lmull
:::; 2 (s~p IILnll) lIu - vII + c: for all n, m 2: no(C:).
Thus, the sequence (Lnu) is Cauchy and is hence convergent. Define
Lu:= lim Lnu.
n-+oo
Obviously, the operator L: X -> Y is linear. Moreover,
(i.e., L is also continuous). o
Standard Example 3 (Weak convergence). Let (un) be a sequence in the
normed space X over lK. Then the following two conditions are equivalent:
(i) un ~ u as n -> 00.
3.4 Applications to Cubature Formulas 175
(ii) The sequence (1Iunll) is bounded, and there is a dense subset D of
X* such that
as n ----t 00 for all fED.
Proof. Set Lnf := (t, un) for all f E X* and fixed n. Since
forallfEX',
the operator Ln: X' ----t ][( is linear and continuous. By Corollary 2 in Sec-
tion 1.1,
for all n.
The assertion follows now from Proposition 2. Note that X* is a Banach
space, by Section 1.21 in AMS Vol. 108. 0
3.4 Applications to Cubature Formulas
Let -00 < a = x~n) < x~n) < ... < x~n) = b < 00. By a cubature formula,
we understand a formula of the following form:
(7)
where
n
Ln u := L c~n)u(x~n»), n = 1,2, ... ,
k=O
and r n ( u) denotes the remainder. Our problem is to choose the real numbers
4 n ) in such a way that we obtain a convergent cubature formula for the
given function u:
lim rn(u) = O.
n-too
Proposition 1. The following two conditions are equivalent:
(i) The cubature formula (7) is convergent for all continuous functions
u: [a, b] ----t JR.
(ii) The cubature formula (7) is convergent for all polynomials u and
n
sup L 14n ) 1< 00. (8)
n:2: 1 k =o
176 3. Principles of Linear Functional Analysis
Corollary 2. Suppose that all the numbers ckn ) are nonnegative and that
the cubature formula is exact for the function u == 1; then condition (8) is
satisfied.
In fact, letting u( x) == 1 in (7), we get
n
Lc~n) =
k=O
1 a
b
dx.
Proof of Proposition 1. Let X := Cia, b], and set
Lu:= 1b u(x)dx.
Then, the operators L, Ln: C[a, b] ----'> IR are linear and continuous. Moreover,
n
IILnl1 = L Ic~n)1 for n = 1,2, .... (9)
k=O
To prove this, let u E C[a, b]. Fix the number n. Then
Furthermore, let us construct a piecewise linear, continuous function w: [a, b] ----'>
IR by prescribing the values
w(X~n») := sgn ckn ), k = O, ... ,n,
at all the node points x~n). Then
ILnwl = I
I~ c~n)sgn c~n) = ~ Ic~n)lllwll,
since Ilwll = l. This yields (9).
By the Weierstrass approximation theorem, the set of polynomials is
dense in the Banach space Cra, b]. Therefore, the assertion follows from the
Banach-Steinhaus theorem (Proposition 2 in Section 3.3). 0
Example 3 (The trapezoid formula). Let x~n) ;= k(br~a), k = 0,1, ....
Then
b - a (U(b) + u(a)
L n U .._
- -----:;:;- 2 + u (n»)
Xl + ... + u (X n(k)- l )) (10)
is called the trapezoid formula, where n = 1,2, ....
3.4 Applications to Cubature Formulas 177
(i) For each u E C 2 [a, b]' we get the following error estimates:
(b a)3
\rn(u)\:::;
1 n
; 2 max \u"(x)\,
as;xS;b
n = 1,2, .... (11)
(ii) For each u E C[a, b], the trapezoid formula converges as n - t 00.
Proof. Ad (i). We set Q := x~n) and (3 := X~21' Let
r:= 1{3a
(3 - Q
u(x)dx - -2-(u(Q) + u«(3).
For given y with Q < Y < (3, define the linear function
U«(3) - u(Q)
p(x) := u(Q) + (x - Q)--'--':--~-'
(3-Q
and set
U(y) - p(y)
p(x) := u(x) - p(x) - (y _ Q)(y _ (3) (x - Q)(x - (3). (12)
Then, p( Q) = p(y) = p«(3) = O. By the mean value theorem, this implies
the existence of numbers ~ and Tf with Q < ~ < y < Tf < (3 such that
p'(~) = p'(Tf) = O.
Again by the mean value theorem, there is a number ( with ~ < ( < Tf such
that
p"(O = o.
According to (12), this implies
u"«() - 2 u(y) - p(y) = 0,
(y - Q)(Y - (3)
and hence
U" «((x»
u(x) - p(x) = 2 (x - Q)(x - (3) for all x E [Q, (3].
Integration yields
p(X)dXI ::; ~ max \U"(x)ll{3 (x -
11a {3 u(x)dx -1{3
a 2 as;xS;{3 a
Q)(x - (3)dx.
Hence
{3 u(x)dx - 1
+ u«(3) I :::; as;xS;b "( )\«(3_Q)3
max lu x
11a -«(3 - Q)(u(Q)
2
12 .
178 3. Principles of Linear Functional Analysis
Recall that a = k(b-a) and (3 = (k+l)(b-a); then summation over k yields
n n
(11).
Ad (ii). If u is a polynomial, then rn(u) --t 0 as n --t 00, by (11). More-
over, the trapezoid formula (10) is exact for u == 1, again by (11).
Thus, assertion (ii) is a special case of Proposition 1 and Corollary 2. 0
3.5 The Open Mapping Theorem
Theorem 3.C (Banach's open mapping theorem). Let A: X --t Y be a
linear continuous operator, where X and Yare Banach spaces over ][(.
Then, the following two conditions are equivalent:
(i) A is surjective.
(ii) A is open, that is, A maps open sets onto open sets.
Proof. (i) ~ (ii). Let us introduce the open ball
BR:= {u E X: Ilull < R}.
Step 1: Since A is surjective,
UA(Bn).
00
Y = (13)
n=l
By the Baire theorem (Theorem 3.A), there is some index m such that the
closure A(Bm) is not nowhere dense. Thus, there is a point w E Y such
that
w E int A(Bm).
Since A is surjective, there exists some point u E X such that w = Au.
Hence
o E int A(Bm - u).
Finally, choose the number R > 0 so large that Bm - u <;:; B R. Then
o E int A(BR)' (14)
Step 2: Let us prove the stronger result that
(15)
Condition (14) means that there is some number r > 0 such that
IIvll < r with v E Y implies v E A(BR).
3.5 The Open Mapping Theorem 179
In particular, this implies the following:
For each v E Y with IIvll < r, there is a point
(16)
U E Br such that IIv - Aull < ~.
To prove (15) it is sufficient to show that, for each v E Y with IIvll < r,
there is some point u E B3R such that
v=Au. (17)
In fact, this means that 0 E int A(B3R)' and hence we get (15), by the
linearity of A.
Let v E Y be given with IIvll < r. Using (16), we construct a sequence
(un) in the ball BR such that Vo := v and
112(vn - Aun)11 < r, Vn+1 = 2(vn - Aun ), n = 0, 1, ....
Hence
2- n - 1
V n +l = 2-n'Vn - A(2- nUn ) , n = 0, 1, ....
This implies
(18)
00
u:= L2- n u n
n=O
is convergent, by Section 1.22 in AMS Vol. 108. Hence Ilull < 3R. Letting
m ~ 00 in (18), we get (17).
Step 3: Let U be an open subset of X, and let u E U. Then there is some
r > 0 such that
u+rBR ~ U.
Using the linearity of the operator A and (15), we obtain
Au E int A(u + rBR),
and hence Au E int A(U). Thus, the set A(U) is open.
(ii) :=} (i). Since A is open, the set A(X) contains an interior point. This
implies A(X) = Y, by the linearity of A. D
Proposition 1 (Banach's continuous inverse theorem). Let A: X ~ Y be
a linear continuous operator, where X and Yare Banach spaces over JI(. If
the inverse operator
180 3. Principles of Linear Functional Analysis
exists, then it is continuous.
Proof. By the open mapping theorem (Theorem 3.e), the operator A is
open. Thus, if the set W is open in X, then A(W) is open in Y. A general
result about continuous maps on topological spaces tells us that this implies
the continuity of A-I (cf. Problem 1.13a).
A direct proof goes like this. Set Be := {u E X: lIuli < e}. Since A-I is
linear, it is sufficient to prove that A-I is continuous at the point v = o.
In fact, for each given e > 0, the set A(Be) is open, since A is open. Hence
o E int A(Be) because A(O) = o. Thus, there is some number c5(e) > 0 such
that !!Au!! < c5(e) implies u E Be, that is,
IIAu!! < c5(e) implies lIuli < e.
Hence Ilvll < c5(e) implies IIA-Ivll < e. This means that the operator A-I
is continuous at v = O. 0
The following corollary represents an important reformulation of Propo-
sition 1 in terms of the operator equation
Au=v, u E X. (19)
Corollary 2 (The well-posedness principle). Let A: X -+ X be a linear
continuous operator, where X and Yare Banach spaces over JK:. Then the
following two conditions are equivalent:
(i) Equation (19) is well posed, that is, by definition, for each given v E
Y, equation (19) has a unique solution u, which depends continuously
on v.
(ii) For each v E Y, equation (19) has a solution u, and Aw = 0 implies
w=O.
3.6 Product Spaces
Definition 1. Let XI. ... , Xn be normed spaces over K The product space
Xl x ... X Xn
consists of all the n-tuples
where Uk E Xk for k = 1, ... , n.
For a E JK:, we set
a(uI, ... ,un) := (aIuI, ... ,anun ),
( U1, ... , Un) + (VI, ... , v n ) := (u I + VI , ... , Un + Vn ),
3.7 The Closed Graph Theorem 181
and n
!!(UI, ... ,un)!! := I)Uk!!' (20)
k=l
Then, Xl x ... X Xn becomes a normed space over lK.
Proposition 2. If X I, ... ,Xn are Banach spaces, then so is the product
space Xl x ... X X n .
Proof. Let us consider the case where n = 2. The general case proceeds
analogously.
Suppose that the sequence of the points (un,v n ) is Cauchy in Xl x X 2 .
Then
for all n, m ~ no(C:).
Thus, (u r,) and (v n ) is Cauchy in Xl and X 2 , respectively. Hence
un ---t U in X I and Vn ---t v in X 2 as n ---t 00.
This implies
as n ---t 00,
that is, (un,v n ) ---t (u,v) in Xl x X 2 as n ---t 00. o
By the same argument, it follows from (20) that
(ui k), ... , u~k)) ---t (UI,"" un) in Xl x ... X Xn as k ---t 00
iff all the components converge, that is,
U(k)
rn -t Um in X m as k -t 00
for all m = 1, ... , n.
3.7 The Closed Graph Theorem
Definition 1. Let X and Y be normed spaces over lK. By the graph G(A)
of the operator
A: D(A) ~ X ---t Y,
we mean the subset
G(A):= {(u,Au):u E D(A)}
of the product space X x Y.
182 3. Principles of Linear Functional Analysis
r
G(A) G(A)
/
2 2
(a) (b)
G{A)
/
(e)
FIGURE 3.3.
The operator A is called graph-closed iff G(A) is closed in X x Y. This
means that for each sequence (un) in the set D(A) it follows from
Un -> U in X asn->oo (21)
and
AUn -> v in Y asn->oo
that U E D(A) and v = Au.
Example 2. Let X =Y =R Then the following are true:
(i) The function A: [1, 2J -> IR pictured in Figure 3.3(a) is continuous and
graph-closed in IR x R
(ii) The function A: ]0,1[-> IR pictured in Figure 3.3(b) is continuous but
is not graph-closed in IR x R
(iii) The function A:IR -> IR pictured in Figure 3.3(c) is not continuous
but is graph-closed.
It follows from (21) that each continuous operator A: X -> Y is also
graph-closed. The converse is not always true, by Example 2(iii). However,
the following theorem tells us that the situation is nice in the linear case.
Theorem 3.D (Banach's closed graph theorem). Let X and Y be Banach
spaces over J[(.
3.8 Applications to Factor Spa.ces 183
Then, each graph-closed linear operator A: X --+ Y is continuous.
Proof. Let us define the following two linear continuous operators,
P: G(A) --+ X and Q: G(A) --+ Y,
through
P(u, Au) := u and Q(u, Au) := Au,
for all u E X. Obviously,
P(u,Au) = 0
implies u = 0 and Au = O. Thus, the operator P is bijective. Since A is
graph-closed, G(A) is a closed linear subspace of the Banach space X x Y.
Hence G(A) is also a Banach space. By the continuous inverse theorem in
Section 3.5, the inverse operator
p- 1 : X --+ G(A)
is continuous. Obviously, the diagram
X~Y
p-\A
G(A)
is commutative (Le., A = QP- 1 ). Therefore, A is continuous. D
Standard Example 3. Let A: X --+ X be a linear self-adjoint operator
on the Hilbert space X over K Then A is continuous.
Proof. Let Un --+ u and AUn --+ v in X as n --+ 00. It follows from
(Au n I w) = (un lAw) for all w E X
that
(v I w) = (u lAw) = (Au I w) for all w E X,
that is, Au = v. Thus, A is graph-closed, and hence continuous, by Theorem
3.D. D
3.8 Applications to Factor Spaces
The following results on factor spaces and direct sums represent impor-
tant auxiliary tools for the investigation of linear and nonlinear operator
equations in Section 3.12 and in Chapters 4 and 5. The proofs will be
184 3. Principles of Linear Functional Analysis
based on the continuous inverse theorem, the closed graph theorem, the
Hahn-Banach theorem, and the Zorn lemma.
Let L be a linear subspace of the linear space X over JK. For all u, v EX,
we define
u==v(modL) iffu-vEL. (22)
This is an equivalence relation. In fact, for all u, v, w, z E X and a E lK, we
have the following:
u == u(mod L);
u == v(mod L) =} v == u(mod L); (23)
u == v(mod L), v == w(mod L) =} u == w(mod L).
This equivalence relation is compatible with the linear structure of L:
u == v(mod L) =} au = av(mod L);
(24)
u == w(mod L), v == z(mod L) =} u + v == w + z(mod L).
Definition 1. The factor space XI L consists of all the equivalence classes
[u] with respect to (22), that is,
v E [u] iff u == v(mod L).
Explicitly, this means that
[u] = u+L.
The elements v of the class [u] are called the representatives of [u]. Obvi-
ously,
[u] = [v] ~ u == v(mod L). (25)
If we introduce the linear operations
a[u] := [au],
[u] + [v] := [u + v], (26)
the factor space XI L becomes a linear space. The operations in (26) are
well defined, namely, they are independent of the chosen representatives.
This follows from (24) and (25). For example, if [u] = [v], then u == v(mod
L), and hence au == av(mod L), that is, [au] = [av].
In other words, the factor space X I L consists of all the different sets
u+L, where u EX,
and the linear operations on XI L are given through
(u + L) + (v + L) = (u + v) + L,
a(u + L) = au + L,
3.8 Applications to Factor Spaces 185
FIGURE 3.4.
which corresponds to the usual operations A + Band aA for subsets A and
B of linear spaces as defined in Section 1.1 of AMS Vol. 108.
Proposition 2. Let L be a closed linear subspace of the normed space X
over IK. Then the following are true:
(i) The factor space XI L becomes a normed space over lK with respect to
the norm
1/ [uJ II = inf Ilvll· (27)
VE[uJ
(ii) If X is a Banach space, then so is X I L.
Since [uJ = u + L, we get
II [uJ II = dist(O,u+L) = dist(u,L).
Example 3. Let X =]R2 with the Euclidean norm 11·11. In Figure 3.4, the
factor space XI L consists of all the straight lines [uJ = u + L parallel to L,
and the norm I/ull is equal to the distance from the origin to the straight
line [uJ.
Proof of Proposition 2. Ad (i). We first show that
1/ [uJ II = 0 {} [uJ = O.
°{} u
This is identical to
II [uJ II = E L.
In fact, if u E L, then [uJ = L. Hence 0 E [uJ and 1/ [uJ 1/ = 0, by (27).
Conversely, let II [uJ 1/ = o. Since L is closed, so is the set [uJ = u + L. By
°
(27), E u+L. Hence u E L.
Let a E lK. Since I/avll = lalllvll, we have
II a[uJ II = wE[uJ
inf I/awll
= 1001 inf Ilwll
wE[uJ
= 100111 [uJ II·
186 3. Principles of Linear Functional Analysis
Finally, it follows from IIwl + w211 :::; IIwlll + IIw211 that
II [ul + [vlll = inf
wIE[u],W2E[v]
IIwl + w211
:::; inf inf IIw211 = II [ulll + II [vlll·
IIwlll + w2E[v]
wIE[u]
Ad (ii). It follows from (27) that each class [ul contains a point v such
that
Ilvll :::; 211 [ulll· (2S)
Now let ([un]) be a Cauchy sequence in X/L. Using a simple induction
argument based on (28), we obtain a sequence (v n ) in X such that Vn E [unl
and
IIVn - vn+lll :::; 211 [u n ]- [un+1111 for all n. (29)
First suppose that
for all n. (30)
It follows from (29) and the triangle inequality that
Ilvn+m - vnll :::; 2- n(1 + 2- 1+ 2- 2+ ... ),
that is, (v n ) is Cauchy in X. Since X is a Banach space, we have
Vn --+ V in X as n --+ 00.
By (27),
II [un] - [vlll :::; IIvn - vII --+ 0 as n --+ 00,
Hence ([un]) is convergent in X/ L.
In the general case, there exists a subsequence, again denoted by ([un]),
such that (30) holds. Then the assertion follows from Proposition 7 in
Section 1.3 of AMS Vol. lOS. 0
Definition 4. Let L be a linear subspace of the linear space X over K
Then, the canonical mapping
7r:X --+ X/L
is defined through
7r( u) := [uJ, for all u E X,
where [u] = u + L.
Proposition 5. If L is a closed linear subspace of the normed space X
over 1K, then the canonical mapping 7r: X --+ X / L is linear, continuous,
and surjective.
3.8 Applications to Factor Spaces 187
Proof. For all u E X, 117r(u)1I = II [u]11 ::; IIuli. 0
Let A: X - t Y be a linear continuous operator, where X and Yare
Banach spaces over lK. We define the operator
[A]:X/N(A) -t R(A) (31)
through
[A][u] := Au.
This definition is independent of the selected representative. In fact,
let [u] = [vJ. Then u-v E N(A), that is, A(u-v) = 0, and hence Au = Av.
o
Proposition 6. Let the range R(A) of the operator A be closed.
(i) The operator [A] from (31) is a linear homeomorphism.
(ii) There exists a number c > 0 such that
c· dist(u, N(A)) ::; IIAull for all u E X. (32)
Proof. Ad (i). The null space N(A) = {u E X: Au = O} is closed. In fact,
if
AUn = 0 and Un - t U as n - t 00,
then Au = O. Thus, X/N(A) is a Banach space. Obviously, the operator
[A] is linear. Since
II [A][u] II = IIAvll ::; IIAllllvil for all v E [u],
we have II [A][uJIl ::; IIAII II [u] II , and thus [A] is continuous.
Furthermore, the operator [A] is bijective. In fact, if [AHuJ = 0, then
u E N(A), and hence [u] = O.
Since R(A) is a closed linear subspace of the Banach space Y, the range
R(A) is also a Banach space. The continuous inverse theorem from Sec-
tion 3.5 now tells us that the inverse operator [AJ-I: R(A) - t X/N(A) is
continuous.
Ad (ii). By (i), there is a constant d > 0 such that
for all [u] E X/N(A).
Hence
II [v] II ::; dll [A][v] II for all [v] E X/N(A).
This is (32) with c = d- I . o
188 3. Principles of Linear Functional Analysis
3.9 Applications to Direct Sums and Projections
3.9.1 Projections
Definition 1. Let X be a linear space over IK, and let Ll and L2 be linear
subspaces of X.
(i) We write
X = Ll EB L2 (33)
iff each u E X allows the following unique representation:
u = Ul + U2, where Ul E Ll and U2 E L 2. (34)
We say that X is the direct sum of Ll and L 2, and that L2 is an
algebraic complement of Ll in X.
(ii) The operator P: X -+ X is called an algebraic projection iff P is linear
and p2 = P.
(iii) If X is a normed space, then the operator P: X -+ X is called a
continuous projection iff P is a continuous algebraic projection.
Obviously,
X = Ll EB L2
Moreover, let X = Ll EB L 2. Then
implies u=o.
This follows from u = u + 0 = 0 + u and from the uniqueness of the
decomposition in (34).
Using the Zorn lemma, we will prove in Proposition 8 ahead that
Each linear subspace Ll of the linear space X has an algebraic comple-
ment L2 in X.
Proposition 2. Let X be a linear space. Then the following statements
hold true:
(i) Suppose that X = Ll EB L2. If we set
Pu:= Ul
in (34), then P: X -+ X is an algebraic projection onto the linear
subspace Ll . Moreover,
Ll = P(X) and L2 = (I - P)(X) = N(P). (35)
We call P the projection onto Ll along L 2 .
3.9 Applications to Direct Sums and Projections 189
(ii) Conversely, if P: X ---+ X is an algebmic projection, then X = L1 E9L2
with (35).
Proof. Ad (i). Since the decomposition in (34) is unique, and since
we obtain PU1 = U1, and hence p 2u = PU1 = UI = Pu. That is, p2 = P.
By (34), U2 = u - UI = (I - P)u. Hence L2 = (I - P)(X). Finally, it
follows from (34) that 1
Pu = 0 {=> u E L 2 ,
that is, N{P) = L 2 .
Ad (ii). Let u E X. Setting U1 := Pu and U2 := (I - P)u, we obtain
by (35). This decomposition is unique. In fact, let
By (35), we get VI = Pv and V2 = (1 - P)w for some V, w E X. Since
p 2 = P, this implies PVl = VI and PV2 = O. Hence
This yields Ul = VI and U2 = V2.
Definition 3. (i) The direct sum X = L1 E9 L2 is called a topological
direct sum iff the corresponding projection P: X ---+ X onto Ll along L2 is
continuous. Then we say that L2 is a topological complement of Ll in X.
(ii) The linear subspace Ll splits the normed space X iff Ll has a topo-
logical complement in X.
Example 4. Let X = ]R2. Then
]R2 = Ll E9 L 2 , (36)
where L1 and L2 denote the two straight lines pictured in Figure 3.5. The
projection P: X ---+ X onto L1 along L2 corresponds to the usual pamllel
projection onto Ll parallel to L2. Since P is continuous, (36) represents a
topological direct sum.
Moreover, both L 1 and L2 split ]R2 .
Proposition 5. Let L be a linear subspace of the normed space X over oc.
Then
190 3. Principles of Linear Functional Analysis
FIGURE 3.5.
(i) L splits X iff there exists a continuous projection P: X ----) X onto L.
(ii) If L splits X, then L is closed.
Unfortunately, the converse of (ii) is not true. 4
Important classes of splitting closed linear subspaces will be considered
later after some necessary preparations (cf. Standard Example 17).
Proof. Ad (i). This follows from Proposition 2.
Ad (ii). By (i), we have L = P(X), where the projection P: X ----) L is
continuous. Let (un) be a sequence in L such that Un ----) U as n ----) 00.
Letting n ----) 00, it follows from
for all n
that u = Pu, and hence u E L. Thus, L is closed. o
Proposition 6. Let X = L1 EB L2 be a direct sum, where L1 and L2 are
linear subspaces of the Banach space X. Then the following two conditions
are equivalent:
(i) X = L1 EB L2 represents a topological direct sum.
(ii) Both L1 and L2 are closed.
Proof. (i) =} (ii). Both L1 and L2 split X, and hence L1 and L2 are closed.
(ii) =} (i). Let P: X ----) X be the algebraic projection onto L1 along L 2.
We have to show that P is continuous. To this end, let (un) be a sequence
in X. Then,
(37)
4Counterexamples were given by Murray, On complementary manifolds and
projections in spaces Lp and IP. Transact. Amer. Math. Soc. 41(1937),138-152.
3.9 Applications to Direct Sums and Projections 191
Hence Ul n = PUn. Suppose that
Un ~ U and PUn ~ V as n ~ 00.
Letting n ~ 00 in (37), we get
U=v+w,
where U2n ~ W as n ~ 00. Since L1 and L2 are closed, we have v E L1
and W E L 2. Thus, v = Pu. Consequently, the operator P is graph-closed.
The closed graph theorem in Section 3.7 tells us that P is continuous. 0
3.9.2 Codimension
Definition 7. Let L be a linear subspace of the linear space X over K
Then, the codimension of L in X is defined as the dimension of the factor
space XI L, denoted as
codim L:= dim(XIL).
Obviously, if L = X, then X/L = {O}, and hence codim X = O. The
following proposition explains the intuitive meaning of codim L.
Proposition 8. Let L be a linear subspace of the linear space X over IK.
Then the following statements hold true:
(i) There exists a linear subspace M of X such that
X = LffiM. (38)
(ii) If M is any linear subspace of X such that (38) holds, then
codim L = dim M.
(iii) Prom (38) we get
dim X = dim L + dim M,
and hence
dim X = dim L + codim L.
It follows from (iii) that if X = L ffi M and dim X < 00, then
co dim L = dim X - dim L. (39)
Example 9. For an m-dimensional linear subspace L of ]RN, N ~ 1, we
get
codim L = N - m. (40)
192 3. Principles of Linear Functional Analysis
For example, if L is a plane through the origin in 1R3 , then
dim L = 2 and co dim L = 1.
Proof of Proposition 8. Ad (i). Let C be the class of all the linear
operators
P:D(P) ~ X -t L
such that L ~ D(P) and Pu = u on L. We write
PI :::; P2 iff P2 is an extension of Pl'
By the Zorn lemma from the appendix in AMS Vol. 108, C contains a
maximal element Po. Then D(Po) = X. Otherwise, there would exist a
point Uo EX -D(Po). Set N:= D(Po)+span{uo}, and define the operator
P: N - t L through
P(u + o:uo) := Po(u) for all u E D(Po), 0: E OC.
Then P is a proper extension of Po, contradicting the maximality of Po. In
addition, we get
P5 = Po·
In fact, for each v E X, it follows from Pov E L that Po(Pov) = Pov, by
the construction of C.
Therefore, the operator Po: X - t X is an algebraic projection onto L.
Letting M := (I - Po)(X), we obtain
X = Po(X) EB (I - Po)(X) = L EB M.
Ad (ii). For each u E X, we have
u = v+w, where vEL and w E M.
Define the map ¢: M -t X/ L through
¢(w) := [w] for all w EM.
Then, ¢ is linear and surjective. Moreover, ¢(w) = 0 with w E M implies
wE L. It follows from w E L n M and X = L EB M that w = O. Thus, ¢ is
a bijection. This yields
dim M = dim X/L,
and hence dim M = codim L.
Ad (iii). By (ii), it is sufficient to prove that X = L EB M implies
dim X = dim L + dim M. ( 41)
3.9 Applications to Direct Sums and Projections 193
First let dim L = 00. Then L ~ X implies dim X = 00. Analogously, dim
M = 00 yields dim X = 00.
Next suppose that dim L < 00 and dim M < 00. Then (41) follows from
the fact that the union of a basis in L and a basis in M represents a basis
in X. 0
Corollary 10. Let L be a linear subspace of the linear space X over][(.
Suppose that Ul, ... , U m are linearly independent elements of X such that
Lnspan{U1, ... ,U m} = {O}. (42)
Then, m ::; codim L.
Proof. It follows from (42) that [uI], ... , [um] are linearly independent
elements of XIL. Hence m::; dim XIL. 0
Corollary 11. Let L be a closed linear subspace of the Banach space X
over][( with codim L < 00, and let S be a linear subspace of X such that
L~S~X.
Then, S is closed and codim S < 00.
Proof. Let us consider the canonical mapping
71': X ---> XIL
from Section 3.8. Recall that 71'(u) := [u] = U +L for all U E X. The
restriction of 71' to S is given by
71': S ---> SIL.
The operator 71' is linear and continuous on X. Since codim L < 00, we
get dim XI L < 00, and hence dim SI L < 00. Consequently, the finite-
dimensional subspace SI L of XI L is closed, and the preimage
S = 71'-l(SI L)
is therefore also closed, by Lemma 12 which appears next.
Since L ~ S, it follows from
alUl + ... + amUm == 0 (mod L)
that
a1U1 + ... + amU m == 0 (mod S),
where a1, . .. , am E ][(. Therefore, if Ul, ... , U m are linearly independent
mod S, then they are also linearly independent mod L. Hence
dim XIS::; dim XI L.
194 3. Principles of Linear Functional Analysis
This yields codim S ::; codim L. o
Lemma 12. Let A: X -+ Y be a continuous operator, where X and Yare
normed spaces over K Let W ~ Y. The following conditions hold:
(i) IfW is open, then so is A-1(W).
(ii) If W is closed, then so is A-1(W).
Proof. This is a special case of a more general result about continuous
maps on topological spaces (cf. Problem 1.13a). A direct proof resembles
the following.
Ad (i). Let W be open, and let Uo E A-1(W). For each c > 0, there is a
8(c) > 0 such that
Ilu - uoll < 8(c) implies IIAu - Auoll < c,
by the continuity of A. If we choose the number c sufficiently small, then
lIu - uoll < 8(c) implies AUEW,
and hence Uo is an interior point of A -1 (W). Thus, W is open.
Ad (ii). Use (i) and the fact that the complements of closed sets are open.
o
3.9.3 Linear Operator Equations
Let us consider the linear operator equation
Au=b, u E X. (43)
Proposition 13. Suppose that the operator A: X -+ X is linear, wher'e X
and Yare linear spaces over K Let L be any fixed algebraic complement
of the null space N(A), namely, L is a linear subspace of X such that
x = N(A) EB L. (44)
Then the following statements are true:
(i) The restriction
A: L -+ R(A) (45)
is linear and bijective. Hence
codim N(A) = dim R(A). (45*)
3.9 Applications to Direct Sums and Projections 195
(ii) In addition, suppose that X and Yare Banach spaces, Land R(A) are
closed, and the operator A: X -+ Y is continuous. Then the operator
from (45) is a linear homeomorphism.
Recall that R(A) = A(X). The number dim R(A) is called the rank of
A. We denote this as
rank A := dim R(A).
°
Proof. It follows from Au = with u E L that u E N(A)nL. Hence u = 0,
by (44).
Ad (ii). This follows from the continuous inverse theorem in Section 3.5.
D
Suppose that dim X < 00 and dim Y < 00. Let
B: L -+ R(A)
denote the restriction of the operator A: X -+ R(A) to the linear subspace
L of X. Then, for each given bEY, the solution set of the original equation
(43) is given through
B- 1 b + N(A),
where dim N(A) = dim X - rank A, by (45*).
Proposition 14. Let h, .. . ,In, f: X -+ IK be linear functionals on the
linear space X over IK. Suppose that each solution u E X of the system
j = 1, ... ,n,
is also a solution of the equation
feu) = o.
Then, there exist numbers 001, ... ,an E IK such that
Proof. We may assume that h, ... , fn are linearly independent. The proof
proceeds by induction.
Step 1: We prove the statement for n = 1. Since h # 0, there exists a
point U1 E X such that h (ut} # 0. Replacing U1 with f3U1, if necessary,
we get
Set
v:= u - h(U)U1.
196 3. Principles of Linear Functional Analysis
Then h (v) = 0, and hence f (v) = 0, by hypothesis. This implies
0= feu) - h(u)f(ut) for all u E X,
that is, f = ah for some a E IK.
Step 2: We prove the statement for n = 2. Since h is linearly independent
of h, there exists a point U2 E X such that
and
by Proposition 14 with n = 1 and f = h. Analogously, there exists a point
Ul E X such that
h(ut) = 0 and h(ut) # O.
We may assume that
Set
v := U - h(U)Ul - h(U)U2.
Then, h(v) = h(v) = 0, and hence f(v) = 0 by hypothesis. This implies
0= feu) - h(u)f(ud - h(u)f(U2) for all u E X,
that is, f = adl + a2i2, where aj := f(uj).
Step 3: If the assertion is true for n, then a similar argument as in Step
2 shows that the statement is also true for n + 1. 0
3.9.4 Biorthogonal Systems and Splitting Subspaces
Definition 15. Let X be a normed space over K By an X-biorthogonal
system {Uj, uj }j=l, ... ,n, we understand a system of points UI, ... ,Un E X
and functionals ui, ... ,u~ E X* such that
for all i, j = 1, ... ,n.
Proposition 16. Let X be a normed space over K
(i) Each system UI,"" Un E X of linearly independent points can be
extended to an X -biorthogonal system.
(ii) Each system ui, ... ,u~ E X* of linearly independent functionals can
be extended to an X -biorthogonal system.
3.9 Applications to Direct Sums and Projections 197
Proof. Ad (i). Let L = span{ul, .. "Un }, Define the linear functional
u;: L -+ IK through
n
(u;, :~::>~jUj) := ai, i = 1, .. . ,n.
j=l
By the Hahn-Banach theorem in Section 1.1, ui can be extended to a linear
continuous functional ui: X -+ IK.
Ad (ii). Set Ij := uj. Then the existence of points Ub"" Un with
(uj, Ui) = 8ji follows as in the proof of Proposition 14. 0
Standard Example 17. Let L be a linear subspace of the Banach space
X over K Then L splits X if one of the following three conditions is met:
(i) L is a closed linear subspace of the Hilbert space X.
(ii) dim L < 00.
(iii) L is closed and codim L < 00.
Proof. Ad (i). By Proposition 12 in Section 5.1 of AMS Vol. 108, there
exists a continuous orthogonal projection P: X -+ X onto L. Now use
Proposition 5.
Ad (ii). Let {Ul'"'' un} be a basis of L. Extend this basis to an X-
biorthogonal system {Uj, uj}. Define
n
Pu := L)uj, u)Uj.
j=l
Then PUk = Uk for all k, and hence p 2 = P. Thus, the operator P: X -+ X
represents a continuous projection onto L. Now use Proposition 5.
Ad (iii). There exists a linear subspace M of X such that X = L EEl M,
where dim M = co dim L < 00. Thus, L and M are closed subspaces of
X. By Proposition 6, X = L EEl M is a topological direct sum, and hence L
splits X. 0
3.9.5 Pseudo- Orthogonal Complements
Definition 18. Let L be a linear subspace of the normed space X over K
The set
L.l: = {u* E X*: (u*,u) = 0 for all U E L} (46)
is called the pseudo-orthogonal complement to L.
Let M be a linear subspace of X*. Then we set
.lM:= {u E X: (u*,u) = 0 for all u* EM}.
198 3. Principles of Linear Functional Analysis
These notions generalize orthogonal complements LJ... in Hilbert spaces. 5
Proposition 19. LJ.. and 1. M are closed linear subspaces of X and X*,
respectively.
Proof. Suppose that u~ E LJ... for all nand
u~ -+ u* in X* as n -+ 00.
If we let n -+ 00, it follows from (u~, v) = 0 for all n and all vEL that
(u·, v) = 0 for all vEL, and hence u* E LJ... (cf. Problem 3.5).
Suppose that Un E 1. M for all nand
Un -+ U in X as n -+ 00.
If we let n -+ 00, it follows from (u*, un) = 0 for all n and all u* E M that
(u·,u) = 0 for all u* EM, and hence u E J..M. 0
Proposition 20. Let L be a linear subspace of the normed space X over
K
Proof. By (46), (£)1. = LJ... Therefore, it is sufficient to prove that
where M is a closed linear subspace of X. By Definition 18,
Hence M ~ 1. (MJ..).
Conversely, we want to show that J..(MJ..) ~ M. Let v E J..(MJ..) and
suppose that v ¢ M. By Proposition 3 in Section 1.2, it follows from the
Hahn-Banach theorem that there exists a functional u* E X* such that
u* = 0 on M and (u*,v)yfO.
Hence u* E MJ.. and v ¢ 1. (MJ..). This is a contradiction. o
The following result will be used in the theory of Fredholm operators,
which will be studied in Chapter 5.
Proposition 21. Let X be a normed space over][(. Then
SIn the following, the symbol L J.. always corresponds to (46) if we do not state
explicitly that L J.. means an orthogonal complement in a Hilbert space.
3.10 Dual Operators 199
(i) If L is a finite-dimensional linear subspace of X, then
co dim L.1 = dim L in X*.
(ii) If M is a finite-dimensional linear subspace of X*, then
codim.1M = dim M in X.
(iii) If L is a closed linear subspace of X such that L.1 is finite-dimensional,
then
codim L = dim L.1 in X.
Proof. Ad (i). If L = {O}, then L.1 = X*, and hence co dim L.1 = o.
Suppose now that dim L = n, where n > o. Let {Ul, ... , un} be a basis of
L. Extend this to an X -biorthogonal system {Uj, uj}. Define the continuous
projection operator P: X* ---- X* through
n
Pu* := u* - L(u*,Uj)uj for all u* E X*.
j=1
Obviously, Pu* = u* iff (u*,Uj) = ofor allj (Le., U* E L.1). Thus, P(X*) =
L1.. Hence codim L1. = dim(1 - P)(X*) = n, by Proposition 8 along with
X* = P(X*) (f) (I - P)(X*).
Ad (ii). Use a similar argument as in the proof of (i).
Ad (iii). By Proposition 20, L = .1(L.1). It follows from (ii) that codim
L = dim L1..
3.10 Dual Operators
The theory of linear operator equations in Banach spaces is essentially
based on the concept of duality. To this end, we need dual operators.
The key relation for dual operators is given through
(ATu*,u) = (u*,Au) for all u E X, u* E Y*. (47)
Proposition 1. Let
A:X----Y
be a linear continuous operator, where X and Y are normed spaces over lK.
Then there exists precisely one linear operator
AT: Y* ____ X*
such that relation (47) holds. In addition, AT is continuous.
200 3. Principles of Linear Functional Analysis
The operator AT is called the transposed or dual operator to A. We will
show in Example 3 ahead that, in finite-dimensional Hilbert spaces, the
transposed operator AT and the adjoint operator A * correspond to the
transposed matrix and the adjoint matrix, respectively.
Proof. Existence. Let u* E Y* be given. Set
f(u):= (u*,Au) for all u E X.
Then
/f(u)l:::; IIu*II IIAuil :::; IIu*II IIAu" "ull for all u E X. (4S)
Hence f: X -4 lK is a linear continuous functional, namely, f E X*. Define
AT u* := .f.
Obviously, (ATu*,u) = (u*,Au) for all u E X. This way, we obtain the
linear operator AT: y* - 4 X*. By (4S),
for all u* E Y*,
and hence AT is continuous.
Uniqu~ness. Let u* E y* and v* E X* be given. Suppose that
(v*,u) = (u*,Au) for all u E X.
It follows from (47) that (v* - AT u*, u) = 0 for all u EX, and hence
v*=ATu*. 0
Proposition 2. Let A: X - 4 X be a linear continuous operator on the
Hilbert space X over lK. Then, the following diagram is commutative:
AT
Ji 1
X"''---~. X*
A' J-'
x -..........
~x
Here, J denotes the duality map of X. Explicitly,
This result shows that there exists a simple relation between the dual
operator AT and the adjoint operator A * .
Proof. Let u, vEX. By Section 2.11 in AMS Vol. lOS, we have
(Ju, v) = (u I v).
3.10 Dual Operat.ors 201
Hence
(J-l AT Ju I v) = (AT Ju, v) = (Ju, Av) = (u I Av). 0
Example 3 (Matrix equations). Let X =I- {O} be a finite-dimensional
Hilbert space over 1K with the orthonormal basis {ell ... , eN} (e.g., X =
1K N, N ;::: 1). Then, for each u, b EX, we have the representations
and
where (n, (3n E 1K for all n. Let
A:X -+ X
be a linear operator. We are given b E X and b* E X*. Then the following
relations between operator equations and matrix equations hold true:
(i) The original equation
Au=b, UEX, (49)
corresponds to the matrix equation
A( = (3, ( 49*)
where we set
and the matrix elements anm of the operator A are given through
N
Aem = L anmen , m=l, ... ,N. (50)
n=l
(ii) The adjoint equation
A*u = b, uE X, (51)
and the dual equation
ATu*=b*, u* E X* (52)
correspond to the matrix equations
(51*)
202 3. Principles of Linear Functional Analysis
and
(52*)
respectively. Here A* and AT denote the adjoint matrix and the transposed
matrix to A = (a nm ), respectively. Explicitly,
and
for n, m = 1, ... , N. The bar denotes the conjugate complex number. In
addition,
N N
u* = L~~e~ and b* = ""'
L-t (3*n e*n'
n=l n=l
where ~~, f3~ E IK and all n. The basis {ei, ... , eN} of the dual space X*
will be defined ahead.
If X is a real space (i.e., IK = ~), then A* = AT.
Proof. Ad (49). It follows from
that f3n = 2::;;=1 anm~m- This is (49*).
Ad (51). Noting that (en I em) = onm, from (50) we get
anm = (en I Ae m ).
Thus, the matrix elements of the adjoint operator A * are given through
a~m = (en I A*em ) = (Ae n I em) = i'i mn ·
Ad (52). For n = 1, ... ,N, define
Then, e~: X -> IK is a linear functional (i.e., e~ E X*). Let u* E X*. Then
where ~~ := u*(e n ). Hence
N
u* = L~~e~, ~~ , ... ,~'1v E IK. (53)
n=l
3.10 Dual Operators 203
Conversely, each u* from (53) is a linear functional on X. Thus, {ei, ... ,
eN} forms a basis of X*.
According to (50), the matrix elements a~m of the dual operator AT are
given through
N
AT*
em =
~ T
~anmen'
* m=l, ... ,N.
n=l
Since (e;n, en) = 6mn , we get
a;m = (ATe;;.pe n ) = (e~,Aen)
N
= (e~, L aknek) = amn · o
k=l
The properties of dual operators can be described conveniently by using
the so-called duality functor.
Definition 4. Let
A:X~Y (54)
be a linear continuous operator, where X and Yare normed spaces over
K The duality functor 'D assigns to (54) the dual operator
AT:y* ~ X*. (54*)
Proposition 5. Let X, Y, and Z be normed spaces over OC, and let A: X ~
Y and B: Y ~ Z be linear continuous operators.
Then, the duality functor 'D is contravariant, that is, 'D assigns to the
sequence
the following sequence:
X* Ly* £Z*.
Proof. We have to show that
This follows immediately from
(v, BAu) = (BT v, Au) = (AT BT v, u) for all u EX, v E Z*. o
Corollary 6. If the operator A: X ~ Y is linear, continuous, and bijective,
then so is the dual operator AT: y* ~ X*. Moreover, we get
(55)
204 3. Principles of Linear Functional Analysis
Proof. Let Ix denote the identity operator on X. It follows from
and
that
AT(A-1f = Ix. and
since t{ = Ix- and I~ = !y •. o
Recall the following from Section 2.8. Let X be a Banach space over j[{.
If we set
jx(u)(f):= (f,u) for all u EX, f E X *,
then the linear continous operator jx: X -- X** preserves the norm, that
is, Iljx(u)11 = lIull for all u E X. Set
ATT:= (ATf.
Proposition 7. Let X and Y be Banach spaces over K
(i) The following diagram
is commutative for all operators A E L(X, Y).
(ii) The duality functor V is norm-preserving, that is,
for all A E L(X, Y).
(iii) The duality functor V is compact, that is, if A E L(X, Y) is compact,
then so is AT E L(Y*,X*).
Proof. Ad (i). For all u E X and v E Y*,
(jy(Au), v) = (v, Au) = (AT v, u)
= (jx(u), AT v) = (ATT jx(u), v),
and hence jyA = ATTjX.
Ad (ii). Let A E L(X, Y). By the proof of Proposition 1,
3.11 The Exactness of the Duality Functor 205
This implies
Since jx and jy are norm-preserving, we get
IIAII = sup IIAull = sup IIjy(Au)1I
lIull$! lIull$!
= sup IIATTjX(u)lI:5 sup IIATTlllljxllliul1 :5IIATT II.
lIuliSI lIuliSI
Hence IIATII :5 IIAII :5 IIATTII :5 IIATII, showing that IIATII = IIAII.
Ad (iii). This will be proved in Section 5.1. 0
3.11 The Exactness of the Duality Functor
Riemann has shown us that proofs are better achieved through ideas
than through long calculations.
David Hilbert
The language of exact sequences plays a fundamental role in modern
mathematics (e.g., in algebraic topology). We want to show that this lan-
guage allows us to give elegant proofs in linear operator theory (cf. Figure
3.6).
Definition 1. Let Xl, ... , Xn be linear spaces over OC, and let A j : Xj -?
X j +1, j = 1, ... , n - 1, be linear operators. Then the sequence
X1 A,
---+
X2 A2
---+
X3 Xa An - 2
---+ . .. ---+
X n-l An - l
---+
Xn (56)
is called exact iff R(Aj) = N(Aj+1) for all j = 1, ... ,n - 2.
The sequence (56) is called an exact Banach sequence iff it is exact and
all the operators
j = 1, ... ,n - 1,
are linear and continuous, where Xl' ... ' Xn are Banach spaces over OC and
the range R(A n - 1 ) is closed. 6
In particular, the exactness of
X~y~Z
6This implies that all the ranges R( Aj), j = 1, ... , n - 1, are closed. In fact,
we have R(A j ) = N(Aj+J), and the null space N(Aj+l) is closed for all j =
1, ... , n - 2, since A j + 1 is continuous.
206 3. Principles of Linear Functional Analysis
Hahn-Banach theorem exact sequences
""/
exact duality functor
~
I closed range theorem I product index theorem for
Fredholm operators (Chapter 5)
""/
linear operator equation
( Fredholm's alternative)
FIGURE 3.6.
means that R(A) = N(B).
The following example shows that important operator properties can be
translated into the language of exact sequences.
Example 2. Let A: X ----+ Y be a linear operator, where X and Yare linear
spaces over lK. Then
(i) A is injective iff the sequence
O--->X~Y (57)
is exact.
(ii) A is surjective iff the sequence
X~Y--->O (58)
is exact.
(iii) A is bijective iff the sequence
O--->X~Y--->O (59)
is exact.
Here, 0 ----+ X and Y ----+ 0 denote the trivial maps 0 1--7 0 and u 1--7 0,
respectively.
Proof. Ad (i). The exactness of (57) means that N(A) = {O}.
Ad (ii). The exactness of (58) means that R(A) = Y.
Ad (iii). The exactness of (59) is equivalent to the exactness of (57) and
(58). 0
Proposition 3. Let
O--->X~Y~Z--->O
3.11 The Exactness of the Duality Functor 207
be an exact sequence, where X, Y, and Z are finite-dimensional linear
spaces over lK. Then
dim X - dim Y + dim Z = O.
Proof. The operator A is injective. Hence dim R(A) dim X. Let W
denote an algebraic complement of N(B) in Y:
Y = N(B) EB W. (60)
The operator B is surjective. Thus, the restriction B: W -> Z is bijective,
and hence dim Z = dim W. Since N(B) = R(A), it follows from (60) that
dim Y = dim R(A) + dim W = dim X + dim Z. o
Proposition 4. The duality functor D is exact, that is, D sends exact
Banach sequences to exact sequences.
In Section 3.12 we will use Proposition 4 in order to prove the fundamen-
tal closed range theorem. This theorem tells us that the closedness of the
range R(Ad in (56) implies R(Af) = N(Al).L; thus the range R(Af) is
also closed. Consequently, we get the following stronger result: the duality
functor D sends exact Banach sequences to exact Banach sequences.
Proof. Step 1: Let us first consider the short exact Banach sequepce
X~Y~Z.
That is, R(A) = N(B), and R(B) is a closed linear subspace of Z. We have
to show that
X* £::. Y* £- Z*
is an exact sequence, that is, R(BT) = N(A T ).
Since R(A) = N(B), we get
BA=O,
and hence AT BT = (BAf = O. This implies R(BT) t;;; N(AT).
Conversely, let us show that N(AT) t;;; R(B T ). To this end, choose u* E
N(A T ). Hence u* E y* and
(u*,Au) = (ATu*,u) = 0 for all u E X.
This yields u*(v) == (u*,v) = 0 for all v E R(A). Define
[u*](v + R(A)) := u*(v) for all v E Y.
208 3. Principles of Linear Functional Analysis
It follows as in the proof of Proposition 6 in Section 3.8 that the linear
functional
[u*J: Y/ R(A) ---) OC
is continuous. Letting [BJ(v+N(B)) := Bv for all v E Y, we get the linear
homeomorphism
[BJ: Y/N(B) ---) R(B),
by Proposition 6 in Section 3.8. Observe that the range R(B) is closed. The
decisive trick of our proof consists in introducing the linear functional v'
through the commutative diagram
v'
[Bl-\f']
R(B) ~ 1K
YIN(B)
that is, we set
(61)
Recall that R(A) = N(B). The functional v* is continuous on R(B). Hence
Iv*(w)1 :S const Ilwll for all w E R(B),
where R(B) ~ Z. By the Hahn-Banach theorem (Theorem l.B in Section
1.1), there exists a linear continuous extension
v': Z ---) OC.
Relation (61) tells us that, for all v E Y,
v*(Bv) = [u*][Brl(Bv)
= [u*J(v + N(B)) = [u'](v + R(A)) = u*(v).
This yields
(v',Bv) = (u*,v) for all v E Y,
and hence u' = BT v*, which means u' E R(B T ). Therefore, N(AT) c
R(B T ).
Step 2: The general case can easily be reduced to Step 1. In fact, the
sequence in (56) is an exact Banach sequence iff all the possible short
sequences
Aj AHl
Xj ---) Xj+l ---) X j +2 , j = 1, ... , n - 2,
are exact Banach sequences. o
3.11 The Exactness of the Duality Functor 209
In t.he following two examples, let us apply the language of exact ~e
C(llence~ t.o embeddings and projections.
Example 5. Let. X be a closed linear subspace of the Banach space Y over
lK. and let j: X ....... Y denote the trivial embedding map defined through
j(u) := Ii for all Ii E X. Then, j is injective, that is, the sequence
O-->X-.!.....Y
i~ all e;r;act Banach sequence. By Proposition 4, the dual sequence
.]'
Of-- X* J- Y*
i~ also exact (i.e., the dual operator jT is surjective).
l\Ioreover, N(jT) = X.L.
Proof. For all 'l/, E X and u* E Y*,
(Fcll*),u)x = (n*,j(u))y = (u*,u)y.
Therefore. the functional jT (u*) represent,; the restriction of the functional
u*: Y--> lK to the sub~pace X. Obviously,
FCu*) =0 iff u* = 0 on X
(i.e.,I/" E X.L). Hence N(F) = X.L. o
Example 6. Let X be a closed linear subspace of the Banach space Y over
lK, and let
IT: Y ....... Y/X
be the canonical mapping from Section 3.8 defined through n( u) := 'U + X
for (Ill u. E X. Obviou,;ly, N(n) = X. Since n is linear, continuoui:l, and
8u',:jec:tivf'.. the ~equence
i~ (exact. By Example 5,
0--> X -.!..... Y ~ Y/X --> 0
is all e:fact Banach sequence. It follows from Proposition 4 that the dual
HeCl11enCe
.1' T
Of-X' J- Y* ?- (Y/X)* f - 0
is exact. Hence the dual operator n T is injective, and R(n T )
X.L.
210 3. Principles of Linear Functional Analysis
3.12 Applications to the Closed Range Theorem
and to Fredholm Alternatives
The following result represents the most important theorem on linear op-
erator equations.
Theorem 3.E (Banach's closed range theorem). Let A: X -+ Y be a linear
continuous operator where X and Yare Banach spaces over][(. Then the
following three conditions are equivalent:
(i) Fredholm alternative. R(A) = .L N(AT) and R(AT) = N(A).L.
(ii) Closed range. R(A) is closed.
(iii) A priori estimate. There is a constant c > 0 such that
c· dist(u, N(A» ::; IIAu/i for all u E X. (62)
In terms of the operator equation
Au=b, UEX, (E)
and the dual equation
ATu* = b*, u* E Y*, (E*)
Theorem 3.E(i) means the following.1 Let the range R(A) be closed.
(a) For given bEY, the original equation (E) has a solution iff
(u*, b) = 0 (63)
for all solutions u* of the homogeneous dual equation (E*).
(b) Conversely, for given b* E X*, the dual equation (E*) has a solution
iff
(b*, u) = 0 (64)
for all solutions u of the homogeneous original equation (E).
Observe that condition (63) is quite natural. In fact, if Au band
ATu* = 0, then
(u*,b) = (u*,Au) = (ATu*,u) = o. (65)
7By definition, the homogeneous original equation and the homogeneous dual
equation correspond to (E) with b = 0 and (E*) with b* = 0, respectively.
3.12. Applications to the Closed Range Theorem 211
Thu~, (63) represents a simple necessary solvability condition for (E). The
closed range theorem tells us that this condition is also a sufficient solv-
ability condition provided that the range R(A) is closed.
Furthermore, if AT u* = b* and Au = 0, then
(b*,u) = (ATu*,u) = (u*,Au) =0. (66)
This is the solvability condition (64) for the dual equation (E*).
If X and Yare finite-dimensional spaces, then R(A) is closed automat-
ically. In this case, statement,s (a) and (b) correspond to classic results on
finite linear systems.
Proof of Theorem 3.E. (i) =} (ii). By Proposition 19 in Section 3.9, the
set ~N(AT) is closed.
(ii) =} (i). Let R(A) be closed. According to Examples 5 and 6 in Section
3.11,
0-; N(A) ~ X ~ Y ~ YjR(A) ----+ 0
represents an exact Banach sequence. By Proposition 4 in Section 3.11, the
dual ~equence
Or- N(A)* L X* £:. Y* L (YjR(A))* r - 0
is exact. This implies
and
By Examples 5 and 6 in Section 3.11,
and N(F) = N(A)~.
Since R(A) is closed, it follows from Proposition 20 in Section 3.9 that
Hence
and
(ii) =} (iii). This is Proposition 6 in Section 3.8.
(iii) =} (ii). First let N(A) = {OJ. Then
cl/ul/ ::; IIAul1 for all u E X.
Thi~ il1lplie~ that R(A) is closed. In fact, if AUn --+ v as n --+ 00, then
(Au,,) is Cauchy, and cllu n - umll ::; IIAu n - Aumll shows that (un) is also
Cauchy. Hence Un, --+ u as n --+ 00, that is, Au = v.
If N(A) of- {OJ, then we use the operator
[AJ: XjN(A) - ; Y
212 3. Principles of Linear Functional Analysis
from Proposition 6 in Section 3.8. Recall that [AHul := Au for all u EX.
Thus, the a priori estimate in (62) is equivalent to
cll [ulll :S II [Al [ulll for all [u] E XjN(A).
The same argument as the preceding one shows that R([AJ) is closed. Since
R(A) = R([AJ), the range R(A) is also closed. 0
Corollary 1 (Closed range theorem for Hilbert spaces). Let A: X ~ X
be a linear continuous operator on the Hilbert space X over IK. Then the
following two conditions are equivalent: 8
(i) R(A) = N(A*)-L and R(A*) = N(A)-L.
(ii) R(A) is closed.
In terms of the operator equation
Au=b, uEX, (E)
and the adjoint equation
A*v = c, v EX, (E,,)
this means the following. Let the range R(A) be closed.
(a) For given b E X, the original equation (E) has a solution iff
(vlb)=O
for all solutions v of the homogeneous adjoint equation (Ea).
(b) For given c EX, the adjoint equation (Ea) has a solution iff
(clu)=O
for all solutions u of the homogeneous original equation (E).
Proof of Corollary 1. By Proposition 2 in Section 3.10, we have
and (Ju,v) = (u I v) for all u,v E X. The assertion follows now from
Theorem 3.E.
~Here, the symbol .1 denotes the orthogonal complement (d. Section 2.9 in
AMS Vol. 108).
3.12. Applications to the Closed Range Theorem 213
In fact, by Theorem 3.E, the original equation (E) has a solution iff
(u*, b) =0 for all u· with AT u* = O.
If we let v := J-1u*, this is equivalent to
(vlb)=O for all v with J- 1 AT Jv = 0,
that is, bE N(A*)J.. Hence R(A) = N(A*)J..
Moreover, the adjoint equation (Ea) can be written as
By Theorem 3.E, this equation has a solution iff
(J-IC,U) = 0 for all u with Au = O.
This is equivalent to
(c I u) = 0 for all u with Au = 0
(Le., c E N(A)J.). Hence R(A*) = N(A)J.. o
Standard Example 2. Let A: X -+ Y be a linear continuous operator,
where X and Yare Banach spaces over lK.
If co dim R(A) < 00, then the range R(A) is closed.
Proof. Choose a linear subspace Z of Y such that
Y = R(A) E9 Z.
As in Proposition 6 of Section 3.8, define the linear continuous injective
operator [A]: X/N(A) -+ Y through
[AHu] := Au for all [u] E X/N(A).
Finally, set
B([u], z) := [AHu] +z for all [u] E X/N(A), z E Z.
Then, the operator
B: (X/N(A)) x Z -+ Y
is linear, continuous, and bijective. In fact, [A] [u] + z = 0 implies [A] [u] = 0
along with z = 0 (i.e., [u] = 0). Since dim Z < 00, both Z and X/N(A)
are Banach spaces. Thus, B is a linear homeomorphism, by the continuous
inverse theorem in Section 3.5. The set
W:= {([u],O): [u] E X/N(A)}
214 3. Principles of Linear Functional Analysis
is closed in the product space (X/N(A)) x Z. Hence the set B(W) is also
closed, by Lemma 12 in Section 3.9. Finally, observe that
R(A) = R([A]) = B(W).
Thus, the range R(A) is closed. o
Standard Example 3. Let A: X -----t Y be a linear continuous operator,
where X and Yare Banach spaces over JK. Then the following two condi-
tions are equivalent:
(i) A priori estimate. There is a constant e > 0 such that
ellull ::; IIAull for all u E X.
(ii) The range R(A) is closed and Au = 0 implies u = O.
Proof. Observe that dist(u, N(A)) = lIuli if N(A) = {O}, and use Theorem
3.E. 0
Standard Example 4. Let A: X -----t Y be a linear continuous operator,
where X and Yare Banach spaces over ll{. Furthermore, let Z be a Banach
space over ll{ such that the embedding
X~Z
is compact. Then the following two statements are equivalent:
(i) A priori estimate. There is a constant e > 0 such that
cllullx ::; II Au II v. + IIullz for all u E X. (67)
(ii) The range R(A) is closed and dim N(A) < 00.
This result plays an important role in the theory of elliptic-type linear
partial differential equations (cf. Lions and Magenes (1972), Vol. 1, Chapter
2, Section 5.2).
Proof. (i) ::::} (ii). Since A is continuous, the null space N(A) is closed, and
hence N(A) is a Banach space with respect to the norm II . IIx. Let B be
the closed unit ball in N(A). We want to show that B is compact. Then
dim N(A) < 00, by Section 2.3.
In fact, let (un) be a sequence in B. Since the embedding X ~ Z is com-
pact, the set B is relatively compact in Z. Thus, there exists a subsequence,
3.12. Applications to the Closed Range Theorem 215
again denoted by (un), such that Un -+ U in Z as n -+ 00. Since AU n = 0
for all n, it follows from (67) that
for all n, m.
Hence (un) is Cauchy in X. Thus, B is compact.
We now prove that the range R(A) is closed. Since dim N(A) < 00, there
exists a continuous projection P: X -> X onto N(A). Set L := (I - P)(X).
Then
x = N(A) EB L.
The operator A: L -+ R(A) is bijective on the closed linear subspace L of
X. This implies the existence of a constant d > 0 such that
for all U E L. (68)
Otherwise, there would exist a sequence (un) in L such that
for all n, (69)
and AU n -+ 0 in Y as n -+ 00. Since the embedding X c;;: Z is compact,
there is a subsequence, again denoted by (un), such that Un -+ V in Z as
n -+ 00. By (67),
cllu n - urnllx ~ IIAu n - AUrnily + Ilu n - urnllz
(i.e., (un) is Cauchy in X), and hence
un -+ U in X as n -+ 00.
This implies U ELand Au = o. Hence U = 0 because X = N(A)EBL. From
(69) we get lIullx = 1. This contradicts U = O.
(ii) => (i). From X = N(A) E9 L we obtain the decomposition
Ul E N(A), U2 E L,
for all U EX. Hence
for all U E X. (70)
All the norms are equivalent on the finite-dimensional space N(A) (cf.
Section 1.12 in AMS Vol. 108). Hence
for all Ul E N(A).
If we use Ul = U - U2, this implies
IIUllix ~ const(llullz + Il u 21Iz).
Since the embedding X c;;: Z is continuous, we have /lv/lz ~ constllv/lx for
all v EX, and hence
216 3. Principles of Linear Functional Analysis
Thus, it follows from (70) that
lIullx ::; const(llullz + II U 21Ix). (71)
Finally, by the continuous inverse theorem from Section 3.5, the operator
A: L -> R(A) is a linear homeomorphism. This implies (68), namely,
for all U2 E L. (72)
From (71) and (72) we get the desired inequality (67), since AU2 = Au. 0
Problems
3.1. The Baire category. Let M be a set of the first Baire category in the
Banach space X over K, and let N be a nonempty open subset of X.
Show that N - ]v! is dense in N.
Solution: We have to prove that N ~ (N - M). If this is not true, then
the set
S:= N - (N - M)
is nonempty and open. Hence S is of the second Baire category in X, by
Theorem 3.A. Since
N - (N - M) ~ N - (N - M) = M,
the set S is of the first Baire category. This is a contradiction.
3.2. Examples. Determine the Baire category of the following sets 1\;[ in X:
(i) X:= 1Ft and M:= {~E X: sin~ = O}.
(ii) X:= 1Ft and M:= {~E X: 0::; sin~::; I}.
(iii) X:= 1Ft 2 and M := {~, 1}) EX: + 7]2 < I}.
1;,2
(iv) X:= 1Ft 2 and M:= {1;,,1}) EX: e + 7]2 = I}.
(v) X:= 1Ft2 and M := {(I;" 7]) E X: I;, + 7] = p, p = rational number}.
(vi) X:= 1Ft 2 and M:= {(I;" 7]) EX: 0::; 1;,,7]::; I}.
Solution: Cf. Problem 3.20.
3.3. Topological direct sum. Let
X = Xl Ef.)X2
be a direct sum, where X is a Banach space over OC. Let
be the corresponding decomposition for each u E X. Show that the follow-
ing three statements are mutually equivalent:
Problems 217
(i) X = Xl EB X 2 is a topological direct sum.
(ii) The map u f-+ (u I, U2) is a linear homeomorphism from X onto the
product space Xl x X 2 .
(iii) The norm Ilull* := \IuI11 + IIu2\1 is equivalent to the original norm Ilull
on X.
Hint: Use the continuous inverse mapping theorem.
3.4. A var-iant of the Banach-Steinhaus theorem. Let (Ln) be a sequence
of linear continuous operators Ln: X ---> Y, where X is a Banach space over
lK and Y is a normed space over lK. Suppose that the limit
Lu:= lim Lnu
n~oo
exists for all u EX. Show that
IIL\\::::; lim \\Lnll < 00.
n->oo
Solution: By the Banach-Steinhaus theorem (Corollary 1 in Section
3.3), sUPn IILn II < 00. It follows from IILnull : : ; IILnllllul1 that
for all U EX.
3.5. Weak conver-gence. Let X be a normed space X over K Show that
(a) If u;, ---> u* in X* and Un ~ u in X as n ---> 00, then
as n ---> 00.
(b) If Un ~ u in X as n ---> 00, then
Ilull::::; n-+oo
lim Ilun\l.
(c) If X is reflexive, then u~ ~ u* in X' as n ---> 00 is equivalent to
(U~, u) ---> (u*, u) as n ---> 00 for all u E X.
(d) If X is reflexive, then u;, ~ u* in X* and Un ---> U in X as n ---> 00
imply
as n ---> 00.
(e) If X is a Hilbert space, then Un ~ U in X and lIun ll ---> Ilull as 11, ---> 00
imply Un ---> u as 11, ---> 00.
218 3. Principles of Linear Functional Analysis
Solution: Recall that (v*,v):= v*(v) and hence
l(v*,v)l:::; Ilv*lllIvll for all v E X, v* E X*.
Ad (a). Since (un) is bounded, we get
I(u.~,un) - (u*,u)1 = I(u~ - u·,un ) + (u*,u n - u)1
:::; Ilu~ - u'll sup lIunll + I(u*,un - u)l---) 0 as n ---) 00.
n
Ad (b). Use Problem 3.4 and the fact that X* is a Banach space (cf. the
proof of Example 3 in Section 3.3).
Ad (c). Use the definition of reflexive normed spaces in Section 2.8.
Ad (d). Use (c) and an analogous argument as in the proof of (a).
Ad (e). Since (un I u) ---) (u I u) and (u I un) ---) (u I u) as n ---) 00, we get
as n ---) 00.
Cf. Zeidler (1986), Vol. 2B, Proposition 21.23.
3.6. Weak* convergence. Let X be a Banach space over lK, and let (u~) be
a sequence in the dual space X*. We write
u~ ~ u· in X as n ---) 00
iff (u;"u) ---) (u*,u) as n ---) 00 for all u E X. This is the so-called weak*
convergence. Show that the following are true:
(a) If u~ ~ u* in X* as n ---) 00, then (u~) is bounded in X* and
lIu*lI:::; lim lIu~lI·
n->oo
(b) If u~ ~ u* in X* and Un ---) U in X as n ---) 00, then
as n ---) 00.
(c) If X is reflexive, then u~ ~ u* in X* iff u~ ~ u in X* as n ---) 00.
(d) If X is separable, then each bounded sequence (u~) in X* has a sub-
sequence (U~/) such that U~I ~ u* in X* as n' ---) 00.
(e) Let (u~) be a sequence in X*. Then u~ ~ u* in X* as n ---) 00 iff
(u~) is bounded and there exists a dense subset D of X such that
(u~,v) ---) (u*,v) as n ---) 00 for all v E D and fixed u* E X*.
Problems 219
Solution: Ad (a). Use Problem 3.4.
Ad (b), (c). Use similar arguments as in Problem 3.5.
Ad (d). Use a similar argument as in the proof of Proposition 6 in Section
2.8. To this end, let {Vk} be a countable dense subset of X. By a diagonal
procedure, we obtain a subsequence (w~) of (u~) such that
as n --+ 00 for all k.
Since {Vk} is dense in X and (w~) is bounded in X*, it follows that, as
n --+ 00, the limit (w~,v) --+ a(v) exists for all vEX. From
l(w~,v)1 ~ sup IIw~lIl1vll,
n
we get la(v)1 ::; constllvll for all v E X, and hence a E X*. Thus, (w~,v)--+
(a, v) for all VEX.
Ad (e). Use the same argument as in the proof of Example 3 in Section
3.3.
Cf. Zeidler (1986), Vol. 2B, Proposition 21.26.
3.7. Subsequences. Show that a sequence (un) in a Banach space X over lK
has the following convergence properties:
(i) Strong convergence. Let u be a fixed element of X. If every subse-
quence of (un) has, in turn, a subsequence that converges strongly to
u in X, then the original sequence (un) converges strongly to u (Le.,
Un --+ U in X as n --+ 00).
(ii) Weak convergence. Let u be a fixed element in X. If every subsequence
of (un) has, in turn, a subsequence that converges weakly to u, then
the original sequence converges weakly to u (i.e., Un ~ u in X as
n--+oo).
(iii) Bounded sequences. Let (un) be a bounded sequence in the reflexive
Banach space X. If all the weakly convergent subsequences of (un)
have the same limit u, then (un) converges weakly to u (Le., Un ~ u
in X as n --+ 00).
Hint: Cf. Zeidler (1986), Vol. 1, Section 10.5.
3.S. Compact opemtors and weak convergence. Let A: X --+ Y be a linear
operator, where X and Yare Banach spaces over K Show that
(i) If A is compact, then A is strongly continuous, that is, as n --+ 00,
Un ~ U implies AUn --+ Au.
(ii) Conversely, if A is strongly continuous and X is reflexive, then A is
compact.
220 3. Principles of Linear Functional Analysis
Hint: Cf. Zeidler (1986), Vol. 2A, Proposition 21.29.
The following problems summarize important properties of reflexive Ba-
nach spaces. The statement in Problem 3.17 represents a deep result of
functional analysis (the Eberlein-Smuljan theorem).
3.9. Invariance of reflexivity under normisomorphisms. Let X and Y be
normed spaces over OC with
X~Y
(i.e., X is normisomorphic to Y). Show that X is reflexive iff Y is reflexive.
3.10. Reflexivity of closed linear subspaces. Let L be a closed linear sub-
space of the reflexive Banach space X over K Then L is also reflexive.
This has been proved in Section 2.8.
3.11. Reflexivity of the dual space. Show that a Banach space X over OC is
reflexive iff the dual space X* is reflexive.
Hint: If X is reflexive, then the reflexivity of X* follows by using a
simple argument based on the surjectivity of the operator j from Section
2.8 (cf. Holmes (1975), p. 126).
Conver.sely, if X* is reflexive, then so is X**, by the preceding argument.
Using the map j: X -+ X**, we obtain that j(X) is a closed linear subspace
of X** with X ~ j(X). Now use Problems 3.9 and 3.10.
3.12. Reflexivity of product spaces. Let X and Y be normed spaces over OC.
Set
X x Y:= {(u,v):u E X, v E Y}, II(u,v)11 := Ilull + IIvll,
X*xY* = {(u*,v*):u* E X*, v* E Y*}, {(u*,v*)I1* := max{lIu*ll, IIv*II}·
Show that
(i) X* x y* is a normed space over OC equipped with the norm II . 11*.
(ii) (X x Y)* ~ X* x Y*.
(iii) If X and Yare reflexive, then so is the product space X x Y.
More precisely, there exists a normisomorphism 3: X* x y* -+ (X x Y)*
given through
3(u*, v*)(u, v) := u*(u) + v*(v),
for all (u,v) E X x Y and (u*,v*) E X* x Y*.
3.13. Reflexivity of factor spaces. Let X be a Banach space over lK., and
let L be a closed linear subspace of X. Recall that
L1.:= {u* E X*: (u*,u) = 0 for all u E L}.
Show that
Problems 221
(i) LJ.. is a closed linear subspace of X·.
(ii) There exists a functional u* E LJ.. with lIu· II = 1 and (u*, u) = 1 for
some u E X, provided L #- X.
(iii) (XI L)* ~ LJ...
(iv) If X is reflexive, then so is the factor space XI L.
More precisely, there exists a normisomorphism .J: (XI L ). --) LJ.. given
through
.J(u*)([uJ) := u*(u) for all u E X.
Recall that the elements [u] of XI L are the sets [u] := u + L.
Hint: Use the Hahn-Banach theorem in (ii). The proof of (iii) is based
on (ii). In order to prove (iv), use (iii) along with Problems 3.9 and 3.11.
3.14. Dual operators. Let A: X --) X be a linear continuous operator on
the reflexive Banach space X over IK. Show that
ATT = A.
This relation corresponds to A ** =A in Hilbert spaces.
3.15.* Embeddings. Let X and Y be Banach spaces over IK such that the
embedding
X~Y
is continuous, and X is dense in Y. Show that
(i) The embedding Y· ~ X* is continuous.
(ii) If X is reflexive, then Y* is dense in X*.
Hint: Use the Hahn-Banach theorem. Cf. Zeidler (1986), Vol. 2A, p. 98.
3.16. Weak topology. Let X be a normed space over K A subset W of X
is called weakly open iff, for each point Uo E W, there is a number c > 0
and there are finitely many functionals h, ... , f n E X* such that the set
{u E X: ih(u - uo)i < c, j=l, ... ,n}
is contained in W. Show that
(i) All the weakly open subsets of X form a separated topology (cf. Prob-
lem 1.12).
This is called the weak topology of X. By weak closedness, weak com-
pactness, and so forth, we understand closedness, compactness, and
so on with respect to the weak topology.
222 3. Principles of Linear Functional Analysis
(ii) The weak convergence is identical to the convergence with respect to
the weak topology.
(iii) In a finite-dimensional normed space over 1K, the weak topology is
identical to the usual topology induced by the norm.
Important Remark (The shortcoming of classic sequences in general
topological spaces). We have shown in Problems 1.15 and 1.18 that if X
is a metric space (e.g., X is a normed space), then a subset M of X is
compact iff it is sequentially compact.
Unfortunately, this result is not valid in general topological spaces (e.g.,
normed spaces equipped with the weak topology). In order to characterize
compact sets by means of convergence in general topological spaces, one
needs generalized sequences (Moore-Smith sequences).
Important results are summarized in the appendix to Zeidler (1986), Vol.
1, pp. 758ff.
3.17. ** Weak compactness and reflexivity. Let X be a Banach space over lK.
Then the following three fundamental statements are mutually equivalent:
(i) X is reflexive.
(ii) Each bounded sequence in X has a weakly convergent subsequence.
(iii) The closed unit ball B in X is weakly compact.
Study the proof in Rolewicz (9172), Chapter 5. Also see Holmes (1975),
pp. 126 and 149, and Dunford and Schwarz (1958), Vol. 1.
Recall that
B is compact iff dim X < 00,
by Section 2.3. Thus, the closed unit ball B of X carries important infor-
mation about the structure of the Banach space X.
3.18. Reflexivity of finite-dimensional normed spaces. Show that each finite-
dimensional normed space over II{ is reflexive.
Hint: An elementary proof follows from Proposition 4 in Section 1.21 of
AMS Vol. 108.
The statement is also an immediate consequence of Problems 3.16(iii)
and 3.17.
3.19. Locally convex spaces. By a seminorm p on the linear space X over
we understand a function p: X -+ [0, oo[ such that
II{,
p(u + v) ::::; p(u) + p(v) and p(au) = lalp(u)
for all u, v E X and a E lK. Obviously, each norm is a seminorm. Conversely,
a seminorm is a norm iff p(u) = 0 implies u = O.
Problems 223
By definition, a locally convex space consists of a linear space X over lK
together with a system of seminorms {Pj}jEJ on X such that, for U E X,
U=O iff pj(U) = 0 for all j E J.
A subset W of X is called open iff, for each point Uo E W, there is a
number € > 0 and finitely many seminorms Pjl , ... , Pjn such that the set
{UEX:Pjk(U-UO)<€, k=l, ... ,n}
is contained in W. Show that
(i) These open sets form a separated topology on X (cf. Problem 1.12f).
(ii) Each normed space X equipped with the weak topology is a locally
con vex space with respect to the system of seminorms {I f I}f EX' .
Historical Remark. The theory of locally convex spaces was developed
in the 1950s, motivated by the observation that spaces of generalized func-
tions are locally convex spaces but not normed spaces (cf. the appendix to
Zeidler (1986), Vol. 2B, pp. 1056ff).
3.20. Solution to Problem 3.2. The sets M in (i), (iv), and (v) are of the
first Baire category in X, whereas the remaining sets are of the second
Baire category in X.
4
The Implicit Function Theorem
Data aequatione quotcunque fluentes quantitae involvente fluxiones
invenire et vice versa. 1
Isaac Newton to Leibniz, 1676
It is worth noting that the notation facilitates discovery. This, in a
most wonderful way, reduces the mind's labor.
Gottfried Wilhelm Leibniz
In this chapter let us consider some basic facts about the differential
calculus for operators. The main strategy encompasses the following:
(i) Differentiation means linearization.
(ii) Higher derivatives correspond to multilinearization.
1 "It is useful to differentiate functions and to solve differential equations."
More precisely, Newton communicated his discovery to Leibniz in the following
form:
6a cc d ae 13e ff 7i 31 9n 40 4q rr 4s 9t 12v x.
This decodes into the Latin sentence above, which must have been incomprehen-
sible to Leibniz, although Leibniz too discovered differential calculus at about
the same time. It is said that more ingenuity is required to decode this anagram
than to discover differential calculus.
226 4. The Implicit Function Theorem
Frechet derivative
(F-derivative)
the Banach continuous ~
calculus
inverse theorem
(Chapter 3)
~
the Taylor theorem
the Banach
fixed-point theorem
- ~
the implicit
function theorem
bifurcation theory
(Chapter 5)
I
~
the surjective implicit
function theorem ~ ~1;~"""~I_ti,,",
and nonlinear boundary-value
-
problems (Chapter 5)
the Lagrange
diffeomorphisms normal form for
multiplier rule
(inverse mapping theorem) double splitting maps
""~;,,~~J.~~
j ~ /~'._m)
Sard-Smale theorem for
linearization principle nonlinear Fredholm operators
(equivalent maps) (Chapter 5)
t
topological direct sum
(Chapter 3)
FIGURE 4.1.
The fundamental implicit function theorem on the unique local solvability
of parameter-dependent operator equations is a consequence of the Banach
fixed-point theorem combined with calculus. In fact, the implicit function
theorem represents a cornerstone of nonlinear analysis. Important applica-
tions are displayed in Figure 4.1.2
We will introduce a notation that produces formulas that are as simple
as possible.
2Many other applications of the implicit function theorem are studied in Zei-
dler (1986), Vol. 1, Chapter 4, and Vol. 4, Chapter 73 (Applications to Banach
Manifolds) .
4.1 m-Linear Bounded Operators 227
4.1 m-Linear Bounded Operators
Definition 1. Let Xl, ... ,Xm and Y be Banach spaces over lK. The map-
ping
114: Xl X ... X Xm -+ Y
is called m-linear and bounded iff M is linear in each argument and there
is a constant C ~ 0 such that
(1)
for all Uj EXj ,j=l, ... ,m.
The norm of M is defined by
IIMII := sup IIM(Ul,'" ,um)11
lIuj 11::;1,j=l, ... ,m
so that
for all Uj E Xj, j = 1, ... , m.
Some examples will be considered in Section 4.3.
Proposition 2. Each m-linear bounded operator is continuous.
Proof. For example, let m = 2. Suppose that
as n -+ 00.
Then Un -+ U and Vn -+ v as n -+ 00, and hence (un) and (v n ) are bounded.
Thus,
IIM(un,v n ) - M(u,v)1I = IIM(un - u,v n ) + M(u,vn - v)11
::; IIMliliun - ullllvnll + IIMllllullllvn - vii -+ 0
as n -+ 00. o
In the following, we write
iff 111~;I~,~!1 -+ 0 as h -+ O. In order to simplify notation, we set
228 4. The Implicit Function Theorem
4.2 The Differential of Operators and
the Frechet Derivative
The key formulas are given by the decomposition
feu + h) - feu) = df(u)h + o(llhll), (2)
and
j'(u) == df(u),
as well as
df(u+h)k-df(u)k = d 2 f(u)hk+r (3)
with the "small" remainder r, namely,
sup Ilr(u; h, k)1I = o(llhll), (4)
Ilkli:Sl
and
j"(u) == d2 feu).
Definition 1. Let f: U(U) C;;;; X -> Y be a given operator defined on an
open neighborhood of the point u, where X and Yare Banach spaces over
lK.
(i) The differential df (u) of f at the point u exists iff there is a linear
bounded operator denoted by
df(u): X -> Y
such that (2) holds for all hEX in some open neighborhood of h = 0
in X.
Synonymously, we also write j'(u) instead of df(u) and we call j'(u) the
F-derivative 3 of f at the point u.
(ii) The second differential d 2 f (u) of f at the point u exists iff there is a
bilinear bounded operator denoted by
such that (3) and (4) hold for all k E X and all h in some open
neighborhood of h = 0 in X.
3 F-derivative stands for "Fnkhet derivative."
4.2 The Differential of Operators and the Fnkhet Derivative 229
Synonymously, we also write f"(u) instead of d2 feu) and we call f"(u)
the second F -derivative of f at u.
Roughly speaking, df (u) == f' (u) represents a linearization of the opera-
tor f at the point u.
Moreover, the linearization of df(u)k == f'(u)k at the point u leads to
the bilinear operator d2 f(u)hk == f"(u)hk.
It follows easily that df(u) is uniquely determined by (2). In fact,
df(u)h = tlim t-1[f(u + th) - feu)] == dd feu + th)lt_o' (5)
..... o t -
provided df(u) exists. Analogously, the existence of d2 feu) implies
d
d2 f(u)hk = dt df(u + th)klt=o' (6)
Remark 2. Formulas (5) and (6) are frequently used in the following way:
(a) One formally computes df(u) by means of (5).
(b) One justifies this by verifying the decomposition in (2).
The same method works nicely in the case of d2 feu). Here one has to use
(6) along with (3).
Similarly to (3) and (4), the (n+ I)th differential dn+l feu) at the point
u is defined through induction by means of the following formula:
with the "small" remainder
sup Ilr(u;h,h1, ... ,hn)1I =o(lIhll), h -+ O. (8)
IIh j 119,j=1, ... ,n
Here we assume that
~+lf(u):X x··· x X -+ Y
is an (n + I)-linear bounded map. Synonymously, we set
f(n+l)(u) == dn+l feu).
Here f(n+l)(u) is called the (n + I)th F-derivative of f at the point u.
Parallel to (5) and (6), we obtain that the existence of dn+1 feu) implies
dn+l f(u)hh 1 ... h n = ! dn feu + th)h1.·· hnlt=o (9)
230 4. The Implicit Function Theorem
for all hI, ... , h n E X, where n = 1,2, ....
Definition 3. The differential d n f is said to be continuous at the point u
iff for each c > 0, there is a 6(c) > 0 such that
for all hEX with Ilhll < 6(c),
where the norm is to be understood in the sense of Definition 1 from Section
9.1. Explicitly, this means that
IIdn f(u + h) hI ... hn - dn f(u)h l ... hnll :::; cllhlll'" IIhnil
for all hI, ... ,hn E X and all hEX with IIhil < 6(c).
The map f: U ~ X ---4 Y on the open subset U of X is called C k (k ~ 0)
iff dn f is continuous on U for n = 0,1, ... , k, where cf1 f := f.
Moreover, f is called Coo iff f is C k for all k.
The relation between f" and the iterated derivative (f')' will be studied
in Section 4.6. As we will explain there, with a view to concrete applications,
it is easier to work with f"(u), in the sense of Definition 1, than with
(f')'(u). Our definition of f"(u) emphasizes the philosophy that higher
derivatives correspond to multilinearization.
Classical Standard Example 4. Suppose that the real-valued function
of N real variables (6, ... , ~N) has continuous partial derivatives up to
the kth order on an open neighborhood U(u) of the point u. Then the
differential dn f(u) exists for all n = 1, ... , k, where
N
L {)31.... {).3k f(u)h l 31· h 232· ... hk'3k (10)
for all hI"", hk E JRN with h j := (h j l , ... , hjN ).
In addition, if f has continuous partial derivatives up to the kth order
on the open subset U of JRN, then f is C k on U.
Formula (10) is identical to the well-known classic formula for differen-
tials.
Proof. Let k = 1 and N = 2. We set u := (~,"1) and h := (0:,/3). The
classic mean value theorem tells us that
f(~+ 0:, "1 + /3) - f(~, "1)
= f(~ + 0:, "1 + f3) - f(~, "1 + f3) + f(~, "1 + f3) - f(~, "1)
= f~(~ + fJo:, 'T} + /3)0: + fTJ(~' "1 + ()f3)/3, where 0 < fJ, () < 1.
4.2 The Differential of Operators and the Frechet Derivative 231
Note that Ihl = (lal 2+ LBI2)!; then the continuity of fe and f1/ at (e,1])
implies that
feu + h) - feu) = fe(u)a + f1/(u)(3 + oOh!), h --+ o.
Hence
df(u)h= fe(u)a+f1/(u)(3.
This is (10) for k = 1.
The remaining statements are proved similarly. o
Example 5 (Differentiation of bilinear operators). Let
be a bilinear bounded operator, where Xl, X 2 , and Yare Banach spaces
over oc. Set X:= Xl x X 2 and u:= (Ul,U2) for u E X.
Then B is Coo. For all u, h, k EX,
dB(u)h = B(Ul' U2) + B(hl' h2), (11)
d2B(u)kh = B(kb h2) + B(hl' k2), (12)
and dn B(u) = 0 if n = 3,4 ....
Proof. Let us use the strategy from Remark 2.
Ad (11). Formally,
d
dB(u)h = dt B(u + th)lt=o·
Since
B(u+th) =B(Ul +thl ,U2+ th 2) ( )
= B(Ul' U2) + tIB(hl' U2) + B(Ul' h2)] + t 2B(hl' h2), 13
we get (11).
To justify this, we have to inspect the remainder. Note that
for all u E X.
By (13) with t = 1,
B(u + h) = B(u, h) + dB(u)h + r,
where r = B(hl' h2). Hence
IIrll :::; IIBllllhlllllh211 :::; IIBllllhl1 2 for all hEX,
that is, r = o(lIhll) as h --+ O.
232 4. The Implicit Function Theorem
It follows from (11) and the bilinearity of B that dB (u): X --+ Y is linear.
Finally, we have to show that the operator dB(u): X --+ Y is bounded.
But this follows from
//dB(u)hl/ ~ I/BI/I/UlIIl/h21/ + IIBIII/h i l//l u 21/
~ 211BI/I/ul/I/hl/ for all hEX.
Ad (12). Relation (12) follows analogously to (11). Since d2 B ( u) does
not depend on u, we also get dn B( u) = 0 if n 2: 3.
By (11), the continuity of u f-7 dB(u) follows from
I/dB(u)h - dB(v)hl/ ~ IIBllllul - vll/llh211 + liB II Ilhlll II u 2 - v211
~ 21/BI/I/u - vill/h/l,
and hence
I/dB(u) - dB(v)11 ~ 21/ B I/llu - vii for all u, vEX.
Similarly, we get the continuity of d 2 B. o
Proposition 6 (The sum rule). Let f, g: U (u) <;;; X --+ Y be mappings on
an open neighborhood of the point u, where X and Yare Banach spaces
over IK. Let n = 1,2, .... Then
for all Ct, (3 E lK,
provided the F-derivatives f(nJ(u) and g(nJ(u) exist.
Proof. This follows immediately from the definition of the F-derivative. 0
Partial F-derivatives are defined parallelly to the classical situation.
Definition 7. Let the map
f:U(u,v) <;;; X x Y --+ Z
be given on an open neighborhood of the point (u,v), where X, Y, and Z
are Banach spaces over lK.
Let v be fixed and set g(w) := few, v). If 9 has an F-derivative at the
point u, then we define the paTtial F -derivative fu (u, v) through
fu(u,v) := g'(u).
The partial F-derivative fv(u, v) is defined similarly. Instead of fu(u, v),
fv(u,v), one also writes Dd(u,v), D2f(u,v), respectively.
Proposition 8. Let f: U (u, v) <;;; X x Y --+ Z be given in Definition 7. If
the F -derivative f' (u, v) exists, then the partial derivatives fn( u, v), fv( u, v)
also exist and
f'(u, v)(h, k) = fu(u, v)h + fv(u, v)k for all hEX, kEY.
4.3 Applications to Analytic Operators 233
Proof. Note that iu(u, v)h = j'(u, v)(h, 0) and iv(u, v)k = j'(u)(O, k). 0
Further properties of partial F -derivatives will be proved in Problem
4.11.
4.3 Applications to Analytic Operators
Definition 1. Let X and Y be Banach spaces over K Let there be given
a k-linear bounded operator
M:X x X x .. · x X --+ Y,
which is symmetric in all variables. A power operator is created from M
by setting
Mu k := M(u, ... , u) (14a)
and
]1/!U n1 V n := }I,f(u, ... ,u;v, ... ,v), m+n = k, (14b)
~'--v--'
m times n times
for any partition of k. For k = 0, Muo is a fixed element w in X. Henceforth
IIMlillull n with 11 = 0 will denote the norm Ilwll of this element.
More precisely, in (14b) we need only that (Ul, .. . , Uk) 1-+ M(Ul, ... , Uk)
be symmetric with respect to both (Ul, . .. , urn) and (u rn +!, ... , Uk).
Example 2 (Integral operators). Let X = Y = C[a, b], and let A: [a, b] x
[a, b] --+ R. be continuous, where -00 < a < b < 00. Define
M(u,v,w)(y):= lb A(y,x)u(x)v(x)w(x)dx for all y E [a, b]
and all u, v, w EX. Then we obtain a power operator from X x X to X by
setting
Muv 2 := M(u,v,v) for all u,v E X.
We have 4
IIM(u, v, w)11 :::; (b - a) max IA(x, y)llIullllvllllwll
a:5,x,y:5,b
for all u,v,w E X,
and hence
IIMII :::; (b - a) max IA(x, y)l·
a:5,x,y:5,h
Definition 3. Let C k [a, b] be the set of all continuous functions
u: [a, b] --+ R.
4Recall that Ilull = maxa:Sx9Iu(x)l·
234 4. The Implicit Function Theorem
that have continuous derivatives up to order k on the compact interval
[a,b].
Then Ck[a, b] is a real Banach space with the norm
k
JluJl := L a<x<b
max lu(j)(x)l·
j=O - -
This is a special case of Example 6 in Section 4.4.
Example 4 (Differential operators). Let X := C 1 [a, b], Y := CIa, b], and
M(u,v,w)(y) := ¢(y)u'(y)v'(y)w(y) for all y E [a, b],
all u, v, w EX, and fixed function ¢ E Y. Then, letting
for all u,v E X,
we obtain a power operator from X x X to Y. We have
JlM(u,v,w)lly::; 1I¢lIyllullxllVllxllwllx for all u,v,w E X,
and hence JlMII ::; 1I¢lly·
Computation with power operators is the same as with ordinary powers.
For all u,v E X, Q E 1K, and n,m,k E N, we have
k
M(u + v)k = L (~) Muk-jv j (binomial formula), (16)
j=O J
IIMu k - Mvkll ::; kR k- 1 1lMllliu - vII for Ilull, IIvll ::; R. (17)
The simple idea behind the proof of (17) can be seen by considering the
decomposition
Mu 2 - Mv 2 = M(u,u) - M(v,v) = M(u - v,u) + M(v,u - v),
and so
Proposition 5. Power opemtors have Lipschitz continuous F -derivatives
of arbitmry order. The formulas for the F -derivatives are pamllel to the
corresponding classic formulas.
4.3 Applications to Analytic Operators 235
To explain this, let the power operator A: X -4 Y be given, where
Then, for each u EX,
A'(u)h = kMuk-Ih, (18)
for all h, hI, h2 EX, and so on.
The proof of (18) follows from (16). Let us explain the simple idea of the
proof with the following special case.
Example 6. Let Au := lIJu 2. Then, for all u, v, h, hI, h2 EX,
A'(u)h = 2Muh, (19)
IIA'(u) - A'(v) II ::; 211Mllilu - vII, (20)
A"(u)h I h 2 = 2Mh I h2, (21)
and A(n)(u) == 0 if n ~ 3.
Proof. From
A(u + h) - Au = M(u+ h,u + h) - M(u,u)
= 2M(u, h) + M(h, h) = 2Muh + r(h),
along with IIr(h)1I ::; IIMllllhll 2 , we get (19).
Relation (20) follows from
II(A'(u) - A'(v»hll = 112M(u - v)hll ::; 211Mllilu - vllllhll,
that is, A': X -4 Y is Lipschitz continuous. Furthermore,
This yields (21). o
The following definition is basic. Let us consider expressions of the form
00
A(u):= LMk(U - uO)k, UEX, (22)
k=O
together with the majorant condition
00
L IIMkllil u - uoll k < 00. (23)
k=O
236 4. The Implicit Function Theorem
Convergence of (23) ensures the absolute convergence of (22), and hence
the convergence of (22), by Section 1.22 in AMS Vol. 108.
Definition 7. Let X and Y be Banach spaces over IK, and let lVh: X ~ Y
be power operators for all k = 0,1, ....
The operator A: U(uo) c;;:; X ~ Y is called analytic at the point Uo E X
iff there is a number p > 0 such that series (23) converges for all u E X
with lIu - uoll :s; p and A(u) allows representation (22) for all those points
u.
The operator A is called analytic on the open set V iff A is analytic at
each point of V.
Theorem 4.A. If the operator A is analytic at the point uo, then A is Coo
on some open neighborhood V(uo) of uo.
Moreover, the formulas for the derivatives are obtained by an application
of the corresponding formulas for power operators.
For example, from (22) and (23) we obtain
L klVh(u -
00
A'(u)h = uO)k-1h for all hEX and u E V(uo).
k=l
Proof. By (23), the series
00
L IllVhllz k
k=O
converges for all z E <C with Izl :s; p, and hence the differentiated series
00
k=l
converges for all z E <C with Izl :s; ~.
Step 1: We want to prove that the F-derivative A'(u) exists for all u E X
with Ilu-uoll:s;~.
Let Ilhll :s; ~ and Ilu - uoll :s; ~, and let k = 2,3, .... By (16),
lvh(u + h - uo)k -lVh(u - uo)k = kAh(u - uO)k-l h + rl.;,
where
·
SlIlce ",k
~j=o (k) = 2 ,
j k
Ilrkll :s; ~ (;) IIMklllitt - uollk-Jllhll j :s; 41IhI121IAhllpk-2.
4.3 Applications to Analytic Operators 237
Hence
A(u + h) - A(u) = Lh + r,
where 00 00
Lh:= L kl'vh(u - uol-Ih and r:= Lrk.
k=1 k=2
This implies
\\Lhl\ ::; (~kI\Mk \\ (~) k-I) I\h\\
and
\\rl\ ::; ; (~IIMkIIPk) Il h 2
11 .
Therefore, A'(u) = L.
Step 3: The existence of higher derivatives can be proved analogously. 0
The explicit computation of higher F-derivatives can frequently be based
on the following special product rule:
d
dtB(¢(t),'ljJ(t))lt=s = B(¢'(s),'ljJ(s)) + B(¢(s),'ljJ'(s)). (24)
Proposition 8. Let B: X x Y --t Z be a bilinear bounded operator, where
X, Y, and Z are Banach spaces over lK. Furthermore, let
¢: U(s) c:;;; lR --t X and 'ljJ: U(s) c:;;; lR --t Y
be functions defined on an open neighborhood of s E lR such that the deriva-
tives ¢' (s) and 1// (s) exist.
Then the derivative of the function t f---+ B(¢(t), 'ljJ(t)) exists at the point
s and formula (24) holds true.
Proof. We have
¢(s + h) = ¢(s) + h¢'(s) + ha(h) and 'ljJ(s + h) = 'ljJ(s) + h'ljJ'(s) + h(3(h),
where a(h) --t 0 and (3(h) --t °as h --t O. Hence
B(¢(s + h), 'ljJ(s + h)) = B(¢(s), 'ljJ(s)) + hb + h'Y(h),
where b denotes the right-hand side of (24) and 'Y(h) -> 0 as h --t 0, by the
continuity of B. 0
Standard Example 9. Let X and Y be Banach spaces over lK with X -I
{O} and Y -I {O}. By Section 1.23 in AlVIS Vol. 108, there exists a maximal
nonempty subset
Linv(X, Y) of L(X, Y) such that A-I E L(Y,X) for all A E Linv(X, Y).
238 4. The Implicit Function Theorem
Define the operator <.P: Linv(X, Y) -+ L(Y, X) by
<.p(A) := A-I.
Then <.P is analytic and
<.P' (A)B = -<.p(A)B<.p(A) for all A E Linv(X, Y), B E L(X, Y). (25)
Proof. Set H := _A- 1 B. The Neumann series from Section 1.23 in AMS
Vol. 108 yields
<.p(A+B) = (A(I-H))-l = (I_H)-lA- 1 = (I+H+H 2 + ... )A- 1 . (26)
For IIHII < 1, this series has the geometric series
as a majorant series, and <.P is hence analytic.
By Theorem 4.A, the operator <.P has F-derivatives of each order. Ac-
cording to (9) and (26),
<.p'(A)B = !<.P(A + tB)lt=o = HA- 1 •
This is (25). o
Using (9) and the special product rule (24), all the higher derivatives of
<.P are obtained recursively. For example, to compute <.P", note that
(E,F) ~ -EBF
is a bilinear bounded operator from L(Y, X) x L(Y, X) to L(X, Y), since
IIEBFII ::; IIBIIIIEIIIIFII· By (9) and (25),
<.p"(A)CB = ! <.p'(A + tC)Blt=o = -<.p'(A)CB<.p(A) - <.p(A)B<.p'(A)C,
for all A E Linv(X, Y) and B, C E L(X, Y).
4.4 Integration
Throughout this section let -00 < a < b< 00, and let X be a Banach
space over ][{ with the norm II . II. Furthermore, set
4.4 Integration 239
)
•
) (
•
t
a b
FIGURE 4.2.
Parallel to classical analysis, we will define the integral through
Ib
a
u(t)dt:= lim
n-+oo
Ib
a
un(t)dt, (27)
where (un) is a sequence of regular step functions with
as n -+ 00. (28)
Definition 1. Let the function u: [a, bj -+ X be given.
(a) u is called a regular 5 step function iff there exists a decomposition
a = to < tl < ... < tm = b of the interval [a, bj such that the function
u = u(t) is constant on all the open subintervals jtj-l, tj[, that is,
u(t) = bj for all t E jtj-l, tj[, j = 1, ... ,m,
where bj E X (see Figure 4.2). The integral of such a regular step function
u is defined by
Obviously, by the triangle inequality,
(29)
J:
(b) u is called integrable iff there exists a sequence (un) of regular step
functions Un: [a, bj -+ X such that (28) holds. Then the integral u(t)dt
is defined by (27).
The following proposition shows that definition (b) makes sense.
Proposition 2. (i) The limit in (27) exists.
5More general step functions are considered in the appendix to AMS Vol. 108
in order to define the more general Lebesgue integral.
240 4. The Implicit Function Theorem
(ii) This limit is independent of the choice of (un).
Proof. Ad (i). If v and ware regular step functions on [a, b], then so is the
linear combination av+{3w for a, (3 ElK. This follows by using a refinement
of the corresponding partitions of [a, bJ. '
Let
as n ----> 00,
where (un) and (v n ) are sequences of regular step functions on [a, b]. Then,
for each c: > 0, there is an no(C:) such that
lIun - umll* ::; II(un - u) + (u - um)ll*
(30)
::; lIun - ull* + Ilu - umll* < c: for all n,m 2:: no(c:),
that is, (Un) is Cauchy with respect to II . II *. Furthermore,
as n ----> 00. (31)
It follows from (29) and (30) that
III ".(t)dt -lu..(t)dtll ~ Ill(Un(t) -"m(t))dtll
::; (b - a) II Un - umll* < (b - a)c:
for all n,m 2:: no(C:). Thus, the sequence U:un(t)dt) is Cauchy in the
Banach space X. Hence the limit in (27) exists.
Ad (ii). By (29) and (31),
III".(t)dt -l v.(t)dtll '" (b - a)II". - v.lI. ~ 0 as n ~ 00. 0
Standard Example 3. Each continuous function u: [a, bJ ----> X is inte-
grable.
Proof. Let n = 1,2, .... For j = 1, ... , n, define tj := a + jLlt, where
Llt := b-a.
n
Set
un(t) := u(tj) for all t E [t j- 1 , tj[,
along with un(b) := u(b). Then
lIu - unll* ::; max sup Ilu(t) - u(tj)1I ----> 0 as n ----> 00,
1::;)Sn tj-1S!Stj
since u is uniformly continuous on [a, bJ (cf. Proposition 9 in Section 1.11
of AMS Vol. 108). 0
4.4 Integration 241
Obviously, if a < e < band u: [a, b] ~ X is integrable over both subin-
tervals [a, e] and [e, b], then u is also integrable over [a, b] and
i
b
u(t)dt = i
C
u(t)dt + lb u(t)dt. (32)
Proposition 4 (Properties of the integral). Let u, v: [a, b] ~ X be inte-
grable. Then the following properties exist:
(i) Linearity. For all 0:, (3 E lK, the function o:u + (3v is integrable over
[a, b] and
b
ib(O:U(t) + (3v(t))dt = 0: i u(t)dt + (3i b v(t)dt.
(ii) Generalized triangle inequality. The function t 1--+ lIu( t) II is integrable
over [a, b] and
(33)
(iii) Functionals. For each functional u* E X*, the function t 1--+ (u*, u(t))
is integrable over [a, b] and
lb (u*, u(t))dt = (u*, lb u(t)dt). (34)
Proof. Ad (i), (iii). Use definition (27) and the continuity of u*.
Ad (ii). By (27) and the triangle inequality,
1 1U(t)dtll = nl~ Ili Un(t)dtll
b
b
::; J~ lb lIun(t)lldt = lb lIu(t)lIdt.
Note that if t 1--+ un(t) is a regular step function on [a, b], then so is t 1--+
Ilun(t)ll. 0
Theorem 4.B (The fundamental theorem of calculus). Let u: [a, b] ~ X
be continuous. Then the function v: [a, b] ~ X defined by
v(s) := 1 8
u(t)dt, a ::; s ::; b,
242 4. The Implicit Function Theorem
is differentiable on [a, b] with6
v'(s) = u(s) for all s E [a, b].
Proof. Suppose that h > 0 and s < b. By (33),
Ilh-l(v(s + h) - v(s) - u(s)11 = Ilh- 1+ (u(t) - U(S»)dtll
1 8 h
~ h- 1 j s
8+h
Ilu(t) - u(s)lldt
< sup Ilu(t) - u(s)ll---' 0 as h ---. O.
s~t~8+h
The proof for h < 0 proceeds similarly. D
Corollary 5. If the function u: [a, b] ---. X is continuously differentiable,
then
8
1u'(t)dt = u(s) - u(a) for all s E [a, b].
Proof. Set v(s) := J: u'(t)dt - u(s) + u(a). By Theorem 4.B,
v'(s) =0 on [a,bJ and v(a) =0.
Thus, for all u* E X*,
d
ds(u*,v(s)) = (u*,v'(s)) =0 on [a,b]
and (u*,v(a)) = O. By classic calculus, this implies
(u*,v(s») =0 on [a,bJ for all u* EX*.
Hence v(s) = 0 on [a, b]. D
Example 6. Let -00 < a < b < 00, and let X be a Banach space over lK.
For k = 0, 1, ... , let
Ck([a, b], X)
denote the set of all continuous functions u: [a, b] ---. X that have continuous
derivatives up to order k. Then Ck([a, b], X) becomes a Banach space over
lK equipped with the norm
k
lIullk := "'" max Ilu(j)(t)ll·
L..., a<t<b
j=O - -
6The derivatives v'(a) and v'(b) are to be understood as one-sided derivatives.
4.5 Applications to the Taylor Theorem 243
Instead of eO([a, b], X), we simply write C([a, b], X).
Proof. Obviously, II· Ilk is a norm.
Step 1: First let k = O. Then we can use the same proof as in Standard
Example 6 from Section 1.3 in AMS Vol. 108.
Step 2: Let k = 1. Suppose that (un) is a Cauchy sequence with respect
to the norm II· Ih, that is, for each c > 0, there is an no(c) such that
lIun - Umlll = a~t~b
max Ilun{t) - um(t)11 + max Ilu~{t) -
a~t~b
u~(t)1I <c
for all n, m ~ no(c). By Step 1, C([a, b], X) is a Banach space. Thus, there
exist functions u, v E C([a, b], X) such that
lIun - ullo --+ 0 and lIu~ - vllo --+ 0 as n --+ 00.
Let s E [a, b]. Then
Ills (u~(t) - v(t))dtll :S (b - a)llu~~ - vllo --+ 0 as n --+ 00,
1
and hence
u(s) - u(a) = 8
v(t)dt.
By Theorem 4.B, u'(s) = v(s). This implies u E Cl([a, b], X) and the
convergence lIu n - ulh --+ 0 as n --+ 00.
For k ~ 2, proceed by induction. 0
4.5 Applications to the Taylor Theorem
In classic analysis, the fundamental Taylor formula describes the approx-
imation of functions by polynomials. The generalized Taylor formula on
Banach spaces reads as follows:
1
+L
n-1
feu + h) = feu) k!f(k)(u)h k + Rn, (35)
k=1
where we set f(k)(u)h k := f(k)(u)h ... h, and the remainder has the form
R '=
n'
r (1-(n -7)n-l
10
1
I)!
f(n)(u + 7h)h n d7
'
(36)
where n = 1,2, .... For n = 1, the sum will be zero in (35), by definition.
Theorem 4.C. Let the map f: U ~ X --+ Y be en on the open convex set
U, where X and Y are Banach spaces over lK. Then the Taylor formula
(35), (36) holds true for all u E U and all hEX with u + hE U.
244 4. The Implicit Function Theorem
In particular, since J01 (1 - r)n- 1 dr = ~, it follows from (36) that
1
IIRnl1 ~ -, sup IIf(nl(u + rh)hnll· (36*)
n. O~r::;t
Proof. For given v* E Y*, set
¢(t):= (v*,f(u+th)), O~t~l.
By (9), for k = 1, ... ,n,
¢(kl(t) = (v*, f(kl(u + th)hk), O~t~l.
The classic Taylor theorem for real functions tells us that
n-1 1
¢(1) - ¢(O) - ' " _¢(kl(O) -
~ k!
11
0
(1
- r
(n -I)!
)n-1
¢Cnl(r)dr = O.
k=l
Using (34), this means
1
L k!fCkl(u)hk -
n-l
(v*,f(u + h) - f(u) - Rn) =0 for all v* E Y*.
k=l
Hence we obtain (35). o
4.6 Iterated Derivatives
Proposition 1. Let f: U(u) ~ X ----. Y be defined on an open neighborhood
of the point u, where X and Yare Banach spaces over K.
(i) The second F-derivative f"(u) exists iff the iterated derivative
(f')'(u) exists. In this case, we get
f"(u)hk = (f')'(u)(h)(k) for all h, k E X. (37)
(ii) f" is continuous at the point u iff (f')' is continuous at u.
Proof. Ad (i). Step 1: Suppose that (f')'(u) exists. We want to prove that
f"(u) == d 2 f(u) exists along with (37).
The existence of (f'),(u) means the following. The operator
f'(v): X ----. Y
4.6 Iterated Derivatives 245
is lineal' and bounded for all v E V, where V is some open neighborhood
of the point u. This way, we get the operator
J':V ~ X ...... L(X, Y).
Then the operator
(f')'(u): X ...... L(X, Y)
is linear and bounded. Hence
(f')'(u)(h) E L(X, Y) for all hEX,
which implies
(f')'(u)(h)(k) E Y for all k E X.
To Himplify notation, set g(v) := J'(v). For all h, k E L(X, Y),
1Ig'(u) (h)(k) II ~ 1Ig'(u)hllllkll ~ 11g'(u)lllIhllllkll,
meaning that the map (h, k) t--4 g'(u)(h)(k) is bilinear and bounded from
X x X to Y. It remains to prove (37). By definition of g'(u),
g(u + h) - g(u) = g'(u)h + R(h),
where R(h) E L(X, Y) and R(h) = o(lIhll) as h ...... O. This implies
J'(u + h)k - J'(u)k = g'(u)(h)k + r,
where r(h,k):= R(h)k. Hence Ilr(h,k)1I ~ IIR(h)lIlIkll. that is,
sup IIr(h, k)11 = o(llhll). h ...... O.
II k· II 9
This yields (37).
Step 2: Suppose that f"(u) exists. We have to show that (f')'(u) exists
along with (37). This follows similarly to Step l.
Ad (ii). The statement follows immediately from the formula
IIU')'(u + h) - U')'(u)11 = Ilf"(u + h) - f"(u) II, (38)
which follows from (37) along with
11(f')'(u + h) - (f')'(u)11 = sup 11g'(u + h)h1 - g'(u)h 1 11
1111>11::;1
sup 1Ig'(u + h)(hd(h2) - g'(u)(hd(h 2 )11· 0
111'1119·1111,2119
Simihtrly, we get the following more general result. Let us write
Df(u) := J'(u), D2 f(u) := D(Df)(n), and so forth.
Corollary 2. Let the map f: U ~ X ...... Y be g'iven as 'in Pmpo.sif'ion 1.
a.nd let 'II = 2.3, ....
246 4. The Implicit Function Theorem
(i) The nth F-derivative f(n)(u) exists iff the nth iterated derivative
D n f(u) exists. In this case, we have
D n f(u)(h l ) ... (h n ) = f(n)(u)h l ... h n for all h j E X,j = 1, ... , n.
(39)
(ii) If f(n)(v) and f(n)(w) exist, then
Ilf(n)(v) - f(n) (w)1I = IID n f(v) - D n f(w)lI·
(iii) fen) is continuous at the point u iff D n f is continuous at u.
Remark 3 (Different approaches to the F-derivative). Let dom f denote
the domain of definition of f. Note that the iterated derivatives
f: dom f ~ X ~ Y,
1': dom I' ~ X ~ L(X, Y),
(f')': dom (f')' ~ X ~ L(X, L(X, Y)), and so forth,
correspond to image spaces that acquire a more and more complicated struc-
ture. Our definition of higher derivatives in Section 4.2 avoids this.
In mathematical literature, higher derivatives are frequently defined as
iterated derivatives. Corollary 2 tells us that there exists a natural relation
between the two different approaches, which allows us to identify fen) with
D n f. Roughly speaking, one observes the following:
(a) Our definition of fen) in Section 4.2 is convenient with a view to the
computation of derivatives in concrete situations (e.g., see Problems
4.1 through 4.7).
(b) The use of iterated derivatives Dn f simplifies theoretical investiga-
tions (e.g., see the chain rule in the next section).
Let f = (II, ... , f N) and N, n = 1,2, .... Then the following two formu-
las are frequently used in connection with the chain rule:
(40)
and
f(n)(u)h l ... h n = (fin) (u)h l ... h n , ... , fj;)(u)h l ... hn ) (41)
for all hj E Xj, j = 1, . .. ,N.
Proposition 4. Let X, Xl"'" XN be Banach spaces over OC, and let the
operator
f: U ~ X ~ Xl X ... X X N
be given on the open neighborhood U of u. Then the following are true:
4.7 The Chain Rule 247
(i) The iterated F-derivative D n feu) exists iff Dn h(u) exists for all j =
1, ... , N. Here formula (40) holds true.
(ii) The nth F -derivative f{ n) (u) exists iff ft) (u) exists for all j
1, ... , N. Here formula (41) holds true.
(iii) f is en on U iff 11, ... , fN are en on U.
Proof. This follows immediately from the definition of the norm
on Xl X ... X XN and from the corresponding definitions of D n feu) and
f{n)(u). 0
4.7 The Chain Rule
The chain rule represents the most important rule of differential calculus.
The key formula reads as follows:
(g 0 f)'(u) = g'(f(u)) 0 j'(u). (42)
Recall that (g 0 f)(u) := g(f(u)). Formula (42) tells us that the operations
of linearization and composition can be interchanged. Since g' (f (u)) and
f'(u) are linear operators, formula (42) can also be written as
(g 0 f)'(u) = g'(f(u))j'(u). (42*)
This shorter notation is convenient in order to avoid clumsy formulas for
higher derivatives, as we will explain ahead.
Theorem 4.D (The chain rule). Let X, Y, and Z be Banach spaces over
IK, and let the two mappings
f: U(u) ~ X -> Y and g: V(f(u)) ~ Y -> Z
be given, where U and V are open neighborhoods of the points u and feu),
respectively. Let m = 1,2, ... be fixed.
If the F-derivatives f{m)(u) and g{m)(f(u)) exist, then the F-derivative
(g 0 f){m)(u) exists and formula (42) holds true for m = 1.
Corollary 1. For fixed m = 1,2, ... , suppose that the mappings
f: U ~ X -> Y and g: V ~ Y -> Z
are em, where U and V are open sets and feU) ~ v.
248 4. The Implicit Function Theorem
Then the composite map 9 0 f is also em on U.
Proof of Theorem 4.D. Step 1: Let m = 1. Set v := g(u). By hypothesis,
feu + h) - feu) = j'(u)h + Ilhll a(h),
g(v + k) - g(v) = g'(v)k + IIkll b(k),
where a(h) 0 as h --> 0 and b(k)
--> --> 0 as k --> O. Choose v := feu) and
k := feu + h) - feu). Then
g(j(u + h)) - g(j( u)) = g' (v)j' (u)h + r(h), (43)
where
r(h) := + Ilkllb(k) = o(lIhll),
g'(v)lIhlla(h) h --> 0,
since III'I~III ::::: 1Ig'(v)lIlIa(h)1I + ... --> 0 as h --> O. From (43) we get (42).
Step 2: In order to proceed by induction, we assume that the statement
has been proved for m = n.
Suppose that f(n+1)(u) and g(n+1) (v) exist. By Section 4.6, this is equiv-
alent to the existence of the iterated derivatives Dn+1 f (u) and Dn+1 g( v).
By (42),
D(g 0 J)(u) = Dg(v)Df(u). (44)
Define the bilinear bounded map B: L(Y, Z) x L(X, Y) --> L(X, Z) by
B(R,S):= RS for all R E L(Y, Z), S E L(X, Y).
In this connection, note that IIRSII ::::: IIRIlIiSIi. Then equation (42) can be
written in the form
D(g 0 f)(u) = B(Dg(v), Df(u)). (45)
Example 5 in Section 4.2 tells us that B is Coo. By assumption, there exist
Dn(Dg)(v) and Dn(Df)(u). Applying the chain rule for m = n to (45)
along with (40), we obtain the existence of Dn D(g 0 f)(u). By Section 4.6,
this implies the existence of (g 0 f)(n+l)(u). 0
Proof of Corollary 1. Observe that the continuity of Dg and D f implies
the continuity of D(g 0 f), by means of (45) along with the continuity
of B. Now use the same induction argument as in the proof of Theorem
4.D. 0
Let us compute the F-derivative (g 0 f)(n) explicitly. Since Theorem
4.D ensures the existence of the derivatives, we can use the convenient
formula (9) along with the special product rule (24). For example, let n = 2.
Differentiating
(g 0 f)'(u + tk)h = g'(j(u + tk))j'(u + tk)h
4.7 The Chain Rule 249
at the point t = 0, we obtain
(g 0 I)"(u)kh = g"(v)f'(u)kf'(u)h + g'(v)f"(u)kh (46)
for HI1 11" k E X, where v := feu). In this connection, note that we may
write
g'(f(u + tk))f'(u + tk)h = C(g'(f(u. + tk)), f'(u + tk)h),
where the bilinear bounded operator c: L(Y, Z) xY ~ Z is defined by
C(R,y):= Ry for all R E L(Y, Z), y E Y.
In fact, this bilinear operator is bounded, since IIRyll ~ IIRlillyli.
Analogously, we get (g 0 I)(n) (u) for n = 3,4 ....
Remark 2 (Collvenient notation). R.ecall that f'(u): X ~ Y and g'(v):
Y ~ Z are linear bounded operators, whereas f"(u): X x X ~ Y and
9"('1'): Y x Y ~ Z are bilinear bounded operators. R.ecall also our con-
vention Alv'W := Af(v, w) for bilinear operators. Without this convention,
formula (46) reads as follows:
(g 0 I)" (u)(k, h) = g" (v )(f' (u)k, f' (u)h) + g' (v )[!" (u) (k, h)].
In contrast to this, our notation in (46) avoids redundant symbols.
Standard Example 3 (The product rule). Set
H(u) := B(f(u),g(u)).
We want to justify the formula
H'(u)h = B(f'(u)h,g(u)) + B(f(u),g'(u)h) for all hEX. (47)
Let X, Y, Z, and W be Banach spaces over K Suppose that B: Y x Z ~
IV i1:i a bilinear bounded operator, and suppose that the operators
f:U~X~Y and g:U ~ X ~ Z
are defined on the open neighborhood U of u and that the nth F-derivat-ives
f(l/l(u) and g(n)(u) exist for fixed n = 1, .... Then the following are true:
(i) The nth F-derivative H(n)(u) exists.
(ii) Formula (47) holds true.
(iii) If f and 9 are cn on U, then so is H.
250 4. The Implicit Function Theorem
Proof. Ad (i), (iii). By Example 5 in Section 4.2, B is Coo. Now use the
chain rule along with Proposition 4 in Section 4.6 on the differentiability
of the map u t-+ (f(u),g(u».
Ad (ii). By (9) and the special product rule (24), differentiation of
H(u + th) = B(f(u + th), g(u + th»
at the point t = 0 yields (47). o
The explicit expression for higher derivatives can also be obtained by
using (9) and the special product rule (24). For example, differentiation of
H'(u + tk)h = B(f'(u + tk)h,g(u + tk» + B(f(u + tk), g'(u + tk)h)
at the point t = 0 yields
H"(u)kh = B(f"(u)kh,g(u)) + B(f'(u)h,g'(u)k)
+ B(f'(u)k,g'(u)h) + B(f(u),g"(u)kh).
4.8 The Implicit Function Theorem
We want to solve the operator equation
F(u,v) =0 (48)
in a neighborhood of the point (uo,vo), where we assume that
F( uo, vo) = O. (49)
In particular, we are interested in a locally unique solution (cf. Figure 4.3).
Condition (50) is decisive. Set U:= {u E X: lIu - uoll < pl.
Theorem 4.E. Let X, Y, and Z be Banach spaces over lK, and let
F: U(uo,vo) ~ X x Y --t Z
be a en-map on an open neighborhood of the point (uo,vo) such that (49)
holds and 1 :::; n :::; 00. Suppose that the operator
Fv (uo, vo) : Y --t Z is bijective. (50)
Then the following statements hold true:
(i) There exist numbers r > 0 and p > 0 such that, for each given u E U,
the original equation (48) has a unique solution v E Y with Ilv-voll :::;
r. Denote this solution by v(u).
4.8 The Implicit Function Theorem 251
Vo
L-------+------_u
Uo
FIGURE 4.3.
(ii) The function u 1--+ v(u) is en on U. In particular,
v'(u) = -Fv(u, V(u))-l Fu(u, v(u» for all u E U. (51)
Since this theorem ensures the existence of v(m)(u) for m = 1,2, ... , n,
these derivatives can be computed by means of (9) in the following way.
For example, let m = 2. By (51),
Fu(u + tk,v(u + tk»h + Fv(u + tk,v(u + tk»v'(u + tk)h = O.
Differentiating this at the point t = 0 by using the chain rule and the
product rule, we get
Fuu(u, v(u»kh + Fvu(u, v(u»v'(u)kh + Fuv(u, v(u))kv'(u)h
+ Fvv(u, v(u»v'(u)kv'(u)h + Fv(u, v(u»v"(u)kh = O.
for a.ll h,k E X. Applying the inverse operator Fv(u,V(U»-l to this, we
obtain v"(u)kh. The same argument can be used for computing v(m)(u)
with m ~ 3.
Proof. To simplify notation, assume tha.t Uo = 0 and Vo = O. Assume
also that X :/:- to}, Y :/:- to}, and Z :/:- to}. Otherwise, the statements are
trivial. Set
f(u,v):= Fv(O,O)v - F(u,v)
and
Auv := Fv(O, 0)-1 feu, v). (52)
To avoid confusion, note that u is an index in "Au" but not a partial
derivative. Then, the original problem (48) is equivalent to the fixed-point
problem Fv(O,O)v = f(u,v), that is,
v E X. (53)
Step 1: We apply the Banach fixed-point theorem to (53).
252 4. The Implicit Function Theorem
By the continuous inverse mapping theorem from Section 3.5, the inverse
operator F;1(0, 0): Z --+ Y is linear and continuous. Furthermore, it follows
from Proposition 7 in Section 1.23 of AMS Vol. 108 that the operator
(54)
is linear and continuous for all (u, v) in a sufficiently small neighborhood
W of (0,0) in X x Y and sUP(u,V)EW IlFv(u,v)-111 < 00.
Let lIull, IIvll, IIwll :S r. Observe
fv(u,v) = Fv(O,O) - Fv(u,v).
Since fv is continuous at (0,0) and fv (0,0) = 0, Taylor's theorem in Section
4.5 implies that 7
Ilf(u, v) - f(u, w)II::; sup IIfv(u, v + r(w - v))lllIv - wll
0::;1"9
= o(1)llv - wi!' r --+ O.
Since f(O, 0) = 0 and f is continuous at (0,0), we get
IIf(u,v)II :S Ilf(u,v) - f(u,O)11 + IIf(u,O)II
::; 0(1)llvll + Ilf(u, 0)11, r --+ 0,
where IIf(u,O)II --+ 0 as IIull --+ o. Finally, note that
IIAuvll ::; IlFv(0,0)-111 IIf(u,v)ll,
IIAuv - Auwll ::; 0(1) IlFv(O, 0)-1 II IIv - wll, r --+ O.
Let M := {v E Y: Ilvll :S r} and U := {u E X: lIuli < pl. Then, for
sufficiently small r > 0 and p > 0, we obtain
(a) IIAuvll::; rand
(b) IIAuv - Auwll ::; !lIv - wll
for each given u E U and all v, w E M.
By the Banach fixed-point theorem from Section 1.6 of AMS Vol. 108,
the operator Au: M --+ M has a unique fixed point v(u), that is, for each
u E U, equation (53) has a unique solution v(u) E M.
This is statement (i) of Theorem 4.E.
Step 2: We choose the numbers r > 0 and p > 0 to be so small that
for all u E U, v E M, (55)
7Recall that a(r) = 0(1) as r --+ 0 means that a(r) --+ 0 as r --+ O.
4.8 The Implicit Function Theorem 253
and
(u,v) 1-+ Fv(u,v)-l is continuous from U x M to L(Z,Y). (56)
This is possible by (54). Furthermore, we also may assume that F is en
on some open subset W of X x Y with U x MeW.
Step 3: We show that u 1-+ v(u) is continuous on U. Let u, Z E U. Then
Ilv(u) - v(Z) II = IIAuv(u) - Azv(z)1I
~ IIAuv(u) - Auv(z) II + IIAuv(z) - Azv(z)1I
1
~ 2"v(u) - v(z)1I + R(u),
where R(u) --+ 0 as u --+ z, by (52). Hence IIv(u) - v(z)1I ~ 2R(u) --+ 0 as
u --+ z.
Step 4: We show that the F-derivative v'(u) exists for each u E U. Let
hEX and set
k:=v(u+h)-v(u)
along with v := v(u), where IIhll is sufficiently small. Since F is Cion W,
it follows from
0= F(u+ h,v + k) - F(u,v)
that
0= Fu(u, v)h + Fv(u, v)k + o(lIhll + IIkID, IIhll + IIkll --+ o.
Hence
Since k --+ 0 as h --+ 0, this implies
IIkll ~ constllhll + T111kll
if IIhll is sufficiently small. Therefore, IIkll ~ constllhll, and hence
v(u + h) - v(u) = -Fv(u, v)-lFu(u, v)h + o(lIhll), h --+ O.
Thus, the F-derivative v'(u) exists, where
v'(u) = -Fv(u, v(u»-l Fu(u, v(u» for all u E U. (58)
By (56) and Step 3, u 1-+ v'(u) is continuous on U.
Step 5: Suppose that F is C 2 on W. Then, applying the chain rule to
(58), it follows that the second F-derivative u 1-+ vl/(u) is continuous on U.
In this connection, note the following:
(a) Since u --+ v(u) is e 1 on U, the map
u --+ (u,v(u»
254 4. The Implicit Function Theorem
e
is also 1 from U to X x Y, by Proposition 4 in Section 4.6.
(b) The map
is Coo from Linv(Y, Z) to L(Z, Y), by Standard Example 9 in Section 4.3.
Moreover, observe that Fv(u, v(u» E Linv(Y, Z) for all u E U, by (56).
(c) The bilinear bounded map
(R, S) f--+ -RS
is Coo from L(Z, Y) x L(X, Z), by Example 5 in Section 4.2. Moreover,
observe that
Fv(U,V(U»-l E L(Z, Y) and Fu(u,v(u» E L(X,Z).
Step 6: Suppose that F is en on W, n ;::: 3. Using the same argument as
in Step 5, it follows that v = v(u) is en
on U, by induction. 0
Applications of the implicit function theorem to integral equations can
be found in Section 5.13.
4.9 Applications to Differential Equations
Let us study the following initial-value problem:
x'(t) = f(t,x(t),p) for all t E jto - a, to + a(,
(59)
x(to) = y.
For given to E IR and y EX, we are looking for a solution x(·) with x( t) E X
for all t E jto - a, to + a[, where X is a Banach space over J[{. Here p denotes
a parameter living in the Banach space P over J[{ (e.g., P = 1R).
In the special case where X := IR N , N ;::: 1, problem (59) corresponds to
a system
xj(t) = h(t,x(t),p(t», j = 1, ... ,N,
of N real differential equations.
The following result is fundamental to the theory of ordinary differential
equations.
Proposition 1. Suppose that the mapping
f: U ~ IR x X x P -t X
is en for fixed n (1 ~ n < 00), where U is an open set containing the given
point (to, xo, Po). Then the following are true:
4.9 Applications to Differential Equations 255
(i) There exist a number a > 0 and an open neighborhood V(xo,Po) of
(xo, Po) in X x P such that the original initial-value problem (59) has
a unique solution x = x(t;y,p) for each (y,p) E V(xo,Po).
(ii) The solution depends smoothly on the initial value y and the para-
meter p, that is, the mapping
(t,y,p) ~ x(t;y,p)
isC n jromjto-a,to+a[ xV(xo,Po) toX.
The original system (59) is equivalent to the following system:
x'(t) = f(t,x(t),p(t)),
p'(t) = 0 on lt~ - a, to + ai, (59*)
x(to) = y, p(t o) = p,
where the values of the unknown function t ~ (x(t),p(t)) live in the product
space X x P. Observe that (59*) does not contain any additional param-
eters. Thus, it is sufficient to prove Proposition 1 in the case where f is
independent of the parameter p.
Our proof will be based on the implicit function theorem. The idea is the
following:
(Q;) Existence. We apply the implicit function theorem to the rescaled
problem (60).
((3) Smoothness. We differentiate the original problem (59). This way, we
obtain the new problem (70), which tells us that the solution of (59)
is en.
Proof. To simplify the notation only, we suppose that U = lR x X.
Ad (i). Step 1: Rescaling. Set J:= [-1,1]. The decisive trick of the proof
is the following rescaling:
t = to + sa, z(s; T, y) := X(T + as; y) - y for all s E J.
Then the original problem (59) corresponds to the new problem:
Z'(s) - af(T + as, z(s) + y) = 0 for all s E J,
(60)
z(O) = 0
with fixed T = to. We write this as an operator equationS
F(z,a,T,y) = 0 (61)
8In the following, let T E lR be a free parameter. This is important in order to
prove continuity properties of the solution x = x(t).
256 4. The Implicit Function Theorem
with the operator F: Z x JR x JR x X -t Wand the real Banach spaces
Z:= {z E C 1 (J,X):z(0) = O}, W:= C(J,X).
The norm on Z and W is given by
Ilzllz := max
sE)
Ilz(s)llx + max
sE)
Ilz'(s)llx
and
Ilzllw := max Ilz(s)lIx.
sE)
Step 2: The implicit function theorem. Set Q := (0,0, to, xo), namely,
z = 0, a = 0, 7 = to, Y = Xo. Obviously, F( Q) = 0, and the linearization
of (60) at Q yields
Fz(Q)z = z'.
The crucial point is that, for every w E W, there exists exactly one z E Z
with z' = w, namely, z(s) = J;
w(t)dt. Hence Fz(Q): Z - t W is bijective.
° °
Thus, by the implicit function theorem from Section 4.8, there exist num-
bers p > and r > such that, for given a > 0, 7 E JR, and y E X with
a, 17 - tal, Ily - xollx < p, (62)
the operator equation (61) has a unique solution z E Z with Ilzllz < r. In
addition, the map
(a, 7, y) f-+ Z (63)
is C 1 from the open subset (62) in JR x JR x X to Z.
The remaining part of the proof is routine.
However, for the convenience of the reader, let us discuss this in detail.
Step 3: Uniqueness for the original problem (59). It follows from Step
2 that each solution of (59) is locally unique. In fact, consider a solution
x = x(t) of (59), say at the point to. Since the function x(·) is differentiable,
°
it is continuous, and hence x'(-) is continuous, by (59). Thus, x(·) is C 1 •
Moreover, if we choose the number a > sufficiently small, then IIzllz < r,
and Step 2 yields the local uniqueness. That is, x(·) is unique on a small
neighborhood of to.
Furthermore, observe that local uniqueness implies global uniqueness by
using the following standard argument. Let x = x(t) and x = X(t) be two
solutions of (59) for some fixed a > 0. Let I := Jto - c, to - b[ be the largest
open interval such that
x(t) = X(t) on I.
For example, suppose that b < a. By continuity, x(b) = X(b). Applying the
local uniqueness to the point b, we obtain that x(t) = X(t) on some small
neighborhood of t = b. This contradicts the maximality of I.
4.9 Applications to Differential Equations 257
Ad (ii). Let x(·) be the solution of (59) from Step 2. Introduce the partial
derivatives
vet, y) := Xt(t, y) and wet, y) := xy(t, y) on U,
where U := {(t, y) E lR x X: It - tol < a, IIY - xollx < pl. Let us show that
v and ware en-Ion U (64)
provided a and p are sufficiently small depending on n. By Problem 4.11,
this implies that x = x(t, y) is en on U. Let us prove (64) by induction.
Step A: Let n = 1. We want to show the following:
(a) (t,y) ....... x(t,y) is continuous on U.
(b) (t, y) ....... Xt(t, y) is continuous on U.
(c) (t,y) ....... xy(t,y) is continuous on U.
(d) For each (t, y) E U, there exists Xty(t, y) = Xyt(t, y).
Ad (a). The continuity of the mapping from (63) implies that
IIz(to + a, y + h) - z(to, Y)lIz < c
provided lal and IIhllx are sufficiently small. This means that
max Iz(s; to
sEJ
+ a, y + h) - z(s; to, y)1 < C.
Thus, (t, y) ....... x(t, y) is continuous all U, by the definition of the function
z in Step 1.
Ad (b). Use the original equation (59) and (a).
Ad (c), (d). Since the mapping from (63) is e l , there exists the contin-
uous derivative A := Zy. That is, the operator A: X ---+ Z is linear and
IIz(y + h) - z(y) - Ahllz = o(lIhll), h ---+ 0 in X. (65)
The continuity of Zy means that
IIA(to + a, y + k)h - A(to, y)hll z :s: ellhllx for all hEX (66)
if lal and IIkllx are sufficiently small. Relation (65) implies
max Iz(s; T, y + h) - z(s; T, y) - (Ah)(s)1
sEJ
+ max Izs(s; T, y + h) - zs(s; T, y) - (Ah)'(s)1 = o(IIhll), h ---+ 0 in X.
sEJ
By the definition of the derivative, this yields
Zy(s;T,y)h = (Ah)(s) (67)
258 4. The Implicit Function Theorem
and Zys(S;T,y) = (Ah)'(s), that is,
Zys(s; T, y) = ZSy(s; T, y), (68)
for all hEX and all s E J, (T, y) E U. Moreover, for s E J, it follows from
(66) that
IIzy(s; to + a, y + k)h - Zy(s; to, y)hll x 5 c:llhllx for all hEX,
if lal and IIkllx are sufficiently small. Hence
IIzy(s; to + a, y + k) - Zy(s; to, y)1I 5 c:. (69)
Using the definition of the function Z in Step 1, we get (d) and (c) from
(68) and (69), respectively.
Step B: Using (59*), we obtain Proposition 1 for n = 1 in the case where
f depends on a parameter.
Step C: Let n ~ 2. Suppose that Proposition 1 has been proved for n - 1.
Let x = x(t, y) be the solution of the original problem (59) from Step A.
Assume that (t, y) ~ x(t, y) is en-IOn U.
Differentiating the original equation (59) with respect to y and t, and
using Xty(t, y) = Xyt(t, y), we get 9 the following new linear initial-value
problem:
v'(t) = ft(t, x(t, y)) + fx(t, x(t, y))v(t), w'(t) = fx(t, x(t, y))w(t) (70a)
v(to) = a and w(to) = I, (70b)
where a := Xt(to, y) = f(to, y). For y = Yo, we set ao := f(to, Yo). Since
(t, y) ~ x(t, y) is en-IOn U, the right-hand side of (70a) is also C n - 1 as
a function of (t, y, v, w). Using Proposition 1 for n -1, we obtain that the
map
(t, a) ~ (v(t; a), w(t; a))
is en-IOn a small neighborhood of the point (to, ao). Since
y ~ a == f(to,Y)
is en-Ion small neighborhood of Yo, we obtain the desired result (64). 0
4.10 Diffeomorphisms and the Local Inverse
Mapping Theorem
Let us apply the implicit function theorem to the study of the local behavior
of nonlinear mappings. The results of this section and the next are of great
importance for nonlinear analysis.
9To simplify notation, we write vet) and wet) instead of vet, y) = Xt(t, y) and
wet, y) = Xy(x, t), respectively.
4.10 Diffeomorphisms and the Local Inverse Mapping Theorem 259
°
Definition 1. Let U and V be nonempty open sets in the Banach spaces
X and Y over lK.. Let :S r :S 00.
The mapping f: U --+ V is called a Or-diffeomorphism iff f is bijective
and both f and f- 1 are Or-maps.
A local or -diffeomorphism at the point Uo is a or -diffeomorphism from
some open neighborhood U(uo) in X onto some open neighborhoodlO
V(f(uo)) in Y.
Obviously, 0 0 diffeomorphisms are homeomorphisms.
Theorem 4.F (The local inverse mapping theorem). Let f: U(uo) ~ X --+
Y be a or -map on some open neighborhood of the point Uo, where X and
Yare Banach spaces over][( and 1 :S r :S 00.
Then f is a local Or -diffeomorphism at Uo iff 1'(uo): X --+ Y is bijective.
A global version of the inverse mapping theorem will be considered in
Problem 4.12.
Proof. Step 1: Let l' (uo): X --+ Y be bijective and set vo := f( uo). Fur-
thermore, set
F(u, v) := f(u) - v.
Then the equation
F(u,v) = 0, u E X, v E Y, (71)
can be solved for u by the implicit function theorem in Section 4.8,' because
F(uo, vo) = 0, and the map Fu(uo, vo) = f'(uo) is bijective from X to Y.
Thus, there exist open neighborhoods U(uo) and V(vo) such that, for each
v E V(vo), equation (71) has a unique solution u = u(v) in U(uo). The map
v 1--+ u(v)
is cr. Obviously, u(v) = f-l(v) for all v E V(vo)·
If we set g(v) := f- 1 (v), then
f(g(v)) = v and g(f(u)) = u. (72*)
Differentiation by the chain rule gives
f'(u)g'(v) =I and g'(v)f'(u) = I, (72)
where v:= f(u). Hence g'(v) = 1'(u)-l, that is,
for all u E U( uo). (73)
lOWe also speak of a local diffeomorphism between the points Uo and f(uo).
260 4. The Implicit Function Theorem
Step 2: Let f be a local CT -diffeomorphism at uo· Then (72*) implies
equation (72) for all u E U(uo). Hence I'(u): X ~ Y is bijective for all
u E U(uo). 0
Corollary 2. Let f: U ~ V and g: V ~ W be C r -diffeomorphisms, where
U, V, and Ware nonempty open sets in Banach spaces over][{ and 1 :::;
r ::; 00.
Then the composite map g 0 f: U ~ W is also a cr -diffeomorphism.
Proof. Obviously, go f is bijective. By the chain rule,
(g 0 f)'(u) = g'(f(u))I'(u) for all u E U.
According to the inverse mapping theorem (Theorem 4.F), l' (u) and
g'(f(u)) are bijective, and hence so is (g 0 f)'(u). Thus, again using the
inverse mapping theorem, we prove the assertion in Corollary 2. 0
4.11 Equivalent Maps and the Linearization
Principle
Let us now study the case where l' (uo) is not bijective, that is, where f is
not a local diffeomorphism at uo. Our point of departure is the following
commutative diagram:
U L V
¢r (74)
W ~ Z
Definition 1. Let f: U ~ V and g: W ~ Z be two CT-maps where U,
V, W, and Z are nonempty open subsets of Banach spaces over lK. Let
0:::; r:::; 00.
We say that the map f at the point Uo is CT-equivalent to the map gat
the point Wo iff there exist a local cr -diffeomorphism ¢ between Wo and
Uo and a local CT-diffeomorphism 'I/J between f(uo) and g(wo) such that
the diagram (74) is locally commutative. By definition, this means that we
have
g='l/Jofo¢ on some open neighborhood of uo.
We write
f ::.., g at (Uo, wo)
iff f at Uo is cr -equivalent to g at woo To discuss this, let
v = f(u) on some open neighborhood of uo.
4.11 Equivalent Maps and the Linearization Principle 261
If we introduce the new local coordinates u = cp(w) and z = 1jJ(v), then we
get
z = g(w) on some open neighborhood of woo
Roughly speaking:
Equivalent maps locally possess the same structure.
Since local diffeomorphisms are invariant under composition of maps and
inverse formation, we find that the equivalence of maps represents an equiv-
alence relation. That is,
(i) J::" gat (uo, wo) implies g::" J at (wo, uo).
(ii) J::" gat (uo,wo) and g::" h at (wo,zo) imply J::" h at (uo,zo).
Example 2. Let f: U(uo) S;;; IR --+ IR be CT on some open neighborhood of
Uo (1 ~ r ~ 00). Then J at Uo is CT -equivalent to both the linearizations
U I--t J( uo) + j' (uo)( u - uo) at Uo (75)
and
U I--t j' (uo)u at u = o. (75*)
Relation (75*) is a direct consequence of Theorem 4.G, which follows the
next definition. Use the translations u I--t u - Uo and v I--t v - J (uo) in order
to obtain (75) from (75*).
If f'(uo) =1= 0, then (75*) is also COO-equivalent to u I--t u at u = O. In
fact, if we set w := f'(UO)-lV, then v = f'(uo)u is transformed into the
new equation w = u.
Definition 3. Let J: U(uo) S;;; X --+ Y be a CI-map on an open neighbor-
hood of uo, where X and Y are Banach spaces over K Then
(i) J is called a submersion at Uo iff J'(uo): X --+ Y is surjective and the
null space N(J'(uo» splits X.
(ii) J is called an immersion at Uo iff f'(uo): X --+ Y is injective and the
range R(J'(uo» splits Y.
(iii) J is called a subimmersion at Uo iff either
X and Y have finite dimensions (76)
and rank f' (u) is constant on some open neighborhood of Uo or con-
dition (76) is not satisfied and N(J'(uo» splits X, R(J'(uo» splits
Y, as well as
J'(u)(Nc ) = j'(u)(X)
for all u on some open neighborhood of uo, where Nc is given in such
a way that X = N(J'(uo» EB Nc is a fixed topological sum.
262 4. The Implicit Function Theorem
Theorem 4.G (The linearization principle). Let f: U(uo) <;:;; X ---> Y be a
C r -map on an open neighborhood of uo, where X and Yare Banach spaces
over IK and 1 ::; r ::; 00. Suppose that f is a submersion, an immersion, or
a subimmersion at uo.
Then f at Uo is C r -equivalent to the linearization 1'(uo): X ---> Y at
u = O.
Corollary 4. In addition, the following hold:
(i) If f is a submersion at uo, then f is locally surjective at uo, that is,
there exists an open neighborhood U( uo) in X and a number p > 0
such that the equation
feu) = v, (77)
has a solution for each vE Y with /Iv - f(uo)" < p.
(ii) If f is an immersion at uo, then f is locally injective at uo, that
is, there exists an open neighborhood U(uo) in X such that, for each
v E Y, equation (77) has at most one solution.
Theorem 4.G is also called the rank theorem since subimmersions on
finite-dimensional spaces have locally constant rank.
Important applications of this theorem to the theory of manifolds can be
found in Zeidler (1986), Vol. 4, Section 73.11.
Proof of Corollary 4. Use Theorem 4.G and the observation that the
linearized equation
1'(uo)(u - uo) = v, u E U(uo), (77*)
has the corresponding properties. In fact, if f is a submersion at uo, then
the solutions of (77*) are given through u = Uo + A-Iv + w, where w E
N(J'( uo)) and
A: Nc ---> Y
denotes the restriction of f'(uo) to N c . Note that A is a linear homeomor-
phism, by Proposition 13 in Section 4.9.
If f is an immersion at Uo, then f'(uo) is injective, and hence equation
(77*) has at most one solution. 0
Set N := N(J'(uo)) and R:= R(J'(uo)). Choose fixed topological direct
sums
X = NEBNc and (78)
along with the corresponding continuous projections
P:X ---> N and Q: Y ---> R.
4.11 Equivalent Maps and the Linearization Principle 263
Proof of Theorem 4.G. Step 1: Let 1 be a submersion at 11.0. Let the
operator A be given as in the proof of Corollary 4. Define
¢(u) := Pu + A~l feu). (79)
Since I' (1/.o)h = f'(uo)(I - P)h and hence A~l 1'( uo)(I - P)h = A~l A(I -
P)h == (I - P)h, we get
¢'(1/.o)h = Ph + J,I.~l f'(uo)h = h for all hEX.
By the inverse mapping theorem (Theorem 4.F), ¢ is a local Or-diffeomor-
phism at Un. MUltiplying (79) by f'(-uo), we obtain
f'(uo)¢(u) = feu) on some open neighborhood of uo.
This way, we get the commutative diagram:
U(O) ~ x.
Hence f at U(J is Cr'-equivalent to 1'(11.0) at 11. = O.
Step 2: Let 1 be an immersion at U(J. After a translation we may assume
that 11.0 = 0 and f(O) = O. Define
¢(v) := f(f'(O)~lQv) + (I - Q)v. (80)
Then
¢'(O)k = Qk + (J - Q)k = k for all kEY.
By the inverse mapping theorem, ¢ is a local C r -diffeomorphism at v = 0
with ¢(O) = f(O). Since QI'(O) = 1'(0), we get
¢(f'(O)v) = l(v)
for all v on some open neighborhood of v = O. Thus, the following diagram
i" commutative:
264 4. The Implicit Function Theorem
U(O) ~ X .-L. y
f'(O)~ t<l>
V(O) ~ Y
Hence f at Uo = 0 is C"-equivalent to 1'(0) at u = O.
Step 3: Let f be a subimmersion at uo. The proof will be given in the
next section, by using a normal form for double splitting maps. 0
4.12 The Local Nonnal Form for Nonlinear
Double Splitting Maps
We will use the notation from (7S). Our goals are the first normal form
f(¢(n,r)) = f(uo) +r + g(n,r) for all (r,n) E U(O,O)) ~ N x R, (Sla)
where
g(n,r) ERe onU(O,O), g(O,O)O=O, g'(O,O) =0, (SIb)
and the second normal form
f(1jJ(u)) = f(uo) + f'(uo)(u - uo) + a(u) for all u E U(uo), (S2a)
where
a(u) ERe on U(uo), a(uo) = 0, a'(uo) = 0. (S2b)
This can be regarded as a variant of the Taylor theorem.
Proposition 1. Let f: U(uo) ~ X ~ Y be a cr -map on an open neigh-
borhood of uo, where X and Yare Banach spaces over IK and 1 :S r :S 00.
Suppose that the null space N := N(f'(uo)) splits X and the range R :=
R(f' (uo)) sphts Y. Then the following are true:
(i) There exist an open neighborhood U(O, 0) in N x R and a local cr -
diffeomorphism ¢: U(O, 0)) -- X between (0,0) and Uo such that (81)
holds.
4.12 The Local Normal Form for Nonlinear Double Splitting Maps 265
(ii) There exist an open neighborhood U(uo) in X and a local c r -diffeo-
morphism 1jJ: U( uo) -> X between Uo and Uo such that (82) holds.
Proof. Without loss of generality, let Uo = 0 and f (uo) = O.
Ad (i). The proof idea is to apply the inverse mapping theorem to the
mapping
P(u) := (Ul, h(u))
and to let ¢ := p-l.
The topological direct sums X = N EEl Ne and Y = R EEl Re yield the
decompositions
u = Ul + U2 and f(u) = h(u) + h(u),
for 11, E X and f(11,) E Y, respectively (i.e., 11,1 EN, 11,2 ENe, and JI(11,) E R,
12(11,) ERe).
Since f(O) = 0 and
I' (O)h = f~ (O)h + f~(O)h for all hEX
with f'(O)h E R, we obtain
h (0) = 12(0) = 0 and f~(O) = o.
The map P: U(O) ~ X -> N x R, as defined above, is C r with P(O) = (0,0)
and
P'(O)h = (hl,f~(O)h) = (hl,j'(O)h) for all hEX.
Since 1'(0): Ne -> R is bijective, it follows from P'(O)h = 0 that hI = 0
and hence h2 = 0 (i.e., h = 0). Thus, P'(O): X -> N x R is bijective,
and the inverse mapping theorem (Theorem 4.F) implies that P is a local
C r -diffeomorphism at 11,0 = O.
Letting ¢:= p-l, we get ¢(n,r) = 11" where n = Ul and r = h(u). Thus
f(¢(n,r)) = h(11,) + h(u) = r + h(¢(n,r)).
This is (81) with g(n, r) := 12 (¢(n, r)).
Finally, we obtain g(O, 0) = 0 from 12(0) = 0 and ¢(O,O) = O. Moreover,
f~(O) = 0 implies
g' (0,0) = f~(O)¢' (0,0) = O.
Ad (ii). Define x(u) := (n, r) by
n:=P11, and r := 1'(0)11,. (83)
If (n, r) = 0, then u = 0, since 1'(0): Ne -> R is bijective. Thus, the map
x: X -+ N x R is a Coo-diffeomorphism, by the inverse mapping theorem.
Letting 1jJ(u) := ¢(n, r) along with (n, r) := x(11,), we get
f(1jJ(u)) = f(¢(n, r)) = r + g(n, r)
= 1'(0)11, + a(11,),
266 4. The Implicit Function Theorem
where a(u) := g(n,r). This is (82). o
We are now able to finish the proof of Theorem 4.G in the case where j
°
is a subimmersion at u. Without loss of generality, we again assume that
Uo := and j(uo) := 0. Define
H(n,r):= j(¢(n,r)).
Then
(a) Hn(n, r) = ° for all (n, r) E N x R on some open neighborhood
U(O, 0). This will be proved ahead.
(b) H(n, r) is independent of non U(O, 0). In fact, we may assume that
U(O, 0) is convex. By the Taylor theorem,
IIH(nl,r) - H(n2,r)ll:S; sup IIHn(nl + T(n2 - nl),r)11 = 0,
0::;7::;1
for all (nl,r), (n2,r) E U(O,O).
(c) Set G(r) := H(n, r). Due to (b) and (81),
G(r) = H(O, r) = r + g(O, r).
Since G'(O)h = h + g'(O, O)h = h for all hER, the operator G'(O): R --+ Y
°
is injective. By Theorem 4.G for immersions, G at r = is equivalent to
G'(O) at r = 0. Thus, Step 2 of the proof of Theorem 4.G tells us that
there exists a local C r -diffeomorphism 'IjJ: U(O) <:;; Y --+ R between v =
°
and r = such that ('IjJ 0 G)(r) = G'(O)r, that is,
°
('ljJojo¢)(n,r) =r
on some open neighborhood of (0,0) in N x R. Consequently, j at Uo = °
is equivalent to the mapping
(n,r)l-tr (84)
from N x R to R at the point (0,0).
(d) Obviously, the mapping from (84) at (0,0) is equivalent to 1'(0) at
u = 0, by means of (83). In fact,
for all (n,r) E N x R.
°
(e) By (c) and (d), f at Uo = is equivalent to 1'(0) at u = 0. This is
the statement of Theorem 4. G for subimmersions.
It remains to prove (a). In the following, let (n,r) E U(O,O). For all
(h,k) EN x R, we have
H'(n,r) = I'(¢(n,r))¢'(n,r) (85)
4.12 The Local Normal Form for Nonlinear Double Splitting Maps 267
and
H'(n, r)(h, k) = k + g'(n, r)(h, k).
Since k E Rand g'(n, r)(h, k) ERe, it follows from
(86)
along with (hj,kj) E N x R, j = 1,2, that kl = k 2 . Thus, the map
H'(n, r): {O} x R -+ H'(n, r)(N x R), (n, r) E U(O, 0), (87)
is injective. We want to show that
The map from (87) is surjective, (88)
provided U(O,O) is sufficiently small. In fact, since ¢ is a local C r -diffeo-
morphism, the linearization
¢'(n,r):N x R -+ X
is bijective, by the inverse mapping theorem (Theorem 4.F). Thus, by (85),
H'(n,r)(N x R) = RU'(¢(n,r». (89)
Case 1: Let dim X < 00 and dim Y < 00. Since f is a subimmersion at
uo = 0,
dim RU' (¢( n, r» = const = dim RU' (0» = dim R.
By (89), dim H'(n, r)(N x R) = dim R. Thus, the injectivity of (87) implies
(88),u
Case 2: Let dim X = 00 or dim Y = 00. Since f is a subimmersion at
uo = 0,
f'(u)(N e) = f'(u)(X) for all u E U(O).
Let v E H'(n, r)(N x R) be given. Then v E J'(u)(X), by (89). Here,
u := ¢(n, r). Thus, there is an ne E Ne such that v = J'(u)ne. By the proof
of Proposition 1, ¢ = F- 1 and
F'(u)n e = (0, h(u)n e).
Thus, we get ¢'(n,r)(O,j{(u)nc) = nc along with w := J{(u)nc E R. By
(85),
H'(n, r)(O, w) = f'(u)n c = v.
This yields (88).
110bserve that dim( {O} x R) = dim R < 00 in (87).
268 4. The Implicit Function Theorem
To finish the proof, let (h, k) E N x R be given. Since the map from (87)
is surjective, there exists a k E R such that
H'(n, r)(O, k) = H'(n, r)(h, k).
By (86), k = k. Therefore,
Hr(n, r)k = Hn(n, r)h + Hr(n, r)k for all hEN,
and hence Hn(n, r) = 0. This proves part (a).
The proof of Theorem 4.G is now complete. 0
4.13 The Surjective Implicit Function Theorem
We want to solve the equation
F(u,v)=O (90)
under the assumption that 12
Fv (uo, vo): Y -t Z is surjective. (91)
Theorem 4.H. Let X, Y, and Z be Banach spaces over OC, and let
F:U(uo,vo) ~ X x Y -t Z
°
be a ai_map on an open neighborhood of the point (uo, vo) such that
F(uo,vo) = and (91) holds. Then the following are true:
°
(i) Let r > O. There is a number p > such that, for each given u E X
with lIu-uoll < p, the original equation (90) has a solution v, denoted
by v(u), such that
IIv - voll < r.
In particular, the limit u -t Uo in X implies v(u) -t Vo.
(ii) There is a number d > 0 such that IIv(u)1I :=:; dIWv(uo, vo)v(u)lI.
Proof. Without loss of generality, we may assume that Uo = 0 and Vo = O.
Step 1: Let N:= {v E Y:Fv(O,O)v = O}. There is a number d > 0 such
that, for each Z E Z, there is a point w(z) E Y with
Fv(O,O)w(Z) = z and IIw(z)lI:=:; dllzII· (92)
120bserve that we do not need the null space N(Fv(u(), vo» to split Y.
4.13 The Surjective Implicit Function Theorem 269
In fact, by the closed mnge theorem in Section 3.12, there is a number c >0
such that
IIFlI(O, O)(v - n)1I ~ c· dist(v, N) for all v E Y, n E N.
Since F,.(O, 0): Y ~ Z is surjective, there exists a v E Y such that z =
F,,(O,O)(v - n) for all n E N. Moreover, there is some n E N such that
IIv - nil ::; 2 dist(v, N).
Stell 2: The original equation (90) is equivalent to
F,.(O, O)v = f(u, v) with f(u,v):= Fv(O,O)v - F(u,v). (93)
Let 111111 ::; p and IIvll, IIwll ::; r. Then, by Step 1 of the proof of the implicit
function theorem (Theorem 4.E),
IIf(u, v)1I ::; o(l)lIvll + Ilf(u, 0)11, r ~ 0, (94)
and
111(1£, w) - f(u, v)11 ::; o(l)llw "7 vii, r~O. (95)
Step 3: For a given 1£ E X with 111£11 ::; p, let us consider the following
iterative method:
F,.(O,O)Vm+l = f(u, vm), m= 0, 1,2, ... , (96)
where 'LIn := 0 and Vm+l is chosen according to (92), that is,
vm+l := w(f(u,vm)).
Siuce 11'0'111+111 ::; dllf( 1£, vm ) II, it follows from (94) and (95) that, for suffi-
dent.ly small p and r, we get
II 'tim II ::;1' and Ilvm+2-vm+l11 ::;2- 1 IlVm+l-v m ll forallm=O,l, ....
It. folloWH, as in the proof of Theorem 1.A in AMS Vol. 108, that (v m ) is a
Canchy sequence, and hence
v'" ~ v as rn ~ 00
for Home '1'. This implies 11'011 ::; r ancl FlI(O,O)v = f(u,v), by (96). Hence
F(u.!,) = O.
Finally, if we let 1/1. ~ 00, it follows from
IIv",+ll1 ::; dll~,(O, O)vm+l - F(u, v",)11
that 111'11 ::; dllF,. (0, O)vll· o
270 4. The Implicit Function Theorem
4.14 Applications to the Lagrange Multiplier Rule
Let us consider the minimum problem
feu) = min!, (97a)
along with the side condition
G(u) = O. (97b)
Our goal is to justify the necessary solvability condition
!,(u) + AG'(U) = 0, (98)
where A is called a Lagrange multiplier.
Proposition 1. Let f: U(u) ~ X - t lR, and G: U(u) ~ X - t Y be CIon an
open neighborhood of u, where X and Yare real Banach spaces. Suppose
that u is a solution of (97a), (97b), where
G' (u): X -t Y is surjective.
Then there exists a functional A E y* such that (98) holds true.
Sufficient solvability conditions can be found in Zeidler (1986), Vol. 3,
Section 43.8.
In the special case where Y:= lR,n and G(u) = (gl(U),oo.,gn(u», the
surjectivity of G'(u): X - t Y is equivalent to the fact that, for each
(WI, ... ,Wn ) E lR,n, the system
gj(u)h = Wj, j = 1, ... ,n,
has a solution hEX. Then condition (98) is equivalent to
n
!,(u) + I>jgj(u) = 0,
j=l
where A = (AI, ... , An) E lR,n.
The following proof will be based on the surjective implicit function the-
orem and the closed range theorem.
We have shown in Section 5.17.3 of AMS Vol. 108 that in terms of sta-
tistical physics, absolute temperature is nothing other than a Lagrange
multiplier. Moreover, applications of the Lagrange multiplier rule to capil-
lary surfaces can be found in Problem 2.12.
Proof. Let u be a solution of (97). Note that G(u) = O. We want to show
that
G'(u)h = 0 implies !,(u)h = O.
4.14 Applications to the Lagrange Multiplier Rule 271
To this end, let hEX be given such that G'(u)h = O. Set
F(e,v):= G(u+eh+v),
where (e,v) lives in some neighborhood of (0,0) in lB'.xX. Since F(O,O) =0
and
Fv(O, O)v = G'(u)v for all v E X,
it follows from the surjective implicit function theorem (Theorem 4.H) ap-
plied to F that there are numbers p > 0 and r > 0 such that, for each
e E lB'. and lei S; p, there is a vee) E X with Ilv(e)11 S; r such that
G(u + eh + vee)) = 0 (99)
and
Ilv(e)1I S; constllFv(O, O)v(c) II S; constIlG'(u)v(e)II (100)
along with IIv(e)1I -> 0 as e -> O. By the definition of the F-derivative,
G(u + k) = G'(u)k + o(llkID, k -> O.
Thus, according to (99),
G'(u)v(e) + o(1)lIeh + v(e)1I = 0, e -> 0,
since G'(u)h = O. By (100), IIv(e)11 :::; o(l)(l/ehll + I/v(e)II), that is,
I/v(e)1I = o(e), e -> O.
Since u is a solution to (97), it follows from (99) that
feu + eh + vee)) ~ feu).
This yields
f'(u)(eh + vee)) + o(l)lIeh + v(e)1I ~ 0, e -> o.
Dividing by e and letting e -> ±O, we get f'(u)h ~ 0 and f'(u)h :::; 0, which
means that
f'(u)h = O.
Step 2: It follows from Step 1 that G'( u)h = 0, hEX, implies f'(u)h = 0,
that is, we get the key condition
f'(u) E N(G'(u)).L.
By the closed range theorem from Section 3.12, N(G'(u)).L = R(G'(u)T),
and hence
f'(u) E R(G'( u) T).
Thus, there exists a functional A E y* such that f'(u) = G'(u)T A, namely,
(J'(u), h) = (G'(u) T A, h) = (A, G'(u)h) for all hEX.
Hence (J'(u) + AG'(u))h = 0 for all hEX. This is (98). D
272 4. The Implicit Function Theorem
Problems
In the following, let X, Xj, Y, and Z denote Banach spaces over K Fur-
thermore, let U (u) denote an open neighborhood of the point u. Recall
that
A(n)(u)h l ··· hn == dnA(u)h l ... hn·
Let -00 < a < b < 00.
4.1. Linear operators. Let A: X -> Y be a linear continuous operator. Show
that
A' = A and A (n) = 0 for all n 2': 2.
4.2. Composition. We are given the operator A: U(u) ~ X -> Y and the
two linear continuous operators L: Y -> Z and M: Z -> X. Let n 2': 1.
Show that if the F-derivative A(n)(u) exists, then the two F-derivatives
(L 0 A)(nl(u) and (A 0 M)(nl(u) exist, where
(L 0 A)(n)(u) = L 0 A(n)(u)
and
for all hj EX, j = 1, ... ,n.
4.3. The superposition operator. Set X := eta, b]. Define
(Au)(x) := f(u(x)) for all x E [a, b].
Suppose that f:'R. -> 'R. is en,
n 2': 1. Differentiating A(u + th)(x)
f(u(x) + th(x)), t E 'R., at the point t = 0, we formally obtain
[A'(u)h](x) = f'(u(x))h(x) for all x E [a, b]. (101)
Show that the operator A: X -> X is en and A'(u) is given by (101) for
all u, hEX. Moreover, show that
for all u, hI' ... ' h n E X and all x E [a, b].
Hint: Use the classic Taylor theorem and an induction argument.
4.4. Nonlinear integral operator. Set X := eta, b]. Define
(Au)(x):= lb A(x, y)f(u(y))dy for all x E la, bJ.
Problems 273
Suppose that A: [a, b] x [a, b] -t ~ is continuous and that f: ~ -t ~ is C n ,
n ~ 1. Differentiating
A(u + th)(x) = lb A(x,y)f(u(y) + th(y»dy, t E~,
at the point t = 0, we formally obtain
[A'(u)h](x) = lab A(x, y)j'(u(y))h(y)dy for all x E [a, b]. (102)
Show that the operator A: X
- t X is C n and A'(u) is given by (102) for
all u, hEX. Moreover, show that
[A(n)(u)h 1 h 2 ··· hn](x) = lb A(x, y)f(n)(u(y»h 1 (y)h 2 (y)··· hn(y)dy
for all u, h 1 , ... , h n E X and all x E [a, b].
4.5. More general superposition operators. Set X := C[a, b]. Define
(Au)(x) := g(x, U1(X), ... ,um(x» for all x E [a, b],
where u = (U1,"" um), m ~ 1. Suppose that g: [a, b] x ~m -t ~ is cn,
n~1.
Show that the operator A: X x ... x X -t X is C n and
m
[A'(u)h](x) = LgUj(x,u(x»hj(x) for all x E [a, b],
j=l
and for all u,h E X x ... x X, where h = (h 1 , ... ,hm ).
4.6. Nonlinear differential operator. Set X := C 2 [a, b] and Y := C[a, bJ.
Define
(Au)(x) := ul/(x) + g(x,u(x), u'(x» for all x E [a, b].
Suppose that g: [a, bJ x ~2 - t ~ is C n , n 2: 1. Differentiating A(u + th)(x)
at the point t = 0, we formally obtain
[A'(u)hJ(x) = hl/(x) + gu(x, u(x), u'(x»h(x)
+ gul(x, u(x), u'(x»h'(x) for all x E [a, bJ. (103)
Show that the operator A: X - t Y is cn
and that A'(u) is given by (103)
for all u, hEX. Compute A(n)(u)h 1 ... h n for n = 2,3.
4.7. Generalizations. Formulate and prove analogous results for the oper-
ators related to
lb g(x,y,u(y»dy and g(x, u' (x), ... ,u(m) (x».
274 4. The Implicit Function Theorem
4.8. Nonlinear systems of real equations. Let X := JRm and Y := JR k , where
k, m 2: 1. Define the operator A: X -> Y by v = Au and
j = 1, ... ,k. (104)
Suppose that fj: JRm -> JR, j = 1, ... , k, is C n with n 2: 1.
Show that A: X -> Y is C n and
where d n fj has been computed in Example 4 of Section 4.2.
In particular, the equation v = A'(u)w corresponds to the linearization
of (104), namely,
m
Vj = "L0sfj(ul, ... ,um)ws, j = 1, ... ,k,
s=1
where Os := %us.
Formulate the implicit function theorem, the inverse mapping theorem,
and the rank theorem (Theorem 4.G) in terms of nonlinear systems of real
equations.
Hint: Cf. Zeidler (1986), Vol. 1, Section 4.8 and Problem 4.4b.
4.9. The Gateaux derivative (G-derivative). Let A: U(uo) <;: X -> Y. By
definition, the operator A is G-differentiable at Uo iff there exists a linear
bounded operator L: X -> Y such that
A(uo + th) - A(uo) = tLh + oCt), t -> 0,
for all hEX with IIhll ::; 1 and all real numbers t in some neighborhood of
zero. We call the operator Ac(uo) := L the G-derivative of A at uo.
Show that if Ac(u) exists on some open neighborhood U of Uo and the
map u ~ Ac(u) from U to L(X, Y) is continuous at Uo, then the F-
derivative A'(uo) exists and A'(uo) = Ac(uo).
Hint: Cf. Zeidler (1986), Vol. 1, Section 4.2.
4.10. Symmetry of higher derivatives. Suppose that A: U <;: X -> Y is C n ,
n 2: 2, on the open set U. Then, for each u E U, the map
is symmetric on X x ... x X, that is, A(n)(u)h 1 ... h n remains invariant
under any permutation of hI, ... , h n .
Solution: For example, let n = 2. For fixed h, k E X and fixed functional
y* E Y*, introduce the real function
¢(t,s):= (y*,A(u+th+sk»), t, s E JR,
Problems 275
where It I and lsi are sufficiently small. Then
¢t(t,s) = (y*,A'(u+th+sk)h),
¢st(O,O) = (y*,AI/(u)kh), ¢ts(O,O) = (y',AI/(u)hk).
Since ¢ is e 2 in some neighborhood of (0,0), we obtain ¢ts(O, 0) = ¢st(O, 0)
by a well-known classical result. Hence
AI/(u)hk = AI/(u)kh for all h, k E X, u E U,
since y* E y* is arbitrary.
4.11. Partial F-derivatives. Let the operator
A: U ~ Xl X •.. X Xm -+ Y
be given, where U is open and m 2: 1. Let n 2: 1.
Show that A is en on U iff the partial F-derivative DjA is en-Ion U
for all j = 1, ... , m. In addition, we have
m
A'(u)h = L DjA(u)hj for all u E U, hE Xl X ••. X X m, (105)
j=l
where h = (hI, ... , hm ).
Solution: Let m = 2. The general case proceeds analogously.
Suppose first that A is en on U. Then recall that (105) has been proved
in Section 4.2. Since
DIA(u)h = A'(u)(h, 0) and D2A(u)h = A'(u)(O, h),
DIA and D2A are en-IOn U.
Conversely, suppose that DIA and D2A are en-Ion U. We first prove
(105). Set u := (v, w) and h := (a, (3), where v, a E Xl and w, (3 E X 2 .
Also set Av := DIA and A", := D2A. By the triangle inequality,
IIA(v + a, w + (3) - A(v, w) - AlI (v, w)a - A",(v, w)(311
:::; IIA(v + a, w + (3) - A(v, w + (3) - All (v, w + (3)all
+ IIAlI (v, w + (3)a - AlI(v, w)all
+ IIA(v, w + (3) - A(v, w) - A",(v, w)(3II·
It follows from the Taylor theorem in Section 4.5 that the right side can be
bounded by
sup IIA v (v+Ta,w+(3) -A v (v,w+(3)llll a ll
O::;r:::;1
+ IIAlI (v, w + (3) - All (v, w)lllIall
+ sup IIAlI (v, W + T(3) - A",(v,w)IIII(3II·
O::;r:::;1
276 4. The Implicit Function Theorem
By the continuity of Av and Aw at the point (v, w), this last expression can
be bounded by r(a,;3)(llall + 11;311) with r(a,j3) - 7 0 as (a,;3) -7 O. This
proves (105).
Set Pj(hl, h 2 ) := hj, j = 1,2. Then (105) reads as
A'(U) = D1A(u) 0 n + D2A(u) 0 P2.
The operator Pj: Xl x X 2 -7 Xj is linear and continuous. Since DjA is
C n - l on U, the operator A is C n on U, by Problem 4.2.
4.12.* The global inverse mapping theorem. Let f: X
-7 Y be a Cl-map,
where X and Yare Banach spaces over lK. Then the following two condi-
tions are equivalent:
(i) f: X -7 Y is a Cl-diffeomorphism.
(ii) f'(U): X -7 Y is bijective for all u E X, and f is proper (i.e., the
compactness of C in Y implies the compactness of f- l (C)).
Study the proof in Berger (1977), p. 221.
4.13. A simple proof for a variant of the global inverse mapping theorem via
the continuation method. Let f: X -> Y be a proper cr -map, 1 :::; r :::; (X) ,
where X and Yare Banach spaces over lK. Suppose that
!' (u): X -> Y is bijective for all u E X.
Show that
(a) f: X -7 Y is surjective.
(b) If f: X -> Y is injective, then f is a cr -diffeomorphism.
Hint: For t E JR, use the equation
f(u(t)) = tv. (106)
Solution: Ad (a): Let v E Y. Without loss of generality, we may assume
that f(O) = O. By the local inverse mapping theorem (Theorem 4.F), there
exists a number to > 0 such that equation (106) has a solution u(t) for
each t E [0, to]. Let T be the supremum of alI, the numbers to. If T = (X) ,
then equation (106) has a solution for t = 1 and we are done.
We show that T = (x). On the contrary, suppose that T < (x). Let (t n )
be a sequence such that tn -> T as n -> (X) and tn < T for all n. By (106),
for all n.
Since f is proper, there exists a subsequence, again denoted by (u(t.,,)),
such that
as n -> (X)
Problems 277
for some w. Hence few) = Tv. By the local inverse mapping theorem (The-
orem 4.F), f is a local Cl-diffeomorphism at the point w. Hence equation
(106) has a unique solution for all t in an open neighborhood of T in R
This contradicts the maximality of T.
Ad (b). Since f: X -t Y is bijective, the inverse map f- 1 : Y -t X exists.
The local inverse mapping theorem (Theorem 4.F) tells us that f- 1 is CT.
Hence f: X -t Y is a CT -diffeomorphism. 0
4.14. Algebraic operations for multilinear forms. Let X be a linear space
over lK. For n 2: 1, denote the set of all n-linear forms
A:Xx",xX-tlK
by MY/(X). Set MO(X) := lK.
4.14a. The tensor algebra of multilinear forms. Let m, n 2: 1, and let
A E Mm(x) and B E MY/(X). Define the tensor product A lSi B through
(A lSi B)(Ul,"" um+Y/) := A(Ul,"" um)B(Um+l, ... , um+n)
for all Ul, ... , U n + m EX. If a, (3 E lK and A E Mn(x), n 2: 1, then define
a lSi A = A lSi a := aA and a lSi (3 := a(3.
Let m, n, r 2: O. Show that for all A E M7n(X), B E MY/(X), and C E
Mr(x), the following conditions are met:
(i) A lSi B E Mm+n(x);
(ii) A lSi (B lSi C) = (A 0 B) 0 C;
(iii) A0(B+C) = (A0B)+(A0C) and (B+C)0A = (B0A)+(C0A).
Naturally enough, suppose that n = r in (iii).
Definition. Let 0M(X) denote the set of all finite sums of multilinear
forms over X, that is,
A o + Al + ... + Ak,
where Am E Mm(x) for all m. Then, iSlM(X) is a linear space over lK.
Moreover, iSlM(X) becomes an algebra with respect to the operations "+"
and "0".
4.14h. The Grassmann algebra of antisymmetric multilinear forms. Let
1n 2: 1. The set of all antisymmetric m-linear forms A E Mm(x) is denoted
by Am(x). Thus, A E Am(x) means that
A(u 7r (l) , ... , U7r (m+n)) = (sgn 7r)A(Ul,"" urn),
for all Ul, ... ,U,,,, where 1f is a permutation of 1, ... ,m, and sgn 1f denotes
the sign of 7r.
278 4. The Implicit Function Theorem
Let m, n 2: 1. For A E Arn(x) and B E An(x), define the exterior
product A 1\ B through
(A 1\ B)(Ul,"" u rn +n )
:= L)sgn 7f)A( U71'(1) , ••• ,u71'(rn))B( U71'(rn+l) , ... ,U71'(rn+n))
71'
for all Ul,"" U rn +n E X. Here, we sum over all permutations 7f of 1, ... ,
m + n that have the following additional property:
7f(1) < ... < 7f(m) and 7f(m + 1) < ... < 7f(m + n).
For a, (3 E II{ and A E An(x), n 2: 1, define
a 1\ A = A 1\ a := aA and a 1\ (3 := a(3.
Let m, n, r 2: O. Show that for all A E Arn(x), B E An(x), and C E
Ar(x), the following conditions are met:
(i) A 1\ B E Arn+n(x);
(ii) A 1\ (B 1\ C) = (A 1\ B) 1\ C;
(iii) A 1\ B = (_lynn B 1\ A (supercommutativity);
(iv) AI\(B+C) = (AI\B)+(AI\C) and (B+C)I\A = (BI\A)+(CI\A).
Naturally enough, suppose that n = r in (iv).
Definition. Let I\A(X) denote the set of all finite sums of antisymmetric
multilinear forms over X, that is,
where A E Am(x) for all m. Then, I\A(X) is a linear space over K More-
over, I\A(X) becomes an algebra with respect to the operations "+" and
"1\".
4.14c. The relation to determinants. Show that if a, b, c E X T , then
(a 1\ b)(u, v) = I~~~~ ~~~~ I
and
a(u) a(v) a(w)
((a 1\ b) 1\ c)(u, v, w) = b(u) b(v) b(w)
c(u) c(v) c(w)
for all u, v, w E X. Generally, if al, ... , an E X, then
Problems 279
for all Ul,'" ,Un EX.
4.14d. The Grassmann algebra I\(X) of a linear space X. Let X be a
linear space over OC. We are given U EX. Define
U(U*) := u*(u) for all u* EXT.
This way, we regard u as an element of Al(XT). By definition, the Grass-
mann algebra I\(X) of the linear space X consists of all finite sums of all
possible finite I\-products, that is,
for all u.i E X. Obviously, I\(X) is a linear space over OC, and I\(X) is an
algebra with respect to the operations "+" and "1\".
5
Fredholm Operators
It came as a complete surprise, when, in a short note published in
1900, the Swedish mathematician Ivar Fredholm (1866---1927) showed
that the general theory of all integral equations considered prior to
him was, in fact, extremely simple.
Jean Dieudonne (1981)
The purpose of this note is to introduce a nonlinear version of Fred-
holm operators, and to prove that in this context Sard's theorem
holds if zero measure is replaced by first category.
Steve Smale (1965)
Before you generalize, formalize, and axiomatize there must be math-
ematical substance.
Hermann Weyl (1885-1955)
Another characteristic of mathematical thought is that it can have
no success where it cannot generalize.
Charles Sanders Peirce (1839-1914)
282 5. Fredholm Operators
Let us first consider the linear operator equation
Au=b, u E X. (1)
It is quite natural to look for a class of linear operators that have the
following properties:
(i) Equation (1) has a solution u iff a finite number of solvability condi-
tions is satisfied for b.
(ii) The general solution of (1) depends on a finite number of parameters.
The class of linear Fredholm operators satisfies conditions (i) and (ii).
The index of a linear Fredholm operator A is defined through
ind A := dim N(A) - codim R(A).
Large classes of linear differential and integral operators represent Fredholm
operators in appropriate function spaces. In particular, if the operator A is
Fredholm of index zero, then the following fundamental principle holds for
equation (1):
Uniqueness implies existence.
For example, the Riesz-Schauder theory tells us that the operator
I+C:X -+X (2)
is Fredholm of index zero on the Banach space X provided the linear op-
erator C: X -+ X is compact. This generalizes the classic Fredholm theory
for integral equations (cf. Section 5.3).
Now suppose that the operator A from equation (1) is nonlinear. Then
it is quite natural to look for a class of nonlinear operators that possess the
following property:
For "most" right-hand sides b, the equation Au = b, u EX, has at most
a finite number of solutions.
This leads us to the class of nonlinear Fredholm operators introduced by
Smale in 1965. For example, each operator
B+C:X -+ X
is a nonlinear Fredholm operator of index zero on the Banach space X
provided
(a) the operator B: X -+ X is linear, continuous, and bijective, and
(b) the operator C: X -+ X is compact and C 1 .
5. Fredholm Operators 283
exact duality functor
~
closed range theorem
~
implicit function theorem
Fredholm alternative
~
Riesz-Schauder theory
(I + linear compact operator)
product index theorem .. parametrix dual pairs
~
nonlinear
Fredholm operators
- stability of Fredholm
operators under perturbations
bifurcation theory
(nonlinear operators)
Fredholm differential
integral equations equations
pseudo-differential
operators
Atiyah-Singer index theorem
FIGURE 5.1.
The index of a linear Fredholm operator A plays a fundamental role, since
it is invariant under appropriate perturbations of A (cf. Section 5.8). One
of the most important achievements of twentieth-century mathematics is
represented by the famous Atiyah-Singer index theorem. 1
Roughly speaking, this index theorem tells us that the index of elliptic-
type differential operators and certain classes of pseudo-differential opera-
tors on compact manifolds depends only on the topology of the manifold.
This way we obtain a deep relation between analysis and topology which
has its roots in the ingenious work of Riemann in the middle of the nine-
1 As an introduction to the Atiyah-Singer index theorem we recommend Zei-
dler (1995), Gilkey (1984), BOOBS and Bleecker (1985), and Cycon et al. (1986)
(supersymmetric approach).
284 5. Fredholm Operators
teenth century. Pseudo-differential operators 2 generalize both differential
and integral operators.
Figure 5.1 displays important interrelations. For the convenience of the
reader, we start with the Riesz-Schauder theory on Hilbert spaces, which
is based on a simple variant of the closed graph theorem.
5.1 Duality for Linear Compact Operators
Theorem 5.A. Let X and Y be Banach spaces over OC.
If the linear operator A: X --+ Y is compact, then so is the dual operator
AT: Y* --+ X*.
Schauder proved this theorem in 1930.
Proof. Let f, fj E Y*. Then, for all u, Ui EX, we get the following key
inequality:
If(Au)-iJ(Au)1 ::::; If(Au)- f(Aui)I+If(Aui)- fj(Aui)I+lfj(Aud- fj(Au)1
::::; IIfil IIAu - Au;!1 + If(Aui) - fj(Aui)1 + IIfjll IIAu - Audl· (3)
In the foiIowing we will critically use the fact that a subset of a Banach
space is relatively compact iff it has a finite c -net for each c > 0 (cf.
Proposition 10 in Section 1.11 of AMS Vol. 108).
Let B* be a bounded set in Y*. We have to show that the set AT (B*)
is relatively compact.
To this end, fix c > O. Let B denote the closed unit ball in X. Since the
operator A is compact, the set A(B) is relatively compact, and hence A(B)
has a finite c-net. That is, there are points Ul, ... ,UN E B such that
min IIAu - AUili <c for all u E B.
15;i5;N -
Since the set B* is bounded, we obtain from
If(Aui)1 ::::; Ilfll IIAuili
that the set
F:= {(f(AUl), ... ,f(AuN)):j* E B*}
is bounded in the finite-dimensional Banach space OC N , and hence F is
relatively compact, that is, F has a finite c-net. Thus, there exist points
it, ... , f M E B* such that
for all i.
2The modern theory of pseudo-differential operators can be found in Hor-
mander (1983).
5.1 Duality for Linear Compact Operators 285
It follows now from (3) that, for all u E B and I E B*,
I/(Au) - fJ(Au)l ::; (11/11 + IIfJlDIIAu - AUili + I/(Aui) - Ij(Aui)l,
and hence
l~~nM I/(Au) - Ij(Au)1 ::; const· c
_3_
+ c.
Finally, observe that AT (J - fJ) E X*, and hence
II AT I - AT Ijll = sup I(AT(J - Ij),u)1 = sup I/(Au) - Ij(Au)l,
uEB uEB
by the definition of the dual operator AT. This implies
l!?~N II AT I - AT fJlI ::; const· c + c for all I E B*. (4)
_3_
Varying the number c > 0, relation (4) tells us that, for each TJ > 0, the set
AT(B*) has a finite TJ-net, that is, AT(B*) is relatively compact. 0
Corollary 1. Let X be a Hilbert space over lK. II the linear operator A: X -t
X is compact, then so is the adjoint operator A *: X - t X.
Proof. The duality map .:l: X - t X* is a homeomorphism with II.:lull =
\lull for all u E X, and
A* = .:l-1 AT .:l.
Since AT is compact, the compactness of A* follows from Proposition 4
ahead. 0
Proposition 2 (Sums). Let the operators
A,B:X -t Y
be compact, where X and Y are normed spaces over lK.
Then the sum A + B: X - t Y is also compact.
Proof. Let (un) be a bounded sequence in X. Since A is compact, there
exists a subsequence (un') such that (Au n,) is convergent. Furthermore,
since B is compact, there exists a subsequence (un") of (un') such that
(Bun") is convergent. Hence (Aun" + BUn") is convergent. 0
Definition 3. Let X and Y be normed spaces over lK. Then the operator
A: D(A) ~ X - t Y is called bounded if it maps bounded sets onto bounded
sets.
For example, each linear continuous operator A: X -t Y is bounded,
since IIAull ::; IIAliliuli for all u E X.
Proposition 4 (Products). The operators
BC and CE
286 5. Fredholm Operators
are compact provided the following hold:
(i) X, Y, V, W are normed spaces over lK.
(ii) C: X -> Y is compact.
(iii) E: V -> X is continuous and bounded, and
(iv) B: Y -> W is continuous.
Proof. Ad CD: V -> Y. If M is a bounded set in V, then so is E(M), and
hence the set C(E(M» is relatively compact.
Ad BC: X -> W. If N is a bounded set in X, then C(N) is relatively
compact. Since Un -> u as n -> 00 implies BUn -> Bu, the set B(C(N» is
also relatively compact. D
Proposition 5 (Finite rank). Let A: X -> Y be a bounded continuous
operator with
dim R(A) < 00,
where X and Yare normed spaces over II{. Then A is compact.
Proof. If M is a bounded subset of X, then A(M) is a bounded set in the
finite-dimensional normed space R(A). Hence A(M) is relatively compact.
D
5.2 The Riesz-Schauder Theory on Hilbert Spaces
We consider the operator equation
Bu+Cu = b, UEX, (5)
along with the dual equation
B*v + C*v = b*, v E X. (5*)
By definition, the homogeneous original equation and the homogeneous
dual equation correspond to (5) and (5*) with b = 0 and b* = 0, respec-
tively.
Theorem 5.B. Suppose that
(i) The operator B: X -> X is linear, continuous, and bijective on the
Hilbert space X over II{ (e.g., B = I).
(ii) The operator C: X -> X is linear and compact.
5.2 The Riesz-Schauder Theory on Hilbert Spaces 287
Then the following properties are met:
(a) Original problem. For given bE X, equation (5) has a solution u EX
iff b satisfies the solvability condition
(b I v) =0
for all solutions v of the homogeneous dual equation (5*).
(b) Finiteness. The homogeneous dual equation (5*) and the homogeneous
original equation (5) have the same finite number of linearly indepen-
dent solutions.
(c) Well-posedness. If Bu + Cu = 0 implies u = 0, then the original
equation (5) has a unique solution u for each given b EX.
Moreover, the solution u depends continuously on b, that is, the in-
verse operator
is continuous.
(d) Dual equation. For given b* E X, equation (5*) has a solution v iff
b* satisfies the solvability condition
(b* I u) =0
for all solutions u of the homogeneous original equation (5).
In terms of Section 5.4 ahead, this theorem tells us that the operator
B + C is Fredholm of index zero. Statement (c) says that
Uniqueness implies existence.
Let Ul, ... , Un and VI, ... ,Vn be a maximal number of linearly independent
solutions of the homogeneous original equation (5) and the homogeneous
dual equation (5*), respectively. Then, for given bE X, the original problem
(5) has a solution u iff
for j = 1, ... , n,
and the general solution of (5) is given by
n
U = Uo + LctjUj,
j=l
where Uo is a special solution of (5), and ctl, ••• ,ctn are arbitrary numbers
fromK
288 5. Fredholm Operators
Moreover, we will show in Section 5.5 that this theorem remains true for
Banach spaces X provided we replace the adjoint operators
B*, C*: X ---> X and the inner product (- I .)
with the dual operators
and the symbol (., .),
respectively. Recall that (b',u) = b*(u) for all b* E X* and u E X.
Theorem 5.B represents a Fredholm alternative. Such an alternative for
integral equations was first proved by Fredholm in 1900. The generalization
to Banach spaces dates back to papers by Riesz in 1915 and Schauder in
1930.
Proof. We set
S := B + C, N:= N(S), R:= R(S).
Obviously, the null space N of the operator S is closed, since S is contin-
uous. We shall show ahead that the range R of S is closed. Therefore, by
Section 2.9 in AMS Vol. lOS, we have the orthogonal direct sums
We also introduce the index of S by
ind S := dim N - codim R.
Step 1: We show that dim N < 00. Let (un) be a bounded sequence in N
(i.e., BUn + CUn = 0 for all n). Since C is compact, there is a subsequence,
again denoted by (un), such that CUn ---> W as n ---> 00. Hence Un ---> v as
n ---> 00, where v := -B- 1 w. This implies Bv + Cv = 0 (i.e., v EN).
Thus, the closed unit ball in the closed subspace N of X is compact. By
Section 2.3, this implies dim N < 00.
Step 2: We show that dim N(S*) < 00. Note that S* = B* + C*. Since
C is compact, so is C* by Corollary 1 in Section 5.1. Furthermore, B*-1 =
(B- 1 )*, by the proof of Proposition 13 in Section 5.2 in AMS Vol. lOS. The
same argument as in Step 1 yields dim N(S*) < 00.
Step 3: We show that the range R(S) is closed. By Theorem 3.E(iii) in
Section 3.12, it is sufficient to prove that
c . dist( u, N) ::; IISulI for all U E X and fixed c > O.
If this is not true, then there exists a sequence (un) such that
as n ---> 00 (6)
5.2 The Riesz-Schauder Theory on Hilbert Spaces 289
and dist( Un, N) = 1 for all n. As in Step 1, there exists a subsequence, again
denoted by (un), such that Un - t V as n - t 00 and v EN, contradicting
dist( Un, N) = 1 for all n.
Step 4: Special case of the closed graph theorem. We show that
R = N(S*)l., (7)
where ..L denotes the orthogonal complement. In fact, it follows from
(SU I v) = (u I S*v) for all u, v E X
that N(S*) ~ Rl. and Rl. ~ N(S*). Hence RJ. = N(S*), that is,
X = REB N(S*),
since R is closed. This yields (7).
It follows from (7) that
co dim R = dim N(B*) < 00.
Step 5: The fundamental stability of the index against small perturba-
tions. We show that if the operator T: X - t X is linear and continuous,
and if liT - SII is sufficiently small, then
ind S = ind T.
The decisive trick consists in constructing the operator
T(U,v):= Tu+v
Then the operator 1': N l. X Rl. - t X is linear and continuous.
Moreover, if T = S, then the operator 8 is bijective. This is the key obser-
vation. In fact, the operator 8 is surjective, by construction. 3 Furthermore,
8 is injective. To see this, assume 8(u, v) = o. Then,
Su + v = 0,
and hence v = 0, 8u = O. From 8u = 0 and u E N J. we get u = o.
If 111' - 811 is sufficiently small, then the operator l' is also bijective,
by Proposition 7 in Section 1.23 of AMS Vol. 108. Thus, if liT - 811 is
sufficiently small, then l' is bijective, that is, we obtain the direct sum 4
(8)
30bserve that S(Nl.) = R(S) = R.
4Note that
implies UI = U2 and VI = V2·
290 5. Fredholm Operators
by definition of T. Moreover, we get
N(T) ~ N. (9)
In fact, Tu = 0 along with u E N 1. implies T( u, 0) = 0, and hence u = O.
Therefore, N(T) ~ N. By (9),
dim N(T) :::; dim N < 00.
The proof follows now from the two relations (8) and (9) in a simple
manner. By (8),
codim T(N1.) = dim R1. = co dim R. (10)
Let us choose a linear subspace M of N such that we get the direct sum
N = N{T) ffiM. (11)
Then, X = (N(T) ffi M) ffi N1. = N(T) ffi (M ffi N1.). Hence T is injective
on M ffi N1.. This implies
R{T) = T{M) ffi T(Nl.) and dim T{M) = dim M,
that is,
(12)
From this we immediately obtain the following:
codim R = codim T{N.l.) = codim R{T) + dim M (by (10) and (12)),
dim N = dim N{T) + dim M (by (11)).
Hence ind S = dim N - codim R = dim N{T) - codim R(T) = ind T.
Step 6: We show that
ind (B + C) = O.
In fact, the function t f-+ B + tC is continuous from [0,1] to L(X, X).
By Step 5, ind(B + tC) = const for all t E [0,1]. Since the operator B is
bijective, we get dim N(B) = 0 and codim R(B) = 0, and hence
ind B = dim N{B) - codim R(B) = O.
Ad (a). This follows from R = N(S*)1. in Step 4.
Ad (b). By Step 4, dim N(S*) = co dim R. Hence
dim N(B*) = dim N - ind{B + C) = dim N < 00.
Ad (c). If Bu + Cu = 0 implies u = 0, then B + C is injective. Moreover,
dim N = 0 implies dim N{B*) = 0, and hence codim R = 0 {I.e., B + Cis
5.3 Applications to Integral Equations 291
surjective). By the continuous inverse theorem from Section 3.5, the inverse
operator (B + C)-l: X ~ X is continuous.
Ad (d). Observe that (S*)* = S. Therefore, the dual equation to (5*) is
given by (5). Consequently, statement (d) follows immediately from (a). 0
This proof has been chosen in such a way that it can be generalized im-
mediately to more general situations as appear ahead (the Riesz-Schauder
theory on Banach spaces and the perturbation theory for Fredholm opera-
tors.)
5.3 Applications to Integral Equations
Parallel to Section 4.4 in AMS Vol. 108, let us consider the integral equation
lb A(x,y)u(y)dy - >.u(x) = hex), a:::; x:::; b, (13)
along with the dual integral equation
lb A(y,x)v(y)dy - >.v(x) = 0, a:::; x:::; b, (13*)
where -00 < a < b < 00. In contrast to Proposition 4 in Section 4.4 in
AMS Vol. 108 we do not assume that the kernel A is symmetric. The real
number>' is called an eigenvalue of (13) iff the homogeneous equation (13)
with h == 0 has a nontrivial solution u ;:j:. 0 on [a, b].
Proposition 1. Assume that the function A: [a, bj x [a, bj ~ IR is contin-
uous. Let the function h E L2 (a, b) and the real number >. i= 0 be given.
Then the following statements hold true:
(i) If>. is not an eigenvalue of (13), then (13) has a unique solution
u E L 2 (a,b).
(ii) If >. is an eigenvalue of (13), then>' is also an eigenvalue of (13*)
with the same multiplicity, and (13) has a solution u E L 2 (a, b) iff
lb h(x)v(x)dx =0
for all eigensolutions v of (13*).
Corollary 2. The eigensolutions of (13) and (13*) are continuous.
If h is continuous on la, bj, then so is each solution u of (13).
292 5. Fredholm Operators
Proof. Let X = L 2 (a, b), and define the operator C through
(Cu)(x) := lb A(x, y)u(y)dy for all x E [a, b].
By Lemma 3 in Section 4.4 of AMS Vol. 108, the operator C: X -> X is
linear and compact.
The adjoint operator C*: X -> X is given through
(C*v)(x) = lb A(y,x)v(y)dy for all x E [a, bJ.
In fact, it follows from the Tonnelli theorem (cf. "Iterated Integration" in
the appendix to AMS Vol. 108) that, for all u, v E X,
(Cu I v) = lb (l b
A(x, Y)U(Y)dY) v(x)dx
= lb (l b A(x, Y)V(X)dX) u(y)dy = (u I C*v).
Now use Theorem 5.B with B := -AI. o
Corollary 2 follows from the continuity of parameter integrals (cf. the
appendix to AMS Vol. 108). In fact, if u E L2(a, b), then the function 9
defined by
g(x) := lb A(x, y)u(y)dy
is continuous on [a, bJ.
5.4 Linear Fredholm Operators
Definition 1. Let X and Y be normed spaces over II(. By a linear Fredholm
operator
A:X -> Y
we understand a linear continuous operator with
dim N(A) < 00 and codim R(A) < 00.
The index of A is defined to be the integer
ind A:= dim A - codim R(A).
5.4 Linear Fredholm Operators 293
Example 2 (Finite-dimensional operators). Let X and Y be finite-dimen-
sional normed spaces over IK. Then each linear operator A: X - t X is
Fredholm and
indA=dimX-dimY.
Proof. By (45*) in Chapter 3,
codim N(A) = dim R(A).
Hence
ind A = dim N(A) - co dim R(A)
= (dim X - codim N(A» - (dim Y - dim R(A»
= dim X - dim Y. o
Example 3 (Differential operator). Let X = C1[a,b] and Y = C[a,b],
where -00 < a < b < 00. Set
(Au)(x) := u'(x) for all x E [a, bJ.
Then the linear operator A: X -t Y is Fredholm with ind A = 1.
Proof. For each given f E C[a, bJ, the equation
u' =f on [a,b] (14)
has a solution u E C 1 [a, b] given through
u(x) = l x
f(t)dt.
Hence R(A) = Y, showing that co dim R(A) = O.
Furthermore, it follows from (3) with f == 0 that u = const, and hence
dim N(A) = 1. 0
Example 4 (Integral operator). Let >. E IR with >. =1= O. Set
(Au)(x) := lb A(x,y)u(y)dy - >.u(x),
for all x E [a, b], where the function A: [a, b] x [a, b] - t IR is continuous,
-00 < a < b < 00.
Then the operator A: X - t X is Fredholm of index zer0 5 provided we set
X := L 2 (a, b).
5We shall show in Section 5.11 that this remains true for X = C[a, bJ.
294 5. Fredholm Operators
This follows from Section 5.3. o
Proposition 5 (The role of the index). Let A: X --; Y be a linear Fredholm
operator, where X and Yare normed spaces over lK. Then
(i) A is surjective iffind A = dim N(A).
(ii) A is injective iff dim N(A) = O.
(iii) A is bijective iffind A = dim N(A) = O.
(iv) If X and Yare Banach spaces, then the equation
Au =b, u EX,
is well posed iffind A = dim N(A) = O.
Proof. Ad (i). A is surjective iff codim R(A) = O.
Ad (ii), (iii). This is obvious.
Ad (iv). This follows from Corollary 2 in Section 3.5. o
By Proposition 5(iv), Fredholm operators of index zero playa special
role. We now make the following assumption.
(H) Let A: X --; Y be a linear Fredholm operator, where X and Yare
Banach spaces over lK.
Proposition 6. Assume (H). Then the following properties are met:
(i) The range R(A) is closed.
(ii) R(A) = 1.. N(AT) and R(AT) = N(A)J...
(iii) codlm R(AT) = dim N(A).
(iv) codim R(A) = dim N(AT) = dim N(A) - ind A.
(v) The dual operator AT: y* --; X* is also Fredholm, and
ind AT = -ind A.
Proof. Ad (i). Cf. Standard Example 2 in Section 3.11.
Ad (ii). Cf. the closed graph theorem (Theorem 3.E).
Ad (iii), (iv). Use (ii) and Proposition 21 in Section 3.9.
Ad (v). Observe that
ind AT = dim N(AT) - co dim R(AT)
= codim R(A) - dim N(A) = -ind A. o
5.5 The Riesz-Schauder Theory on Banach Spaces 295
In terms of the operator equation
Au = b, uE X, (E)
and the dual equation
AT u* = b*, u* E Y*, (E*)
Proposition 6 tells us the following. Assume (H). Then the following prop-
erties are met:
(i) Original problem. For gfven bEY, equation (E) has a solution u E X
iff b satisfies the solvability condition
(u*, b) =0
for all solutions u* of the homogeneous dual equation (E*).6
(ii) Finiteness. The homogeneous original equation (E) and the homoge-
neous dual equation (E*) have only a finite number of linearly inde-
pendent solutions, and
ind A = dim N(A) - dim N(A T ).
(iii) Dual equation. For given b* E X*, equation (E*) has a solution u* E
Y* iff
(b*, u) = 0
for all solutions u of the homogeneous original equation (E).
(iv) Well-posedness. Let ind A = 0 and suppose that Au = 0 implies
u = O.
Then, for each given bEY, the original equation (E) has a unique
solution u, which depends continuously on b.
Moreover, for each given b* E X*, the dual equation (E*) has a unique
solution u*, which depends continuously on b* .
5.5 The Riesz-Schauder Theory on Banach
Spaces
Theorem 5.C. Let X and Y be Banach spaces over lK. Then the operator
B+C:X ~ Y
is Fredholm of index zero provided the following hold:
6Recall that the homogeneous equations (E) and (E*) correspond to b =0
and b* = 0, respectively.
296 5. Fredholm Operators
(i) The linear operator B: X --+ Y is continuous and bijective.
(ii) The linear operator C: X --+ Y is compact.
Proof. Use the proof of Theorem 5.B with the following obvious modifica-
tions:
(a) Replace the adjoint operators B* and C* with the dual operators BT
and C T , respectively.
(b) Replace the special closed graph theorem with the general closed
graph theorem (Theorem 3.E) in order to prove that
codim(B + C) < 00.
(c) Replace orthogonal direct sums with direct sums.
(d) Replace orthogonal complements such as N.J., R.J., and so forth, with
topological complements. 0
5.6 Applications to the Spectrum of Linear
Compact Operators
Theorem 5.D. Let A: X --+ X be a linear compact operator on the complex
Banach space X =I- {a}. Then the following statements hold true:
(i) All nonzero points A in the spectrum a(A) of A are eigenvalues of A
with finite multiplicity. 7
(ii) The spectrum a(A) is either a finite set or a countable subset of C
with the only limit point A = 0, which belongs to a(A).
(iii) The spectrum a(A) is not empty.
Proof. Ad (i). Let A E a(A) with A =I- O. By the definition of a(A), the
operator A - AI is not bijective. Since this operator is Fredholm of index
zero, we get 0 < dim N(A - AI) < 00 (Le., A is an eigenvalue of A with
finite multiplicity).
Ad (ii). By Section 1.25 in AMS Vol. 108, a(A) is compact.
Suppose that An E a(A) for all n and let An --+ A as n --+ 00. Assume
that An =I- Am if n =I- m. We have to show that A = O.
7The spectrum a(A) of an operator A was defined in Section 1.25 of AMS
Vol. 108.
5.6 Applicatiohs to the Spectrum of Linear Compact Operators 297
There exists a sequence (un) with AU n - AnUn = 0 for all n and Un i=- Urn
if n i=- m. In addition, Un i=- 0 for all n. Define
Xn := span{ Ul, ... , un}·
We show that the eigenvectors UI, ... , Un are linearly independent, by an
induction argument. Suppose that UI, ... , Un-I are linearly independent
and
n-I
Un = L a j u j for some aI, ... ,an - l E C.
j=1
It follows from
10-1
0= AUn - AnUn = L aj(Aj - An)Uj
j=1
along with Aj i=- An for all j = 1, ... , n - 1 that aj = 0, and hence Un = O.
This contradicts Un i=- O.
Since X n - 1 is a proper subspace of X n , there exists a point Vn E Xn
such that IIvnll = 1 and
. 1
dlst(Vn ,Xn - 1 ) ;:::: "2 for all n ;:::: 2, (15)
by Step 1 of the proof of Theorem 2.B (almost orthogonal elements). Then
Vn = anU n + Wn-l for some W n - l E X n - 1 and some an E C. Hence
since the space X n - 1 is invariant under A. For m < n, it follows from (15)
that
since AV rn E Xn- 1. This implies
lAIn I = II ~: II ~ 00 as n ~ 00.
Otherwise, there would exist a bounded subsequence (t7). Since A is
compact, we may assume that (A (t7))
is convergent, contradicting (16).
Ad (iii). We will use standard arguments from classic complex function
theory.
Suppose that a(A) is empty. Then the operator (A - )..1)-1: X ~ X is
linear and continuous for each)" E C. Set /-l := Then *.
(17)
298 5. Fredholm Operators
By Example 6 in Section 1.23 of AMS Vol. 108, the series
fL(fLA - I)-I = -fL(I + fLA + fL2 A2 + ... )
converges in L(X, X) for all fL E C with IfLl < r if r is sufficiently small.
Hence the series
(18)
converges in L(X, X) for all A E C with IAI > r. For each fixed AO E C, we
get
R)., = (A - AoI - (A - AO)I)-l = R).,o(I - (A - Ao)R)"o)-1
(19)
= R).,o(I + (A - Ao)R).,o + (A - Ao)2Rlo + ... ).
This series converges in L(X, X) for all A E C with IA - Aol < p, where p
is sufficiently small.
After these preparations, we define the function ¢: C -+ C through
for all A E C,
where f E X* and u E X are fixed such that feu) -=F O. By (19), the
function ¢ allows the convergent series expansion
for sufficiently small IA - Aol. Note that the point AO can be chosen arbi-
trarily. Thus, ¢ is holomorphic on Co Hence
fa ¢(A)dA = 0 (20)
for each circle C around the origin. If C is sufficiently large, then it follows
from (18) that
-fa ¢(A)dA fa (A- feu) + A- 2f(Au) + .. ·)dA
= 1
fa A- 1f(u)dA = 27rif(u).
=
Since feu) -=F 0, this contradicts (20). o
5.7 The Parametrix
Proposition 1. Let A: X -+ Y be a linear continuous operator, where X
and Yare Banach spaces over lK. Then the following two statements are
equivalent:
5.7 The Parametrix 299
(i) The operator A is Fredholm.
(ii) There exist linear continuous operators ~, Pr : Y --+ X and linear
compact operators Cl: X --+ X and C r : Y --+ Y such that
P1A=I+Cl, (21a)
APr = I +Cr . (21b)
The operators Pz and Pr are called a left and right parametrix for A,
respectively. The theory of pseudo-differential operators provides a sys-
tematic method for constructing the parametrices corresponding to large
classes of differential and integral operators. This can be found in Hor-
mander (1983).
Proof. (i) =} (ii). Choose linear subspaces V and W of X and Y, respec-
tively, such that
X = N(A) EB V and Y = R(A) EB W.
This is possible by Proposition 6(i) in Section 5.4 and by Standard Example
17 from Section 3.9. Let
P:X--+N(A) and Q:Y --+ W
be the corresponding linear continuous projection operators onto N(A) and
W, respectively. Define the linear continuous operator
B: R(A) EB W --+ X
through
for all U E R(A), w E W,
where Ao: V --+ R(A) denotes the restriction of A: X --+ Y to V. By Propo-
sition 13 in Section 3.9, the operator Ao is a linear homeomorphism. Finally,
observe that
BA=I-P and AB = I - Q.
Since P(X) and Q(Y) are finite-dimensional linear spaces, the operators
P and Q are compact. Now set Pr = PI := B.
(ii) =} (i). It follows from Theorem 5.C that the operators 1+ Cl and
I + Cr' are Fredholm of index zero.
By (21a), N(A) ~ N(I + Cl). Hence
dim N(A) ~ dim N(I + C1) < 00.
Furthermore, it follows from (21b) that
R(I + C r ) ~ R(A).
Since codim R(I + C r ) < 00 and R(I + C r ) is closed, we get co dim R( A) <
00, by Corollary 11 in Section 3.9. 0
300 5. Fredholm Operators
5.8 Applications to the Perturbation of Fredholm
Operators
Let F(X, Y) denote the set of all linear Fredholm operators A: X ~ Y,
where X and Yare Banach spaces over K Recall that L(X, Y) denotes
the Banach space of all linear continuous operators B: X ~ Y equipped
with the operator norm IIBII.
Proposition 1. Let S E F(X, Y). Then there exists a number c > 0 such
that
T E F(X, Y) and ind T = ind S
for all operators T E L(X, Y) with liT - SII < c.
Proof. Use Step 5 of the proof of Theorem 5.B along with the following
modifications:
(a) Replace orthogonal direct sums with direct sums.
(b) Replace orthogonal complements such as N 1., R1., and so on, with
topological complements. 0
Since the index is an integer, Proposition 1 can be formulated in the
following equivalent way:
The set F(X, Y) is open in L(X, Y), and the function S t---7 ind S is
continuous on L(X, Y).
Theorem 5.E (Compact perturbations of Fredholm operators). Let S E
F(X, Y), and let A E L(X, Y) be a compact operator.
Then the linear operator S + A is Fredholm and
ind(S + A) = ind S.
Proof. We will use the method ofthe parametrix. Since S E F(X, Y), there
exist operators Pl, Pr E L(X, Y) and compact operators Cl E L(X, X),
Cr E L(Y, Y) such that
and
by Section 5.7. Hence
and
(S + A)Pr· = 1+ Cr + APr'
5.9 Applications to the Product Index Theorem 301
Since the operator A is compact, so are ~A and APT! by Proposition 4
in Section 5.1. Thus, nand Pr are a left and right parametrix for S + A,
respectively. Hence S + A is Fredholm.
Now consider the continuous function
t f-+ S + tA
from [0,1) to L(X, Y). Since tA is compact, the operator S+tA is Fredholm
for all t E [0,1). By Proposition 1, ind(S + tA) = const for all t E [0,1). 0
5.9 Applications to the Product Index Theorem
Theorem 5.F. Let
X~Y~Z
be a sequence of linear Fredholm operators A and B, where X, Y, and Z
are Banach spaces over OC.
Then the linear operator
x!!4z
is also Fredholm and
ind(BA) = ind B + ind A. (22)
Proof. We will use the method of the parametrix. Note that the llroducts
of parametrices for A and B produce parametrices for BA. In fact, it follows
from Section 5.7 that, for j = 1,2, there exist linear continuous operators
pl(j), p~j) and linear compact operators C?), d
j ) such that
~(ll A = I + CPl, AP(l)
r
= I + C(1)
r ,
pF) B = I + Cl(2) ,
By Proposition 4 in Section 5.1, we obtain
Pl(l) ~(2) BA = I + linear compact operator;
BAPPl pP) = I + linear compact operator.
Thus, the product BA possesses right and left parametrices, that is, BA is
Fredholm, by Section 5.7.
Let us compute the index of BA. The following sequences of finite-
dimensional linear spaces are exact:
0--+ N(A) --+ N(BA) ~ R(A) n N(B) --+ 0
302 5. Fredholm Operators
0-7 R(B)j R(BA) -7 Zj R(BA) -> Zj R(B) -70
0-7 (R(A) + N(B))j R(A) -7 Yj R(A) ~ R(B)j R(BA) -70 (22*)
0-7 N(B) n R(A) -7 N(B) ~ (N(B) + R(A))j R(A) -7 O.
Here, the arrows without A, [B], and rr correspond to trivial inclusion maps,
and rr denotes the canonical map rr: Y -7 Yj R(A). Recall that rr(u) :=
u + R(A). Furthermore,
[B](u + R(A)) := Bu + BR(A) = Bu + R(BA).
Observe that the factor space W jV is the collection of all the different
subsets w + V, where w E W.
The exactness of all the preceding sequences follows simply from
N(A) ~ N(BA) and R(BA) ~ R(B).
For example, N(A) ~ N(BA) implies that the inclusion map N(A) - ....
N (B A) is injective, that is,
0-7 N(A) -7 N(BA)
is exact. Moreover, the map A: N(BA) -7 R(A) n N(B) is surjective, that
is,
N(BA) ~ R(A) n N(B) -7 0
is exact, and hence
0-7 N(A) -7 N(BA) ~ R(A) n N(B) -70
is exact, and so forth.
By Proposition 3 in Section 3.11, the exactness of 0 -7 A -7 !3 -7 C -7 0
implies
±(dim A - dim !3 + dim C) = o.
Applying this to the exact sequences in (22*), and summing the corre-
sponding relations for the dimensions (with +, -, +, -), we obtain
dim N(A) - dim N(BA) + dim (Zj R(BA)) - dim(Zj R(B))
- dim(YjR(A)) + dim N(B) = o.
Observe that co dim R(BA) = dim(Z/ R(BA)), and so on. Hence
ind A - ind BA + ind B = O. o
5.10 Fredholm Alternatives via Dual Pairs 303
5.10 Fredholm Alternatives via Dual Pairs
We want to reformulate the Fredholm alternative in terms of dual pairs,
which is convenient with a view to differential and integral equations (cf.
Section 5.11).
Definition 1. Let X and Y be normed spaces over lK. We call {Y, X} a
dual pair iff there exists a bounded bilinear map (.,.) v: Y x X -> IK such
that
(i) (v, u) v = 0 for all u E X implies v = 0;
(ii) (v, u)v = 0 for all v E Y implies u = O.
Example 2. Let X be a normed space over lK. Then {X* , X} forms a dual
pair with 8
(v, u) v := (v, u) for all VEX·, u E X.
Proof. For all u E X and v E X* ,
I(v, u) I ::; IIvllllull·
If (v,u) = 0 for all u E X and fixed v E X·, then v = O.
Conversely, if
(v, u) = 0 for all vEX· and fixed u EX,
then u = 0, by the Hahn-Banach theorem (cf. Standard Example 1 in
Section 1.1).
Example 3. Let X := era,
b], where -00 < a < b < 00. Then {X, X}
forms a dual pair with respect to
(v, u)v := lb v(x)u(x)dx for all u,v E X.
Proof. Recall that lIuli = maxa:::;x:::;b lu(x)l. Hence
I(v,u)vl::; (b-a)lIvllllull for all u, vEX.
If (v, u)v = 0 for all u E X and fixed v E X, then vex) = 0 on [a, b], by
Variational Lemma 10 in Section 2.2 of AMS Vol. 108. 0
8Recall that (v, u) = v(u).
304 5. Fredholm Operators
Note that the dual space CIa, b]* corresponds to functions of bounded
variation, by Section 1.3. Therefore, the dual pair {X,X} from Example 3
possesses a simpler structure than the usual dual pair {X*, X}.
Let us now consider the operator equation
Au =b, uEX, (23)
along with the "dual" equation
ADv = 0, VEX, (23*)
where
for all u, vEX. (24)
Theorem 5.G (The Fredholm alternative). Suppose that
(i) {Y, X} is a dual pair, where X and Yare Banach spaces over J[{.
(ii) The linear continuous operators A, AD: X -t Yare Fredholm with
ind A = -ind AD.
(iii) Th~ operator AD is "dual" to A, meaning that relation (24) is satis-
fied.
Then, for each given bEY, the original equation (23) has a solution u
iff
(b,V)D =0
for all solutions v of the "dual" equation (23*).
The advantage of this theorem over the usual formulation is that the
operators. A and AD live in the same space, in contrast to A and AT. In
applications, relation (24) corresponds frequently to integration by parts,
as we shall see ahead.
Proof. We want to reduce this new situation to the usual Fredholm alter-
native.
Step 1: We first show that
dim N(AD) :::; dim N(A T ). (25)
To this end, let {Vb . .. , vn } be a basis of N (A D). Define
for all u E Y.
Then fJ E yo, since Ifj(u)1 :::; constllullllvjll for all u E Y. By (24),
fJ(Au) = (AU,Vj)D = (ADvj,U)D = O.
5.11 Applications to Integral Equations and Boundary-Value Problems 305
Hence
(AT /j, u) = (/j, Au) = 0 for all u E Y,
that is, fI, ... , in E N(AT). This implies (25).
The functionals h, ... , in E y* are linearly independent. In fact it fol-
lows from
that (u,alvl + ... + anVn)D = 0 for all u E Y. This implies alVl + ... +
anV n = 0, and hence al = ... = an = O.
Step 2: Similarly, we get
dim N(A) ::; dim N«ADf). (26)
Step 3: Since the operators A and AD are Fredholm, we obtain
codim R(A) = dim N(AT) and codim R(AD) = dim N«ADf).
Thus, it follows from
ind A = dim N(A) - codim R(A)
::; co dim R(AD) - dim N(AD) = -ind AD
along with ind A = -ind AD that we can replace::; with = in equations
(25) and (26). Consequently, {h, ... , in} forms a basis of N(A T ).
Step 4: Since the operator A is Fredholm, the original equation Au = b,
u EX, has a solution iff
for all j = 1, ... , n.
By the definition of /j, this is equivalent to
for all j = 1, ... , n. o
Dual pairs play an important role in the modern theory of nonlinear
partial differential equations. This can be found in Zeidler (1986), Vol. 2B,
Theorems 27.B and 30.B, as well as Vol. 5, Theorems 83.Uff.
5.11 Applications to Integral Equations and
Boundary-Value Problems
Let us first consider the integral equation
u(x) -l b
A(x,y)u(y)dy = hex), a ::; x ::; b, (27)
306 5. Fredholm Operators
along with the dual integral equation
v(x) -lb A(y, x)v(y)dy = 0, a:<:::: x:<:::: b. (27*)
Standard Example 1. Let the function A: [a, b] x [a, b] ~ IR be continuous,
where -00 < a < b < 00.
Then, for given h E C[a, b]' the original problem (27) has a solution
u E C[a, b] iff
lb h(x)v(x)dx = 0,
for all solutions v E C[a, b] of the dual equation (27*).
The same result has been obtained in Section 5.3 by means of Hilbert
space methods combined with a regularization argument.
Proof. Set X := C[a, b]. We will use the dual pair {X, X} with
(V,U)D:= lb v(x)u(x)dx for all u,v E X
(cf. Example 3 in Section 5.10). Define
(Au)(x) := u(x) -lb A(x, y)u(y)dy
and
(ADv)(x) := v(x) -l b
A(y, x)v(y)dy
for all x E [a, bJ. By Standard Example 12 in Section 1.11 of AMS Vol. 108,
the linear operator A: X ~ X is a compact perturbation of the identity.
Hence Theorem 5.E tells us that A: X ~ X is Fredholm of index zero. The
same argument shows that AD: X ~ X is Fredholm of index zero.
Finally, for all u, v EX,
lb (l b
A(x, Y)U(Y)dY) v(x)dx = lb (l b
A(x, Y)V(X)dX) u(y)dy,
and hence we get the duality relation
for all u,v E X.
Thus, the assertion follows from Theorem 5.G. o
5.11 Applications to Integral Equations and Boundary-Value Problems 307
Next we want to study the following boundary-value problem:
au" + fJu' + 'Yu = h on [a, b], u(a) = u(b) = 0, (28)
along with the dual problem
(av)" - (fJv)' + 'YV = 0 on [a, bJ, v(a) = v(b) = o. (28*)
Standard Example 2. Let
a E C 2 [a, b], fJ E C 1 [a, b], 'Y E C[a, b],
where -00 < a < b < 00, and suppose that a(x) > 0 on [a, b].
Then, for given h E C[a, b], the original problem (28) has a solution
u E C2[a, b] iff
lb h(x)v(x)dx = 0,
for all solutions v E C2[a, b] of the dual problem (28*).
Proof. Set
X:= {u E C 2 [a,b]:u(a) = u(b) = O} and Y:= C[a,b],
along with the norm
lIulix := max lu(x)1 + max lu'(x)1 + aSxSb
max lu"(x)l,
aSxSb aSxSb
and lIully := maxa<x<b lu(x)l. Then X and Yare real Banach spaces.
We want to use ih; dual pair {Y, X} with
(v, U)D := lb v(x)u(x)dx for all v E Y, u E X.
Define the linear operators A, AD: X -t Y through
Au := au" + fJu' + 'YU for all u E X,
and
ADv := (av)" - (fJv)' + 'YV for all v E X.
For all u, v EX, integration by parts yields the duality relation
(AU,V)D = lb
(au" +f3u' +'Yu)vdx
= lb (-(av)'u' - (fJv)'u + 'Yuv)dx
= lb ((av)" - (fJv)' + 'Yv)udx = (ADV, V)D.
308 5. Fredholm Operators
Finally, let us prove that the operators A, AD: X ---7 Yare Fredholm of
index zero. Then, the assertion follows from Theorem 5.G. To this end,
define the linear operators B, C: X ---7 Y through
Bu:= au"
and
Cu := ;3u' + '"'(U for all u E X.
We will show the following:
(a) B: X ---7 Y is linear, continuous, and bijective.
(b) C: X ---7 Y is linear and compact.
This implies that the operator A := B + C is a compact perturbation of
the Fredholm operator B of index zero. By Theorem 5.E, the operator A
is Fredholm of index zero.
Ad (a). Obviously,
IIBUlly::; a~x~b
max la(x)1 max lu"(x)1
a~x~b
::; constllullx for all u E X. (29)
Hence B is continuous. For given hEY and p E JR, the initial-value problem
au" =h on [a, b], u(a) = 0, u'(a) = p,
has the unique solution
u = Uo + p(x - a),
where Uo corresponds to p = 0. Choosing p in an appropriate way, we find
that, for each given hEY, the boundary-value problem
au" = h on [a, b], u(a) = u(b) = 0,
has a unique solution u EX. Thus, B: X ---7 Y is bijective.
Ad (b). As in (29), we obtain
IICully ::; constllullx for lall u E X.
Thus, C is continuous. Let NI be a bounded subset of X. Then, for all
UE X and all x,y E [a,bj, we get
lu'(x) - u'(y)1 ::; (max lull(z)l) Ix - yl
a~z~b
::; (sup Ilu llx ) Ix - yl
uEM
5.12 Bifurcation Theory 309
and
lu(x) - u(Y)1 ~ (sup
uEM
lIullx) Ix - YI·
By the Arzeld-Ascoli theorem (Standard Example 7 in Section 1.11 of AMS
Vol. 108), the set C(M) is relatively compact in Y, that is, the operator
C: X -+ Y is compact.
Using the same argument, we see that the operator AD: X -+ Y is also
Fredholm of index zero. 0
5.12 Bifurcation Theory
To explain the basic idea of bifurcation theory, let us consider the following
two real equations:
u - Uo - (p - Po) = 0 (30)
and
(u - uo)[(u - uo) - (p - Po)] = O. (31)
The solutions are pictured in Figures 5.2(a) and 5.2(b), respectively. Ob-
viously, in Figure 5.2(a), the solution curve through the point (uo,Po) is
unique in a neighborhood of (uo,Po), in contrast to Figure 5.2(b). We say
that (uo,Po) is a bifurcation point in Figure 5.2(b).
In the natural sciences, bifurcation points correspond to substantial
changes in the behavior of systems. For example, Figure 5.3 displays a
beam buckling under the influence of an outer force p. If the force becomes
critical (i.e., p = Po), then the rest state passes over to a buckled state. 9
One frequently observes the following principle in nature:
Loss of stability leads to bifurcation.
We begin with the operator equation
F(u,p) = 0, u E X, P E II, (32)
where X and II are Banach spaces over lK. Here p is regarded as a parameter
living in the parameter space II (e.g., II = IK).
Definition 1. The point (uo,Po) is called a bifurcation point of (32) iff the
following conditions are met:
(i) F(uo,Po) = OJ
(ii) For n = 1,2, ... , there are two sequences {(Un,Pn)} and {(vn,Pn)}
of solutions to equation (32) that converge to (uo,Po) as n -+ 00.
[) A detailed mathematical study of this problem can be found in Zeidler (1986),
Vol. 2B, Section 29.13.
310 5. Fredholm Operators
:0) / I
Po
• P
:j Z Po
• P
(a) no bifurcation point (b) bifurcation point
FIGURE 5.2.
P ----~~~---------~~~----- ~-
P < Po P > Po
(a) (b)
FIGURE 5.3.
These are distinct sequences, that is, Un f:. Vn for all n = 1,2, ....
Proposition 2 (Necessary bifurcation condition). Let
F: U(uo,Po) <;;;; X x II -+ Y
be a CI-map on an open neighborhood of the point (uo, Po), where X, Y,
and II are Banach spaces over OC.
If (uo, Po) is a bifurcation point of (32), then the linearization
Fu(uo,Po): X -+ Y
is not bijective.
Proof. This follows immediately from the implicit function theorem in
Section 4.8. o
We now want to formulate an important, sufficient bifurcation condition
in the case where II := OCr!, that is, p = (PI, ... ,Pn)' We assume the
following:
(HI) Let F: U(uo,Po) <;;;; X x OC n -+ Y be a C 2 -map on an open neighbor-
hood of the point (uo,Po), where X and Yare Banach spaces over
OC, and n :::: 1.
(H2) Trivial solution. For all P E OCr! in an open neighborhood of Po E OC n ,
F(uo,p) = o.
(H3) Linearization. The linearized operator Fu(uQ,po): X -+ Y is Fred-
holm.
5.12 Bifurcation Theory 311
Suppose that there exists abE N(Fu(uo,Po)) with b =F 0, that is,
Moreover, suppose that vi, ... ,v~ E y* form a basis of R(Fu(uo,po)).l..
This is equivalent to the fact that vi, ... ,v~ are linearly independent and
that the linearized equation
UEX, (33)
has a solution for given v E Y iff (vi, v) = °for all j = 1, ... , n.
(H4) Bifurcation condition. Let
where ajk := (vi, FpkU(uo,Po)b), j, k = 1, ... , n.
Theorem 5.H. Assume (HI) through (H4). Then (uo,Po) is a bifurcation
point of the equation
F(u,p) = 0, (34)
Proof. Without loss of generality, we may assume that Uo = and Po ° = 0.
Since dim N(Fu(O, 0)) < 00, there exists a topological direct sum
x = N(Fu(O, 0)) Ell W
Choose elements V1,"" Vn E Y with (vi, Vj) = Oij for i,j = 1, ... , n, and
set n
Qv:= L(vi,v)Vi.
i=1
Then the linearized equation (33) has a solution iff Qv = 0, by (H3). Thus,
1- Q: Y ~ R(Fu(O, 0))
represents a continuous projection onto the range R(Fu(O, 0)). Define
._ {s-1 F(sb + sw,p) if s =F 0, (35)
H(w,p, s).- Fu(O,p)(b + w) if s = O.
We will show the following:
(a) The operator H: U(O, 0, 0) ~ W X ][(n X ][( ~ ][( is C 1 on an open
neighborhood of the point (0,0,0).
312 5. Fredholm Operators
(b) The linearization
(w,p) 1-+ Hw(O, 0, O)w + Hp(O, 0, O)p
is bijective from W x lKn onto Y.
It then follows from the implicit function theorem in Section 4.8 that,
for each s ElKin an open neighborhood of s = 0, the equation
H(w,p, s) = 0 (36a)
has a unique CI-solution
w = w(s), p = p(s), s =s (36b)
in an open neighborhood of the point (w,p, s) = (0,0,0) in W x lKn x K
Since H(O, 0, 0) = 0, we get w(O) = 0 and p(O) = O. Consequently, for
sufficiently small lsi with s =f. 0, the original equation (34) has the nontrivial
solution
u = sb + sw(s), p = p(s)
(Le., u =f. 0). Since equation (34) also possesses the trivial solution (u,p) =
(O,p), the point (0,0) is a bifurcation point of (34).
Ad (a). Assume first that F is analytic on a neighborhood of the point
(0,0) in X x II. Hence
(37)
with
L
00
IIMjkllliulljllpllk < 00,
j,k=O
for all (u,p) in some open neighborhood of (0,0) in X xII. Since F(O,p) == 0,
MOk =0 for all k = 0, 1,2, ....
Furthermore, differentiating relation (37) at (O,p), we get
L
00
Fu(O,p)h = Mlkhpk for all hEX.
k=O
Thus,
H(w,p,s) = L sj-IMjdb+w)jpk, (38)
j~l,k~O
along with
5.13 Applications to Nonlinear Integral Equations 313
meaning that H is analytic on some open neighborhood of (0,0,0). Hence
H is Coo on that neighborhood. Furthermore,
Hw(O, 0, 0) = MlO = Fu(O, 0), Hp(O, 0, O)p = Mllbp = Fpu(O, O)pb. (39)
For the general case, the proof of (a) will be given in Problem 5.1, by
means of the Taylor theorem.
Ad (b). Set B:= Fu(O, O) and C:= Fpu(O, 0). Then the linearized equa-
tion H",(O, 0, O)w + Hp(O, 0, O)p = h is identical to
Bw+Cpb= h, (40)
If we note that QB = 0, then (40), after applying Q and (I - Q), is
equivalent to
QCpb = Qh, (41a)
Bw = (1 - Q)(h - Cpb), (w,p) E W x ][{n. (41b)
Equation (41a) is equivalent to
n
~:::>jQFpju(O, O)b = Qh,
j=l
that is,
n
:~.::>j (vi, Fpju(O, O)b) = (vi, Qh), i = 1, ... ,no
j=l
By (H4), this equation has a unique solution p E ][(n, for each given hEY.
Thus, equation (41a) can be solved uniquely for pj since w E W, we can
then solve the second equation (41b) uniquely for w, by (H3). This proves
(b). 0
5.13 Applications to Nonlinear Integral Equations
Let us consider the integral equation
a ::; t ::; b, p E JR, u E X, (42)
with the real Banach space X ;= C[a, bJ, -00 < a < b< 00. The linearized
problem is
u(t) = Po lb A 1 (t, s)u(s)ds, a ::; t ::; b, Po E JR, u E X, (43)
314 5. Fredholm Operators
with the corresponding dual equation
v(t) = Po lb A 1 (s, t)v(s)ds, a ::; t ::; b, Po E JR, vEX. (43*)
We set (u I v) := J:u(t)v(t)dt for all u, vEX. Assume that every function
Ak: [a, bJ x [a, bJ-+ JR is continuous and that the majorant series
is convergent for all real numbers ~ in some open neighborhood of zero.
Recall that Po is called a characteristic number of (43) iff equation (43)
has a nontrivial solution u. Moreover, Po is called a simple characteristic
number iff (43) has precisely one linearly independent solution.
Proposition 1. (i) The regular case. Suppose that Po is not a characteristic
°
number of (43). Then there exist numbers p > and r > such that, for
each given p E JR with Ip - Pol < p, the original problem (42) has a unique
°
solution u E X withlO lIuli < r.
(ii) The bifurcation case. Let Po be a simple characteristic number of
(43). Then (O,Po) E X x JR is a bifurcation point of the original problem
(43) provided that
(ulv)=lo, (44)
where u and v are nontrivial solutions to (43) and (43*), respectively.
Proof. Let us write the original equation (42) in the form
F(u,p) = 0, u E X, P E JR.
Obviously, the operator F: U(O,Po) ~ X x JR -+ X is analytic on some open
neighborh.ood of the point (O,Po) in X x JR. The linearized equation
hEX, (45)
corresponds to
h(t) - Po lb A 1 (t, s)h(s)ds = w(t), a ::; t :::; b, hEX.
Note that Fu(O,po): X -+ X is a Fredholm operator of index zero.
Moreover, the partial derivative (h,p) 14 Fpu(O,Po)ph corresponds to the
integral operator
P lb A 1 (t, s)h(s)ds.
lORecall that lIuli = maxa:S",:Sb lu(x)l.
5.14 Applications to Nonlinear Boundary-Value Problems 315
Ad (i). By hypothesis, equation (45) with w = 0 has only the trivial
solution h = O. Hence Fu(O,po):X --+ X is bijective. The assertion now
follows from the implicit function theorem in Section 4.8.
Ad (ii). Let v E X be a nontrivial solution of (43*). Set
(v*,g) := (v I g) for all 9 E X.
Then, v* E X*. By Example 1 in Section 5.11, for given w EX, equation
(45) has a solution h iff
! (v*, w) = O.
The bifurcation condition (H4) from Section 5.12(i) reads as follows:
(v*, Fpu(O,po)u) = ib (i b
vet) AI(t, S)U(S)dS) dt
=pol(vlu)=I=-O.
The assertion now follows from Theorem 5.H. o
Remark 2. Suppose that AI(t,s) > 0 for all t,s E [a,b] x [a,b]. Then,
by the classical theorem of Jentzsch,l1 the integral operator L: X --+ X
belonging to the kernel AI, namely,
(Lh)(t) := ib AI(t, s)h(s)ds for all t E [a, b]
has a positive spectral radius r, and Po := r- I is a simple characteristic
number of both (43) and (43*), where the corresponding eigenfunctions u
and v are positive on [a, b]. Thus, condition (44) is satisfied automatically,
and hence (0, Po) E X x JR is a bifurcation point of (42).
5.14 Applications to Nonlinear Boundary-Value
Problems
Let us study the boundary-value problem
-u"(t) + q(t)u(t) = p (U(t) + ~ aku(t)k) on [a,bJ,
(46)
u(a) = u(b) = 0, P E JR,
llThis is a special case of the functional analytic Krein-Rutman theorem (cf.
Zeidler (1986), Vol. 1, Example 7.30).
316 5. Fredholm Operators
where -00 < a < b < 00, along with the linearized problem
-u"(t) + q(t)u(t) = Pou(t) on [a, b],
(47)
u(a) = u(b) = 0, Po E R
Let ak be fixed real numbers for which the series L:k ak~k converges in
an open neighborhood of ~ = 0 in R Let q: [a, b] ---+ JR be a continuous
function. Set
X := {u E C 2[a, b]: u(a) = u(b) = O}, Y:= C[a,b].
Proposition 1. The point (0, Po) E X x JR is a bifurcation point of (46) iff
Po is an eigenvalue of (47).
Proof. Let us write equation (46) in the form
F(u,p) = 0, u E X, pER
Then F: U(O,Po) ~ X x JR ---+ X is analytic on some open neighborhood of
the point (O,Po). The linearized equation
hE X, (48)
corresponds to the boundary-value problem
-h"(t) + q(t)h(t) = poh(t) + wet) on [a, b],
h(a) = h(b) = O.
Let Po be an eigenvalue of (47) with the eigenfunctions u and v. Since
u(a) = u'(a) = 0 along with (47) imply u == 0, we get u'(a) =f 0 and
v'(a) =f O. Hence u'(a) = .xv'(a) for some .x E JR, and the uniqueness of the
solution to the initial-value problem in (47) yields u == .xv on [a, b]. Thus,
there exists precisely one linearly independent eigenfunction u to Po.
By Example 2 in Section 5.11, for given w E Y, problem (49) has a
lb
solution hEX iff
(u I w) := u(t)w(t)dt = 0 (50)
and the operator Fu(O,po): X ---+ Y is Fredholm of index zero.
The partial derivative (h,p) ~ Fpu(O,Po)ph corresponds to the operator
(h,p) ~ -ph.
By (50), the decisive bifurcation condition (H4) of Theorem 5.H reads as
-l
follows:
(u I Fpu(O,po)u) =
b
u(t)2dt =f O.
The assertion now follows from Proposition 2 and Theorem 5.H in Section
5.12. 0
5.15 Nonlinear Fredholm Operators 317
5.15 Nonlinear Fredholm Operators
In this section, X and Y denote real Banach spaces.
Definition 1. Let A: U ~ X -+ Y be a C 1 -map, where U is open. Then A
is called a Fredholm map iff the linearization
A'(u):X -+ Y
is Fredholm for all u E U.
Since u 1-+ A'(u) is continuous, the index ind A'(u) is locally constant,
by Section 5.8. Thus, if U = X, then
ind A := ind A'(u)
is independent of u EX. This number is called the index of A.
Definition 2. The operator A: U ~ X -+ Y is called proper iff the preimage
A -1 (C) of every compact set C in Y is also compact.
Standard Example 3. The operator A: X -+ Y is proper provided
A=B+C,
where B: X -+ Y is a homeomorphism, C: X -+ Y is compact, and
IIAull-+ 00 as Ilull -+ 00. (51)
Proof. Let C be a compact subset of Y, and let AUn E C for all n E N. It
suffices to show that (un) contains a convergent subsequence (un')' Then
Un' -+ u as n -+ 00, and hence Au E C, that is, U E A-1(C). In the
following, we do not distinguish between sequences and subsequences.
Since C is bounded, it follows from (51) that (un) is bounded. Conse-
quently,
asn-+oo
for some w. All AUn live in the compact set C, and so
asn-+oo
for some v. Thus, BUn -+ V - was n -+ 00. Since B is a homeomorphism,
Un -+ B- 1 (v - w) as n -+ 00. D
Let us now consider the equation
Au=w, uEX. (52)
318 5. Fredholm Operators
Theorem 5.1 (The Smale principle). Suppose that the CI-operator A: X ~
Y is Fredholm and proper with ind A = O.
Then, for each Wa E Y and € > 0, there exists a point W E Y with
Ilw - wall < € such that the original equation (52) has at most a finite
number of solutions.
Smale proved this theorem in 1965. Roughly speaking, Theorem 5.I tells
us that in "most cases" equation (52) has at most a finite number of so-
lutions. One also says that (52) has generically at most a finite number of
solutions. The proof will be given later after some preparations.
Definition 4. Let the map A: X ~ Y be C I .
(i) The point u E X is called a regular point of A iff A' (u): X ~ Y is
surjective. Otherwise, u is called a singular point of A.
(ii) The point v E Y is called a regular value of A iff the preimage A-I (v)
is empty or consists solely of regular points. Otherwise, v is called a
singular value of A (i.e., A -1 (v) contains at least one singular point).
Proposition 5 (Sard's theorem). If A: IR m ~ IR n is a Ck-mapping with
k> max(O, m - n) and m, n E N, then the set of singular values of A has
n-dimensional Lebesgue measure zero in IRn.
Consequently, the set of regular values of A is dense in IRn.
Sard proved this famous classic result in 1942. A proof can be found in
Abraham and Robbin (1967).
Example 6. Let J: IR ~ IR be a CI-function. Then u is a regular point of
Jiff
f'(u) -1= o.
Moreover, v is a singular value of J iff there exists a point u such that
J(u) = v and f'(u) = O.
The Sard theorem tells us that
Singular values are rare (cf. Figure 5.4).
Example 7. Suppose that the CI-operator A: X ~ Y is Fredholm and
proper with ind A = O. Let w be a regular value of A. Then
(i) If Au = w, A is a local CI-diffeomorphism at u.
(ii) The equation Au = w has at most a finite number of solutions.
5.15 Nonlinear Fredholm Operators 319
FIGURE 5.4.
Proof. Ad (i). Since the Fredholm operator A'(u): X -+ Y is surjective
and ind A'(u) = 0, A'(u): X -+ Y is bijective. By the local inverse mapping
theorem from Section 4.10, A is a local CI-diffeomorphism at u.
Ad (ii). Since A is proper, the set A-I(w) is compact. Suppose that there
exists an infinite sequence (un) with
AU n =w for all n.
Since A-I(w) is compact, there exists a subsequence, again denoted by
(un), such that Un -+ u as n -+ 00. Hence Au = w. By (i), u is an isolated
solution of Au = w. This is a contradiction. 0
Let us assume the following.
(H) A: U(uo) ~ X -+ Y is a Ck-Fredholm map with k > max(ind A'(uo),
0), where U(uo) is an open neighborhood of Uo.
Lemma 8. If (H) holds, then there exists an open neighborhood V(uo) of
Uo in X such that the regular values of the restriction AWCuo) are dense in
Y.
Proof. We make essential use of the local normal form (81) from Section
4.12. Let N .- N(A'(uo)) and R := R(A'(uo)). We choose topological
direct sums
x = NEBNc and Y = REB Rc. (53)
By Proposition 1 in Section 4.12, there exists a Ck-diffeomorphism
¢: U(O,O) ~ N x R -+ V(uo)
such that the relation
h(n, r) = Auo + r + g(n, r) on U(O, 0) (54)
holds for h(n, r) := A(¢(n, r)), with n E N, r E R, and g(n, r) E Rc on
U(O,O).
The dimensions of Nand Rc are finite, because A'(uo) is Fredholm. The
product of a linear surjective operator with a linear bijective operator is
320 5. Fredholm Operators
surjective. Thus, by the local inverse mapping theorem from Section 4.10,
regular values are invariant under diffeomorphisms. Consequently, it suffices
to show that the regular values of h are dense in Y.
Let v E Y. We decompose
where VI E Rand V2 ERe· (54*)
Let 1j;(n) := g(n, VI)' Then 1j;: U(O) <;:;; N -f Re is C k , by (54). Letting
m := dim N - dim R e ,
we obtain m = ind A' (uo) and k > max( m, 0) from (H).
(a) According to Sard's theorem (Proposition 5), the regular values of 1j;
are dense in Re.
(b) We show that if V2 is a regular value of 1j;, then V is a regular value
of h. Indeed, by (54) and (54*), from hen, r) = v it follows that r = VI and
1j;(n) = V2. Moreover, by (54), we have
h'(n, vd(n, f) = f + 1j;'(n)n + gr(n, vd(O, f)
for all n E N, fER. Note that Y = R tJ7 Re as well as gr(n, VI)(O, f) ERe
for all fER. Therefore, the surjectivity of 1j;'(n): N - f Re implies the
surjectivity of
h'(n, vd: N x R - f Y.
It follows from (a) and (b) that the regular values of h are dense in Y. 0
Lemma 9. If (H) holds, then A is locally closed, that is, A maps closed
sets contained in a sufficiently small open neighborhood of the point Uo onto
closed sets.
Proof. Since h = A 0 <p, it suffices to show that h maps closed sets onto
closed sets.12 Let
as k -f 00,
and let
where WI E Rand W2 ERe.
Assume that (nk' rk) lives in the bounded closed set M for all k. It follows
from (53) and (54) that rk --+ WI as k - f 00. Since dim N < 00, we find
that nk - f n as k - f 00 for some n E N, after passing to a subsequence, if
necessary. Hence h(n,wI) = wand (n,wd E M. 0
Lemma 10. If (H) holds and Uo is a regular point of A, then there exists
an open neighborhood of Auo that contains only regular points of A.
120bserve that cjJ is a homeomorphism and use Lemma 12 from Section 3.9.
5.15 Nonlinear Fredholm Operators 321
Proof. This follows from normal form (54) with R = Y and g == O. Note
that
h'(n,r)(n,f) = r for all fEY
and all (n,r) E U(O,O), that is, h'(n,r) is surjective for these points. 0
Corollary 11. If (H) holds, then there exists an open neighborhood W( uo)
of Uo in X such that the set of singular values of the restriction A1W(uo) is
closed in Y, and the set of regular values of A1W(uo) is open and dense in
Y.
Thus, the set of singular values of A1W(uo) is nowhere dense in Y.
Proof. Let R be the set of regular points of A in a sufficiently small
neighborhood W(uo) of the point uo. By Lemma 10, the set R is open,
and hence the set S := W(wo) - R of singular points of A in W(uo) is
closed. Lemma 9 tells us that A(S) is closed in Y. Thus, the set Y - A(S)
of regular values of A1W(uo) is open and dense in Y, by Lemma 8. 0
Lemma 12. If A: X -+ Y is continuous and proper, then A transforms
closed sets onto closed sets.
Proof. Let C be a closed set in X, and let AUn = v n , where Un E C for
all n E Nand Vn -+ v as n -+ 00. The set of all Vn together with v is
compact. Therefore, (un) contains a convergent subsequence with Un' -+ U
as n -+ 00. Since C is closed and A is continuous, we have U E C and
Au = v. This means that A(C) is closed. 0
Proposition 13 (Sard-Smale theorem). Let A: X -+ Y be a proper C k _
Fredholm map with k > max(ind A,O).
Then the set of regular values of A is open and dense in Y.
Proof of Theorem 5.1. Proposition 13 and Example 7 immediately imply
Theorem 5.1. 0
Proof of Proposition 13. Step 1: By Lemma 10, the set of the regular
points of A is open in X. Thus, the set of the singular points of A is closed
in X, and hence Lemma 12 tells us that the set of singular values of A is
closed in Y. Consequently, the set reg (A) of the regular values of A is open
in Y.
Step 2: Choose a fixed point v E Y. Let U be an open set in X such that
A- 1 (v) ~ U. Then there exists an open neighborhood V(v) of the point v
with
A-l(V(V)) ~ U.
Otherwise, there exists a convergent sequence AUn -+ v as n -+ 00 with
Un tf. U for all n. Since A is proper, we may assume that, Un -+ U as
322 5. Fredholm Operators
n _ 00 after passing to a subsequence, if necessary. This yields the desired
contradiction u ¢ U and Au = v.
Step 3: By Corollary 11, there exists an open neighborhood W(u) for
each point u E A -1 (v) such that the singular values of the restriction
AIW(u)
form a nowhere dense set S(u) in Y. Since A is proper, the set A- 1(v)
is compact, and hence finitely many sets W(U1),"" W(u m ) already cover
the set A-1(v) (cf. Problem 1.14). Set
m
U:= U W(Uj).
j=l
Then the set
m
Su:= U S(Uj)
j=l
of singular values of the restriction AIU is of the first Baire category.
Step 3: According to Step 2 there exists an open neighborhood V(v) of v
such that A- 1 (V(v» £; U. Hence the set of singular values S of A: X - Y
in V(v) is equal to Su (Le., S is of the first Baire category). Hence the set
V(v) - S of regular values of A in V(v) is dense in V(v) (cf. Problem 3.1).
Since the point v can be chosen arbitrarily, the set of regular values of
operator A is dense in Y. 0
5.16 Interpolation Inequalities
Let 0 < Q < 1. Inequalities of the type
IIUlly ::; constllullx Ilull~Q, for all u E X, (55)
are called interpolation inequalities. Such inequalities playa fundamental
role in modern analysis. They allow us to give efficient existence proofs
for nonlinear partial differential equations. For example, the interpolation
o
inequality (60) implies the compactness of the embedding W~(G) £; L4(G).
This will be used in the next section in order to given an existence proof for
the famous stationary Navier-Stokes equations. 13 Interpolation inequalities
for Sobolev spaces follow from integral inequalities based on the Holder
inequality.
13 A detailed study of interpolation inequalities can be found in Zeidler (1986),
Vol. 2A, Section 21.17ff.
5.16 Interpolation Inequalities 323
Recall the following. Let X and Z be Banach spaces over IK such that
X s:;; Z. Define an operator E: X -+ Z that assigns to each element u of X
the same u, but now regarded as an element of Z. The operator E is called
an embedding operator. The embedding X s:;; Z is called continuous iff the
operator E is continuous, that is,
lIullz :::; constllullx for all u E X.
Furthermore, the embedding X s:;; Z is called compact iff the operator E is
compact, that is, the embedding X <;::: Z is continuous and each bounded
sequence in X has a subsequence that converges in Z.
The following simple result is crucial.
Proposition 1 (Compact embedding). Let X, Y, and Z be Banach spaces
over IK such that we have the inclusions
X s:;; Y and X s:;; Z
along with the interpolation inequality (55) for the norms. Then the follow-
ing conditions are met:
(i) If the embedding X s:;; Z is continuous, then so is X s:;; Y.
(ii) If the embedding X s:;; Z is compact, then so is X s:;; Y.
Proof. Ad (i). Since the embedding X s:;; Z is continuous,
lIullz :::; constllullx for all u E X.
By (55), lIully :::; constllullx for all u E X.
Ad (ii). Let (Un) be a bounded sequence in X. Since the embedding
X ~; Z is compact, there is a subsequence (un') that converges in Z. Let
us denote (un') by (un). It follows from (55) that
lIun - umlly :::; constllun - umll'Xllu n - umI11-a:
for all n, m. (56)
The sequence (un) is Cauchy in Z. By (56), (un) is also Cauchy in Y. Thus,
the sequence (un) converges in Y. D
Let G be a nonempty bounded open set in 1R3. Define 14
(u I vh:= fa uvdx, (u I v) := fa ojuojvdx,
(u I Vh,2 := (u I vh + (u I v),
14We sum over two equal indices from 1 to 3.
324 5. Fredholm Operators
and
lIull := (u I u)!.
Furthermore, let
Lemma 2. The norm II . II is equivalent to the original norm II . 1h,2 on the
o
Sobolev space W~(G).
Proof. By the Poincare-Friedrichs inequality, there is a constant c >0
such that
cllull~ $ lIull 2 for all u EW~(G)
(cf. Section 2.5.6 in AMS Vol. 108). Hence cllull~ + cllull 2 $ (1 + c)lluIl 2 .
Thus, there is a constant d > 0 such that
for all u EW~(G). (57)
o
o
Let (un) be a sequence in WHG). Relation (57) tells us that (un) is
Cauchy with respect to the norm II . 111,2 iff it is Cauchy with respect to
o
the equivalent norm 11·11. Consequently, W~(G) is a Hilbert space equipped
with the new inner product (. 1 .). In what follows we will always refer to
this new inner product.
Definition 3. Let L4(G) denote the set of all measurable functions u: G --t
IR such that IIull4 < 00.
Then, L4(G) becomes a real Banach space 15 with respect to the norm
II . 1\4. The following result will be critically used in the next section.
o
Proposition 4. The embedding WHG) ~ L4(G) is compact.
Proof. Set
We will show in Lemma 7 ahead that in this case the interpolation inequal-
ity (55) holds true.
15See Problem 5.9 for a more general result. By definition, two functions u
and v represent the same element of L4(G) iff their values differ only on a set of
measure zero.
5.16 Interpolation Inequalities 325
Rellich's compactness theorem from Section 5.7 in AMS Vol. 108 tells us
that the embedding X ~ Z is compact. Thus, it follows from Proposition
1 that the embedding X ~ Y is also compact. 0
It remains to prove Lemma 7. To accomplish this, we need some prepa-
rations.
Lemma 5. For all u E CQ"(IR.2),
{ u 4 dx ~4 { u 2 dx { (u~ + u~)dx, (58)
JR2 JR2 JR2
where x = (e, 1]).
Proof. In what follows we write briefly the integral J instead of J~=. From
we obtain the key inequality:
(59)
Hence,
Therefore, by the Schwarz inequality,
This immediately implies (58). o
Lemma 6. For all u E CQ"(lR3 ),
where x = (~, 1], ().
326 5. Fredholm Operators
Proof. Set J := JJR3 u 4 dx. By Lemma 5,
J = I d( I I u(e, TJ, ()4dedTJ
::; 4 I d( II u 2dedTJ I I(u~ + u~)dedTJ
::; 4 II rr8 u (e, TJ, ()2dedTJ II I(u~ +u~)dedTJd(.
Applying the key inequality (59), we obtain
By the Schwarz inequality,
o
Lemma 7. For all u EW~(G), we have the crucial Ladyzhenskaya inequal-
ity
(60)
o
Recall that 11·11 denotes the norm on W~(G) corresponding to the inner
product (. I .).
Proof. Let u EW~(G). Since the set C8'"(G) is dense in W~(G), there
o 0
exists a sequence (un) in W~(G) such that Un -+ U in W~(G) as n -+ 00.
Hence,
un -+ U in L2(G) as n -+ 00, (61)
by Lemma 2. Furthermore, Lemma 6 tells us that
for all n, (62)
and 1 3
I\un - Umll4 ::; 81\un - uml\~ I\un - Uml\i for all n, m. (63)
Thus, the sequence (un) is Cauchy in the Banach space L4(G). This implies
the existence of a function v E L4(G) such that
Un -+ V in L4(G) as n -+ 00. (64)
We want to show that
u(x) = vex) for almost all x E G. (65)
5.16 Interpolation Inequalities 327
Then, letting n ~ 00, we obtain the desired inequality (60) from (62).
To prove (65) we will use the following standard trick. By Problem
5.9a(iv), it follows from (61) that there is a subsequence (un') of (un)
such that
as n l ~ 00 for almost all x E G.
Similarly, it follows from (64) that there is a subsequence (un") of (un')
such that
Un" (x) ~ v(x) as nil ~ 00 for almost all x E G.
This implies (65). o
Proposition 8. Let 1t and Z be real Hilbert spaces such that the embedding
is continuous, and 1t is dense in Z. Then the embedding
Z ~ 1t* (66)
is continuous. In addition, Z is dense in the dual space 1{*.
The precise interpretation of relation (66) will be given in Step 3 of the
following proof.
Proof. Step 1: The injective map 1/J: Z ~ 1t*. Let v E Z. Define v* through
v*(u) := (v 1 u)z for all U E 1{. (67)
Then
Iv*(u)l::; Ilvllzllullz for all u E 1{.
Since the embedding 1t ~ Z is continuous, lIullz ::; constllull'H for all u E 1t.
Hence,
Iv* (u) 1 ::; constllvllz lIull'H for all u E 1t. (68)
This shows that v*: 1t ~ lR. is a linear continuous functional on 1t (i.e.,
v* E 1t*). Furthermore, it follows from (68) that
IIv*II'H. ::; constllvllz for all v E Z. (69)
Define now the map 1/J: Z ~ 1{* through
1/J(v) := v*.
Then, 1/J is linear and continuous, by (69). Moreover, 1/J is injective. In fact,
if 1/J(v) = 0 for fixed v E Z, then (v 1 u)z = 0 for all u E 1{. Since 1{ is
dense in Z, v = O.
328 5. Fredholm Operators
Step 2: We show that the set 'IjJ( Z) is dense in 1t*. Suppose that this is
not true. Then the closure 'IjJ(Z) of 'IjJ(Z) in 1t* is a proper closed linear
subspace of 1t*. Choose a point u* E 1t* such that u* f/. 'IjJ(Z). By Section
1.2, there exists a functional f E (1t*)* such that f(u*) =1= 0 and
f(v*) =0 for all v* E 'IjJ(Z). (70)
Since the Hilbert space 1t is reflexive, there exists a point w E 1t such that
f(v*) = v*(w) for all v* E 1t*.
By (70), v*(w) = 0 for all v* E 'IjJ(Z). Thus, relation (67) tells us that
(v I w) z = 0 for all v E Z.
Hence w = O. Therefore, we obtain f = 0, contradicting f( u*) =1= O.
Step 3: Interpretation of relation (66). Since the map 'IjJ: Z --; 1t* is
injective, we can identify the point v in Z with the point 'IjJ( v) in 1t*. In
this sense, we write Z <;;; 1t* . 0
Proposition 9. Let X and Y be normed spaces over IK such that the em-
bedding X <;;; Y is continuous. In addition, let M be a set in the dual space
X*. Then the following hold true:
(i) The embedding Y' <;;; X* is continuous.
(ii) If}VI is open in X*, then the intersection y* n M is open in Y·.
(iii) If M is dense in X*, then y* n M is dense in Y'.
Proof. Ad (i). Let f E Y*. Then the functional f: Y --; IK is linear and
continuous. Hence the restriction f: X --; IK is also linear and continuous.
Moreover, it follows from
sup{lf(u)l: u EX, Ilulix ::; I} ::; sup{lf(u)l: u E Y, Ilully ::; I}
that
Ilfllx* ::; IIflly*·
Ad (ii). The embedding operator E: Y* --; X* is continuous. Thus, the
preimage of open sets is again open. In particular, E-l(!vI) is open in Y'.
Observe that E-l(M) = y* n M.
Ad (iii). Let fEY'. Choose any E > O. Since }vI is dense in X', there
exists a linear continuous functional g: X --; Ik such that gEM and
IIf - gllx* < E.
Since X is a linear subspace of Y, there exists an extension g: Y --; lK such
that 9 E Y' and
IIf - glly· < E,
by the Hahn-Banach theorem (which allows norm-preserving extensions).
Finally, observe that 9 E Y' n M. 0
5.17 Applications to the Navier-Stokes Equations 329
5.17 Applications to the Navier-Stokes Equations
Undoubtedly, the Navier-Stokes equations are of basic importance
within the context of modern theory of partial differential equations.
Although the range of their applicability to concrete problems has
now been clearly recognized to be limited, as my dear friend and
bright colleague K.R. Rajagopal has showed me by several examples
during the last six years, the mathematical questions that remain
open are of such a fascinating and challenging nature that analysts
and applied mathematicians cannot help being attracted by them
and trying to contribute to their resolution. Thus, it is not a co-
incidence that over the past ten years more than seventy signifi-
cant research papers have appeared concerning the well-posedness of
boundary and initial-boundary value problems. 16
Giovanni Paolo Galdi (1994)
In this section, we want to combine important tools from functional
analysis in order to solve the famous stationary Navier-Stokes equations.
In particular, we will use the Riesz theorem, the closed range theorem,
the Leray-Schauder principle, and the Smale principle for nonlinear proper
Fredholm operators of index zero. Furthermore, we will use the theory of
distributions and the theory of Sobolev spaces introduced in AMS Vol. 108.
We recommend that the reader study this section carefully. This way,
the reader gets an impression of the modern approach to nonlinear partial
differential equations arising in mathematical physics. We also like to show
that the language of functional analysis is the right language for modern
mathematical physics.
Let G be a nonempty, bounded, open, and connected set in ]R3. The
stationary motion of an incompressible viscous fluid in G is governed by
the following Navier-Stokes equations:
-T/Av + p(vV')v = K - V'p on G (equation of motion),
V'v = 0 on G (incompressibility conditions),
v=O on 8G (boundary condition).
(71)
Here, we use the following notation: 17
v(x) = velocity vector of the fluid at the point x,
p(x) = pressure at the point x,
p = constant density of the fluid,
16The four-volume monograph by Galdi (1994) contains a detailed, up-to-date
study of the Navier-Stokes equations.
17The total force acting on the fluid equals fa K(x)dx.
330 5. Fredholm Operators
K(x) = outer force density at the point x,
ry = viscosity (a positive constant).
The full equation of motion
pVt - ryt:..v + p(vY')v = K - Y'p
corresponds to Newton's law: mass x acceleration = force. In the stationary
case, all the quantities are independent of time t. In particular, the time
derivative Vt of the velocity field vanishes. This yields the first equation in
(71). The boundary condition in (71) reflects the experimental fact that a
viscous fluid sticks to the boundary. To simplify notation, set p = 1.
Turbulence. Physical experiments show that turbulence occurs if the
outer force densities K are sufficiently large. This physical effect strongly
complicates the mathematics of the Navier-Stokes equations.
A detailed physical motivation of the Navier-Stokes equations can be
found in Zeidler (1986), Vol. 4, Section 70.3.
5.17.1 Reformulation of the Classical Problem
The original problem (71) can be written in the following form:
-ryt:..v + Y'(v Q9 v) = K - Y'p (equation of motion),
Y'v = 0 (incompressibility condition), (72)
v=o (boundary condition).
To show this, let us choose a Cartesian coordinate system with the or-
thonormal basis ell e2, and e3. Then, x = xjej along with 18
v = vjej and K = Kjej.
Set OJ := o/Oj. Observe that
(vY')v = vjojvmem = OJ (vjvm)e m - vm(ojvj)e m
= OJ (vjvm)e m = Y'(v Q9 v)
because Y'v = OjVj = O. Thus, for m = 1,2,3, problem (71) reads as
follows:
-ryoj8j vm + OJ (VjVm ) = Km - G (equation of motion),
8 mP on
OjVj =0 on G (incompressibility condition),
Vm =0 on 8G (boundary condition).
(73)
This is identical to the invariant formulation in (72).
18We sum over two equal indices from 1 to 3.
5.17 Applications to the Navier-Stokes Equations 331
5.17.2 The Classical Basic Idea
To find smooth solutions to the original problem (72), let us write this in
the following form
P=-1]Av+V(v®v)-K onG, VEX, (74)
-Vp=P on G, (75)
fa Pwdx = 0 for all w EX. (76)
Here, X denotes the set of all smooth velocity fluids V on G that satisfy
both the incompressibility condition
Vv = 0 on G
and the boundary condition V = 0 on aG. From the physical point of view,
P is the force density generated by the pressure p in the fluid.
Let us discuss this.
Step 1: Suppose that (v, p) is a smooth solution of the original problem
(72). We want to show that (v,p) satisfies equations (74) through (76).
In fact, vEX. Moreover, it follows immediately from (72) that the
velocity field v and the pressure p satisfy equations (74) and (75). Equation
(75) implies
fa Pwdx = fa-(VP)WdX for all W E X.
Integration by parts yields
fa Pwdx = fap(VW)dX = 0 for all W E X,
since Vw = 0 on G. This shows that
Relation (76) represents a necessary solvability condition for the pressure
equation (75).
In summary, (v,p) is a solution to (74) through (76).
Step 2: Conversely, assume that there is a solution (v, P) of equation
(74) such that relation (76) is additionally satisfied. We want to show that
there exists a pressure function p that satisfies equation (75).
Let G be a simply connected region with smooth boundary. To determine
the pressure p, we use the basic fact from classical vector calculus that
Relation (76) is also a sufficient solvability condition for the pressure
equation (75).
Thus, equation (75) indeed has a solution p. To summarize, (v,p,P) is a
solution of (74) through (76). This implies that (v,p) is a solution of the
original problem (72).
332 5. Fredholm Operators
The preceding discussion shows the crucial fact that it suffices to find
a velocity field v E X such that the following orthogonality relation holds
fa
true:
(-'T/ Av + V(v Q9 v) - K)wdx = 0 for all w EX. (77)
This is the key to our approach.
Remark 1 (Generalized solutions). Our aim is to prove the existence of
generalized (nonsmooth) solutions. Since equation (77) is related to or-
thogonality, we will use a Hilbert space approach. To reduce the order of
highest derivatives that appear in our generalized problem, consider first
a sufficiently smooth situation. Then, integrating relation (77) by parts
yields 19
fa ('T/VvVW - v(vV)w - Kw)dx = 0 for all w E X (78)
and fixed VEX. Observe that w = 0 on the boundary 8G.
Using regularity theory, it can be shown that generalized solutions are
also classical smooth solutions provided the boundary and the outer force
densities are smooth (Remark 8).
To simplify notation, let
5.17.8 The Generalized Problem
Motivated by (78), define
(v I w) := fa VvVwdx = fa 8 j v m 8 j v m dx,
a(u,v,w) := -fa u(vV)wdx = -fa ujvm 8 j w m dx,
19Using components, equation (77) reads as follows:
Since Wm = 0 on QG, integration by parts yields
This coincides with (78).
5.17 Applications to the Navier-Stokes Equations 333
fa fa
and
K(w) := Kwdx = Kjwjdx.
Then, the key equation (78) reads as follows:
17(V I w) + a(v, v, w) = K(w) for all W EX (79)
and fixed vEX.
Let us now introduce the relevant function spaces. Consider first the
product space
By Lemma 2 in Section 5.16, 'H is a real Hilbert space with respect to the
inner product (. I .). Moreover, define
X := closure of the set D in the Hilbert space 'H, (80)
where
D := {¢: ¢j E C8"(O) for all j and \l¢ = O}.
Here, \l¢ = 8j ¢j. Note that
D~X,
where X was introduced in Section 5.17.2. By (80), X is a closed subspace
of the Hilbert space 'H. Thus, X is a Hilbert space equipped with the inner
product (. I .). In addition, the set D is dense in X. Observe that 20
3
(v I w) := 2:)v I Wh,2.
j=l
Furthermore, define
along with
L fa KjLjdx.
3 3
(K I L)z := 2)Kj I Lj)z =
J=l J=l
Then Z is a real Hilbert space with respect to the inner product (. I .) z·
Definition 2. The generalized problem to the original problem (72) reads
asfollows. Let 17 > O. We are given the outer force density K E 'H*. We are
20We use the notation introduced in Section 5.16.
334 5. Fredholm Operators
looking for a velocity field v E X such that the following equation holds
true:
1](v I ep) + a(v, v, ep) = K(ep) for all ep ED. (81)
Remark 3 (Motivation). Let v be a smooth solution to the classical prob-
lem (72). Our discussion in Section 5.17.2 shows that v is also a generalized
solution.
Conversely, we will show in Corollary 6 that each generalized solution is
a solution to the original problem (72), in the sense of distribution theory
and of generalized boundary values.
Remark 4 (The outer forces). (i) Classical outer force densities. Let the
outer force density K E Z be given. Then the components K 1 , K 2 , and K3
of K live in L2 (G). Define the functional
K(ep):= fa Kjepjdx for all ep E 1i. (82)
It follows from 21
for all ep E H
that K E H*.
Since C[J (G) is dense in L2 (G), the set H is dense in Z. This implies
the continuous embedding
Z ~ '}-(.*.
Moreover, Z is dense in '}-(.*, by Proposition 8 in Section 5.16.
The outer force densities K E Z are called classical outer force densities.
(ii) Generalized outer force densities. Denote by W 2- 1 (G) the dual space
to W~(G). Let K j E W2-1(G) for all j. Define
for all ep E H.
Then, K E }{*. Moreover, all elements from }{* are obtained this way.
All the outer force densities K E }{* are called generalized outer force
densities. It follows from the continuous embedding Z ~ '}-(.* along with
IIK\\w :S constIlK\\z
that each classical outer force density is also a generalized outer force den-
sity, and the norm IIKllw is arbitrarily small provided the classical norm
IIKllz is sufficiently small.
In modern mathematical physics, forces correspond quite often to func-
tionals.
21Cf. Lemma 2 in Section 5.16.
5.17 Applications to the Navier-Stokes Equations 335
The following proposition justifies the choice of the space X for velocity
fields.
Proposition 5 (Velocity fields). Let v EX. Then, for m = 1,2,3,
OjVj = 0 on G (incompressibility condition),
Vrn = 0 on oG (boundary condition),
where the incompressibility condition is to be understood in the sense of
distribution theory, and where the boundary condition is satisfied in the
sense of generalized boundary values.
o
Proof. If v E X, then Vrn EW~(G) for all m. By Section 2.5.5 of AMS
Vol. 108, the function Vrn vanishes on the boundary oG (in the generalized
sense).
To prove the incompressibility condition, let v E D. Since ornvrn = 0, we
obtain
fa cPornvrn dx = 0 for all cP E Co(G).
Integration by parts yields
fa vrnomcPdx = 0 for all cP E Co(G). (84)
Since D is dense in X, and since X-convergence implies L 2 (G)-convergence
of V rn , a passage to the limit shows that relation (84) remains true for all
v EX.
Define
for all cP E Co(G). (85)
Since the function Vm lives in L2(G), the functional from (85) represents a
distribution (cf. Section 2.8.3 in AMS Vol. 108). Furthermore,
(Ornvrn)(cP) = -vm(orncP) = 0 for all cP E CoCG).
Consequently,
OmVm = 0 on G,
in the sense of distribution theory. This is precisely the incompressibility
condition. 0
5.17.4 The Fundamental Existence Theorem
Theorem 5.J. Let G be a nonempty, bounded, open, connected set in R3.
Consider a viscous fluid in G with density22 p > 0 and viscosity 'r/ > O.
22Here we do not use the convention p = 1 from Section 5.17.1 in order to
clarify the physical statement. Therefore, in the proof of Theorem 5.J, we have
to replace 1] and K with p-l1] and p-l K, respectively.
336 5. Fredholm Operators
Then, for given outer force density K E 'It*, the following conditions are
met:
(i) Existence. The generalized problem (81) has a solution vEX.
(ii) Uniqueness. If the outer force density is sufficiently small, then the
velocity field v E X is unique.
More precisely, we have to assume that the dimensionless quantity
p
2'IIKllw
fJ
is sufficiently small.
(iii) Generic finiteness. There is an open, dense subset 'ltij of H* such
that, for each outer force density K E 'ltij, the generalized Problem
(81) has only a finite number of solutions v EX.
There is an open, dense subset?3 Zo of Z such that for each (classical)
outer force density K E Zo, the generalized problem (81) has only a finite
number 'of solutions v EX.
Corollary 6 (The pressure p). Suppose that the boundary 8G of the re-
gion G is sufficiently smooth. 24 Let v E X be a solution of the generalized
problem (81). Then there exists a pressure function p E L2(G) such that
-fJ8j 8j v m + 8 j (vjv m ) = Km - 8 mP on G (equation of motion),
8j vj = 0 on G (incompressibility condition),
Vm =0 on G (boundary condition),
(86)
where bo'th the equation of motion and the incompressibility condition are
to be understood in the sense of distribution theory, and where the boundary
condition is satisfied in the sense of generalized boundary values. 25
Naturally enough, the pressure function p is unique up to a constant. If
we use the normalization condition
!aPdX = 0,
then p is unique (as an element of the space L2 (G)).
o
23Choose Zo := 1t n Z.
24For example, suppose that 8G is a two-dimensional CI-manifold that lies
locally on one side of 8G.
25Cf. Sections 2.5.5 and 2.8.3 of AMS Vol. 108.
5.17 Applications to the Navier-Stokes Equations 337
Let us first discuss this.
Remark 7 (Turbulence). The lack of uniqueness 26 corresponds to the ex-
perimental fact that turbulence appears for sufficiently large outer forces.
The perfect mathematical description of turbulence is a famous open prob-
lem in mathematical physics.
As an introduction to the modern theory of turbulence, we recommend
the books by Chorin (1975, 1994), Sirovich (1991), and Foias et al. (1993).
Remark 8 (The stability property of reasonable classical forces). Recall
that
along with
Furthermore, recall that the forces K E Z are called classical forces. Let us
designate the forces Ko E Zo as reasonable classical forces because they are
classical functions and they generate only a finite number of velocity fields
v EX. Since the set Zo is dense and open in Z, we obtain the following:
(i) If K is a classical force, then for each c > 0, there is a reasonable
classical force K* with 11K - K*llz < c.
(ii) If K is a reasonable classical force, then there is a number 0 > 0 such
that each classical force K. with 11K - K.llz < 0 is reasonable.
This can be expressed briefly by saying that
Most classical forces are reasonable, and they remain reasonable under
small perturbations.
That is, reasonable classical forces are generic.
Let us summarize our results for classical forces:
(i) For all classical outer force densities K E Z, there exists a velocity
field vEX.
(ii) If the dimensionless quantity
p2
".,41G
rK dx
2
is sufficiently small, then the velocity field is unique.
2Gln fact, there exists a counterexample, where nonuniqueness appears (cf.
Galdi (1994), Vol. 2, p. 11).
338 5. Fredholm Operators
(iii) Most classical outer force densities generate only a finite number of
velocity fields. This property remains unchanged under small pertur-
bations of the outer force densities.
(iv) To each velocity field v EX, there corresponds a pressure function p
that is unique up to a constant (as an element of L2 (G)). 27
Remark 9 (The efficiency of modern calculus). Formally, the equations
from (86) look like the classical equations from (73). It is a decisive ad-
vantage of the modern theory of distributions that, on the one hand, this
theory is powerful enough to force the existence of solutions to important
problems in mathematical physics. On the other hand, modern calculus
resembles classical calculus.
Historical Remark 10. The Navier-Stokes equations were formulated by
Navier in 1822 and studied by Stokes in 1845. Existence and uniqueness
theorems for the stationary Navier-Stokes equations were first proved by
Odquist in 1930 and then by Leray in 1933. This time gap between 1822
and 1930 characterizes a retardation between physics and mathematics that
happens quite often. For example, the Laplace equation was introduced
by Laplace near 1800, but a deeper mathematical understanding of this
equation via interpolation theory was gained only in the 1960s.
Our proof of statements (i) and (ii) from Theorem 5.J follows the elegant
approach discovered by Ladyshenskaya in 1959. Foias and Temam proved
a statement of type (iii) in 1977, by using the Sard-Smale theorem.
The proofs of Theorem 5.J and Corollary 6 given in Sections 5.17.6 and
5.17.8, respectively, will be a simple consequence of an abstract theorem
based on the Riesz theorem, the Leray-Schauder principle, and the Smale
principle. In addition, we will use the Holder inequality and the results from
Section 5.16 about interpolation inequalities and compact embeddings. Fur-
thermore, we will use the generalization of a theorem from classical vector
calculus to distribution theory.
A different proof via the Galerkin method for pseudo monotone operators
can be found in Zeidler (1986), Vol. 4, Section 72.4.
5.17.5 A Functional A nalytic Theorem
Let us make the following assumptions:
27In addition, the sophisticated regularity theory for the stationary Navier-
Stokes equations shows that if both the boundary and the outer forces are smooth,
then so are the velocity field and the pressure.
More precisely, if the boundary GO is Coo, and if K j E Coo (0) for all j, then
Vj,p E COO(O) for all j, and the equations in (86) are satisfied in the classical
sense (cf. Galdi (1994), Vol. 2, Section 8.5).
5.17 Applications to the Navier-Stokes Equations 339
(HI) 1t and Z are real Hilbert spaces, where the embedding 1t ~ Z is
continuous, and 1t is dense in Z.
(H2) X is a closed linear subspace of 1f., and D is a dense subset of X.
Denote the inner product on X by (. I·).
(H3) Y is a real Banach space such that the embedding X ~ Y is compact.
(H4) a: X x X x X ~ lR is trilinear (i.e., a(u, v, w) is linear with respect
to each argument). In addition, for all u, v, w EX,
la(u, v, w)1 ::; const Ilullyllvlldwllx.
(H5) a(v,v,v) = 0 for all v E D.
Condition (H5) is crucial for obtaining a priori estimates.
Proposition 11. Let 1J > O. For a given functional K E 1f.*, the following
hold true:
(i) Existence. There exists a solution v E X to the equation
1J(v I ¢) + a(v, v, ¢) = K(¢) for all ¢ ED. (87)
(ii) Uniqueness. If the norm of the functional K is sufficiently small (i.e.,
IIKllw < 8), then the solution v is unique.
(iii) Generic finiteness. There exists an open, dense subset 1f.o of 1f.* such
that, for each K E 1f.o, equation (87) has only a finite number of
solutions v EX.
The intersection 1f.o n Z is open and dense in Z.
Remark 12 (Elementary approach). If one wants to present an elementary
approach to the Navier-Stokes equations in a lecture, it is convenient only
to prove statements (i) and (ii) of Theorem 5.J on existence and uniqueness.
In this connection, it suffices to use statements (i) and (ii) of Proposition 11.
Thus, for the convenience of the lecturer, we divide the proof of Proposition
11 into two parts.
The proof of Proposition 11 (iii) , and hence the proof of Theorem 5.J(iii),
is based on the sophisticated Smale principle.
By Proposition 8 in Section 5.16, we have the continuous embedding
Z ~ 1f.*,
and Z is dense in 1f.. Moreover, it follows from the continuous embedding
X ~ 1t that the embedding
1f. ~ X*
340 5. Fredholm Operators
is also continuous.
The following existence and uniqueness proof will work not only for given
K E 1i*, but also for K E X*. Observe that IIKllx· :::; constIlKII'w·
Proof of Proposition ll(i), (ii). The idea of the proof is to reduce the
original problem (87) to an equivalent operator equation,
vEX, (88)
by using the Riesz theorem. Next we will apply the Leray-Schauder prin-
ciple to (88) in order to obtain statements (i) and (ii).
Step 1: The functional K E X*. By the Riesz theorem, there is a unique
element K. in the Hilbert space X such that
K(w) = (K. I w) for all w E X
(cf. Section 2.10 of AMS Vol. 108). The duality map J: X ....... X' de-
fined through J(K.) := K is linear, bijective, and norm preserving (Le.,
IIJ(K.)II = IIKI!). Therefore, if Xo is an open, dense subset of X, then the
image Xo := J(Xo) is an open, dense subset of X·.
Step 2: The opemtor B. Let u, v E X be fixed. By (H4),
for all wE X. (89)
It follows from the Riesz theorem that there exists a unique element in X
denoted by B( u, v) such that
a(u,v,w) = (B(u,v) I w) for all w E X.
In addition,
IIB(u, v)llx = sup la(u, v, w)l, (90)
where the supremum is taken over all w EX, with Ilwllx :::; 1.
Varying u and v, we obtain an operator B: X x X ....... X that has the
following properties:
(a) B is bilinear and bounded.
(b) For all u, v E X,
(91)
In fact, statement (b) follows from (89) and (~O). This implies
IIB(u,v)llx :::; constllullxllvllx
because the embedding X ~ Y is continuous. The bilinearity of B follows
from the bilinearity of the map (u,v) f--+ a(u,v,w) for fixed w.
Step 3: The opemtor A. Define
Av := B(v, v) for all vEX.
The operator A: X ....... X has the following properties:
5.17 Applications to the Navier-Stokes Equations 341
(a) A is locally Lipschitz continuous. This means that
IIAu - Avllx :::; const . rllu - vllx
for all u, v E X with Ilullx, Ilvllx :::; r. Here, r denotes a fixed, but
otherwise arbitrary, positive number.
(b) A is compact.
(c) (Av I v) = 0 for all vEX.
Let us prove this. Recall that
IIVlly :::; constllvllx for all v E X (92)
because the embedding X ~ Y is compact.
Ad (a). Let u, v E X with Ilullx, Ilvllx :::; r. Note that
Au - Av = B(u,u) - B(v,v) = B(u - v,u) + B(v, u - v).
By (91) and (92),
IIAu - Avllx :::; const(llu - vlldully + IIvllyliu - vlly)
:::; const(llu - vlldullx + IIvllxllu - vlly).
Hence we obtain the key inequality
IIAu - Avllx :::; const· rllu - vlly· (93)
This implies property (a), by (92).
Ad (b). Let (v n ) be a bounded sequence in X. Since the embedding
X ~ Y is compact, there exists a subsequence (v n ') that converges in Y.
Thus, (v n ') is Cauchy in Y. By (93), the sequence (Av n ,) is Cauchy in X,
and hence it is convergent in X.
Ad (c). Let v EX. Since D is dense in X, there is a sequence (v n ) in D
such that Vn -> v in X as n -> 00. By (H5),
for all n.
Letting n -> 00, we obtain (Av I v) = 0, since the operator A is continuous.
Step 4: The equivalent operator equation. It follows from
1)(V I ¢) + a(v, v, ¢) = K(¢) for fixed v E X and all ¢ E D (94)
that
1)(V I ¢) + (B(v, v) I ¢) = (K* I ¢) for all ¢ ED.
Recall that Av = B(v, v). Then
(1)V + Av - K* I ¢) = 0 for all ¢ ED.
342 5. Fredholm Operators
Since the set D is dense in the Hilbert space X, we obtain the operator
equation
1]v+Av-K* =0, vEX. (95)
Conversely, the operator equation (95) implies (94). Thus, equation (95)
is equivalent to (94).
Step 5: The crucial a priori estimates. Set JI, := 1]-1. Consider the mod-
ified operator equation
v = -JI,tAv + Jl,tK*, v E X, (96)
where t E [O,IJ. Note that, for t = 1, equation (96) is identical to (95).
Suppose that v is a solution to (96). Then
(v I v) = -JI,t(Av I v) + Jl,t(K* I v).
Since (Av I v) = 0, (v I v) = Jl,t(K* I v). Hence Ilvllk :::; Jl,IIK*llxllvllx. This
implies the a priori estimates
(97)
for any solution v to problem (96).
Step 6: The existence proof via the Leray-Schauder principle. It follows
from (97) that, for given K*, equation (95) has a solution, for t = 1, by the
Leray-Schauder principle (cf. Theorem 1.D in Section 1.18 of AMS Vol.
108).
In this connection, note that the operator v f--+ Jl,Av + Jl,K* is compact
on X because A is compact.
Step 7: The uniqueness proof via local Lipschitz continuity. Choose K* E
X. Suppose that v and ware solutions of the operator equation (95). By
the a priori estimates (97),
IIvll,llwll:::; r,
where r:= 1]-1 IIK* IIx. From (95) we obtain
1]( v - w) + Av - Aw = 0.
By the local Lipschitz continuity of the operator A: X -+ x,
1]lIv - wllx :::; const· rllv - wllx.
If the quantity 1]-l r = 1]- 2 I1K*lIx is sufficiently small, then v = w.
Since IIK* IIx = IIKllx* and IIKllx* :::; constIIKII1i*, the quantity
1]- 2 1IKII1-l*
has to be sufficiently small. o
Proof of Proposition l1(iii). We will use the Smale principle.
Step 8: Further properties of the operator A: X -+ X. We will show that
5.17 Applications to the N avier-Stokes Equations 343
(a) A is Goo.
(b) For any 'r/ > 0, the operator 'r/l + A: X --+ X is a nonlinear Fredholm
operator of index zero.
(c) The operator 'r/l + A: X --+ X is proper.
Let us prove this.
Ad (a). Recall that Av := B(v, v) on X. Since the operator B: X x X --+
X is bilinear and bounded, it follows from Example 5 in Section 4.2 that
A is Goo.
Ad (b). Since the operator A: X --+ X is compact, the F-derivative
A'(u): X --+ X is also compact, for each u E X (cf. Problem 5.2). Set
G := 'r/l + A. Then the F-derivative of G at the point u E X is given by
G'(u) := 'r/l + A'(u). Thus, G'(u) represents a compact perturbation of the
Fredholm operator 'r/l: X --+ X. Since this operator is bijective, ind('r/l) = O.
By Theorem 5.E in Section 5.8, 'r/l + A is Fredholm of index zero:
Ad (c). Let M be a compact set in X. We have to show that the preimage
N := ('r/l + A)-l(M) is compact.
The set M is bounded in X. By the a priori estimate in (97), the set N
is bounded in X. Let (v n ) be a sequence in N. Since the operator A: X --+
X is compact, the set A(M) is relatively compact. Thus, there exists a
subsequence (vn/) such that AVnl --+ w as n' --+ 00. Set
bnl := 'r/Vn' + Avn, · (98)
Since the sequence (bnl) lives in the compact set M, there is a cODvergent
subsequence bn" --+ b as nil --+ 00. In addition, bE M. By (98),
v n " --+ v as nil --+ 00,
where 'r/V = b - w. Finally, since the operator A: X --+ X is continuous, it
follows from (98) that
b= 'r/v+Av.
This shows that v E N. Thus, the set N is compact.
Step 9: Generic finiteness via the Smale principle. This principle tells us
that there is an open, dense subset Xo of X such that, for given K* E X o,
equation (88) has only a finite number of solutions (cf. Theorem 5.I in
Section 5.15).
Define XO' := J(Xo). By Step 3, the set XO' is open and dense in X*.
Step 10: Since XO' is open and dense in X*, and since the embedding
H ~ X is continuous, the intersection set HD := XO' n H* is dense in 1t*,
by Proposition 9 in Section 5.16.
Step 11: Since 1tD is open and dense in H*, and since the embedding
H ~ Z is continuous, the intersection 1tD n Z* is dense and open in Z*,
again by Proposition 9 in Section 5.16.
344 5. Fredholm Operators
Step 12: The intersection ?to
n Z is open and dense in Z. To see that this
follows from Step 11, let v E Z. Then the duality map :1: Z -+ Z" assigns
a functional v* := :1(v) in Z· to the point v such that
v*(u) = (v I u)z for all u E Z.
The restriction of the linear continuous functional v*: Z -+ JR to ?t is a
linear continuous functional v·: 'Ji -+ JR, that is,
v* E 'Ji*.
Similarly, we write
v E ?t".
By Propositions 8 and 9 in Section 5.16, this corresponds to
Z* S;;;?t* and Z S;;; ?t*.
This way, we identify ?ton Z with n Z" . ?to
Finally, since the duality map is a normisomorphism between Z and Z* ,
it sends open and dense sets in Z onto open and dense sets in Z·. 0
The proof of Proposition 11 has been finished.
5.11.6 Proof of Theorem 5.J
We have to show only that the assumptions (HI) through (H5) of Proposi-
tion 11 are satisfied. However, this can be easily done. To this end, choose
along with the norm
We will use the following two simple key observations:
o
(a) Let u, v E L4(G), and W EW~(G). Then, OkW E L2(G). By the HOlder
inequality for three factors,
(cf. Problem 5.9b).
(b) If Vj E C(f(G) and OjVj = 0, then integration by parts yields
2 fa Vjvmojvmdx = fa VjOj(v~TJdx = - fa V;" OJ Vj dx = O.
5.17 Applications to the Navier-Stokes Equations 345
Ad (HI), (H2), and (H3). This follows from Section 5.16.
Ad (H4). Recall that
By the key observation (a),
for all u,v,w E X.
Ad (H5). Observe that assumption (H5) coincides with (b).
The proof of Theorem 5.J has been finished. o
5.17. 7 A Result from Modem Vector Calculus
In the following two sections, let us use the language of distribution theory.
As before, G denotes a nonempty, bounded, open, and connected set in lR3 .
Let U be a distribution, that is, U E V'(G). Recall from Section 2.8.3 in
AMS Vol. 108 that the derivative ajU of U is defined through
for all test functions ¢ E Cff (G).
In contrast to classical functions, distributions are mathematical objects
that possess derivatives of arbitrary order. If the function u: G ---> lR is
locally integrable, then we set
U(¢):= fa u¢dx for all ¢ E Cff(G).
In this sense, locally integrable functions can be regarded as distributions.
It is convenient to denote both the distribution U and the function by the
same symbol u.
We now consider the basic equation
-'\7p = p on G (100*)
from vector calculus. In hydrodynamics,28 P is the force density generated
by the pressure p. Using components in a Cartesian coordinate system,
equation (100*) is equivalent to the following equation:
-amP = Pm on G, m = 1,2,3. (100)
28In mechanics, p denotes the potential to the force P. Moreover, in electrody-
namics, P is the electric field vector to the electrostatic potential p.
346 5. Fredholm Operators
This equation was considered from a classical point of view in Section
5.17.2. Let us now investigate equation (100) from the distribution theory
point of view. First we formulate a simple necessary solvability condition.
e 1
Recall that we denote the dual space WHG)* by W 2- (G).
Proposition 13 (Necessary solvability condition). Let p, Pm E V'(G) for
m = 1,2,3. Suppose that equation (100) is fulfilled. Then the following are
true:
(i) For all ¢> E D,
Pm(¢>m) = O. (101)
(ii) lfp E L 2 (G), then Pm E W 2- 1 (G) for all m.
More precisely, statement (ii) means that Pm can be uniquely extended
to a functional Pm E W 2- 1 (G).
Proof. Ad (i). Recall that ¢> ED implies 8m¢>m = O. Thus, from (100) we
obtain
Pm(¢>m) = p(8m¢>m) = O.
Ad (ii). It follows from (100) that, for all'I/J E CO'(G),
IPm('I/J) I = Ip(8m 'I/J)1 = lfa P8m'I/J dX I
~ (fa p2 d X )! (fa (8m'I/J)2dx ) ! ,
by the Schwarz inequality. Hence
for all'I/J E Coo(G).
e
Observe that CO'(G) is dense in the Hilbert space H :=W~(G). There-
fore, Pm can be extended to a linear continuous map Pm: H ---+ JR., by the
extension principle (cf. Section 3.6 in AMS Vol. 108). 0
It is remarkable that the simple conditions from Proposition 13 are strong
enough to ensure the existence of a pressure function p.
Proposition 14 (Sufficient solvability condition). Suppose that the bound-
ary 8G of the region G is sufficiently regular. 29 If we are given functionals
Pj E W 2- 1 (G), j = 1,2,3, condition (101) is satisfied.
29For example, suppose that 8G is a two-dimensional CI-manifold, where G
lies locally on one side of 8G.
5.17 Applications to the Navier-Stokes Equations 347
Then equation (100) has a solution L2(G) that is unique up to a constant.
The proof will be given in Problem 5.18b and will be based on the closed
range theorem.
5.17.8 Determination of the Pressure
Let us again consider the equation of motion for a viscous fluid
-ryD.v + \i'(v (>9 v) = K - \i'p on G.
Using components in a Cartesian coordinate system, this means that
on G, (102)
where m = 1,2,3. Suppose that Vj, VjVm, K m , and pare distributions. 3D
Then equation (102) tells us that
for all 'IjJ E C(f(G). Replacing 'IjJ with <Pm and summing over m, we obtain
for all <Pm E Cfj' (G). Hence
for all <Pm E Cfj'(G). Recall that om<Pm = 0 if <P E D. Thus,
for all <P E D. This is precisely the generalized problem from Definition 2.
We now want to reverse the preceding argument. Observe that the fol-
lowing proof resembles the classical argument used in Section 5.17.2.
Proof of Corollary 6. We are given the force functional K E 1-(*. Let the
velocity field v E X be a generalized solution to the equation of motion, in
the sense of Theoremo 5.J. 0 0
Recall that 1-( =W ~(G) x W HG) x W ~(G) and that X ~ 1-(. Hence
1-(* ~ X*. Denote the restriction of K to the mth factor wi (G) of 1-( by
Km. Then
for all <P E 1-(. (106)
30 Observe that a general product for distributions does not exist (cf. Ober-
guggenberger (1992)). For example, this fact is responsible for serious mathemat-
ical difficulties arising in quantum field theory. However, in the present case we
can assume that Vj and VjVm are locally integrable functions.
348 5. Fredholm Operators
o
Since v E X, Vm EW~(G). Hence Vm E L2(G). Thus, the product VjVm
is integrable over G, by the Schwarz inequality. Consequently, VrnVj is a
distribution.
Step 1: Existence of the pressure p. Define
(107)
Then, Pm is a distribution, that is, Pm E V' (G). Consider now the pressure
equation
-omP = Pm, m = 1,2,3. (108)
Since v is a generalized solution to the equation of motion, equation (105)
holds true. Hence
for all ¢ E D.
We want to show that the following conditions are met:
(a) Km E W 2- 1 (G),
(b) OjOjV m E W 2- 1(G),
(e) OJ(VjV rn ) E W2-1(G).
Ad (a). By (106),
IKm(1,b) I S eonstll1,blll,2 for all1,b EW~(G).
Ad (b). For al11,b E COC(G),
ICOjOjVrn )(1,b) I = IVm(OjOj1,b) I = fa VmOjOj1,bdx
= fa OjvmOj1,bdx
S (fa (OjV m)2 dx ) ~ (fa (Oj1,b)2dx ) ~
S eonstll1,blll,2'
Therefore, OjOjVm can be uniquely extended to a linear continuous fune-
o
tional on W ~ CG), by the same argument as in the proof of Proposition
13.
Ad (c). For all1,b E COOCG),
l(oj(vjvm)(1,b)1 = I(VjVm)(Oj¢) I = lfa VjvmOj1,bdxl
S IIvjI141Ivm I1411¢1h,2
S constll¢lh,2,
Problems 349
by (99). This proves (c).
It follows from (a) through (c) that Pm E W 2- 1 (G). By Proposition 14,
p E L2(G).
The remaining statements of Corollary 6 follow from Proposition 5. 0
Problems
5.1. A special Cl-function. Prove that the function
H(s w ):= { S-1 F(sb + sw,p) if s#-O (109)
"p Fu(O,p)(b+w) ifs=O
is CIon an open neighborhood of (0,0,0) provided F is C 2 on an open
neighborhood of (0,0), where F satisfies the assumptions of Theorem 5.H
with Uo = 0, Po = 0.
Solution: Recall that F(O,p) == 0. By the Taylor theorem from Section
4.5,
F(sb + sw,p) = F(sb + sw,p) - F(O,p)
S2 2 2
sFu(O,p)(b + w) + 2Fuu(0,p)(b + w) + o(s ), S -4 0, (110)
and
= Fu(O,p)(b + w) + sFuu(O,p)(b + W)2 + o(s), S -4 0, (111)
as well as
Fp(sb + sw,p) = Fp(O,p) +s 11 Fup(T(sb + sw),p)(b + W)dT
(112)
= sFup(O,p)(b + w) + o(s), S -4 0.
It follows from (110) that H is continuous at (0, w, p) and
Hs(O, w,p) = lim s-I(H(s, w,p) - H(O, w,p» = 2- 1 Fuu(O,p)(b + W)2.
8-+0
By (109),
Hs(s, w,p) = _s-2 F(sb + sw,p) + s-1 Fu(sb + sw,p)(b + w) if s #- 0.
350 5. Fredholm Operators
Thus, it follows from (110) and (111) that, for 8 =1= 0,
H.(8, W,p) = 2- 1 F uu (0,p) + 0(1), 8 - O.
Hence
as 8 - O.
By (109),
HW(8, w,p) = Fu(8b + 8W,p)W for small 181·
Finally,
H { + sw,p)
8- 1 Fp(sb if 8 =1= 0
p(8, w,p) = Fpu(O,p)(b + w) if 8 = O.
According to Problem 4.10, F"(u,p)(h, k) = F"(U,p)(k, h). Hence
Fup(u,p) = Fpu(u,p). Thus, by (112),
Hp(s,w,p) - Hp(O,w,p) as 8 - O.
Summarizing, we find that the partial derivatives H s , H w , and Hp are
continuous on a neighborhood of (0,0,0). By Problem 4.11, H is C1 on a
neighborhood of (0,0,0).
5.2. Compact operators. Let A: X - X be a compact C 1 -operator on the
Banach space X over oc. Show that, for each u EX, the Frechet derivative
A'(u): X - X is compact.
Hint: Cf. Zeidler (1986), Vol. 1, Proposition 7.33.
5.3. Nonlinear Fredholm operators. Show that the operator
B+C:X-X
is a nonlinear Fredholm operator of index zero on the Banach space X over
OC provided the following conditions are met:
(i) The operator B: X - X is linear, continuous, and bijective.
(ii) The operator C: X _ X is compact and C 1 •
Hint: Use the same argument as in the proof of Proposition 11 in Section
5.17.5.
5.4. * The generalized Jordan normal form. Let
C:X-X
be a linear compact operator on the Banach space X over OC. We want to
decompose the space X into invariant subspaces, with respect to C, that
are as small as possible. We are given the number A E OC, A =1= o. Set
A:=)d - C.
Problems 351
We already know that if A is not an eigenvalue of C, then the operator
AI - C:X -+ X
is bijective, and the inverse operator (AI - C)-I: X -+ X is continuous.
Now suppose that A is an eigenvalue of A. Show that
(i) There exists a natural number n such that 31
and
(ii) The space X decomposes into the direct sum
where dim N(An) < 00. The closed linear subspaces N(An) and
R(An) of X are invariant under the operator C.
(iii) The operator
is bijective, and the corresponding inverse operator is continuous on
R(An).
(iv) There exist linear subspaces Lt, . .. ,Lr of N(An) such that we have
the direct sum decomposition
Each of the spaces L j is invariant under C. Moreover, for each L j ,
there exists a basis U1. ... ,Um such that
CU s = AUs + Us+l, s = 1, ... ,m -1,
CUm = AUm .
This is called the Jordan normal form for the operator C. Note that
if dim L j = m = 1, then L j is a one-dimensional eigenspace of the
operator C.
Hint: Cf. Riesz and Nagy (1955), Sections 80 and 89.
Next we want to study the following three important inequalities:
Young inequality => HOlder inequality
=> Minkowski inequality.
31 Recall our convention that K C M means both K ~ M and K of:. M.
352 5. Fredholm Operators
5.5. The Young inequality. Show that
aP bq
ab < -
- p
+-q for all a, b 2::: 0, (113)
where
-1 + -1 = 1, 1 < p,q < 00. (114)
p q
Solution: If p = q = 2, then (113) follows from (a - b)2 2::: 0. In the
general case, we consider the function
+ -b -
aP q
F(a) = - ab,
p q
for fixed b > 0. Note that
F(O) > 0, F(b*) = 0, and lim F(a) = +00.
a->+oo
Hence F has a minimum on [0,00[. Thus, there exists a number ao > °
such that
F(a) 2::: F(ao) for all a E [0,00[.
From F' (ao) =
(113).
°it follows that ao = bt, and hence F (ao) = O. This is
5.6. The Holder inequality in jRN. Assume (114). Show that
(115)
for all ej, 'r/j E C, j = 1, ... ,N.
Solution: It follows from the Young inequality (113) that
Summing over j = 1, ... , N and using (114), we get (114).
5.1. The Minkowski inequality. Set
Problems 353
where x = (6, ... , ~N). Show that
for all x,y E eN, (116)
where 1 ::; p < 00.
Solution: Let p > 1. By the Holder inequality (115),
N N
L I~j + T/jiP ::; L I~j + T/jIP-1(lejl + IT/jl)
j=l j=l
Since (p - l)q = P and ~ = 1- ~, we get (116).
5.S. The Banach space e~. Let Jl{00 denote the linear space of all the se-
quences (U n )n;?:l' where Un E JI{ for all n E N (cf. Problem 1.5 in AMS Vol.
108). Moreover, let e~ denote the set of all (un) E Jl{oo such that
where 1 ::; p < 00. The space e~ has been introduced in Problem 1.5 in
AMS Vol. 108. Using Problem 5.7, show that
(i) f~ is a separable Banach space over JI{ if 1 ::; p < 00.
(ii*) For 1 < p, q < 00 and p-1 + q-1 = 1,
(f~)* ~ f~.
That is, the dual space (f~)iK is normisomorphic 32 to f~. More pre-
cisely, let y E f~. Setting
00
f(x) := L T/j~j for all x E f~, (117)
j=l
32Recall that the Banach space X over JI{ is normisomorphic to the Banach
space Y over][{ iff there exists a linear bijective map 4>: X -> Y such that 114>(u) II =
lIuli for all u E X.
354 5. Fredholm Operators
we get a linear continuous functional f on £~ (i.e., f E (£~)*) and
(118)
Each functional f E (f~)* can be obtained this way, where y is
uniquely determined by f.
(iii) £~ is reflexive if 1 < p < 00.
(iv*) (ff)* ~ f~, in the sense of (ii) with p = 1 and q = 00.
(v*) ff and £~ are not reflexive.
(vi*) f~ is not separable.
Hint: Cf. Kothe (1960), Vol. 1, Section 14.
5.9. The fundamental Banach space L~(G), 1 < p < 00. Let G be a
nonempty open subset of ]RN, N ?: 1. Set
(fa IU(X)IPdX) P ,
1
Iluli p := 1 :::; p < 00.
Let L~(G) denote the set of all measurable functions u: G --> IK such that 33
Iluli p < 00.
5.9a. Basic properties. Show that
(i) If 1 < p, q < 00 and p-l +q-l = 1, then we have the Holder inequality:
(119)
for all u E L~(G) and v E L~(G).
(ii) For all u, v E L~(G) with 1 :::; p < 00, we have the Minkowski in-
equality:
Ilu + vll p :::; lIull p + IIvllp' (120)
(iii) Let 1 :::; p < 00. Then L~( G) becomes a separable Banach space over
IK with the norm II . lip once we identify any two functions that differ
only on a set of measure zero on G. Moreover, CO'(G)IK is dense in
L~(G).
33For brevity we write Lp(G) instead of L~(G) in the case where IK = IR.
Problems 355
(iv) Assume 1 :::; P < 00. Let (Un) be a sequence in Lp(G) such that
Un ---+ U in L~( G) as n ---+ 00. Then there are a subsequence (un') and
a function v E Lp(G) such that, for almost all x E G,
Un,(X) ---+ u(x) as n' ---+ 00,
and sUPn' \un,(x)\ :::; v(x).
Solution: Ad (i). We may assume that l\ul\p = I\vl\p = 1. Otherwise
we replace U and v with Au and /-LV, respectively. Integration of the Young
inequality,
\uv\:::; \u\p +~,
P q
over G yields fa \uv\dx :::; 1. This is (119).
Ad (ii). Use a similar argument as in Problem 5.7.
Ad (iii). Use the same arguments as for the space Lf(G) in Section 2.2
of AMS Vol. 108 .
Ad (iv). Use the same argument as in the proof given in Section 2.2.1
of AMS Vol. 108. For the construction of the function v, see Kufner et al.
(1977), Section 2.8.
5.9b. The Holder inequality for three factors (special case). Show that
for all u,v E L~(G) and W E Lf(G).
This inequality plays an important role in the existence proof for the
stationary Navier-Stokes equations (cf. Section 5.17.6).
Solution: Since \u\2, Ivl 2 E Lf( G), it follows from the Holder inequality
for two factors from (119) that
Hence uv, wE Lf(G). Thus, again by the Holder inequality for two factors,
Using (122), we obtain (121).
5.9c. The Holder inequality for n factors. Let 1 < PI.··· ,Pn < 1, where
n 1
L-=l,
Pj
j=l
356 5. Fredholm Operators
and n = 2,3, .... Show that
Uj E L~j (G), j = 1, ... , n.
for all
Hint: Use the same argument as in Problem 5.9b.
5.10.* The dual space. Let 1 < p, q < 00 and p-l + q-l = 1. Then
L~(G)* ~ L~(G),
that is, the dual space (L~(G))* is normisomorphic to L~(G). More pre-
cisely, if v E L~(G), then
F(u) := fa u(x)v(x)dx for all u E L~(G)
defines a linear continuous functional F on L~(G) with
IIPII = Ilvll q •
Each F E L~(G)* can be obtained this way, where v E L~(G) is uniquely
determined by F.
Hint: Study the proof in Kufner et al. (1977).
5.11. Reflexivity. If 1 < p < 00, then the space L~(G) is reflexive.
Hint: Use Problem 5.10.
5.12. The Sobolev space W;'(G)IK' Let G be a nonempty open subset of
]RN, let 1.::; p < 00, and let m = 1,2, .... Set
meaning that we sum over all the partial derivatives of u up to order m.
By definition, the space W;
(G)IK consists of all the functions
u E L~(G)
with
aa u E L~(G) for all n: 0 < Inl ::; m,
where the partial derivatives are to be understood in the sense of general-
ized functions.
Problems 357
Explicitly, this means the following. We have u E W;l
(G)IK iff u E L~ (G),
and for each a: 0 < lal ::; m there exists a function denoted by {)"'u such
that ()C>u E L~(G) and
for all ¢ E G8" (G). Show that
(i) Wl~n(G)1K becomes a Bahach space over IK with the norm 1I'lIm,p once
we identify any two functions that differ only on a set of measure zero
on G.
(ii) Wl~n( G)IK is reflexive if 1 < p < 00.
Hints: Ad (i). Use the same arguments as for Wd-(G) in Section 2.2 in
AMS Vol. 108.
Ad (ii). Observe that W;(G)IK is normisomorphic to a closed linear
subspace of the product space
L~(G) x ... x L~(G),
by means of the map
u ~ ({)"'u)IQI~m'
Fhrthermore, use Problem 5.11 and the following two facts:
(a) Products of reflexive Banach spaces are again reflexive.
(b) Closed linear subspaces of reflexive Banach spaces are again reflexive.
The Sobolev spaces W; (G)IK represent the basic tool for the modern the-
OTY of linear and nonlinear paTtial differential equations.
This can be found in Zeidler (1986), Vols. 2ff.
5.13. Approximation of compact operators. Let Gn , G: X --t Y be linear
continuous operators, where X and Y are Banach spaces over K Show
that if
lim IIGn - Gil = 0
n->oo
and Gn is compact for all n, then G is also compact.
5.14. Integral operators. Let -00 < a < b< 00. Define
(Au)(x) := lb A(x, y)u(y)dy for all x E]a, b[,
358 5. Fredholm Operators
where the function A: la, b[ x la, b[-; IR is measurable and
(123)
for fixed p: 1 <p< 00. Set
X:= Lq(a,b),
where p-l + q-l = 1. By Problem 5.10,
X* = Lp(a, b),
in the sense of a normisomorphism. Hence X** = X. Show that
(i) The operator A: X -; X* is linear and continuous with
(124)
(ii) The operator A: X -; X* is compact.
(iii) The dual operator AT: X -; X* is given by AT = B, where
(Bv)(x) := lb A(y, x)v(y)dy for all x E la, b[. (125)
(iv) If p = 2, then the adjoint operator A*: X -; X is given by A* = B.
Solution: Ad (i). By the Fubini theorem (cf. the appendix to AMS Vol.
108), it follows from (123) that
lb IA(x, y)IPdy < 00
for almost all x E la, b[. Let U E X. By the Holder inequality,
Hence
Problems 359
Ad (ii). Set G:= la,b[ x la,b[. Since CO"(G) is dense in Lp(G), for each
n EN there is a function An E Co (G) such that
It follows as in the proof of Lemma 3 in Section 4.4 of AMS Vol. 108 that
the operator An: X -. X* corresponding to the kernel An is compact. By
Problem 5.13, the operator A: X -. X* is compact, too.
Ad (iii). Observe that
(w, u) = lb w(x)u(x)dx for all w E X*, u EX,
by Problem 5.10. According to the Tonelli theorem from the appendix to
AMS Vol. 108, for all u, v EX, we get
(Bv, u) = lb (lb A(y, X)V(Y)dY) u(x)dx
= lb (lb A(y, X)U(X)dX) v(y)dy = (v, Au).
Ad (iv). Observe that (u I v) = (u, v) if p = 2, that is, X = L 2 (a, b).
5.15. Applications to integral equations. Show that Proposition 1 in Section
5.3 remains valid if the function A: la, b[ x la, b[-. ]R is measurable with
ll a
b b
IA(x, yWdxdy < 00.
5.16. The zoo of function spaces. Let G be a nonempty, open subset of]RN.
In order to solve linear or nonlinear partial differential equations by the
methods of functional analysis, one has to choose the appropriate function
spaces. There exist two important types of function spaces, namely,
(i) the Lebesgue spaces L~(G) and the Sobolev spaces W;'(G)c of com-
plex functions f: G -. C, where 1 ::; p ::; 00 and m = 1,2, ... (cf.
Problems 5.10 and 5.12), and
(ii) the Holder spaces cm,a(G) of complex functions f: G -. C, where
m = 0,1,2, ... and 0 < II < 1 (cf. Problem 1.8 in AMS Vol. 108).
5.16a. Show that the embedding cm,a(G) ~ C k ,{3(G) is compact pro-
vided that G is bounded and k < m or k = m and 0 < f3 < ll.
360 5. Fredholm Operators
Hint: Use the Arzela-Ascoli theorem from Section 1.11 of AMS Vol. 108.
There exist many other important classes of function spaces (e.g., frac-
tional Sobolev spaces, Zygmund spaces, Hardy spaces, Morrey-Campanato
spaces, spaces of bounded variation (and of generalized bounded variation),
spaces of bounded mean oscillations, and so on). It turns out that there are
two scales of spaces, namely the Besov spaces B;q and the Triebel-Lizorkin
spaces F S , which playa fundamental role in organizing the zoo of function
pq
spaces. In I
this connection, the Fourier transformation pays t h e d eClSlve
..
role. Let us briefly discuss this.
5.16b. Dyadic partition of unity for ]R.N. By definition, such a partition
is a family {rPj} of COO-functions rPj:]R.N -> C which have the following
properties:
(ii) rPo(x) = 0 if Ixl > 2.
(iii) rPj(x) = 0 if Ixl < 2 3: 1 or 23~1 < lxi, where j = 1,2, ....
5.16c. The definition of the basic scales of function spaces via Fourier
transformation. Let {rPj} be a dyadic partition of unity for ]R.N. Suppose
that 1 ::; p, q ::; 00 and s E R We define
and, for p < 00,
The Besov spaces B;q are related to Holder spaces, whereas the Triebel-
Lizorkin spaces F;q are related to Sobolev spaces and fractional Sobolev
spaces (cf. Problem 5.16f).
Discussion of notation. Let us explain the notation used in the preced-
ing definitions. Recall from Section 3.8 in AM~ Vol. 108 that Sf denotes the
space of tempered distributions and that the classic Fourier transformation
can be extended to a linear bijective operator F: Sf -> Sf. By definition
j = 1,2, ....
Let the distribution f E Sf be given. Then the distribution rPj F f represents
a localization of the Fourier transform F f of f. The inverse Fourier trans-
formation applied to rPj yields the function fJ. More precisely, we obtain
Problems 361
the decomposition 34
00
f = L!J,
j=O
where fJ is an entire analytic function for all j, by the Pay ley-Wiener-
Schwartz theorem (cf. Schwartz (1966)).
The norm II· lip is the norm on the Lebesgue space L~(IRN) (cf. Problem
5.9), whereas Nq(aj) is the norm on the space l~. That is,
5.16d. Banach spaces. Show that 35
(i) B~q(\RN) is a complex Banach space with respect to the norm
N q (112 sj fJllp)·
(ii) F;q(\R N ) is a complex Banach space with respect to the norm
IINq (2 sj fj)llp·
5.16e. Function spaces on bounded domains. Let G be a nonempty,
bounded, open subset of \R N with smooth boundary.36 Define
and
F;q(G) := restriction of the elements from F;q(\R N ) to G.
This means the following. Let 1 E B~q(\RN). Then 1 is a tempered distri-
bution. The restriction 1* of 1 to the set G is defined through
1*(4)) := 1(4)) for all 4> E cgo(G). (126)
Show that
34This is to be understood in the sense of distributions, that is,
=L
00
f(¢) fj(¢) forall ¢ E S,
j=O
where fj(¢) = JIRN fj(x)¢(x)dx.
35It turns out that the definition of the spaces B~q and F;q does not depend
on the choice of the dyadic partition of unity. However, changing this partition
leads to equivalent norms.
36This means that the boundary aG of G is an (N - I)-dimensional C oo _
manifold such that G lies locally on one side of aG.
362 5. Fredholm Operators
(i) B~q(G) is a complex Banach space with respect to the norm
Ilf*11 := inf Ilfll·
Here llill denotes the norm on B;q(JR N ). and the infimum is taken
over all elements f of B~q(JRN) for which relation (126) holds true.
(ii) F;q(G) is a complex Banach space with respect to the norm (126)
replacing B;q(JR N ) with F;q(JR N ).
5.16.f* Important special cases. Let G = JRN or let G be given as in
Problem 5.16e. Then
(i) B;;:;;~(G) = cm,a(G) (Holder spaces) if m = 0, 1, ... and 0 < 0: < 1.
(ii) Fg2 (G) = L~(G) (Lebesgue spaces) if 1 < p < 00.
(iii) F;:2(G) = W;'(G)c (Sobolev spaces) if m = 1,2, ... and 1 < p < 00.
Generally, if s is an arbitrary real number and 1 < p < 00, then F;2(G)
is called a fractional Sobolev space denoted by W;(G).
Hint: Cf. Triebel (1992).
5.16g. Characterization of fractional Sobolev spaces. Let s E JR and 1 <
P< 00. Then W;(JRN)c consists of all tempered distributions f such that
where 1/Js(x) := (1+ IxI2)~. Show that W;(JRN)c is a complex Banach space
with respect to the norm
Hint: Cf. Zeidler (1986), Vol. 2A, Section 21.20 and Triebel (1992).
Historical Remark. Holder and Ljapunov introduced the class of Hol-
der continuous functions at the end of the nineteenth century in order
to describe subtle properties of potentials caused by mass distributions.
Sobolev spaces W;'(G)c for m = 1,2, ... emerged in the 1930s. These two
classes of spaces play a fundamental role in the modern theory of partial
differential equations (e.g., see Zeidler (1986), Vols. 1-5). Since the 1950s,
many attempts were made to generalize these two classes of function spaces.
The spaces B~q and F;q were introduced in the 1970s. A detailed study of
these spaces and of their relations to other function spaces (along with
valuable historical remarks) can be found in the monograph by Triebel
(1992).
As an elementary introduction to function spaces, we recommend the
textbooks by Kufner, John, and FuCik (1977) and Zeidler (1986), Vol. 2A,
Chapters 21 and 22. A summary of important results about function spaces
Problems 363
and about their relation to interpolation theory can be found in the exten-
sive appendix to Zeidler (1986), Vol. 2B. Interpolation theory emerged in
the 1960s and represents an important tool in order to organize the zoo
of function spaces and to obtain properties of operators between function
spaces in an intelligent and very effective way.
5.17. The generalized Riesz theorem for functionals. Let X be a real Banach
space, and let Y be a real Hilbert space. Suppose that the linear continuous
operator A: X -7 Y has closed range. We are given a functional F E X*
that vanishes on the null space N(A) of the operator A. Show that there
exists an element p of Y such that
F(v) = (p I Av)x for all VEX.
Solution: Consider the dual operator AT: y* -7 X*. The closed range
theorem from Section 3.12 tells us that
By assumption, FE N(A).1.. Thus, there is a functional f E y* such that
By the Riesz theorem, there exists a point p E Y such that
f(w) = (p I w) for all w E Y.
Thus, for all v EX,
F(v) = (AT f)(v) = f(Av) = (p I Av). o
This result will be used in Problem 5.18b in order to prove the existence
of a pressure function p in a fluid.
5.18. Modern vector calculus and its physical interpretation. Let G be a
nonempty, bounded, open, connected set in 1R3 such that the boundary 8G
is sufficiently smooth. 37
5.18a. * The compressibility equation. Recall from Section 5.17.3 that
1i =W~(G)x W~(G)x W~(G)
and
37For example, suppose that the boundary 8G is a two-dimensional C 1 _
manifold, where G lies locally on one side of 8G.
364 5. Fredholm Operators
Furthermore,
and X denotes the closure of D in the Hilbert space 'H. In what follows we
sum over two equal indices from 1 to 3.
Consider now the problem
\7v = J.t on G, (127)
v=O on aGo
Recall that \7v = div V. For given J.t, we are looking for a velocity field v
of a fluid on G.
Here, J.t measures the compressibility of the fluid. More precisely, the
quantity div v measures the relative change in volume of the flow (in first-
order approximation). In particular, div v == 0 is equivalent to the fact that
the flow is volume preserving, that is, the flow is incompressible (cf. Zeidler
(1986), Vol. 4, Section 70.5).
Using the velocity components with respect to a Cartesian coordinate
system, we obtain the following equivalent problem:
on G,
(128)
on aG, j = 1,2,3.
We are looking for a solution v E 'H. The following are met:
(i) If v E 'H is a solution of (128), then J.t E Z and
!aJ.tdX = O. (129)
(ii) For given p, E Z with (128), the original problem (128) has a solution
v. E 'H. The general solution to (128) is given by
v = v. + w, wEX. (130)
Hint: Relation (129) follows from the Gauss theorem Ie
p,dx = Iediv
vdx = Iae vndS = O.
Study the proof to (ii) in the monograph by Galdi (1994), Vol. 1, Sections
3.3 and 3.4.1.
Remark: Let p, = O. If G is an unbounded domain, then each velocity
field v E X is a solution to equation (128). Unfortunately, if G is poorly
shaped, then it may be that other solutions to (128) are not living in the
space X. This fact, discovered by Heywood in 1976, complicates the in-
vestigations of the nonstationary Navier-Stokes equation (cf. Galdi (1994),
Vols. 3 and 4).
Problems 365
5.18b. The weak pressure equation. Consider the equation
-Vp=p on G,
(131)
IaPdx = O.
Recall that Vp = grad p. Here, p can be regarded as the pressure in a fluid.
For given outer force density38 P, we are looking for a pressure function p
normalized by the second equation in (131). Using a Cartesian coordinate
system, problem (131) is equivalent to the following problem:
-ajp = Pj onG, j=1,2,3,
(132)
Iapdx = O.
We are given the functionals Pj E W2-1(G), j = 1,2,3, such that
for all ¢ ED. (133)
Show that problem (132) has a unique solution p E L2(G).
Use Problems 5.17 and 5.18a.
Solution: Step 1: Existence. Let Y := {p E L2(G): Iapdx = O}. The
linear continuous operator
V:1t ~ Y
is surjective, and it has the null space N (V) = X, by Problem 5.18a. Define
for all ¢ E 1t.
Then the functional P: 1t ~ lR. is linear and continuous, and it vanishes
on the null space N(V), by (133). Thus, Problem 5.17 tells us that there
exists apE Y such that
P(¢) = (p I V¢)y = !aPV¢dX for all ¢ E 1t.
In particular, this implies
Pj('IjJ) = !apaj'I/JdX for all 'IjJ E Co(G).
Therefore, p satisfies equation (132), in the sense of distribution theory.
Step 2: Suppose that we are given a function p E L2 (G) along with
on G, j = 1,2,3.
Then, p E WJ(G). Let's use Friedrich's mollification from Problem 2.12 in
AMS Vol. 108. Set
Pe(x):= fa ¢e(x - y)p(y)dy on G for all IS > O.
38The pressure p generates the force IH Pdx acting on each open subset H of
G.
366 5. Fredholm Operators
Let H be a connected open subset of G. Then, for all c < distance(8G, H),
Integration by parts yields
on H for all j.
Since the function Pe is smooth, it is a constant on H for all c <
distance( 8G, H).
From Pe -+ pin L2(G) as c -+ 0, it follows that there is a subsequence
such that
Pen (x) -+ p(x) as n -+ 00 for almost all x E G,
by Problem 5.9a. Thus, p(x) = const for almost all x E G. The normaliza-
tion condition faPdx = 0 enforces p(x) = 0 for almost all x E G.
5.18c. * The strong pressure equation. We want to solve the pressure
equation (132) in the case where PI, P2, and P3 are functions. Let Pj E
L 2 (G), j = 1,2,3. Then problem (132) has a solution p E WJ(G) iff
fa Pjvjdx = 0 for all v E D. (134)
This tells us that there is a duality between the velocity fields v E X of a
viscous incompressible fluid and the outer force densities P E Z generated
by a pressure p.
Hint: Study the elegant proof given in the monograph by Galdi (1994),
Vol. 1, p. 103.
5.18d. The famous Helmholtz-Weyl decomposition of vector fielqs. Let
us reformulate the theorem from the preceding problem in terms of Hilbert
space theory. Consider the Hilbert space
along with the inner product (P I v) = fa Pjvjdx. Define
Zl := closure of Din Z,
Z2 := {P E Z: P = -gradp for some p E WJ(G)}.
Recall that D consists of all C<f(G)-vector fields v with div v = O. Obvi-
ously, Zl and Z2 are closed linear subspaces of the Hilbert space Z.
Problem 5.18c is now equivalent to saying that
zt = Z2 in Z,
Problems 367
that is, Z2 represents the orthogonal complement to Zl in the Hilbert space
Z. Therefore, we have the famous orthogonal decomposition
(135)
This means that, for each vector field w E Z, there exists the unique
decomposition
w = v + P, v E Zl, P E Z2'
In particular, we have
div v = 0 on G
and
curl P = 0 on G, (136)
in the sense of distribution theory. Observe that curl P = -curl grad
p = -'\7 x ('\7p) = 0 for smooth functions p. Then a passage to the limit
yields (136).
The decomposition of vector fields w into the sum of a divergence-free
field v and a curl-free field P was used by Hermann von Helmholtz in 1870.
The orthogonal decomposition (135) dates back to a famous paper written
by Hermann Weyl in 1940.
5.18e.** The very weak pressure equation. Let G be a nonempty open
subset of 1R7l., n 2:: 1. Consider the pressure equation
j = 1, ... ,n. (137)
We are given the distributions Pj E V' (G) for all j. Then, a distribution
p E V' (G) is a solution of (137) iff
Pj(ePj) =0 for all ePj E CO'(G) with OjePj = O. (138)
Here we sum over j from 1 to n.
Hint: Obviously, relation (137) implies (138) (cf. the proof of Proposition
13 in Section 5.17). The converse statement represents the special case of a
profound theorem on differential forms due to de Rham (cf. Temam (1977),
p. 14).
The philosophy of the de Rham theorem is that, for equations in terms
of differential forms, the natural necessary solvability conditions are also
sufficient. This was also the general philosophy of the present chapter.
5.19. Bifurcation and formation of patterns in nature. Bifurcation describes
the change of the qualitative behavior of systems in nature produced by a
loss of stability under external influences. From a mathematical point of
view, the following problems can be solved by using Theorem 5.J.
5.19a.* The Benard problem. Consider a viscous fluid between two
plates, as shown in Figure 5.5, where the temperature To of the lower
plate and the temperature Tl of the upper plate satisfy the condition
To> n.
368 5. Fredholm Operators
e:--3
iii
heat
(b) (c)
FIGURE 5.5.
If the temperature difference To - Tl is sufficiently small, then the fluid is
at rest. If the temperature difference is increased, then at a critical value,
Benard cells appear in the fluid. These cells have a hexagonal st.ructure.
This phenomenon was discovered experimentally by Benard in 1901.
In experiments a pan with silicon oil is heated with hot water from under-
neath it.' The fluid flow is made visible through small, equally distributed
aluminium pieces. After reaching the critical temperature difference, hexag-
onal cells appear in the pan, which are shown from above in Figure 5.5(c).
Benard cells correspond to a bifurcation phenomenon. Physically, they
arise by combining the gravitational force and the heat convection flow.
During the past twenty years, physicists, chemists, and biologists have
shown a great deal of interest in these Benard cells, because one observes
the formation of a complicated structure. This process frequently occurs in
the evolution of life.
From a mathematical point of view, one can apply Theorem 5.J to the
Navier-Stokes equations combined with the equations for heat conduction.
Study the proof in Zeidler (1986), Vol. 4, Section 72.9.
5.19b. * The Taylor problem. As in Figure 5.6(a) we consider a viscous
fluid between two concentric cylinders, whereby the outer cylinder is at
rest and the inner cylinder rotates counterclockwise around the z-axis with
angular velocity w. Let the cylinder radii be rand R, respectively, with
r < R. The Reynolds number Re is important. We set
pwr2
Re= - - .
'fJ
Here, p and 'fJ denote the density and the viscosity of the fluid, respec-
tively. In experiments one observes a critical Reynolds number Reo with
the following properties:
Problems 369
(( ((
tt tt
,
, ,
1 .. - - - - - _ .. '
..... --- (( ((
(a) Re < Reo (b) Re = Reo
FIGURE 5.6.
(i) For Re < Reo, that is, for small angular velocities w there exists an
axisymmetric flow that does not depend on the z-coordinate. This is
the Couette flow.
(ii) For Re = Reo, Taylor vortices occur, which are periodic in z (see
Figure 5.6(b)). These Taylor vortices were discovered experimentally
by Taylor in 1922.
(iii) If the angular velocity w gets larger and larger, that is, for increas-
ing Reynolds numbers, one obtains more and more complicated flow
pictures until at a certain Recrit turbulence occurs.
From a mathematical point of view, one can apply Theorem 5.J to the
Navier-Stokes equations. Study the proof to (ii) in Zeidler (1986), Vol. 4,
Section 72.7. A detailed discussion of the Couette-Taylor flow can be found
in the monograph by Chossat and looss (1994).
5.19c. * H opf bifurcation. Consider a finite-dimensional or infinite-dimen-
,;jonal dynamical system that is in an equilibrium state. If an external
influence acts on the system, then it may happen that the equilibrium
state loses its stability and the system starts oscillations. This important
phenomenon, called Hopf bifurcation, was discovered by Eberhard Hopf in
1942.
From a mathematical point of view, this bifurcation problem can be
solved by using Theorem 5.J.
Study the proof in Zeidler (1986), Vol. 4, Section 79.9.
5.19d.* Water waves and bifurcation. Consider a parallel water flow in
a cl1annel. If the velocity c of the flow becomes critical, then permanent
water waves occur (d. Figure 5.7).
370 5. Fredholm Operators
(a) C < ccrit (b) C = Ccrit
FIGURE 5.7.
From a mathematical point of view, one can use Theorem 5.J. Study the
proof for permanent gravitational water waves in Zeidler (1986), Vol. 4,
Chapter 71.
The rigorous treatment of permanent water waves represented a famou::;
open problem in the nineteenth century. A detailed mathematical and phy::;-
ical discussion of a broad class of permanent wave::; (including capillary-
gravity waves and tidal waves) along with historical remarks can be found
in the monograph by Zeidler (1972). See also the survey article by Zeidler
(1977).
5.20. The buckling of beams and plates, and bifurcation for problems that
have a variational structure. Under the influence of critical external forces,
huckling of beams and plates occurs as pictured in Figure 5.3. Such proh-
lems can frequently be solved by using Theorem 5.J. However, there is a
general bifurcation theorem for problems that have a variational structure.
Roughly speaking, this theorem says that each eigenvalue of the linearized
problem is also a bifurcation parameter.
(i) Study the main theorem of bifurcation theory for Fredholm operators
of variational type in Zeidler (1986), Vol. 2B, Section 29.18.
(ii) Study applications of this theorem to general variational pro hI em::; in
Zeidler (1986), Vol. 2B, Section 29.19ff.
(iii) Study the buckling of beams in Zeidler (1986), Vol. 2B, Section 29.13.
(iv) Study the buckling of plates in Zeidler (1986), Vol. 4, Chapter 65 (the
von Karman equations).
References
Additional references along with hints for further reading can be found in
AMS Vol. 108.
Abraham, R., Marsden, J., and Ratiu, T. (1983): Manifolds, Tensor Anal-
ysis, and Applications. Addison-Wesley, Reading, MA.
Abraham, R. and Robbin, J. (1967): Transversal Mappings and Flows. Ben-
jamin, New York.
Albers, D., Alexanderson, G., and Reid, C. (1987): International Mathe-
matical Congresses. Springer-Verlag, New York.
Alt, H. (1992): Lineare Funktionalanalysis: eine anwendungsorientierte
Einfiihrung. 2nd edition. Springer-Verlag, Berlin, Heidelberg.
Amann, H. (1990): Ordinary Differential Equations: An Introduction to
Nonlinear Analysis. De Gruyter, Berlin.
Amann, H. (1995): Linear and Quasilinear Parabolic Problems, Vol. 1.
Birkhauser, Basel.
Ambrosetti, A. (1993): A Primer of Nonlinear Analysis. Cambridge Uni-
versity Press, Cambridge, UK.
Ambrosetti, A. and Coti-Zelati, V. (1993): Periodic Solutions of Singular
Lagrangian Systems. Birkhauser, Basel.
Antman, S. (1995): Nonlinear Elasticity. Springer-Verlag, New York.
Appell, J. and Zabrejko, P. (1990): Nonlinear Su,perposition Operators.
Cambridge University Press, Cambridge, UK.
Aubin, J. (1977): Applied Functional Analysis. Wiley, New York.
372 References
Aubin, J. (1993): Optima and Equilibria: An Introduction to Nonlinear
Analysis. Springer-Verlag, Berlin, Heidelberg (translation from French).
Aubin, J. and Ekeland, I. (1983): Applied Nonlinear Functional Analysis.
Wiley, New York.
Bagger, J. and Wess, J. (1991): Supersymmetry and Supergravity. 2nd ex-
panded edition. Princeton University Press, Princeton, NJ.
Baggett, L. (1992): Functional Analysis: A Primer. Marcel Dekker, New
York.
Bakelman, I. (1994): Convex Analysis and Nonlinear Geometric Elliptic
Equations. Springer-Verlag, Berlin, Heidelberg.
Banach, S. (1932): Theorie des operations lineaires. Warszawa. (En-
glish edition: Theory of Linear Operations. North-Holland, Amsterdam,
1987.)
Banks, R. (1994): Growth and Diffusion Phenomena. Springer-Verlag,
Berlin, Heidelberg.
Barton, G. (1989): Elements of Green's Functions and Propagation: Poten-
tials, Diffusion, and Waves. Clarendon Press, Oxford.
Bartsch, T. (1993): Topological Methods for Variational Problems with Sym-
metry. Springer-Verlag, Berlin, Heidelberg.
Beauzamy, B. (1988): Introduction to Operator Theory and Invariant Sub-
spaces. North-Holland, Amsterdam.
Berberian, S. (1974): Lectures in Functional Analysis and Operator Theory.
Springer-Verlag, New York.
Berezin, F. (1987): Introduction to Superanalysis. Reidel, Dordrecht.
Berezin, F. and Shubin, M. (1991): The Schrodinger Equation. Kluwer,
Dordrecht.
Berger, M. (1977): Nonlinearity and FUnctional Analysis. Academic Press,
New York.
Bethuel, F., Brezis, H., and Helein, F. (1994): Ginzburg-Landau Vortices.
Birkhauser, Basel.
Boccara, N. (1990): Functional Analysis. Academic Press, New York.
Bogoljubov, N. (1967): Lectures on Quantum Statistics, Vols. 1, 2. Gordon
and Breach, New York (translation from Russian).
Booss, B. and Bleecker, D. (1985): Topology and Analysis. Springer-Verlag,
New York.
Bourgignon, J. (1995): Variational Calculus. Springer-Verlag, Berlin, Hei-
delberg.
Bratteli, C. and Robinson, D. (1979): Operator Algebras and Quantum Sta-
tistical Mechanics, Vols. 1,2. Springer-Verlag, New York.
Brezis, H. (1983): Analyse functionelle et applications. Masson, Paris.
References 373
Brody, F. and Vamos, T. (eds.) (1994): Neumann Compendium (selected
papers by John von Neumann). World Scientific, Singapore.
Brokate, M. and Sprekels, J. (1995): Hysteresis Phenomena in Phase Tran-
sitions. Springer-Verlag, Berlin, Heidelberg (to appear).
Browder, F. (ed.) (1992): Nonlinear and Global Analysis. Reprints from the
Bulletin of the American Mathematical Society. Providence, RI.
Caroll, R. (1988): Mathematical Physics. North-Holland, Amsterdam.
Chang, K. (1966): Critical Point Theory and Its Applications. Springer-
Verlag, Berlin, Heidelberg (to appear).
Choquet-Bruhat, Y., DeWitt-Morette, and Dillard-Bleick, M. (1988): Anal-
ysis, Manifolds, and Physics. Vols. 1, 2. North-Holland, Amsterdam.
Chorin, J. (1975): Lectures on Turbulence Theory. Publish or Perish,
Boston, MA.
ChOl'in, A. (1994): Vorticity and Turbulence. Springer-Verlag, New York.
Chossat, P. and looss, G. (1994): The Couette-Taylor Flow. Springer-
Verlag, New York.
Chow, S. and Hale, J. (1982): Methods of Bifurcation Theory. Springer-
Verlag, Berlin, Heidelberg.
Ciarlet, P. (1977): Numerical Analysis of the Finite Element Method for
Elliptic Boundary- Value Problems. North-Holland, Amsterdam.
Ciarlet, P. (1983): Lectures on Three-Dimensional Elasticity. Springer-Ver-
lag, New York.
Collet, P. and Eckmann, J. (1990): Instabilities and F'ronts in Extended
Systems. Princeton University Press, Princeton, NJ.
Collins, J. (1984): Renormalization. Cambridge University Press, Cam-
bridge, UK.
Colombeau, J. (1992): Multiplication of Distributions. Lecture Notes in
Mathematics, Vol. 1532. Springer-Verlag, Berlin, Heidelberg.
Connes, A. (1994): Noncommutative Geometry. Academic Press, New York.
Constantinescu, F. and de Groote, H. (1994): Geometrische und algebrai-
sche Methoden der Physik: Supermannigfaltigkeiten und Virasoro- Al-
gebren. Teubner-Verlag, Stuttgart.
Conway, J. (1990): A Course in Functional Analysis. Springer-Verlag, New
York.
Cornwell, J. (1989): Group Theory in Physics. Vol. 1: Fundamental Con-
cepts; Vol. 2: Lie Groups and Their Applications; Vol. 3: Supersymme-
tries and Infinite-Dimensional Algebras. Academic Press, New York.
Courant, R. and Hilbert, D. (1937): Die Methoden der Mathematischen
Physik, Vols. 1, 2. (English edition: Methods of Mathematical Physics,
Vols. 1, 2. Wiley, New York, 1989).
374 References
Creutz, M. (1983): Quarks, Gluons, and Lattices. Cambridge University
Press, Cambridge, UK.
Cycon, R., Froese, R., Kirsch, W., and Simon, B. (1986): Schrodinger Op-
erators. Springer-Verlag, New York.
Dacarogna, B. (1989): Direct Methods in the Calculus of Variations.
Springer-Verlag, Berlin, Heidelberg.
Dal Maso, G. (1993): An Introduction to f-Convergence. Birkhauser, Basel.
Dautray, D. and Lions, J. (1990): Mathematical Analysis and Numerical
Methods for Science and Technology; Vol. 1: Physical Origins and Clas-
sical Methods; Vol. 2: Functional and Variational Methods; Vol. 3: Spec-
tral Theory and Applications; Vol. 4: Integral Equations and Numerical
Methods; Vol. 5: Evolution Problems I; Vol. 6: Evolution Problems II -
the Navier-Stokes Equations, the Transport Equations, and Numerical
Methods. Springer-Verlag, Berlin, Heidelberg (translation from French).
Davies, P. (ed.) (1989): The New Physics. Cambridge University Press,
Cambridge, UK.
Deimling, K. (1985): Nonlinear Functional Analysis. Springer-Verlag, New
York.
Deimling, K. (1992): Multivalued Differential Equations. De Gruyter,
Berlin.
Deuflhard, P. and Hohmann, A. (1993): Numerische Mathematik 1. De
Gruyter, Berlin. (English edition: Numerical Analysis: A First Course
in Scientific Computation. De Gruyter, Berlin, 1994.)
Deuflhard, P. and Bornemann, F. (1994): Numerische Mathematik II: Inte-
gration gewohnlicher Differentialgleichungen. De Gruyter, Berlin (En-
glish edition in preparation).
DeVito, C. (1990): Functional Analysis and Linear Operator Theory. Addi-
son-Wesley, Reading, MA.
Diekman, 0., Lunel, S., van Gils, A., and Walther, H. (1995): Delay Equa-
tions: Functional Analysis, Complex Analysis, and Nonlinear Analysis.
Springer-Verlag, Berlin, Heidelberg (to appear).
Dierkes, U., Hildebrandt, S., Kuster, A., and Wohlrab, O. (1992): Minimal
Surfaces, Vols. 1, 2. Springer-Verlag, Berlin, Heidelberg.
Dieudonne, J. (1969): Foundations of Modern Analysis. Academic Press,
New York.
Dieudonne, J. (1981): History of Functional Analysis. North-Holland, Am-
sterdam.
Donoghue, J., Golowich, M., and Holstein, B. (1992): The Dynamics of the
Standard Model. Cambridge University Press, Cambridge, UK.
Dubrovin, B., Fomenko, A., and Novikov, S. (1992): Modern Geometry:
Methods and Applications, Vols. 1-3. Springer-Verlag, New York (trans-
lation from Russian).
References 375
Dunford, N. and Schwartz, J. (1988): Linear Operators, Vols. 1-3. Wiley,
New York.
Ecollomou, E. (1988): Green's Functions in Quantum Physics. Springer-
Verlag, New York.
Edwards, R. (1994): Functional Analysis. Dover, New York.
Ekeland,1. and Temam, R. (1974): Analyse convex et problemes variation-
nels. Dunod, Paris. (English edition: Convex Analysis and Variational
Problems. North-Holland, New York, 1976).
Ekeland, 1. (1979): Elements d'economie mathematique. Hermann, Paris.
Ekeland, I. (1990): Convexity Methods in Hamiltonian Mechanics. Springer-
Verlag, New York.
Emch, G. (1986): Mathematical and Conceptual Foundations of 20th-Cen-
t1./,ry Physics. North-Holland, Amsterdam.
Ericksen, J. and Kinderlehrer, D. (eds.) (1988): Theory and Applications
of Liquid Crystals. Springer-Verlag, New York. .
Euler, L. (1911ff): Opera Omnia (Collected Papers). Leipzig-Berlin, later
Basel-Ziirich, Vols. 1-72.
Evalls, 1. (1994): Partial Differential Equations. Berkeley Mathematics
Lecture Notes, Vols. 3A and 3B. University of Berkeley, CA.
Farkas, M. (1994): Periodic Motions. Springer-Verlag, Berlin, Heidelberg.
Fenya, S. and Stolle, H. (1982): Theorie und Praxis del' linearen Integral-
gleichungen, Vols. 1-4. Deutscher Verlag der Wissenschaften, Berlin.
Feymnan, R., Leighton, R., and Sands, M. (1963): The Feynman Lectures
in Physics. Addison-Wesley, Reading, MA. .
Fillll, R. (1985): Equilibrium Capillary Surfaces. Springer-Verlag, Berlin,
Heidelberg.
Foias, C., Sell, G., and Temam, R. (1993): Turbulence in Fluid Flows: A
Dynamical Systems Approach. Springer-Verlag, New York.
Friedman, A. (1982): Variational Principles and Free Boundary- Value
Problems. Wiley, New York.
Friedman, A. (1989/94): Mathematics in Industrial Problems, Vols. 1-6.
Springer-Verlag, New York.
Gajewski, H., Grager, K., and Zacharias, K. (1974): Nichtlineare Operator-
gleichungen. Akademie- Verlag, Berlin.
Galdi, G. (1994): An Introduction to the Mathematical Theory of the
Navier-Stokes Equations, Vols. 1-4. Springer-Verlag, Berlin, Heidelberg
(Vol:;. 3 anel 4 to appear).
Gelfaud, 1. anel Shilov, E. (1964): Generalized Functions, Vols. 1-5. Aca-
demic Press, New York (translation from Russian).
Gell-1\Iaull, 1\1. (1994): The Quark and the Jaguar: Advent1tres in the Simple
and the Complex. Freeman, San FrancilOco, CA.
376 References
Giaquinta, M. (1993): Introduction to Regularity Theory for Nonlinear El-
liptic Systems. Birkhauser, Basel.
Giaquinta, M. and Hildebrandt, S. (1995): Calculus of Variations, Vols. 1,
2. Springer-Verlag, New York.
Gilbarg, D. and Trudinger, N. (1994): Elliptic Partial Differential Equa-
tions of Second Order. 2nd edition. Springer-Verlag, New York.
Gilkey, P. (1984): Invariance Theory, the Heat Equation, and the Atiyah-
Singer Index Theorem. Publish or Perish, Boston, MA.
Girvin, S. and Prange, R. (1990): The Quantum Hall Effect. 2nd edition.
Springer-Verlag, New York.
Green, M., Schwarz, J., and Witten, E. (1987): Superstrings, Vols. 1, 2.
University Press, Cambridge, UK.
Greiner, W. and MUller, B. (1994): Quantum Mechanics: Symmetries.
Springer-Verlag, Berlin, Heidelberg.
Greiner, W. and Reinhardt, J. (1994): Quantum Electrodynamics. Springer-
Verlag, Berlin, Heidelberg.
Greiner, W. (1993): Gauge Theory of Weak Interactions. Springer-Verlag,
Berlin, Heidelberg.
Greiner, W. and Schafer, A. (1994): Quantum Chromodynamics. Sprillger-
Verlag, Berlin, Heidelberg.
Grosche, G., Ziegler, D., Ziegler, V., and Zeidler, E. (eds.) (1995): Teubner-
Taschenbuch der Mathematik II (Handbook of Advanced Mathematics).
Teubner-Verlag, Stuttgart, Leipzig (English edition in preparation).
Grosse, H. (1995): Models in Statistical Physics and Quantum Field Theo1'y,
Springer-Verlag, Berlin, Heidelberg (to appear).
Gruber, P. and Wills, J. (1993): Handbook of Convex Geometry, Vols. 1,2.
N orth-Holland, Amsterdam.
Guillemin, V. and Pollack, A. (1974): Differential Topology. Prentice-Hall,
Englewood Cliffs, NJ.
Guillemin, V. and Sternberg, S. (1990): Symplectic Techniques in Physics.
Cambridge University Press, Cambridge, UK.
Guisti, E. (1984): Minimal Surfaces and Functions of Bounded Variation.
Birkhauser, Basel.
Gurtin, M. (1993): Thermomechanics of Evolving Phase. Clarendon Press,
Oxford.
Haag, R. (1993): Local Quantum Phys'ics: Pields, Particles, Algebras.
Springer-Verlag, Berlin, Heidelberg.
Hackbusch, W. (1992): Elliptic Differential Equat'ions.· TheoTY and Nurner-
ical Treatment. Springer-Verlag, New York (translation from German).
Hackbusch, W. (1994): Iterative Solution of Large Sparse Syste'ms of Equa-
tions. Springer-Verlag, New York (translation from German).
References 377
Hale, J. and Kogak, H. (1991): Dynamics of Bifurcations. Springer-Verlag,
Berlin, Heidelberg (cf. also Kogak (1989)).
Hatfield, B. (1992): Quantum Field Theory of Point Particles and Strings.
Addison-Wesley, Redwood City, CA.
Henneaux, M. and Teitelboim, C. (1993): Quantization of Gauge Systems.
Princeton University Press, Princeton, NJ.
Henry, D. (1981): Geometric Theory of Semilinear Parabolic Equations.
Lecture Notes in Mathematics, Vol. 840, Springer-Verlag, New York.
Hermann, C. and Sapoval, B. (1994): Physics of Semiconductors. Springer-
Verlag, New York.
Heuser, H. (1975): Funktionalanalysis. Teubner-Verlag, Stuttgart. (English
edition: Functional Analysis, Wiley, New York, 1986).
Hilbert, D. (1912): Grundzuge einer allgemeinen Theorie der Integralglei-
chungen. Teubner-Verlag, Leipzig.
Hilbert, D. (1932): Gesammelte Werke (Collected Works), Vols. 1-3.
Springer-Verlag, Berlin.
Hildebrandt, S. and Tromba, T. (1985): Mathematics and Optimal Form.
Scientific American Library, Freeman, New York.
Hiriart-Urruty, J. and Lemarchal, C. (1993): Convex Analysis and Mini-
mization Algorithms, Vols. 1, 2. Springer-Verlag, Berlin, Heidelberg.
Hirzebruch, F. and Scharlau, W. (1971): Einfuhrung in die Funktionalana-
lysis. Bibliographisches Institut, Mannheim.
Hofer, H. and Zehnder, E. (1994): Symplectic Invariants and Hamiltonian
Dynamics. Birkhauser, Basel.
Holmes, R. (1975): Geometrical Functional Analysis and Its Applications.
Springer-Verlag, New York.
Honerkamp, J. and Romer, H. (1993): Theoretical Physics: A Classical Ap-
proach. Springer-Verlag, New York.
Hoppenstaedt, F. and Peskin, C. (1994): Mathematics in Medicine and the
Life Sciences. Springer-Verlag, New York.
Hormander, L. (1983): The Analysis of Linear Partial Differential Oper-
ators; Vol. 1: Distribution Theory and Fourier Analysis; Vol. 2: Dif-
ferential Operators with Constant Coefficients; Vol. 3: Pseudo-Differ-
ential Operators; Vol. 4: Fourier Integral Operators. Springer-Verlag,
New York.
Hormander, L. (1994): Notions of Convexity. Birkhiiuser, Basel.
Huang, K. (1992): Quarks, Leptons, and Gauge Fields. 2nd edition. World
Scientific, Singapore.
Isham, C. (1989): Modem Differential Geometry for Physicists. World Sci-
entific, Singapore.
378 References
Jost, J. (1984): Harmonic Maps between Surfaces. Springer-Verlag, Berlin,
Heidelberg.
Jost, J. (1991): Two-Dimensional Geometric Variational Problems. Wiley,
New York.
Jost, J. (1994): Differentialgeometrie und Minimalfiiichen. Springer-Verlag,
Berlin, Heidelberg.
Jost, J. (1994a): Riemannian Geometry and Geometric Analysis. Springer-
Verlag, Berlin, Heidelberg.
Jost, J. (1996): Postmodern Analysis. Springer-Verlag, Berlin, Heidelberg
(to appear).
Kadison, R. and Ringrose, J. (1983): Fundamentals of the Theory of Oper-
ator Algebms, Vols. 1-4. Academic Press, New York.
Kaku, M. (1987): Introduction to Superstring Theory. Springer-Verlag, New
York.
Kaku, M. and Trainer, J. (1987): Beyond Einstein: The Cosmic Quest for
the Theory of the Universe. Bantam Books, New York.
Kaku, M. (1991): Strings, Conformal Fields, and Topology. Springer-
Verlag, New York.
Kaku, M. (1993): Quantum Field Theory. Oxford University Press, Oxford.
Kantorovich, L. and Akilov, G. (1964): Functional Analysis in Normed
Spaces. Pergamon Press, Oxford (translation from Russian).
Kato, T. (1976): Perturbation Theory for Linear Operators. Springer-
Verlag, Berlin.
Kelley, J. (1955): General Topology. Van Nostrand, New York.
Kevasan, S. (1989): Topics in Functional Analysis and Applications. Wiley,
New York.
Kirillov, A. and Gvishiani, A. (1982): Theory and Problems in Functional
Analysis. Springer-Verlag, New York.
Ko<;ak, H. (1989): Differential and Difference Equations Through Computer
Experiments. With Diskettes. Springer-Verlag, New York (cf. also Hale
and Ko<;ak (1991)).
Kleinert, V. (1989): Gauge Fields in Condensed Matter, Vols. 1, 2. World
Scientific, Singapore.
Kolb, E. and Turner, M. (1990): The Early Universe. Addision-Wesley,
Redwood City, CA.
Kolmogorov, A., Fomin, S., and Silverman, R. (1975): Introductory Real
Analysis. Dover, New York (enlarged translation from the Russian).
Kothe, G. (1960): Topologische lineare Riiume. Springer-Verlag, Berlin.
Krasnoselskii, M. and Zabreiko, P. (1984): Geometrical Methods in Nonl,tn-
ear Analysis. Springer-Verlag, New York (translation from Russian).
References 379
Kress, R. (1989): Linear Integral Equations. Springer-Verlag, New York.
Kreyszig, E. (1989): Introductory Functional Analysis with Applications.
Wiley, New York.
Kufner, A., John, 0., and Fucik, S. (1977): Function Spaces. Academia,
Prague.
Kuksin, S. (1993): Nearly Integrable Infinite-Dimensional Systems. Lecture
Notes in Mathematics, Vol. 1556. Springer-Verlag, Berlin.
Kuperschmidt, B. (1992): The Variational Principles of Dynamics. World
Scientific, Singapore. I
Ladyzhenskaya, o. (1969): The Mathematical Theory of Viscous Incom-
pressible Flows. Gordon and Breach, New York.
Landau, L. and LifSic, E. (1982): Course of Theoretical Physics, Vols. 1-10.
Elsevier, New York.
Lang, S. (1993): Real Analysis. 3rd edition. Springer-Verlag, New York.
Lawson, H. and Michelsohn, M. (1989): Spin Geometry. Princeton Univer-
sity Press, Princeton, NJ.
Lazutkin, V. (1993): KAM-Theory and Semiclassical Approximations to
Eigenfunctions. Springer-Verlag, Berlin, Heidelberg.
Leis, R. (1986): Initial-Boundary Value Problems in Mathematical Physics.
Wiley, New York.
Levitan, B. and Sargsjan, I. (1991): Sturm-Liouville and Dirac Operators.
Kluwer, Boston, MA (translation from Russian).
Lions, J. (1969): Quelques methodes de resolution des problemes aux limites
nonlineaires. Dunod, Paris.
Lions, J. (1971): Optimal Control of Systems Governed by Partial Differ-
ential Equations. Springer-Verlag, Berlin (translation from French).
Lions, J. and Magenes, E. (1972): Inhomogeneous Boundary- Value Prob-
lems, Vols. 1-3. Springer-Verlag, New York.
Luenberger, D. (1969): Optimization by Vector Space Methods. Wiley, New
York.
Liist, D. and Theissen, S. (1989): Lectures on String Theory. Lecture Notes
in Physics, Vol. 346. Springer-Verlag, Berlin, Heidelberg.
Mandl, F. and Shaw, G. (1989): Quantum Field Theory. Wiley, New York.
Marathe, K. and Martucci, M. (1992): The Mathematical Foundations of
Gauge Theory. North-Holland, Amsterdam.
Marchioro, C. and Pulvirenti, M. (1994): Mathematical Theory of Inviscid
Fluids. Springer-Verlag, New York.
Markowich, P. (1990): Semiconductor Equations. Springer-Verlag, Berlin,
Heidelberg.
380 References
Marsden, J. (1974): Applications of Global Analysis in Mathematical
Physics. Publish or Perish, Boston, MA.
Marsden, J. (1992): Lectures in Mechanics. Cambridge University Press,
Cambridge, UK.
Marsden, J. and Tromba, A. (1976): Vector Calculus. Freeman, San Fran-
cisco, CA.
Marsden, J. and Hughes, T. (1983): Mathematical Foundations of Elastic-
ity. Prentice-Hall, Englewood Cliffs, NJ.
Marsden, J. and Ratiu, T. (1994): Introduction to Mechanics and Sym-
metry; A Basic Exposition of Classical Mechanical Systems. Springer-
Verlag, New York.
Mawhin, J. and Willem, M. (1987): Critical Point Theory and Hamiltonian
Systems. Springer-Verlag, New York.
Maurin, K. (1972): Methods of Hilbert Spaces. Polish Scientific Publishers,
Warsaw.
Meirmanov, A. (1992): The Stefan Problem. De Gruyter, Berlin (translation
from Russian).
Meyer, K. and Hall, G. (1992): Introduction to Hamiltonian Dynamical
Systems and the N -Body Problem. Springer-Verlag, New York.
Mielke, A. (1991): Hamiltonian and Lagrangian Flows on Center Manifolds
with Applications to Elliptic Variational Problems. Lecture Notes in
Mathematics, Vol. 1489. Springer-Verlag, Berlin, Heidelberg.
Milnor, J. (1969): Topology from the Differentiable Point of View. Univer-
sity of Virginia Press, Charlottesville, VA.
Monastirsky, M. (1993): Topology of Gauge Fields and Condensed Matter.
Plenum Press, New York.
Miiller, I. and Rugeri, T. (1993): Extended Thermodynamics. Springer-
Verlag, Berlin, Heidelberg.
Murray, J. (1989): Mathematical Biology. Springer-Verlag, Berlin, Heidel-
berg.
Nakahara, M. (1990): Geometry, Topology, and Physics. Hilger, Bristol.
Necas, J. (1967): Les methodes directes en theorie des equations elliptiques.
Academia, Prague.
Nishikawa, K. and Wakatani, M. (1993): Plasma Physics; Basic Theory
with Fusion Applications. Springer-Verlag, Berlin, Heidelberg.
Nobel Prizes (1954ff): Nobel Lectures. Edited by the Nobel Fonndation,
Stockholm.
Novikov, S., Manakov, S., Pitajevskii, L., and Zakharov, V. (1984): Theory
of Solitons. Plenum Press, New York (translation from Russian).
Oberguggenberger, M. (1992): Multiplication of Distributions and Applica-
tions to Partial Differential Equations. Harlow, Longman, UK.
References 381
Peierls, R. (1979): Surprises in Theoretical Physics. Princeton University
Press, Princeton, NJ.
Petryshyn, V. (1993): Approximation-Solvability of Nonlinear Functional
and Differential Equations. Marcel Dekker, New York.
Plakida, N. (1994): High-Temperature Superconducitivity: Experiment and
Theory. Springer-Verlag, New York.
Polyakov, A. (1987): Gauge Fields and Strings. Academic Publishers, Har-
wood, NJ.
Pressley, A. and Segal, G. (1988): Loop Groups. Oxford, Clarendon Press.
Rabinowitz, P. (1986): Methods in Critical Point Theory with Applications.
Amer. Math. Soc., Providence, RI.
Racke, R. (1992): Lectures on Evolution Equations. Vieweg, Braunschweig.
Raychaudhuri, A., Banerji, S. and Banerjee, A. (1993): General Relativity,
Astrophysics, and Cosmology. Springer-Verlag, New York.
Reed, M. and Simon, B. (1972): Methods of Modern Mathematical Physics.
Vol. 1: Functional Analysis; Vol. 2: Fourier Analysis, Self-Adjointnessj
Vol. 3: Scattering Theory; Vol. 4: Analysis of Operators. Academic Press,
New York.
Renardy, M. and Rogers, R. (1993): Introduction to Partial Differential
Equations. Springer-Verlag, New York.
Riesz, F. and Nagy, B. (1955): Ler;ons d'analyse fonctionelle (English edi-
tion: Functional Analysis. Frederick Ungar Publishing Company, New
York, 1978).
Rolewicz, S. (1972): Metric Linear Spaces. Polish Scientific Publishers,
Warsaw.
Rolnick, W. (1994): Fundamental Particles and Their Interactions. Addi-
son-Wesley, Reading, MA.
Rowlatt, P. (1966): Group Theory and Elementary Particles. Elsevier, New
York.
Rudin, W. (1966): Real and Complex Analysis. McGraw-Hill, New York.
Rudin, W. (1973): Functional Analysis. McGraw-Hill, New York.
Rnei, K. (1971): Quantum Theory of Particles and Fields, Vol. 1, 2. Uni-
versity Press, Taipei, Taiwan.
Rnei, K. (1972): Classical Theory of Particles and Fields, Vols. 1, 2. Uni-
versity Press, Taipei, Taiwan.
Sakai, A. (1991): Operator Algebras. Canlbridge University Press, Cam-
bridge, UK.
Sattinger, D. and Weaver, O. (1993): Lie Groups, Lie Algebras, and Their
Representations. Springer-Verlag, New York.
Schechter, M. (1971): Principles of Functional Analysis. Wiley, New York.
382 References
Schmutzer, E. (1989): Grundlagen der theoretischen Physik, Vols. 1, 2.
Deutscher Verlag der Wissenschaften, Berlin.
Schneider, P., Ehlers, J., and Falco, E. (1992): Gravitational Lenses.
Springer-Verlag, New York.
Schrieffer, J. (1964): Theory of Superconductivity. Benjamin, New York.
Schwartz, L. (1966): The-orie des distributions. Hermann, Paris.
Seydel, R. (1994): Practical Bifurcation and Stability Analysis: Prom Equi-
librium to Chaos. Springer-Verlag, Berlin, Heidelberg.
Shore, S. (1992): An Introduction to Astrophysical Hydrodynamics. Aca-
demic Press, San Diego, CA.
Simon, B. (1993): The Statistical Mechanics of Lattice Gases. Princeton
University Press, Princeton, NJ.
Sirovich, L. (ed.) (1991): New Perspectives in Turbulence. Springer-Verlag,
New York.
Sirovich, L. (ed.) (1994): Trends and Perspectives in Applied Mathematics.
Springer-Verlag, New York.
Smale, S. (1965): An infinite-dimensional version of Sard's theorem. Amer.
J. Math. 87, 861-866.
Smoller, J. (1994): Shock Waves and Reaction-Diffusion Equations. 2nd
enlarged edition. Springer-Verlag, New York.
Soper, D. (1975): Classical Field Theory. Wiley, New York.
Spohn, H. (1991): Large Scale Dynamics of Interacting Particles. Springer-
Verlag, Berlin, Heidelberg.
Stephani, H. (1989): Differential Equations: Their Solutions Using Symme-
tries. Edited by MacCallum. Cambridge University Press, Cambridge,
UK.
Sterman, G. (1993): An Introduction to Quantum Field Theory. Cambridge
University Press, Cambridge, UK.
Straub, D. (1989): Thermofiuid Dynamics of Optimized Rocket Propulsions.
Birkhiiuser, Basel.
Struwe, M. (1988): Plateau's Problem and the Calculus of Variations.
Princeton University Press, Princeton, NJ 1
Struwe, M. (1990): Variational Methods. Springer-Verlag, New York.
Sunder, V. (1987): An Invitation to von Neumann Algebras. Springer-
Verlag, New York.
Ta-Pai Cheng and Ling-Fong Li (1984): Gauge Theory of Elementary Par-
ticle Physics. University Press, Oxford.
Temam, R. (1977): Navier-Stokes Equations: Theory and Numerical Anal-
ysis. North-Holland, Amsterdam.
References 383
Temam, R. (1988): Infinite-Dimensional Dynamical Systems in Mechanics
and Physics. Springer-Verlag, New York.
ter Haar Romeny, B. (ed.) (1994): Geometry-Driven Diffusion in Computer
Vision. Kluwer, Dordrecht.
Thaller, B. (1992): The Dirac Equation. Springer-Verlag, Berlin, Heidel-
berg.
Thirring, W. (1991): A Course in Mathematical Physics. Vol. 1: Classical
Dynamical Systems; Vol. 2: Classical Field Theory; Vol. 3: Quantum Me-
chanics of Atoms and Molecules; Vol. 4: Quantum Mechanics of Large
Systems. Springer-Verlag, New York (translation from German).
Triebel, H. (1985): Analysis and Mathematical Physics. Teubner-Verlag,
Leipzig.
Triebel, H. (1992): Theory of Function Spaces II. Birkhauser, Basel.
Tromba, A. (1977): On the Number of Simply Connected Minimal Surfaces
Spanning a Curve. Mem. Am. Math. Soc., Providence, RI.
Vanhorn, W. (1994): The Stokes Equation. Akademie-Verlag, Berlin.
Visintin, A. (1994): Differentiable Models of Hysteresis. Springer-Verlag,
Berlin, Heidelberg.
Wahl, W. von (1985): The Equations of Navier-Stokes and Abstract
Parabolic Equations. Vieweg, Braunschweig.
Waldschmidt, M., Moussa, P., Luck, J., and Itzykson, C. (1992). From
Number Theory to Physics. Springer-Verlag, Berlin, Heidelberg.
Weinberg, S. (1992): Dreams of a Final Theory. Pantheon Books, New
York.
Wendland, W. (1996): Integral Equation Methods for Boundary- Value Prob-
lems. Springer-Verlag, Berlin, Heidelberg (to appear).
Wiedemann, H. (1993): Particle Accelerator Physics, Vols. 1, 2. Springer-
Verlag, Berlin, Heidelberg.
Wightman, A. and Velo, G. (1980): Rigorous Atomic and Molecular
Physics. Plenum Press, New York.
Wloka, J. (1971): Funktionalanalysis und ihre Anwendungen. De Gruyter,
Berlin.
Yosida, K. (1991): Lectures on Differential and Integral Equations. Dover,
New York.
Yosida, K. (1988): Functional Analysis. 5th edition. Springer-Verlag, New
York.
Zabczyk, J. (1992): Optimal Control Theory. Birkhauser, Basel.
Zeidler, E. (1972): Beitriige zur Theorie und Praxis freier Randwertauf-
gaben. Akademie-Verlag, Berlin.
384 References
Zeidler, E. (1977): Bifurcation Theory and Permanent Waves. In: P. Rabi-
nowitz (ed.), Applications of Bifurcation Theory, Academic Press, New
York, pp. 203-224.
Zeidler, E. (1986): Nonlinear Functional Analysis and Its Applications. Vol.
1: Fixed-Point Theorems; Vol. 2A: Linear Monotone Operators; Vol. 2B:
Nonlinear Monotone Operators; Vol. 3: Variational Methods and Opti-
mization; Vols. 4, 5: Applications to Mathematical Physics. Springer-
Verlag, New York (second enlarged editions of Vols. 1 and 4, 1992 and
1995, Vol. 5 in preparation.)
Zeidler, E. (1995): Teubner-Taschenbuch der Mathematik II (Handbook
on Advanced Mathematics) (Chapters 3 through 15: Linear and nonlin-
ear functional analysis; dynamical systems; nonlinear partial differential
equations in mathematical physics; analysis on manifolds; Riemannian
geometry and general relativity; Lie groups, Lie algebras, and elemen-
tary particles; algebraic and differential topology; fibre bundles, modern
differential geometry and gauge field theory; characteristic classes and
the Atiyah-Singer index theorem; the Riemann-Roch-Hirzebruch the-
orem). Cf. Grosche, Ziegler, and Zeidler (eds.) (1995) (English edition
in preparation).
Zeidler, E. (1995): Applied Functional Analysis: Applications to Mathemat-
ical Physics. Applied Mathematical Sciences Vol. 108. Springer-Verlag,
New York.
Zinn-Justin, J. (1993): Quantum Field Theory and Critical Phenomena.
Oxford University Press, Oxford.
Zuily, C. (1988): Problems in Distributions and Partial Differential Equa-
tions. North-Holland, Amsterdam.
List of Symbols
Science is a first class piece of furniture for the bel etage - as long
as common sense reigns on the ground floor.
Oliver Wendell Holmes
General Notation
A=>l3 A implies· l3
iff if and only if
A<=>l3 A iff B (Le., A=> l3 and l3 => A)
f(x) := 2x f(x) = 2x by definition
xES x is an element of the set S
x{j.S x is not an element of the set S
{x: ... } set of all elements x with the property ...
S~T the set is contained in the set T
SeT S ~ T and S =f T (the set is properly contained in T)
SuT the union of the sets Sand T (the set of all
elements that live in S or T)
SnT the intersection of the sets Sand T (the set of
all elements that live in Sand T)
• One says that
(i) condition A is sufficient for B, and
(ii) condition B is necessary for A.
386 List of Symbols
S-T the difference set (the set of all elements that
live in S and not in T)
o empty set
28 set of all subsets of S (the power set of S)
SxT product set {(x, y): xES and yET}
{p} set of the single point p
N set of the natural numbers 1,2, ...
JR,e,Q,z set of the real, complex, rational, integer numbers
][{ JR or e
JRN set of all real N-tupels x = (Xl.' .. ,XN)
(i.e., Xj E JR for all j)
set of all complex N-tupels (Xl!' .. ,XN)
(Le., Xj E e for all j)
][{N JRN or eN
Re z, 1m z real part of the complex number z = x + yi,
imaginary part of z (Le., Re z := x, 1m z := y)
z conjugate complex number z:= x - yi,
JzJ absolute value of the complex number z,
JzJ := vx2 + y2
[a,b] closed interval (the set {x E JR: a ::; x ::; b})
]a,b[ open interval (the set {x E JR: a < x < b})
]a,b] half-open interval (the set {x E JR: a < x ::; b})
[a,b[ half-open interval (the set {x E JR: a ::; x < b})
sgnr signum of the real number r
ejk Kronecker symbol, ejk := 1 if j = k,
and ejk := 0 if j i=- k
inf S infimum of the set S ofreal numbers (the largest
lower bound of S)
sup S supremum of the set S of real numbers (the
smallest upper bound of S)
minS the minimum of the set S of real numbers (the
smallest upper bound of S)
maxS the maximum of the set S of real numbers (the
largest element of S)
lower limit of the real sequence (an) (see page 136)
n--oo
upper limit of the real sequence (an)
The Landau Symbols
f(x) = O(g(x», Jf(x)J ::; constJg(x)J for all x in a neighborhood
x - a of the point a
f(x) = o(g(x», lim f(x) = 0
x--a g(x)
x-a
List of Symbols 387
Norms and Inner Products
IIxll norm of x 16*
lim Xn = x the sequence (x n ) converges to the point x 18
n-+oo
(or Xn ---+ x as
n ---+ 00)
00
LUn infinite series in a Banach space 175
n=l
(x I y) inner product I103
N
(x I y) Euclidean inner product, (x I Y) := L xnYn
n=l
U13
C
conjugate complex number to Yj) I107
Ixl Euclidean norm, Ixl := (x I x)! = ~ IXn l 2 )1 1107
Ixloo special norm, Ixloo := sup Ixnl I10
n
(u I vh inner product in the Lebesgue spaces L2 (G)
and L~(G), (u I vh := fa u(x)v(x)dx I112
liul12 norm on the Lebesgue spaces L2(G) and L~(G),
(fa IU(XWdX)
1
lIul12 := (u I u)! = 2 I112
(u I Vh,2 inner product on the Sobolev space Wi(G),
(u I "),,2'~ 1 (uv+ t&;U&;v)
G 3=1
<Ix 1118
lIulh.2 norm on the Sobolev space Wi(G), Ilu1i1,2 :=
1
(u I u)l" ~ (10 (U 2+ t,(8;U)2) <Ix) , 1118
(. I ·)E energetic inner product 1271
II·IIE energetic norm 1271
Operators
A:S~X---+Y operator from the set S into
the set Y, where S ~ X 116
D(A) (or dom A) domain of definition of the operator A 117
R(A) (or im A) range (or image) of the operator A I17
N(A) (or ker A) null space (or kernel) of the operator A,
N(A) := {x: Ax = O} 169
·If we add the symbol I, then the page number refers to AMS Vol. 108.
388 List of Symbols
I (or id) identical operator, Ix := x for all x 176
A(8) image of the set 8, A(8) := {Ax: x E 8}
A-I(T) preimage of the set T, A-I(T) := {x: Ax E T} 117
A-I inverse operator to A 117
G(A) graph of the operator A,
G(A) := {(x,Ax):x E D(A)} 1412
II All norm of the linear operator A 169
11111 norm of the functional 1 174
AB (or AoB) the product of the operators A and B,
(AB)(u) := A(Bu) 128
A~B the operator B is an extension of
the operator A 1258
A* adjoint operator to the linear operator A 1261
AT dual operator to the linear operator A 199
A closure of the linear operator A 1412
a(A) spectrum of the linear operator A 182
peA) resolvent set of the linear operator A 182
rCA) spectral radius of the linear operator A 193
rank A rank of the linear operator A,
rank A := dim R(A) 195
ind A index of the linear operator A,
ind A := dim N(A) - codim R(A) 292
det A determinant of the matrix A
tr A trace of the (N x N)-matrix
A = (ajk), tr A := au + ... + aNN
tr A trace of the linear operator A 145
in a Hilbert space
Special Sets
S closure of the set 8 130
int 8 interior of the set 8 130
ext 8 exterior of the set 8 131
88 boundary of the set 8 131
U.(p) c-neighborhood of the point p in a
normed space, Ue(p) := {x EX: IIx - pil < c} 115
U(p) neighborhood of the point p 115
dim X dimension of the linear space X 15
Xc complexification of the linear space X 197
X/L factor space 184
codim L codimension of the linear subspace L,
co dim L:= dim(X/L) 191
L1. orthogonal complement to the linear
subspace L 1163
018 the product 018 := {ax: x E 8}, a E R, C 16
List of Symbols 389
S+T the sum S + T := {x + y: xES and yET} 16
M ffi L orthogonal direct sum (X = M ® L,
where L = M~) 1163
M ffi L direct sum* 188
X®Y tensor product 1222
X* dual space 174
XE energetic space 1271
spanS linear hull of the set S 130
spanS closed linear hull of the set S 130
co S convex liull of the set S 131
co S closed convex hull of the set S
dist(p, S) distance of the point p from the set S 147
diam S diameter of the set S 146
meas S measure of the set S 1429
8(x) the Dirac delta fmiction 1156
8 the delta distribution 1159
Derivatives
u'(t) derivative of an operator function
u = u(t) at time t 179
ajl partial derivative :1
Xj
001.1 afla~2 ... a';t I, where a = (a1, ... , aN)
(the classical symbols are also used for the
derivatives of generalized functions) 1157
lal the sum a1 + ... + aN 1157
a
derivative in the direction of
an
the exterior normal 1179
N
l:J.I Laplacian, l:J.I := La; I 1123
n=l
8F(x; h) variation of the functional F at the
point x in direction of h 43
8n F(x; h) nth variation of the functional F
at the point x in the direction of h 43
A'(x) (or dA(x)) Frechet-derivative of the operator A
at the point x 228
dn A(x)(h 1 , ••• , h n ) nth Frechet-differential of the operator A
at the point x in the direction of
h lo ••. , h n 229
"To simplify notation, we will use the same symbol M ffi L for the direct sum,
the topological direct sum, and the orthogonal direct sum. The text always refers
to the momentary meaning.
390 List of Symbols
Integral
fa u(x)dx the Lebesgue integral that comprehends
and generalizes the classical integral 1432
lb
f:
F(x)dg(x) Lebesgue-Stieltjes integral 1439
F(>.)dEA operator-valued Lebesgue-Stieltjes integral
(with respect to the spectral family {EA} ) 1330
Spaces of Continuous Functions
C[a, b]' C(O) I13, I114
L(X, Y), Linv(X, Y) 172, 178
Spaces of Holder Continuous Functions
C"'[a, b], C k,'" [a, b], C"'(O), Ck,,,,(O) (C"'(O) = Co,,,,(O)) 195ft'
Spaces of Smooth Functions
Ck[a, bj, Ck(G), Ck(O), COO(G), ck(G)c (Co(O) := C(O)) 195, I114
cgo(G) (or V(G)), S I114, 1212
Spaces of Integrable Functions (Lebesgue Spaces)
L 2(a, b), L2(G), L~(G) (L2(G) := L~(G) if lK = JR) I128, I112
Lp(G) 355
Sobolev Spaces
1 0 1
W 2 (G), W2(G) I128, I129
W;n(G) 357
Spaces of Sequences
lK oo , I!" l~ (h := l~ if lK = JR) 194, I175
Ip 354
Spaces of Distributions
V'(G), S' I158, 1217
S (JR N ) F S (JRN)
B pq 'pq 361
B;q(G), F;q(G) 362
List of Theorems
Do not imagine that mathematics is hard and crabbed, and repulsive
to common sense. It is merely the etherealization of common sense.
Lord William Kelvin (1824-1907)
Theorem l.A (The Hahn-Banach theorem for linear spaces) 2
Theorem l.B (The Hahn-Banach theorem for normed spaces) 4
Theorem l.C (Separation of convex sets) 8
Theorem l.D (The minimum norm problem and duality) 15
Theorem 2.A (Necessary and sufficient conditions for local
minima) 45
Theorem 2.B (Lack of compactness in infinite-dimensional
Banach spaces) 48
Theorem 2.C (The existence of weakly convergent subsequences) 50
Theorem 2.D (The existence theorem for convex minimum
problems) 53
Theorem 2.E (Variational inequalities) 66
Theorem 2.F (Saddle points and duality) 73
Theorem 2.G (Existence of a saddle point) 76
Theorem 2.H (Existence of a quasi-minimal point) 84
Theorem 2.1 (The minimum principle via the Palais-Smale
condition) 86
Theorem 2.J (The mountain pass theorem) 88
Theorem 2.K (The main theorem on monotone operators) 93
392 List of Theorems
Theorem 2.L (Symmetry and the Noether theorem on
conservation laws) 101
Theorem 2.M (The basic equations of gauge field theory) 125
Theorem 3.A (The Baire theorem) 170
Theorem 3.B (The uniform boundedness theorem) 172
Theorem 3.C (The open mapping theorem) 178
Theorem 3.D (The closed graph theorem) 183
Theorem 3.E (The closed range theorem) 210
Theorem 4.A (Differentiation of analytic operators) 236
Theorem 4.B (The fundamental theorem of calculus) 242
Theorem 4.C (The Taylor theorem) 244
Theorem 4.D (The chain rule) 248
Theorem 4.E (The implicit function theorem) 251
Theorem 4.F (The inverse mapping theorem) 259
Theorem 4.0 (The linearization principle) 262
Theorem 4.H (The surjective implicit function theorem) 269
Theorem 5.A (Duality for linear compact operators) 284
Theorem 5.B (The Riesz-Schauder theory on Hilbert spaces) 286
Theorem 5.C (The Riesz-Schauder theory on Banach spaces) 295
Theorem 5.D (The spectrum of linear compact operators) 296
Theorem 5.E (Compact perturbations of Fredholm operators) 300
Theorem 5.F (The product index theorem) 301
Theorem 5.0 (The Fredholm alternative) 304
Theorem 5.H (The main theorem of bifurcation theory) 311
Theorem 5.1 (The Smale principle) 318
Theorem 5.J (The stationary Navier-Stokes equations) 337
List of the Most Important
Definitions
The collection of all our experiences consists of what we know and
what we have forgotten.
Marie von Ebner-Eschenbach (1830-1916)
Spaces
linear space 15*
dimension 15
linear subspace 130
Banach space 110
norm 16
separable 183
reflexive 61
Hilbert space 1105
inner product 1103
orthogonal elements 1103
orthogonal projection 1163
complete orthonormal system 1198
Fock space (bosons or fermions) 1361
Lebesgue space 1112, 355
Sobolev space 1128,357
energetic space 1271
*If we add the symbol I, then the page number refers to AMS Vol. 108.
394 List of the Most Important Definitions
Triebel-Lizorkin space 361
Holder space 195
Besov space 361
dual space 174
metric space 34
topological space 28
Convergence
norm convergence (strong convergence) 18
Cauchy sequence 19
weak convergence 49
sequentially continuous 126
sequentially compact 133
relatively sequentially compact 133
Operators
domain of definition I17
range and pre image I17
injective I17
surjective I17
bijective I17
inverse operator I17
compact 139
continuous 126
k-contraction I19
Lipschitz continuous 127
Holder continuous 195
homeomorphism 128
diffeomorphism 259
submersion 262
subimmersion 262
immersion 262
analytic 236
monotone 93
coercive 93
weakly coercive 53
linear 169
symmetric 1262
the Friedrichs extension 1277
adjoint 1261
dual 199
self-adjoint 1262
Hamiltonian 1326
List of the Most Important Definitions 395
orthogonal projection operator 1268
skew-adjoint 1262
unitary 1210
Fourier transformation 1214
trace class 1345
statistical state 1346
statistical operator 1348
Hilbert-Schmidt operator 1345
semigroup 1296
Green function (propagator) 1384
one-parameter group 1296
dynamics of a quantum system 1326
Fredholm alternative 288
linear Fredholm operator 292
index 292
nonlinear Fredholm operator 317
m-linear bounded 227
Functional
nonlinear 43
linear 174
convex 53
strictly convex 53
concave 75
bilinear form 118
bounded 1118
symmetric 1118
distribution (generalized function) 1158
tempered distribution 1217
Fourier transformation 1218
generalized eigenfunction 1342
Dirac delta distribution 1159
Green function 1158
fundamental solution 1181
Palais-Smale condition 86
Embedding
continuous 323
compact 323
Spectrum
eigenvalue and eigenvector 182
396 List of the Most Important Definitions
generalized eigenvector 1342
resolvent set 182
resolvent operator 182
essential spectrum 183
spectral family 1331
measurements in quantum systems 1341
Set
open 115
neighborhood 115
interior 130
closed 115
closure 130
boundary 131
compact or relatively compact (in normed spaces) 133
(in topological spaces) 30
dense 183
nowhere dense 169
first and second Baire category 169
convex 129
bounded 133
countable 184
Point
critical point 44
local minimum 44
local maximum 44
saddle point 72
bifurcation point 309
fixed point 118
Operator Algebras
Banach algebra 175
von Neumann algebra 1357
C*-algebra 1355
observable 1357
state 1356
pure 1356
mixed 1356
KMS-state (thermodynamic equilibrium) 1358
*-automorphism 1356
dynamics of a quantum system 1357
List of the Most Important Definitions 397
Derivative
time derivative 179
nth-variation 43
Frechet-differential 228
Frechet derivative (F-derivative) 228
partial Frechet-derivative (F -derivative) 232
generalized derivative of a function I127
derivative of a distribution I160
Integral
Riemann integral for vector-valued functions 239
Lebesgue integral 1432
Lebesgue measure 1427
integration by parts I157
Lebesgue-S~ieltjes integral 1439
Feynman path integral 1382
Subject Index
The reader should also consult the index to AMS Volume 108.
adjoint equation 201 Bianchi identity 125
adjoint representation 111 bifurcation 367
algebraic complement 188 bifurcation point 309
algebraic projection 188 bifurcation theory 309
analytic operators 233, 236 bilinear operators 231
angular-momentum conservation biorthogonal systems 196
159 bosonic string 158
antilinear 61 bosons 163
antiquarks 115, 119 boundary-value problems 315
applications to cubature brachistochrone 132
formulas 175 Brouwer fixed-point theorem 94
Arzela-Ascoli theorem 35 buckling of beams 370
associated vector bundle 130
Atiyah-Singer index theorem C* -algebras 36
283 cr -diffeomorphism 259
calculus of variations 56
Baire theorem 169 canonical mapping 186
Banach-Steinhaus theorem 173 Cantor's nested interval principle
bang-bang controls 28 169
baryon number 113 capillary surfaces 144
baryons 114 chain rule 247
Benard problem 368 characteristic number 314
Besov spaces 360 Cebysev approximation 18
400 Subject Index
Clifford algebra 124 diffeomorphisms 259
closed 29 differential 228
closed range theorem 210 differential geometry 130
cluster point 137 differential operators 234
codimension 191 Dirac equation 126
coercive 53 direct sum 188
coerciveness condition 57 Dirichlet problem 134
coin game 81 distance 34
commutation relations 162 dual equation 201
compact embedding 323 dual operators 199
compactness 30 dual pair 303
complete 31 dual representation 120
completion principle 23 dual space 5
compressibility equation 363 duality functor 203
concave 75 duality for linear compact
conformal coordinates 150 operators 284
conformal gauge 160 duality theory 73
connection 130 dyadic partition of unity 360
conservation of energy 99
conservation law 98, 135, 159 eigensolutions 291
constraints 158 eigenvalue problems 59
continuity 30 Ekeland principle 83
continuous inverse theorem 179 elastic energy 140
continuous projection 188 elasticity 139
control equation 26 electromagnetic field tensor 126
control functional 26 elementary particle physics 106
control restriction 26 elementary particles 112
convergence 29 embeddings 221
convex approximation models embedding operator 323
142 energetic space 24
Cooper pairs 155 energy-momentum conservation
coupling constant 124 159
covariant derivatives 124 equicontinuous 35
covariant directional derivative equilibrium condition 141
132 equilibrium equation 141
critical point 44 equivalent maps 261
curvature 130 Euler equation 41
Euler-Lagrange equation 46
defects 155 Euler-Lagrange system 48
deformation 139 exact Banach sequence 205
deformation tensor 140 exact sequences 205
density and duality 37 exterior product 279
density of the fluid 146 extreme point 36
density of the outer force 140
diagonal procedure 51 F -derivative 228
Subject Index 401
factor space 184 half-spaces 6
farmer's allocation problem 27 hanging rope 133
fat 169 harmonic maps 151, 158
fennions 163 Helmholtz-Weyl decomposition
fiber 131 366
finite intersection property 31 Higgs field 154
first category 169 Hilbert spaces 16
first Piola-Kirchhoff stress Holder inequality 352
tensor 140 Holder spaces 362
formation of patterns 367 Hopf bifurcation 369
four-potential 126 hypercharge 113
fractional Sobolev space 362 hyperplane 6
Frechet derivative 43, 228
Fredholm alternative 288, 304 index 292
Fredholm operator 292 immersion 262
free boundary problems 155 implicit function theorem 251
free surface 146 incompressibility condition 330
function spaces 359 inertial system 123
functional 43 infinite-dimensional Lie algebra
fundamental theorem of calculus 162
24:.! integration 239
integral equations 291, 313
G-derivative 275 integral operators 233
Galerkin equation 96 interpolation inequalities 322
game theory 81 inverse mapping theorem 251
Gateaux derivatives 43, 275 isospin 112
gauge field theory 102, 122 iterated derivatives 244
gauge invariance 103, 127
gauge transformations 103, 127 Jacobi's identity 107
Gelfand theorem 37 Jordan curve 150
Gell-Mann-Nishijima formula Jordan normal form 350
112
generalized surface area 151 kinetic energy 139
ghost state 164 Krein theorem 24
global gauge transformation 103 Krein-Milman convexity
global inverse mapping theorem theorem 36
277
GNS-theorem 36 Ladyzhenskaya inequality 326
graph-closed 182 Lagrange multiplier rule 143, 270
Grassmann algebra 278 Lagrangian 46
gravitational constant 138 Landau-Ginzburg model 152
growth condition 57 Lebesgue spaces 362
Leray-Schauder principle 342
Haar condition 25 Lie algebra 107
Hahn-Banach theorem 2 Lie group 130
402 Subject Index
Lie product 107 neighborhood 29
linear optimization 36 neutron 112
linearization 225 Newtonian equation 138
linearization principle 261 Noether theorem 98, 156, 159
local boundedness 95 noncollision solutions 138
local diffeomorphism 259 non differential continuous
local inverse mapping theorem function 171
259 nonlinear elasticity 139
local maximum 44 nonlinear Fredholm operators
local minimum 44 317,350
locally convex spaces 222 normal form 265
Lorentz group 160 normed spaces 4
Lorentz transformation 159 normisomorphic 353
lower semicontinuous 55 nowhere dense 169
lower semicontinuous functionals
136 open 29
open mapping theorem 178
m-linear bounded operators 227
optimal control of rockets 20
mapping degree 153
optimal strategy pair 81
matrix equation 201
order parameter 154
Maxwell equations 126
orthogonal complement 212
Maxwell-Dirac equation 127
meager 169
mean curvature 147 Palais-Smale condition (PS) 86
mesons 119 parallel transport 131
metric space 34 parametrix 298
minimal surfaces 134, 149 partial F-derivative 232, 276
minimax theorem 75 phase transformations 103
minimum norm problem 15 phase transition 154
Minkowski functional 9 Plateau problem 150
Minkowski inequality 352 Poincare transformations 159
mixed state of three quarks 113 Poincare-Friedrichs inequality 58
mixed strategy 82 polyconvex material 142
moment problem 13 Pontrjagin maximum principle
monotone operators 93 25
monotonicity trick 94 potential energy 139
moutain pass theorem 87 powE(r operator 233
multilinear forms 278 precompact 31
multilinearization 225, 230 pressure 336
pressure equation 365
N-body problem 138 principle fiber bundle 130
nth F-derivative 246 principle of gauge invariance 106
NASA 147 principle of minimal potential
natural boundary condition 148 energy 71, 134
Navier-Stokes equations 329 principle of relativity 123
Subject Index 403
principle of stationary action 48, second differential 228
165 secondary category 169
probabilistic coin game 83 seminorm 222
product index theorem 301 separated 29
product rule 250 separation of convex sets 6
product space 180 sequentially compact 31
projections 188 sequentially lower semicon-
proper 277, 317 tinuous functionals
proton 112 137
pseudo-orthogonal complements short exact Banach sequence 207
197 side condition 145
singular point 318
quadratic variational inequalities skew-adjoint 108
70 Smale principle 318
quantization of the bosonic Sobolev space 56, 356
string 164 space shuttle 147
quantum electrodynamics 127 space-time manifold 157
quantum field theory 107 spectrum 296
quantum numbers 112 splits 189
quarks 114, 119 splitting subspaces 196
quasi-convex 55 standard model 106
quasi-minimal points 83 stationary 47
quasi-solutions 84 step function 239
Stieltjes integral 10, 11
rank 195 Stone-Weierstrass approx-
rank theorem 263 imation theorem
reflexive 61 35
reflexivity 220 stored energy function 140
regular point 318 stored potential energy 140
regular value 318 strangeness 113
regularity up to the boundary strategy set 81
151 stress force 141
regularized problem 84 stress tensor 140
relative adhesion coefficient 146 strictly convex 53
relatively open 29 string theory 156
relativistic particles 135 strong convergence 49
renormalization of energy 155 strong pressure equation 366
representation 110 subalgebra 107
Reynolds numbers 369 subimmersion 262
Riesz theorem 41 submersion 262
Riesz-Schauder theory 286, 295 subsequences 219
sum rule 232
saddle point 72 supercommutativity 279
Sard's theorem 318 superconducting state 154
Sard-Smale theorem 321 superconductor 154
404 Subject Index
supercooled Helium 154 variational problem 46
superfluidity 154 vector calculus 363
supermathematics 163 velocity fields 335
superposition operator 273 very weak pressure equation 367
supernumbers 162 Virasoro algebra 161
superstring theory 165 Virasoro charges 161
surface tension 146 Virasoro constraints 161
surjective implicit function
theorem 269 weak compactness 222
symmetries 98 weak convergence 49, 217
weak topology 221
Taylor problem 368 weak" convergence 218
Taylor theorem 243 weakly coercive 53
tensor algebra 278 weakly open 221
tensor representations 111 weakly sequentially continuous
Tietze-Urysohn extension 53
theorem 36 weakly sequentially lower
topology 29 semicontinuous 53
topological complement 189 Weierstrass approximation
topological direct sum 189 theorem 35
topological space 28 Weierstrass existence theorem 53
traceless 109 Weierstrass theorem 33
trapezoid formula 176 weight diagram 117
Triebel-Lizorkin spaces 360 well-posedness principle 180
turbulence 330 winding number 152
world sheet 157
uniform boundedness theorem
172 Yang-Mills equation 125
Young inequality 352
value of the game 81
variation 43, 135 Zorn lemma 3
variational inequality 66
Applied Mathematical Sciences
(continued from page ii)
61. Sallinger/Weaver: Lie Groups and Algebras with 89. O'Malley: Singular Perturbation Methods for
Applications to Physics. Geometry, and Ordinary Differential Equations.
Mechanics. 90. Meyer/Hall: Introduction to Hamiltonian
62. LaSalle: The Stability and Control of Discrete Dynamical Systems and the N-body Problem.
Processes. 91. Straughan: The Energy Method, Stability, and
63. Grasman: Asymptotic Methods of Relaxation Nonlinear Convection.
Oscillations and Applications. 92. Naber: The Geometry of Minkowski Spacetime.
64. Hsu: Cell-to-Cell Mapping: A Method of Global 93. Colton/Kress: Inverse Acoustic and
Analysis for Nonlinear Systems. Electromagnetic Scattering Theory. 2nd ed.
65. Rand/Armbruster: Perturbation Methods, 94. Hoppensteadt: Analysis and Simulation of
Bifurcation Theory and Computer Algebra. Chaotic Systems.
66. HlavaceklHaslinger/NecasVLovisek: Solution of 95. Hackbusch: Iterative Solution of Large Sparse
Variational Inequalities in Mechanics. Systems of Equations.
67. Cercignani: The Boltzmann Equation and Its 96. Marchioro/Pulvirenti: Mathematical Theory of
Applications. Incompressible Nonviscous Fluids.
68. Temam: Infinite-Dimensional Dynamical 97. Lasota/Mackey: Chaos, Fractals. and Noise:
Systems in Mechanics and Physics. 2nd ed. Stochastic Aspects of Dynamics, 2nd ed.
69. Golubitsky/Stewart/Schaeffer: Singularities and 98. de Boor/Hlillig/Riemenschneider: Box Splines.
Groups in Bifurcation Theory. Vol. II. 99. Hale/Lunel: Introduction to Functional
70. Constantin/FoiasiNicolaenkolTemam: Integral Differential Equations.
Manifolds and Inertial Manifolds for Dissipative 100. Sirovich (ed): Trends and Perspectives in
Partial Differential Equations. Applied Mathematics.
7t. Catlin: Estimation. Control. and the Discrete 101. NusseIYorke: Dynamics: Numerical Explorations,
Kalman Filter. 2nd ed.
72. LochaklMeunier: Multiphase Averaging for 102. Chossat/Jooss: The Couette-Taylor Problem.
Classical Systems. 103. Chorin: Vorticity and Turbulence.
73. Wiggins: Global Bifurcations and Chaos. 104. Farkas: Periodic Motions.
74. Mawhin/Willem: Critical Point Theory and 105. Wiggins: Normally Hyperbolic Invariant
Hamiltonian Systems. Manifolds in Dynamical Systems.
75. Abraham/Marsden/Ratiu: Manifolds. Tensor \06. CercignanilJllner/Pulvirenti: The Mathematical
Analysis, and Applications. 2nd cd. Theory of Dilute Gases.
76. Lagerstrom: Matched Asymptotic Expansions: 107. An/man: Nonlinear Problems of Elasticity.
Ideas and Techniques. 108. Zeidler: Applied Functional Analysis:
77. Aldous: Probability Approximations via the Applications to Mathematical Physics.
Poisson Clumping Heuristic. 109. Zeidler: Applied Functional Analysis: Main
78. Dacorogna: Direct Methods in the Calculus of Principles and Their Applications.
Variations. 110. Diekmann/van Gi/s/Verduyn Lunel/Wal/her:
79. Hernandez-Lerma: Adaptive Markov Processes. Delay Equations: Functional-, Complex-, and
80. Lowden: Elliptic Functions and Applications. Nonlinear Analysis.
81. Bluman/Kumei: Symmetries and Differential III. Visintin: Differential Models of Hysteresis.
Equations. 112. Kuznetsov: Elements of Applied Bifurcation
82. Kress: Linear Integral Equations. Theory, 2nd ed.
83. Behernes/Eberly: Mathematical Problems from 113. Hislop/Sigal: Introduction to Spectral Theory:
Combustion Theory. With Applications to SchrMinger Operators.
84. Joseph: Fluid Dynamics of Viscoelastic Fluids. 114. Kevorkian/Cole: Multiple Scale and Singular
85. Yang: Wave Packets and Their Bifurcations in Perturbation Methods.
Geophysical Fluid Dynamics. 115. Taylor: Partial Differential Equations I, Basic
86. Dendrinos/Sonis: Chaos and Socio-Spatial Theory.
Dynamics. 116. Taylor: Partial Differential Equations II,
S7. Weder: Spectral and Scattering Theory for Wave Qualitative Studies of Linear Equations.
Propagation in Perturbed Stratified Media. 117. Toylor: Partial Differential Equations III,
88. Bogaevski/Povzner: Algebraic Methods in Nonlinear Equations.
Nonlinear Perturbation Theory.
(continued on next page)
Applied Mathematical Sciences
(continued from previous page)
118. GodlewskilRaviart: Numerical Approx.imation of 126. Hoppensteadtllzhikevich: Weakly Connected
Hyperbolic Systems of Conservation Laws. Neural Networks.
119. Wu: Theory and Applications of Partial 127. Isakov: Inverse Problems for Partial Differential
Functional Differential Equations. Equations.
120. Kirsch: An Introduction to the Mathematical 128. LilWiggins: Invariant Manifolds and Fibrations
Theory of Inverse Problems. for Pertorbed Nonlinear Schrlldinger Equations.
121. BrokatelSprekels: Hysteresis and Phase 129. MUller: Analysis of Spherical Symmetries in
Transitions. Euclidean Spaces.
122. Gliklikh: Global Analysis in Mathematical 130. Feintuch: Robust Control Theory in Hilbert
Physics: Geometric and Stochastic Methods. Space.
123. LelSchmitt: Global Bifurcation in Variational 131. Ericksen: Introduction to the Thermodynamics
Inequalities: Applications to Obstacle and of Solids, Revised ed.
Unilateral Problems. 132. Ihlenburg: Finite Element Analysis of Acoustic
124. Polak: Optimization: Algorithms and Consistent Scattering.
Approximations. 133. Vorovich: Nonlinear Theory of Shallow Shells.
125. Arnold/Khesin: Topological Methods in
Hydrodynamics.