Adel N. Boules - Fundamentals of Mathematical Analysis-OUP Oxford (2021)
Adel N. Boules - Fundamentals of Mathematical Analysis-OUP Oxford (2021)
Fundamentals of
Mathematical Analysis
A D E L N . B OU L E S
1
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
3
Great Clarendon Street, Oxford, OX2 6DP,
United Kingdom
Oxford University Press is a department of the University of Oxford.
It furthers the University’s objective of excellence in research, scholarship,
and education by publishing worldwide. Oxford is a registered trade mark of
Oxford University Press in the UK and in certain other countries
© Adel N. Boules 2021
The moral rights of the author have been asserted
First Edition published in 2021
Impression: 1
All rights reserved. No part of this publication may be reproduced, stored in
a retrieval system, or transmitted, in any form or by any means, without the
prior permission in writing of Oxford University Press, or as expressly permitted
by law, by licence or under terms agreed with the appropriate reprographics
rights organization. Enquiries concerning reproduction outside the scope of the
above should be sent to the Rights Department, Oxford University Press, at the
address above
You must not circulate this work in any other form
and you must impose this same condition on any acquirer
Published in the United States of America by Oxford University Press
198 Madison Avenue, New York, NY 10016, United States of America
British Library Cataloguing in Publication Data
Data available
Library of Congress Control Number: 2020952673
ISBN 978–0–19–886878–1 (hbk.)
ISBN 978–0–19–886879–8 (pbk.)
DOI: 10.1093/oso/9780198868781.001.0001
Printed and bound by
CPI Group (UK) Ltd, Croydon, CR0 4YY
Links to third party websites are provided by Oxford in good faith and
for information only. Oxford disclaims any responsibility for the materials
contained in any third party website referenced in this work.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Dedication
To all my children
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Preface
This is a beginning graduate book on real and functional analysis, with a significant
component on topology. The prerequisites include a solid understanding of
undergraduate real analysis and linear algebra, and a good degree of mathematical
maturity. Rudimentary knowledge of metric spaces, although not required, is a
huge asset. With the singular exception of Liouville’s theorem (stated without
proof), and a passing reference to Laurent series, knowledge of complex analysis
is neither assumed nor needed.
It is possible for students with high mathematical aptitude to study this book
independently. However, the book is designed as a textbook for well-prepared
students of mathematics, to be taught under the able guidance of an instructor.
I like to think of this book as an accessible classical introduction to the subject.
The goal is to provide a springboard from which students can dive into greater
depths in the sea of mathematics.
The book is neither encyclopedic nor a shallow introduction. The aim is to achieve
excellent breadth and depth. The topics are organized logically but not rigidly, in
order to maximize utility and the potential readership. The careful sequencing
of the sections is designed to allow instructors to select topics that suit their
course goals, student backgrounds, and time limitations. Although the proofs are
detailed, I hope the reader will find the writing style clear and concise. The section
exercises constitute an important complement to the results in the main body of
the section. Indeed, some of the exercises provide alternative approaches to some
topics, and generalizations of some of the results in the main text are considered in
the exercises. The book synopsis included after the preface furnishes more details
on the structure of the book and brief chapter descriptions.
viii preface
Sir Isaac Newton once said that if he had seen further than others, it was by
standing on the shoulders of giants. I am no giant, but this book is the shoulder I
have to offer. Perhaps a few students will climb and will be able to see farther than
I have.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Acknowledgments
I thank the anonymous reviewers of this book for their insightful criticism.
Their suggestions greatly contributed to the richness of the book and the
cohesion of its topics. I also thank the commissioning editor Dan Taber and the
assistant commissioning editor Katherine Ward for their prompt and professional
assistance during the review and production stages, and my son Youssef for his
assistance with the typesetting and formatting of the graphics.
Finally but foremost, my deep gratitude goes to my wife and life companion. Her
kind, patient, and trusting nature touched many lives and transformed mine.
Jacksonville, Florida
July 2020
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
The book in its entirety contains enough material for a two-semester course. The
core of the book can be used for an easy paced two-semester course. If a definition
of the core contents of the book is desirable, I define the core to consist of the
following sections, in addition to the very basic ideas in sections 1.1, 1.2, and 3.1:
Instructors can choose material from this part as their students’ background
warrants. The most basic results in the first three chapters are stated without proof.
Chapter 1. This chapter furnishes a brief refresher of basic concepts. The natural,
rational, and real number systems are taken for granted, although we develop the
completeness of the real line and the Bolzano-Weierstrass theorem at length,
as well as the complex number field, including its completeness. Embryonic
manifestations of completeness and compactness can be seen in this chapter.
Examples include the nested interval theorem and the uniform continuity of
continuous functions on compact intervals, and our proof of the Heine-Borel
theorem in chapter 4 is squarely based on the Bolzano-Weierstrass property of
bounded sets.
Chapter 2. This chapter fills in any potential gaps that may exist in the
student’s knowledge of set theory. Sections 2.1 and 2.2 are essential for a proper
understanding of the rest of the book. In particular, a thorough understanding
of countability and Zorn’s lemma is indispensable. Some of section 2.3 may be
included, but only an intuitive understanding of cardinal numbers is sufficient.
Studying section 2.3 up to theorem 2.3.4, together with theorem 2.3.13, is sufficient
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Chapter 6. This chapter introduces Banach spaces. Sections 6.1–6.4 form the
core of the chapter. It would be accurate to characterize sections 6.1–6.4 as quite
classical. Section 6.5 is needed for sections 7.3 and 7.4. Section 6.6 can be omitted
if a brief introduction is the goal. In this case, section 7.5 must also be omitted.
Section 6.7 is terminal and may be omitted without consequence. I have enriched
the chapter by including such topics as Gelfand’s theorem, Schauder bases, and
complemented subspaces. Chapters 6 and 7 include a good number of applications
of the four fundamental theorems of functional analysis.
Chapter 7. This chapter introduces Hilbert spaces and the elements of operator
theory. Sections 7.3 and 7.4 contain a good set of results on self-adjoint and
compact operators. The section exercises contain problems that suggest alternative
approaches and hence allow the instructor to shorten these two sections while
preserving good depth. For example, the Fredholm theory can be bypassed if
the instructor wishes to limit the discussion to compact, self-adjoint operators on
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Hilbert spaces. Sections 7.3 and 7.4 are written in such a way to facilitate extending
the results to compact operators on Banach spaces (section 7.5). For example, we
used Riesz’s lemma instead of the projection theorem in order to keep the proofs
adaptable for extension to Banach spaces. Sections 7.3–7.5 contain more results
than are typically found in an introductory course.
Chapter 8. Section 8.1 furnishes a brief but rigorous introduction to the Riemann
integral of continuous functions on compact boxes in ℝn . Although it has intrinsic
value, the section is included for the express purpose of developing section 8.4.
Section 8.4 develops the Lebesgue measure on ℝn , and the approach is to extend
the positive linear functional provided by the Riemann integral on the space
of continuous, compactly supported functions on ℝn . This very nearly amounts
to developing the Radon measure theory on locally compact Hausdorff spaces.
However, I chose to limit the discussion to Lebesgue measure on ℝn because I did
not wish to base the presentation heavily on chapter 5. I did, nonetheless, include
an excursion into Radon measures as an optional topic. The rest of the chapter is
largely independent of sections 8.1 and 8.4 and constitutes a decent introduction
to general measure and integration theories. The section on complex measures has
intrinsic value but is also included in order to facilitate the study of the duals of 𝔏p
spaces. In particular, I limited the discussion of signed measures to real measures,
this is, signed measures that are not allowed to assume infinite values. This turned
out to be sufficient for our purposes. The selection of topics and the approach in
sections 8.6 and 8.8 are quite classical and cover the basics of 𝔏p spaces and product
measures. Section 8.7 contains an excellent collection of approximations theorems,
including approximations by 𝒞∞ functions. The title of the last section accurately
captures its contents: a mere glimpse of the subject. However, the section finally
settles questions started in sections 3.7 and 4.10 and concludes with the unraveling
of the mystery about the completeness of orthogonal polynomials.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Appendices
Appendix A. This appendix contains the proof of the equivalence of the axiom
of choice, Zorn’s lemma, and the well-ordering principle. I created this appendix
in order to avoid distraction if instructors decide not to include the proof in their
course.
Contents
1. Preliminaries 1
1.1 Sets, Functions, and Relations 2
1.2 The Real and Complex Number Fields 9
2. Set Theory 25
2.1 Finite, Countable, and Uncountable Sets 26
2.2 Zorn’s Lemma and the Axiom of Choice 33
2.3 Cardinal Numbers 39
3. Vector Spaces 47
3.1 Definitions and Basic Properties 49
3.2 Independent Sets and Bases 53
3.3 The Dimension of a Vector Space 57
3.4 Linear Mappings, Quotient Spaces, and Direct Sums 61
3.5 Matrix Representation and Diagonalization 70
3.6 Normed Linear Spaces 75
3.7 Inner Product Spaces 85
4. The Metric Topology 103
4.1 Definitions and Basic Properties 105
4.2 Interior, Closure, and Boundary 110
4.3 Continuity and Equivalent Metrics 119
4.4 Product Spaces 129
4.5 Separable Spaces 133
4.6 Completeness 136
4.7 Compactness 149
4.8 Function Spaces 160
4.9 The Stone-Weierstrass Theorem 171
4.10 Fourier Series and Orthogonal Polynomials 175
5. Essentials of General Topology 191
5.1 Definitions and Basic Properties 192
5.2 Bases and Subbases 197
5.3 Continuity 200
5.4 The Product Topology: The Finite Case 205
5.5 Connected Spaces 208
5.6 Separation by Open Sets 213
5.7 Second Countable Spaces 217
5.8 Compact Spaces 221
5.9 Locally Compact Spaces 226
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
xviii contents
Bibliography 453
Glossary of Symbols 455
Index 457
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
1
Preliminaries
Dirichlet was appointed to fill Gauss’s chair at Göttingen, soon became Dedekind’s
friend and mentor, and had a strong influence in shaping his mathematical
interests. While at Göttingen, Dedekind studied the work of Galois and was the
first to lecture on Galois theory.
Dedekind was later appointed to the Polytechnic of Zürich and began teaching
there in 1858. By the 1860s, The Collegium Carolinum in Brunswick had been
upgraded to the Brunswick Polytechnic, and Dedekind was appointed to it in
Fundamentals of Mathematical Analysis. Adel N. Boules, Oxford University Press (2021). © Adel N. Boules.
DOI: 10.1093/oso/9780198868781.003.0001
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
1862. With this appointment he returned to his hometown and remained there
for the rest of his life.
Dedekind retired in 1894. His life was long, healthy, and contented. He never
married and instead lived with one of his sisters, who also remained unmarried,
for most of his adult life. “He did not feel pressed to have a more marked effect in
the outside world: such confirmation of himself was unnecessary.”1
“Dedekind’s legacy ... consisted not only of important theorems, examples, and
concepts, but a whole style of mathematics that has been an inspiration to each
succeeding generation.”2
The reader is expected to be familiar with basic set theoretic concepts such
as containment, unions, and intersections and should be comfortable with set
notation. Most of the essential definitions will be stated in this section. A number
of basic facts will be stated as theorems, without proof.
preliminaries 3
X − A = {x ∈ X ∶ x ∉ A}.
We use the same notation for relative differences (the complement of B in A):
A − B = {x ∈ A ∶ x ∉ B} = A ∩ (X − B).
Theorem 1.1.2 (De Morgan’s laws). Let A1 , A2 ,…, An be subsets of a set X. Then
Definition. If x and y are objects (e.g., numbers, functions, sets), the ordered pair
(x, y) is defined by (x, y) = {x, {x, y}}. The reader can verify that the definition
guarantees that (x, y) = (a, b) if and only if x = a, and y = b.
X × Y = {(x, y) ∶ x ∈ X, y ∈ Y}.
ℜ( f ) = { f (x) ∶ x ∈ X}.
If A ⊆ X, the image of A under f is the set f (A) = { f (a) ∶ a ∈ A}. The inverse
image of a set B ⊆ Y is the set f−1 (B) = {x ∈ X ∶ f (x) ∈ B}.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Definition. The identity function IX on a set X is the function IX (x) = x for all
x ∈ X.
Indexed Sets
Let I be a set (the indexing set) and let 𝔄 be a collection of sets. An indexing of
𝔄 by I is a bijection A ∶ I → 𝔄. The image of an element 𝛼 ∈ I is denoted by A𝛼
instead of A(𝛼). Thus 𝔄 = {A𝛼 ∶ 𝛼 ∈ I}. Indexing is, of course, not limited to sets;
one can index, for example, a set of numbers, or functions. If there is no danger
of ambiguity, we sometimes omit reference to the indexing set I and write {A𝛼 }𝛼 .
Indexing is clearly a generalization of sequencing, as illustrated by the examples
below.
preliminaries 5
Example 3. We can index the set of linear homogeneous functions in one real
variable as { f𝛼 ∶ 𝛼 ∈ ℝ}, where, for x ∈ ℝ, f𝛼 (x) = 𝛼x.
(a) There exists a sequence of sets (Bn ) such that B1 ⊆ B2 ⊆ … and ∪∞ n=1 An =
∪∞ B
n=1 n . We simply define B n = ∪ n
A
i=1 i .
(b) There exists a disjoint sequence of sets (Cn ) such that ∪∞ ∞
n=1 An = ∪n=1 Cn . The
sequence we seek is C1 = A1 and, for n ≥ 2, Cn = An − ∪i=1 Ai .
n−1
(a) A ∪ (∩𝛼 B𝛼 ) = ∩𝛼 (A ∪ B𝛼 ),
(b) A ∩ (∪𝛼 B𝛼 ) = ∪𝛼 (A ∩ B𝛼 ).
Theorem 1.1.5 (De Morgan’s laws). Let {A𝛼 }𝛼 be an indexed family of subsets of a
set X. Then
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
(a) X − ∩𝛼 A𝛼 = ∪𝛼 (X − A𝛼 ),
(b) X − ∪𝛼 A𝛼 = ∩𝛼 (X − A𝛼 ).
Theorem 1.1.6. Let f ∶ X → Y, let {A𝛼 }𝛼 be a collection of subsets of X, and let {B𝛽 }𝛽
be a collection of subsets of Y. Then
x ∶ I → ∪𝛼∈I X𝛼
We will denote the function x in the above definition by (x𝛼 )𝛼∈I , or simply (x𝛼 ).
The above definition generalizes the definition of the Cartesian product of a
n
finite number of sets. Indeed, for sets X1 , X2 , … , Xn , the Cartesian product ∏i=1 Xi
is the set of all sequences (x1 , x2 , … , xn ) such that xi ∈ Xi for all 1 ≤ i ≤ n. A
sequence is nothing but a function x ∶ {1, 2, … , n} → ∪ni=1 Xi such that xi = x(i) ∈
Xi for all 1 ≤ i ≤ n.
Example 7. Let ℝℕ be the set of all infinite sequences in ℝ. This is also the product
∞
∏i=1 Xi , where each Xi = ℝ.
Example 8. Let A be a set, and let 2A denote the set of all functions from A to
the set {0, 1}. Indeed, 2A is a product because if we define X𝛼 = {0, 1} for all
𝛼 ∈ A, then 2A = ∏𝛼∈A X𝛼 . As a special case, the set 2ℕ is the set of all binary
sequences.
preliminaries 7
1 if x ∈ S,
𝜒S (x) = {
0 if x ∉ S.
Definition. Let {X𝛼 }𝛼∈I be a collection of sets, and let X = ∏𝛼∈I X𝛼 . For each
𝛼 ∈ I, define the projection 𝜋𝛼 ∶ X → X𝛼 by 𝜋𝛼 (x) = x𝛼 . Here x = (x𝛼 )𝛼∈I is
an element of X.
Example 10. Consider the set D of all functions f ∶ [0, 1] → ℝ. This set can be
thought of as ∏𝛼∈[0,1] X𝛼 , where each X𝛼 = ℝ. If f ∈ D and a ∈ [0, 1], then
𝜋a ( f ) = f (a). Fix an element a ∈ [0, 1] and an interval U ⊆ ℝ. It makes sense
to ask what 𝜋a−1 (U) is. This is simply the set of all functions f ∈ D such that
𝜋a ( f ) ∈ U or simply f (a) ∈ U. Thus 𝜋a−1 (U) is the set of all the functions on
the closed unit interval whose graphs cross the line segment {a} × U.
Thus the union of the equivalence classes is A, and distinct equivalence classes are
disjoint. The common terminology is that the equivalence classes partition A.
Exercises
preliminaries 9
12. Show that the composition of two injective (respectively, surjective, bijec-
tive) functions is injective (respectively, surjective, bijective).
13. Let f ∶ A → B and g ∶ B → C be bijections. Show that (gof)−1 = f−1 og−1 .
14. (a) Show that if f ∶ A → B is injective, then there exists a function g ∶ B → A
such that gof = IA .
(b) Show that if f ∶ A → B is surjective, then there exists a function g ∶ B → A
such that fog = IB .
15. Show that the function f ∶ ℕ × ℕ → ℕ given by f (m, n) = 2m−1 (2n − 1) is a
one-to-one correspondence.
16. Verify the one-to-one correspondence between 2A and 𝒫(A).
17. Show that if A has n elements and B has m elements, then AB has nm
elements. Conclude that 𝒫(A) has 2n elements.
18. Let S and T be subsets of a set A. Show that
(a) 𝜒S∩T = 𝜒S .𝜒T ; and
(b) 𝜒S∪T = 𝜒S + 𝜒T − 𝜒S∩T .
19. Prove theorem 1.1.7.
20. Fix an integer n > 1, and define a relation R on ℤ as follows: xRy if x − y
is a multiple of n. Show that R is an equivalence relation, and describe the
equivalence classes.
21. Define a relation R on ℝ as follows: xRy if and only if x − y ∈ ℚ. Show that
R is an equivalence relation.
22. Define a relation R on ℤ by xRy if and only if x2 + y2 is even. Show that R
is an equivalence relation.
23. Define a relation R on ℝ by xRy if and only if xy ≥ 0. Is R an equivalence
relation?
the section is not totally self-contained, there is value in its inclusion because
it illustrates a number of important proof techniques and provides a succinct
summary of the properties of real and complex number fields.
(a) a + b = b + a.
(b) a + (b + c) = (a + b) + c.
(c) There is an element 0 ∈ F such that a + 0 = a.
(d) For every a ∈ F, there is an element −a ∈ F such that a + (−a) = 0.
(e) a × b = b × a.
(f) a × (b × c) = (a × b) × c.
(g) There is an element 1 ∈ F such that a × 1 = a.
(h) For every a ≠ 0, there is an element a−1 such that a × a−1 = 1.
(i) a × (b + c) = a × b + a × c.
We often omit the symbol for multiplication and write ab or a.b for a × b. The
element 0 is called the additive identity, and 1 is called the multiplicative identity
of the filed. A field must clearly contain at least two elements.
With the usual operations of addition and multiplication of numbers, the rational
numbers, ℚ, and the real numbers, ℝ, are fields. We will see later in this section
that complex numbers also form a field.
Real Numbers
preliminaries 11
It is clear that if M is an upper bound of A, then every real number greater than M
is also an upper bound of A. This leads to the following definition.
Definition. The least upper bound of a set A ⊆ ℝ is the number M such that
The least upper bound of A is also called the supremum of A and is denoted by
supA. If A is not bounded above, we set supA = ∞.
Definition. The greatest lower bound of a set A ⊆ ℝ is the number m such that
The greatest lower bound of A is also called the infimum of A and is given the
notation inf A. If A is not bounded below, we set infA = −∞.
1 n
Example 2. Let A1 = (−∞, 1), A2 = { ∶ n ∈ ℕ}, and A3 = { ∶ n ∈ ℕ}. Then,
n n+1
supA1 = supA2 = supA3 = 1, infA1 = −∞ , infA2 = 0, and infA3 = 1/2.
Definition. A sequence (an ) of real numbers is said to diverge to ∞ if, for every
M ∈ ℝ, there is a natural number N such that an > M for all n > N. In this case,
we also say that an has limit ∞, and we write limn an = ∞. The sequence (an )
is said to diverge to −∞ if, for every m ∈ ℝ, there is a natural number N such
that an < m for all n > N. In this case, we also say that an has limit −∞, and we
write limn an = −∞.
1
Example 4. Let an = n + , bn = 1 + (−1)n , cn = e−n .
n
The sequence (an ) diverges to ∞, while (bn ) does not converge, nor does it
diverge to ±∞. Finally, limn cn = 0.
Definition. A sequence (an ) is bounded if its range {a1 , a2 , ...} is a bounded set.
Thus there is a positive number M such that, for all n ∈ ℕ, |an | ≤ M.
preliminaries 13
Definition. A sequence an is said to be a Cauchy sequence if, for every 𝜖 > 0, there
is a natural number N such that, for all m, n > N, |an − am | < 𝜖
Proof. Let 𝜖 = 1. There is a positive integer N such that, for m, n ≥ N, |an − am | < 1.
In particular, taking m = N, |an − aN | < 1 for all n ≥ N Thus, by the trian-
gle inequality, for every n ≥ N, |an | = |(an − aN ) + aN | ≤ |an − aN | + |aN | ≤ 1 +
|aN |. Let M = max{|a1 |, … , |aN−1 |, 1 + |aN |}. Clearly, |an | ≤ M for all n.
Definition. Let (an ) be a sequence, and let (n1 , n2 , ...) be a strictly increasing
sequence of natural numbers. We say that (ank )∞
k=1 is a subsequence of (an ).
Proof. Let 𝜖 > 0. Since limn an = a, there is a positive integer N such that, for n > N,
|an − a| < 𝜖. Since (nk ) is an increasing sequence of natural numbers, nk ≥ k for
every k ∈ ℕ. Thus, for k > N, nk > N and |ank − a| < 𝜖.
Proof. Define a term an of the sequence to be a peak if, for every i ≥ n, an ≥ ai . There
are two cases:
Case 1. The sequence (an ) has finitely many peaks. Suppose k0 is the largest positive
integer for which ak0 is a peak, and let n1 = k0 + 1. Since an1 is not a peak, there is
an integer n2 > n1 such that an2 > an1 . Continuing inductively, one can construct
a strictly increasing sequence of positive integers n1 < n2 < n3 , ... such that an1 <
an2 < an3 …. The sequence (ank ) is an increasing subsequence of (an ).
Case 2. The sequence, (an ) contains infinitely many peaks, an1 ≥ an2 ≥ an3 ≥ …,
where nk is an increasing sequence of positive integers. The sequence (ank ) is a
non-increasing subsequence of (an ).
Proof. Let (an ) be a bounded sequence. By the previous theorem, (an ) contains a
monotonic subsequence, (ank ), which is convergent by theorem 1.2.3.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
The following two examples are fixtures in undergraduate real analysis books. The
proof technique is quite common and is valid for general compact sets. See chapter
4. First we remind the reader of the definition of continuity.
The number 𝛿 in the above definition depends on 𝜖 only and not on any particular
x ∈ X. For example, the function f ∶ (0, 1) → ℝ defined by f (x) = 1/x is continuous
but not uniformly continuous.
preliminaries 15
Example 9. The rational field ℚ does not satisfy the Cauchy criterion. For
n 1
example, the sequence ∑i=0 is a Cauchy sequence in ℚ, but its limit, e, is
i!
not in ℚ.
Proof. Let I1 = [a, b] be a closed bounded interval that contains A. Bisect I1 into
two congruent closed subintervals. One of the two subintervals contains infinitely
many points of A. Denote that interval by I2 . Continuing this process produces
a sequence of subintervals I1 ⊇ I2 ⊇ … such that A ∩ In is infinite for all n ∈ ℕ,
b−a
and the length of In , l(In ) = n−1 . For each n ∈ ℕ, pick a point an ∈ A ∩ In . If
2
b−a b−a
m > n, In ⊇ Im , and am , an ∈ In ; hence |an − am | < n−1 . Since limn n−1 = 0, (an )
2 2
is a Cauchy sequence. Let a = limn an . Since ai ∈ In for all i ≥ n, a ∈ In for all
n (see exercise 9 at the end of this section). Now let 𝛿 > 0. Since limn l(In ) = 0,
and a ∈ ∩∞ n=1 In , In ⊆ (a − 𝛿, a + 𝛿) for sufficiently large n. Thus (a − 𝛿, a + 𝛿)
contains infinitely many points of A because In does. In particular, (a − 𝛿, a +
𝛿) ∩ A contains a point other than a.
Example 10. The completeness of ℝ is equivalent to the Cauchy criterion. The fact
that the completeness of ℝ implies the Cauchy criterion has been established
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
in theorem 1.2.9. Observe that the proof of theorem 1.2.9 depends heavily
(through the intervening theorems) on theorem 1.2.3, where the completeness
of ℝ was crucial. We now prove that the Cauchy criterion implies the complete-
ness of ℝ.
b0 −a0 b0 −a0
Consequently, an+1 − an ≤ and bn − bn+1 ≤ .
2n 2n
We now show that (bn ) is a Cauchy sequence. Let 𝜖 > 0, and choose an integer N
such that (b0 − a0 )/2N−1 < 𝜖. For m > n > N, we have
By the Cauchy criterion, the sequence (bn ) has a limit, say, b. An argument
identical to the one above shows that (an ) is convergent, and since bn − an ≤
(b0 − a0 )/2n , limn an = b.
Finally, we prove that b = supA. If a > b for an element a ∈ A, then a > bn for some
n, which contradicts the fact that bn is an upper bound of A. Thus b is an upper
bound of A. For any number c < b, let 𝜖 = b − c. Since limn an = b, there exists an
integer n such that an ∈ (b − 𝜖, b + 𝜖). In particular, an > c; hence c is not an upper
bound of A.
3 Observe that if an+1 = bn , the process terminates and bn is the least upper bound (in fact,
the maximum) of A. Otherwise, the process continues ad infinitum, and each bn is a strict upper
bound of A.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
preliminaries 17
Definition. The extended real line is ℝ = ℝ ∪ {−∞, ∞}. We need this extension
of ℝ because the limits of some sequences are infinite and because it is some-
times convenient to allow functions to take infinite values. We retain the usual
ordering on ℝ, and, for x ∈ ℝ, we define −∞ < x < ∞. The following rules of
arithmetic in ℝ are convenient and widely accepted:
Remark. Not every limit point of a sequence is a limit point of its range. For
example, the sequence an = (−1)n has two limit points, ±1, while its range, the
set {−1, 1}, has no limit points.
Theorem 1.2.11. An extended real number a is a limit point of (an ) if and only if
there exists a subsequence (ank ) of (an ) such that limk ank = a.
Theorem 1.2.12. 𝛼 = lim supn an if and only if, for every 𝜖 > 0,
(a) there is a positive integer N such that an < 𝛼 + 𝜖 for all n > N, and
(b) an > 𝛼 − 𝜖 for infinitely many n ∈ ℕ.
Proof. Suppose 𝛼 = lim supn an . Since 𝛼 = infn 𝛼n , there is a positive integer N such
that 𝛼N < 𝛼 + 𝜖. Now, because an ≤ 𝛼n and 𝛼n is non-increasing, an ≤ 𝛼n ≤ 𝛼N <
𝛼 + 𝜖, for all n ≥ N. This proves (a).
To prove (b), note that 𝛼 − 𝜖 < 𝛼 ≤ 𝛼1 = sup{a1 , a2 , ...}. Thus there is a positive
integer n1 such that an1 > 𝛼 − 𝜖. Now 𝛼 − 𝜖 < 𝛼 ≤ 𝛼n1 +1 = sup{an1 +1 , an1 +2 , ...}.
Thus there is a positive integer n2 > n1 such that an2 > 𝛼 − 𝜖. This process produces
a subsequence ank of an such that ank > 𝛼 − 𝜖.
To prove the converse, suppose 𝛼 ∈ ℝ satisfies conditions (a) and (b), and let
𝜖 > 0. By condition (b), for every n ∈ ℕ, there exists an integer k ≥ n such that
ak > 𝛼 − 𝜖. Thus 𝛼n = supk≥n ak ≥ 𝛼 − 𝜖. Taking the limit as n → ∞, produces
lim supn an ≥ 𝛼 − 𝜖. Since 𝜖 is arbitrary, lim supn an ≥ 𝛼.
By condition (a), there exists an integer N such that ak < 𝛼 + 𝜖 for every k > N.
Thus, for every n > N, 𝛼n = supk≥n 𝛼k ≤ 𝛼 + 𝜖. Taking the limit as n → ∞, we
obtain lim supn an ≤ 𝛼 + 𝜖. Because 𝜖 is arbitrary, lim supn an ≤ 𝛼.
Theorem 1.2.13. The upper limit of a sequence (an ) is the largest limit point of (an ).
Proof. Let 𝜖 > 0. By the previous theorem (and its proof), there is a positive integer N
such that, for all n > N, an < 𝛼 + 𝜖, and a subsequence (ank ) such that ank > 𝛼 − 𝜖.
Since limk nk = ∞, there is a positive integer K such that nk > N for all k > K.
Therefore 𝛼 − 𝜖 < ank < 𝛼 + 𝜖 for all k > K. By theorem 1.2.11, 𝛼 is a limit point
of (an ). If t is a limit point of (an ), then, for infinitely many positive integers n,
t − 𝜖 < an . By theorem 1.2.12, there is an integer N such that, for all n > N, an <
𝛼 + 𝜖. Choosing n large enough for the last two inequalities to be simultaneously
satisfied, we have t − 𝜖 < an < 𝛼 + 𝜖. Therefore t < 𝛼 + 2𝜖. Since 𝜖 is arbitrary,
t ≤ 𝛼.
Proof. Let 𝛼 = lim supn an , 𝛽 = lim infn an , and suppose that 𝛼 = 𝛽. By theorems
1.2.12 and problem 17 at the end of this section, there is a positive integer N such
that, for n > N, an < 𝛼 + 𝜖 and 𝛼 − 𝜖 = 𝛽 − 𝜖 < an . Thus, for n > N, 𝛼 − 𝜖 < an <
𝛼 + 𝜖; hence limn an = 𝛼. Conversely, if limn an = a, then it is easy to verify that
the conditions of theorem 1.2.12 and those of problem 17 are met with 𝛼 = a and
𝛽 = a, respectively. Hence 𝛼 = 𝛽.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
preliminaries 19
Complex Numbers
The definition so far makes ℂ nothing more than the Euclidean plane ℝ2 . This
is why the set of complex numbers is also called the complex plane. What sets ℂ
apart from ℝ2 is the following pair of binary operations.
Definition. For complex numbers z = (x, y) and w = (a, b), we define the sum
z + w = (x + a, y + b) and the product zw = (ax − by, ay + bx). The real field ℝ
is embedded into the complex plane in a natural way: we identify a real number
x with the complex number (x, 0). Under the operations of complex addition
and multiplication, the subset ℝ̃ = {(x, 0) ∈ ℂ ∶ x ∈ ℝ} is closed in the sense
that if z and w are in ℝ,̃ then z + w and zw are in ℝ.̃ Indeed z + w = (x + a, 0)
and zw = (ax, 0). From now on, we make no distinction between ℝ and ℝ̃ and
simply write x for (x, 0). With this understanding, we see that if x ∈ ℝ, then
xw = (x, 0)(a, b) = (xa, xb). It is also straightforward to verify that the elements
0 = (0, 0) and 1 = (1, 0) satisfy z + 0 = z and z.1 = z for all z ∈ ℂ. Thus 0 and 1
are the identity elements for complex addition and multiplication, respectively.
Definition. The complex number i = (0, 1) is called the imaginary number. Now
i2 = (0, 1).(0, 1) = (−1, 0) = −1. We therefore think of i as the square root of −1.
Armed with the imaginary number i, we now have a convenient and notationally
simple way to represent complex numbers. An arbitrary complex number z
can be written as z = (x, y) = (x, 0) + (0, y) = x + y(0, 1) = x + iy. With this way
of representing complex numbers, we can restate the definitions of complex
addition and multiplication as follows. For complex numbers z = x + iy and
w = a + ib, z + w = (x + a) + i(y + b) and zw = (ax − by) + i(ay + bx). Note that
the complex operations obey the same rules as the addition and multiplication of
linear polynomials, taking into account that i2 = −1. Indeed, if we multiply out the
product of the binomials x + iy and a + ib according to the usual rules of algebra,
we obtain (x + iy)(a + ib) = ax + iay + ibx + i2 by = (ax − by) + (ay + bx)i, which
is consistent with the original definition of complex multiplication. Now that
we have a convenient way of manipulating complex numbers, we can prove the
following theorem.
Proof. Most of the defining properties of a field are easy to verify. As a sample of the
calculations, we verify the following two properties:
Definition. For a complex number z = x + iy, x is called the real part of z, and y
is the imaginary part of z. We use the notation x = Re(z), and y = Im(z).
(a) z + w = z + w and zw = z w;
(b) z + z = 2Re(z) and z − z = 2iIm(z);
(c) |z| = |z| and zz = |z|2 ;
(d) |Re(z)| ≤ |z|, |Im(z)| ≤ |z|, and |z| ≤ |Re(z)| + |Im(z)|;
z
(e) z−1 = 2 ; and
|z|
(f) the triangle inequality |z + w| ≤ |z| + |w|.
Proof. The proofs are mostly computational and are left to the reader to check. We
prove the triangle inequality below.
Note that zw is the conjugate of zw; hence zw + zw = 2Re(zw) ≤ 2|zw| =
2|z||w| = 2|z||w|. Using this, we have |z + w|2 = (z + w)(z + w) = zz + zw +
zw + ww ≤ |z|2 + 2|z||w| + |w|2 = (|z| + |w|)2 . The result follows by taking the
square roots of the extreme sides of the above string of inequalities.
Now that we have a measure of the length of a complex number, we have a measure
of the distance between two points in the complex plane. For complex numbers
z1 and z2 , the quantity |z1 − z2 | is exactly the Euclidean distance between z1 and
z2 . Now we can generalize many of the properties of subsets of the real line to
the complex plane. For example, a bounded subset of ℂ is a set A of complex
numbers such that sup{|z| ∶ z ∈ A} < ∞. For a complex number a, and a positive
real number 𝛿, the set {z ∈ ℂ ∶ |z − a| < 𝛿} is an open disk of radius 𝛿 and centered
at a. A point z ∈ ℂ is a limit point of a set A of complex numbers if every open
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
preliminaries 21
disk centered at z contains points of A other than z. We urge the reader to examine
the rest of the concepts we studied for real numbers and generalize them to the
complex field, whenever possible. One important distinction between ℝ and ℂ is
that there is no natural (or useful) way to order the complex field. We conclude the
section with the following theorem.
Proof. Let zn = xn + iyn , where (xn ) and (yn ) are real sequences. If (zn ) is Cauchy,
then, given 𝜖 > 0, there exists a positive integer N such that, for n, m > N,
|zn − zm | < 𝜖. It follows that the real sequences (xn ) and (yn ) are Cauchy
sequences, since |xn − xm | ≤ |zn − zm | and |yn − ym | ≤ |zn − zm |. By the complete-
ness of ℝ, (xn ) and (yn ) converge to real numbers x and y, respectively. Clearly,
(zn ) converges to z = x + iy because |zn − z| ≤ |xn − x| + |yn − y|. We leave the
proof of the converse to the reader.
Exercises
1. Prove that the finite union of bounded subsets of ℝ is bounded, and give an
example to show that the conclusion is false for an infinite union of bounded
sets.
2. Prove that if A ⊆ ℝ is bounded below, then A has a greatest lower bound.
Hint: Define −A = {−x ∶ x ∈ A}. Show that inf A = −sup{−A}.
3. Prove that if limn an = a, limn bn = b, then limn (an ± bn ) = a ± b and that
limn (an bn ) = ab.
4. Let limn bn = b ≠ 0. Prove that there is a natural number N such that, for
all n > N, |bn | ≥ |b|/2. Hence prove that if, in addition, limn an = a, then
a a
limn n = .
bn b
5. Prove theorem 1.2.1.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
preliminaries 23
∞ zn
24. (a) Show that the series ∑n=0 is absolutely convergent for all z ∈ ℂ.
n!
z ∞ zn
(b) Define e = ∑n=0 . Show that, for all z, w ∈ ℂ, ez ew = ez+w . Conclude
n!
that, for n ∈ ℕ, and z ∈ ℂ, (ez )n = enz . Hint: Recall that absolutely conver-
gent series can be multiplied term by term. The reader will recognize ez as
the complex exponential function.
25. (a) Show that, for 𝜃 ∈ ℝ, ei𝜃 = cos 𝜃 + i sin 𝜃. Hint: Recall that the terms of
an absolutely convergent series can be rearranged without affecting the sum
of the series.
(b) Show that if z is a nonzero complex number, then there is a unique
positive number r and a unique real number 𝜃 ∈ [0, 2𝜋) such that z = rei𝜃 .
z
Hint: Write z = |z|w, where w = . Note that |w| = 1.
|z|
26. Show that, for 𝜃 ∈ ℝ, (cos 𝜃 + i sin 𝜃)n = cos(n𝜃) + i sin(n𝜃).
27. Let z be a nonzero complex number, and write z = rei𝜃 . Show that, for n ≥ 2,
each of the numbers 𝜉k = r1/n ei(𝜃+2𝜋k)/n , 0 ≤ k ≤ n − 1, satisfies 𝜉kn = z. The
numbers 𝜉k are the nth roots of z.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
OUP UNCORRECTED PROOF – FINAL, 12/1/2021, SPi
2
Set Theory
A false conclusion once arrived at and widely accepted is not easily dislodged
and the less it is understood the more tenaciously it is held.
Georg Cantor
In 1873 Cantor proved the countability of the set of rational numbers. He then
proved that the real numbers were uncountable and published the result in 1874.
It is in that paper that the idea of a one-to-one correspondence appeared for the
first time. He next pondered the question of whether the unit interval could be
put in a one-to-one correspondence with the unit square. He initially dismissed
the possibility and wrote that “the answer seems so clearly to be ‘no’ that proof
appears almost unnecessary.” When he did prove the result, he wrote to Dedekind
in 1877, “I see it, but I don’t believe it!” In a paper published in 1878, he made the
concept of one-to-one correspondence precise and discussed sets of equal power,
that is, sets which have equal cardinality.
Fundamentals of Mathematical Analysis. Adel N. Boules, Oxford University Press (2021). © Adel N. Boules.
DOI: 10.1093/oso/9780198868781.003.0002
OUP UNCORRECTED PROOF – FINAL, 12/1/2021, SPi
Between 1879 and 1884, Cantor published a series of six papers designed to
provide a basic introduction to set theory, and this is when he realized that his
work was not finding the acceptance that he had hoped. In fact, Cantor’s ideas
earned him the strong antagonism of Kronecker, among other mathematicians
and philosophers. Dedekind was sympathetic to Cantor’s work and in 1888 wrote
his article Was sind und was sollen die Zahlen [What are numbers and what should
they be], partially in defense of Cantor’s work.
Cantor’s last major papers on set theory appeared in 1895 and 1897, where he
had hoped, without success, to include a proof of the continuum hypothesis. He
did, however, succeed in formulating his theory of well-ordered sets and ordinal
numbers. It was also during those years that Cantor discovered the first paradoxes
of set theory.
Hilbert described Cantor’s work as “the finest product of mathematical genius and
one of the supreme achievements of purely intellectual human activity.”
Cantor’s personal life was not entirely a happy one. For more than thirty years,
Cantor was troubled with bouts of depression, and, in 1899, he suffered the death
of his youngest son. He spent the last year of his life confined to a sanatorium,
where he died of a heart attack.
Definition. Two sets A and B are equivalent if there is a bijection from A to B.1
We use the notation A ≈ B to indicate the equivalence of A and B.
set theory 27
Example 3. The closed interval [0, 1] is equivalent to the open interval (0, 1).
Define a function f ∶ [0, 1] → (0, 1) as follows:
⎧1/2 if x = 0,
f (x) = 1/(n + 2) if x = 1/n, n ∈ ℕ,
⎨
⎩x otherwise.
(a) A ≈ A.
(b) If A ≈ B, then B ≈ A.
(c) If A ≈ B and B ≈ C, then A ≈ C.
Theorem 2.1.2.
(a) A proper subset B of ℕn is finite, and Card(B) = m for some m < n.
(b) A proper subset B of a finite, set A is finite, and Card(B) < Card(A).
(c) If m, n ∈ ℕ and m < n, then there is no injection from ℕn to ℕm .
(d) A finite set is not equivalent to any of its proper subsets.
(e) ℕ is infinite.
g(x) if x ∈ ℕm ,
f (x) = {
n+1 if x = m + 1.
(d) Suppose B is a proper subset of a finite set A, and let n = Card(A), and m =
Card(B). By part (b), m < n. Let g ∶ B → ℕm and h ∶ ℕn → A be bijections. If
there is a bijection f ∶ A → B, then gofoh would be an injection from ℕn to ℕm .
This contradicts part (c).
(e) If, for some positive integer m, there exists a bijection f ∶ ℕ → ℕm , then, for
any integer n > m, the restriction of f to ℕn would be an injection from ℕn into
ℕm . This contradicts part (c) and completes the proof.
Proof. Suppose A is infinite. First we show that, for n ∈ ℕ, A contains a set of exactly
n elements. The proof is inductive, and here is the inductive step: Having found a
subset {a1 , … , an } of A containing exactly n elements, we pick an element an+1 ∈
A − {a1 , … , an }. Such an an+1 exists because otherwise A would be be equal to
{a1 , … , an }, which is finite. The set {a1 , … , an+1 } has exactly n + 1 elements.
For n = 0, 1, 2, …, let Bn be a subset of A of exactly 2n elements. Define C0 = B0 ,
n−1 i
and, for n ∈ ℕ, let Cn = Bn − ∪n−1 n−1
i=0 Bi . Now Card(∪i=0 Bi ) ≤ ∑i=0 2 = 2 − 1.
n
n n
Hence Card(Cn ) ≥ 2 − (2 − 1) = 1. Thus the sets Cn are disjoint and nonempty.
We choose an element cn from each Cn , and we obtain a sequence of distinct
elements of A. The converse is true because ℕ is infinite.
OUP UNCORRECTED PROOF – FINAL, 12/1/2021, SPi
set theory 29
Theorem 2.1.5. A set A is infinite if and only if it is equivalent to one of its proper
subsets.
Example 5. The closed interval [0, 1] is infinite because it is equivalent to its subset
(0, 1). (See example 3.)
Proof. We enumerate ℕ × ℕ recursively as follows: f (1) = (1, 1), and once f (n) has
been defined, say, f (n) = (a, b), define
(a − 1, b + 1) if a > 1,
f (n + 1) = {
(b + 1, 1) if a = 1.
1
1 2 3 4
We claim that B = {ank ∶ k ∈ ℕ}. If not, then there exists an element b ∈ B such
that b ≠ ank for all k ∈ ℕ. Now b ∈ A, so b = an for some positive integer n. By
assumption, n ≠ nk for all k ∈ ℕ. Because nk is a strictly increasing sequence of
positive integers, there are two possibilities: either n < n1 or there is a unique k ∈ ℕ
such that nk < n < nk+1 . The former possibility would contradict the definition of
n1 , and the latter possibility contradicts the definition of nk+1 . This shows that
B = {ank ∶ k ∈ ℕ}.
set theory 31
Proof. Let {An } be a countable collection of countable sets, and let A = ∪∞ n=1 An .
Write An = {an1 , an2 , . . .}. Define f ∶ ℕ × ℕ → ∪∞
n=1 An by f (m, n) = amn . Clearly, f
is onto. By theorem 2.1.6, there exists a bijection g ∶ ℕ → ℕ × ℕ. The composition
fog maps ℕ onto A. By theorem 2.1.9, A is countable.
Proof. Use theorem 2.1.10 and the facts that ℤ = ℕ ∪ {0} ∪ −ℕ and ℚ = ∪∞
n=1
m
{ ∶ m ∈ ℤ}.
n
Proof. Recall that 2ℕ is the set of all sequences from ℕ in {0, 1} (binary sequences).
Suppose, for a contradiction, that 2ℕ is countable. Then 2ℕ = {x1 , x2 , . . .}, where
each xi is a binary sequence, say, xi = (xi1 , xi2 , . . .) and each xij is 0 or 1. The binary
sequence y = (y1 , y2 , . . .), where
0 if xii = 1,
yi = {
1 if xii = 0.
Proof. Pick two distinct elements a0 and a1 in A. By the previous theorem, {a0 , a1 }ℕ
is uncountable. Consequently, Aℕ is uncountable because it contains {a0 , a1 }ℕ . See
problem 4 at the end of this section.
Proof. Let T be the set of binary sequences which contain only a finite num-
ber of nonzero terms. Each of the sets Tn of binary sequences of length n is
finite, and T is equivalent to ∪∞
n=1 Tn . Therefore T is countable. It follows that
OUP UNCORRECTED PROOF – FINAL, 12/1/2021, SPi
Remark. It is easy to see that the function f in the above proof is onto and hence
A ≈ (0, 1]. See problem 11 at the end of this section.
Exercises
set theory 33
The axiom of choice is one of the most useful tools in set theory. Although
it is easy to state and widely accepted, the axiom of choice has also generated
much controversy among mathematicians. In this section, we study the axiom
of choice and its most famous and widely applicable equivalent: Zorn’s lemma,
which is an indispensable tool in this book. The section and the section exercises
contain typical but illuminating illustrations of how Zorn’s lemma is applied. In
this section, we also study partially ordered, linearly ordered, and well-ordered sets
and establish results such as the Schröder-Bernstein theorem, which will help us
study cardinal numbers in the next section. Although ordinal numbers have been
avoided in this book, the section exercises are largely focused on well-ordered sets.
(a) x ≤ x,
(b) if x ≤ y and y ≤ z, then x ≤ z, and
(c) if x ≤ y and y ≤ x, then x = y.
Example 1. Let A = 𝒫(ℕ). Order A by set inclusion. Thus if S and T are subsets
of ℕ, then S ≤ T means that S ⊆ T. Set inclusion is a partial ordering of A. It is
not total because if S and T are subsets of ℕ, it need not be the case that T ⊆ S
or S ⊆ T. The set {ℕn ∶ n ∈ ℕ} is a chain in A.
Example 3. ℕ is well ordered with the usual ordering of the real numbers. We
often use the well ordering of ℕ without explicit mention; see, for example, the
proof of theorem 2.1.7.
–1
–2
–3
–3 –2 –1 0 1 2 3
Figure 2.2
OUP UNCORRECTED PROOF – FINAL, 12/1/2021, SPi
set theory 35
Example 4. Let A = ℕ ∪ {𝜔}, where 𝜔 is any object not in ℕ. The ordering on the
subset ℕ of A is the natural ordering of the integers. We define n < 𝜔 for all
n ∈ ℕ. Thus we simply define 𝜔 to be the largest element of A. The set (A, ≤) is
well ordered.
We now state the well ordering principle, which is really an axiom. It simply states
that any set can be well ordered.
The well ordering principle: given a nonempty set A, there exists a well ordering
on A.
The axiom of choice: if {X𝛼 }𝛼∈I is a nonempty collection of nonempty sets, then
∏𝛼 X𝛼 is nonempty.
The axiom of choice is perhaps the most believable axiom of set theory. However,
it is neither a simple fact nor obvious. In fact, the axiom of choice is equivalent to
the well ordering principle and to the following axiom, which is less intuitive than
the well ordering principle or the axiom of choice:
Zorn’s lemma: if A is a partially ordered set such that every chain in A has an
upper bound, then A contains a maximal element.
Theorem 2.2.1. The axiom of choice, Zorn’s lemma, and the well ordering principle
are all equivalent.
Theorem 2.2.2. Let A and B be nonempty sets. Then there is an injection from A to
B or an injection from B to A.
Proof. Let 𝔅 be the collection of all injective functions f such that Dom( f ) ⊆ A
and ℜ( f ) ⊆ B. Let { f𝛼 ∶ 𝛼 ∈ I} be an indexing of 𝔅 and, for 𝛼 ∈ I, write A𝛼 =
Dom( f𝛼 ), and B𝛼 = ℜ( f𝛼 ). Partially order 𝔅 as follows: f𝛼 ≤ f𝛽 if f𝛽 extends f𝛼 .
More explicitly, f𝛼 ≤ f𝛽 means that A𝛼 ⊆ A𝛽 , B𝛼 ⊆ B𝛽 , and the restriction of f𝛽 to
A𝛼 is f𝛼 . Clearly, ≤ is a partial ordering of 𝔅. Now let ℭ be a chain in 𝔅, and index
ℭ by a subset J of I; ℭ = { f𝛼 ∶ 𝛼 ∈ J }. We show that ℭ has an upper bound: let
A𝛼 = Dom( f𝛼 ), B𝛼 = ℜ( f𝛼 ), S = ∪𝛼∈J A𝛼 , and T = ∪𝛼∈J B𝛼 . Define f ∶ S → T as
follows: if x ∈ S, choose a set A𝛼 that contains x, and let f (x) = f𝛼 (x). The function
f is well defined because ℭ is a chain. Specifically, if x ∈ A𝛼 ∩ A𝛽 (𝛼, 𝛽 ∈ J), then
f𝛼 ≤ f𝛽 or f𝛽 ≤ f𝛼 ; say, the former. Since f𝛽 extends f𝛼 , f𝛼 (x) = f𝛽 (x). We leave
it to the reader to verify that f is an injection. Clearly, f is an upper bound of ℭ.
By Zorn’s lemma, 𝔅 has a maximal element, say, f1 . Write A1 = Dom( f1 ) and
B1 = ℜ( f1 ). If A1 = A, then f1 is an injection from A into B. If B1 = B, then f−11 is
an injection from B to A, and the proof is complete. We now show that A1 ≠ A
and B1 ≠ B cannot occur simultaneously. If that were the case, pick elements
a ∈ A − A1 , b ∈ B − B1 , and extend f1 to a function f ∶ A1 ∪ {a} → B1 ∪ {b} by
defining f (a) = b and f |A1 = f1 . Clearly, f is a strict extension of f1 , and this
contradicts the maximality of f1 .
set theory 37
g(x) if x ∈ C,
h(x) = {
x if x ∈ D.
Example 6. Any two open disks in the plane are equivalent. Let 0 < r1 < r2 ,
and consider the disks D1 = {(x, y) ∈ ℝ2 ∶ x2 + y2 < r21 } and D2 = {(x, y) ∈ ℝ2 ∶
x2 + y2 < r22 }. Choose a number a such that 0 < a < r1 . The function f ∶ D2 →
a
D1 defined by f (x, y) = (x, y) is an injection, as the reader can easily verify. By
r2
the previous lemma, D1 ≈ D2 . In a similar manner, one can prove that any two
open squares are equivalent.
Example 7. An open square is equivalent to an open disk. Let S = {(x, y) ∶ |x| <
2, |y| < 2}, let D3 = {(x, y) ∶ x2 + y2 < 9}, and let D1 = {(x, y) ∶ x2 + y2 < 1}.
Observe that D1 ⊆ S ⊆ D3 . Set inclusion is clearly an injection from S into
D3 . By example 6, there is an injection from D3 into D1 , and hence from D3 into
S. By the Schröder-Bernstein theorem, S ≈ D3 .
Exercises
1. Let A be a partially ordered set and let S ⊆ A. State the definition of each of
the following terms: the least element of S, a minimal element of S, a lower
bound of S, and the greatest lower bound of S.
OUP UNCORRECTED PROOF – FINAL, 12/1/2021, SPi
6. Let A be linearly ordered and let a, b ∈ A. Show that S(a) = S(b) if and only
if a = b.
7. Prove that if every segment of a linearly ordered set A is well ordered, then
A is well ordered.
8. Let A be a well-ordered set, and let B be a proper subset of A with the
property that the conditions b ∈ B and c < b imply that c ∈ B. Prove that
B is a segment of A.
9. Let A be a well-ordered set, and let B be a proper subset of A such that, for
every b ∈ B and for every a ∈ A − B, b < a. Prove that B is a segment of A.
10. The principle of transfinite induction. Suppose that A is a well-ordered
set and that ∅ ≠ B ⊆ A is such that whenever S(x) ⊆ B, x ∈ B. Prove that
B = A.
11. Suppose that A is a well-ordered set and that B ⊆ A. Prove that either
∪x∈B S(x) = A, or ∪x∈B S(x) is an initial segment of A.
12. Prove that there exists an uncountable, well-ordered set Ω such that every
initial segment of Ω is countable.
13. Let Ω be as in the previous problem. Prove that every countable subset of
Ω has an upper bound.
14. Give a direct proof of the fact that Zorn’s lemma implies the axiom of
choice. Hint: Let {X𝛼 }𝛼∈I be a nonempty collection of nonempty sets, and let
𝔅 = {(J, g) ∶ J ⊆ I, g ∈ ∏𝛼∈J X𝛼 }, that is, g is a choice function on {X𝛼 }𝛼∈J .
The set 𝔅 ≠ ∅ because finite subsets J of I generate such functions. Partially
order 𝔅 as follows: (J1 , g1 ) ≤ (J2 , g2 ) if J1 ⊆ J2 and g2 extends g1 .
15. Let A be a linearly ordered set. Is the union of a collection of well-ordered
subsets of A necessarily a well-ordered set?
16. The Hausdorff maximal principle. Every partially ordered set contains a
maximal chain, that is, a chain that is not properly contained in any other
chain.
Prove that the Hausdorff maximal principle is equivalent to Zorn’s
lemma.
OUP UNCORRECTED PROOF – FINAL, 12/1/2021, SPi
set theory 39
Hint: To prove that the Hausdorff maximal principle implies Zorn’s lemma,
let (X, ≤) be a partially ordered set that satisfies the conditions of Zorn’s
lemma. Let C be a maximal chain, and let x be an upper bound of C. To
prove the converse, let ℭ be the collection of all chains in X, and order ℭ by
set inclusion. Verify that the conditions of Zorn’s lemma are met and hence
ℭ contains a maximal member, that is, a maximal chain in X.
17. Prove that any open disk is equivalent to any closed disk.
18. Prove that any open square is equivalent to any closed square.
In section 2.1 we took a small step toward showing that infinite sets are not created
equal. In this section, we show that there are infinitely many types of infinities,
in the sense that there is a whole cascade (loosely speaking) of infinite sets of
unequal sizes, or cardinalities. This is the first result in the section. Our approach
to infinite cardinals is intuitive rather than axiomatic. We proceed to show that the
set of integers is the smallest infinite set, then we prove that a set of infinite sets is
well ordered by size, or cardinality. Only an intuitive understanding of cardinal
numbers is essential for subsequent material that make reference to cardinality.
Thus the discussion of cardinal arithmetic and sums of infinitely many cardinals
can be omitted on the first reading if the goal is to take the fastest route to chapter 4.
Definition. Let A and B be nonempty sets. We say that A and B have the same
cardinality if A ≈ B. We also say that A and B define the same cardinal number,
and we write Card(A) = Card(B).
Proof. Recall that the notation 𝒫(A) stands for the power set of A. Define a func-
tion f ∶ A → 𝒫(A) by f (x) = {x}. Clearly, f is one-to-one; therefore, Card(A) ≤
Card(𝒫(A)). If Card(A) = Card(𝒫(A)), then there exists a bijection g ∶ A →
𝒫(A). Define S = {x ∈ A ∶ x ∉ g(x)}. Since g is onto, let a be such that g(a) = S. If
a ∈ S, then, by the definition of S, a ∉ S. If a ∉ S, then again, by the definition of
S, a ∈ S. This contradiction completes the proof.
The reader may have observed that while our definition of what it means for
two sets to have the same cardinality is unambiguous, we have not really defined
OUP UNCORRECTED PROOF – FINAL, 12/1/2021, SPi
what a cardinal number is. We can give a slightly more tangible definition of a
set of cardinal numbers as follows: Let 𝔖 be a set of sets. By theorem 2.1.1, set
equivalence is an equivalence relation on 𝔖. We can define the cardinal numbers
in 𝔖 to be equivalence classes of set equivalence in 𝔖. This does not define all
cardinal numbers because if A is a set that is not equivalent to any set in 𝔖,2 then
Card(A) ≠ Card(S) for all S ∈ 𝔖.
One might be tempted to generalize the idea of the last paragraph by considering
the set of all sets, instead of a fixed set of sets. However, within the limitations
of naïve set theory, this is paradoxical for the following reason: If we were
allowed to use terms such as the set of all sets, 𝔖, let U = ∪{S ∶ S ∈ 𝔖}. Since
U contains every S ∈ 𝔖, Card(S) ≤ Card(U). Since 𝔖 contains all sets, Card(U)
would be the largest cardinal number. This is a paradox because, by theorem 2.3.1,
Card(𝒫(U)) > Card(U). Such paradoxes can be avoided in an axiomatic treatment
of set theory. Such a treatment is hardly essential for our purposes because we will
never refer to cardinal numbers as an absolute concept. We will be content to
think of cardinal numbers as a comparative measure of the size of sets in the sense
of the opening definition of this section.
n = Card(ℕn )
ℵ0 = Card(ℕ)
𝔠 = Card(ℝ)
The natural numbers are the finite cardinals, and all other cardinals are infinite.
Proof. Let a, b ∈ 𝔖 and let A and B be sets such that a = Card(A), and b = Card(B).
By theorem 2.2.2, there is an injection from A to B or one from B to A. Thus
a ≤ b or b ≤ a. To check antisymmetry, suppose that a ≤ b ≤ a. Then is an
injection from A to B and one from B to A. By the Schröder-Bernstein theorem,
A ≈ B and a = b.
2 Such a set A exists. One can take A to be the power set of ∪{S ∶ S ∈ 𝔖}.
OUP UNCORRECTED PROOF – FINAL, 12/1/2021, SPi
set theory 41
The following theorem establishes the fact that any set of cardinal numbers is well
ordered.
Theorem 2.3.4. If 𝔖 = {𝜉𝛼 }𝛼∈I is a set of cardinal numbers, then there is an element
𝛼0 ∈ I such that 𝜉𝛼0 ≤ 𝜉𝛼 for all 𝛼 ∈ I.
Proof. Let {X𝛼 }𝛼∈I be sets such that 𝜉𝛼 = Card(X𝛼 ). If 𝔖 contains any integers, the
smallest of these integers is the least cardinal in 𝔖. Otherwise, all the sets X𝛼
are infinite. We prove that there is 𝛼0 ∈ I such that, for every 𝛼 ∈ I, there is an
injection f𝛼 ∶ X𝛼0 → X𝛼 .
Let X = ∏𝛼∈I X𝛼 , and let 𝔅 be the collection of subsets B of X with the property
that if x = (x𝛼 ) and y = (y𝛼 ) are distinct elements of B, then x𝛼 ≠ y𝛼 for all 𝛼 ∈ I.
Order 𝔅 by set inclusion. It is clear that if ℭ is a chain in 𝔅, then ℭ has an upper
bound, namely, ∪{C ∶ C ∈ ℭ}. By Zorn’s lemma, 𝔅 has a maximal member, B.
We claim that, for some 𝛼0 ∈ I, 𝜋𝛼0 (B) = X𝛼0 . If this is not the case, then, for
each 𝛼 ∈ I, choose an element a𝛼 ∈ X𝛼 − 𝜋𝛼 (B) and let a = (a𝛼 ). The set B ∪ {a}
is clearly in 𝔅, which contradicts the maximality of B and shows that, for some
𝛼0 ∈ I, 𝜋𝛼0 (B) = X𝛼0 .
Now, for each a ∈ X𝛼0 , there is a unique element x ∈ B such that 𝜋𝛼0 (x) = a.
Such x exists because 𝜋𝛼0 is onto, and it is unique by the definition of 𝔅 and the
fact that B ∈ 𝔅. Now define f𝛼 ∶ X𝛼0 → X𝛼 as follows: f𝛼 (a) = 𝜋𝛼 (x), where x is
the element of B constructed above.3 By construction, f𝛼 is an injection.
Cardinal Arithmetic
The above operations are well defined in the sense that they are independent of
the particular sets A and B chosen to represent a and b. For example, if A ≈ C and
B ≈ D, then A × B ≈ C × D and AB ≈ CD . See the exercises on section 2.1.
f (x) if x ∈ A,
g(x) = {
x if x ∈ C.
1. a + b = b + a and ab = ba.
2. a + (b + c) = (a + b) + c and a(bc) = (ab)c.
3. a(b + c) = ab + ac.
4. ab ac = ab+c .
5. ac bc = (ab)c .
c
6. (ab ) = abc .
7. If a ≤ b, then a + c ≤ b + c.
8. If a ≤ b, and c ≥ 1, then ac ≤ bc.
Proof. Most of the rules of cardinal arithmetic are obvious. We prove property 6,
as an example. Let A, B, and C be such that a = Card(A), b = Card(B) and
C C
c = Card(C). We need to show that (AB ) is equivalent to AB×C . Let f ∈ (AB ) .
Then for c ∈ C, f (c) is a function from B to A. We write fc instead of f (c). For such
an f, define a function 𝜙( f ) = g ∶ B × C → A by g(b, c) = fc (b). The assignment
C
𝜙 ∶ f ↦ g maps (AB ) to AB×C .
set theory 43
Proof. Let A be such that Card(A) = a, and let 𝔅 = {(A𝛼 , f𝛼 )}𝛼∈I be the collection
of all bijections f𝛼 ∶ A𝛼 → A𝛼 × A𝛼 , where A𝛼 ⊆ A. To see that 𝔅 ≠ ∅, pick a
countable subset G of A. By theorem 2.1.6, G ≈ G × G; hence 𝔅 ≠ ∅.
Order 𝔅 as follows: for 𝛼 and 𝛽 ∈ I, (A𝛼 , f𝛼 ) ≤ (A𝛽 , f𝛽 ) if A𝛼 ⊆ A𝛽 , and f𝛽
extends f𝛼 . If ℭ = {(A𝛼 , f𝛼 )}𝛼∈J is a chain in 𝔅, let C = ∪𝛼∈J A𝛼 and define a
function f ∶ C → C × C by f (x) = f𝛼 (x), where 𝛼 ∈ J is such that x ∈ A𝛼 . The
function f is a well-defined bijection from C → C × C, as the reader can verify.
Clearly, (C, f ) extends every member in ℭ and hence is an upper bound of
ℭ. By Zorn’s lemma, 𝔅 contains a maximal member, (C, g). We claim that
Card(C) = a. Suppose for a contradiction that Card(C) = b < a. First observe
that b ≤ b + b ≤ b.b = Card(C × C) = Card(C) = b, and hence b + b = b.b = b.
Now let d = Card(A − C). If d ≤ b, then a = b + d ≤ b + b = b, which contradicts
the supposition that b < a. Therefore b < d, and A − C contains a subset E such
that Card(E) = b.
Now (C ∪ E) × (C ∪ E) = (C × C) ∪ K, where K = (C × E) ∪ (E × C) ∪ (E × E).
Since K is the disjoint union of three sets each of cardinality b, Card(K) = b +
b + b = (b + b) + b = b + b = b. Therefore there is an bijection h ∶ E → K. Now
define a function f ∶ C ∪ E → (C × C) ∪ K = (C ∪ E) × (C ∪ E) by
g(x) if x ∈ C,
f (x) = {
h(x) if x ∈ E.
Clearly, the pair (C ∪ E, f ) ∈ 𝔅 is a strict extension of (C, g), which contradicts the
maximality of (C, g). This shows that the supposition b < a is false; hence a = b.
This concludes the proof because a.a = Card(C × C) = Card(C) = a.
Proof. a ≤ ab ≤ a.a = a.
Proof. a ≤ a + b ≤ a + a = a.
Definition. Let {a𝛼 }𝛼∈I be a set of cardinal numbers, and let {A𝛼 } be a collection
of disjoint sets such that Card(A𝛼 ) = a𝛼 . Define ∑𝛼∈I a𝛼 = Card(∪𝛼∈I A𝛼 ).
Theorem 2.3.10. Let {a𝛼 }𝛼∈I be a collection of equal cardinal numbers, say, a𝛼 = a
and let b = Card(I). Then ∑𝛼∈I a𝛼 = ab.
Proof. Let A be such that Card(A) = a, and let {A𝛼 } be a collection of disjoint
sets such that Card(A𝛼 ) = a. Then there are bijections f𝛼 ∶ A → A𝛼 . Define a
function f ∶ A × I → ∪𝛼∈I A𝛼 by f (x, 𝛼) = f𝛼 (x). Verifying that f is a bijection is
straightforward. Therefore ∑𝛼∈I a𝛼 = Card(∪𝛼∈I A𝛼 ) = Card(A × I) = ab.
Proof. Let {A𝛼 } be a collection of disjoint sets such that Card(A𝛼 ) = a𝛼 , and let
{B𝛼 } be a collection of disjoint sets such that Card(B𝛼 ) = b𝛼 . By assumption,
there exist injections f𝛼 ∶ A𝛼 → B𝛼 . Define a function f ∶ ∪𝛼∈I A𝛼 → ∪𝛼∈I B𝛼 by
f (x) = f𝛼 (x) if x ∈ A𝛼 . The function f is well defined because {A𝛼 } is a disjoint
family. Clearly, f is an injection from ∪𝛼∈I A𝛼 into ∪𝛼∈I B𝛼 .
Theorem 2.3.12. Let I be an infinite set, and let b = Card(I). Then the family 𝔉 of
finite sequences in I has cardinality b.
OUP UNCORRECTED PROOF – FINAL, 12/1/2021, SPi
set theory 45
Proof. By definition, 2ℵ0 = Card(2ℕ ). Let T be the set of all binary sequences that
contain only a finite number of nonzero terms. By problem 6 on section 2.1,
Card(2ℕ − T) = Card(2ℕ ) = 2ℵ0 . By the proof of theorem 2.1.15 (see also prob-
lem 11 on section 2.1), 2ℕ − T ≈ (0, 1] ≈ ℝ. Thus 𝔠 = Card(ℝ) = Card((0, 1]) =
Card(2ℕ − T) = 2ℵ0 .
Example 6. 𝔠ℵ0 = 𝔠.
Using theorems 2.3.6 and 2.3.13, 𝔠ℵ0 = (2ℵ0 )ℵ0 = 2ℵ0 ℵ0 = 2ℵ0 = 𝔠.
ℵ
Take a1 = 𝔠. By the previous example, a1 0 = a1 . For n ≥ 1, define an+1 = 2an .
The generalized continuum hypothesis states that, for any infinite cardinal a,
there is no cardinal number b such that a < b < 2a , that is, 2a is the immediate
successor of a.
Exercises
2. Prove that if a ≥ 2 is a cardinal number, then a + a ≤ a.a. This result was used
in the proof of theorems 2.3.6 and 2.3.8.
3. Let a and b be infinite cardinal numbers. Prove that if a + a = a + b, then
a ≥ b.
4. Let a, b, and c be infinite cardinal numbers. Prove that if a + b < a + c, then
b < c.
ℵ
5. What is ℵ0 0 ?
∞
6. Prove that ∑n=1 n = ℵ0 .
7. Let A and B be infinite sets and let f ∶ A → B be a surjection.
(a) Prove that Card(A) ≥ Card(B).
(b) Prove that if f−1 (b) is at most countable for each b ∈ B, then A ≈ B.
8. Let {A𝛼 }𝛼∈I be a family of nonempty sets. Prove that Card(∪𝛼∈I A𝛼 ) ≤
∑𝛼∈I Card(A𝛼 ).
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
3
Vector Spaces
Peano was born in a farmhouse about 5 km from Cuneo, where he received his
early education. One of Peano’s uncles was a priest and a lawyer in Turin, and
he realized the child’s talent. He took him to Turin in 1870 for his secondary
schooling. Peano entered the University of Turin in 1876, graduated in 1880
doctor of mathematics, and was appointed to the university the same year. He
received his qualification to be a university professor in 1884.
In 1886 Peano proved the existence of the solution of the differential equation
dy/dx = f (x, y) under the mere assumption that f is continuous in the neigh-
borhood of the initial point (x0 , y0 ). In 1888 he published the book Geometrical
Calculus, which begins with a chapter on mathematical logic. A significant feature
of the book is that, in it, Peano sets out with great clarity the ideas of Grassmann,
who made the first attempt to define a vector space, albeit in a rather obscure way.
Fundamentals of Mathematical Analysis. Adel N. Boules, Oxford University Press (2021). © Adel N. Boules.
DOI: 10.1093/oso/9780198868781.003.0003
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
This book contains the first definition of a vector space given with a remarkably
modern notation and style. This was, without a doubt, a big development in the
history of mathematics. In 1889 Peano published an axiomatic approach to the
definition of the natural numbers that was based on the notion of the successor
function. In 1890 he made the stunning discovery that there are continuous
surjective mappings from [0,1] onto the unit square, which came to be known as
space-filling curves.
Peano’s career was strangely divided into two periods. The period up to 1900
is one where he showed great originality and a remarkable feel for topics that
would be important in the development of mathematics. His achievements were
outstanding, and he had a modern style quite ahead of his own time. However,
this feel for what was important seemed to leave him, and, after 1900, he worked
with great enthusiasm on two projects of great difficulty, which were enormous
undertakings but proved quite unimportant in the development of mathematics.
From around 1892, Peano embarked on a new and extremely ambitious project,
namely, the Formulario Mathematico. As he explained:1
Even before the Formulario Mathematico project was completed, Peano took up the
project of finding an international, artificial language, “Latino sine flexione,” which
was based on Latin but stripped of all grammar. He compiled the vocabulary by
taking words from English, French, German, and Latin. In fact, the final edition of
the Formulario Mathematico was written in Latino sine flexione, which is another
reason the work was so little used.
vector spaces 49
This section is a summary of the most basic concepts of vector space theory. The
main reason for including this section is to establish terminology and provide a
collection of important examples. The reader should pay particular attention to
the examples, because the sequence and function spaces we introduce here are of
fundamental importance for the rest of the book. The theorems are stated without
proof.
(a) u + v = v + u.
(b) u + (v + w) = (u + v) + w.
(c) There is an element 0 ∈ V (the zero vector) such that u + 0 = u.
(d) For every u ∈ U, there is an element −u ∈ U such that u + (−u) = 0.
(e) a(u + v) = au + av.
(f) (a + b)u = au + bu.
(g) (ab)u = a(bu).
(h) 1.u = u.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
The field 𝕂 is called the base field, the elements of 𝕂 are referred to as scalars, and
the elements in U are called vectors. The only two fields we will use in this book are
the real field, ℝ, and the complex field, ℂ. Either of these two fields will be denoted
by 𝕂. Most of the results we will obtain apply equally whether the underlying field
is ℝ or ℂ. When a given result applies to only one field but not the other, we will
explicitly state the base field.
Example 2. Let I be a nonempty set, and let 𝕂I be the space of all functions from
I to 𝕂. For functions x = (x𝛼 )𝛼∈I , y = (y𝛼 )𝛼∈I , and for a ∈ 𝕂, define
Example 3. The space 𝕂(I) is the space of all functions x ∶ I → 𝕂 such that x𝛼 = 0
for all but a finite number of elements 𝛼 ∈ I. Addition and scalar multiplication
are defined as in example 2.
Example 6. Let 𝕂m×n be the space of all m × n matrices. Addition and scalar
multiplication are defined entrywise, in the usual manner.
Example 7. For real numbers a < b, define X = ℬ[a, b] as the space of all bounded
(real or complex) functions on the interval [a, b]. For f, g ∈ X, x ∈ [a, b], and
c ∈ 𝕂, define vector addition and scalar multiplication in X, respectively, by
vector spaces 51
Example 9. The space 𝒞∞ (ℝ) consists of all real-valued functions on ℝ that have
derivatives of all orders. Vector addition and scalar multiplication are defined
as in example 7.
Theorem 3.1.1.
(a) The zero vector is unique.
(b) For u ∈ U and a ∈ 𝕂, 0.u = O and a.O = O (0 is the scalar zero and O is the
zero vector).
(c) (−a)u = a(−u) = −(au).
n n
(d) (∑i=1 ai )u = ∑i=1 ai u.
n n
(e) a(∑i=1 ui ) = ∑i=1 aui .
(b) The canonical vectors in 𝕂(ℕ) are the sequences en (n ∈ ℕ), where the nth
term of en is one, and all the other terms are zero.
(c) The canonical vectors in 𝕂(I) are the functions e𝛼 ∶ I → 𝕂, defined by
e𝛼 (𝛽) = 𝛿𝛼,𝛽 . Here 𝛿𝛼,𝛽 is the Kronecker delta:
1 if 𝛼 = 𝛽,
𝛿𝛼,𝛽 = {
0 if 𝛼 ≠ 𝛽.
vector spaces 53
Exercises
This section is focused on the concepts on linear independence and bases. Our
approach to studying bases is unified in the sense that we do not treat finite-
dimensional and infinite-dimensional spaces separately. We use Zorn’s lemma to
prove the existence of a basis. A number of important equivalent characterizations
of a basis are also discussed, both in the body of the section as well as in the section
exercises.
n
Terminology. A vector of the form ∑i=1 ai ui , where at least one ai ≠ 0, is called
a nontrivial linear combination of u1 , u2 , . . . , un . The above definition can be
restated as follows: {u1 , u2 , . . . , un } is dependent if some nontrivial linear combi-
nation of u1 , u2 , . . . , un is zero.
n
j
∑ 𝛼i ci = 0, j = 0, . . . , n − 1.
i=1
1 1 … 1
⎛ ⎞
𝛼 𝛼2 … 𝛼n ⎟
J=⎜ 1 .
⎜ ⋮ ⋮ ⎟
n−1
⎝𝛼1 𝛼2n−1 … 𝛼nn−1 ⎠
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
vector spaces 55
The corollary below says that an independent subset of a vector space can be
augmented to a basis.
Proof. The proof parallels that of theorem 3.2.2, except that 𝔅 is defined to be the
family of all independent subsets of U that contain S1 ; 𝔅 ≠ ∅ because S1 ∈ 𝔅.
Theorem 3.2.4. Let S be a subset of a vector space U. The following are equivalent:
(a) S is a basis.
(b) S is independent and spans U (meaning that Span(S) = U).
(c) Every nonzero element of U can be written uniquely as a finite linear combi-
nation of vectors in S. Specifically, if u ≠ 0, then there exists a unique subset
{u1 , . . . , un } of S and a unique set of nonzero scalars {a1 , . . . , an } such that
n
u = ∑i=1 ai ui .
Proof. (a) implies (b). Since a basis is independent, we only need to show that
Span(S) = U. Let u ∈ U and, without loss of generality, assume that u ∉ S. Then
S1 = S ∪ {u} is dependent, so a finite subset S2 of S1 is dependent. S2 must contain
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
u because the other elements of S2 are independent. Write S2 = {u, u1 , ..., un }. Then
there are scalars a, a1 , . . . , an , not all zero, such that au + a1 u1 + . . . + an un = 0;
1
a ≠ 0 because otherwise, {u1 , . . . , un } would be dependent. Hence u = (a1 u1 +
a
... + an un ). Thus S spans U.
(b) implies (c). We only need to show the uniqueness of the representation of a
nonzero element u ∈ U as a finite linear combination of S. Suppose there are
finite subsets E and F of S such that u can be written as a linear combination
of the elements of both E and F. We will show that E ≠ F leads to a contradiction.
We adopt the notation E ∩ F = {u1 , . . . , ur }, E − F = {ur+1 , . . . , us }, and F − E =
{us+1 , . . . , un }. The assumption is that there are nonzero scalars a1 , . . . , b1 , . . . , such
that
r s r n
u = ∑ ai ui + ∑ ai ui = ∑ bi ui + ∑ bi ui .
i=1 i=r+1 i=1 i=s+1
r s
Rearranging the above equation, we have ∑i=1 (ai
− bi )ui + ∑i=r+1 ai ui −
n
∑i=s+1 bi ui = 0. This would contradict the independence of E ∪ F unless E − F =
r
∅ = F − E. Now ∑i=1 (ai − bi )ui = 0, and the independence of E forces ai = bi
for all 1 ≤ i ≤ r.
(c) implies (a) First observe that the zero vector is not in S because otherwise
the uniqueness of representation of any finite linear combination of S would be
violated by adding 1.0 to it. To show the independence of S, suppose a linear
n
combination of some finite subset of S is equal to zero, say, ∑i=1 ai ui = 0.
By the previous observation, at least two of the coefficients are nonzero, say,
n
a1 ≠ 0 ≠ a2 . In this case, a1 u1 = − ∑i=2 ai ui . This contradicts the uniqueness
of the representation of a1 u1 and proves the independence of S. To show that
S is maximal, let u ∈ U − S. Then u is a finite linear combination of elements
u1 , . . . , un of S. This implies that {u, u1 , . . . , un } is dependent, and hence {u} ∪ S is
dependent. This establishes the maximality of S.
Example 10. The set S = {1, x, x2 , ...} is independent and spans ℙ and is therefore
a basis for ℙ. Naturally, we call S the canonical basis for ℙ.
Example 11. For the same reason, {en ∶ n ∈ ℕ} is a basis for 𝕂(ℕ).
vector spaces 57
Exercises
In this section, we discuss the definition of dimension and prove the invariance
of the cardinality of the basis. Some results on cardinal arithmetic are needed in
the infinite-dimensional case. We also prove the existence of a vector space of any
given dimension.
Lemma 3.3.1. Consider the following system of linear equations with coefficients
in 𝕂:
We now prove the invariance of the number of vectors in a basis for a finite-
dimensional space.
vector spaces 59
Theorem 3.3.4. Let {u𝛼 }𝛼∈I and {v𝛽 }𝛽∈J be bases for an infinite-dimensional space
U. Then Card(I) = Card(J).
Proof. For each 𝛽 ∈ J, there is a finite subset I𝛽 ⊆ I such that v𝛽 is a linear combina-
tion of the finite set {u𝛼 ∶ 𝛼 ∈ I𝛽 }. Therefore
Since no proper subset of {u𝛼 }𝛼∈I spans U (theorem 3.2.5), I = ∪𝛽∈J I𝛽 . Using
theorems 2.3.11, and 2.3.10 (also see problem 8 on section 2.3),
Notation. We use the notation dim𝕂 (U) to denote the dimension of a vector space
U over the field 𝕂. If the base field is understood, we simply write dim(U).
We now show the existence of a vector space of any given dimension. The essential
uniqueness of such a space will be discussed in section 3.4.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Exercises
1. In this problem, the base filed is ℝ. Let V1 be the set of real symmetric n × n
matrices, and let V2 be the set of skew-symmetric matrices. Show that V1
and V2 are subspaces of ℝn×n , and find their dimensions. An n × n matrix
A is skew-symmetric if, for all 1 ≤ i, j ≤ n, aij = −aji .
2. Let V be a subspace of U. Show that dim(V) ≤ dim(U).
3. Let U be an n-dimensional vector space, and let S be a subset of U of exactly
n elements. Prove that the following are equivalent:
(a) S is a basis for U.
(b) S is independent.
(c) S spans U.
4. Let V be a subspace of U. Show that if V contains a basis for U, then V = U.
5. Show that a vector space U is infinite dimensional if and only if it contains
an infinite independent subset.
6. Let U be an infinite-dimensional vector space. Show that there is a sequence
V1 ⊃ V2 ⊃ ... (proper containments) of subspaces of U such that dim(Vn ) =
dim(U) for all n.
7. Let {x0 , . . . , xn } be a set of distinct real numbers. For 0 ≤ i ≤ n, define the
following set of polynomials in ℙn :
(x − x1 )(x − x2 )...(x − xn )
L0 (x) = ,
(x0 − x1 )(x0 − x2 )...(x0 − xn )
(x − x0 )(x − x2 )...(x − xn )
L1 (x) = , . . . , and
(x1 − x0 )(x1 − x2 )...(x1 − xn )
(x − x0 )(x − x1 )...(x − xn−1 )
Ln (x) = .
(xn − x0 )(xn − x1 )...(xn − xn−1 )
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
vector spaces 61
Show that the set {Li }ni=0 is a basis for ℙn . Hint: For a polynomial f ∈ ℙn ,
n
show that f = ∑i=0 f (xi )Li (x). Observe that Li (xj ) = 𝛿ij .
8. Let I = [a, b] be a closed, bounded interval, and suppose a = t1 < t2 < ... <
tn = b is a fixed set of points (also called nodes) in I. Define V to be the
set of continuous functions on [a, b] whose restrictions to the subintervals
[ti , ti+1 ] are linear. Prove that V is a vector space, and find a basis for it.
A function in the space V is known as a continuous, piecewise linear
function with nodes {t1 , . . . , tn }.
9. The space of continuous, piecewise linear functions. Let U be the collec-
tion of all continuous, piecewise linear functions on [a, b]. Prove that U is
an infinite-dimensional vector space.
10. Show that ℝ is an infinite-dimensional vector space over ℚ.
11. Let M be a field, let L be a subfield of M, and let K be a subfield of L.
We can consider L as a vector space over K, and M as a vector space
over either L or K. Prove that if dimL (M) and dimK (L) are finite, then
dimK (M) = dimL (M).dimK (L).
(a) T(0) = 0;
(b) T(−u) = −T(u);
n n
(c) if a1 , . . . , an ∈ 𝕂 and u1 , . . . , un ∈ U, then T(∑i=1 ai ui ) = ∑i=1 ai T(ui );
(d) the image under T of a subspace of U is a subspace of V; and
(e) the inverse image under T of a subspace of V is a subspace of U.
x
Example 3. Let T ∶ ℙ → ℙ be defined by T( f ) = ∫0 f (t)dt. It is easy to verify
directly that 𝒩(T ) = {0} and that T is one-to-one.
Theorem 3.4.3. Let U be a vector space of dimension n < ∞, and let T be a linear
transformation from U to a vector space V.
Then dim(Ker(T )) + dim(ℜ(T )) = n. In other words,
rank(T ) + nullity(T ) = n.
vector spaces 63
Theorem 3.4.4. Let S = {u𝛼 }𝛼∈I be a basis for a vector space U, and let {v𝛼 }𝛼∈I be
an arbitrary subset of a vector space V. Then there exists a unique linear mapping
T ∶ U → V such that, for every 𝛼 ∈ I, T(u𝛼 ) = v𝛼 .
S(x) = S( ∑ a𝛼 u𝛼 ) = ∑ a𝛼 S(u𝛼 )
𝛼∈F 𝛼∈F
= ∑ a𝛼 v𝛼 = ∑ a𝛼 T(u𝛼 ) = T( ∑ a𝛼 u𝛼 ) = T(x).
𝛼∈F 𝛼∈F 𝛼∈F
The above theorem says that a linear mapping is completely (and uniquely)
determined by its values on a basis. Stated differently, an arbitrary function on
a basis for U can be uniquely extended to a linear function on U.
Example 5. Let S = {1, x, x2 , ...} be the canonical basis for ℙ, and define T ∶ S → ℙ
by T(1) = 0, T(x) = 0, and, for n ≥ 2, T(xn ) = n(n − 1)xn−2 . It is clear that the
unique linear mapping on ℙ that extends T is T( f ) = f ″ (the second derivative
of f.)
ai if 0 ≤ i ≤ n,
yi = {
0 if i > n.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Theorem 3.4.6. Let U be a vector space of infinite dimension ℵ, and let I be a set
such that Card(I) = ℵ. Then U is isomorphic to 𝕂(I).
Proof. Let {u𝛼 }𝛼∈I be a basis for U, and let {e𝛼 } be the canonical basis for 𝕂(I). If
∑𝛼∈F a𝛼 u𝛼 is the unique representation of an element x ∈ U as a finite linear
combination of the basis elements, define T ∶ U → 𝕂(I) by T(x) = ∑𝛼∈F a𝛼 e𝛼 .
The proof that T is an isomorphism is much like the proof of the previous
theorem.
Quotient Spaces
Definition. The quotient space U/V (read U modulo V) consists of the cosets of
V, endowed with a vector space structure by the operations
(x + V) + (y + V) = (x + y) + V
and
a(x + V) = (ax) + V.
The above operations are well defined in the sense that they do not depend on the
particular element x chosen to represent the coset x + V. For example, if x′ + V =
x + V and y′ + V = y + V, then x′ − x ∈ V, y′ − y ∈ V, and (x′ + y′ ) − (x + y) ∈ V;
hence (x′ + y′ ) + V = (x + y) + V. For brevity of notation, the coset x + V will be
denoted by x.
vector spaces 65
Theorem 3.4.7. Let T ∶ U → W be linear, and let V = Ker(T ). Then U/V is isomor-
phic to ℜ(T ) via the isomorphism T(x) = T(x).
Proof. We leave it to the reader to verify that T is well defined. Clearly, T is onto. We
verify the linearity of T:
T(ax + by) = T(ax + by) = T(ax + by) = aT(x) + bT(y) = aT(x) + bT(y).
Direct Sums
Example 8. Let U = ℝ3 , and let U1 and U2 be distinct lines containing the origin.
Then the subspace U1 + U2 is the plane that contains U1 and U2 .
Example 10. Let c be the space of all convergent sequences, and let c0 be the
space of all sequences that converge to 0. We show that c = c0 ⊕ Span({e}),
where e = (1, 1, 1, ...). Let x = (x1 , x2 , ...) be a convergent sequence, and let
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Definition. A linear mapping from a vector space U to the base field 𝕂 is called a
linear functional on U.
1
Example 12. Define 𝜆 ∶ ℙ → ℝ by 𝜆( f ) = ∫0 f (x)dx. (The base field is ℝ and the
polynomials have real coefficients.)
n
Example 13. Define 𝜆 ∶ 𝕂n×n → 𝕂 by 𝜆(A) = ∑i=1 aii . Here A = (aij ) is an n × n
n
matrix. The quantity ∑i=1 aii is called the trace of A, often written tr(A).
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
vector spaces 67
Proof. (a) implies (b). Let x ∈ U be such that 𝜆(x) ≠ 0. By replacing x with x/𝜆(x),
we may assume that 𝜆(x) = 1. For y ∈ U, let w = y − 𝜆(y)x. Then 𝜆(w) = 𝜆(y) −
𝜆(𝜆(y)x) = 𝜆(y) − 𝜆(y)𝜆(x) = 0. This shows that w ∈ Ker(𝜆) = M; hence y =
w + 𝜆(y)x ∈ M + Span({x}), and U = M + Span({x}). Next we show that M ∩
Span({x}) = {0}. This will complete the proof. If y ∈ M ∩ Span({x}), then y = ax
for some a ∈ 𝕂, and 𝜆(y) = 0. But 𝜆(y) = a𝜆(x) = a. Thus a = 0, and y = 0.
Conversely, suppose that U = M ⊕ Span({x}) for some nonzero x ∈ U. Let S1
be a basis for M, and let S = S1 ∪ {x}. Then S is a basis for U. Define 𝜆 ∶ S → 𝕂
by 𝜆(x) = 1, and 𝜆(u) = 0 for all u ∈ S1 . Finally, extend 𝜆 to a linear functional,
which we also denote by 𝜆, on U according to theorem 3.4.4. The reader can easily
verify that Ker(𝜆) = M.
Example 14. Refer to example 12. Let M = Ker(𝜆). The following facts are easy to
1
verify: A basis for M is {xn − ∶ n ∈ ℕ}, and the one-dimensional subspace
n+1
N of constant polynomials is a complement of M. Every polynomial f can be
1
written as f = g + c, where c = ∫0 f (t)dt, and g = f − c.
Another important vector space is the space of all linear transformations from one
vector space U to another space V.
Notation. Let U and V be vector spaces. The set of all linear transformations
from U to V is denoted by Hom(U, V). A linear mapping is also called a
homomorphism, hence the notation Hom(U, V).
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
It is easy to see that Hom(U, V) is a vector space with the following operations:
for T1 , T2 ∈ Hom(U, V) and a ∈ 𝕂, (T1 + T2 )(u) = T1 (u) + T2 (u), and (aT1 )(u) =
aT1 (u).
Theorem 3.4.14. Suppose U and V are vector spaces over a field 𝕂. Then Hom(U, V)
is a vector space, and Hom(U, U) is an algebra over 𝕂.
Exercises
vector spaces 69
A careful reading of example 2 in section 3.4 reveals that the set of linear mappings
from 𝕂n to 𝕂m is in one-to-one correspondence with the set of m × n matrices.
This section generalizes this result. Suppose U and V are finite-dimensional vector
spaces and that {u1 , . . . , nn } and {v1 , . . . , vm } are bases for U and V, respectively.
Theorem 3.4.4 states that a linear mapping T ∶ U → V is uniquely determined by
the vectors T(u1 ), . . . , T(un ). Since each of the vectors T(uj ) can be uniquely written
as a linear combination of {v1 , . . . , vm } with coefficients in 𝕂, the set of coefficients
determines T uniquely. This observation is the basis for the opening definition of
this section. The information in this section is standard, and we assume familiarity
with its contents.
The matrix representing a linear mapping is totally dependent on the base pair
(B, C) and is even sensitive to the permutation of the elements in each basis. Thus
the bases B and C are assumed to be ordered.
df
Example 2. Let T ∶ ℙn → ℙn be the linear transformation T( f ) = .
dx
If B = {1, x, . . . , xn }, then the matrix of T relative to (B, B) is
0 1
⎛ ⎞
⎜ 0 2 ⎟.
⎜ ⋱ n⎟
⎝ 0⎠
vector spaces 71
Now we study how the matrix of the composition of two linear transformations
relates to the matrices of the composed transformations.
m m p
′
(SoT)(uj ) = S(T(uj )) = S( ∑ aij vi ) = ∑ ∑ aij aki wk
i=1 i=1 k=1
p m p m
′ ′
= ∑ ( ∑ aki aij )wk = ∑ ekj wk , where ekj = ∑ aki aij .
k=1 i=1 k=1 i=1
Thus the matrix of SoT relative to (B, D) is E = (ekj ). By the definition of matrix
multiplication, ekj is the (k, j) entry of the product A′ A.
The above theorem is the crucial piece of information needed to prove the
following theorem, which is a special case of theorem 3.5.1 when V = U and C = B.
Example 3. Notice that if P is the matrix from B to B′ , then P−1 is the matrix from
B′ to B. Indeed, let Q be the matrix from B′ to B, and consider the mapping
T = IU with the base pair (B, B′ ) (its matrix is P), and the mapping S = IU with
the base pair (B′ , B) (its matrix is Q). Consider the matrix of the composition
SoT. On the one hand, its matrix relative to (B, B) is QP by theorem 3.5.2. On the
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
other hand, the matrix of IU relative to (B, B) is the identity matrix In . Therefore
QP = In , and Q = P−1 .
Proof. Fix a basis B for U. For another basis B′ for U, let P be the matrix from
B to B′ . The correspondence Ψ ∶ B′ ↦ P is the correspondence promised by the
theorem. We leave the rest of the formalities to the reader. The examples preceding
this theorem are relevant for verifying the details.
Proof. Consider diagram 1. Each corner contains a pair: a space and a basis. The
top arrow prompts the reader to consider the mapping T ∶ U → V and mind the
bases indicated in the top corners of the diagram. Thus the matrix of the mapping
T represented by the top arrow is relative to the base pair (B, C) and is therefore A.
Likewise, the matrices representing the rest of mappings indicated on the diagram
are Q−1 for IV , P−1 for IU , and A′ for the mapping depicted by the bottom arrow.
Now IV oT = ToIU . Applying theorem 3.5.2 to each side of the above equation, we
get Q−1 A = A′ P−1 , or A′ = Q−1 AP.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
vector spaces 73
T
(U, B) (V, C)
IU IV
′ T
(U, B ) (V, C′ )
Diagram 1
Corollary 3.5.6. Let U be an n-dimensional vector space, let T ∈ Hom(U, U), and
let B and B′ be bases for U. If A is the matrix of T relative to B, and A′ is the matrix
of T relative to B′ , then A′ = P−1 AP, where P is the matrix from B′ to B.
Proof. This is the special case of the above theorem when V = U, C = B, and
C′ = B′ .
Diagonalization
The following theorem gives a necessary and sufficient condition for a square
matrix (linear operator) to be diagonalizable.
Proof. Suppose A is diagonalizable. Thus there exists an invertible matrix P such that
P−1 AP = D, a diagonal matrix. Let 𝜆1 , . . . , 𝜆n be the diagonal entries of D, and let
P = [u1 , . . . , un ] be a partitioning of A by its columns. The equation P−1 AP = D
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
d2 f
Example 6. Let T be an operator on 𝒞∞ (ℝ), defined by T( f ) = 2 + f. It is easy
dx
to verify that, for every 𝜔 ∈ ℝ, the function f𝜔 (x) = sin (𝜔x) is an eigenfunction
of T corresponding to the eigenvalue 𝜆𝜔 = 1 − 𝜔2 .
Exercises
A11 A12
( ),
0 A22
vector spaces 75
The abstraction of the function d to an arbitrary vector space yields the definition
of a normed linear space. Instead of using the notation d(x), we use the universally
accepted notation ‖x‖ for the length of a vector x, or its distance from the zero
vector.
Normed linear spaces are the most common examples of metric spaces. What
sets norms apart, still using the function d on ℝ2 as our prototype, is the fact that
the distance function between two points in the plane is translation invariant
in the sense that if D ∶ ℝ2 × ℝ2 → ℝ is the function D(x, y) = {(x1 − y1 )2 +
1/2
(x2 − y2 )2 } , then D(x, y) = D(x − a, y − a) for all x, y, a ∈ ℝ2 . Equivalently,
D(x, y) = D(x − y, 0) = d(x − y). See the definition of a translation later on in
this section. This property makes no sense for a general metric space because the
underlying set of a metric space is not required to be a vector space.
The function ‖.‖ is called a norm on X, and condition (c) in the above definition
is known as the triangle inequality.
Definition. The distance between two points x and y in a normed linear space X
is the scalar ‖x − y‖.
The reader can easily verify that the defining conditions of a norm are satisfied
in each of the examples below.
We verify the triangle inequality here. If f and g are bounded functions on [a, b]
and x ∈ [a, b], then |(f + g)(x)| ≤ |f (x)| + |g(x)| ≤ ‖f‖∞ + ‖g‖∞ .
Thus ‖f + g‖∞ = supx∈[a,b] |(f + g)(x)| ≤ ‖f‖∞ + ‖g‖∞ .
vector spaces 77
3 1
⎧ 2n x if 0 ≤ x ≤ ,
⎪ 2n2
1 1 1
fn (x) = −2n3 (x − 2 ) if ≤x≤ ,
⎨ n 2n2 n2
⎪0 if
1
≤ x ≤ 1.
⎩ n2
It is clear that
1
‖fn ‖∞ = fn (1/2n2 ) = n, ‖fn ‖1 = .
2n
Thus
‖A + B‖∞ ≤ ‖A‖∞ + ‖B‖∞ .
The matrix norm in the above example is compatible with the ∞-norm on ℝn in
the sense that, for x ∈ ℝn , ‖Ax‖∞ ≤ ‖A‖∞ ‖x‖∞ , as the reader can easily verify.
lp Spaces
Definition. For every real number 1 ≤ p < ∞, define lp to be the set of all
∞
sequences x = (x1 , x2 , ...) in 𝕂 such that ∑n=1 |xn |p < ∞. For x ∈ lp ,
∞ 1/p
‖x‖p = ( ∑ |xn |p ) .
n=1
Showing that lp (for 1 < p < ∞) is a normed linear space is less straightforward
and requires the development of two useful inequalities which are important in
their own right.
Definition. Let 1 < p < ∞. The conjugate Hölder exponent of p is the number
1 1
q > 1 such that + = 1. By definition, p = 1 and q = ∞ are conjugate Hölder
p q
exponents.
|x|p |y|q
Lemma 3.6.1. If p > 1, and x, y ∈ ℂ, then |xy| ≤ + . Here p and q are
p q
conjugate Hölder exponents.
t 1 1 1
Proof. Consider the function f (t) = t1/p − − , t ≥ 1; f ′ (t) = t1/p−1 − ≤ 0. Thus
p q p p
f is decreasing for all t ≥ 1, and since f (1) = 0, it follows that f (t) ≤ 0 for all t ≥ 1.
t 1 a
Thus t1/p ≤ + for t ≥ 1. Now let a, b > 0 and, say, ≥ 1. By replacing t with
p q b
a 1/p 1 a 1 ba1/p a b a b
a/b, we obtain ( ) ≤ ( ) + . Therefore, ≤ + , or a1/p b1/q ≤ + .
b p b q b1/p p q p q
Letting a = |x|p , b = |y|q , we obtain the inequality we seek.
∞ ∞
∑ |xn yn | ≤ supn |yn | ∑ |xn | = ‖y‖∞ ‖x‖1 .
n=1 n=1
n n n
1 1 1
q ∑ |xi yi | ≤ p ∑ |xi |p + q ∑ |yi |
q
‖x‖p ‖y‖ i=1 p‖x‖p i=1 q‖y‖q i=1
∞ ∞
1 p 1 q 1 1
≤ p ∑ |xi | + q ∑ |yi | = + = 1.
p q
p‖x‖p i=1 q‖y‖q i=1
1 n
The summary of the above calculations is that ∑i=1 |xi yi | ≤ 1. Taking the
‖x‖p ‖y‖q
limit as n → ∞ we obtain Hölder’s inequality.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
vector spaces 79
Proof. We already proved the theorem for p = 1 and p = ∞, so assume that 1 < p <
∞, and let q be the conjugate Hölder exponent of p. Then:
n n
∑ |xi + yi |p ≤ ∑ |xi + yi |p−1 (|xi | + |yi |).
i=1 i=1
Applying Hölder’s inequality to the right side of the above inequality yields
n n
∑ |xi + yi |p−1 |xi | + ∑ |xi + yi |p−1 |yi |
i=1 i=1
n 1/p n 1/p n 1/q
p p (p−1)q
≤ [ (∑ |xi | ) + (∑ |yi | ) ] (∑ |xi + yi | )
i=1 i=1 i=1
n 1/q
p
≤ (‖x‖p + ‖y‖p ) (∑ |xi + yi | ) .
i=1
n n 1/q
∑ |xi + yi |p ≤ (‖x‖p + ‖y‖p )( ∑ |xi + yi |p ) .
i=1 i=1
Thus
n 1−1/q
( ∑ |xi + yi |p ) ≤ ‖x‖p + ‖y‖p .
i=1
Taking the limit as n → ∞, and recalling that 1 − 1/q = 1/p, we have
∞ 1/p
( ∑ |xi + yi |p ) ≤ ‖x‖p + ‖y‖p or, ‖x + y‖p ≤ ‖x‖p + ‖y‖p .
i=1
We have verified all the crucial details needed to prove the result below.
Observe that Hölder’s inequality and the triangle inequality apply to finite
sequences:
n n 1/p n 1/q
∑ |xi yi | ≤ ( ∑ |xi |p ) ( ∑ |yi |q ) , and
i=1 i=1 i=1
n 1/p n 1/p n 1/p
( ∑ |xi + yi |p ) ≤ ( ∑ |xi |p ) + ( ∑ |yi |p ) .
i=1 i=1 i=1
Definition. Let X be a normed linear space. The open ball of radius r centered at
x ∈ X is the set
B(x, r) = {y ∈ X ∶ ‖y − x‖ < r}.
Example 9. In (ℝ2 , ‖.‖∞ ), the open ball of radius r centered at the point (x0 , y0 )
is the open square {(x, y) ∶ |x − x0 | < r, |y − y0 | < r}. In (ℝ2 , ‖.‖1 ), the open ball
of radius r centered at (x0 , y0 ) is the open square with vertices (x0 ± r, y0 ) and
(x0 , y0 ± r).
Remark. The vector u in the above definition does not have to be a unit vector, but
when it is, t is the exact distance between 𝜉 + tu and 𝜉. An important special case
is the equation of the line joining two points 𝜉 and 𝜂 in X. In this case, the line
is the set of all points x such that
x = 𝜉 + t(𝜂 − 𝜉) = (1 − t)𝜉 + t𝜂, −∞ < t < ∞.
The set {(1 − t)𝜉 + t𝜂 ∶ 0 ≤ t ≤ 1} is called the line segment joining 𝜉 and 𝜂.
vector spaces 81
The set E + x can be visualized as rigidly moving E in the direction of the vector x.
The graph of the parabola y = x2 + 1 is the translation of the graph of the parabola
y = x2 by the vector (0, 1). Figure 3.1 depicts the translation of the raindrop-like
set E by the vector x = (0, −1).
Translating a set preserves most of its characteristics. Convexity is a good
example.
Example 11.
(a) An open ball in ℝn is a convex set.
(b) Let A be an m × n real matrix, and let b ∈ ℝm . The two sets
{x ∈ ℝn ∶ Ax = b} and {x ∈ ℝn ∶ Ax > b} are convex subsets of ℝn .3
(c) The union of the first and third quadrants in the plane is not convex.
(d) The raindrop region in figure 3.1 is not convex.
(e) The intersection of an arbitrary collection of convex sets is convex.
0
E
x
–1
E+x
n
3 The notation Ax < b means that ∑j=1 aij xj < bi for all 1 ≤ i ≤ m.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
We limit the discussion below to ℝn , although some of the statements are valid for
an arbitrary vector space.
k
Proof. It is enough to show that if C is convex and x = ∑i=1 𝜆i xi is a convex combina-
tion of points x1 , . . . , xk ∈ C, then x ∈ C. The converse is trivial. We use induction
on k. The statement is true for k = 2 by the very definition of convexity. Without
k 𝜆x
loss of generality, assume that 𝜆1 < 1, and write x = 𝜆1 x1 + (1 − 𝜆1 ) ∑i=2 i i .
1−𝜆1
k 𝜆i k 𝜆x
Now ∑i=2 = 1 and y = ∑i=2 i i ∈ C by the inductive hypothesis. By the
(1−𝜆1 ) 1−𝜆1
convexity of C, x = 𝜆1 x1 + (1 − 𝜆1 )y ∈ C.
It is clear that conv(A) is the intersection of all the convex subsets of ℝn that contain
A and that conv(A) ≠ ∅, since A ⊆ ℝn , and ℝn is convex.
Theorem 3.6.6. For a nonempty set A ⊆ ℝn , conv(A) is the set of all convex
combinations of points of A.
Proof. By the previous theorem, it is enough to show that the set of all convex
k
combinations of points in A is a convex set. Suppose that x = ∑i=1 𝜆i xi and
l
y = ∑j=1 𝜇j yj are, respectively, convex combinations of points x1 , . . . , xk and
y1 , . . . , yl in A. If 𝛼 ∈ [0, 1], then
k l
(1 − 𝛼)x + 𝛼y = ∑(1 − 𝛼)𝜆i xi + ∑ 𝛼𝜇j yj
i=1 j=1
k l
∑(1 − 𝛼)𝜆i + ∑ 𝛼𝜇j = 1.
i=1 j=1
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
vector spaces 83
A natural question now is whether there is an upper bound on the length of the
convex combinations of vectors in A needed to generate all of conv(A). The next
theorem provides the answer.
k
Proof. Let x ∈ C. Then x = ∑i=1 𝜆i xi for some x1 , . . . , xk ∈ A and some 𝜆1 , . . . , 𝜆k ∈
k
(0, 1] with ∑i=1 𝜆i = 1. If k > n + 1, then the vectors x2 − x1 , . . . , xk − x1 are
linearly dependent, so there are constants 𝜇2 , . . . , 𝜇k , not all zero, such that
k k k k
∑j=2 𝜇j (xj − x1 ) = 0. If we set 𝜇1 = − ∑j=2 𝜇j , then ∑j=1 𝜇j xj = 0, and ∑j=1 𝜇j =
0. Observe that at least one of the numbers 𝜇1 , . . . , 𝜇k is positive. Now, for all
k 𝜆i 𝜆j
𝛼 ∈ ℝ, x = ∑j=1 (𝜆j − 𝛼𝜇j )xj . Let i be such that = min1≤j≤k { ∶ 𝜇j > 0}, and
𝜇i 𝜇j
𝜆i
choose 𝛼 = . Observe that 𝛼 > 0 and for 1 ≤ j ≤ k, 𝜆j − 𝛼𝜇j ≥ 0. Now x =
𝜇i
k k
∑j=1 (𝜆j − 𝛼𝜇j )xj , 𝜆j − 𝛼𝜇j ≥ 0, and ∑j=1 (𝜆j − 𝛼𝜇j ) = 1. Since 𝜆i − 𝛼𝜇i = 0, x
is a convex combination of, at most, k − 1 points of A. We continue this process
until x is a convex combination of, at most, n + 1 vectors in A.
Example 12. It is possible that k < n + 1. The closed unit disk D in ℝ2 is the convex
hull of the unit circle 𝒮1 , and every interior point in D is a convex combination
of two vectors in 𝒮1 . However, k = n + 1 is the best possible bound. For example,
if x0 , x1 , and x2 are three noncollinear points in the plane, then an interior point
in the triangle defined by the three points is not a convex combination of any
two of the three points.
A convex set may not have any extreme points. A simple example is the set
{(x, y) ∈ ℝ2 ∶ 0 ≤ x ≤ 1, −∞ < y < ∞}.
While it is intuitively obvious that a polytope is the convex hull of its vertices,
this is not an entirely trivial fact. We prove this fact, together with the even more
fundamental fact that polytopes do have vertices.
k
Proof. Since xk is not a vertex of Q, there exist convex combinations y = ∑i=1 𝛽i xi
k
and z = ∑i=1 𝛾i xi (y ≠ z), and a number 𝜆 ∈ (0, 1) such that xk = 𝜆y + (1 − 𝜆)z.
k
Now xk = ∑i=1 𝛼i xi , where 𝛼i = 𝜆𝛽i + (1 − 𝜆)𝛾i . If 𝛼k = 0, the proof is complete.
It is easy to check that 𝛼k = 1 is possible only if 𝛽k = 𝛾k = 1. But this would force
k−1 𝛼
y = z = xk , which is a contradiction. Thus 0 < 𝛼k < 1 and xk = ∑i=1 i xi , as
1−𝛼k
desired. We leave it to the reader to check that Q = conv(x1 , . . . , xk−1 ).
Proof. We prove the result by induction on k. The result is true for k = 2 by example
13. Now consider the polytope Q = conv(x1 , . . . , xk ). If all the points x1 , . . . , xk are
vertices of Q, there is nothing to prove. Otherwise, one point, say, xk is not a vertex
of Q. By the previous lemma, Q = conv(x1 , . . . , xk−1 ). By the inductive hypothesis,
Q is the convex hull of its vertices.
The fact that a polytope is the convex hull of its extreme points is the weakest
version of the well-known Krein-Millman theorem.
vector spaces 85
The standard 2-simplex is a triangle with vertices (0, 0), (1, 0), and (0, 1). The
standard 3-simplex is a pyramid with vertices (0, 0, 0), (1, 0, 0), (0, 1, 0), and
(0, 0, 1).
n
Every point x in the standard n-simplex can be written uniquely as x = ∑i=1 𝜆i ei ,
n n
where 𝜆i ∈ [0, 1], and ∑i=1 𝜆i ≤ 1. Set 𝜆0 = 1 − ∑i=1 𝜆i . The numbers 𝜆0 , . . . , 𝜆n
are called the barycentric coordinates of x.
Exercises
The concept of an inner product stems out of the need to have an instrument that
determines the orthogonality of vectors in a normed linear space. Let us consider
the Euclidean norm on ℝn . The orthogonality of two vectors x = (x1 , . . . , xn ) and
y = (y1 , . . . , yn ) in ℝn is equivalent to the condition that ‖x + y‖2 = ‖x‖2 + ‖y‖2 .
(the Pythagorean theorem). Now
n n n n n
‖x + y‖2 = ∑(xi + yi )2 = ∑ x2i + ∑ y2i + 2 ∑ xi yi = ‖x‖2 + ‖y‖2 + 2 ∑ xi yi .
i=1 i=1 i=1 i=1 i=1
n
Thus the orthogonality of x and y is equivalent to the condition ∑i=1 xi yi = 0.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Example 2. The space l2 is an inner product space with the inner product
∞
⟨x, y⟩ = ∑n=1 xn yn .
Example 3. The space 𝒞[a, b] is an inner product space with the inner product
b
⟨f, g⟩ = ∫a f (x)g(x)dx.
For an element x in an inner product space H, we write ‖x‖ = √⟨x, x⟩. We will see
shortly that ‖.‖ is indeed a norm on H.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
vector spaces 87
|⟨x,y⟩|2
Substituting 𝛼 = −⟨x, y⟩/‖y‖2 , we obtain 0 ≤ ‖x‖2 − , from which the
‖y‖2
Cauchy-Schwarz inequality follows.
It is easy to verify that if y = 𝛼x, then |⟨x, y⟩| = ‖x‖‖y‖. Conversely, suppose
that |⟨x, y⟩| = ‖x‖‖y‖. Now
‖x + y‖ ≤ ‖x‖ + ‖y‖.
Taking the square roots of the extreme sides of the above string yields the triangle
inequality.
It follows from the above corollary that the function ‖x‖ = ⟨x, x⟩1/2 defines a norm
on H. Therefore every inner product space is a normed linear space.
Definition. Two vectors x and y in an inner product space are said to be orthog-
onal if ⟨x, y⟩ = 0. Symbolically, we write x ⟂ y to indicate the orthogonality of
x and y.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Proof.
‖x + y‖2 = ⟨x + y, x + y⟩ = ⟨x, x⟩ + ⟨x, y⟩ + ⟨y, x⟩ + ⟨y, y⟩
= ⟨x, x⟩ + ⟨y, y⟩ = ‖x‖2 + ‖y‖2 .
𝜋 𝜋
1 1 |
⟨un , um ⟩ = ∫ eint e−imt dt = ei(n−m)t || = 0,
2𝜋 −𝜋 2𝜋i(n − m) −𝜋
while
𝜋
1
⟨un , un ⟩ = ∫ eint e−int dt = 1.
2𝜋 −𝜋
Observe the convenience of including the factor 1/2𝜋 in the definition of the
inner product.
Proof. Let {u1 , . . . , un } be a finite subset of S, and suppose that, for scalars a1 , . . . , an ,
n
∑i=1 ai ui = 0. For a fixed 1 ≤ j ≤ n,
n n
⟨∑ ai ui , uj ⟩ = ∑ ai ⟨ui , uj ⟩ = aj .
i=1 i=1
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
vector spaces 89
But
n
⟨∑ ai ui , uj ⟩ = ⟨0, uj ⟩ = 0.
i=1
We continue to exploit the geometry of vectors in ℝ2 to get direction for the next
step. An important concept in geometry (and in Hilbert space theory) is that of
projecting a vector onto another. Let x ∈ ℝ2 and let u be a unit vector in the plane.
The length of the projection of x onto u is given by ‖x‖ cos 𝜃 = ⟨x, u⟩; hence the
vector projection of x onto the line containing u is the vector y = ⟨x, u⟩u. This is
the closest vector in the line containing u to the vector x. Since the projection
of a vector x ∈ ℝ3 onto the span M of two orthonormal vectors u1 and u2 is the
sum of the individual projections of x onto u1 and u2 , the projection of x on M is
⟨x, u1 ⟩u1 + ⟨x, u2 ⟩u2 . The constructions involved in the next two theorems are now
well motivated.
n n
x = ∑ x̂i ui , and ‖x‖2 = ∑ |x̂i |2 .
i=1 i=1
n n
‖x‖2 = ∑ ‖x̂i ui ‖2 = ∑ |x̂i |2 .
i=1 i=1
M⟂ = {z ∈ H ∶ z ⟂ x ∀x ∈ M}.
x = y + z, where y ∈ M and z ∈ M⟂ .
n
⟨z, uj ⟩ = ⟨x − ∑⟨x, ui ⟩ui , uj ⟩
i=1
n
= ⟨x, uj ⟩ − ∑⟨x, ui ⟩⟨ui , uj ⟩ = ⟨x, uj ⟩ − ⟨x, uj ⟩ = 0.
i=1
vector spaces 91
Sometimes the basis vectors u1 , . . . , un in the above theorem are merely orthogonal
and not orthonormal. In this case, we find the orthogonal projection of x on M by
using the formula
n ui
y = ∑i=1 ⟨x, ui ⟩ ,
‖ui ‖2
which is the previously stated formula for y when each ui is replaced with the
u
normalized vector i .
‖ui ‖
Proof. We use induction on dim(H). Let {v1 , . . . , vn } be a basis for H. Use the
inductive hypothesis to find an orthogonal basis {u1 , . . . , un−1 } for the inner
product space Span({v1 , . . . , vn−1 }), and define
n−1
uj
un = vn − ∑ ⟨vn , uj ⟩ .
j=1
‖uj ‖2
The above theorem and its proof deliver more than the mere existence of an
orthonormal basis for an arbitrary finite-dimensional inner product space.
The proof is inductive and constructive; hence it can be applied to an infinite
independent sequence of vectors, recursively, as follows.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Vn = Span({v1 , . . . , vn }) = Span({u1 , . . . , un }) = Un .
Example 7. Consider the space 𝒞[−1, 1] with the inner product ⟨f, g⟩ =
1
∫−1 f (x)g(x)dx. Applying the Gram-Schmidt process to the infinite independent
sequence of monomials 1, x, x2 . . . , we obtain a sequence of orthogonal
polynomials P0 , P1 , . . . , that spans the space of polynomials such that
The following observation is sometimes crucial for avoiding the often cumbersome
calculations needed to compute the orthogonal sequence u1 , u2 , . . . .
1 1 1
|
∫ xj Dn (x2 − 1)n dx = − ∫ jxj−1 Dn−1 (x2 − 1)n dx + xj Dn−1 (x2 − 1)n | .
| −1
−1 −1
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
vector spaces 93
The second term is zero because if k < n, then x2 − 1 is a factor of Dk (x2 − 1)n .
The same reason coupled with integration by parts j − 1 times proves the desired
result.
n n
𝜆(x) = ∑ xî 𝜆(ui ) = ∑ xî ai .
i=1 i=1
On the other hand,
n n n n
⟨x, y⟩ = ⟨∑ xî ui , ∑ aj uj ⟩ = ∑ xî aj ⟨ui , uj ⟩ = ∑ xî ai = 𝜆(x).
i=1 j=1 i,j=1 i=1
For a (complex) matrix A, we use the symbol A∗ to denote the conjugate transpose
of A. Thus (A∗ )ij = aji . The following theorem sums up the properties of conjugate
transposition. We only verify part (c).
Theorem 3.7.8. Let A and B be matrices of compatible sizes for matrix multiplica-
tion. Then
(a) A∗∗ = A,
(b) (AB)∗ = B∗ A∗ , and
(c) if A is an n × n matrix, then, for all x, y ∈ ℂn , ⟨Ax, y⟩ = ⟨x, A∗ y⟩.
u∗
⎛ 1⎞
.
P∗ P = ⎜ ⎟ (u1 , . . . , un ).
⎜.⎟
∗
⎝un ⎠
cos 𝜃 −sin 𝜃
P𝜃 = ( )
sin 𝜃 cos 𝜃
x xcos𝜃 − ysin𝜃 x
( 1) = ( ) = P𝜃 ( ) .
y1 xsin𝜃 + ycos𝜃 y
−1 0
( ).
0 1
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
vector spaces 95
u∗ 𝜆 u∗
𝜆1 ⎛ 1⎞ ⎛ 1 1⎞
. . ⎟
A = PDP∗ = (u1 , . . . , un ) ( ⋱ ) ⎜ ⎟ = (u1 , . . . , un ) ⎜ .
⎜ . ⎟ ⎜ . ⎟
𝜆n ∗ ∗
⎝un ⎠ ⎝𝜆n un ⎠
n n
Thus A = ∑i=1 𝜆i ui u∗i . Similarly, A∗ = ∑i=1 𝜆i ui u∗i .
n n
Now AA∗ = ∑i,j=1 𝜆i 𝜆j ui (u∗i uj )u∗j = ∑i=1 |𝜆i |2 ui u∗i = A∗ A.
Theorem 3.7.13 establishes the fact that normality is also a sufficient condition for
the unitary diagonalization of a matrix.
Proof. It is easy to verify that A − 𝜆I is normal and that its conjugate transpose is
A∗ − 𝜆I. By the previous lemma, ‖(A − 𝜆I)u‖ = ‖(A∗ − 𝜆I)u‖. Thus (A − 𝜆I)u =
0 if and only if (A∗ − 𝜆I)u = 0.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
P∗ AP = D.
Proof. The proof is inductive. The base case (n=2) is left as an exercise. Let 𝜆1 be
an eigenvalue of A, and let v1 be a unit eigenvector corresponding to 𝜆1 . Let
M = Span({v1 }), and let {v2 , . . . , vn } be an orthonormal basis for M⟂ . By construc-
tion, the matrix Q = (v1 , . . . , vn ) is unitary.
We claim that Q∗ AQ has the form
𝜆 0 ... 0
⎛ 1 ⎞
0
Q∗ AQ = ⎜ ⎟.
⎜ ⋮ A′ ⎟
⎝ 0 ⎠
vector spaces 97
Now
|𝜆 |2 0 ... 0
⎛ 1 ⎞
0
(Q∗ AQ)(Q∗ AQ)∗ = ⎜ ⎟,
⎜ ⋮ A′ (A′ )∗ ⎟
⎝ 0 ⎠
while
|𝜆 |2 0 ... 0
⎛ 1 ⎞
0
(Q∗ AQ)∗ (Q∗ AQ) = ⎜ ⎟.
⎜ ⋮ (A′ )∗ A′ ⎟
⎝ 0 ⎠
⟨Tx, y⟩ = ⟨x, T∗ y⟩
for all x, y ∈ H.
We will develop the analog of the spectral decomposition of a normal matrix for
normal operators.
n
qjk = ⟨T∗ (vj ), vk ⟩ − ∑ aji ⟨vi , vk ⟩ = ⟨vj , Tvk ⟩ − ajk
i=1
n n
= ⟨vj , ∑ aik vi ⟩ − ajk = ∑ aik ⟨vj , vi ⟩ − ajk = ajk − ajk = 0.
i=1 i=1
Proof. Fix an orthonormal basis B = {v1 , . . . , vn } for H, and let A be the matrix of
T relative to B. By the previous lemma, A is normal and, by theorem 3.7.13, A is
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
vector spaces 99
The above theorem leads to the spectral theorem for normal operators on finite-
dimensional inner product spaces.
n
T = ∑ 𝜆i Pi ,
i=1
Exercises
1
1. For functions f, g ∈ 𝒞∞ [0, 1], define ⟨f, g⟩ = f (0)g(0) + ∫0 f ′ (x)g′ (x)dx.
Prove that ⟨., .⟩ is an inner product on 𝒞∞ [0, 1].
2. Prove that the following generalization of the previous exercise also defines
an inner product on 𝒞∞ [0, 1]: for a fixed positive integer n,
n−1 1
⟨f, g⟩ = ∑ f(i) (0)g(i) (0) + ∫ f(n) (x)g(n) (x)dx.
i=0 0
3. Prove the following properties of inner products, which are often used
without explicit mention:
(a) If x, y are vectors in an inner product space H such that
⟨x, w⟩ = ⟨y, w⟩ for every w ∈ H, then x = y.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
n m n m
⟨∑ 𝛼i ui , ∑ 𝛽j vj ⟩ = ∑ ∑ 𝛼i 𝛽j ⟨ui , vj ⟩.
i=1 j=1 i=1 j=1
2
‖n ‖ n
‖‖∑ 𝛼i ui ‖‖ = ∑ |𝛼i |2 ‖u‖2i .
‖
i=1 ‖ i=1
−1 0 0
A = ( 0 0 2) .
0 2 3
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
4
The Metric Topology
Felix Hausdorff was born into a wealthy Jewish family and when he was still
a young boy, the family moved to Leipzig. He studied at Leipzig University,
graduating in 1891 with a doctorate in the applications of mathematics to
astronomy. He published four papers on astronomy and optics over the next
few years. Hausdorff remained in Leibzig, where he lectured until 1910. He then
moved to Bonn, then to Greifswald in 1913, returning to Bonn in 1921, where he
continued his work until 1935.
Hausdorff was the first to coin the definitions of metric and topological spaces. In
1914, building on work by Maurice Fréchet and others, he published his famous
text Grundzüge der Mengenlehre. The book was the beginning point for studying
metric and topological spaces, which are now core topics in modern mathematics.
Among Hausdorff ’s numerous achievements, we count his introduction of the
notion of the Hausdorff dimension, his study of the Gaussian law of errors, limit
theorems and the problem of moments, and the strong law of large numbers. He
introduced the concept of a partially ordered set and, from 1901 to 1909, he proved
Fundamentals of Mathematical Analysis. Adel N. Boules, Oxford University Press (2021). © Adel N. Boules.
DOI: 10.1093/oso/9780198868781.003.0004
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Basic calculus concepts such as limits and continuity are heavily based on the
concept of proximity. A metric is the most common tool for measuring proximity.
The definition of a metric is a direct abstraction of the properties of the distance
function in the plane. The most important characteristics of the Euclidean distance
are:
The function d is called the distance function, or the metric. Property (c) is
known as the triangle inequality.
Example 1. Let X = R, and let d(x, y) = |x − y|. The triangle inequality is indeed
the inequality known by the same name in elementary mathematics. In general,
n
the metric on ℝn given by d(x, y) = (∑i=1 |xi − y2i |)1/2 = ‖x − y‖2 is called the
Euclidean (or the usual) metric on ℝn . In this case, the triangle inequality
follows from Minkowski’s inequality with p = 2.
Example 2. Normed linear spaces provide a rich source of metric spaces. If (X, ‖.‖)
is a normed linear space, and the distance function is defined by d(x, y) = ‖x −
y‖, then d(x, y) = ‖x − y‖ = ‖(x − z) + (z − y)‖ ≤ ‖x − z‖ + ‖z − y‖ = d(x, z) +
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
d(z, y), and this proves the triangle inequality. The other properties are trivial to
verify. Special cases include all lp spaces, ℝn with any of the lp metrics, and the
space ℬ[0, 1]. See section 3.6.
Definition. Let (X, d) be a metric space. The open ball of radius r centered at
x ∈ X is the set
The special case of this definition stated in section 3.6 when X is a normned linear
space is consistent with the above definition.
Example 4. In ℝ with the usual metric, B(x, r) is the open interval of radius r
centered at x.
Example 5. In (ℝ2 , ‖.‖2 ), the open ball of radius r centered at (x0 , y0 ) is the open
disk of radius r centered at (x0 , y0 ).
Example 6. In the space ℬ[0, 1] of bounded real functions on [0, 1], the ball B(f, r)
is the set of all bounded functions whose graphs on [0, 1] are between the graphs
of the functions y = f(x) − r and y = f(x) + r.
Example 10. In ℝ with the usual metric, the interval (0, ∞) is open since (0, ∞) =
i=0 (i, i + 2).
∪∞
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Theorem 4.1.1. A subset U of a metric space X is open if and only if, for every x ∈ U,
there exists a positive number 𝛿 such that B(x, 𝛿) ⊆ U.
Proof. Suppose U is open, and let x ∈ U. Since U is the union of open balls, there
exists a ball B(y, r) in X such that x ∈ B(y, r) ⊆ U. Since d(x, y) < r, the number
𝛿 = r − d(x, y) is positive. We show that B(x, 𝛿) ⊆ B(y, r). Let z ∈ B(x, 𝛿). Then
d(z, y) ≤ d(z, x) + d(x, y) < d(x, y) + 𝛿 = r. Conversely, if, for every x ∈ U, there
is a positive number 𝛿x such that B(x, 𝛿x ) ⊆ U, then U = ∪x∈U B(x, 𝛿x ).
(b) Let U and V be open subsets of X, and let x ∈ U ∩ V. By theorem 4.1.1, there
exist positive numbers 𝛿1 and 𝛿2 such that B(x, 𝛿1 ) ⊆ U, and B(x, 𝛿2 ) ⊆ V. Let
𝛿 = min{𝛿1 , 𝛿2 }. Clearly, B(x, 𝛿) ⊆ U ∩ V. Again, by theorem 4.1.1, U ∩ V is open.
By definition, the empty set is also an open subset of any metric space. This is
largely a useful convention. For example, the statement of theorem 4.1.2 (b) should
read: “The intersection of two open sets is open or empty.” If we declare the empty
set to be open, the statement as it stands is correct.
Example 12. In ℝ with the usual metric, [a, b], [a, ∞), and ∪n∈ℤ [2n, 2n + 1] are
all closed sets, as the complement of each set is open.
The theorem below follows from theorem 4.1.2 and De Morgan’s laws.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Theorem 4.1.4. Let x and y be distinct elements of a metric space X. Then there exist
open sets U and V containing x and y, respectively, such that U ∩ V = ∅.
Proof. Let 𝛿 = d(x, y), and set U = B(x, 𝛿/2) and V = B(y, 𝛿/2). We show that U ∩
V = ∅. If z ∈ U ∩ V, then d(x, y) ≤ d(x, z) + d(z, y) < 𝛿/2 + 𝛿/2 = 𝛿, which is a
contradiction.
The property established by the above theorem, namely, that distinct points in a
metric space are contained in disjoint open sets, is called the Hausdorff property.
This is an important separation property of metric spaces. Common terminology
used to describe the Hausdorff property is that distinct points in a metric space
can be separated by disjoint open sets.
Now you see that distance functions are not created equal. An open neighborhood
of a point in the discrete metric is either very small (a single point) or very large
(the whole space), while the collection of neighborhoods of a point x in a normed
linear space includes all the open balls centered at x and is therefore quite rich.
There is another important distinction between a general metric and the metric
generated by a norm. In the latter case, the collection of open neighborhoods
of a point is exactly the translation of the collection of open neighborhoods
of any other points. Thus the neighborhoods of a point are identical (up to a
translation) to the neighborhoods of any other points. The open neighborhoods
in a general metric space can be quite heterogeneous in the sense that knowledge
of the neighborhoods of one point tells us nothing about the open neighborhoods
of other points.
Definition. Let (xn ) be a sequence in a metric space X, and let x ∈ X. We say that
(xn ) converges to x if limn d(xn , x) = 0. In this case, we write limn xn = x. We
also say that x is the limit of (xn ). Observe that if X is a normed linear space,
limn xn = x is equivalent to the condition that limn ‖xn − x‖ = 0.
Convergence in the spaces (ℬ[0, 1], ‖.‖∞ ) and (𝒞[0, 1], ‖.‖∞ ) is equivalent to
uniform convergence. Explicitly stated, a sequence (fn ) of bounded (respectively,
continuous) functions converges in the uniform norm to a bounded (respectively,
continuous) function f if, for 𝜖 > 0, there exists a positive integer N, dependent
only on 𝜖, such that, for all x ∈ [0, 1] and all n > N, |fn (x) − f(x)| < 𝜖. Clearly, the
pointwise convergence of (fn ) to f is necessary for its uniform convergence to f. The
following two examples illustrate that the converse is not true.
nx if 0 ≤ x ≤ 1/n,
fn (x) = {
1 if 1/n ≤ x ≤ 1.
The pointwise limit of the sequence (fn ) is clearly the bounded function
0 if x = 0,
f(x) = {
1 if 0 < x ≤ 1.
However, the sequence (fn ) does not converge to f in ℬ[0, 1] because, for every
n ∈ ℕ, ‖fn − f‖∞ ≥ |fn (1/(2n)) − f(1/(2n))| = |1/2 − 1| = 1/2.
1
Example 14. Let fn (x) = . Clearly, 0 ≤ fn (x) ≤ 1, and limn fn (x) = 0 for
n3 (x−1/n)2 +1
all x ∈ [0, 1]. However, fn does not converge to the zero function in the uniform
norm since ‖fn ‖∞ = fn (1/n) = 1.
Theorem 4.1.6. Let (xn ) be a sequence in a metric space X, and let x ∈ X. If every
subsequence of (xn ) contains a subsequence that converges to x, then (xn ) converges
to x.
Exercises
1. Show that the intersection of an arbitrary collection of open sets need not
be open.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
The notions of interior, closure, and boundary are quite familiar, and their mean-
ing is rather obvious for simple sets. For example, the interior of the closed
disk D = {x ∈ ℝ2 ∶ ‖x‖2 ≤ 1} is the open disk U = {x ∈ ℝ2 ∶ ‖x‖2 < 1}, and the
boundary of D is the unit circle. The fact that a concept is intuitively obvious
is no substitute for a definition. It is often the case that the definition of a
familiar concept deepens our realization that familiarity and simplicity are not
synonymous. You will see in this section that the interior of ℚ is empty, that its
boundary is the entire real line, and that important subsets of ℝ, such as the Cantor
set, can come in infinitely many fragments. Intuitively speaking, one expects the
definitions to capture the ideas that an interior point of a set A must be completely
surrounded by points of A and that a boundary point of A falls on the edge of A.
Thus any ball centered at a boundary point of A falls partially inside A and partially
outside it. This section formulates precise generalizations of those concepts. We
will also see that disjoint closed sets can be separated in much the same way that
the Hausdorff property separates points.
Example 1. The interior of a nonempty subset may well be empty. The simplest
example is the subset ℚ of the metric space ℝ; int(ℚ) = ∅ because ℚ contains
no open intervals and hence no open subsets of ℝ.
Theorem 4.2.1. The interior of a subset A is the largest open subset of X contained
in A. A subset A is open if and only if int(A) = A. Finally, if A ⊆ B, then int(A) ⊆
int(B).
The above theorem captures what it means for an interior point of A to be totally
surrounded by points of A. In fact, the statement of theorem 4.2.2 can be taken as
the definition of an interior point x of A. One can then define the interior of A to
be the set of all interior points of A.
Proof. Suppose, x ∉ A. Then x ∈ X − A, which is open. Thus there exists 𝛿 > 0 such
that B(x, 𝛿) ⊆ X − A. In particular, B(x, 𝛿) ∩ A = ∅. Conversely, if for some 𝛿 >
0, B(x, 𝛿) ∩ A = ∅, then A ⊆ X − B(x, 𝛿), which is closed, so A ⊆ X − B(x, 𝛿) =
X − B(x, 𝛿). In particular, x ∉ A.
Example 2. In ℝ with the usual metric, ℚ = ℝ. This is because every open interval
in ℝ contains rational points.
square S = [0, 1] × [0, 1], A ⊆ S. Any point in S that does not belong to A or the
line segment {0} × [0, 1] must lie strictly between two consecutive teeth, and a
small-enough disk centered at the point is strictly contained between the two
teeth. Finally any disk centered at a point on {0} × [0, 1] intersects all the teeth
from some point n on.
Proof. Suppose limn xn = x, where each xn ∈ A, and let 𝛿 > 0. There exists a natural
number N such that, for all n ≥ N, d(xn , x) < 𝛿. In particular, xN ∈ B(x, 𝛿) ∩ A;
thus x ∈ A, by theorem 4.2.4. Conversely, suppose x ∈ A. By theorem 4.2.4,
B(x, 1/n) ∩ A ≠ ∅ for all n ∈ ℕ. Choose a point xn ∈ B(x, 1/n) ∩ A. Clearly,
limn xn = x.
Example 4. In ℝ2 with the usual metric, points on the unit circle {(x, y) ∶ x2 + y2 =
1} are limit points of the open unit disk {(x, y) ∶ x2 + y2 < 1}.
Proof. Let x be a limit point of A. There exists a point x1 ∈ A such that 0 < d(x1 , x) <
1. Let 𝛿2 = min{d(x1 , x), 1/2}. There exists a point x2 ∈ A such that 0 < d(x2 , x) <
𝛿2 . Note that x2 ≠ x1 by construction. The rest of the construction is inductive.
Having found points x1 , . . . , xn such that 0 < d(xn , x) < . . . < d(x1 , x) such that
1
d(xi , x) < 1/i, let 𝛿n = min{ , d(xn , x)}, then choose a point xn+1 ∈ A such that
n+1
0 < d(xn+1 , x) < 𝛿n . Clearly, limn xn = x. The converse is straightforward.
Theorem 4.2.7. A = A ∪ A′ . Thus A is closed if and only if it contains all its limit
points.
Proof. It is enough to show that A ⊆ int(A) ∪ 𝜕A. The reverse containment is obvi-
ous. Let x ∈ A − 𝜕A. Since every open ball centered at x intersects A, and since
x ∉ 𝜕A, there exists an open ball B(x, 𝛿) that does not intersect X − A. This means
that B(x, 𝛿) ⊆ A; hence x ∈ int(A).
1
Example 7. Let A = {n ∈ ℕ ∶ n ≥ 2} and B = {n + ∶ n ≥ 2}. Then dist(A, B) = 0.
n
1
To see this, observe that an = n ∈ A, that bn = n + ∈ B, and that |an − bn | =
n
1/n → 0 as n → ∞.
Proof. Suppose x ∈ A. By theorem 4.2.5, there exists a sequence (xn ) in A such that
limn xn = x. Thus dist(x, A) = inf {d(x, a) ∶ a ∈ A} ≤ d(xn , x) for all n ∈ ℕ. Since
limn d(xn , x) = 0, dist(x, A) = 0. Conversely, if dist(x, A) = 0, then there exists a
sequence of points xn ∈ A such that limn d(xn , x) = 0. Thus limn xn = x, and x ∈ A
by theorem 4.2.5.
Separation is a central idea in topology and analysis, and its importance cannot be
exaggerated. The Hausdorff property is the simplest form of separation. We will
see below that closed sets can be separated in much that same way points can be.
Theorem 4.2.12. Let F be a closed subset of X, and let x ∈ X − F. Then there exist
open subsets U and V such that x ∈ U, F ⊆ V, and U ∩ V = ∅.
Proof. Since x ∈ X − F, which is open, there exists 𝛿 > 0 such that B(x, 𝛿) ⊆ X − F.
For every y ∈ F, d(x, y) ≥ 𝛿; hence B(x, 𝛿/2) ∩ B(y, 𝛿/2) = ∅. The open sets U =
B(x, 𝛿/2) and V = ∪{B(y, 𝛿/2) ∶ y ∈ F} satisfy the conclusion of the theorem.
Theorem 4.2.13. Let E and F be disjoint closed subsets of a metric space X. Then E
and F can be separated by disjoint open subsets of X.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Proof. We need to find disjoint open subsets U and V such that E ⊆ U and F ⊆ V.
For x ∈ E, dist(x, F) > 0, by theorem 4.2.10. Let 𝛿x = dist(x, F). By the proof
of theorem 4.2.12, for every y ∈ F, B(x, 𝛿x /2) ∩ B(y, 𝛿x /2) = ∅. For y ∈ F, let
𝛿y = dist(y, E) > 0. By the proof of theorem 4.2.12, for every x ∈ E, B(x, 𝛿y /2) ∩
B(y, 𝛿y /2) = ∅. Let U = ∪x∈E B(x, 𝛿x /2), and V = ∪y∈F B(y, 𝛿y /2). Clearly, U and
V are open, E ⊆ U, and V ⊆ V. It remains to show that U and V are disjoint.
If z ∈ U ∩ V, then z ∈ B(x, 𝛿x /2) ∩ B(y, 𝛿y /2) for some x ∈ E and y ∈ F. Now,
d(x, y) ≤ d(x, z) + d(z, y) < 𝛿x /2 + 𝛿y /2 ≤ max{𝛿x , 𝛿y }. But d(x, y) ≥ dist(x, F) =
𝛿x and d(x, y) ≥ dist(y, E) = 𝛿y . We have arrived at a contradiction that proves
the theorem.
Subspaces
Let (X, d) be a metric space, and let A be a subset of X. The defining conditions of
the metric are clearly satisfied by the elements of A. Since the distance function is
the only defining characteristic of a metric space, the pair (A, d) is a metric space
in its own right. We say that (A, d) is a subspace of (X, d), and the metric d on A is
called the restricted (induced, or subspace) metric.
If A is a subspace of a metric space X, we use the notation BA (x, 𝛿) to denote the ball
in A of radius 𝛿 centered at a point x ∈ A. Thus BA (x, 𝛿) = {x ∈ A ∶ d(x, a) < 𝛿}.
We use the notation BA to denote the closure of a subset B of A in the restricted
metric.
Proof. We prove part (b) and leave the rest of the statements to the reader. If B is an
open subset of A, then B is the union of open balls in A. Thus B = ∪𝛼∈I BA (x𝛼 , 𝛿𝛼 ).
By part (a), BA (x𝛼 , 𝛿𝛼 ) = B(x𝛼 , 𝛿𝛼 ) ∩ A; thus, B = ∪𝛼∈I [B(x𝛼 , 𝛿𝛼 ) ∩ A] = [ ∪𝛼∈I
B(x𝛼 , 𝛿𝛼 )] ∩ A = U ∩ A, where U = ∪𝛼∈I B(x𝛼 , 𝛿𝛼 ), which is open in X. We leave
the proof of the converse as an exercise.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Consider the closed unit interval I = [0, 1]. Trisect I and remove the open middle
third (1/3, 2/3). This leaves two closed intervals: I1,1 = [0, 1/3] and I1,2 = [2/3, 1].
Let C1 = I1,1 ∪ I1,2 . Repeat the construction for each of the intervals I1,1 and
I1,2 , thus removing the middle open third of each of the two intervals. This
leaves four closed intervals: I2,1 = [0, 1/9], I2,2 = [2/9, 1/3], I2,3 = [2/3, 7/9], and
I2,4 = [8/9, 1]. Define C2 = ∪4j=1 I2,j . Repeating this construction yields, for every
n ∈ ℕ, a sequence of closed intervals In,1 , . . . , In,2n , each of length 3−n . Define
n
Cn = ∪2j=1 In,j .
The Cantor set is defined to be C = ∩∞
n=1 Cn .
It is clear that C is an infinite set because it contains the endpoints of each of the
intervals In,j for all n ∈ ℕ and all 1 ≤ j ≤ 2n . What is less obvious is whether C
contains any additional points. The surprising answer is that C is uncountable.
Without loss of generality, assume that the hyperplane contains the origin.
Thus there is a nonzero vector a such that M = {x ∈ ℝn ∶ aT x = 0}. For x ∈ M,
and 𝜖 > 0, the open ball B = B(x, 𝜖) is not contained in M because the point
𝜖a
x+ ∈ B − M.
2‖a‖
Lemma 4.2.15. The Cantor set is closed, perfect, and nowhere dense.
We show that C contains no open intervals. This proves that int(C) = ∅. Let J be an
open interval of length 𝜖 > 0, and choose an integer n such that 3−n < 𝜖. Since
the length of each of the intervals In,j is 3−n , none of the intervals In,j contains J.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Finally, let x ∈ C and consider the interval (x − 𝜖, x + 𝜖). Again choose an integer n
such that 3−n < 𝜖. Since x ∈ Cn , x ∈ In,j for some 1 ≤ j ≤ n. Because the length of
In,j is 3−n < 𝜖, In,j ⊆ (x − 𝜖, x + 𝜖). Thus (x − 𝜖, x + 𝜖) intersects C at a point other
than x (at least one of the endpoints of In,j is not equal to x.) This shows that every
point in C is a limit point of C; hence C is perfect.
In the rest of this subsection and in the section exercises, we need to consider the
ternary (base 3) expansions of points in [0, 1]. Every point x ∈ [0, 1] has a ternary
∞ a
representation x = ∑i=1 ii where each ai ∈ {0, 1, 2}. The sum may well be finite,
3
n a
x = ∑i=1 ii , and the ternary representation of a number may not be unique, but
3
this point is of no immediate consequence. However, see the section exercises.
n a
Lemma 4.2.16. If x = ∑i=1 ii , ai ∈ {0, 2}, then x is the left endpoint of an interval
3
In,j for some 1 ≤ j ≤ 2n .3
a
Proof. We prove the result by induction on n. When n = 1, x = 1 . If a1 = 0, x = 0,
3
which is the left endpoint of I1,1 . If a1 = 2, x = 2/3, which is the left endpoint of I1,2 .
Now suppose the statement is true for a certain integer n. Consider a number
n+1 a
y = ∑i=1 ii , where ai ∈ {0, 2}. If an+1 = 0, there is nothing to prove, so suppose
3
n ai
an+1 = 2. By the inductive hypothesis, the number x = ∑i=1 is the left endpoint
3i
n 2
of an interval In,j for some 1 ≤ j ≤ 2 . Since y = x + n+1 , y is the left endpoint of
3
the right closed subinterval that results from the trisection of In,j . Thus y is the left
endpoint of an interval In+1,k for some 1 ≤ k ≤ 2n+1 .
∞ a
Proposition 4.2.17. If y = ∑i=1 ii , where ai ∈ {0, 2}, then y ∈ C.
3
n a
Proof. It is enough to prove that y ∈ Cn for every n ∈ ℕ. Let x = ∑i=1 ii . By the
3
previous lemma, x is the left endpoint of some interval In,j . Since the length of In,j
∞ 2
is 3−n and y − x ≤ ∑i=n+1 = 3−n , y ∈ In,j ⊆ Cn .
3i
We now need the binary (base 2) representations of numbers in the interval [0, 1].
∞ a
In this system, every x ∈ [0, 1] can be written as a series ∑i=1 ii , where ai ∈ {0, 1}.
2
n a
Again, such a representation may be finite; x = ∑i=1 ii , and, in this case, x does
2
not have a unique representation. For example, the number 1/2 can also be written
n a
as 1/4 + 1/8 + 1/16 + . . . . In general, the number x = ∑i=1 ii , where an = 1 can
2
n−1 a ∞ 1
also be written as x = ∑i=1 ii + ∑i=n+1 i . In order to avoid ambiguity, we use
2 2
the latter representation of x and not the finite sum representation.
Proof. We define a function f ∶ [0, 1] → C as follows: f(0) = 0, and, for x ∈ (0, 1],
∞ a ∞ 2a
write x = ∑i=1 ii and define f(x) = ∑i=1 ii . By the previous proposition, f(x) ∈
2 3
C. We leave it to the reader to verify that f is one-to-one. Now lemma 2.2.3 implies
that C is equivalent to [0, 1]; hence Card(C) = Card([0, 1]) = 𝔠.
Exercises
10. Let (X, d) be a metric space, and let A and B be subsets of X. Prove that
(a) int(A ∩ B) = int(A) ∩ int(B), and
(b) int(A ∪ B) ⊇ int(A) ∪ int(B), giving an example to show that the con-
tainment may be proper.
11. Let A and B be as in the previous exercise. Prove that
(a) A ∪ B = A ∪ B, and
(b) A ∩ B ⊆ A ∩ B, giving an example to show that the containment may be
proper.
12. Show that if a sequence (xn ) in a metric space X converges to x, then {xn ∶
n ∈ ℕ} ∪ {x} is closed in X.
13. Cantor-like sets. Let 0 < 𝜖 ≤ 1. From the unit interval [0, 1], remove the
open subinterval of length 𝜖/3 centered at 1/2, leaving the two closed
intervals I1,1 and I1,2 . Then repeat the geometric construction of the Cantor
set, except require that the removed open interval from In,j be centered at
the midpoint of In,j and have length 𝜖/3n+1 . The resulting set, C𝜖 , is known
as a Cantor-like set. Prove that C𝜖 is closed, perfect, and nowhere dense.
14. Complete the proof of theorem 4.2.18. Hint: Modify the proof of theorem
2.1.15
15. Prove the converse of lemma 4.2.16.
We now take a more careful look at the ternary representation of numbers
in [0, 1]. Specifically, we address the issue of the nonuniqueness of the
n a
representation of a finite sum x = ∑i=1 ii , where an ≠ 0. If an = 2, we use
3
n−1 ai
the finite sum to represent x. If an = 1, we use the series x = ∑i=1 +
3i
∞ 2
∑i=n+1 i and not the finite sum to represent x.
3
16. Prove the converse of proposition 4.2.17. Thus the Cantor set consists
of exactly the points in [0, 1] that have a ternary representation of the
∞ a ∞ a
form x = ∑i=1 ii , ai ∈ {0, 2}. Hint: Prove that if x = ∑i=1 ii , and any of the
3 3
integers ai = 1, then x ∉ C.
17. Prove that a number x ∈ C is a right endpoint of an interval In,j if and only
if the ternary representation of x contains a finite number of zeros.
18. Prove that the interior of the standard n-simplex Tn consists of all the points
in Tn with positive barycentric coordinates. Hence describe the boundary
of Tn .
Continuity, from the intuitive point of view, is about the gradual rather than the
abrupt change of function values. In its simplest form, the graph of a continuous,
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Theorem 4.3.2. Let f ∶ (X, d) → (Y, 𝜌). Then f is continuous at x0 if and only if, for
a sequence (xn ) in X with limn xn = x0 , limn f(xn ) = f(x0 ).
Theorem 4.3.2 provides an extremely useful criterion for proving that a given
function is continuous. It is called the sequential characterization of continuity.
See examples 1 and 2 below.
Theorem 4.3.3. For a function f from a metric space (X, d) to a metric space (Y, 𝜌),
the following are equivalent.
(a) f is continuous on X.
(b) The inverse image of an open subset of Y is an open subset of X.
(c) The inverse image of a closed subset of Y is a closed subset of X.
Proof. (a) implies (b). Let V be an open subset of Y, and let x0 ∈ f−1 (V). Since V is
open, there exists 𝜖 > 0 such that B(f(x0 ), 𝜖) ⊆ V. Since f is continuous at x0 , there
exists 𝛿 > 0 such that f(B(x0 , 𝛿)) ⊆ B(f(x0 ), 𝜖). Thus f−1 (V) ⊇ f−1 (B(f(x0 ), 𝜖) ⊇
B(x0 , 𝛿). This proves that f−1 (V) is open in X.
(b) implies (c). Let F be a closed subset of Y. Then Y − F is open in Y. By
assumption, f−1 (Y − F) is open in X. But f−1 (Y − F) = X − f−1 (F); hence f−1 (F)
is closed in X.
(c) implies (a). Let x0 ∈ X and let V = B(f(x0 ), 𝜖); Y − V is closed in Y, so, by
assumption, f−1 (Y − V) = X − f−1 (V) is closed in X, and hence f−1 (V) is open.
Because x0 ∈ f−1 (V), there exists 𝛿 such that B(x0 , 𝛿) ⊆ f−1 (V). By theorem 4.3.1,
f is continuous at x0 .
Example 2 (the continuity of inner products). Let (xn ) and (yn ) be convergent
sequences in an inner product space with limits x and y, respectively. Then
limn ⟨xn , yn ⟩ = ⟨x, y⟩. First recall that convergent sequences are bounded. Thus
there is a constant M such that ‖yn ‖ ≤ M. Now
Definition. Let d1 and d2 be metrics on the same underlying set X. We say that d1
is weaker (or coarser) than d2 if every d1 -open subset of X is d2 -open. In this
case, we also say that d2 is stronger or finer than d1 .
Example 3. Let (X, d1 ) be any metric space, and let d2 be the discrete metric on X.
Clearly, d1 is weaker than d2 . We will give more interesting examples later.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Theorem 4.3.4. A metric d1 is weaker that another metric d2 if and only if the
identity function IX ∶ (X, d2 ) → (X, d1 ) is continuous.
Proof. If IX ∶ (X, d2 ) → (X, d1 ) is continuous and V is d1 -open, then I−1 X (V) is open
in d2 . But I−1
X (V) = V, so d 1 is weaker than d 2 . The converse is proved by reversing
the above reasoning.
Now we discuss concrete criteria that guarantee that a metric d1 is weaker than d2 .
Since every d1 -open set is the union of d1 -open balls, it suffices to show that every
d1 -open ball, Bd1 (x, 𝛿) is d2 -open. Since every y ∈ Bd1 (x, 𝛿) is the center of a ball
Bd1 (y, 𝛿 ′ ) ⊆ Bd1 (x, 𝛿), it is further sufficient to show that every open ball Bd1 (y, 𝛿 ′ )
contains a d2 -open ball Bd2 (y, 𝜖) for some 𝜖 > 0. We apply the above strategy to
prove the following theorem.
Theorem 4.3.5. If there exists a real number 𝛼 > 0 such that d1 (x, y) ≤ 𝛼d2 (x, y) for
all x, y ∈ X, then d1 is weaker than d2 .
Proof. Consider a d1 -open ball Bd1 (x, 𝛿), and let 𝜖 = 𝛿/𝛼. It follows that Bd2 (x, 𝜖) ⊆
Bd1 (x, 𝛿), because if y ∈ Bd2 (x, 𝜖), then d1 (x, y) ≤ 𝛼d2 (x, y) ≤ 𝛼𝜖 = 𝛿.
Example 5. Consider the space X = 𝒞[0, 1] under the uniform metric and the
1-metric. The identity function IX ∶ (X, ‖.‖∞ ) → (X, ‖.‖1 ) is continuous since,
for f ∈ 𝒞[0, 1], ‖f‖1 ≤ ‖f‖∞ . By theorem 4.3.4, the 1-metric is weaker than the
uniform metric. However, the identity function IX ∶ (X, ‖.‖1 ) → (X, ‖.‖∞ ) is not
continuous. To see this, consider the sequence (see section 3.6)
1
⎧ 2n3 x if 0 ≤ x ≤ 2 ,
⎪ 2n
3 1 1 1
fn (x) = −2n (x − 2 ) if 2 ≤ x ≤ 2 ,
⎨ n 2n n
⎪0 1
if 2 ≤ x ≤ 1.
⎩ n
1
‖fn ‖1 = ; hence fn → 0 in the 1-norm, while fn does not converge in the
2n
uniform norm since ‖fn ‖∞ = n.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Definition. Two metrics d1 and d2 on X are equivalent if they generate the same
collection of open sets. Thus d1 and d2 are equivalent if d1 is weaker than d2 and
d2 is weaker than d1 .
Theorem 4.3.6. Two metrics d1 and d2 on a set X are equivalent if and only if the
identity function IX ∶ (X, d1 ) → (X, d2 ) is bicontinuous.
Theorem 4.3.7. If there exist positive constants 𝛼 and 𝛽 such that 𝛽d2 (x, y) ≤
d1 (x, y) ≤ 𝛼d2 (x, y) for every x, y ∈ X, then d1 and d2 are equivalent.
Theorem 4.3.8. A necessary and sufficient condition for two metrics d1 and d2 on
X to be equivalent is that a sequence (xn ) converges to x in d1 if and only if it
converges to x in d2 .
Proof. Suppose d1 and d2 are equivalent, and let limn xn = x in d1 . By theorem 4.3.6,
IX ∶ (X, d1 ) → (X, d2 ) is continuous; hence, by the sequential characterization of
continuity, IX (xn ) = xn converges to x in d2 . We leave the rest of the proof to the
reader.
Example 6. Let X = ℝn . The metrics induced by the 1-norm, the 2-norm, and
the ∞-norm are all equivalent. To see this, we use theorem 4.3.7. The reader
should work out the details. A partial list of the inequalities needed includes
‖x‖1 ≤ n‖x‖∞ and ‖x‖1 ≤ √n‖x‖2 .
Example 7. Let (X, d) be a metric space. Then the metric d(x, y) = min{1, d(x, y)}
is equivalent to d. It is a simple exercise to show that d is a metric. The fact that
the two metrics are equivalent follows from Bd (x, 𝜖) ⊆ Bd (x, 𝜖) and Bd (x, 𝛿) ⊆
Bd (x, 𝜖), where 𝛿 = min{𝜖, 1}.
Remarks.
1. Important properties of metric spaces are often determined by the collection
of open sets and not by the specific metric that generates the open sets. For
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
d(x,y)
Example 8. For an arbitrary metric space (X, d), the metric d(x, y) = is
1+d(x,y)
equivalent to d.
We show that d satisfies the triangle inequality and leave the rest of the details
t
for the reader to verify. The function f ∶ [0, ∞) → [0, 1) defined by f(t) = is
1+t
a b
increasing. Thus if 0 ≤ a ≤ b, then ≤ . Replacing a with d(x, z), and b with
1+a 1+b
d(x, y) + d(y, z), yields
y = 0, we have ‖Px‖ = ‖x‖. First we claim that, for x, y ∈ ℝn , ⟨Px, Py⟩ = ⟨x, y⟩.
This will conclude the proof because we then have
(PT P − In )x = 0.
Since x is arbitrary, PT P − In = 0.
We now prove the claim. The assumption that ‖Px − Py‖2 = ‖x − y‖2 yields
⟨Px − Py, Px − Py⟩ = ⟨x − y, x − y⟩. Expanding the bilinear forms on the two
sides of the last identity yields ⟨Px, Py⟩ = ⟨x, y⟩, as claimed.
Isometric spaces are virtually identical except for the nature of the elements of the
spaces X and Y and the definition of the metrics d and 𝜌. An isometry preserves
all the metric properties of the space, including boundedness, which, as we saw, is
not preserved under the equivalence of metrics. Another metric property that is
preserved under isometries but not under metric equivalence is completeness. See
section 4.6.
Homeomorphisms
Definition. Two metric spaces (X, d) and (Y, 𝜌) are homeomorphic if there exists
a bicontinuous bijection 𝜑 from X to Y. The function 𝜑 is called a homeomor-
phism from X to Y.
Example 10. The open interval (−1, 1) is homeomorphic to ℝ (both sets have the
t
usual metric). The function f(t) = 2 maps (−1, 1) bicontinuously onto ℝ.
1−t
N = (0,1)
ξ = P–1 (x)
x = P (ξ)
(0,0)
Explicit formulas exist for P and P−1 . It is easier to derive the formula for P−1
than to compute that for P. For a fixed x ∈ ℝ, the parametric equations of the
line containing the points N and (x, 0) are 𝜉1 = xt, 𝜉2 = 1 − t, and −∞ < t < ∞.
Finding the intersection point 𝜉 of the line and the circle yields the formula
for P−1 ∶
x x2
P−1 (x) = 𝜉 = (𝜉1 , 𝜉2 ) = ( , ).
1 + x2 1 + x2
Inverting the above formulas, one obtains the following formula for the stereo-
graphic projection:
𝜉1
x = P(𝜉) = .
1 − 𝜉2
We define the chordal metric 𝜒(x, y) on ℝ as follows: for two points x, y ∈ ℝ,
𝜒(x, y) is the length of the chord of the circle that joins the points P−1 (x) and
P−1 (y), hence the name chordal metric. Note that 𝜒 is the metric on ℝ that
makes the stereographic projection an isometry. Given the above formula for
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
P−1 , a direct calculation of the Euclidean distance between P−1 (x) and P−1 (y)
yields
|x − y|
𝜒(x, y) = .
√1 + x2 √1 + y2
Because the stereographic projection is a homeomorphism, the chordal metric
𝜒 is equivalent to the usual metric on ℝ. We will see in section 4.6 that the
chordal metric is not complete. This illustrates the fact that completeness is not
preserved under homeomorphisms. The reader will recall that boundedness
is not preserved under metric equivalence. Saying that two metrics d1 and
d2 on a space X are equivalent is exactly the same as saying that the identity
mapping IX ∶ (X, d1 ) → (X, d2 ) is a homeomorphism. The properties of a space
that are preserved under homeomorphisms are called topological properties
of the space. Compactness is the prime example of a topological property;
see theorem 4.7.4. The fact that some metric properties, such as boundedness
and completeness, fail to be hereditary under homeomorphisms is a rather
inconvenient fact and does not diminish the usefulness of such properties.
Exercises
1. Let 𝕂 denote the real or complex field with the usual metric. Prove that if f
and g are continuous functions from a metric space (X, d) to 𝕂, then so are
the functions f ± g and fg. If, in addition, g(x) ≠ 0 for all x ∈ X, then f/g is
continuous.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
17. Let X be a normed linear space. Prove that any two open balls in X are
homeomorphic. The same is true of closed balls.
18. The parametric equations of the line containing the points (0, 0, . . . , 1) and
(x1 , x2 , . . . , xn , 0) in ℝn+1 are
n+1
{𝜉 = (𝜉1 , 𝜉2 , . . . , 𝜉n+1 ) ∶ ∑ 𝜉i2 − 𝜉n+1 = 0},
i=1
xi ‖x‖22
𝜉i = , 1 ≤ i ≤ n, 𝜉n+1 = .
1 + ‖x‖22 1 + ‖x‖22
Hence, by inverting the above formulas, derive the formula for the stereo-
graphic projection,
𝜉i
P(𝜉1 , . . . , 𝜉n+1 ) = (x1 , . . . , xn ), where xi = .
1 − 𝜉n+1
‖x − y‖2
𝜒(x, y) = .
2 2
√1 + ‖x‖2 √1 + ‖y‖2
The Euclidean plane ℝ2 , as the product of two copies of ℝ, is the simplest example
of a product space. We saw in section 4.3 that the Euclidean metric in the plane,
although the most natural, is equivalent to several other metrics, including the
∞-metric, which, according to the definition below, is the product metric on
ℝ2 . It is only natural to expect that the product of two open intervals should
be an open subset of ℝ2 , and the definition we adopt for the product metric
smoothly guarantees that. When we identify the complex field with ℝ2 , the
convergence of a complex sequence zn = xn + iyn is equivalent to the convergence
of its real and imaginary parts in ℝ, and one expects that product metrics in
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
general should extend this property. Not only does the product metric preserve the
componentwise convergence in the factor spaces, it is characterized by it. You will
see that the product metric is the weakest metric that guarantees componentwise
convergence in the factor spaces. Additionally, we will show that the product
metric admits the continuity of the projections on the factor spaces and, once
again, is characterized by it. We therefore think of the product metric as the most
economical metric that generalizes the properties of Euclidean space in relation
to its factor spaces.
n
Let {(Xi , di )}ni=1 be a finite set of metric spaces, and let X = ∏i=1 Xi =
{(x1 , . . . , xn ) ∶ xi ∈ Xi } be the Cartesian product of the underlying sets Xi .
For x ∈ X and 𝛿 > 0, we denote the D-ball in X of radius 𝛿 centered at x by BD (x, 𝛿).
n
Theorem 4.4.2. If Ui is open in Xi for each 1 ≤ i ≤ n, then the set U = ∏i=1 Ui is
open in (X, D).
n
Proof. Let x ∈ ∏i=1 Ui . Then xi ∈ Ui , and hence there exists 𝛿i > 0 such that
n n
Bdi (xi , 𝛿i ) ⊆ Ui . Let 𝛿 = min1≤i≤n 𝛿i . Clearly, ∏i=1 Bdi (xi , 𝛿) ⊆ ∏i=1 Bdi (xi , 𝛿i ) ⊆
n n
∏i=1 Ui . By theorem 4.4.1, ∏i=1 Bdi (xi , 𝛿) = BD (x, 𝛿). Thus x ∈ BD (x, 𝛿) ⊆ U,
which proves that U is open.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
(k)
Proof. Because di (xi , xi ) ≤ D(x(k) , x), limk D(x(k) , x) = 0 implies that
(k) (k)
limk di (xi , xi ) = 0 . Conversely, if limk xi = xi in di for each 1 ≤ i ≤ n, then
(k)
limk max1≤i≤n di (xi , xi ) = 0, and hence limk x(k) = x.
Theorem 4.4.4 says that the convergence of a sequence in the product metric D
is equivalent to the convergence of each of the component sequences (compo-
nentwise convergence). In fact, componentwise convergence characterizes all the
metrics on X that are equivalent to the product metric D, as the following theorem
shows.
Proof. We use theorem 4.3.8. The metrics D and D∗ are equivalent if and only if
convergence of a sequence in one metric occurs if and only if it occurs in the other
metric. Clearly, this is the case for D and D∗ , since convergence in either metric is
equivalent to componentwise convergence.
Example 2. To illustrate the importance of the above theorem, note that each of
the following metrics are equivalent to the product metric D on X:
n
D1 (x, y) = ∑ di (xi , yi ),
i=1
n 1/2
Proof. Let (y(k) ) be a sequence in Y, and suppose limk y(k) = y. By theorem 4.4.4,
limk f(y(k) ) = f(y) if and only limk fi (y(k) ) = fi (y) for all 1 ≤ i ≤ n.
Example 3. We used the previous theorem to prove the continuity of the stereo-
graphic projections. See example 12 on section 4.3, and problem 18 on the same
section.
Exercises
weakest metric relative to which all the projections 𝜋i are continuous. More
explicitly stated, show that if D∗ is a metric on X and each 𝜋i ∶ (X, D∗ ) →
(Xi , di ) is continuous, then the product metric is weaker that D∗ .
Although the rigorous definition of the real line was a giant leap in the devel-
opment of mathematics, it would not be nearly as useful an invention had it not
been for the fact that it contains the rational numbers as a dense subset. Indeed,
all practical computations, including machine calculations, are done exclusively
using rational numbers. The simplicity of rational numbers is enhanced by their
countability. Thus ℚ is numerous enough, simple enough, but not too enormous to
be a useful approximation of ℝ. It is a reasonable quest to study metric spaces
that contain a countable dense subset (of simpler elements). Such spaces are,
by definition, separable. You will see that many (but not all) metric spaces are
separable. The classical example is the space 𝒞[0, 1]. It is well known that (see
section 4.8) the set of polynomials with rational coefficients, which is countable,
is dense in 𝒞[0, 1]. What can be a nicer approximation of a continuous function
than a rational polynomial! Separability of a metric space turns out to be equivalent
to the existence of a countable collection of open sets that generate all open sets,
which is an added benefit and an important characterization of separability.
We use the uniform continuity of f (see example 8 on section 1.2). Let 𝛿 > 0
be such that |f(x) − f(y)| < 𝜖 whenever |x − y| < 𝛿. Choose a natural number n
such that 1/n < 𝛿, and, for 0 ≤ j ≤ n, let xj = j/n. Define the function g to be
the continuous, piecewise linear function such that g(xj ) = f(xj ) for 0 ≤ j ≤ n.
By construction, ‖f − g‖∞ < 𝜖. Observe that this example says that the space of
continuous, piecewise linear functions is dense in 𝒞[0, 1].
The above example shows that ℝn is second countable because the collection 𝔅
is countable.
(a) X is separable.
(b) X is second countable.
(c) X is Lindelöf.
Proof. (a) implies (b). Let A = {a1 , a2 , . . . } be a countable dense subset of X. We claim
that the countable collection 𝔅 = {B(an , r) ∶ n ∈ ℕ, r ∈ ℚ} is an open base for X.
To prove that every open subset of X is the union of members of 𝔅, it is sufficient to
show that if x ∈ X and 𝛿 > 0, there exist an element an ∈ A and a rational number
r such that x ∈ B(an , r) ⊆ B(x, 𝛿). Pick an element an ∈ A such that d(x, an ) <
𝛿/4, and choose a rational number r such that 𝛿/4 < r < 𝛿/2. Then x ∈ B(an , r),
and if y ∈ B(an , r), then d(x, y) ≤ d(x, an ) + d(an , y) < 𝛿/4 + r < 𝛿, so B(an , r) ⊆
B(x, 𝛿).
(b) implies (c). Let 𝔅 = {Bn ∶ n ∈ ℕ} be a countable open base for X. Suppose,
for some collection 𝒰 = {U𝛼 ∶ 𝛼 ∈ I} of open subsets of X, X = ∪𝛼∈I U𝛼 . For each
natural number n, pick an element Vn in 𝒰 that contains Bn . If no element of
𝒰 contains Bn , define Vn = ∅. We claim that {Vn }n∈ℕ covers X. If x ∈ X, then
x ∈ U𝛼 for some 𝛼 ∈ I. There exists Bn such that x ∈ Bn ⊆ U𝛼 ; thus, Vn ≠ ∅ and
x ∈ Vn .
(c) implies (a). For a fixed n ∈ ℕ, X = ∪x∈X B(x, 1/n). By assumption, there exists a
set {xn,1 , xn,2 . . . } such that X = ∪∞
j=1 B(xn,j , 1/n). We claim that {xn,j ∶ n, j ∈ ℕ} is
dense in X. Let x ∈ X and let 𝛿 > 0. Choose n ∈ ℕ such that 1/n < 𝛿. Because
x ∈ ∪∞j=1 B(xn,j , 1/n), x ∈ B(xn,j , 1/n) for some j ∈ ℕ. Now d(xn,j , x) < 1/n < 𝛿,
and the proof is complete.
The following example shows that a separable metric space is, in a way, not too
large.
Exercise
4.6 Completeness
Proof. Let lim xn = x, and let 𝜖 > 0. There exists a natural number N such that, for
n > N, d(xn , x) < 𝜖/2. Now, for m, n > N, d(xn , xm ) ≤ d(xn , x) + d(x, xm ) < 𝜖.
Theorem 4.6.3. If a Cauchy sequence (xn ) contains a subsequence xnk that converges
to x, then (xn ) converges to x.
Proof. Let 𝜖 > 0. There exists a natural number N such that, for m, n ≥ N,
d(xn , xm ) < 𝜖/2. Since limk xnk = x, there exists an integer K such that, for k ≥ K,
d(xnk , x) < 𝜖/2. Without loss of generality, we may assume that K > N and
thus nK ≥ K > N. Taking m = nK and using the triangle inequality, for n > N,
d(xn , x) ≤ d(xn , xnK ) + d(xnK , x) < 𝜖.
Example 1. Consider the chordal metric 𝜒 on ℝ. We will show that although the
sequence xn = n is a Cauchy sequence in (ℝ, 𝜒), it does not converge.
1 1
|n − m| | − | 1 1
n m
𝜒(n, m) = = ≤ | − | → 0 as m, n → ∞.
√1 + n2 √1 + m2 1 1 n m
√1 + n2 √1 + m2
To prove that the sequence does not converge to any x ∈ ℝ, we observe that
x
|n − x| |1 − | 1
n
lim 𝜒(n, x) = lim = lim = ≠ 0.
n n √1 + n2 √1 + x2 n 1 √1 + x2
√1 + n2 √1 + x
2
Theorem 4.6.4.
(a) A closed subspace A of a complete metric space is complete.
(b) A complete subspace A of a metric space is closed.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Proof. (a) Let (xn ) be a Cauchy sequence in A. Since X is complete, there exists x ∈ X
such that limn xn = x. Since A is closed, theorem 4.2.5 guarantees that x ∈ A.
(b) Let x ∈ A. By theorem 4.2.5, there exists a sequence (xn ) in A such that
limn xn = x. Now (xn ) is Cauchy (theorem 4.6.1), and A is complete, so (xn )
converges to a point y in A. By the uniqueness of limits, x = y.
Proof. We leave the proof of the case p = ∞ to the reader (also see theorem
4.8.1). Fix 1 ≤ p < ∞, let (xn ) be a Cauchy sequence in lp , and write xn =
(xn,1 , xn,2 , . . . , xn,k , . . . ). Given 𝜖 > 0, there exists N ∈ ℕ such that, for n, m >
p ∞
N, ‖xn − xm ‖p = ∑k=1 |xn,k − xm,k |p < 𝜖p . In particular, if k is a fixed positive
integer, then, for every n, m > N, |xn,k − xm,k | < 𝜖. Thus (xn,k )∞ n=1 is a Cauchy
sequence in 𝕂. By the completeness of 𝕂, xk = limn xn,k exists for every k ∈ ℕ.
Set x = (xk )∞ p
k=1 . We will show that x ∈ l and that limn ‖xn − x‖p = 0. For an
arbitrary positive integer K,
K K ∞
p
∑ |xk |p = lim ∑ |xn,k |p ≤ lim sup ∑ |xn,k |p = lim sup ‖xn ‖p .
n n n
k=1 k=1 k=1
p
Because (xn ) is Cauchy, ‖xn ‖p is bounded by theorem 4.6.2; hence lim supn ‖xn ‖p <
∞
∞. This shows ∑k=1 |xk |p < ∞, and hence x ∈ lp .
Taking the limit as K → ∞ of the extreme left side of the above string of inequali-
ties, we have
∞
p p
‖xn − x‖p = ∑ |xn,k − xk |p ≤ lim sup ‖xn − xm ‖p .
k=1 m→∞
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Observe that the above inequalities hold for an arbitrary positive integer n. Now
let 𝜖 > 0. There exists N ∈ ℕ such that, for m, n > N, ‖xn − xm ‖p < 𝜖. Thus, for
n > N, lim supm→∞ ‖xn − xm ‖p ≤ 𝜖. We have shown that, for n > N, ‖xn − x‖p ≤
𝜖. This completes the proof.
Proof. For every n ∈ ℕ, choose a point xn ∈ Fn . Let 𝜖 > 0. There exists a natural
number N such that, for n > N, diam(Fn ) < 𝜖. Now if m ≥ n > N, then Fn ⊇ Fm
and xn , xm ∈ Fn , and hence d(xn , xm ) < 𝜖. This makes (xn ) a Cauchy sequence and
hence convergent to, say, x. Each of the sets Fn contains all but a finite number
of terms of (xk ). Since each Fn is closed, x ∈ Fn for all n, and x ∈ ∩∞ n=1 Fn . Now
diam(∩n=1 Fn ) ≤ diam(Fn ) → 0. Hence ∩n=1 Fn = {x}.
∞ ∞
Proof. Let {An } be a countable family of nowhere dense subsets of X. Without loss
of generality, assume that each An is closed. Since X − A1 is open and nonempty,
there exists a ball B1 = B(x1 , 𝛿1 ) such that B1 ∩ A1 = ∅. By reducing the radius
𝛿1 , if necessary, we may assume that 𝛿1 < 1 and that B1 ∩ A1 = ∅. Since B1 − A2
is open and nonempty, we can find a ball B2 = B(x2 , 𝛿2 ) such that B2 ∩ A2 = ∅.
As before, we may assume that 𝛿2 < 1/2 and B2 ∩ A2 = ∅. We can continue this
process and construct a sequence of balls {Bn } such that Bn ∩ An = ∅, B1 ⊇ B2 ⊇
. . . , and diam(Bn ) ≤ 2/n. By the Cantor intersection theorem, ∩∞
n=1 Bn = {x}. Since
Bn ∩ An = ∅, x ∉ An for all n ∈ ℕ, and ∪∞ A
n=1 n ≠ X.
Theorem 4.6.8. Let {An } be a countable family of closed nowhere dense subsets of
a complete metric space X, and let U0 be a nonempty open subset of X. Then
U0 − ∪∞
n=1 An ≠ ∅.
Theorem 4.6.10. The product of a finite number {(Xi , di )}ni=1 of complete metric
spaces is complete.
Proof. Let (fn ) be a Cauchy sequence in 𝒞[a, b]. For 𝜖 > 0, there is a positive integer
N such that ‖fn − fm ‖∞ < 𝜖 for every m, n ≥ N. Thus, for every x ∈ [a, b] and
every m, n ≥ N, |fn (x) − fm (x)| < 𝜖; hence (fn (x)) is a Cauchy sequence for every
x ∈ [a, b]. By the completeness of 𝕂, f(x) = limn fn (x) exists for every x.
We claim that limn ‖fn − f‖∞ = 0. Let 𝜖 and N be as in the previous paragraph.
Then |fn (x) − fm (x)| < 𝜖 for every x ∈ [a, b] and every n, m ≥ N. Taking the limit
as m → ∞, we obtain |fn (x) − f(x)| < 𝜖 for every x ∈ [a, b] and every n ≥ N. This
means that ‖fn − f‖∞ < 𝜖, as claimed.
Finally, we need to show that f is continuous. Suppose that xk ∈ [a, b] and that
limk xk = x. Let 𝜖 > 0. By the previous paragraph, there is an integer N such that
‖fN − f‖∞ < 𝜖. By the continuity of fN at x, there exists an integer K such that, for
k > K, |fN (xk ) − fN (x)| < 𝜖. Now, for k > K,
|f(x) − f(xk )| ≤ |f(x) − fN (x)| + |fN (x) − fN (xk )| + |fN (xk ) − f(xk )| < 3𝜖.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Example 4 (the Weierstrass M-test). Let fn be a sequence in 𝒞[a, b], and suppose
that there exists a real sequence (Mn ) such that, for every n ∈ ℕ, ‖fn ‖∞ ≤ Mn
∞ ∞
and ∑n=1 Mn < ∞. Then the series of functions ∑n=1 fn (x) converges in 𝒞[a, b].
n
We prove that the sequence of partial sums Sn (x) = ∑i=1 fi (x) is a Cauchy
sequence in 𝒞[a, b]. Let 𝜖 > 0. By the convergence of the positive series
∞ m
∑n=1 Mn , there is an integer N such that, for m > n > N, ∑i=n+1 Mi < 𝜖.⁴
m
Thus, for m > n > N, and for every x ∈ [a, b], |Sm (x) − Sn (x)| ≤ ∑i=n+1 |fi (x)| ≤
m
∑i=n+1 Mi < 𝜖, or ‖Sm − Sn ‖∞ < 𝜖. This shows that Sn is a Cauchy sequence and
hence, by the completeness of 𝒞[a, b], is convergent to a function f ∈ 𝒞[a, b].
∞
In fact, the series ∑=1 fn (x) converges to f absolutely as well as uniformly to f on
[a, b].
b b
lim ∫ fn (x)dx = ∫ f(x)dx.
n
a a
∞
In particular, if the series ∑n=1 gn (x) converges in 𝒞[a, b], then
b ∞ ∞ b
∫ ∑ gn (x)dx = ∑ ∫ gn (x)dx.
a n=1 n=1 a
Let 𝜖 > 0. There exists an integer n such that for n > N and all x ∈ [a, b], |fn (x) −
f(x)| < 𝜖. Now if n > N, then
b b b
| ∫ fn (x)dx − ∫ f(x)dx| ≤ ∫ |fn (x) − f(x)|dx ≤ 𝜖(b − a).
a a a
n ∞
⁴ The sequence of partial sums ∑i=1 Mi is Cauchy because its limit, ∑n=1 Mn , is convergent.
⁵ Also called Banach’s fixed point theorem.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Now z = limn xn = limn T(xn−1 ) = T(limn xn−1 ) = T(z). To show that z is unique,
suppose w is a fixed point of T. Then d(z, w) = d(T(z), T(w)) ≤ kd(z, w). This
would be a contradiction unless d(z, w) = 0, that is, z = w.
dy
= f(x, y(x)), y(a) = y0 .
dx
It is relatively easy to check that d is a complete metric on X. Note that, for all
x ∈ [a, b], and all y, z ∈ X, |y(x) − z(x)| ≤ eK(x−a) d(y, z). The initial value prob-
lem in question is equivalent to the integral equation
x
y(x) = y0 + ∫ f(s, y(s))ds.
a
x
Define a function T ∶ X → X by y ↦ Ty , where Ty (x) = y0 + ∫a f(s, y(s))ds. The
proof will be complete if we show that T has a unique fixed point y ∈ X. For two
functions y, z ∈ X,
x
−K(x−a) −K(x−a) || |
e |Ty (x) − Tz (x)| = e ∫ f(s, y(s)) − f(s, z(s))ds|
| |
a
x x
≤ e−K(x−a) ∫ |f(s, y(s)) − f(s, z(s))|ds ≤ Le−K(x−a) ∫ |y(s) − z(s)|ds
a a
x
L
≤ Le−K(x−a) d(y, z) ∫ eK(s−a) ds = e−K(x−a) d(y, z)[eK(x−a) − 1]
a
K
L −K(x−a) L L
< e d(y, z)eK(x−a) = d(y, z) = kd(y, z), where k = < 1.
K K K
The above inequalities show that d(Ty , Tz ) ≤ kd(y, z); hence T is a contraction. We
now invoke the contraction mapping theorem to conclude that the initial value
problem has a unique solution in 𝒞[a, b].
Proof. Let x be the unique fixed point of Tn . Thus Tn (x) = x. Now T(x) = Tn+1 (x) =
Tn (T(x)). Thus T(x) is a fixed point of Tn . But the fixed point ot Tn is unique, so
T(x) = x. We leave it to the reader to show that x is the only fixed point of T.
Let X = 𝒞[a, b], equipped with the uniform metric. Define a function T ∶ X → X
by
x
u ↦ Tu , where Tu (x) = f(x) + ∫ K(x, y, u(y))dy.
a
and, by induction,
Ln Ln
|Tnu (x) − Tnv (x)| ≤ ‖u − v‖∞ (x − a)n ≤ ‖u − v‖∞ (b − a)n .
n! n!
Ln (b−a)n
For sufficiently large n, k = < 1, and, for such an n, Tn is a contraction.
n!
By theorem 4.6.14, T has a unique fixed point, and the Volterra equation has a
unique solution.
The geometric meaning of the condition in the above definition is that the slope
of the line joining the point (x0 , f(x0 )) and an arbitrary point (x, f(x)) on the graph
of f cannot exceed n in absolute value. There is no shortage of functions in 𝔉n . For
example, the functions f𝛼 (x) = 𝛼x are in 𝔉n if |𝛼| ≤ n.
The assumptions of the previous example are much stronger than they need to
be. The differentiablility of f at a single point in (0, 1) is enough to guarantee that
f ∈ 𝔉n for some n, as the following example illustrates.
Proof. Let (fk ) be a sequence in 𝔉n , and suppose fk → f in the uniform norm. For each
k ∈ ℕ, let xk be such that |fk (x) − fk (xk )| ≤ n|x − xk |. By the Bolzano-Weierstrass
theorem (theorem 1.2.8), (xk ) contains a convergent subsequence xkp . For simplic-
ity of notation, write xp for xkp and fp for fkp , and let x0 = limp→∞ xp . For x ∈ [0, 1],
|f(x) − f(x0 )| = limp |fp (x) − fp (xp )| ≤ n limp |x − xp | = n|x − x0 |.
the right and left derivatives of f by D+ f(x) and D− f(x), respectively. We use
the notation |Df| to denote the minimum (absolute) value of the (one-sided)
derivatives of f. Simply put, |Df| is the minimum absolute value of the slope of
any straight line segment of the graph of f. Figure 4.2 shows the graph of the type
of functions of interest to us. In that graph, the slope of any straight line segment
of the graph is ±4, and |D𝜓| = 4.
Choose an integer m such that 2.4m > 4k (b − a), and divide the interval [a, b]
j(b−a)
into 4m subintervals of equal length. For 0 ≤ j ≤ 4m , let xj = a + m . Define 𝜓
4
to be the continuous, piecewise linear function such that, for 0 ≤ j ≤ 4m , 𝜓(xj ) =
xj +xj+1
0, and, for 0 ≤ j ≤ 4m − 1, 𝜓( ) = 2−k . The magnitude of the slope of any
2
2−k
straight line segment of the graph of 𝜓 is equal to 1 (b−a)
> 2k .
2 4m
0.5
Example 9. Let [a, b] be an arbitrary interval, and let h be the linear function
h(x) = mx + c. For every 𝜖 > 0 and for every n ≥ 1, there exists a continuous,
piecewise linear function 𝜑 on [a, b] such that 𝜑(a) = h(a), 𝜑(b) = h(b),
‖h − 𝜑‖∞ < 𝜖, and |D𝜑| > n.
Choose an integer k such that 2−k < 𝜖 and 2k − |m| > n. Using example 8,
we find a continuous, piecewise linear function 𝜓 such that 𝜓(a) = 0 = 𝜓(b),
‖𝜓‖∞ = 2−k , |D𝜓| > 2k . Define 𝜑 = h + 𝜓. Clearly, ‖h − 𝜑‖∞ = ‖𝜓‖∞ = 2−k <
𝜖 and, for x ∈ [a, b], |D± 𝜑(x)| = |D± 𝜓(x) + m| ≥ |D± 𝜓(x)| − |m| = |D𝜓| −
|m| > 2k − |m| > n.
Proof. Let f ∈ 𝒞[0, 1] and let 𝜖 > 0. We will show that there is a continuous, piecewise
linear function g such that ‖f − g‖∞ < 𝜖 and g ∉ 𝔉n . Since continuous, piecewise
linear functions are dense in 𝒞[0, 1] (see example 1 in section 4.5), let h be a
continuous, piecewise linear function such that h(xj ) = f(xj ) for 0 ≤ j ≤ M and
‖f − h‖∞ < 𝜖/2. For 0 ≤ j ≤ M − 1, let hj be the restriction of h to [xj , xj+1 ]. By
example 9, for each j we construct a piecewise linear function 𝜑j such that ‖hj −
𝜑j ‖∞ < 𝜖/2 and |D𝜑j | > n. We define the required function g by pasting together
the functions 𝜑j . The function g is continuous because 𝜑j (xj ) = f(xj ) = 𝜑j+1 (xj ).
Now ‖f − g‖∞ ≤ ‖f − h‖∞ + ‖h − g‖∞ < 𝜖.
The following result follows from remark 1, lemma 4.6.16, lemma 4.6.17, and
theorem 4.6.9.
Exercises
1 1
4. Define a metric on ℕ as follows: d(m, n) = | − |. Prove that d is an
n m
incomplete metric.
5. For x, y ∈ ℝ, define d(x, y) = |tan−1 x − tan−1 y|. Prove that d is an incom-
x−y
plete metric on ℝ. Use the identity tan−1 x − tan−1 y = tan−1 ( ).
1+xy
6. Let A be a dense subset of a metric space X such that every Cauchy sequence
in A is convergent to a point in X. Prove that X is complete.
7. Prove that if (xn ) and (yn ) are Cauchy sequences in a metric space, then
d(xn , yn ) converges.
8. Prove the converse of the Cantor intersection theorem. Hint: Let (xn ) be a
Cauchy sequence. For each n ∈ ℕ, let An = {xn , xn+1 , . . . }, and let Fn = An .
Show that limn diam(Fn ) = 0.
9. Prove that a subset A of a metric space is nowhere dense if and only if every
nonempty open set U contains a nonempty open subset V such that V ∩
A = ∅.
10. Show that a closed subset F of a metric space X is nowhere dense, if and
only if X − F is dense.
11. Show that the boundary of a closed subset F of a metric space X is nowhere
dense and give an example to show that the assumption that F is closed
cannot be omitted.
12. Let X be a complete metric space, and let {Fn } be a countable collection of
closed, nowhere dense subsets of X. Is ∪∞ n=1 Fn necessarily nowhere dense?
13. Prove that a contraction on a metric space is continuous. Notice that this
fact was used in the proof of theorem 4.6.12
14. Prove that the metric d in the proof of theorem 4.6.13 is complete.
15. Prove that the function Ty in the proof of theorem 4.6.13 is continuous.
16. Let g ∶ [a, b] → ℝ and K ∶ [a, b] × [a, b] → ℝ be continuous functions.
Show that when |𝛼| is small enough, the integral equation
b
y(x) = 𝛼 ∫ K(x, t)y(t)dt + g(x)
a
0 a1,2 … a1,n 0 … … 0
⎛ ⎞ ⎛ ⎞
⋮ 0 ⋱ a 0
U=⎜ ⎟ , L = ⎜ 2,1 ⎟,
⎜ ⋮ ⋱ an−1,n ⎟ ⎜ ⋮ ⋱ ⋱ ⎟
⎝0 … 0 ⎠ ⎝an,1 … an,n−1 0⎠
a11
D=( ⋱ ).
ann
Define J = −D−1 (L + U). Show that the function T ∶ ℝn → ℝn defined
by Tx = Jx + D−1 b is a contraction. Conclude that the iteration xk =
Txk−1 , k ≥ 1, converges to the solution of the system Ax = b. Hint: Examine
the matrix norm ‖J‖∞ defined in section 3.6.
4.7 Compactness
Proof. Let (X, d) be a compact space, and let (Y, 𝜌) be a metric space. We show that
if f ∶ X → Y is a continuous surjection, then (Y, 𝜌) is compact. Let {V𝛼 } be an
open cover of Y. Since f is continuous, f−1 (V𝛼 ) is open in X for each 𝛼, and hence
{f−1 (V𝛼 )} is an open cover of X. The compactness of X yields a finite subcover
{f−1 (V𝛼i )}ni=1 of X. Clearly, {V𝛼i }ni=1 covers Y.
Theorem 4.7.5. A metric space X is sequentially compact if and only if it has the
Bolzano-Weierstrass property.
Definition. A metric space X is totally bounded if, for every 𝜖 > 0, there exists
a finite subset {x1 , . . . , xn } of X such that X = ∪ni=1 B(xi , 𝜖). The set {x1 , . . . , xn } is
called an 𝜖-dense subset of X.
Suppose X is complete and totally bounded. We claim that X has the Bolzano-
Weierstrass property. The proof will be complete by theorem 4.7.5. Let A be an
infinite subset of X. The total boundedness of X allows us to cover X by a finite
collection closed balls of radius 1. One of the balls, B1 , contains infinitely many
points of A. Define F1 = B1 , and A1 = A ∩ B1 . Now cover X by a finite collection of
closed balls of radius 1/2. One of those balls, B2 , contains infinitely many points of
A1 and hence infinitely many points of A. Define F2 = B2 ∩ F1 , and A2 = A1 ∩ B2 .
Continue by induction to construct a sequence of closed subsets F1 ⊇ F2 ⊇ F3 ⊇ . . .
such that diam(Fn ) ≤ 2/n and each Fn contains infinitely many points of A. By the
Cantor intersection theorem, let {x} = ∩∞ n=1 Fn . Since limn diam(Fn ) = 0, any ball
centered at x contains Fn for sufficiently large n. Since Fn contains infinitely many
points of A, x is a limit point of A.
Theorem 4.7.7. In a sequentially compact metric space X, every open cover of X has
a Lebesgue number.
Proof. Suppose that there is an open cover 𝒰 = {U𝛼 } of X that does not have a
Lebesgue number. We show that X is not sequentially compact. By assumption,
for every n ∈ ℕ, there exists a subset An of X such that diam(An ) < 1/n, and
An is not contained in any member of 𝒰. For each n ∈ ℕ, pick a point xn ∈ An .
We claim that (xn ) has no convergent subsequence. Suppose, contrary to our
claim, that some subsequence (xnk ) of (xn ) converges to x. Since ∪𝛼 U𝛼 = X,
there exists a member U𝛼 of 𝒰 that contains x, and since U𝛼 is open, there
is a number 𝛿 > 0 such that B(x, 𝛿) ⊆ U𝛼 . Now choose a positive integer K
such that d(xnK , x) < 𝛿/2, and 1/nK < 𝛿/2. If y ∈ AnK , then d(x, y) ≤ d(x, xnK ) +
d(xnK , y) < 𝛿/2 + diam(AnK ) < 𝛿/2 + 𝛿/2 = 𝛿. This implies that AnK ⊆ B(x, 𝛿) ⊆
U𝛼 , which is a contradiction.
(a) X is compact.
(b) X is sequentially compact.
(c) X has the Bolzano-Weierstrass property.
(d) X is complete and totally bounded.
Proof. In light of theorems 4.7.5, 4.7.6, and 4.7.8, we only need to show that (a)
implies (b). Let (xn ) be a sequence in X. Define An = {xn , xn+1 , . . . }, and let Fn =
An . Clearly, {Fn } is a descending sequence of closed nonempty sets. If ∩n∈ℕ Fn = ∅,
then ∪n∈ℕ (X − Fn ) = X. Thus (X − Fn ) is an ascending sequence of open subsets
that covers X. Therefore X = X − Fn , for some positive integer n, and hence Fn =
∅. This contradiction shows that ∩n∈ℕ Fn ≠ ∅. Let x ∈ ∩n∈ℕ Fn . Observe that x
is a closure point of each of the sets An . Since x ∈ A1 , there exists an integer
n1 ≥ 1 such that d(xn1 , x) < 1. Now x ∈ An1 +1 ; thus there is an integer n2 ≥ n1 + 1
such that d(xn2 , x) < 1/2. Having found a sequence of positive integers n1 < n2 <
. . . < nk such that, for 1 ≤ i ≤ k, d(xni , x) < 1/i, choose an integer nk+1 ≥ nk + 1
1
such that d(xnk+1 , x) < . This is possible because x ∈ Ank +1 . By construction,
k+1
limk xnk = x.
Proof. It is enough to show that the product of two compact metric spaces X and
Y is compact. Let (xn , yn ) be a sequence in X × Y. Since X is compact, there is a
subsequence (xnk ) of (xn ) that converges to x ∈ X. Since Y is compact, there exists
a subsequence (ynkp ) of (ynk ) that converges to a point y ∈ Y. Now (xnkp , ynkp ) is a
subsequence of (xn , yn ) that converges to (x, y) as p → ∞.
n
F(𝜆0 , . . . , 𝜆n , x0 , . . . , xn ) = ∑ 𝜆i xi , where (𝜆0 , . . . , 𝜆n ) ∈ Tn , and x0 , . . . , xn ∈ K.
i=0
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Proof. A compact subset of any metric space is closed and bounded by theorem
4.7.3. Conversely, suppose K ⊆ ℝn is closed and bounded; K is contained in some
rectangle I1 × . . . × In , where each Ii is a closed bounded interval in ℝ. By theorem
1.2.10, each Ii has the Bolzano-Weierstrass property and hence is compact by
theorem 4.7.9. By Tychonoff ’s theorem, I1 × . . . × In is compact and, by theorem
4.7.2, K is compact.
V = (x1 − 1, x1 + 1) × . . . × (xn − 1, xn + 1)
⁶ We will see in section 6.1 that all norms on ℝn are equivalent. Thus if a set K is closed and bounded
in one norm on ℝn , then it is closed and bounded in any norm on ℝn .
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
and
V = [x1 − 1, x1 + 1] × . . . × [xn − 1, xn + 1]
is compact.
Since every open subset of ℚ is the union of sets of the form (a, b) ∩ ℚ,
where a, b ∈ ℝ, it is enough to show that a set of the form I = [a, b] ∩ ℚ is
not sequentially compact. Choose an irrational number r ∈ (a, b), then choose
a sequence xn ∈ I such that limn xn = r. Clearly, no subsequence of (xn ) is
convergent in I.
Example 6. The metric space l∞ is not locally compact. It is enough to show that
the closed unit ball B = {x ∈ l∞ ∶ ‖x‖∞ ≤ 1} is not compact (see problem 8 in
the section exercises). As B contains the canonical vectors en of 𝕂(ℕ), since
d(en , em ) = 1 if n ≠ m, the sequence (en ) in l∞ does not contain a convergent
subsequence.
Theorem 4.7.14. The product of finitely many locally compact spaces is locally
compact.
Observe that
⟨a−z,y−z⟩
Observe that if 𝜃 is the angle between a − z and y − z, then cos 𝜃 = .
‖a−z‖2 ‖x−z‖2
∘
The condition ⟨a − z, y − z⟩ ≤ 0 is equivalent to saying that 𝜃 is at least 90 , hence
the name obtuse angle criterion. Figure 4.3 illustrates the geometry. The wedge-
shaped region depicts the convex set C, and the rest of the diagram is self-
explanatory.
We know (see theorem 4.2.12) that a closed subset in a metric space can be
separated from a point outside it by disjoint open sets. In ℝn , a closed con-
vex subset can be separated from a point outside it in a much stronger and
more specific way. They can be separated by a hyperplane, as the next example
illustrates.
Example 10. Let C be a closed convex subset of ℝn , and let a ∈ ℝn − C. Then there
exists a hyperplane nT x = b such that nT y < b for every y ∈ C, and nT a > b. Thus
C is contained in one of the open half-spaces determined by the hyperplane,
and a is contained in the other open half-space.
Proof. We only need to show that C contains the intersection of the closed half-spaces
containing C. The reverse containment is obvious. If a ∉ C, then, by the previous
example, there is a hyperplane nT x = b such that nT y < b for all y ∈ C and
nT a > b. Thus C is contained in the closed half-space H = {x ∈ ℝn ∶ nT x ≤ b},
but a ∉ H.
⁷ We can translate C by −m. Specifically, we look at the set C′ = {x − m ∶ x ∈ C} and the point
m′ = 0. This translation preserves all the properties of C but has the advantage that the hyperplane we
seek has a homogeneous equation. This simplifies the algebra.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Example 11. Every tangent line to the unit circle is a supporting line of the closed
unit disk. The line y = x + 1 is a supporting line of the closed unit square
S = [0, 1] × [0, 1]. Slight rotations of the line about the point (0, 1) are also
supporting lines of S. Thus there are infinitely many supporting lines of S at
the point (0, 1). The line x = 1 is also a supporting line of the square.
Exercises
B = {x ∈ X ∶ ‖x‖ ≤ 1}
Prove that
A𝜖 = {x ∈ X ∶ dist(x, A) < 𝜖}.
Also show that if E is a compact subset of X and F is closed in X and disjoint
from E, then E𝜖 ∩ F𝜖 = ∅ for some 𝜖 > 0.
17. Show that if E and F are disjoint compact subsets of ℝn , then there are points
x ∈ E and y ∈ F such that d(x, y) = dist(E, F).
18. Let E and F be disjoint compact convex subsets of ℝn . Show that there exists
a hyperplane uT x = b such that uT x > b for every x ∈ E, and uT x < b for
every x ∈ F.
19. Prove that a closed convex subset C of ℝn is the intersection of the closed
supporting half-spaces that contain C.
20. Find a countable set of closed supporting half-planes whose intersection is
the closed unit disk.
21. Prove that the standard n-simplex in ℝn is compact.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Definition. Let X be a nonempty set, and define ℬ(X) to be the set of all
bounded, real or complex, functions on X. Define vector addition and scalar
multiplication in ℬ(X) by (f + g)(x) = f(x) + g(x), (af)(x) = af(x). Here f and
g are bounded functions, x ∈ X, and a ∈ 𝕂. The supremum norm (also the
uniform or ∞-norm) of a function f ∈ ℬ(X) is defined by
It is a straightforward exercise to verify that ℬ(X) is a vector space and that the
function ‖.‖∞ is a norm. Observe that it is not assumed that X is necessarily a
metric space. It is sometimes necessary to specify the scalar field. In this case, we
use the notations ℬ(X, ℝ), and ℬ(X, ℂ) to indicate whether we wish to consider
real or complex valued functions.
Definition. Let X be a metric space, and define 𝒞(X) to be the set of continuous
real or complex functions on X. The operations on 𝒞(X) are defined pointwise
as in the above definition. This clearly makes 𝒞(X) into a vector space; see prob-
lem 1 on section 4.3. However, since a continuous function is not necessarily
bounded, the supremum norm is not necessarily defined on 𝒞(X).
Proof. We only prove the completeness of ℬ(X). Let (fn ) be a Cauchy sequence in
ℬ(X), and let 𝜖 > 0. There exists a natural number N such that, for m, n > N,
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
‖fn − fm ‖∞ = supx∈X |fn (x) − fm (x)| < 𝜖. In particular, (fn (x)) is a Cauchy
sequence in 𝕂 for each x ∈ X. Therefore limn fn (x) exists. Define f(x) = limn fn (x).
We claim that f ∈ ℬ(X). Since fn is Cauchy, there exists N ∈ ℕ such that, for
n > N, ‖fn − fN ‖∞ < 1. Consequently, for all x ∈ X, and all n > N, |fn (x)| ≤
|fn (x) − fN (x)| + |fN (x)| ≤ ‖fn − fN ‖∞ + ‖fN ‖∞ < 1 + ‖fN ‖∞ . Taking the limit of
the quantity on the extreme left of the above string of inequalities, we obtain
|f(x)| ≤ 1 + ‖fN ‖∞ . Thus f is a bounded function.
Finally, we show that limn fn = f in ℬ(X). Let 𝜖 > 0. There exists N ∈ ℕ such that,
for n, m > N and for all x ∈ X, |fn (x) − fm (x)| < 𝜖. Taking the limit as m → ∞, we
obtain |fn (x) − f(x)| ≤ 𝜖 for all x ∈ X, and all n > N. This means that ‖fn − f‖∞ <
𝜖 for all n > N, and the proof is now complete.
Theorem 4.8.2. If X is a metric space, then the space ℬ𝒞(X) of continuous bounded
functions on X is a complete normed linear space.
Proof. Since ℬ𝒞(X) is a subspace of ℬ(X), it suffices, by theorems 4.8.1 and 4.6.4, to
show that ℬ𝒞(X) is closed in ℬ(X). Let f ∈ ℬ(X) be a closure point of ℬ𝒞(X). We
need to show that f is continuous. For 𝜖 > 0, there exists a function g ∈ ℬ𝒞(X) such
that ‖f − g‖∞ < 𝜖/3. Fix x0 ∈ X, and let 𝛿 > 0 be such that d(x, x0 ) < 𝛿 implies
that |g(x) − g(x0 )| < 𝜖/3. Now if d(x, x0 ) < 𝛿, then
Proof. Let 𝜖 > 0. For every x ∈ X, there exists 𝛿x > 0 such that whenever d(x, 𝜉) <
𝛿x , |f(x) − f(𝜉)| < 𝜖/2. Now X = ∪x∈X B(x, 𝛿x ). Let 3𝛿 be a Lebesgue number for
the open cover {B(x, 𝛿x ) ∶ x ∈ X}. For each 𝜉, 𝜂 ∈ X with d(𝜉, 𝜂) < 𝛿, B(𝜉, 𝛿)
contains 𝜂 and has diameter < 3𝛿. By the definition of a Lebesgue number,
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
there exists x ∈ X such that B(𝜉, 𝛿) ⊆ B(x, 𝛿x ). In particular, d(𝜉, x) < 𝛿x , and
d(𝜂, x) < 𝛿x . Consequently, |f(𝜉) − f(𝜂)| ≤ |f(𝜉) − f(x)| + |f(x) − f(𝜂)| < 𝜖.
The next result uses function spaces to provide an elegant and succinct proof of
the existence of the completion of an arbitrary (incomplete) metric space.
Theorem 4.8.4. Let X be a metric space. Then there exists a complete metric space
X and an isometry 𝜑 ∶ X → X such that 𝜑(X) is dense in X.
Proof. We know that ℬ(X, ℝ) is a complete metric space. We will find an isometry
𝜑 ∶ X → ℬ(X, ℝ). The theorem follows by taking X = 𝜑(X). To this end, fix an
element a ∈ X. For 𝜉 ∈ X, define a function 𝜑𝜉 ∶ X → ℝ by
By the triangle inequality, |𝜑𝜉 (x)| = |d(x, 𝜉) − d(x, a)| ≤ d(a, 𝜉) for all x ∈ X.
Therefore 𝜑𝜉 is bounded. We now show that the map 𝜉 ↦ 𝜑𝜉 from X to ℬ(X, ℝ)
is an isometry. Specifically, we need to show that, for 𝜉, 𝜂 ∈ X, ‖𝜑𝜉 − 𝜑𝜂 ‖∞ =
d(𝜉, 𝜂).
Now ‖𝜑𝜉 − 𝜑𝜂 ‖∞ = supx∈X |𝜑𝜉 (x) − 𝜑𝜂 (x)| = supx∈X |d(x, 𝜉) − d(x, 𝜂)| ≤ d(𝜉, 𝜂).
Therefore ‖𝜑𝜉 − 𝜑𝜂 ‖∞ ≤ d(𝜉, 𝜂).
Since |𝜑𝜉 (𝜉) − 𝜑𝜂 (𝜉)| = d(𝜉, 𝜂), ‖𝜑𝜉 − 𝜑𝜂 ‖∞ = d(𝜉, 𝜂), as desired.
Theorem 4.8.5. Let (Y, 𝜌) be a complete metric space space, and let 𝜑 ∶ X → Y be
an isometry from a metric space X into Y such that 𝜑(X) is dense in Y. Then 𝜑
can be uniquely extended to an isometry 𝜑 ∶ X → Y.
To show that 𝜑 is onto, let y ∈ Y. Since 𝜑(X) is dense in Y, there exists a sequence
xn in X such that lim 𝜑(xn ) = y. Again, because 𝜑 is an isometry, (xn ) is Cauchy
in X. Since X is complete, (xn ) converges to a point x ∈ X. By the very definition
of 𝜑, 𝜑(x) = y.
We now prove two theorems of great utility: Ascoli’s theorem, which gives neces-
sary and sufficient conditions for the compactness of a subset of continuous func-
tions on a compact space in the uniform metric, and the Weierstrass polynomial
approximation theorem. Later in the book, we will encounter several applications
of the two theorems.
Proof. The proof mimics that of theorem 4.8.3 and is left as an exercise.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
To prove the claim, let f, g ∈ 𝔉𝜑 . Since |f(xi ) − 𝜑(xi )| < 𝜖/4 and |g(xi ) − 𝜑(xi )| <
𝜖/4 for every xi ∈ A, |f(xi ) − g(xi )| < 𝜖/2 for every xi ∈ A. Now let x ∈ X. Then
x ∈ B(xi , 𝛿xi ) for some xi ∈ A, and
Remark. Observe that we did not use the full force of the assumption that 𝔉 is
bounded, just that it is pointwise bounded. Problem 8 at the end of this section
is relevant here.
‖g − P‖∞ < 𝜖.
Proof. Observe that the theorem says that the space of polynomials is dense
in 𝒞[0, 1]. Without loss of generality, we may replace g with the function
f(x) = g(x) − [g(0) + x(g(1) − g(0))]. This is because g(0) + x(g(1) − g(0)) is a
polynomial. Replacing g with f has the advantage that f(0) = f(1) = 0. Extend f to
ℝ by defining f(x) = 0 when x ∉ [0, 1].
1 1
Define Ln (x) = cn ∫−1 f(x + t)(1 − t2 )n dt, where c−1 2 n
n = ∫−1 (1 − t ) dt. Since
f(x) = 0 for x ∉ [0, 1],
1−x 1
Ln (x) = cn ∫ f(x + t)(1 − t ) dt = cn ∫ f(𝜉)[1 − (𝜉 − x)2 ]n d𝜉.(𝜉 = x + t.)
2 n
−x 0
The last expression makes it clear that Ln (x) is a polynomial of degree ≤ 2n.
1
Since ∫−1 cn (1 − t2 )n dt = 1 and, for |t| < 1, cn (1 − t2 )n ≥ 0,
1
|Ln (x) − f(x)| = |cn ∫ [f(x + t) − f(x)](1 − t2 )n dt|
−1
1
≤ cn ∫ |f(x + t) − f(x)|(1 − t2 )n dt
−1
𝛿
= cn ∫ |f(x + t) − f(x)|(1 − t2 )n dt + cn ∫ |f(x + t) − f(x)|(1 − t2 )n dt
−𝛿 |t|>𝛿
𝛿
< 𝜖cn ∫ (1 − t2 )n dt + 2‖f‖∞ cn ∫ (1 − t2 )n dt
−𝛿 |t|>𝛿
1
< 𝜖 + 4‖f‖∞ ∫ (1 − t2 )n dt < 𝜖 + 4𝜖‖f‖∞ .
𝛿
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
The last example and the Weierstrass polynomial approximation theorem have
far-reaching generalizations. Their proofs require the full power of the Stone-
Weierstrass theorem. The proof of all three theorems can be found in section 4.9.
1
Example 3. Let f ∈ 𝒞[0, 1] be such that ∫0 xn f(x)dx = 0, for every nonnegative
integer n. Then f = 0.
The discussion so far has been focused on scalar-valued functions, and all the
function spaces we have studied are normed linear spaces. We now expand the
discussion and consider functions that take values in a general metric space. The
next two examples are extensions of theorems 4.8.1 and 4.8.2.
Let I = [0, 1] be the closed unit interval, and let I2 = [0, 1] × [0, 1] be the closed
unit square; I is given the usual metric on ℝ, and we give I2 the product met-
ric 𝜌((r, s), (u, v)) = max{|r − u|, |s − v|}. In theorem 4.8.9, we will make use of
the space 𝒞(I, I2 ) defined in example 5 with the complete metric D defined in
example 4.
Let J = [a, b] be an arbitrary closed interval, and let S be an arbitrary closed square.
We will refer to a function in 𝒞(J, S) as a path. We are particularly interested in the
four types of triangular paths g shown in figure 4.4. The triangles differ only in
orientation. Specifically, the intervals [a, (a + b)/2] and [(a + b)/2, b] are mapped
linearly onto the straight line segments of the triangle such that g(a) and g(b) are
adjacent corners of S and g((a + b)/2) is the center of S. See the formula defining
the path f0 in the proof of theorem 4.8.9.
Before we embark on the task of finding the space-filling curve, we describe
a special type of operation we need in the proof of the next theorem. Observe
that the paths g intersect only two of the four sub-squares that result from
bisecting the sides of S. We define the modified paths g′ as follows. Divide [a, b]
into four congruent subintervals Jj = [a + j(b − a)/4, a + (j + 1)(b − a)/4], 0 ≤ j ≤
3, and map the subinterval Jj linearly onto the four triangular paths that make up
the path g′ , as shown in figure 4.5. Observe that the paths g′ intersect all the sub-
squares of S.
We are now ready to find the space-filling curve. The statement of the theorem
below justifies the term space filling.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
(a) (b)
(c) (d)
Proof. We apply the operation discussed above the theorem to construct a sequence
(fn ) that converges to the desired function f.
Define the path f0 ∶ I → I2 by
(t, t) if 0 ≤ t ≤ 1/2,
f0 (t) = {
(t, 1 − t) if 1/2 ≤ t ≤ 1.
(a) (b)
(c) (d)
0 0 0
0 1 0 1 0 1
A crucial feature of the sequence (fn ) is that if a triangular piece T of the path
fn is contained in a square S of length 2−n , then the four triangular pieces of
fn+1 obtained by modifying T are contained in the same square S. Thus, for every
t ∈ I, 𝜌(fn+1 (t), fn (t)) < 2−n . Consequently, D(fn+1 , fn ) < 2−n . This is the crux of
the proof.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Thus the sequence (fn ) is Cauchy, and the completeness of (𝒞(I, I2 ), D) guarantees
that fn converges to a function f ∈ 𝒞(I, I2 ).
Since I is compact, the range of f is compact and hence closed in I2 . The proof will
be complete if we show that the range of f is dense in I2 . Let x ∈ I2 , and let 𝜖 > 0.
Choose an integer n such that 2−n < 𝜖/2 and D(fn , f) < 𝜖/2. The point x belongs
to one of the 4n squares that contain the triangular pieces of fn . Let S be such a
square. If t ∈ [0, 1] is such that fn (t) is on the triangular piece contained in S, then
𝜌(fn (t), x) < 2−n and
Exercise
Lemma 4.9.1. Let 𝒜 be a subalgebra of 𝒞(X, ℝ) satisfying SA1 and SA2. Then, for
f, g ∈ 𝒜, the functions max{f, g} and min{f, g} are in 𝒜 (the closure of 𝒜).
1 1 1 1
Proof. Since max{f, g} = (f + g) + |f − g|, and min{f, g} = (f + g) − |f − g|, and
2 2 2 2
since 𝒜 is a subspace of 𝒞(X, ℝ), it is sufficient to prove that |f| ∈ 𝒜 whenever
f ∈ 𝒜. Let M = ‖f‖∞ , and let 𝜖 > 0. By the Weierstrass approximation theorem
n
applied to the function g(t) = |t|, there exists a polynomial p(t) = ∑j=0 aj tj (aj ∈
ℝ) such that, for all t ∈ [−M, M], | |t| − p(t)| < 𝜖. Consider the function pof =
n
∑j=0 aj fj . Since 𝒜 is an algebra, pof ∈ 𝒜, and since | |f(x)| − p(f(x))| < 𝜖 for all
x ∈ X, ‖|f| − pof‖∞ < 𝜖, and |f| ∈ 𝒜.
Lemma 4.9.2. Let 𝒜 be a subalgebra of 𝒞(X, ℝ) satisfying SA1 and SA2, and let f ∈
𝒞(X, ℝ). For every y, z ∈ X, there exists a function gyz ∈ 𝒜 such that gyz (y) = f(y)
and gyz (z) = f(z).
Proof. If y = z, define gyz (x) = f(y) (a constant function). Otherwise, by SA2, there
exists a function h ∈ 𝒜 such that h(y) ≠ h(z). The following function is in 𝒜 and
satisfies the requirements:
h(x) − h(z)
gyz (x) = f(z) + (f(y) − f(z)) .
h(y) − h(z)
of f − gy,z , Uy,z is open and, clearly, y, z ∈ Uy,z . In particular, for every x ∈ Uyz ,
f(x) < gy,z (x) + 𝜖 and f(x) > gy,z (x) − 𝜖. The collection {Uy,z ∶ y ∈ X} covers X.
nz
Thus there exists a finite subset {y1 , . . . , ynz } of X such that ∪i=1 Uyi ,z = X.
nz
Define gz = max{gyi ,z ∶ 1 ≤ i ≤ nz }, and let Vz = ∩i=1 Uyi ,z . The function gz is
in 𝒜 by lemma 4.9.1. Observe that
and
g = min{gzj ∶ 1 ≤ j ≤ m}.
A little reflection reveals that g(x) − 𝜖 < f(x) < g(x) + 𝜖 for all x ∈ X.
Proof. Clearly, 𝒜 is an algebra, and it contains all constant functions. To show that 𝒜
separates points in X, let x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ) be distinct points in
n
X. The polynomial p(t1 , . . . , tn ) = ∑i=1 (ti − xi )2 satisfies p(x) = 0, and p(y) > 0.
By the Stone-Weierstrass theorem, 𝒜 is dense in 𝒞(X, ℝ).
Proof. Since compact metric spaces are separable, let {𝜉n ∶ n ∈ ℕ} be a countable
dense subset of X. For n ∈ ℕ, define fn (x) = d(x, 𝜉n ), and define f0 (x) = 1. Let
ℳ be the set of all finite products of the functions f0 , f1 , . . . , and let 𝒜 be the set
of all linear combinations with real coefficients of elements in ℳ. Clearly, 𝒜 is
a subalgebra of 𝒞(X, ℝ).11 We show that 𝒜 separates points in X. If x and y are
distinct, let 𝛿 = d(x, y)/4. There exists a natural number n such that d(x, 𝜉n ) < 𝛿.
The function fn separates x and y since fn (x) < 𝛿, and fn (y) > 3𝛿.
By theorem 4.9.3, 𝒜 is dense in 𝒞(X, ℝ). We now show that the countable set
n
𝒜1 = {∑i=1 qi gi ∶ n ∈ ℕ, qi ∈ ℚ, gi ∈ ℳ} is dense in 𝒞(X, ℝ). By the first part of
n
the proof, it is enough to show that if f = ∑i=1 ai gi ∈ 𝒜 and 𝜖 > 0, then there exists
an element h ∈ 𝒜1 such that ‖f − h‖∞ < 𝜖. Let M = max{‖gi ‖∞ ∶ 1 ≤ i ≤ n}
n
and choose rational numbers qi such that |ai − qi | < 𝜖/(nM). Set h = ∑i=1 qi gi .
Clearly, ‖f − h‖∞ < 𝜖.
To show that 𝒞(X, ℂ) is separable, let f = f1 + if2 ∈ 𝒞(X, ℂ), and choose functions
h1 and h2 in 𝒜1 such that ‖f1 − h1 ‖∞ < 𝜖/2, and ‖f2 − h2 ‖∞ < 𝜖/2. The function
h = h1 + ih2 is in 𝒜1 + i𝒜1 and satisfies ‖f − h‖∞ < 𝜖. Since 𝒜1 + i𝒜1 is count-
able, the proof is complete.
Theorem 4.9.3 does not extend to 𝒞(X, ℂ), as we show in example 1 below. First
we need a definition.
Definition. Let 𝒞(𝒮1 , ℂ)12 be the space of all continuous complex functions on
[−𝜋, 𝜋] such that f(−𝜋) = f(𝜋). It is clear that 𝒞(𝒮1 , ℂ) is a closed subspace of
𝒞[−𝜋, 𝜋] when both spaces are given the uniform norm.
Another way to view the space 𝒞(𝒮1 , ℂ) is as follows. The restriction of any
continuous, 2𝜋-periodic function g ∶ ℝ → ℂ to the interval [−𝜋, 𝜋] is in the space
𝒞(𝒮1 , ℂ). Conversely, any function f ∈ 𝒞(𝒮1 , ℂ) can be extended by periodicity
to a continuous, 2𝜋-periodic function. Thus the space 𝒞(𝒮1 , ℂ) is also the space
of continuous, 2𝜋-periodic functions. Every point 𝜃 ∈ [−𝜋, 𝜋) corresponds to a
unique point ei𝜃 on the unit circle 𝒮1 in the complex place, and, for every function
f ∈ 𝒞(𝒮1 , ℂ), there corresponds a function f ̃ ∶ 𝒮1 → ℂ, where f(ẽ i𝜃 ) = f(𝜃) (here
𝜃 ∈ [−𝜋, 𝜋)). The correspondence f ↔ f ̃ is unambiguous because of the condition
f(−𝜋) = f(𝜋). Therefore the space of 2𝜋-periodic functions can also be thought of
as the space of continuous functions on the unit circle 𝒮1 . We adopt any of the
three equivalent characterizations of 𝒞(𝒮1 , ℂ), as convenience dictates.
12 The reason for the notation will be justified in the next paragraph.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
2𝜋 n 2𝜋
∫0 fpdt = ∑j=0 aj ∫0 ei(j+1)t dt = 0. Because |f| = ff = 1,
| 2𝜋 | | 2𝜋 | 2𝜋
2𝜋 = || ∫0 ffdt|| = || ∫0 f(f − p)dt|| ≤ ∫0 |f − p|dt ≤ 2𝜋‖f − p‖∞ . Thus ‖f −
p‖∞ ≥ 1, as claimed.
Proof. Let ℛ = {Re(f) ∶ f ∈ 𝒜}, and let ℐ = {Im(f) ∶ f ∈ 𝒜}. If f = f1 + if2 ∈ 𝒜, then
if = −f2 + if1 ∈ 𝒜. Thus f2 ∈ ℛ and f1 ∈ ℐ. It follows that ℐ = ℛ. First we show
that ℛ satisfies SA1 and SA2. It is clear that ℛ contains all constant functions. If
x and y are distinct points of X, then there exists f ∈ 𝒜 such that f(x) ≠ f(y). Thus
f1 (x) ≠ f1 (y) or f2 (x) ≠ f2 (y). Because f1 and f2 are in ℛ, ℛ separates points in X.
Theorem B.3 implies that ℛ is dense in 𝒞(X, ℝ). Because 𝒜 is closed under complex
conjugation, f1 = (f + f)/2 ∈ 𝒜; thus ℛ ⊆ 𝒜, and hence ℛ + iℛ ⊆ 𝒜. We show
that ℛ + iℛ is dense in 𝒞(X, ℂ). By the density of ℛ in 𝒞(X, ℝ), there are functions
h1 , h2 ∈ ℛ such that ‖f1 − h1 ‖∞ < 𝜖/2 and ‖f2 − h2 ‖∞ < 𝜖/2. The function h =
h1 + ih2 is in ℛ + iℛ and ‖f − h‖∞ < 𝜖.
Example 2. For n ∈ ℤ, let un (t) = eint , and consider the set 𝒯 = Span({un ∶ n ∈
ℤ}). 𝒯 is clearly a subalgebra of 𝒞(𝒮1 , ℂ) that satisfies the assumptions of
theorem 4.9.6. Therefore 𝒯 is dense in 𝒞(𝒮1 , ℂ).
The last example is really a well-known theorem. We will expand this discussion
in a more focused manner in the next section.
In section 3.7 we studied the geometry of inner product spaces more than their
metric properties. We now have a bigger toolbox with which we can tackle
inner product spaces. Before we pose the central questions of this section, let us
summarize the highlights of section 3.7, upon which this section rests heavily. Let
{u1 , u2 , . . . } be an infinite orthonormal sequence of vectors in an inner product
space H. The orthogonal projection of an element x ∈ H on the finite-dimensional
n
space Mn = Span({u1 , . . . , un }) is, by definition, the vector Sn x = ∑i=1 ⟨x, ui ⟩ui . We
know from theorem 3.7.6 that the vector Sn x is the closest vector in Mn to x,
and we also say that Sn x is the best approximation of x in Mn . Now that we have
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Fourier series
1 𝜋
In section 3.7, we defined the inner product ⟨f, g⟩ = ∫−𝜋 f(x)g(x)dx on the space
2𝜋
𝒞[−𝜋, 𝜋]. The sequence
{un (t) = eint ∶ n ∈ ℤ}
is an orthonormal sequence with respect to the above inner product. The norm
of a function f induced by the inner product will be denoted by ‖f‖2 in order
to distinguish it from the uniform norm on 𝒞[−𝜋, 𝜋], which will also play a
prominent role in this section. Thus the uniform norm of a function f ∈ 𝒞[a, b]
will be denoted by the usual notation ‖f‖∞ , while
𝜋 1/2
1
‖f‖2 = ( ∫ |f(x)|2 dx) .
2𝜋 −𝜋
For a function f ∈ 𝒞[−𝜋, 𝜋], we define the Fourier series of f to be the formal
series
∞ 𝜋
̂
∑ f(n)e inx ̂ = 1 ∫ f(t)e−int dt.
where f(n)
n=−∞
2𝜋 −𝜋
̂
The numbers f(n), n ∈ ℤ are called the Fourier coefficients of f. It is clear that
the partial sum of the Fourier series,
n
̂ ijx
Sn f(x) = ∑ f(j)e
j=−n
Using the terminology we just established, example 2 in section 4.9 can be stated
as follows.
Lemma 4.10.2. For a function f ∈ 𝒞[−𝜋, 𝜋] (not necessarily periodic), and for every
𝜖 > 0, there exists a 2𝜋-periodic function g such that ‖f − g‖2 < 𝜖.
𝜖2
Proof. Let M = ‖f‖∞ and define 𝛿 = . Define g ∈ 𝒞(𝒮1 ) as follows:
8M2
f(−𝜋+𝛿)
⎧ (x + 𝜋) if − 𝜋 ≤ x ≤ −𝜋 + 𝛿,
⎪ 𝛿
g(x) = f(x) if − 𝜋 + 𝛿 ≤ x ≤ 𝜋 − 𝛿,
⎨ f(𝜋−𝛿)
⎪
⎩ −𝛿 (x − 𝜋) if 𝜋 − 𝛿 ≤ x ≤ 𝜋.
Figure 4.7 below shows how f is modified on the subintervals [−𝜋, −𝜋 + 𝛿] and
[𝜋 − 𝛿, 𝜋] to produce g. We replace the graph of f on the subinterval [−𝜋, −𝜋 + 𝛿]
with the straight line that interpolates the points (−𝜋 + 𝛿, f(−𝜋 + 𝛿)) and (−𝜋, 0),
and similarly on the subinterval [𝜋 − 𝛿, 𝜋]. The dotted lines in figure 4.7 indicate
the modification of f to produce g. By construction, g is continuous and periodic.
Also, for x ∈ [−𝜋, 𝜋], |f(x) − g(x)| < 2M. Now
−𝜋+𝛿 𝜋
1 1
‖f − g‖22 = ∫ |f(x) − g(x)|2 dx + ∫ |f(x) − g(x)|2 dx
2𝜋 −𝜋 2𝜋 𝜋−𝛿
−𝜋+𝛿 𝜋
4M2 4M2 8M2 𝛿 𝜖2
≤ ∫ dx + ∫ dx = = < 𝜖2 .
2𝜋 −𝜋 2𝜋 𝜋−𝛿 2𝜋 2𝜋
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
π
δ
-π -π
δ
+
+
Proof. Let f ∈ 𝒞[−𝜋, 𝜋], and let 𝜖 > 0. By lemma 4.10.2, there is a function g ∈
𝒞(𝒮1 ) such that ‖f − g‖2 < 𝜖. By theorem 4.10.1, there exists a trigonometric
polynomial p such that ‖g − p‖∞ < 𝜖. Now
We are now able to settle a question posed in the preamble to this section.
Theorem 4.10.4. For every function f ∈ 𝒞[−𝜋, 𝜋], the sequence of partial sums Sn f
converges in the mean square to f.
Proof. We need to show that limn ‖f − Sn f‖2 = 0. Let 𝜖 > 0. By corollary 4.10.3,
N
there exists a trigonometric polynomial p = ∑j=−N cj uj such that ‖f − p‖2 <
𝜖. For every n ≥ N, p ∈ Mn = Span({uj ∶ −n ≤ j ≤ n}). Because Sn f is the
best approximation of f in Mn , it follows that, for every n ≥ N, ‖f − Sn f‖2 ≤
‖f − p‖2 < 𝜖.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
We take a short detour to discuss the sum of a two-sided sequence. The concept
framed in the following, more general, definition is sometimes useful. See the
excursion in section 7.2.
∑{a𝛼 ∶ 𝛼 ∈ I} = sup{∑𝛼∈F a𝛼 }.
Example 2. Let J be a countable set and suppose that ∑{a𝛼 ∶ 𝛼 ∈ J} < ∞, where
∞
a𝛼 ≥ 0. If 𝛼1 , 𝛼2 , . . . is any enumeration of J, then ∑{a𝛼 ∶ 𝛼 ∈ J} = ∑i=1 a𝛼i .
N
For an integer N, ∑i=1 a𝛼i ≤ ∑{a𝛼 ∶ 𝛼 ∈ J}. Thus the partial sums of the
∞ ∞
series ∑i=1 a𝛼i are bounded by ∑{a𝛼 ∶ 𝛼 ∈ J} and hence ∑n=1 a𝛼n ≤ ∑{a𝛼 }.
Conversely, if F is an arbitrary finite subset of J, then there is an integer N such
N ∞
that F ⊆ {𝛼1 , . . . , 𝛼N }. Therefore ∑{a𝛼 ∶ 𝛼 ∈ F} ≤ ∑n=1 a𝛼n ≤ ∑n=1 a𝛼n . Thus
∞
∑{a𝛼 ∶ 𝛼 ∈ J} ≤ ∑n=1 a𝛼n .
A special case of the above examples is when (an )n∈ℤ is a two-sided sequence
∞
of nonnegative numbers.13 The series ∑−∞ an can be defined (for example)
n
as limn→∞ ∑i=−n ai , which corresponds to the following enumeration of ℤ:
0, −1, 1, −2, 2, −3, 3, . . . .
We can now define, for 1 ≤ p < ∞, the space lp (ℤ) to be the set of all two-sided
∞
sequences x = (xn )n∈ℤ such that ∑n=−∞ |xn |p < ∞. It is easy to check that lp (ℤ) is
∞ 1/p
a complete normed linear space with the norm ( ∑n=−∞ |xn |p ) .
Similarly, we define l∞ (ℤ) to be the space of all bounded scalar functions
x ∶ ℤ → 𝕂, which is a complete normed linear space with the norm ‖x‖∞ =
sup{|x(n)| ∶ n ∈ ℤ}.
We also define the space c0 (ℤ) as the subspace of l∞ (ℤ) of all two-sided sequences
x ∈ l∞ (ℤ) such that lim|n|→∞ xn = 0.
𝜋 ∞
1 ̂ 2.
‖f‖22 = ∫ |f(x)|2 dx = ∑ |f(n)|
2𝜋 −𝜋 n=−∞
Proof. By the continuity of norms (see section 4.3), limn ‖Sn f‖22 = ‖f‖22 . By the
n
Pythagorean theorem, ‖Sn f‖22 = ‖ ∑j=−n f(j)û j ‖22 = ∑n ̂ 2 . We obtain the
|f(j)|
j=−n
required result by taking the limit of both sides as n → ∞.
𝜋 𝜋 𝜋
1 −2 2x |
∫ x2 cos(nx)dx = ∫ x sin(nx)dx = cos(nx)||
𝜋 0 𝜋n 0 𝜋n2 0
n
2 cos n 𝜋 2(−1)
= = .
n2 n2
Thus
̂ = f(−n)
̂ 2(−1)n
f(n) = .
n2
We also have
𝜋 2
̂ = 1 ∫ x2 dx = 𝜋 .
f(0)
2𝜋 −𝜋 3
Theorem 10.4.5 now yields
𝜋 ∞
𝜋4 1 ̂ 2 + ∑ |f(n)|
̂ 2
= ∫ x4 dx = |f(0)|
5 2𝜋 −𝜋 |n|=1
∞ ∞
𝜋4 4
̂ 2= 𝜋 +∑ 8 .
= + 2 ∑ |f(n)|
9 n=1
9 n=1 n4
∞
1 𝜋4
∑ 4
= .
n=1
n 90
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
The next result says that a function in 𝒞[−𝜋, 𝜋] is determined by its Fourier
coefficients.
̂ = g(n)
Corollary 4.10.6 (the uniqueness theorem). If f, g ∈ 𝒞[−𝜋, 𝜋] and f(n) ̂ for
every n ∈ ℤ, then f = g.
Thus the sequence of functions Fn (x) is Cauchy in 𝒞(𝒮1 ) and hence converges
uniformly on [−𝜋, 𝜋] to some function F ∈ 𝒞(𝒮1 ). Thus limn ‖Fn − F‖∞ = 0.
But ‖Fn − F‖2 ≤ ‖Fn − F‖∞ , and hence Fn converges to F in ‖.‖2 . Since Fn also
converges to f in ‖.‖2 (theorem 4.10.4), F = f by the uniqueness of limits.
∞
𝜋2 (−1)n cos(nx)
x2 = +4∑ , −𝜋 ≤ x ≤ 𝜋.
3 n=1
n2
𝜋 𝜋
̂ −1 | 1 1 ̂
F(n) = F(t)e−int || + ∫ f(t)e−int dt = f(n).
2𝜋in −𝜋 2𝜋in −𝜋
in
1
Using the inequality |ab| ≤ [|a|2 + |b|2 ], we have
2
̂ 1 1 ̂ 2]
|F(n)| ≤ [ + |f(n)|
2 n2
and
∞ ∞ ∞
̂ 1 1 ̂ 2 ] < ∞.
∑ |F(n)| ≤ [2 ∑ 2 + ∑ |f(n)|
|n|=1
2 n=1
n |n|=1
b
Consequently, ∫a p(x)𝜔(x)dx < ∞ for every polynomial p. Neither the function 𝜔
nor the interval (a, b) is assumed to be bounded.
When either 𝜔 or (a, b) is unbounded, we interpret the integrals involved
as improper Riemann integrals according to the standard definitions. Observe
that if (a, b) is a bounded interval, then 𝜔 can be unbounded if and only if
limx↓a 𝜔(x) = ∞ or limx↑b 𝜔(x) = ∞. See the weight function for the Tchebychev
polynomials later on in this section.
b
∫ |f(x)|2 𝜔(x)dx < ∞.
a
b b
1
∫ |f(x)g(x)|𝜔(x)dx ≤ ∫ [|f(x)|2 + |g(x)|2 ]𝜔(x)dx < ∞
a
2 a
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
and
b b
2
∫ |f + g|2 𝜔(x)dx ≤ ∫ (|f| + |g|) 𝜔(x)dx
a a
b
= ∫ |f|2 𝜔(x) + 2|fg|𝜔(x) + |g|2 𝜔(x)dx < ∞.
a
1 n 2
Pn (x) = D (x − 1)n
2n n!
1 (2n)!
𝛼n = (2n)(2n − 1) . . . (n + 1) = .
2n n! 2n (n!)2
where
If j < n − 1, cj ‖Pj ‖22 = −𝛽n ⟨xPn , Pj ⟩ = −𝛽n ⟨Pn , xPj ⟩ = 0, since xPj has degree
less than n. Thus Pn+1 − 𝛽n xPn = cn Pn + cn−1 Pn−1 . Now
1
cn ‖Pn ‖22 = −𝛽n ⟨xPn , Pn ⟩ = −𝛽n ∫ xP2n (x)dx = 0,
−1
−n
Evaluating the last identity at x = 1, we obtain cn = 1 − 𝛽n = , and we
n+1
have the recurrence relation:
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
2n + 1 n
Pn+1 = xP − P .
n + 1 n n + 1 n−1
P0 (x) = 1,
P1 (x) = x,
1
P2 (x) = (3x2 − 1),
2
1
P3 (x) = (5x3 − 3x),
2
1
P4 (x) = (35x4 − 30x2 + 3).
8
1
2 2
∫ [Pn (x)] dx = .
−1
2n + 1
1 2
Let an = ∫−1 [Pn (x)] dx. Taking the inner product of Pn with both sides of the
2n−1 n−1 2n−1 1
identity Pn = xPn−1 − Pn−2 , we obtain an = ∫−1 (xPn )Pn−1 dx.
n n n
Using the recurrence relation again, xPn = [(n + 1)Pn+1 + nPn−1 ]/(2n + 1),
and hence
1
2n − 1 1 2n − 1
an = ∫ P [(n + 1)Pn+1 + nPn−1 ] = a .
n 2n + 1 −1 n−1 2n + 1 n−1
1
Now a0 = ∫−1 dx = 2. By induction, one obtains
2
an = .
2n + 1
It follows that the polynomials below are orthonormal in (𝒞[−1, 1], ‖.‖2 ):
2n + 1
P̃ n = √ Pn .
2
Theorem 4.10.8 (mean square convergence). For every f ∈ 𝒞[−1, 1], the sequence
n
Sn f = ∑i=0 ⟨f, P̃ i ⟩P̃ i converges to f in the sense that limn ‖Sn f − f‖2 = 0.
Proof. Let 𝜖 > 0. By the Weierstrass approximation theorem, there exists a polyno-
mial q such that ‖f − q‖∞ < 𝜖/√2. Now ‖f − q‖2 ≤ √2‖f − q‖∞ < 𝜖. Let N be the
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Observe the resemblance between the proof of the last theorem and that of
theorem 4.10.4. See also the examples in section 6.1.
Lemma 4.10.9. For n ≥ 0, there exists a polynomial Tn of exact degree n such that,
for all x ∈ ℝ, cos(nx) = Tn (cos x).
Proof. For n = 0, 1, the polynomials T0 (x) = 1 and T1 (x) = x trivially satisfy the
requirements. The rest of the construction is inductive. Suppose that there are
polynomials T0 , . . . , Tn that satisfy the statement we wish to prove. For n ≥ 1, we
have cos(n + 1)x + cos(n − 1)x = 2cos(nx)cos x. Therefore
cos(n + 1)x = 2cos(nx)cos x − cos(n − 1)x = 2cos xTn (cos nx) − Tn−1 (cos x).
The last identity dictates the definition of Tn+1 and concludes the proof:
Definition. The polynomials Tn in the previous lemma are called the Tchebychev
polynomials. A list of the next three Tchebychev polynomials appears below:
T2 (x) = 2x2 − 1,
T3 (x) = 4x3 − 3x,
T4 (x) = 8x4 − 8x2 + 1.
Theorem 4.10.10. The Tchebychev polynomials are orthogonal with respect to the
weight function 𝜔. Additionally,
‖T0 ‖22 = 𝜋
and, for n ≥ 1,
𝜋
‖Tn ‖22 = .
2
1 𝜋
Tn (x)Tm (x)
⟨Tm , Tn ⟩ = ∫ dx = ∫ cos(n𝜃) cos(m𝜃)d𝜃
−1 √1 − x2 0
𝜋
1
= ∫ cos(m + n)𝜃 + cos(m − n)𝜃d𝜃 = 0.
2 0
1 [Tn (x)]2 𝜋 1 𝜋
Finally, ‖Tn ‖22 = ∫−1 dx = ∫0 cos2 (n𝜃)d𝜃 = ∫ 1 + cos(2n𝜃)d𝜃 =
√1−x2 2 0
𝜋/2.
The basic properties of the Tchebychev polynomials appear below. The first three
follow from the three-term recurrence relation and induction:
𝜋k
6. The extreme values of Tn in [−1, 1] are attained at the points yk = cos ,
n
k
0 ≤ k ≤ n. Additionally, Tn (yk ) = (−1) . Again a direct verification is the
dT (x)
simplest or, as before, we write x = cos 𝜃, then Tn (x) = cos(n𝜃), and n =
dx
d cos(n𝜃) d𝜃 n sin(n𝜃)
= . The interested reader can work out the calculus and
d𝜃 dx √1−x2
arrive at the points yk .
1
For n ≥ 1, let T̃ n = Tn . From the above properties of Tn , T̃ n is a monic
2n−1
polynomial,1⁴ and
T̃ n (xk ) = 0 for 1 ≤ k ≤ n,
k
T(ỹ k ) = (−1) for 0 ≤ k ≤ n,
2 n−1
1
and ‖T̃ n ‖∞ = n−1 .
2
The following theorem establishes the curious fact that, among all monic poly-
nomials of degree n, T̃ n has the least uniform norm on [−1, 1]. This result is
important for understanding the error when a sufficiently differentiable function
is interpolated by a polynomial.
1
Proof. Suppose, for a contradiction, that ‖p‖∞ < . Consider the integers
2n−1
0 ≤ k ≤ n.
−1 1
If k is odd, then p(yk ) > = T̃ n (yk ). If k is even, then p(yk ) < = T̃ n (yk ).
2n−1 2n−1
(a, b) = (−∞, ∞)
and
2
𝜔(x) = e−x .
We will show that the polynomials defined below are orthogonal with respect
to 𝜔:
2 2
Hn (x) = (−1)n ex Dn [e−x ].
∞ 2
Since H0 = 1, and H1 (x) = 2x, ⟨H0 , H1 ⟩ = ∫−∞ 2xe−x dx = 0. We now use induc-
tion on n. If 0 ≤ j < n, then integration by parts yields
∞ ∞
2 2 2 2
⟨xj , Hn ⟩ = ∫ xj (−1)n ex Dn [e−x ]e−x dx = (−1)n ∫ xj Dn [e−x ]dx
−∞ −∞
∞ ∞
−x2 | 2
= (−1)n xj Dn−1 e | − (−1)n ∫ jxj−1 Dn−1 e−x dx.
|−∞
−∞
2
The first term of the last expression is 0 because xj Dn−1 e−x is the product of a
2
polynomial and e−x , and the second term is 0 by the inductive hypothesis.
We leave some of the properties of the Hermite polynomials as exercises for the
interested reader.
Exercises
1 𝜋 ∞
1. Let f, g ∈ 𝒞[−𝜋, 𝜋]. Prove that ∫−𝜋 f(x)g(x)dx = ∑n=−∞ f(n) ̂ g(n).
̂ Hint:
2𝜋
Use theorem 4.10.4 and the continuity of inner products. See section 4.3.
𝜋 4 ∞ cos(2n−1)x
2. Show that |x| = − ∑n=1 2
, −𝜋 ≤ x ≤ 𝜋. Conclude that
2 𝜋 (2n−1)
∞ (−1)n+1 𝜋2
∑n=1 = .
(2n−1)2 8
𝜋 2 x−x3 ∞ (−1)n+1 sin(nx)
3. Show that = ∑n=1 .
12 n3
∞ 1 𝜋6
4. Use the previous problem to show that ∑n=1 6 = .
n 945
5. This exercise furnishes the three-term recurrence relation for general
orthogonal polynomials with respect to a weight function 𝜔 on an interval
b
(a, b), and the inner product ⟨f, g⟩ = ∫a f(x)g(x)𝜔(x)dx. Let 𝜙0 , 𝜙1 , . . . be
the orthogonal monic polynomials with respect to the weight function
𝜔, where 𝜙0 = 1, and 𝜙n has degree n.1⁵ Prove the three term recurrence
relation below:
1⁵ Observe that these are precisely the orthogonal polynomials generated by applying the Gram-
Schmidt process to the monomials 1, x, x2 , . . . .
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
where
⟨x𝜙n , 𝜙n ⟩
an+1 =
‖𝜙n ‖2
and
‖𝜙n ‖
bn+1 = .
‖𝜙n−1 ‖
Here n ≥ 0, and, for notational convenience, define 𝜙−1 = 0, b1 = 0.
6. In the notation of the previous exercise, prove that the roots of 𝜙n are the
eigenvalues of the tri-diagonal matrix
a b2
⎛ 1 ⎞
b a2 ⋱
Jn = ⎜ 2 ⎟.
⎜ ⋱ ⋱ bn ⎟
⎝ bn an ⎠
7. Prove that all the roots of 𝜙n are real and simple and lie in the interval (a, b).
b
Outline: Since ⟨𝜙0 , 𝜙n ⟩ = 0, ∫a 𝜙n 𝜔dx = 0. Thus 𝜙n changes sign in (a, b),
and hence it has at least one root of odd multiplicity. Let x1 , . . . , xr be the
roots of 𝜙n of odd multiplicity in (a, b), and let q = (x − x1 ) . . . (x − xr ). If
r < n, examine ⟨q, 𝜙n ⟩.
8. Prove that the Legendre polynomial Pn satisfies the differential equation
(x2 − 1)P″n + 2xP′n − n(n + 1)Pn = 0.
9. Prove that the sum of the coefficients of any Legendre polynomial is 1. The
same is true for the Tchebychev polynomials.
10. Prove that 𝒞[−1, 1] is contained in the space of continuous square inte-
1
grable functions on (−1, 1) with respect to 𝜔(x) = . The integrals
√1−x2
involved are improper Riemann integrals.
1 2
11. Define the normalized Tchebychev polynomials T0 = , Tn = √ Tn .
√𝜋 𝜋
n
For a function f ∈ 𝒞[−1, 1] let Sn f = ∑j=0 ⟨f, Tj ⟩Tj . Prove that
limn ‖Sn f − f‖2 = 0.
12. Prove that Hn+1 = 2xHn − 2nHn−1 . Conclude that Hn is even if and only if
n is even.
13. Prove that H′n = 2xHn − Hn+1 . Conclude that H′n = 2nHn−1 .
14. Compute H2 and H3 .
∞ 2 2
15. Show that ‖Hn ‖22 = ∫−∞ [Hn (x)] e−x dx = n!2n √𝜋.
16. Prove that the Hermite polynomial Hn satisfies the differential equation
H″n (x) − 2xH′n (x) + 2nHn (x) = 0.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
5
Essentials of General Topology
Considering that he only had three years to devote to topology, he made his mark
in his chosen field with brilliance and passion. He transformed the subject into
a rich domain of modern mathematics. How much more might there have been,
had he not died so young?1
Crilly and Johnson wrote of Pavel Urysohn
In 1915 Urysohn entered the University of Moscow to study physics. However, his
interest in physics soon took second place, for, after attending lectures by Luzin
and Egoroff, he began to concentrate on mathematics. Urysohn graduated in 1919
and continued working toward his doctorate. In June 1921, he became an assistant
professor at the University of Moscow.
Urysohn soon turned to topology. Egoroff gave him two problems in 1921.
These were difficult problems that had been around for some time. Egoroff was
not to be disappointed. Near the end of August, even before working out the details,
Urysohn had the correct ideas for solving the problems. During the following year,
Urysohn worked through the details, building a whole new area of dimension
theory in topology. It was an exciting time for topologists in Moscow, for Urysohn
lectured on the topology of continua, and often his latest results were presented
in the course shortly after he had proved them. He published a series of short
notes on this topic during 1922. The complete theory was presented in an article
1 T. Crilly and D. Johnson, “The emergence of topological dimension theory,” in I. M. James (ed.),
History of Topology (New York: Elsevier, 1999), 1–24.
Fundamentals of Mathematical Analysis. Adel N. Boules, Oxford University Press (2021). © Adel N. Boules.
DOI: 10.1093/oso/9780198868781.003.0005
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
that Lebesgue accepted for publication in the Comptes rendus of the Academy of
Sciences in Paris. This gave Urysohn an international platform for his ideas, which
immediately attracted the interest of mathematicians such as Hilbert, Hausdorff,
and Brouwer. In addition to advancing dimension theory, Urysohn is credited for
an important metrization theorem. He is particularly remembered for “Urysohn’s
lemma,” which establishes the existence of a continuous function taking the values
0 and 1 on disjoint closed subsets of a normal space.
In the summer of 1924, Urysohn set off with Alexandroff on a European trip
through Germany, Holland, and France. The two mathematicians visited Hilbert.
After they left, Hilbert wrote to Urysohn, informing him that his paper with
Alexandroff had been accepted for publication in Mathematische Annalen, and
expressing the hope that Urysohn would visit again the following summer. They
then met Hausdorff, who was impressed with Urysohn’s results. He also wrote a
letter to Urysohn, which was dated August 11, 1924. The letter discusses Urysohn’s
metrization theorem and his construction of a universal separable metric space
(one into which any separable metric space can be injected), which was one of
Urysohn’s last results. Like Hilbert, Hausdorff expressed the hope that Urysohn
would visit again the following summer. Van Dalen writes about their final
mathematical visit, which was to Brouwer:2 “This time [Urysohn and Alexandroff]
visited Brouwer, who was most favourably impressed by the two Russians. He
was particularly taken with Urysohn, for whom he developed something like the
attachment to a lost son.”
While the metric topology is often sufficient for most introductory courses in
analysis, a good understanding of the elements of general topology is essential for
any advanced study of analysis. An attempt to define topology in a paragraph is
quite difficult and not likely to be successful, but we offer the following narrative
for the satisfaction of the the reader who insists on an overview of the subject. We
saw in chapter 4 that the collection of open sets generated by a metric has many
intrinsic properties independent of the defining metric. In this section, we study
the arrangement of the collection of open sets, or the topology, in a metric-free
context. Every metric space is a topological space; hence all results for topological
spaces (which are meaningful in the metric setting) are also valid for metric spaces,
but not conversely. We often fall back on the metric case to gain insight into both
subjects. We will encounter in this section many of the definitions that appeared
in chapter 4, such as closure, interior, and boundary. We include those definitions
again in this chapter for ease of reference. However, the proofs that duplicate
those in chapter 4 are omitted. The amount of duplication is small and does
not rise to the level of redundancy. We encourage the reader to compare results in
this section to their counterparts in the previous chapter. The exercise is insightful.
Thus 𝒯 is closed under the formation of arbitrary unions and finite intersections.
The members of 𝒯 are called the open subsets of X, and the pair (X, 𝒯) is called a
topological space.
Example 2. Let X be a nonempty set, and let 𝒯 = {∅, X}. This topology is called
the trivial or indiscrete topology on X.
Example 4. Let X = (0, ∞), and let 𝒯 consist of ∅ and all intervals of the form
(a, ∞), for all a ≥ 0. It is easy to verify that 𝒯 is a topology.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Example 5. The most common topologies are the metric topologies. Thus every
metric space is a topological space in accordance with the following definition: a
subset U of a metric space (X, d) is open if it is the union of open balls. Theorem
4.1.2 says precisely that the collection of open sets thus defined is a topology.
The reader can look up sections 3.6, 3.7, 4.1, and 4.8 for a variety of examples of
metric spaces and hence topological spaces.
Example 6. The most important topological space is ℝn , where the topology is the
metric topology generated by the Euclidean metric (or any equivalent metric.)
We will call this the usual topology on ℝn .
The following properties of interiors and closures are straightforward. See the
corresponding results in chapter 4.
Observe that the definition does not require a neighborhood of a point or a set
to be open. If A is open, we specifically refer to it as an open neighborhood of
x (respectively, E). For example, an open set is a neighborhood of each of its points.
The following theorem provides a useful criterion for characterizing the closure of
a set. Compare its statement and proof to those of theorem 4.2.4.
X − int(B) = X − [X − (X − B)] = X − B = A.
The proofs of the the following statements strongly resemble their metric counter-
parts. See, for example, theorems 4.2.8 and 4.2.9.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
(a) int(A) ∩ 𝜕A = ∅,
(b) A = int(A) ∪ 𝜕A,
(c) A = A ∪ 𝜕A, and
(d) A is closed if and only if 𝜕A ⊆ A.
Subspace Topology
Theorem 5.1.6. Let A ⊆ Y, and let AY denote the closure of A in (Y, 𝒯Y ). Then
AY = A ∩ Y.
Exercises
11. Show that the results of problems 9 and 10 on section 4.6 are valid for a
general topological space.
Some topologies are quite difficult to define directly, and it is frequently the case
that we want to define a topology on a set X that includes a certain collection
𝔖 of subsets of X. The existence of such a topology is obvious because 𝒫(X) is
such a topology. However, 𝒫(X) is useless because it is too large. This immediately
suggests the question of finding the smallest topology 𝒯 on X that contains 𝔖.
Fortunately, such a unique smallest topology 𝒯 exists.
The reader may wonder what situations would compel us to “want” the mem-
bers of 𝔖 to be open. The prime such situation is when we need a certain class
of functions from X to another topological space Y to be continuous, which is
the overarching idea behind the definition of product and weak topologies. See
sections 5.4 and 6.7.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
The set 𝔖 in the above discussion is called a subbase for 𝒯, and a closely connected
concept is that of a base for the topology 𝒯, which is our first definition. Bases and
subbases have a wide range of applications. In addition to providing the means
to define useful topologies, bases and subbases give us easy ways to prove the
continuity of functions and to characterize closures. See theorems 5.2.2 and 5.3.1.
See problem 2 at the end of this section for an equivalent, more explicit
formulation of the definition of an open base.
Example 1. The collection 𝔅 = {(r, s) ∶ r, s ∈ ℚ, r < s} is an open base for the usual
topology on ℝ. This is because every open subset of ℝ is the union of open
bounded intervals, and any such interval is the union of members of 𝔅: (a, b) =
∪{(r, s) ∶ r ∈ ℚ, s ∈ ℚ, a < r < s < b}. See section 4.5 for a more general version
of this example.
The collection of open balls in a metric space is an open base for the metric
topology. This follows immediately from the definition of open sets in a metric
space.
Example 2. Let X = {a, b, c}, and let ℭ = {∅, X, {a, b}, {b, c}}. The collection ℭ is
not the base for any topology on X because if it were, that topology would be ℭ
because the union of two members of ℭ is in ℭ. However, ℭ is not a topology,
because {a, b} ∩ {b, c} ∉ ℭ.
Proof. If 𝔅 is an open base for some topology 𝒯, and x, U, and V are as in the
statement of the theorem, then U ∩ V is a nonempty open set. By the definition
of an open base, there is member W of 𝔅 such that x ∈ W ⊆ U ∩ V.
The following theorem serves as an early indicator of the importance and typical
uses of open bases.
Theorem 5.2.2. Let X be a topological space, and let 𝔅 be an open base for the
topology on X. If A be a subset of X, and x ∈ X, then x ∈ A if and only if every
basis element containing x intersects A.
Proof. Use theorem 5.1.3 and problem 2 at the end of this section.
Theorem 5.2.3. Let 𝔖 be a collection of subsets of a nonempty set X such that ∪{S ∶
S ∈ 𝔖} = X. Then there exists a unique smallest topology on X that contains 𝔖 as
a subbase.
Exercises
n
1. Show that the collection of open boxes {∏i=1 (ai , bi ) ∶ ai , bi ∈ ℚ} is an open
base for the usual topology on ℝn .
2. Prove that a collection 𝔅 of open subsets of a topological space X is an
open base if and only if, for every open set U and every x ∈ U, there exists a
member B of 𝔅 such that x ∈ B ⊆ U.
3. (a) Prove that the usual topology on ℝ is weaker that the lower limit
topology.
(b) Prove that each of the following intervals is both open and closed in
the lower limit topology: [a, b), (−∞, a), and [a, ∞). Conclude that the
usual topology is strictly weaker than the lower limit topology.
4. Let 𝔅1 and 𝔅2 be bases for the topologies 𝒯1 and 𝒯2 on the same set X. Show
that if, for every B ∈ 𝔅1 and every x ∈ B, there exists an element B′ ∈ 𝔅2
such that x ∈ B′ ⊆ B, then 𝒯1 ⊆ 𝒯2 .
5. Let 𝔅 be an open base for a topology 𝒯 on a set X. Prove that 𝒯 is the
intersection of all the topologies on X that contain 𝔅.
6. What topology on ℝ is generated by the open subbase {(−∞, a) ∶ a ∈ ℝ}?
7. Let {𝒯𝛼 }𝛼 be a collection of topologies on a set X. Prove that there is a unique
smallest topology 𝒯 that contains ∪𝛼 𝒯𝛼 .
5.3 Continuity
Theorem 5.3.1. Using the notation of the above definition, the following are
equivalent:
(a) f is continuous.
(b) The inverse image of a closed subset of Y is a closed subset of X.
(c) If 𝔅 is an open base for 𝒯Y , then f−1 (B) is open in X for every B ∈ 𝔅.
(d) If 𝔖 is an open subbase for 𝒯Y , then, for every S ∈ 𝔖, f−1 (S) is open in X.
Proof. Parts (a) and (b) are equivalent because of the identity f−1 (F) = X − f−1
(Y − F) and the fact that a subset F of Y is closed if and only if Y − F is open.
Clearly, (a) implies (c), and (c) implies (d). Now (d) implies (c) by virtue of the
identity f−1 (S1 ∩ . . . ∩ Sn ) = f−1 (S1 ) ∩ . . . ∩ f−1 (Sn ), and(c) implies (a) because of
the identity f−1 (∪𝛼 B𝛼 ) = ∪𝛼 f−1 (B𝛼 ).
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
The following example follows directly from example 1 and the previous theorem.
Proof. Since ℬ𝒞(X) is a subspace of ℬ(X), it suffices to show that ℬ𝒞(X) is closed
in ℬ(X). Let f ∈ ℬ(X) be a closure point of ℬ𝒞(X). We need to show that f is
continuous at each point x0 ∈ X. For 𝜖 > 0, there exists a function g ∈ ℬ𝒞(X)
such that ‖ f − g‖∞ < 𝜖/3. By the continuity of g at x0 , there exists an open
neighborhood U of x0 such that, for every x ∈ U, |g (x) − g (x0 )| < 𝜖/3. Now if
x ∈ U, then | f (x) − f (x0 )| ≤ | f (x) − g (x)| + |g (x) − g (x0 )| + |g (x0 ) − f (x0 )| < 𝜖.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Homeomorphisms
Definition. We say that two topological spaces (X, 𝒯X ) and (Y, 𝒯Y ) are homeo-
morphic if there exists a bijection 𝜑 ∶ X → Y such that both 𝜑 and 𝜑−1 are
continuous. We call such a function 𝜑 bicontinuous, or a homeomorphism.
Intuitively speaking, two topological spaces are homeomorphic if they have iden-
tical arrangements of open sets.
Example 3. Any two open bounded intervals are homeomorphic. The linear
function that maps (0, 1) onto (a, b) is clearly bicontinuous.
3 Lower semicontinuous functions played a significant role in the early development of measure
theory. Upper and lower semicontinuous functions facilitate a succinct proof of Uryshon’s lemma
(theorem 5.11.2).
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Proof. (a) If f is both upper and lower semicontinuous, then, for all real numbers
a and b, f−1 (a, ∞) and f−1 (−∞, b) are open. Since intervals of the type
(a, ∞) and (−∞, b) form an open subbase for the usual topology on ℝ, f is
continuous. The converse is trivial.
(b) If A ⊆ X is open, then, for a ∈ ℝ, 𝜒A−1 (a, ∞) is open because
⎧∅ if a ≥ 1,
𝜒A−1 (a, ∞) = A if 0 ≤ a < 1,
⎨
⎩X if a < 0.
Exercises
if each f |Ai is continuous, then f is continuous. This result is also true when
each of the sets Ai is open.
5. Let f and g be continuous real-valued functions on a topological space X.
Prove that
(a) f ± g, fg and | f | are continuous,
(b) the set {x ∈ X ∶ f (x) ≤ g (x)} is closed, and
(c) the functions h = min{ f, g} and k = max{ f, g} are continuous.
6. Prove that the following subspaces of the Euclidean plane are homeomor-
phic:
(a) the punctured plane {(x, y) ∶ x2 + y2 > 0}
(b) the open annulus {(x, y) ∶ 1 < x2 + y2 < 4}
7. Prove that a discrete topological space (X, 𝒯) is homeomorphic to a sub-
space of ℝ if and only if X is countable.
8. Let a, b, c, and d be real numbers such that
a b
det ( ) > 0.
c d
az+b
Show that the function f (z) = is a homeomorphism of the open upper
cz+d
half of the complex plane.
x
9. Let X = ℝn − {0}. Prove that the function f (x) = 2 is continuous on X.
‖x‖2
10. (a) Let f be a real function on a topological space X. Prove that f is lower
semicontinuous if and only if −f is upper semicontinuous.
(b) Prove that a subset A of X is open if 𝜒A is lower semicontinuous.
(c) Prove that a subset B of X is closed if 𝜒B is upper semicontinuous.
11. Definition. A sequence (xn ) in a topological space X is said to converge to
x ∈ X if every neighborhood of x contains all but finitely many terms of
(xn ). See problem 9 on section 4.1.
Let f be a function from a topological space X to a topological space Y.
Show that if f is continuous at x0 ∈ X, then it is sequentially continuous at
x0 (see theorem 4.3.2). Also give an example to show that the converse is
false.
In section 4.4, we defined the product of finitely many metric spaces. In this
section, we develop a construction that generalizes the concept to the case of
topological spaces. Thus we define a topology on the Cartesian product of a finite
number of topological spaces. Needless to say, the product topology should agree
with and extend the definition of the product metric.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Let (X1 , 𝒯1 ), … , (Xn , 𝒯n ) be topological spaces, and consider the Cartesian product
n
X = ∏i=1 Xi of the underlying sets. Consider the following collection of subsets
of X:
𝔖 = ∪ni=1 {X1 × . . . × Ui × . . . × Xn ∶ Ui ∈ 𝒯i }.
Since ∪{S ∶ S ∈ 𝔖} = X, theorem 5.2.3 applies; hence the following definition is
meaningful:
𝔅 = {U1 × . . . × Un ∶ Ui ∈ 𝒯i }.
The set 𝔖 is referred to as the defining subbase for the product topology and
the set 𝔅 is called the defining base for the product topology.
Theorem 5.4.1. The product topology is the weakest topology relative to which all
the projections 𝜋i ∶ X → Xi are continuous.
Comparing the above result with problems 2 and 5 on section 4.4 should convince
the reader that the product topology defined in this section is indeed the correct
generalization of the product metric.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
n n
Theorem 5.4.2. If Fi is closed in Xi , then ∏i=1 Fi is closed in ∏i=1 Xi .
n
∏ 𝔅i = {V1 × . . . × Vn ∶ Vi ∈ 𝔅i }
i=1
n
is an open base for the product topology on ∏i=1 Xi .
n
Proof. Let W be open in ∏i=1 Xi , and let x = (x1 , . . . , xn ) ∈ W; W is the union of
sets of the type U1 × . . . × Un , where Ui ∈ 𝒯i . Therefore, for a set of that type, x ∈
U1 × . . . × Un ⊆ W. For each xi , choose a member Vi ∈ 𝔅i such that xi ∈ Vi ⊆ Ui .
Clearly, x ∈ V1 × . . . × Vn ⊆ U1 × . . . × Un ⊆ W.
Exercises
1. Let X and Y be topological spaces, and let x be a fixed element of X. Prove that
Y is homeomorphic to {x} × Y. The latter set is given the restricted topology
induced by the product topology on X × Y.
2. Prove that X1 × . . . × Xn is homeomorphic to X1 × (X2 × . . . × Xn ).
3. Let X and Y be topological spaces, and let A ⊆ X and B ⊆ Y. Prove that
A × B = A × B.
n
7. Prove that if Ai is dense in Xi for 1 ≤ i ≤ n, then ∏i=1 Ai is dense in the
product topology.
8. Let X be an infinite set, and let 𝒯 be the co-finite topology on X. Prove that
the product topology on X × X is not the co-finite topology.
Proof. Here X is given the relative topology induced by the usual topology on ℝ.
If X is not an interval, then there exist two real numbers x and y in X and a
real number z ∈ ℝ − X such that x < z < y. The two sets P = X ∩ (−∞, z) and
Q = X ∩ (z, ∞) form a disconnection of X.
The following result follows directly from the last two theorems.
Example 3. (a) Let f ∶ [a, b] → ℝ be a continuous function and, say, f (a) < f (b).
If k is between f (a) and f (b), then there exists a point x ∈ (a, b) such that
f (x) = k.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
(b) Let f ∶ [a, b] → [a, b] be continuous; then f has a fixed point in [a, b].
Since the range, R, of f is connected, it is an interval. In particular, R
contains the interval [f (a), f (b)]. This proves (a). We now prove (b).
If f (a) = a or f (b) = b, there is nothing to prove, so assume that f (a) >
a and f (b) < b. Define a function h on [a, b] by h(x) = x − f (x). Then
h(a) < 0 < h(b). By (a), there is a point x ∈ [a, b] such that h(x) = 0, that is,
f (x) = x.
Proof. Let (x0 , y0 ) and (x1 , y1 ) be arbitrary but fixed elements in X × Y. Suppose 𝜑 ∶
X × Y → {0, 1} is continuous. The function i ∶ Y → {x0 } × Y given by i(y) = (x0 , y)
is continuous; hence 𝜑oi is continuous and hence constant because Y is connected.
Thus 𝜑(x0 , y0 ) = 𝜑(x0 , y1 ). Likewise, the function x ↦ 𝜑(x, y1 ) is constant, so
𝜑(x0 , y1 ) = 𝜑(x1 , y1 ). Thus 𝜑(x1 , y1 ) = 𝜑(x0 , y0 ), and 𝜑 is constant. This proves
that X × Y is connected.
Proof. Use induction, the previous theorem and the fact that ℝn is homeomorphic to
ℝ × ℝn−1 .
Definition. Let X be a topological space, and let x, y ∈ X. We say that two points x
and y in X are connected if there is a connected subset of X that contains x and
y. Define a relation ≡ on X by x ≡ y if x and y are connected. It is clear that ≡ is
an equivalence relation.
Theorem 5.5.9. The equivalence classes of the relation ≡ in the above definition are
connected sets.
Proof. Let C be one of the equivalence classes and fix an element a ∈ C. For every
x ∈ C, there exists a connected subset Ax of X containing a and x. All the
elements of Ax are related; hence Ax ⊆ C. Since C = ∪x∈C Ax , and a ∈ ∩x∈C Ax ,
C is connected by theorem 5.5.7.
Definition. The equivalence classes of the relation ≡ are called the connected
components of X.
Proof. The last assertion of the theorem is the only one we still need to prove. Let P
be a proper nonempty subset of X that is both open and closed, and let Q = X − P.
Then ∅ ≠ Q ≠ X, and Q is also open and closed. We show that if C is a connected
component of X, and C ∩ P ≠ ∅, then C ⊆ P. The sets C ∩ P and C ∩ Q are both
open and closed in C. Since C is connected, and C ∩ P ≠ ∅, C ∩ Q = ∅, because
otherwise the pair (C ∩ P, C ∩ Q) would form a disconnection of C. This proves
that C ⊆ P.
We conclude this section with a brief excursion into path connected spaces.
is a path from x to y, we say that x and y are path connected. A topological space
X is path connected if every pair of points in X are path connected.
Exercises
7. Prove that ℚ (with the usual topology) is totally disconnected. This result
shows that the connected components of a topological space need not be
open.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
8. Prove that a topological space X is totally disconnected if, for every pair of
distinct points x and y, there is a disconnection (P, Q) of X such that x ∈ P
and y ∈ Q.
9. Prove that if a Hausdorff space X has an open base whose members are also
closed, then X is totally disconnected. The definition of a Hausdorff space
appears in the next section.
10. Prove that the Sorgenfrey line is totally disconnected.
11. Prove the the product of two totally disconnected spaces is totally discon-
nected.
12. Prove that the continuous image of a path connected space is path con-
nected.
13. Prove that the set {x ∈ ℝn ∶ ‖x‖2 > 1} is path connected.
Metric spaces enjoy strong separation properties, which we often take for granted.
For example, two distinct points in a metric space have disjoint open neigh-
borhoods. In chapter 4, we called this property the Hausdorff property. There
is no reason to expect that the same property should hold true for an arbitrary
topological space, so this property must be axiomatized. Similarly, theorem 4.2.13
shows that disjoint closed subsets of a metric space possess disjoint open neigh-
borhoods. In the general topological setting, this property is known as normality.
One important problem in topology is that of the metrizability of a topological
space. Explicitly stated, under what set of conditions is a given topology induced
by a metric. The fact that every metirc space is normal imposes an immediate
necessary condition on a topology to be metrizable: such a topology must be
normal. Of course, normality is not a sufficient condition for a space to be
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
It is safe to say that all important topological spaces are Hausdorff. Weaker
separation axioms, such as T1 , are used mostly to generate exercises and
counterexamples.
Theorem 4.1.4 states that a metric space is Hausdorff, which supports the
statement in the above paragraph since metric spaces are the most important
(but not the only important) examples of topological spaces.
Proof. We show that the set W = X − {x} is open. For every y ∈ W, there exist open
neighborhoods Uy and Vy of x and y, respectively, such that Uy ∩ Vy = ∅. This
clearly implies that Vy ⊆ W for all y ∈ W. Consequently, W = ∪{Vy ∶ y ∈ W},
which is open.
Theorem 4.1.5 says that the limit of a convergent sequence in a metric space is
unique. This is precisely because metric spaces are Hausdorff spaces.
Definition. A Hausdorff space X is said to be regular if, for every x ∈ X and every
closed subset F that does not contain x, there exist open sets U and V such that
x ∈ U, F ⊆ V, and U ∩ V = ∅.
Theorem 5.6.2. A Hausdorff space is regular if and only if for every x ∈ X and every
open neighborhood U of x, there exists an open neighborhood V of x such that
V ⊆ U.
Proof. Suppose X is regular, and let x and U be as in the statement of the theorem. By
regularity, applied to x and the closed set X − U, there exists open neighborhoods
V of x and W of X − U such that V ∩ W = ∅. Because V ⊆ X − W and the latter
set is closed, V ⊆ X − W. In particular, V ⊆ U.
Conversely, let F be a closed subset of X that does not contain x. By assumption,
there exits an open neighborhood U of x such that U ⊆ X − F. Set V = X − U.
The sets U and V are disjoint open neighborhoods of x and F, respectively,
as desired.
Definition. A Hausdorff space X is said to be normal if, for every pair of disjoint
closed subsets E and F of X, there exist open sets U and V such that E ⊆ U,
F ⊆ V, and U ∩ V = ∅.
The proof of theorem 5.6.3 mimics that of theorem 5.6.2 and is therefore omitted.
Theorem 5.6.3. A Hausdorff space X is normal if and only if for every closed set E
and every open neighborhood U of E, there exists an open neighborhood V of E
such that V ⊆ U.
Products and subspaces of normal and regular spaces have dissimilar properties.
For example, the product of regular spaces is regular, but the same result does not
hold for the product of normal spaces. Likewise, an arbitrary subspace of a normal
space need not be normal. See the exercises on section 5.7. However, the following
special case is easy to prove.
Exercises
11. Let X be a regular space. Prove that every pair of distinct points in X have
neighborhoods whose closures are disjoint.
12. Let X be a normal space. Prove that every pair of disjoint closed subsets of
X have neighborhoods whose closures are disjoint.
In this section, we study second countable, separable, and Lindelöf spaces. Theo-
rem 4.5.1 states that all three conditions are equivalent for metric spaces. This is
not true for general topological spaces, and several counterexamples are provided
in this section and the section exercises to show the nonequivalence of the three
conditions. However, second countability implies the other two conditions. Sec-
ond countability has other pleasant consequences, especially when it is combined
with normality or local compactness. The definitions in this section are identical
to the those in the metric case and are included below for ease of reference.
Example 2. In problem 10, we ask the reader to show that the Sorgenfrey line
ℝl is Lindelöf. We show here that ℝ2l is not Lindelöf. Thus the product of two
Lindelöf spaces is not necessarily Lindelöf. Let L be as in example 1. The line L is
closed in ℝ2l . Consider the open cover 𝒰 of ℝ2l that consists of {ℝ2l − L} and the
collection {[x, x + 1) × [−x, −x + 1) ∶ x ∈ ℝ}. Clearly, no countable subset of 𝒰
can cover ℝ2l .
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Proof. Let {Bn } be a countable open base for the topology on X. For each n ∈ ℕ,
choose a point an ∈ Bn , and let A = {an ∶ n ∈ ℕ}. If U ≠ ∅ is open in X, then U
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
It was observed in section 5.6 that normality is a necessary condition for the
metrizability of a topological space. For second countable spaces, the normality
requirement can be relaxed, as the following theorem shows. As it turns out,
regular second countable spaces are metrizable. See the Urysohn metrization
theorem in section 5.11.
Proof. Let E and F be disjoint closed subsets of X, and let 𝔅 be a countable open base
for the topology on X. For every x ∈ E, x belongs to the open set X − F. By theorem
5.6.2, there exists an open neighborhood W of x such that W ⊆ X − F. Choose a
basis element Bx such that x ∈ Bx ⊆ W. Clearly, E ⊆ ∪x∈E Bx . Since 𝔅 is countable,
the collection {Bx }x∈E can be enumerated as {Un }. Observe that Un ⊆ X − F. A
similar argument produces a countable open cover {Vn } of F such that Vn ∈ 𝔅,
and Vn ⊆ X − E.
Define U′n = Un − ∪ni=1 Vi , and V′n = Vn − ∪ni=1 Ui . Notice that if n ≤ m, then
Un ∩ V′m = ∅. By symmetry, if m ≤ n, then V′m ∩ U′n = ∅. It follows that, for
′
Exercises
1. Prove that the product of two second countable spaces is second countable
and that the product of two separable spaces is separable.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
3. Prove that every second countable space is first countable and that every
metric space is first countable.
4. Show that a subspace of a second (respectively, first) countable is second
(respectively, first) countable. Also show that the product of two first
countable spaces is first countable.
5. Show that a subspace of a separable space need not be separable. Hint: See
problem 3 on section 5.1. For a more elaborate example, see problem 12
below.
6. Let X be an uncountable set, and let 𝒯 be the co-finite topology on X.
Show that every infinite subset of X is dense, and hence X is separable.
Show, however, that X is not second countable. Hint: If {Bn } is a countable
collection of open subsets of X, then ∩∞ n=1 Bn is uncountable. Pick a point
∞
x ∈ ∩n=1 Bn , and consider the open set U = X − {x}.
7. Show that a closed subspace of a Lindelöf space if Lindelöf.
8. Let X be a topological space X, and let 𝔅 be an open base for X. Prove that
X is Lindelöf if and only if every open cover of X by members of 𝔅 has a
countable subcover.
9. Show that the Sorgenfrey line is first countable, separable, but not second
countable. Hint: To show that ℝl is not second countable, let 𝔅 be an open
base for ℝl . For every and x ∈ ℝ, there is a member Bx ∈ 𝔅 such that
x ∈ Bx ⊆ [x, x + 1).
10. Prove that the Sorgenfrey line ℝl is Lindelöf. Together with the previous
problem, this problem shows that not every Lindelöf space is second
countable. Hint: Use problem 8. Let {[a𝛼 , b𝛼 ) ∶ 𝛼 ∈ I} be an open cover of
ℝl by basic open subsets of ℝl . Define C = ∪𝛼∈I (a𝛼 , b𝛼 ). View C as a subset
of ℝ with the usual topology; C is Lindelöl because ℝ is a metric space. Thus
there exists a countable subset {𝛼n ∶ n ∈ ℕ} such that C = ∪∞ n=1 (a𝛼n , b𝛼n ).
Argue that ℝ − C is countable.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
The converse of the above example is false, but counterexamples are rather difficult.
The next result follows immediately from theorem 5.3.3 and the fact that for a
compact space X, 𝒞(X) = ℬ𝒞(X).
Theorem 5.8.6. Let X be a compact Hausdorff space, and let 𝒞(X) be the space
of continuous functions on X. Then (𝒞(X), ‖.‖∞ ) is a complete normed linear
space.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Theorem 5.8.7. Let X be a compact space, and let Y be a Hausdorff space. Then a
continuous bijection 𝜑 ∶ X → Y is a homeomorphism from X to Y.
Proof. We prove that 𝜑−1 is continuous by showing that 𝜑 is a closed mapping. Let
F be a closed subset of X. By theorem 5.8.2, F is compact. By theorem 5.8.4, 𝜑(F)
is compact in Y. Now theorem 5.8.3 implies that 𝜑(X) is closed, as desired.
The theorem says that when we limit our attention to compact Hausdorff spaces,
a bijection 𝜑 ∶ X → Y is a homeomorphism if and only if it is simply continuous.
In this situation, we can show that X and Y are homeomorphic by merely showing
the continuity of 𝜑 or 𝜑−1 or by showing that 𝜑 (or 𝜑−1 ) is an open (or a closed)
mapping.
Proof. For every y ∈ F, there exist disjoint open sets Uy and Vy such that x ∈ Uy
and y ∈ Vy . Now F ⊆ ∪y∈F Vy . Since F is compact, F ⊆ ∪ni=1 Vyi for a finite sub-
set {y1 , … , yn } of F. The sets U = ∩ni=1 Uyi and V = ∪ni=1 Vyi have the desired
properties.
Theorem 5.8.10. A compact Hausdorff space is normal. Thus if E and F are disjoint
closed subsets of X, then there exist disjoint open subsets U and V such that E ⊆ U
and F ⊆ V.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Proof. First observe that E and F are compact by theorem 5.8.2. Let x ∈ E. By the
previous theorem, there are disjoint open sets Ux and Vx such that x ∈ Ux and
F ⊆ Vx . Since E ⊆ ∪x∈E Ux , and E is compact, E ⊆ ∪ni=1 Uxi for some finite subset
{x1 , … , xn } of E. Set U = ∪ni=1 Uxi and V = ∩ni=1 Vxi . The sets U and V have the
stated properties.
Lemma 5.8.11 (the tube lemma). Let X be a topological space, and let Y be a
compact space. If an open subset W in X × Y contains a line, {x} × Y, then there
exists a neighborhood U of x such that U × Y ⊆ W. Here x is a fixed element of X.
Proof. For every y ∈ Y, there are open sets Uy ⊆ X and Vy ⊆ Y such that (x, y) ∈
Uy × Vy ⊆ W. Thus {x} × Y ⊆ ∪y∈Y (Uy × Vy ) ⊆ W. The compactness of {x} × Y
yields a finite subset {y1 , … , yn } such that {x} × Y ⊆ ∪ni=1 (Uyi × Vyi ) ⊆ W. Define
n
U = ∩i=1 Uyi . We claim that U × Y ⊆ W. If u ∈ U, and y ∈ Y, then y ∈ Vyi for
some 1 ≤ i ≤ n. But u belongs to Uyi for every 1 ≤ i ≤ n. Therefore (u, y) ∈ Uyi ×
Vyi ⊆ W.
The above lemma says that if an open subset of X × Y contains a line, then it
must contain a strip (or a tube, hence the name) that contains the line. Intuitively,
an open subset of X × Y cannot get arbitrarily thin around a line. The following
example illustrates the concept.
1
Example 5. The open subset W = {(x, y) ∈ ℝ2 ∶ x ∈ ℝ, |y| < } contains the
1+x2
x-axis but there is no positive number 𝛿 such that ℝ × (−𝛿, 𝛿) is contained
in W.
Proof. Use induction, the previous theorem, and the fact that X1 × . . . × Xn is
homeomorphic to X1 × (X2 × . . . × Xn ).
Example 6. A topological space X is countably compact if and only if, for every
descending sequence F1 ⊇ F2 ⊇ . . . of nonempty closed sets, ∩∞
n=1 Fn ≠ ∅.
Exercises
another journey into locally compact spaces in section 5.11, where we establish
Urysohn’s theorem for locally compact Hausdorff spaces and introduce the space
of continuous, compactly supported functions on such spaces.
This section is the transitional section to the remaining three sections in this
chapter. It may be bypassed on the first reading of the book because locally compact
metric spaces (section 4.7) are sufficient for most of the rest of the book. Locally
compact Hausdorff spaces are needed only in sections 8.4 and 8.7, where frequent
reference is made to the results in this section and sections 5.10 and 5.11, and where
certain theorems are extended from ℝn to locally compact Hausdorff spaces.
We established in section 4.7 that ℝn is locally compact and that l∞ is not. See
theorem 6.1.5 for a far-reaching result. Also in section 4.7, we showed that ℚ is
not locally compact.
Theorem 5.9.1. Let X be a Hausdorff space. Then X is locally compact if and only
if, for every x ∈ X and every open neighborhood U of x, there exists an open
neighborhood V of x such that V is compact and V ⊆ U.
Proof. Suppose X is locally compact, and let x and U be as in the statement of the
theorem. Let K be a compact subset of X that contains x in its interior, and
let F = K − U. As F is a closed subset of the compact subset K, it is compact.
Invoking theorem 5.8.9 yields disjoint open sets W1 and W2 such that x ∈ W1 and
F ⊆ W2 . Define V = W1 ∩ int(K). Since K is compact and V ⊆ K, V is compact.
Finally, since V ⊆ X − W2 , and the latter set is closed, V ⊆ X − W2 ⊆ X − F. Thus
V ⊆ K ∩ (X − F) = K − F ⊆ U. The proof of the converse is trivial.
Theorem 5.9.2. Let X be a locally compact Hausdorff space, and let U be an open
neighborhood of a compact subset K of X. Then there exists an open neighborhood
V of K such that V is compact and V ⊆ U.
Theorem 5.9.4. Let X be a second countable locally compact Hausdorff space. Then
X is a countable union of compact subsets of X.
Proof. Let 𝔅 be a countable open base for X. For every x ∈ X, there is an open set
Vx such that x ∈ Vx and Vx is compact. Let Bx ∈ 𝔅 be such that x ∈ Bx ⊆ Vx .
Clearly, Bx ⊆ Vx ; thus Bx is compact. Now X = ∪x∈X Bx . Since 𝔅 is countable, only
countably many of the sets Bx can be distinct, showing that X is a countable union
of compact subsets of X.
For example, ℝn is 𝜎-compact. More generally, the above theorem states that a
second countable locally compact Hausdorff space is 𝜎-compact.
We will use the following result in the next section to prove a simple character-
ization of locally compact Hausdorff spaces. The proof is left as an exercise.
Exercises
5.10 Compactification
In this section, we show that a locally compact Hausdorff space (X, 𝒯) can be
embedded in a compact Hausdorff space (X∞ , 𝒯∞ ) in the manner described in
theorem 5.10.1. In that theorem, the definition of the topology 𝒯∞ requires some
explanation.
(a) The open subsets of 𝒮2 that do not contain N: These are in one-to-one cor-
respondence (through the stereographic projection) with the open subsets
of the usual topology of ℝ2 .
(b) The open subsets U of 𝒮2 that contain the point N: The complement
K = 𝒮2 − U of such an open set is closed in 𝒮2 . Since 𝒮2 is compact, K is
compact. Thus the open sets U of this type are exactly the complements
of compact subsets of the punctured sphere, which are in one-to-one
correspondence with the compact subsets of ℝ2 .
The above discussion suggests that a likely construction of a compact topology that
contains the usual topology on ℝ2 can be obtained by adding a single point, which
we call ∞ (this point corresponds to the point N on the compact sphere), to ℝ2
and define the topology on ℝ2 ∪ {∞} to consist of the above two types of sets. This
is exactly how the topology 𝒯∞ in theorem 5.10.1 is defined.
Theorem 5.10.1. Let (X, 𝒯) be a locally compact Hausdorff space that is not
compact. Then there exists a compact Hausdorff space (X∞ , 𝒯∞ ) containing (X, 𝒯)
such that
Proof. Take an object, which we give the symbol ∞, that does not belong to X, and
let X∞ = X ∪ {∞}.
We define 𝒯∞ to be the collection of subsets of X∞ of one of the following two
types:
Example 2. The one-point compactification of the open interval (0, 1) is the circle
𝒮1 . To see this, recall that the unit (0, 1) is homeomorphic to the line ℝ. Since
the one-point compactification of ℝ is 𝒮1 , the compactification of (0, 1) is 𝒮1 .
Each of the open half lines is homeomorphic to an open half circle, as shown in
figure 5.1(a). There are several ways to see this. The stereographic projection
is the easiest to visualize. The next step is to pull the two open half circles
horizontally apart a distance equal to the diameter of each half circle, as shown
in figure 5.1(b). Now each half circle is homeomorphic to a punctured circle.
For example, function f (ei𝜃 ) = e2i𝜃 maps the half circle {ei𝜃 ∶ −𝜋/2 < 𝜃 < 𝜋/2}
onto the punctured circle {ei𝜃 ∶ −𝜋 < 𝜃 < 𝜋}. Hence X is homeomorphic to the
union of the two tangent punctured circles shown in figure 5.1(c). If we define
the point at infinity to be the missing point of tangency, we obtain the figure
eight shown in figure 5.1(d).
(a) (b)
(c) (d)
Figure 5.1
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Exercises
5.11 Metrization
We now turn to the question of which topologies are induced by a metric. Theorem
5.11.3 is the main result in this section. Although it is not the best known result,
it does establish sufficient conditions for metrization. The proof techniques we
develop along the path to theorem 5.11.3 are elegant and important in their own
right. We first state the following definition.
Lemma 5.11.1. Suppose X is a normal space, and let E and F be disjoint closed
subsets of X. Let C be the set of rational points in the interval [0, 1]. Then there
exists a countable collection of open subsets {Up ∶ p ∈ C} such that
Cn , say, pi < pn+1 < pj . Again by theorem 5.6.3, there exists an open set Upn+1 such
that Upi ⊆ Upn+1 ⊆ Upn+1 ⊆ Upj . By construction, the sets Up0 , … , Upn+1 satisfy
condition (*) for p, q ∈ Cn+1 . Since, for every pair of points p and q in C, there
is a finite set Cn that contains p and q, the proof is complete.
The inclusions E ⊆ Up , and Up ⊆ X − F for all p ∈ C are obvious since E ⊆ U0
and U1 ⊆ X − F.
Remark. Remark. Any dense subset C of [0, 1] containing 0 and 1 can be used in
the construction of the collection {Up ∶ p ∈ C}. A commonly used such set is
the set of dyadic rational numbers D = {0, 1, 1/2, 1/4, 3/4, 1/8, 3/8, 5/8, 7/8, …},⁵
which is slightly more advantageous in the visualization of the construction of
the sets {Up }.
The following theorem is crucial for the proof of theorem 5.11.3. It is greatly
important in its own right.
Theorem 5.11.2 (Urysohn’s lemma). Suppose that X is a normal space and that
E and F are disjoint closed subsets of X. Then there exists a continuous function
f ∶ X → [0, 1] such that f (E) = 1, and f (F) = 0.
p if x ∈ U1−p , 1 if x ∈ U1−q ,
fp (x) = { and gq (x) = {
0 if x ∉ U1−p , q if x ∉ U1−q .
k
⁵ D = {0, 1} ∪ ∪∞
n=1 { ∶ k = 1, 3, 5, … , 2n − 1}.
2n
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Now p = fp (x) > gq (x) = q; hence 1 − p < 1 − q, and U1−p ⊆ U1−q . This
contradicts x ∈ U1−p and x ∉ U1−q and establishes the claim that fp ≤ gq .
Suppose, for a contradiction, that f (x) < g (x) for some x ∈ X. Because C is dense
in [0, 1], there are points p, q ∈ C such that f (x) < p < q < g (x). Now f (x) < p
implies that x ∉ U1−p , and g (x) > q implies x ∈ U1−q . This is a contradiction
because
1 − q < 1 − p; hence U1−q ⊆ U1−p . The contradiction concludes the proof.
∞
| fi (x) − fi (y)|2 1/2
d(x, y) = { ∑ } .
i=1
i2
It is clear that series in the above definition converge since | fi (x) − fi (y)| ≤ 1. In
f (x) f (x) f (y) f (y)
fact, the sequences 𝜑x = ( f1 (x), 2 , … , i , . . . ) and 𝜑y = ( f1 (y), 2 , … , i , . . . )
2 i 2 i
are in l2 and d(x, y) is nothing but the l2 distance between 𝜑x and 𝜑y . It becomes
clear that d is a metric once we show that the function x ↦ 𝜑x is an injection. Let
x and y be distinct elements of X, and let U be an open neighborhood of x that
excludes y. Choose a basis member Bm such that x ∈ Bm ⊆ Bm ⊆ U, then choose a
basis member Bn such that x ∈ Bn ⊆ Bn ⊆ Bm . The pair (Bn , Bm ) ∈ P, and hence
(Bn , Bm ) = (Bni , Bmi ) for some i ∈ ℕ. It follows that fi (x) = 0 and fi (y) = 1.
We now show that the metric d induces the topology 𝒯.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
We prove that every d-open subset U of X is 𝒯-open. Let x ∈ U, and let r > 0
be such that B(x, r) ⊆ U. Here B(x, r) is the d-ball of radius r centered at x.
We will show that there exists a 𝒯-open set V such that x ∈ V ⊆ B(x, r). First
∞ 1 r2
choose an integer N such that ∑i=N+1 2 < . Since each fi is continuous,
i 2
there exists an open neighborhood Vi of x such that, for every y ∈ Vi , | fi (x) −
r2
fi (y)|2 < . We claim that the set V = ∩Ni=1 Vi is the set we seek. If y ∈ V,
2N
∞ | fi (x)−fi (y)|2 N | fi (x)−fi (y)|2 ∞ | fi (x)−fi (y)|2
then [d(x, y)]2 = ∑i=1 = ∑i=1 + ∑i=N+1 <
i2 i2 i2
r2 N 1 ∞ 1 r2 r2
∑i=1 2 + ∑i=N+1 2 < + .
2N i i 2 2
To show that every 𝒯-open set is d-open, it is sufficient to show that every basic
open set Bm is d-open. Let x ∈ Bm . We need to show that there exists r > 0 such that
B(x, r) ⊆ Bm . By theorem 5.6.2, there exists a basis element Bn such that x ∈ Bn ⊆
1
Bn ⊆ Bm . Now (Bn , Bm ) ∈ P, say, (Bn , Bm ) = (Bni , Bmi ). Let r = . If y ∈ B(x, r),
2i
∞ | fj (x)−fj (y)|2 1 | fi (x)−fi (y)|2 1
then ∑j=1 < . In particular, < . Thus | fi (x) − fi (y)| <
j2 4i2 i2 4i 2
1 1
. Because fi (x) = 0, | fi (y)| < . Since fi (X − Bmi ) = 1, y ∈ Bmi = Bm .
2 2
The conditions of theorem 5.11.3 are not necessary for a space to be metrizable.
For example, the space l∞ is metrizable but not second countable. However, if we
limit ourselves to compact Hausdorff spaces, the conditions of theorem 5.11.3 are
necessary as well as sufficient, as the next theorem shows.
We now venture back into locally compact Hausdorff spaces. The following theo-
rem is the closest analog of theorem 5.11.2 for locally compact Hausdorff spaces,
which need not be normal. It is sometimes referred to as Urysohn’s theorem for
locally compact spaces.
Theorem 5.11.5. Let X be a locally compact Hausdorff space, and let K and F be
disjoint subsets of X such that K is compact and F is closed. Then there exists a
continuous function f ∶ X → [0, 1] such that f (K) = 1, and f (F) = 0.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Proof. Applying theorem 5.9.2 to the compact set K and the open set X − F, there
exists an open subset V with compact closure such that K ⊆ V ⊆ V ⊆ X − F.
Applying theorem 5.9.2 again to the compact set K and the open set V, we can
find an open set U with compact closure such that K ⊆ U ⊆ U ⊆ V. Now the
subspace V with the restricted topology is a compact Hausdorff space and is
therefore normal by theorem 5.8.10. Applying theorem 5.11.2 to the closed subsets
K and V − U of V, there is a continuous function f ∶ V → [0, 1] such that f (K) = 1,
and f (V − U) = 0. Extend f to a continuous function f ∶ X → [0, 1] by defining
f (x) = 0, for all x ∈ X − U. Problem 4 on section 5.3 is relevant here to show the
continuity of the extended f.
Proof. Apply theorem 5.9.2 to find an open set U with compact closure such that
K ⊆ U ⊆ U ⊆ V. Now apply theorem 5.11.5 to the sets K and F = X − U to find
a function f ∶ X → [0, 1] such that F(K) = 1 and f (X − U) = 0. Observe that
supp( f) ⊆ U, which is compact.
Theorem 5.11.7. A function f ∈ 𝒞0 (X) is bounded and the space 𝒞0 (X) is a complete
normed linear space under the supremum norm.
Proof. We leave it to the reader to show that 𝒞0 (X) ⊆ ℬ𝒞(X). We prove that 𝒞0 (X) is
closed in ℬ𝒞(X). Let f ∈ ℬ𝒞(X) be a closure point of 𝒞0 (X), and let 𝜖 > 0. There
exists a function g ∈ 𝒞0 (X) such that ‖ f − g‖∞ < 𝜖/2. Let K be a compact subset
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
of X such that |g (x)| < 𝜖/2 whenever x ∉ K. Now if x ∉ K, then | f (x)| ≤ | f (x) −
g (x)| + |g (x)| < 𝜖/2 + 𝜖/2.
Proof. Let g ∈ 𝒞0 (X), let 𝜖 > 0, and let K be a compact subset of X such that
|g (x)| < 𝜖 for x ∈ X − K. By theorem 5.9.2, there is an open subset V with compact
closure such that K ⊆ V. By corollary 5.11.6, there exists a function f ∈ 𝒞c (X) such
that f (K) = 1, 0 ≤ f (x) ≤ 1, and supp( f) ⊆ V. The function fg is in 𝒞c (X) and
‖g − fg‖∞ < 𝜖.
Exercises
This section generalizes section 5.4. First we review some terminology and
notation.
Let {X𝛼 }𝛼∈I be an arbitrary collection of nonempty sets. The Cartesian product
X = ∏𝛼∈I X𝛼 is the set of all functions x ∶ I → ∪𝛼∈I X𝛼 such that, for every 𝛼 ∈ I,
x(𝛼) ∈ X𝛼 . We write x𝛼 instead of x(𝛼), and we denote an element of X by
x = (x𝛼 )𝛼∈I , or simply x = (x𝛼 ). For a fixed 𝛼 ∈ I, the projection of X onto the
factor set X𝛼 is the function 𝜋𝛼 (x) = x𝛼 .
{ ∏ U𝛼 ∶ 𝛼 ∈ I, U𝛼 ∈ 𝒯𝛼 }.
𝛼∈I
∩ni=1 𝜋𝛼−1
i
(U𝛼i ) = U𝛼1 × . . . × U𝛼n × ∏ X𝛼 ,
𝛼≠𝛼i
To reiterate, the above set is the set of all x ∈ X such that 𝜋𝛼i (x) ∈ Ui for all
1 ≤ i ≤ n.
The following theorem is a restatement of the definition of the product
topology. See the proof of theorem 5.4.1.
Theorem 5.12.1. The product topology is the weakest topology relative to which all
the projections 𝜋𝛼 ∶ X → X𝛼 are continuous.
⁶ The set 𝜋𝛼−1 (U𝛼 ) is the set of all elements x in X such that x𝛼 ∈ U𝛼 and the other coordinates, x𝛽 ,
of x are unrestricted elements of X𝛽 . This is exactly the set U𝛼 × ∏𝛽≠𝛼 X𝛽 .
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
z𝛼 if 𝛼 ∈ F,
jz (𝛼) = {
y𝛼 if 𝛼 ∉ F.
a𝛼 if 𝛼 = 𝛼i ,
y𝛼 = {
x𝛼 if 𝛼 ≠ 𝛼i .
Theorem 5.12.3. For each 𝛼 ∈ I, let 𝔅𝛼 be an open base for X𝛼 . Then the family of
subsets of X of the form ∩𝛼∈F 𝜋𝛼−1 (B𝛼 ), where F ranges over finite subsets of I and
B𝛼 ∈ 𝔅𝛼 , is an open base for the product topology.
The proof is left as an exercise.
Lemma 5.12.4. Let {X𝛼 }𝛼 be a collection of topological spaces, and let X be the
product space. If 𝔉 is a collection of closed subsets of X possessing the finite
intersection property, then there exists a family 𝔉∗ of subsets of X, not necessarily
closed, which is maximal subject to the following conditions:
Proof. Consider the family 𝔇 of subsets of X containing 𝔉 and having the finite
intersection property. Order 𝔇 by set inclusion, and let ℭ be a chain in 𝔇. We
will verify that ∪{𝒞 ∶ 𝒞 ∈ ℭ} is an upper bound on ℭ. Let F1 , … , Fn be members
of ∪{𝒞 ∶ 𝒞 ∈ ℭ}. Then there are members 𝒞1 , … , 𝒞n of ℭ such that Fi ∈ 𝒞i . Since
ℭ is a chain, one of the families 𝒞1 , … , 𝒞n , say, 𝒞1 , contains all the others. Now
all the sets F1 , … , Fn are in 𝒞1 ; hence ∩ni=1 Fi ≠ ∅, and ∪{𝒞 ∶ 𝒞 ∈ ℭ} has the
finite intersection property. Clearly, 𝒞 contains 𝔉. By Zorn’s lemma, 𝔇 contains
a maximal member 𝔉∗ . If there are sets F1 and F2 in 𝔉∗ such that F1 ∩ F2 ∉
𝔉∗ , then 𝔉∗ ∪ {F1 ∩ F2 } would have properties (a) and (b), which contradicts
the maximality of 𝔉∗ . This proves that the intersection of two (hence any finite
number of) sets in 𝔉∗ is in 𝔉∗ .
Proof. Let 𝔉 be a collection of closed subsets of X that has the finite intersection
property. We prove that ∩{F ∶ F ∈ 𝔉} ≠ ∅. By theorem 5.8.8, X is compact. Let
𝔉∗ be a collection of subsets of X having the properties described in lemma 5.12.4.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
We will show that ∩{F ∶ F ∈ 𝔉∗ } ≠ ∅. This will establish the theorem because
the members of 𝔉 are closed and ∩{F ∶ F ∈ 𝔉∗ } ⊆ ∩{F ∶ F ∈ 𝔉} = ∩{F ∶ F ∈ 𝔉}.
{𝜋𝛼 (F) ∶ F ∈ 𝔉∗ }.
This is a family of closed subsets of X𝛼 , and it has the finite intersection property
because if F1 , … , Fn are in 𝔉∗ , then ∩ni=1 𝜋𝛼 (Fi ) ⊇ ∩ni=1 𝜋𝛼 (Fi ) ⊇ 𝜋𝛼 (∩ni=1 Fi ) ≠ ∅.
Since each X𝛼 is compact, there is an element x𝛼 ∈ ∩{𝜋𝛼 (F) ∶ F ∈ 𝔉∗ } (theorem
5.8.8). Let x = (x𝛼 ). We will show that x ∈ ∩{F ∶ F ∈ 𝔉∗ }. Let U = ∩ni=1 𝜋𝛼−1 i
(U𝛼i )
be an arbitrary basic open neighborhood of x. We claim that U intersects every
F ∈ 𝔉∗ . This will show that x ∈ F, and the proof will be complete. Since x𝛼i ∈
U𝛼i , and x𝛼i ∈ 𝜋𝛼i (F) for every F ∈ 𝔉∗ , U𝛼i ∩ 𝜋𝛼i (F) ≠ ∅ for every F ∈ 𝔉∗ . Thus
𝜋𝛼−1
i
(U𝛼i ) ∩ F ≠ ∅ for every F ∈ 𝔉∗ . By the maximality of 𝔉∗ , it must be the case
that 𝜋𝛼−1
i
(U𝛼i ) ∈ 𝔉∗ . Since 𝔉∗ is closed under the formation of finite intersections,
n
U = ∩i=1 𝜋𝛼−1 i
(U𝛼i ) ∈ 𝔉∗ . In particular, U ∩ F ≠ ∅ for every F ∈ 𝔉∗ .
𝔖 = { ∏ U𝛼 ∶ U𝛼 ∈ 𝒯𝛼 }.
𝛼∈I
We conclude this section by showing that not every compact Hausdorff space is
metrizable.
Example 6. Let I be an uncountable set and, for each 𝛼 ∈ I, let X𝛼 = [0, 1]. The
space X = [0, 1]I = ∏𝛼∈I X𝛼 is compact by Tychonoff ’s theorem, and Hausdorff
by example 1. We show that X is not metrizable.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Let 0 denote the zero function from I to [0, 1], and let A consist of all elements
x = (x𝛼 ) ∈ X such that x𝛼 = 0 for finitely many 𝛼 ∈ I, and x𝛼 = 1 otherwise. We
show that 0 ∈ A. Suppose B = ∩ni=1 𝜋𝛼−1 i
(U𝛼i ) is a basic open neighborhood of 0.
The element x = (x𝛼 ) defined below is in A ∩ B, and hence A ∩ B ≠ ∅:
0 if 𝛼 = 𝛼i ,
x𝛼 = {
1 if 𝛼 ≠ 𝛼i .
We show that, for any sequence x(n) = (xn𝛼 ) in A, limn x(n) ≠ 0. The proof will
be complete by theorem 4.2.5. Let In be the subset of elements 𝛼 ∈ I for which
(n)
x𝛼 = 0. The set J = ∪∞ n=1 In is countable since each of the sets In is finite. Because
(n)
I is uncountable, I − J ≠ ∅. Pick an element 𝛽 ∈ I − J. By construction, x𝛽 = 1
for all n ∈ ℕ. Now consider the open set V = 𝜋𝛽−1 ([0, 1/2)); V is a neighborhood
of 0 that contains no terms of the sequence (x(n) ).
Exercises
dn (xn , yn )
D(x, y) = supn∈ℕ
n
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
∞
induces the product topology on ∏n=1 Xn . Here x = (xn ), and y = (yn ), are
∞
elements of ∏n=1 Xn .
9. In the notation of the previous exercise, prove that the metric
∞
d(x, y) = ∑ 2−n dn (xn , yn )
n=1
∞
also induces the product topology on ∏n=1 Xn .
10. In the notation of problem 8, prove that if dn is a complete metric for every
n ∈ ℕ, then D is a complete metric.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
6
Banach Spaces
Mathematics is the most beautiful and most powerful creation of the human
spirit.
Stefan Banach
A life-changing event occurred in the spring of 1916 when Banach met Steinhaus,
who was living in Kraków, waiting to take up a post at the Jan Kazimierz University
in Lvov. Steinhaus and Banach wrote a joint paper, which was published in The
Bulletin of the Kraków Academy after the war ended in 1918. From that time,
Fundamentals of Mathematical Analysis. Adel N. Boules, Oxford University Press (2021). © Adel N. Boules.
DOI: 10.1093/oso/9780198868781.003.0006
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
In 1922 the Jan Kazimierz University in Lvov awarded Banach his qualification to
become a university professor, and in 1924 Banach was promoted to full professor.
The years between the wars were extremely busy for Banach. As well as continuing
to produce a stream of important papers, he wrote arithmetic, geometry, and
algebra texts for high schools. In 1929, together with Steinhaus, he started a
new journal, Studia Mathematica, and Banach and Steinhaus became the first
editors. Another important publishing venture, begun in 1931, was a new series
of mathematical monographs. These were set up under the editorship of Banach
and Steinhaus, from Lvov, and Knaster, Kuratowski, Mazurkiewicz, and Sierpiński
from Warsaw. The first volume in the series, Théorie des opérations linéaires, was
written by Banach and appeared in 1932. It was a French version of a volume
he originally published in Polish in 1931 and quickly became a classic. Another
important influence on Banach was the fact that Kuratowski was appointed to
the Lvov Technical University in 1927 and worked there until 1934. Banach
collaborated with Kuratowski, and they wrote some joint papers during this
period. Banach proved a number of fundamental results on normed linear spaces,
including the Hahn-Banach theorem, the Banach-Steinhaus theorem, the Banach-
Alaoglu theorem, Banach’s open mapping theorem, and the Banach fixed point
theorem. In addition, he contributed to measure theory, integration, topological
vector spaces, and set theory.
In 1939, just before the start of World War II, Banach was elected as President
of the Polish Mathematical Society. At the beginning of the war, Soviet troops
occupied Lvov. Banach had been on good terms with the Soviet mathematicians
before the war started, visiting Moscow several times, and he was treated well by
the new Soviet administration. He was allowed to continue to hold his chair at
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
the university, and he became Dean of the Faculty of Science at the university,
now renamed the Ivan Franko University. Life at this stage was little changed for
Banach, who continued his research, his textbook writing, lecturing, and holding
sessions in cafés. Sobolev and Alexandroff visited Banach in Lvov in 1940, and
Banach attended conferences in the Soviet Union. He was in Kiev when Germany
invaded the Soviet Union, and he returned immediately to his family in Lvov.
The Nazi occupation of Lvov in June 1941 meant that Banach lived under very
difficult conditions. He was arrested under suspicion of trafficking in German
currency but was released after a few weeks. As soon as the Soviet troops retook
Lvov, Banach renewed his contacts. He met Sobolev outside Moscow, but by this
time he was seriously ill. Sobolev, giving an address at a memorial conference for
Banach, said of this meeting2
Despite heavy traces of the war years under German occupation, and despite the
grave illness that was undercutting his strength, Banach’s eyes were still lively.
He remained the same sociable, cheerful, and extraordinarily well-meaning and
charming Stefan Banach whom I had seen in Lvov before the war. That is how he
remains in my memory: with a great sense of humor, an energetic human being, a
beautiful soul, and a great talent.
Banach had planned to go to Kraków after the war to take up the chair of
mathematics at the Jagiellonian University, but he died in Lvov in 1945 of lung
cancer.
This section draws some sharp distinctions between finite and infinite-
dimensional spaces. Although some of the results in this section have intrinsic
importance and will be used later in the book, they are collected here to convince
the reader that infinite-dimensional spaces are truly vast compared to finite-
dimensional ones and that a very different set of tools is needed for studying
them. Among other results, we will see that local compactness characterizes finite-
dimensional normed linear spaces, and that an infinite-dimensional Banach space
cannot have a countable linear basis.
Lemma 6.1.1. Let X be an n-dimensional vector space. Then there exists a norm
‖.‖∗ on X such that (X, ‖.‖∗ ) is isometric to (𝕂n , ‖.‖∞ ). In particular, (X, ‖.‖∗ ) is
complete and locally compact.
Proof. Fix a basis {x1 , … , xn } of X, and define ‖x‖∗ = max1≤i≤n |ai |, where x =
n
∑i=1 ai xi is the unique representation of x as a linear combination of the basis
elements. The mapping T ∶ x ↦ (a1 , … , an ) is clearly a linear isometry from
(X, ‖.‖∗ ) onto (𝕂n , ‖.‖∞ ).
Theorem 6.1.2. Let (X, ‖.‖) be an n-dimensional normed linear space, and let ‖.‖∗
be the norm on X defined in lemma 6.1.1. Then there exist positive constants 𝛼
and 𝛽 such that, for all x ∈ X, 𝛽‖x‖∗ ≤ ‖x‖ ≤ 𝛼‖x‖∗ .
Proof. We continue to use the notation of the proof of the previous lemma.
Let 𝛼 = n max1≤ i≤n ‖xi ‖. Then
n n n
‖x‖ = ‖ ∑ ai xi ‖ ≤ ∑ |ai |‖xi ‖ ≤ max1≤i≤n ‖xi ‖ ∑ |ai |
i=1 i=1 i=1
To prove the other inequality, define a function 𝜆 ∶ (X, ‖.‖∗ ) → ℝ by 𝜆(x) = ‖x‖.
Now 𝜆 is continuous because if limn ‖xn − x‖∗ = 0, then |𝜆(xn ) − 𝜆(x)| = |‖xn ‖ −
‖x‖| ≤ ‖xn − x‖ ≤ 𝛼‖xn − x‖∗ . Hence 𝜆(xn ) → 𝜆(x). By lemma 6.1.1, (X, ‖.‖∗ ) is
locally compact; hence the closed unit sphere S = {x ∈ X ∶ ‖x‖∗ = 1} in (X, ‖.‖∗ )
is compact (see problem 9 on section 4.7). Thus the restriction of 𝜆 to S assumes a
minimum value 𝛽 = 𝜆(x0 ) at some point x0 ∈ S. The constant 𝛽 must be positive
since, otherwise, 𝜆(x0 ) = ‖x0 ‖ = 0, and hence x0 = 0, which is not possible. We
x
have shown that, for every x ∈ S, ‖x‖ ≥ 𝛽. Now, for a nonzero vector x ∈ X, ∗ ∈
‖x‖
x
S; hence ‖ ∗
‖ ≥ 𝛽, and ‖x‖ ≥ 𝛽‖x‖∗ .
‖x‖
Corollary 6.1.3. All norms on a finite-dimensional normed linear space are equiv-
alent.
The polynomial p∗n is the best approximation of f in ℙn . It can be shown that p∗n is
unique. Observe that p∗n can have degree less than n.
Example 3. For a function f ∈ 𝒞[0, 1], the sequence of best approximations p∗n
converges uniformly to f.
Let 𝜖 > 0. By the Weierstrass polynomial approximation theorem, there exists
a polynomial q such that ‖ f − q‖∞ < 𝜖. Let N be the degree of q. Then, for
every n > N, q ∈ ℙn . Since p∗n is the best approximation of f in ℙn , ‖ f − p∗n ‖∞ ≤
‖ f − q‖∞ < 𝜖. This shows that limn ‖ f − p∗n ‖∞ = 0.
The following theorem establishes the fact that local compactness is exclusively a
property of finite-dimensional spaces.
Theorem 6.1.5. A normed linear space X is locally compact if and only if it is finite
dimensional.
Proof. Finite-dimensional spaces are locally compact by lemma 6.1.1 and corol-
lary 6.1.3. Now suppose X is a locally compact normed linear space. Thus the
closed unit ball B = {x ∈ X ∶ ‖x‖ ≤ 1} is compact. Since B ⊆ ∪x∈B B(x, 1/2), B ⊆
∪ni=1 B(xi , 1/2) for a finite subset {x1 , … , xn } of B. We will show that {x1 , … , xn }
spans X. Let F = Span{x1 , … , xn }, and suppose, for a contradiction, that there
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
These last two inequalities imply that, for p > P, ‖yp − x‖2 < 𝜖.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Finally, we show that Span(ℋ) is dense in l2 . Let x = (xn ) ∈ l2 , and let 𝜖 > 0.
∞
Choose an integer N such that ∑n=N+1 |xn |2 < 𝜖2 . Define y1 = (1, 0, 0, ...),
N
y2 = (0, 1/2, 0, 0, ...), ... , yN = (0, 0, … , 0, 1/N, 0, ...), and set h = ∑n=1 an yn ,
where an = nxn . Clearly, h ∈ Span(ℋ) and ‖x − h‖2 < 𝜖.
Two very useful tools for studying finite-dimensional spaces are local compact-
ness and the existence of a finite linear basis. We already saw in theorem 6.1.5
that infinite-dimensional normed linear spaces are never locally compact. By
definition, an infinite-dimensional space cannot have a finite Hamel basis. The
following theorem should thoroughly convince the reader that a Hamel basis is
of no practical use as a tool for studying infinite-dimensional Banach spaces.
However, see the concept of a Schauder basis in the exercises following this section
and the next section.
The following result will be used frequently later in the book. The motivation for
the theorem is provided below.
Let M be a proper subspace of ℝn . It is an elementary fact of linear algebra (see
problem 7 on section 3.7) that there is a unit vector x orthogonal to M. In this case,
dist(x, M) = 1.
Generalizing this result to Banach space is more challenging because we lack
the concept of orthogonality, which is a available only for inner product spaces.
The result below provides the next best alternative to the desirable property that
dist(x, M) = 1; we can pick a unit vector x whose distance from M is arbitrarily
close to 1.
Proof. Let v ∈ X − M, and let 𝛿 = dist(v, M). Since 𝜃 < 1, there exists y0 ∈ M such
v−y0
that 𝛿 ≤ ‖v − y0 ‖ < 𝛿/𝜃. Define x = . For y ∈ M, y0 + ‖v − y0 ‖y ∈ M and
‖v−y0 ‖
‖v − (y0 + ‖v − y0 ‖y)‖ ≥ 𝛿. Now
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
v − y0 1
‖x − y‖ = ‖ − y‖ = ‖v − y0 − ‖v − y0 ‖y‖
‖v − y0 ‖ ‖v − y0 ‖
1 𝛿 𝛿
= ‖v − (y0 + ‖v − y0 ‖y)‖ ≥ > = 𝜃.
‖v − y0 ‖ ‖v − y0 ‖ 𝛿/𝜃
Exercises
1. (a) Prove that a sequence (xn ) in a normed linear space X is Cauchy if and
only if limn (xpn − xqn ) = 0 for every pair (pn ) and (qn ) of increasing
sequences of positive integers.
1 n
(b) Show that if limn xn = x, then limn ∑i=1 xi = x.
n
2. Let w be a fixed positive function in 𝒞[0, 1]. For f ∈ 𝒞[0, 1], define
‖ f ‖ = ‖ fw‖∞ . Prove that ‖.‖ is a norm, and determine if it is equivalent
to the uniform norm on 𝒞[0, 1].
3. Let X be a normed linear space. Prove that X is a Banach space if and only
if the closed unit ball in X is complete.
4. Let X be a normed linear space. Prove that X is separable if and only if the
closed unit sphere in X is separable.
5. Let (xn ) be a sequence in a Banach space X such that, for every 𝜖 > 0, there
exists a convergent sequence (yn ) in X such that ‖xn − yn ‖ < 𝜖 for all n ∈ ℕ.
Prove that (xn ) is convergent.
6. The Heine-Borel theorem. Let V be a finite-dimensional subspace of a
normed linear space X. Show that a subset K of V is compact if and only
if it is closed and bounded.
7. Let X be an infinite-dimensional normed linear space. Show that X contains
a compact countable subset that is not contained in any finite-dimensional
subspace of X. Hint: Let {x1 , x2 , ...} be an infinite independent subset of X,
x
and let 𝜉n = n . Consider the set {𝜉n } ∪ {0}.
n‖xn ‖
8. Prove that, for 1 ≤ p < ∞, the linear dimension of lp is 𝔠. Hint: For each
0 < 𝜆 < 1, let x𝜆 = (𝜆, 𝜆2 , 𝜆3 , ...). Show that the set {x𝜆 } is independent, then
use example 7 on section 4.5.
10. Prove that if every absolutely convergent series in a normed linear space
X is convergent, then X is a Banach space. The proof outline is as follows.
Let (xn ) be a Cauchy sequence in X. It is enough to show that (xn ) contains
a convergent subsequence. Choose a subsequence (xnk ) of (xn ) such that
∞
‖xnk+1 − xnk ‖ < 2−k . Define yk = xnk+1 − xnk . Show that ∑k=1 ‖yk ‖ < ∞. By
∞ ∞
assumption, ∑k=1 yk converges. But ∑k=1 yk = −xn1 + limk xnk .
The boundedness of a linear transformation on a normed linear space and its conti-
nuity are used synonymously. Every linear transformation on a finite-dimensional
space is continuous. The picture is far more complicated for linear transformations
on infinite-dimensional spaces. In this chapter and the next, we study continuous
linear transformations exclusively because nonlinear transformations and discon-
tinuous linear transformations fall outside the realm of beginning linear functional
analysis.
In this section, we study the various equivalent characterizations of bounded-
ness, the space of bounded linear transformations on a normed linear space, and
the dual space in particular. The section concludes with a typical representation
theorem, which gives a concrete description of the dual of a normed linear space.
Throughout this section, X and Y are normed linear spaces.
‖T(x)‖ ≤ M‖x‖.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
(c) implies (d). Suppose T is not bounded. Then, for every n ∈ ℕ, there exists
x
xn ∈ X such that ‖T(xn )‖ > n‖xn ‖. Let 𝜉n = n . Then limn 𝜉n = 0 in X, but
n‖xn ‖
‖T(xn )‖
‖T(𝜉n )‖ = > 1. Thus, limn T(𝜉n ) ≠ 0 in Y, and T is not continuous at 0.
n‖xn ‖
(d) implies (a). Suppose that there is a constant M > 0 such that, for every x ∈ X,
‖T(x)‖ ≤ M‖x‖, and limn xn = x in X. Then
‖T(x)‖
‖T‖ = supx≠0 .
‖x‖
Notice that since T is bounded, there is a constant M > 0 such that ‖T(x)‖ ≤ M‖x‖
‖T(x)‖
for all x ∈ X; therefore ≤ M, and hence ‖T‖ is finite. It also follows directly
‖x‖
from the definition that ‖T(x)‖ ≤ ‖T‖‖x‖.
Example 1. Let (cn ) be a bounded sequence and, for a sequence x = (x1 , x2 , ...) ∈
l2 , define T(x) = (c1 x1 , c2 x2 , ...). We claim that T is a bounded linear mapping
∞ ∞
on l2 . Indeed, ‖T(x)‖22 = ∑n=1 |cn xn |2 ≤ ‖c‖2∞ ∑n=1 |xn |2 = ‖c‖2∞ ‖x‖22 . This
estimate shows that T(x) ∈ l2 and that ‖T‖ ≤ ‖c‖∞ . The linearity of T is
obvious.
ny if x = xn ,
T(x) = {
0 if x ∈ S2 .
x
Proof. Let M = sup‖x‖≤1 ‖T(x)‖. For every x ∈ X, x ≠ 0, ‖T( )‖ ≤ M, hence
‖x‖
‖T(x)‖
≤ M. Thus ‖T‖ ≤ M. To prove that M ≤ ‖T‖, fix a vector x ∈ X such
‖x‖
‖Tx‖
that 0 < ‖x‖ ≤ 1. By definition of ‖T‖, ‖T‖ ≥ ≥ ‖Tx‖. Since x is arbitrary,
‖x‖
it follows that M = sup‖x‖≤1 ‖Tx‖ ≤ ‖T‖, as desired.
The proof that ‖T‖ = sup‖x‖=1 ‖T(x)‖ is similar.
Theorem 6.2.3. Let X and Y be normed linear spaces, and let ℒ(X, Y) be the set
of all bounded linear mappings from X to Y. Then ℒ(X, Y) is a normed linear
space with the operations (T1 + T2 )(x) = T1 (x) + T2 (x), (aT)(x) = aT(x), and
‖T(x)‖
the norm ‖T‖ = supx≠0 . Furthermore, if Y is a Banach space, then so is
‖x‖
ℒ(X, Y).
that ℒ(X, Y) is closed under addition and scalar multiplication. Verifying the rest
of the axioms for a vector space is routine.
The above inequality can also be used to verify the defining properties of
a norm. For example, taking a = b = 1 gives ‖T1 + T2 ‖ ≤ ‖T1 ‖ + ‖T2 ‖. The
identity ‖aT‖ = |a|‖T‖ is obvious.
It remains to show that ℒ(X, Y) is complete if Y is complete. Suppose (Tn )
is a Cauchy sequence in ℒ(X, Y), and let 𝜖 > 0. By assumption, there exists
a positive integer N such that, for m, n > N, ‖Tn − Tm ‖ < 𝜖. For all x ∈ X,
‖Tn (x) − Tm (x)‖ = ‖(Tn − Tm )(x)‖ ≤ ‖Tn − Tm ‖‖x‖ < 𝜖‖x‖. Thus (Tn (x)) is a
Cauchy sequence in Y, and hence limn T(xn ) exists for every x ∈ X. Define T(x) =
limn Tn (x). We show that T ∈ ℒ(X, Y). The linearity of T is straightforward;
if x, y ∈ X, and a and b are scalars, then T(ax + by) = limn Tn (ax + by) =
limn aTn (x) + bTn (y) = aT(x) + bT(y). To show that T is bounded, let 𝜖 =
1. There is a positive integer N such that, for m, n ≥ N, ‖Tn − Tm ‖ ≤ 1. In
particular, for all n ≥ N, ‖Tn (x) − TN (x)‖ ≤ ‖x‖. Hence, for all x ∈ X, and
all n ≥ N, ‖Tn (x)‖ ≤ ‖Tn (x) − TN (x)‖ + ‖TN (x)‖ ≤ ‖x‖ + ‖TN ‖‖x‖ = (1 +
‖TN ‖)‖x‖. Taking the limit as n → ∞, we obtain ‖T(x)‖ ≤ (1 + ‖TN ‖)‖x‖. Thus
‖T‖ ≤ (1 + ‖TN ‖). Finally, we show that limn ‖Tn − T‖ = 0. Let 𝜖 > 0, and let N
be such that ‖Tn − Tm ‖ < 𝜖 for all m, n > N. For all x ∈ X, ‖Tn (x) − Tm (x)‖ ≤
𝜖‖x‖. Taking the limit as m → ∞, we have ‖Tn (x) − T(x)‖ ≤ 𝜖‖x‖ for all x ∈ X,
and all n > N. Thus ‖Tn − T‖ ≤ 𝜖 for all n > N; hence limn Tn = T.
An important special case of theorem 6.2.3 is the space ℒ(X, 𝕂) of all bounded
linear functionals from a normed linear space X to the base field. This space is
known as the dual space of X , and is denoted by X∗ . Since 𝕂 is complete, X∗ is a
Banach space, even when X is not complete.
Another important special case of theorem 6.2.3 is the space ℒ(X) = ℒ(X, X)
of bounded linear transformations on a Banach space X. Elements of ℒ(X) are
‖T(x)‖
also called bounded operators on X. The norm ‖T‖ = supx≠0 is called, not
‖x‖
surprisingly, the operator norm on ℒ(X).
Example 5. Let ‖.‖ and ‖.‖′ be norms on a vector space X. Then ‖.‖ and ‖.‖′ are
equivalent if and only if there exist positive constants k1 and k2 such that, for
every x ∈ X, k1 ‖x‖ ≤ ‖x‖′ ≤ k2 ‖x‖. Note the contrast between this result and
exercise 12 on section 4.3.
If k1 and k2 exist, the two norms are equivalent by theorem 4.3.9. Conversely,
the equivalence of the two norms implies the bi-continuity of the identity
mapping I ∶ (X, ‖.‖) → (X, ‖.‖′ ). The continuity of I implies the existence of a
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
positive constant k2 (namely, the norm of I relative to the given norms) such that
‖x‖′ = ‖I(x)‖′ ≤ k2 ‖x‖. The existence of k1 is established in a similar way.
Example 8. Let X be a Banach space, and let T ∈ ℒ(X) be bounded away from
zero. Then T is one-to-one, ℜ(T) is closed in X, and T−1 ∶ ℜ(T) → X is
bounded.
q/p q−1
if yn = 0, and xn = |yn |q /yn if yn ≠ 0. Note that ‖x‖p = ‖y‖q = ‖y‖q ; hence
∞ ∞ q q−1
x ∈ lp . Now 𝜆y (x) = ∑n=1 xn yn = ∑n=1 |yn |q = ‖y‖q = ‖y‖q ‖y‖q = ‖x‖p ‖y‖q .
Thus ‖𝜆y ‖ = ‖y‖q .
The mapping Λ ∶ lq → (lp )∗ given by y ↦ 𝜆y is clearly linear, and the fact that
‖𝜆y ‖ = ‖y‖q makes Λ an isometry. It remains to show that Λ maps lq onto (lp )∗ .
Let 𝜆 be a bounded linear functional on lp . We need to show that 𝜆 = 𝜆y for
some y ∈ lq . Let en be the canonical vectors in 𝕂(ℕ). For x = (x1 , x2 , ...) ∈ lp , the
n
sequence 𝜉n = ∑i=1 xi ei = (x1 , x2 , … , xn , 0, 0, 0, ...) converges to x in lp . Let yn =
𝜆(en ), and let y = (yn ). First we show that y ∈ lq . Let 𝜂n = (y1 , y2 , … , yn , 0, 0, 0...),
n
and define 𝜆n (x) = ∑i=1 xi yi . By the part of the theorem we already estab-
n
lished, 𝜆n ∈ (lp )∗ , and ‖𝜆n ‖ = ‖𝜂n ‖q = (∑i=1 |yi |q )1/q . Now |𝜆n (x)| = |𝜆(𝜉n )| ≤
n
‖𝜆‖‖𝜉n ‖p ≤ ‖𝜆‖‖x‖p ; hence ‖𝜆n ‖ ≤ ‖𝜆‖. Therefore (∑i=1 |yi |q )1/q is bounded by
∞
‖𝜆‖; hence ∑n=1 |yn |q < ∞, that is, y ∈ lq . Finally, we show that 𝜆 = 𝜆y :
n
𝜆(x) = 𝜆(lim 𝜉n ) = lim 𝜆(𝜉n ) = lim 𝜆(∑ xi ei )
n n n
i=1
n n ∞
= lim ∑ xi 𝜆(ei ) = lim ∑ xi yi = ∑ xn yn = 𝜆y (x).
n n
i=1 i=1 n=1
We sometimes summarize the above result by saying that the dual of lp is lq instead
of saying that the dual of lp is isometrically isomorphic to lq . This slight abuse of
language is common.
Exercises
exists a sequence (xn ) such that xn → 0 but ‖T(xn )‖ is bounded away from
0. Consider the sequence 𝜉n = xn /√‖xn ‖.
8. Suppose X is an n-dimensional normed linear space. Prove that
dim(X∗ ) = n.
9. Let T ∶ X → Y be a bounded linear injection. Prove that the following
conditions are equivalent:
(a) T is an isometry from X onto Y.
(b) T(SX ) = SY .
(c) T(BX ) = BY .
Here BX and BY are the closed unit balls in X and Y, respectively, and SX
and SY are the unit spheres in X and Y, respectively.
10. We know that if 1 ≤ p < q ≤ ∞, then lp ⊂ lq ; see problem 3 on section 3.6.
Let i ∶ lp → lq be the inclusion map. Find ‖i‖.
x
11. Define a linear operator T ∈ ℒ(c0 ) as follows: for x = (xn ), T(x) = ( n ). Find
n
‖T‖ and show that ℜ(T) is dense in c0 .
12. In connection with example 1, show that ‖T‖ = ‖c‖∞ .
13. Let X be the space of polynomials equipped with the norm ‖ f ‖ = sup0≤x≤1
| f (x)|. Prove that differentiation is an unbounded operator on X.
14. Define a function ‖.‖′ on the space of null sequences c0 by ‖x‖′ =
∞
∑n=1 2−n |xn |. Here x = (xn ). Prove that the given function is a norm and
that it is not equivalent to the infinity norm on c0 . Hint: The sequence
(1, 1, … , 1, 0, 0, 0, ...) is Cauchy in ‖.‖′ .
15. Let ‖.‖1 and ‖.‖2 be equivalent norms on a Banach space X. Prove that the
closed unit balls B1 = {x ∈ X ∶ ‖x‖1 ≤ 1} and B2 = {x ∈ X ∶ ‖x‖2 ≤ 1} are
homeomorphic. Hint: Consider the function 𝜑 ∶ B1 → B2 defined by
‖x‖1
x if x ≠ 0,
𝜑(x) = { ‖x‖2
0 if x = 0.
(a) 𝜆 is unbounded.
(b) There is a sequence (xn ) in X such that ‖xn ‖ = 1 and limn |𝜆(xn )| = ∞.
(c) There is a sequence (xn ) in X such that limn xn = 0 and 𝜆(xn ) = 1.
x
20. Let T ∶ 𝒞[0, 1] → 𝒞[0, 1] be the linear operator (Tf )(x) = ∫0 f (t)dt. Show
that T is bounded, and find its norm.
1
21. Let 𝜆 ∶ 𝒞[0, 1] → 𝕂 be the linear functional 𝜆( f ) = ∫0 f (t)dt. Show that 𝜆
is bounded, and find its norm.
22. Define a linear operator Pn on the space of convergent sequences c by
Pn (x) = (x1 , x2 , … , xn , xn , xn , ...).
(a) Prove that ‖Pn ‖ = 1 and that lim Pn (x) = x for all x ∈ X.
(b) Prove that u1 = (1, 1, 1...), u2 = (0, 1, 1, 1, ...), u3 = (0, 0, 1, 1, 1, ...), ... , is
a Schauder basis for c.
23. Let {tn } be a countable dense subset of [0, 1], where t1 = 0, t2 = 1. For n ∈ ℕ,
define an operator Pn on 𝒞[0, 1] as follows: Pn f is the continuous, piecewise
linear function with nodes t1 , … , tn such that (Pn f)(ti ) = f (ti ) for 1 ≤ i ≤ n.
Show that ‖Pn ‖ = 1 for all n ∈ ℕ and that, for every f ∈ 𝒞[0, 1], limn ‖Pn f −
f ‖∞ = 0.
24. This is a continuation of the previous exercise. Define u1 (x) = 1, and, for
n ≥ 2, define un to be the continuous, piecewise linear function such that
un (tn ) = 1, and un (ti ) = 0 for 1 ≤ i ≤ n − 1. Prove that {ui }ni=1 is a basis for
the range of Pn , and hence conclude that {un }∞ n=1 is a Schuader basis for
𝒞[0, 1].
Definition. Let {un } be a Schauder basis for a Banach space X. Thus every
∞
x ∈ X has a unique representation x = ∑n=1 an (x)un . Define the canonical
n
projections Pn ∶ X → Span{u1 , … , un } by Pn (x) = ∑i=1 ai (x)ui . Notice that
the last three problems include examples of canonical projections. We
assume, without proof, the fact that the set {Pn } is uniformly bounded, that
is, supn ‖Pn ‖ < ∞.
25. Let {un } be a Schauder basis for a Banach space X, and consider the series
∞
representation x = ∑n=1 an un of an element x ∈ X. Each of the coefficients
an is clearly a linear functional on X. Prove that an ∈ X∗ . Hint: an (x)un =
Pn (x) − Pn−1 (x).
A family of bounded linear functions {T𝛼 }𝛼∈I from X to Y such that, for each x ∈ X,
sup𝛼∈I {‖T𝛼 (x)‖} < ∞ is said to be pointwise bounded. If sup𝛼∈I ‖T𝛼 ‖ < ∞, we say
that the family {T𝛼 } is uniformly bounded.
Example 1. Let X and Y be normed linear spaces, and suppose that dim(X) <
∞. If a family of linear transformations {T𝛼 }𝛼∈I from X to Y is pointwise
bounded, then sup𝛼∈I ‖T𝛼 ‖ < ∞. To see this, fix a basis {x1 , … , xn } for X,
n n
and use the 1-norm on X. Thus if x = ∑i=1 ai xi , then ‖x‖ = ∑i=1 |ai |. Define
Mi = sup𝛼∈I ‖T𝛼 (xi )‖, and let M = max1≤i≤n Mi . For any 𝛼 ∈ I, we have
‖ n ‖ n n
‖T𝛼 (x)‖ = ‖‖T𝛼 (∑ ai xi )‖‖ ≤ ∑ |ai |‖T𝛼 (xi )‖ ≤ ∑ Mi |ai | ≤ M‖x‖.
‖ i=1 ‖ i=1 i=1
Lemma 6.3.1. Let X be a Banach space, and let Y be a normed linear space.
Suppose {T𝛼 }𝛼∈I is a family of bounded linear functions X → Y such that, for
each x ∈ X, sup𝛼∈I {‖T𝛼 (x)‖} < ∞. Then there exists a ball B(x0 , 𝛿) such that
sup{‖T𝛼 (x)‖ ∶ x ∈ B(x0 , 𝛿), 𝛼 ∈ I} < ∞.
Proof. For each n ∈ ℕ, let Fn = ∩𝛼∈I {x ∈ X ∶ ‖T𝛼 (x)‖ ≤ n}. Note that each Fn is
closed and that X = ∪∞ n=1 Fn . Since X is complete, Baire’s theorem forces at least
one set FN to have a nonempty interior. Thus there exists a ball B = B(x0 , 𝛿) ⊆
int (FN ) ⊆ FN . Now, for every x ∈ B and every 𝛼 ∈ I, ‖T𝛼 (x)‖ ≤ N.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
2 𝛿x 2 𝛿x
‖T𝛼 (x)‖ = ‖T𝛼 ( )‖ = ‖T𝛼 (x0 + ) − T𝛼 (x0 )‖
𝛿 2 𝛿 2
2 𝛿x 𝛿
≤ {‖T𝛼 (x0 + )‖ + ‖T𝛼 (x0 )‖} ≤ (N + 𝛽).
𝛿 2 2
The above theorem fails when X is not complete. See problem 1 at the end of this
section.
In the lemma below, we use the notation B𝛿 to denote an open ball in X of radius
𝛿 centered at 0. We use the same notation, in addition to the prime character,
to indicate an open ball in Y. Thus B′r denotes an open ball in Y of radius r and
centered at 0.
Lemma 6.3.3. Suppose that X and Y are Banach spaces and that T is a bounded
linear mapping from X to Y. If, for some r > 0, B′r ⊆ T(B1 ), then B′r ⊆ T(B3 ).
Equivalently, B′r/3 ⊆ T(B1 ).
Proof. First observe that B′r ⊆ T(B1 ) implies that B′r/2i ⊆ T(B1/2i ), for every i ∈ ℕ.
Pick y ∈ B′r . There exists x1 ∈ B1 such that ‖y − T(x1 )‖ < r/2. Now y − T(x1 ) ∈
B′r/2 ⊆ T(B1/2 ), so there exists x2 ∈ B1/2 such that ‖y − T(x1 ) − T(x2 )‖ < r/4.
Continuing in this manner, we can construct a sequence (xn ) in X such that
xn ∈ B1/2n−1 (i.e., ‖xn ‖ < 1/2n−1 ), and ‖y − T(x1 ) − T(x2 ) − ... − T(xn )‖ < r/2n .
Because ‖xn ‖ < 1/2n−1 , the sequence Sn = x1 + ... + xn is a Cauchy sequence in
X; hence x = limn Sn exists. Now T(x) = T(limn Sn ) = limn T(Sn ) = y, and ‖x‖ =
n ∞
limn ‖Sn ‖ = limn→∞ ‖x1 + ... + xn ‖ ≤ limn ∑i=1 ‖xi ‖ ≤ ∑i=1 1/2i−1 = 2 < 3. We
have shown that every y ∈ B′r is the image of an element x ∈ B3 . This proves the
result.
Theorem 6.3.4 (the open mapping theorem). Suppose that X and Y are Banach
spaces and that T ∶ X → Y is a bounded linear mapping from X onto Y. Then there
exists a number 𝛿 > 0 such that B′𝛿 ⊆ T(B1 ).
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Proof. Since T is onto, Y = ∪∞ n=1 T(Bn ). Baire’s theorem implies that T(BN )
has a nonempty interior for some positive integer N. Thus there exists an element
y0 ∈ Y and a positive number r such that B(y0 , r) ⊆ T(BN ). We claim that
B′r ⊆ T(B2N ). Let y ∈ Y be such that ‖y‖ < r, and let 𝜖 > 0. Both y0 and y0 + y
are in B(y0 , r), so there are vectors 𝜉 and 𝜂 in BN such that ‖y + y0 − T(𝜉)‖ <
𝜖/2, and ‖y0 − T(𝜂)‖ < 𝜖/2. Let x = 𝜉 − 𝜂. Then ‖x‖ ≤ ‖𝜉‖ + ‖𝜂‖ < 2N,
and ‖y − T(x)‖ = ‖y − T(𝜉 − 𝜂) − y0 + y0 ‖ ≤ ‖y + y0 − T(𝜉)‖ + ‖T(𝜂) − y0 ‖ <
𝜖/2 + 𝜖/2 = 𝜖. This proves that B′r ⊆ T(B2N ), which establishes our claim and
r
implies that B′r/2N ⊆ T(B1 ). By lemma 6.3.3, B′𝛿 ⊆ T(B1 ), where 𝛿 = .
6N
Corollary 6.3.5. Under the assumptions of the open mapping theorem, given r > 0,
there exists 𝛿 > 0 such that B′𝛿 ⊆ T(Br ).
The following theorem justifies the name of the open mapping theorem.
Theorem 6.3.6. Under the assumptions of the open mapping theorem, T is an open
mapping.
The continuity of a function does not imply its openness. For example, the function
f(x) = sin x is continuous but not open, since the image of interval (0, 𝜋) is (0, 1].
Example 2. Under the assumptions of the open mapping theorem, there exists a
constant M > 0 such that, for every y ∈ Y, there is an element x ∈ T−1 (y) such
that ‖x‖ ≤ M‖y‖. By the open mapping theorem, there exists a positive number
𝛿y
𝛿 such that B′𝛿 ⊆ T(B1 ). For a nonzero vector y ∈ Y, ∈ B′𝛿 ; hence there is a
2‖y‖
𝛿y 2x1 ‖y‖
vector x1 ∈ X such that ‖x1 ‖ ≤ 1 and T(x1 ) = . Define x = . One can
2‖y‖ 𝛿
2‖y‖ 2
see that T(x) = y, and ‖x‖ ≤ . The constant we seek is M = .
𝛿 𝛿
The following results represent a small sample of applications of the open mapping
theorem.
Theorem 6.3.8. Let X be a Banach space under each of the norms ‖.‖ and ‖.‖′ . If
there exits a constant 𝛼 > 0 such that ‖x‖ ≤ 𝛼‖x‖′ for every x ∈ X, then there
exists a constant 𝛽 > 0 such that ‖x‖′ ≤ 𝛽‖x‖ for every x ∈ X.
Proof. Consider the identity mapping IX ∶ (X, ‖.‖′ ) → (X, ‖.‖). The assumption
‖x‖ ≤ 𝛼‖x‖′ is equivalent to the boundedness of IX . By theorem 6.3.7, the inverse
of IX is also continuous. Thus IX ∶ (X, ‖.‖) → (X, ‖.‖′ ) is bounded. Thus there
exists a positive constant 𝛽 such that, for all x ∈ X, ‖x‖′ ≤ 𝛽‖x‖.
Definition. Let (X, d) and (Y, 𝜌) be metric spaces, and let T ∶ X → Y. The graph
of T is the subset G = {(x, T(x)) ∶ x ∈ X} of X × Y. We say that the graph of T is
closed if G is closed in the product metric on X × Y.
Recall that a sequence (xn , yn ) ∈ X × Y converges to (x, y) if and only if xn → x
and yn → y. Thus the graph of T is closed if whenever xn → x, and T(xn ) → y,
then (x, y) ∈ G, or simply y = T(x). It is a simple exercise to verify that if T is
continuous, then the graph of T is closed in X × Y. For Banach spaces and linear
mappings, the converse is true.
Theorem 6.3.9 (the closed graph theorem). Let X and Y be Banach spaces, and let
T be a linear mapping from X to Y. If the graph of T is closed, then T is bounded.
Proof. Define a norm on X as follows: ‖x‖′ = ‖x‖ + ‖T(x)‖. We first show that ‖.‖′
is complete, and hence (X, ‖.‖′ ) is a Banach space. If (xn ) is a Cauchy sequence
in ‖.‖′ , then, for 𝜖 > 0, there is a natural number N such that, for m, n > N,
‖xn − xm ‖′ < 𝜖. In particular, both (xn ) and (T(xn )) are Cauchy sequences in X
and Y, respectively. The completeness of X and Y guarantees that both sequences
converge, say, x = limn xn , and y = limn T(xn ). The assumption that the graph of
graph of T is closed implies that y = T(x). Now ‖xn − x‖′ = ‖xn − x‖ + ‖T(xn ) −
T(x)‖ = ‖xn − x‖ + ‖T(xn ) − y‖ → 0 as n → ∞. This demonstrates the complete-
ness of ‖.‖′ . Now ‖x‖ ≤ ‖x‖ + ‖T(x)‖ = ‖x‖′ . By theorem 6.3.8, the two norms ‖.‖
and ‖.‖′ are equivalent; thus the boundedness of T in one norm is equivalent to its
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
boundedness in the other. But the boundedness of T in the ‖.‖′ norm is immediate
from the inequality ‖T(x)‖ ≤ ‖T(x)‖ + ‖x‖ = ‖x‖′ .
The following examples show that both the linearity of the T and the completeness
of the spaces are needed for the closed graph theorem to hold.
Example 4. Let X = 𝒞[0, 1] be equipped with the 1-norm, and let Y = 𝒞[0, 1] be
equipped with the uniform norm. The identity function I ∶ X → Y is discon-
tinuous by example 7 on section 3.6. However, the graph of I is closed. Sup-
pose limn ‖fn − f‖1 = 0 and limn ‖I(fn ) − g‖∞ = limn ‖fn − g‖∞ = 0. Since con-
vergence in the uniform norm implies convergence in the 1-norm, limn ‖fn −
g‖1 = 0. Now the uniqueness of limits forces f = g.
Exercises
1. Let 𝜆n ∶ 𝕂(ℕ) → 𝕂 be the functional defined by 𝜆n (x) = nxn . Prove that the
set {𝜆n } is pointwise bounded but not uniformly bounded. Here x = (xn ), and
𝕂(ℕ) is given the supremum norm.
2. The Banach-Steinhaus theorem. Let X and Y be Banach spaces, and let
(Tn ) be a sequence of bounded linear mappings from X to Y such that,
for every x ∈ X, T(x) = limn Tn (x) exists. Prove that T is bounded and that
‖T‖ ≤ lim infn ‖Tn ‖. Is it necessarily true that limn Tn = T in ℒ(X, Y)?
∞
3. Let (yn ) be a sequence such that ∑n=1 xn yn < ∞ for all sequences (yn ) ∈ lq .
Prove that (xn ) ∈ lp . Here p and q are conjugate Hölder exponents with p > 1.
∞
4. Let (yn ) be a sequence such that ∑n=1 xn yn < ∞ for all sequences (yn ) that
∞
converge to 0. Prove that ∑n=1 |yn | < ∞.
5. Let X be a Banach space, and suppose that the sequence 𝜆n ∈ X∗ is pointwise
bounded. Prove that 𝜆n is equicontinuous.
6. Let M and N be closed subspaces of a Banach space (X, ‖.‖) such that
X = M ⊕ N. Thus every x ∈ X can be written uniquely as x = y + z, where
y ∈ M, z ∈ N. Define a norm on X by ‖x‖′ = ‖y‖ + ‖z‖. Prove that ‖.‖′ is
equivalent to ‖.‖.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
The Hahn-Banach theorem has many guises, and one of them is an extension
theorem. The following example shows that, from the purely algebraic perspective,
extending a linear functional on a subspace M of a vector space X is a trivial task.
Compare the following example to theorem 6.4.4.
𝜆(x) if x ∈ S1 ,
Λ(x) = {
0 if x ∈ S2 .
1 if x = x0 ,
𝜆(x) = {
0 if x ∈ S2 ∪ S2 .
Extend 𝜆 by linearity to a linear functional 𝜆 on X. Clearly, 𝜆(M) = 0.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Proof. Write 𝜆(x) = u(x) + iv(x). On the one hand, 𝜆(ix) = i𝜆(x) = iu(x) − v(x).
On the other hand, 𝜆(ix) = u(ix) + iv(ix). Equating the right-hand sides
of the above identities yields v(x) = −u(ix), and hence (a). Since, for any
complex number z, |Re(z)| ≤ |z|, |u(x)| ≤ |𝜆(x)|; hence ‖u‖ ≤ ‖𝜆‖. If 𝜆(x) ≠ 0,
|𝜆(x)|
let 𝛼 = . Then |𝜆(x)| = 𝛼𝜆(x) = 𝜆(𝛼x) = u(𝛼x) ≤ ‖u‖‖𝛼x‖ = |𝛼|‖u‖‖x‖ =
𝜆(x)
‖u‖‖x‖. Thus ‖𝜆‖ ≤ ‖u‖, and this establishes (b).
Conversely, if u is a bounded real functional on X and 𝜆(x) = u(x) − iu(ix),
then the additivity of 𝜆 is straightforward. Now 𝜆(ix) = u(ix) − iu(−x) =
u(ix) + iu(x) = i[u(x) − iu(ix)] = i𝜆(x). Hence 𝜆((a + ib)x) = 𝜆(ax) + 𝜆(ibx) =
a𝜆(x) + i𝜆(bx) = a𝜆(x) + ib𝜆(x) = (a + ib)𝜆(x). Thus 𝜆 is complex linear. The
boundedness of 𝜆 follows from the proof of part (b).
Lemma 6.4.2. Let M be a a subspace of a real normed linear space X, and let
x0 ∈ X − M. If u is a bounded real functional on M, then u has an extension U to
a bounded real functional on N = M ⊕ Span{x0 } such that ‖U‖ = ‖u‖.
Proof. Without loss of generality, assume that ‖u‖ = 1. Every element of N can
be written uniquely as x + 𝛼x0 , where x ∈ M and 𝛼 ∈ ℝ. Define U ∶ N → ℝ
by U(x + 𝛼x0 ) = u(x) + 𝛼b, where b is a constant to be determined later in
the proof. The linearity of U is obvious, and since U extends u, ‖u‖ ≤ ‖U‖. It
remains to show that ‖U‖ ≤ 1. It suffices to show that a constant b exists such that
|u(x) − b| ≤ ‖x − x0 ‖ for every x ∈ M, because then |U(x)| = |u(x) + 𝛼b| =
u(x) x
| − 𝛼|| − b| ≤ | − 𝛼|‖ − x0 ‖ = ‖x + 𝛼x0 ‖; hence ‖U‖ ≤ 1. We now show
−𝛼 −𝛼
that a constant b exists such that |u(x) − b| ≤ ‖x − x0 ‖ for every x ∈ M. For
x, y ∈ X, u(x) − u(y) = u(x − y) ≤ ‖u‖‖x − y‖ = ‖x − y‖ ≤ ‖x − x0 ‖ + ‖y − x0 ‖.
Therefore u(x) − ‖x − x0 ‖ ≤ u(y) + ‖y − x0 ‖, and b1 = supx∈M {u(x) − ‖x −
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Lemma 6.4.3. Let M be a subspace of a real normed linear space X, and let u be a
bounded real functional on M. Then u has a bounded real extension, U, on X such
that ‖U‖ = ‖u‖.
Proof. Consider X as a real normed linear space simply by limiting the scalar field to
ℝ, and let u be the real part of 𝜆. By lemma 6.4.1, u is a bounded real functional
on M, and ‖u‖ = ‖𝜆‖. By lemma 6.4.3, u has an extension, U, to X such that
‖U‖ = ‖u‖. Define Λ ∶ X → ℂ by Λ(x) = U(x) − iU(ix). By lemma 6.4.1, Λ is a
bounded linear functional on X, and ‖Λ‖ = ‖U‖ = ‖u‖ = ‖𝜆‖.
We now look at some applications of the Hahn-Banach theorem. The results below
are important in their own right.
x
𝜆(x0 ) = 1. Now, for any x ∈ M, 𝛼 ≠ 0, ‖ − x0 ‖ ≥ 𝛿. Thus 𝛿|𝜆(x + 𝛼x0 )| =
−𝛼
x
𝛿|𝛼| ≤ |𝛼|‖ − x0 ‖ = ‖x + 𝛼x0 ‖. Thus 𝜆 is bounded on N (‖𝜆‖ ≤ 1/𝛿).
−𝛼
Extend 𝜆 to a bounded linear functional Λ on X. The functional Λ has the
desired properties. Conversely, if x0 ∈ M and 𝜆 ∈ X∗ is such that 𝜆(M) = 0,
then there exists a sequence of vectors (xn ) in M such that limn xn = x. Now
𝜆(x0 ) = 𝜆(limn xn ) = limn 𝜆(xn ) = 0.
Example 3. Let A be a dense subset of [−𝜋, 𝜋]. For a fixed t ∈ A, the sequence
eint
𝜉t = ( )∞ 2
n=1 is in l . We claim that the subspace M = Span{𝜉t ∶ t ∈ A} is
n
2
dense in l . We use the above corollary and show that, for a bounded
linear functional 𝜆 on l2 , 𝜆(M) = 0 is possible only if 𝜆 = 0. By theorem
6.2.4, there exists a sequence (yn ) ∈ l2 such that, for every sequence x =
∞ ∞ y ∞ |y |
(xn ) ∈ l2 , 𝜆(x) = ∑n=1 xn yn . For every t ∈ [−𝜋, 𝜋], ∑n=1 | n eint | = ∑n=1 n ≤
n n
∞ 1 1/2 ∞ y
‖y‖2 { ∑n=1 2 } < ∞, and the series ∑n=1 n eint converges absolutely and
n n
uniformly on [−𝜋, 𝜋] to a continuous function F(t). By assumption, F vanishes
on a dense subset of [−𝜋, 𝜋], so F is identically equal to the zero function.
y
Theorem 4.10.5 implies that n = 0 for all n ∈ ℕ. Thus yn = 0, and 𝜆 = 0.
n
Corollary 6.4.7. Let X be a normed linear space, and let x0 ∈ X, x0 ≠ 0. Then there
exists a bounded linear functional 𝜆 on X such that 𝜆(x0 ) = ‖x0 ‖, and ‖𝜆‖ = 1.
In particular, if y ∈ X and 𝜆(y) = 0 for all 𝜆 ∈ X∗ , then y = 0.
by x ↦ x,̂ are obvious. We now show that ‖x‖̂ = ‖x‖. Since |x(𝜆)| ̂ = |𝜆(x)| ≤
|x(𝜆)|
̂
‖𝜆‖‖x‖, ≤ ‖x‖. Hence ‖x‖̂ ≤ ‖x‖. We now show that ‖x‖̂ = ‖x‖. By corollary
‖𝜆‖
6.4.7, there exists 𝜆 ∈ X∗ such that ‖𝜆‖ = 1, and 𝜆(x) = ‖x‖. Now |x(𝜆)|
̂ = |𝜆(x)| =
‖x‖. Therefore ‖x‖̂ ≥ ‖x‖ and ‖x‖̂ = ‖x‖. We have proved the following result.
Theorem 6.4.8. Let X be a normed linear space, and let 𝜑 ∶ X → X∗∗ be the function
𝜑(x) = x.̂ Then 𝜑 is a linear isometry.
The function 𝜑 in the above theorem is known as the natural embedding of X into
X∗∗ . We use the notation X̂ to denote the range of 𝜑. Thus X̂ = {x̂ ∶ x ∈ X}.
The above theorem provides the neatest construction of the completion of a
normed linear space.
Theorem 6.4.9. Let X be a normed linear space. Then X can be linearly and
isometrically embedded as a dense subspace of a Banach space. Thus every normed
linear space has a completion.
Proof. We know that X∗∗ is a Banach space. Let X̂ be the image of X under the
natural embedding 𝜑 in theorem 6.4.8. The desired completion of X is the closure
of X̂ in X∗∗ .
Example 4. The lp spaces are reflexive for 1 < p < ∞. This follows directly from
theorem 6.2.4.
The result below is important in its own right, but it also helps us decide whether
certain spaces are reflexive.
Let {𝜆n } be a countable dense subset of X∗ . Since ‖𝜆n ‖ = sup‖x‖=1 |𝜆n (x)|, there
exist unit vectors xn ∈ X such that |𝜆n (xn )| ≥ ‖𝜆n ‖/2. Let M = Span{x1 , x2 , ...}.
We employ theorem 6.4.6. Suppose that 𝜆 ∈ X∗ is such that 𝜆(M) = 0. Let
𝜖 > 0, and pick a positive integer n such that ‖𝜆n − 𝜆‖ < 𝜖. By the definition of
xn , and the fact that 𝜆(xn ) = 0, we have ‖𝜆n ‖/2 ≤ |𝜆n (xn )| = |𝜆n (xn ) − 𝜆(xn )| =
|(𝜆n − 𝜆)(xn )| ≤ ‖𝜆n − 𝜆‖ < 𝜖. Therefore ‖𝜆‖ ≤ ‖𝜆 − 𝜆n ‖ + ‖𝜆n ‖ < 𝜖 + 2𝜖 = 3𝜖.
This means that 𝜆 = 0, and, by corollary 6.4.6, M is dense in X. Now the
n
countable set {∑i=1 ai xi ∶ n ∈ ℕ, ai ∈ ℚ + iℚ} is dense in X.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
The very definition suggests that not every closed subspace of a Banach space has
a closed complement. However, the following examples identify two important
special cases where closed complements are guaranteed.
Exercises
The spectrum of a square matrix A is simply its set of eigenvalues, and the
eigenvalues of A are easy to characterize. They are exactly the complex numbers 𝜆
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
for which the matrix A − 𝜆I is not invertible. We recall the simple fact that A − 𝜆I
is not invertible if and only if the linear operator T it generates on 𝕂n is not one-
to-one, and this is the case if and only if T in not onto.
The definition of the spectrum of an operator T on an infinite-dimensional
space is exactly the same as it is for a matrix. The stark distinction here is that
not every point in the spectrum of an operator on an infinite-dimensional space
is an eigenvalue. This is because such an operator may be one-to-one but not onto
or conversely. See example 1. Thus the spectrum consists of two main parts: the
complex numbers 𝜆 for which T − 𝜆I is not one-to-one (the eigenvalues) and those
for which T − 𝜆I is one to one but not onto. The spectrum of an operator T often
carries valuable information about T, and, in some cases, the eigenvalues of an
operator and the corresponding eigenvectors completely define the operator.
We know that the set ℒ(X) of bounded linear operators on a Banach space X is a
Banach space. In fact, ℒ(X) is a Banach algebra with the composition of operators
as the multiplication operation. The composition of two operators S and T is
usually denoted by ST rather than SoT. Property (a) is obvious, and property (b)
follows from the inequalities ‖(ST)(x)‖ = ‖S(T(x))‖ ≤ ‖S‖‖T(x)‖ ≤ ‖S‖‖T‖‖x‖.
For the convenience of the reader, we list below the properties that make ℒ(X) a
Banach algebra: for operators T, S, U ∈ ℒ(X) and all a, b ∈ 𝕂,
Example 1. The right shift operator R and the left shift operator L on l2 are,
respectively,
It is clear that R is one-to-one but not onto, while L is onto but not
one-to-one.
Definition. The spectrum, 𝜎(T), of an operator T ∈ ℒ(X) is the set of all complex
numbers 𝜆 such that T − 𝜆I is not invertible. It follows that there are two types
of points in the spectrum:
(a) Complex numbers 𝜆 such that Ker(T − 𝜆I) ≠ {0}: Such a number 𝜆 is called
an eigenvalue of T. Specifically, 𝜆 is an eigenvalue of T if there exists a nonzero
vector x such that Tx = 𝜆x. In this case, we say that x is an eigenvector of T
corresponding (or belonging) to the eigenvalue 𝜆. The set of eigenvalues of T is
known as the point spectrum of T. The set Ker(T − 𝜆I) is called the eigenspace
of T corresponding to the eigenvalue 𝜆.
(b) Complex numbers 𝜆 such that T − 𝜆I is one-to-one but not onto, that
is, ℜ(T − 𝜆I) ≠ X. We will not dwell on this part of the spectrum, since the
eigenvalues are the only important part of the spectrum for our purposes.
The complement of the spectrum of T in the complex plane is called the resolvent
set of T and is denoted 𝜌(T). Thus 𝜆 ∈ 𝜌(T) if and only if (T − 𝜆I)−1 exists. If
𝜆 ∈ 𝜌(T), we use the notation T𝜆 to denote (T − 𝜆I)−1 .
Example 3. Every complex number 𝜆 in the open unit disk is an eigenvalue of the
left shift operator on l2 .
Lemma 6.5.1. If T ∈ ℒ(X), and ‖T‖ < 1, then (I − T)−1 exists, (I − T)−1 =
∞ 1
∑n=0 Tn , and ‖(I − T)−1 ‖ ≤ .
1−‖T‖
∞ ∞ 1 ∞
Proof. First observe that ∑n=0 ‖Tn ‖ ≤ ∑n=0 ‖T‖n = . Thus the series ∑n=0 Tn
1−‖T‖
n
converges to an operator S ∈ ℒ(X). Now (I − T) ∑j=0 Tj= I − Tn+1 . Taking the
limit as n → ∞, (I − T)S = I. Similarly, S(I − T) = I; hence (I − T)−1 = S.
Theorem 6.5.2. Let T ∈ ℒ(X). If 𝜆 ∈ ℂ and |𝜆| > ‖T‖, then 𝜆 ∈ 𝜌(T).
Proof. Since ‖T‖ < |𝜆|, ‖T/𝜆‖ < 1. By lemma 6.5.1, (I − T/𝜆)−1 exists. Thus T − 𝜆I
is invertible since (T − 𝜆I)−1n = −𝜆−1 (I − T/𝜆)−1 . Notice that, in this case,
−1 ∞ T
T𝜆 = (T − 𝜆I)−1 = ∑n=0 n .
𝜆 𝜆
Proof. We show that 𝜌(T) is an open subset of ℂ. Let 𝜆0 ∈ 𝜌(T), and let 𝜆 ∈ ℂ. Recall
the notation T𝜆 = (T − 𝜆I)−1 . Now
T − 𝜆I = (T − 𝜆0 I) − (𝜆 − 𝜆0 )I = (T − 𝜆0 I)[I − (𝜆 − 𝜆0 )T𝜆0 ].
Thus r(T) is the radius of the smallest closed disk in the complex plane that
contains 𝜎(T). By theorem 6.5.3, r(T) ≤ ‖T‖. It is possible that r(T) < ‖T‖. See
problem 5 on section 7.4.
Example 4. Let L ∶ l2 → l2 be the left shift operator. It is clear that ‖L(x)‖2 ≤ ‖x‖2 ,
and since ‖L(e2 )‖2 = ‖e1 ‖2 = 1 = ‖e2 ‖2 , ‖L‖ = 1. Therefore the spectrum of
R is contained in the closed unit disk, D. It follows directly from this and
example 3 that 𝜎(L) = D. Thus, ‖L‖ = r(L) = 1.
The last conclusion of the previous example is true for the right shift operator. We
derive it without directly computing 𝜎(R).
Before we show that the spectrum of a bounded linear operator on a Banach space
is not empty, we need to establish the following identity: for 𝜆 and 𝜇 ∈ 𝜌(T),
T𝜆 − T𝜇 = (𝜆 − 𝜇)T𝜆 T𝜇 , (1)
We need the following result from complex analysis, which we state without proof.
Proof. Suppose, contrary to the above statement, that 𝜎(T) = ∅. Thus 𝜌(T) = ℂ.
For an arbitrary but fixed functional g ∈ (ℒ(X))∗ , define a function F ∶ ℂ → ℂ
F(𝜆)−F(𝜇)
by F(𝜆) = g(T𝜆 ). By identity (1), = g(T𝜆 T𝜇 ).
𝜆−𝜇
F(𝜆)−F(𝜇)
As 𝜇 → 𝜆, g(T𝜆 T𝜇 ) → g(T2𝜆 ). Therefore F′ (𝜆) = lim𝜇→𝜆 = g(T2𝜆 ),
𝜆−𝜇
and F is differentiable at every point of the complex plane. If |𝜆| ≥ 1 + ‖T‖, then,
by lemma 6.5.1,
1 T 1 1 1
‖T𝜆 ‖ = ‖( − I)−1 ‖ ≤ = ≤ 1. (2)
|𝜆| 𝜆 |𝜆| 1 − ‖T‖/|𝜆| |𝜆| − ‖T‖
Thus ‖T𝜆 ‖ is bounded by 1 outside the closed disk, D, of radius 1 + ‖T‖. Therefore,
outside the disk D, |F(𝜆)| = |g(T𝜆 )| ≤ ‖g‖. Because F is continuous on D, it is
bounded on D; hence F is a bounded differentiable function on the entire complex
plane. By lemma 6.5.5, F(𝜆) is constant. If 𝜖 > 0, there exists a positive constant R
such that ‖T𝜆 ‖ < 𝜖 for |𝜆| ≥ R (see inequality (2) above). Consequently, for such
𝜆, |F(𝜆)| ≤ ‖g‖𝜖. Since 𝜖 is arbitrary, and F is constant, F(𝜆) = 0 for all 𝜆 ∈ ℂ.
Now since g is an arbitrary element of (ℒ(X))∗ , T𝜆 = 0 (see corollary 6.4.7). This
is impossible because T𝜆 is invertible.
Proof. By problem 9 at the end of this section, r(Tn ) = [r(T)]n . Therefore r(T) =
[r(Tn )]1/n ≤ ‖Tn ‖1/n , and r(T) ≤ lim infn ‖Tn ‖1/n . The proof will be complete if
we show that lim supn ‖Tn ‖1/n ≤ r(T).
Let 𝜆 ∈ ℂ be such that |𝜆| > ‖T‖. By theorem 6.5.2, T𝜆 = (T − 𝜆I)−1 =
−1 ∞ Tn −1 ∞ g(Tn )
∑n=0 n . If g ∈ (ℒ(X))∗ , then g(T𝜆 ) = ∑n=0 n . By the proof of theorem
𝜆 𝜆 𝜆 𝜆
6.5.6, the function F(𝜆) = g(T𝜆 ) is differentiable for all 𝜆 ∈ 𝜌(T); thus the function
−1 ∞ g(Tn )
F(𝜆) extends the series ∑n=0 n to the set {z ∈ ℂ ∶ |𝜆| > r(T)}. Therefore
𝜆 𝜆
−1 ∞ g(Tn )
the series expansion ∑n=0 is valid for all complex numbers 𝜆 such that
𝜆 𝜆n
−1 ∞ g(Tn )
|𝜆| > r(T).3 Now, for an arbitrary real number a > r(T), the series ∑n=0
a an
Tn
is convergent; hence the sequence g( n ) is bounded. Since g ∈ (ℒ(X))∗ is
a
arbitrary, Tn /an is bounded in ℒ(X). Let K > 0 be such that ‖Tn /an ‖ ≤ K. Then
‖Tn ‖1/n ≤ K1/n a, and lim supn ‖Tn ‖1/n ≤ a. Since a is an arbitrary number greater
than r(T), lim supn ‖Tn ‖1/n ≤ r(T).
Exercises
Theorem 6.6.1. Let X be a Banach space, let x ∈ X, 𝜆 ∈ X∗ , and let T ∈ ℒ(X). Then
Proof. (a) and (b) are previously established facts in new notation. To prove (c),
Example 1. In this example, we use theorem 6.2.4 and identify (l1 )∗ with l∞ . For
elements x = (xn ) ∈ l1 and 𝜆 = (𝜆n ) ∈ l∞ , define T(x) = (x2 , x3 , ...) and S(𝜆) =
(0, 𝜆1 , 𝜆2 , ...). Clearly, T ∈ ℒ(l1 ) and S ∈ ℒ(l∞ ). We claim that S = T∗ . We need
to verify that 𝜆oT = S(𝜆), which is straightforward since, for x ∈ l1 , 𝜆(T(x)) =
∞
∑n=1 𝜆n xn+1 = (S(𝜆))(x).
The next example utilizes several of the ideas of sections 6.3 and 6.4.
̂ T∗∗ (x).
Theorem 6.6.3. Let T ∈ ℒ(X). Then, for every x ∈ X, (Tx) = ̂
Loosely interpreted, the above theorem says that T is the restriction of T∗∗ to X.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Example 5. Let M be the set of all sequences in l1 where every even term is 0. We
claim that M⟂ is the set S of all the sequences in l∞ where every odd term is 0. It
∞
is clear that if 𝜆 = (𝜆n ) ∈ S, then, for every x = (xn ) ∈ M, ∑n=1 xn 𝜆n = 0; hence
S ⊆ M⟂ . Conversely, if 𝜆 = (𝜆n ) ∈ M⟂ , then 𝜆(e2n+1 ) = 0 for every positive
integer n. This means that 𝜆2n+1 = 0, and 𝜆 ∈ S. Observe that M⟂ is a closed
subspace l∞ , consistent with the theorem below.
Theorem 6.6.5. Let T ∈ ℒ(X), let 𝒩(T) and ℜ(T) be the kernel and range of T,
respectively, and let 𝒩(T∗ ), and ℜ(T∗ ) be the kernel and range of T∗ . Then
Proof. (a) x ∈ 𝒩(T) if and only if Tx = 0, if and only if ⟨Tx, 𝜆⟩ = 0 for all 𝜆 ∈ X∗ , if
and only if ⟨x, T∗ 𝜆⟩ = 0 for all 𝜆 ∈ X∗ , if and only if x ∈ ℜ(T∗ )⊥ .
Quotient Spaces
Remarks. 1. If ‖x‖̃ < 𝛿, then there exists 𝜉 ∈ x̃ such that ‖𝜉‖ < 𝛿. This is because
if ‖x‖̃ < 𝛿, then there exists y ∈ M such that ‖x − y‖ < 𝛿. Set 𝜉 = x − y.
2. For x ∈ X, ‖x‖̃ ≤ ‖x‖. This is because 0 ∈ M; hence ‖x‖ = ‖x − 0‖ ≥ ‖x‖. ̃
3. It follows directly from remark 2 that if (xn ) converges to x in X, then (x̃n )
∞
converges to x̃ in X/M. In particular, if the series ∑n=1 𝜉n converges in X,
∞
then ∑n=1 𝜉ñ converges in X/M.
Exercises
1. Show that if T, S ∈ ℒ(X) and a, and b are scalars, then (aT + bS)∗ = aT∗ +
bS∗ . Conclude that if X is reflexive, then the correspondence Ψ ∶ T ↦ T∗ is
an isometric isomorphism from ℒ(X) to ℒ(X∗ ).
2. If T and S are as in the above exercise, show that (ST)∗ = T∗ S∗ .
3. Let X be a Banach space. Prove that X⊥ = {0}, {0}⊥ = X∗ . State and prove the
corresponding statements for X∗ .
4. Let M be a subspace of a Banach space X. Prove that (M⊥ )⊥ = M. Hint: Use
theorem 6.4.5 to show that if x ∉ M, then x ∉ (M⊥ )⊥ .
5. Let X be a Banach space, and let T ∈ ℒ(X). Prove that ℜ(T) = 𝒩(T∗ )⊥ .
Conclude that ℜ(T) is dense if and only if T∗ is one-to-one.
6. Let X be a Banach space, and let T ∈ ℒ(X). Show that if xn →w x, then
Txn →w Tx.
7. Let S and T be commuting bounded linear operators on X. Prove that the
eigenspaces of T are S-invariant.
8. Let T ∈ ℒ(X), and suppose M is a T-invariant subspace of X. Prove that M⊥
is invariant under T∗ .
9. Verify the details of the proof that the norm defined on X/M is indeed a
norm.
10. Show that if M is a closed subspace of a Banach space X, then the quo-
tient map 𝜋 ∶ X → X/M is continuous. Also prove that if N is a finite-
dimensional subspace of X, then 𝜋(N) is a finite-dimensional subspace of
X/M.
11. In the quotient space l∞ /c0 , prove that ‖x‖̃ = lim supn |xn |. Hint: For 𝜖 > 0,
there are finitely many n ∈ ℕ such that |xn | > lim supn |xn | + 𝜖.
12. Let X be a Banach space, and let T ∈ ℒ(X). Define T ∶ X/Ker(T) → ℜ(T)
by T(x)̃ = T(x). Prove that T is a bounded isomorphism. Hint: To show
the continuity of T, suppose x̃n → 0, and choose xn ∈ x̃n such that ‖xn ‖ <
‖x̃n ‖ + 1/n.
13. Let R be the right shift operator on l2 , let M1 be the range of R, and let M2 be
the range of R2 . Determine the quotient spaces l2 /M1 and l2 /M2 . Conclude
that if M1 and M2 are isomorphic closed subspaces of a Banach space X,
then it is not necessarily true that X/M1 and X/M2 are isomorphic.
14. Prove that if M is a closed subspace of a separable Banach space X, then
X/M is separable.
15. Let M be a closed subspace of a Banach space X. Prove that if X∗ is separable,
then so is M∗ .
16. Let M be a closed subspace of a Banach space X. Prove that if M and X/M
are separable, then X is separable. Hint: Let {xn } ⊆ X be such that {x̃n } is
dense in X/M, and let {ym } ⊆ M be dense in M.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
17. Let X be a normed linear space, and let M be a closed subspace of X. Prove
that if M and X/M are Banach spaces, then so is X.
18. Let M be a closed subspace of a Banach space X, and let N be a finite-
dimensional subspace of X. Show that M + N is closed. Hint: Consider
𝜋 −1 (𝜋(N)), where 𝜋 ∶ X → X/M is the quotient map.
19. Let M be a complemented subspace of a Banach space X. Show that M⊥ is
complemented in X∗ . Hint: Let P be the projection of X onto M, and let
N = Ker(P). Show that M⊥ = Ker(P∗ ) and that N⊥ = ℜ(P∗ ).
x
20. Define a linear operator T ∈ ℒ(c0 ) as follows: for x = (xn ), T(x) = ( n ).
n
Describe T∗ . Recall the result of problem 16 on section 6.2.
The weak topologies are defined in much the same way the product topology
is defined. They are designed to guarantee the continuity of a certain class of
functions. We urge the reader to look up theorem 5.4.1, the definition of the
product topology in section 5.12, and theorem 5.12.1. This section is terminal and
may be omitted without loss of continuity.
Definition. Let X be a normed linear space, and let X∗ be its dual. The weak*
topology on X∗ is the smallest topology on X∗ relative to which the functionals
x̂ are continuous. Here x̂ is the image of x ∈ X under the natural embedding of
X into X∗∗ . We use the abbreviation w∗ -topology for the weak* topology on X∗ .
Notice that the definitions of the w- and w∗ -topologies are asymmetric. Only the
functional on X∗ of the form x̂ is admitted in the definition of the w∗ -topology on
X∗ . Thus if X is not reflexive, then the functionals in X∗∗ − X̂ are not guaranteed
to be continuous in the w∗ -topology, and indeed they are not. See theorem 6.7.6.
It follows directly from the definitions that an open base for the w-topology is
the collection of sets of the form ∩ni=1 {x ∈ X ∶ |𝜆i (x) − 𝜆i (x0 )| < r}, where r > 0,
x0 ∈ X, and {𝜆1 , … , 𝜆n } is a finite subset of X∗ . Similarly, an open base for the w∗ -
topology is the collection of all sets of the type ∩ni=1 {𝜆 ∈ X∗ ∶ |𝜆(xi ) − 𝜆0 (xi )| < r},
where r > 0, 𝜆0 ∈ X∗ , and {x1 , … , xn } is a finite subset of X.
Theorem 6.7.1.
(a) A sequence (xn ) converges to x in the w-topology on a normed linear space X
if and only if limn 𝜆(xn ) = 𝜆(x) for every 𝜆 ∈ X∗ .
(b) A sequence (𝜆n ) converges to 𝜆 in the w∗ -topology if and only if limn 𝜆n (x) =
𝜆(x) for every x ∈ X.
Proof. We prove part (b). Let 𝜆n and 𝜆0 be such that limn 𝜆n (x) = 𝜆0 (x) for every
x ∈ X. We show that 𝜆n converges to 𝜆0 in the w∗ -topology. If U is a w∗ -
open neighborhood of 𝜆0 , then there exists r > 0 and a finite subset {x1 , ..., xm }
of X such that ∩m ∗
i=1 {𝜆 ∈ X ∶ |𝜆(xi ) − 𝜆0 (xi )| < r} ⊆ U. Since for all 1 ≤ i ≤ m,
limn 𝜆n (xi ) = 𝜆0 (xi ), there is a natural number N such that |𝜆n (xi ) − 𝜆0 (xi )| < r
for all n > N and all 1 ≤ i ≤ m. This means that 𝜆n ∈ ∩m ∗
i=1 {𝜆 ∈ X ∶ |𝜆(xi ) −
𝜆0 (xi )| < r} ⊆ U, for every n > N. The proof of the converse is a partial reversal
of the above argument.
Proof. We prove part (b). We show that if U = {𝜆 ∈ X∗ ∶ ‖𝜆 − 𝜆0 ‖ < r}, then U con-
tains a w∗ -neighborhood V of 𝜆0 . Let {e1 , … , en } be a basis for X, and define a norm
on X∗ by ‖𝜆‖′ = max1≤i≤n |𝜆(ei )|. Since all norms on X∗ are equivalent, there
exists 𝛿 > 0 such that ‖𝜆‖′ < 𝛿 implies that ‖𝜆‖ < r. The w∗ -open neighborhood
V = ∩ni=1 {𝜆 ∈ X∗ ∶ |𝜆(ei ) − 𝜆0 (ei )| < 𝛿} of 𝜆0 is contained in U.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Before we can prove that the w-topology is different from the norm topology on
an infinite-dimensional normed linear space, we need a fact from general vector
space theory.
Lemma 6.7.4. Let X be a vector space, and let 𝜆 and 𝜆1 , … , 𝜆n be linear functionals
on X. If ∩ni=1 Ker(𝜆i ) ⊆ Ker(𝜆), then 𝜆 is a linear combination of 𝜆1 , … , 𝜆n .
Proof. Define 𝜋 ∶ X → 𝕂n by 𝜋(x) = (𝜆1 (x), … , 𝜆n (x)). The condition ∩ni=1 Ker(𝜆i ) ⊆
Ker(𝜆) implies that Ker(𝜋) ⊆ Ker(𝜆). The previous lemma produces a functional
𝜓 ∶ 𝕂n → 𝕂 such that 𝜓o𝜋 = 𝜆. Because 𝜓 is linear, there exist scalars a1 , … , an
n
such that for (v1 , … , vn ) ∈ 𝕂n , 𝜓(v1 , … , vn ) = ∑i=a ai vi . Now, for x ∈ X, 𝜆(x) =
n
(𝜓o𝜋)(x) = 𝜓(𝜆1 (x), … , 𝜆n (x)) = ∑i=1 ai 𝜆i (x).
Proof. Without loss of generality, we assume that 0 ∈ U. Then there is r > 0 and
a finite subset {𝜆1 , … , 𝜆n } of X∗ such that ∩ni=1 {x ∈ X ∶ |𝜆i (x)| < r} ⊆ U. The set
N = ∩ni=1 Ker(𝜆i ) is clearly contained in U. If N = {0}, then, for every 𝜆 ∈ X∗ ,
N ⊆ Ker(𝜆). By lemma 6.7.4, every 𝜆 ∈ X∗ would be a linear combination of
𝜆1 , … , 𝜆n , contradicting the assumption that X, hence X∗ , is infinite dimensional.
Thus N ≠ {0}, and, for any nonzero x ∈ N, the line {cx ∶ c ∈ ℝ} ⊆ N; hence U is
unbounded.
The above theorem implies that the weak and norm topologies on an infinite-
dimensional space X are distinct since no open bounded subset of X can be
weakly open.
Weak topologies are generally intricate, and good caution must be exercised when
formulating arguments involving them. In metric topologies, when one speaks
of an open neighborhood of a point x, one instinctively thinks of an open ball
centered at x. A w-open neighborhood of a point looks nothing like an open ball
since bounded subsets of X are never w-open.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
We now prove that the w∗ -topology is very tight in the sense that it admits
the continuity of no linear functionals other than the functionals x̂ used in the
definition of the w∗ -topology.
Proof. Let D be the open unit disk in the complex plane. By the w∗ -continuity of
F, F−1 (D) is w∗ -open; hence it contains a w∗ -neighborhood of 0 of the form
U = ∩ni=1 {𝜆 ∈ X∗ ∶ |𝜆(xi )| < r} for some r > 0, and some finite subset {x1 , … , xn }
of X. In particular, F(U) is a bounded subset of the complex plane. We show that
∩ni=1 Ker(x̂i ) ⊆ Ker(F). If 𝜆 ∈ ∩ni=1 Ker(x̂i ), then, clearly, 𝜆 ∈ U and c𝜆 ∈ U, for
all c ∈ ℝ; hence |c||F(𝜆)| = |F(c𝜆)| is bounded for all c ∈ ℝ. This forces F(𝜆) = 0.
Therefore ∩ni=1 Ker(xî ) ⊆ Ker(F). By lemma 6.7.4, F is a linear combination of
x̂1 , … , x̂n ; hence F ∈ X.̂
∞
d(𝜆, 𝜇) = ∑ 2−n |𝜆(xn ) − 𝜇(xn )|.
n=1
Proof. By theorems 6.7.7 and 6.7.9, (B∗ , w∗ ) is compact and metrizable; hence, it is
separable. Since X∗ = ∪∞n=1 nB , X is separable in the w -topology.
∗ ∗ ∗
The converse of theorem 6.7.9 is also true. Recall (see theorem 4.9.10) that if K is
a compact metric space, then 𝒞(K) is separable.
Theorem 6.7.11. Let X be a Banach space. Then (B∗ , w∗ ) is metrizable if and only
if X is separable.
Exercises
7
Hilbert Spaces
Upon graduation from the Wilhelm Gymnasium, where he spent his final year of
schooling, Hilbert enrolled at the University of Königsberg in the autumn of 1880.
He received his Ph.D. from Königsberg in 1885, remained there as a member of
staff from 1886 to 1895, and was promoted to the rank of professor in 1893. In 1895
Hilbert was appointed to the chair of mathematics at the University of Göttingten,
where he spent the rest of his career. Among Hilbert’s numerous students were
Hermann Weyl, Felix Bernstein, Otto Blumenthal, Richard Courant, Alfred Haar,
and Hugo Steinhaus.
Fundamentals of Mathematical Analysis. Adel N. Boules, Oxford University Press (2021). © Adel N. Boules.
DOI: 10.1093/oso/9780198868781.003.0007
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
In the analysis of mathematical talent one has to differentiate between the ability to
create new concepts that generate new types of thought structures and the gift for
sensing deeper connections and underlying unity. In Hilbert’s case, his greatness
lies in an immensely powerful insight that penetrates into the depths of a question.
All of his works contain examples from far-flung fields in which only he was able
to discern an interrelatedness and connection with the problem at hand. From
these, the synthesis, his work of art, was ultimately created. Insofar as the creation
of new ideas is concerned, I would place Minkowski higher, and of the classical
great ones, Gauss, Galois, and Riemann. But when it comes to penetrating insight,
only a few of the very greatest were the equal of Hilbert.
Hilbert retired in 1930, and the city of Königsberg made him an honorary citizen.
He gave an address which ended with famous words that now appear on his
epitaph:
Wir müssen wissen, wir werden wissen: We must know, we shall know.1
1 Perhaps as a rebuttal of Du Bois-Raymond’s statement “we do not know and will not know,”
reflecting the idea that scientific knowledge is unknown and unknowable.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Example 2. The space (𝕂(ℕ), ‖.‖2 ) is not a Hilbert space. We use the fact that a
subspace of l2 is complete if and only if it is closed. Now 𝕂(ℕ) is not closed
in l2 because it contains the sequence x1 = (1, 0, 0, ...), x2 = (1, 1/2, 0, 0, ...), … ,
xn = (1, 1/2, 1/3, ..., 1/n, 0, 0, ...) The limit of the sequence (xn ) is the harmonic
∞
sequence x = (1, 1/2, … , 1/n, ...) because ‖xn − x‖22 = ∑j=n+1 |xj |2 → 0 as
n → ∞. Clearly, x ∉ 𝕂(ℕ).
For ease of reference, we state, without proof, a few results from section 3.7. We
urge the reader to look up the proofs and the basic definitions in section 3.7.
Not all norms are induced by an inner product. However, we have the following
result, which we limit to real normed linear spaces for simplicity.
Example 3. Suppose that (X, ‖.‖) is a real normed linear space and that the norm
satisfies the parallelogram identity. Then the function
1
⟨x, y⟩ = [‖x + y‖2 − ‖x − y‖2 ]
4
⟨mx, y⟩ = m⟨x, y⟩ holds for all m ∈ ℤ. Using this, for all n ∈ ℕ, ⟨x, y⟩ =
1 1 1 1
⟨n x, y⟩ = n⟨ x, y⟩. Equivalently, ⟨ x, y⟩ = ⟨x, y⟩.
n n n n
We have shown that, for all q ∈ ℚ, ⟨qx, y⟩ = q⟨x, y⟩. It is easy to see that if
limn xn = x, then limn ⟨xn , y⟩ = ⟨x, y⟩. Now the homogeneity property, ⟨𝛼x, y⟩ =
𝛼⟨x, y⟩ for 𝛼 ∈ ℝ, is immediate because ℚ is dense in ℝ.
(a) A ⊆ A⊥⊥ ;
(b) if A ⊆ B, then A⊥ ⊇ B⊥ ;
(c) A⊥ is a closed subspace of H; and
(d) A⊥ = M⊥ , where M = Span(A).
Example 8 in section 4.7 is a very special case of the theorem below. Observe that
the completeness of H is crucial here.
Theorem 7.1.5. Let C be a closed convex subset of a Hilbert space H, and let
x ∈ H. Then there exists a unique element y ∈ C such that ‖x − y‖ = dist(x, C) =
infz∈C ‖x − z‖.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
yn +ym
Now ‖yn − x − ym − x‖2 = 4‖x − ( )‖2 ≥ 4𝛿 2 . The last inequality is true
2
yn +ym
because ∈ C due to the convexity of C. Thus
2
This shows that (yn ) is a Cauchy sequence, and hence y = limn yn exists. Since C is
closed, y ∈ C. Now 𝛿 = limn ‖x − yn ‖ = ‖x − y‖, and y is one of the closest points
in C to x. To show that y is unique, suppose z ∈ C is such that ‖x − z‖ = 𝛿. By the
parallelogram law, and as in the calculation above,
Therefore 2Re(𝛼⟨w, z⟩) ≤ |𝛼|2 . Since the above is true for an arbitrary 𝛼, choose
𝛼 = ⟨z, w⟩. We now have 2|⟨w, z⟩|2 ≤ |⟨w, z⟩|2 ; hence ⟨w, z⟩ = 0. The proof is now
complete because M ∩ M⊥ = {0}.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Example 5. Let M be the set of all sequences in l2 whose even terms are zero, and
let N be the set of all sequences in l2 whose odd terms are zero. It is easy to see
that M and N are closed subspaces of l2 and that N = M⟂ . Trivially, every vector
x = (xn ) ∈ l2 can be written as x = (x1 , 0, x3 , 0, ...) + (0, x2 , 0, x4 , ...) ∈ M ⊕ N.
The following theorem gives a complete and simple characterization of the dual
of a Hilbert space. The Riesz representation theorem basically says that a Hilbert
space is isometrically isomorphic to itself in a very natural way.
Recall that a hyperplane in ℝn is nothing other than the translation of the null-
space of a linear functional on ℝn , that all linear functionals on ℝn are continuous,
and that all maximal subspaces are closed. In infinite dimensions, the null-space
of a linear functional 𝜆 is closed if and only if 𝜆 is continuous. The following result
is the exact analog of example 10 in section 4.7.
Example 6. Let C be a closed convex subset of a real Hilbert space H, and let
a ∈ H − C. Then there exists a bounded functional 𝜆 on H and a constant b
such that 𝜆(y) < b for every y ∈ C, and 𝜆(a) > b.
The obtuse angle criterion extends to the current situation, and the proof is
identical to that in example 9 in section 4.7. Thus if z is the closest element
in C to a, then, for every y ∈ C, ⟨a − z, y − z⟩ ≤ 0. Let m = (a + z)/2, and define
n = a − z, 𝜆(x) = ⟨x, n⟩, and b = 𝜆(m). As in example 10 in section 4.7, we may
assume that m = 0; hence b = 0. It is easy to verify that 𝜆(y) < 0 for all y ∈ C
and that 𝜆(a) > 0.
Example 7. If (xn ) and (yn ) are Cauchy sequences in an inner product space, then
limn ⟨xn , yn ⟩ exists.
We prove that the sequence ⟨xn , yn ⟩ is Cauchy in ℂ; hence the limit in question
exists. Recall that Cauchy sequences are bounded. Now
Theorem 7.1.10. Let (X, ⟨., .⟩) be an incomplete inner product space. Then there
exists a Hilbert space H that contains X as a dense subspace such that the inner
product on X is the restriction of the inner product on H. If X is separable, so is H.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Proof. Let ‖.‖ be the norm on X induced by the inner product, and let H be the
completion of X with respect to the norm on X (theorem 6.4.9). Refer to the
extended norm by ‖.‖′ . For x, y ∈ H, choose sequences (xn ) and (yn ) in X such
that xn → x and yn → y, and extend the definition of the inner product to H by
⟨x, y⟩′ = limn ⟨xn , yn ⟩. We leave it to the reader to verify that the inner product
we just defined is well defined and that it is indeed an inner product. Clearly, the
inner product on H extends that on X. Finally, we prove that that the extended
inner product induces the extended norm on H. For a sequence (xn ) converging to
2 2
x ∈ H, ⟨x, x⟩′ = limn ⟨xn , xn ⟩ = limn ‖xn ‖2 = limn (‖xn ‖′ ) = (‖x‖′ ) .
Exercises
1
⟨ f, g⟩ = ∫ f(x)g(x)dx
0
is not complete.
2. Prove the parallelogram law and the polarization identity.
3. Let x and y be nonzero vectors in an inner product space. Prove that there
Re⟨x,y⟩
exists a unique number 0 ≤ 𝜃 ≤ 𝜋 such that cos 𝜃 = . Conclude that
‖x‖‖y‖
‖x + y‖2 = ‖x‖2 + ‖y‖2 + 2‖x‖‖y‖ cos 𝜃.
4. Prove the Apollonius identity: For vectors x, y, and z in an inner product
1 x+y 2
space, ‖z − x‖2 + ‖z − y‖2 = ‖x − y‖2 + 2‖z − ‖.
2 2
5. Let A be a subset of a Hilbert space H, and let M = Span(A). Prove that
A⊥ = M⊥ .
6. Let M be a closed subspace of a Hilbert space. Prove that M = M⊥⊥ . Give
an example to show that the result fails if M is not closed. More generally,
show that M⊥⊥ = M.
7. Show that if A is a subset of a Hilbert space H, then A⊥⊥ is the smallest
closed subspace of H containing A.
8. Let (xn ) and (yn ) be sequences in an inner product space. Prove that
(a) if limn xn = 0, and (yn ) is bounded, than limn ⟨xn , yn ⟩ = 0; and
∞ ∞
(b) if y⊥xn for each n ∈ ℕ, and ∑n=1 xn is convergent, then y⊥ ∑n=1 xn .
9. Prove that if an element x in a Hilbert space is orthogonal to every vector
in a dense subset of H, then x = 0.
10. Let (xn ) be a sequence of mutually orthogonal vectors in a Hilbert space H.
∞ ∞
Prove that ∑n=1 xn converges in H if and only if ∑n=1 ‖xn ‖2 < ∞.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
11. Use Hilbert space methods to provide an easy proof of the Hahn-Banach
theorem for Hilbert spaces.
n
12. Let M = {x = (x1 , … , xn ) ∈ ℝn ∶ ∑i=1 xi = 1}. Show that M is closed and
convex, and find the element in M closest to the origin.
13. Let C be a closed convex subset of a Hilbert space H, let x ∈ H − C, and
let y be the closest element of C to x. Prove that, for every z ∈ C, Re⟨x − y,
z − y⟩ ≤ 0.
14. Let 𝛿n be a positive sequence, and let C = {x ∈ l2 ∶ |xn | ≤ 𝛿n }. Show that C
∞
is compact if and only if ∑n=1 𝛿n2 < ∞.
In the introduction to section 7.1, we made the case for the existence of a maximal
orthonormal sequence {u1 , u2 , ...} in a Hilbert space H. As you will see in this
section, some Hilbert spaces do not admit countable maximal orthonormal
subsets. Perhaps we must first tackle the problem of the existence of a maximal
orthogonal subset of H, then examine the problem of which Hilbert spaces possess
a countable such subset. In this section, we provide solutions to both problems
and reveal the basic structure of a Hilbert space, hence paving the way to answer
the problems posed in section 4.10.
In the theorem below, we prove a little more than the existence of an orthonormal
basis for an arbitrary Hilbert space.
Theorem 7.2.3. A Hilbert space H is separable if and only if every orthonormal basis
of H is countable.
Proof. If H is separable, then H contains a countable dense subset {x1 , x2 , ...} and,
clearly, H = ∪n∈ℕ B(xn , 1/2). If S = {u𝛼 }𝛼∈I is an orthonormal basis for H, then,
for 𝛼, 𝛽 ∈ I, ‖u𝛼 − u𝛽 ‖ = √2. Since the diameter of each of the balls B(xn , 1/2)
is 1, no such ball can contain more that one member of S. Therefore S is at most
countable.
Conversely, if H possesses a countable orthonormal basis S = {un ∶ n ∈ ℕ},
let A be the collection of all finite linear combinations of element in S with
coefficients in ℚ + iℚ. We claim that A is dense in H. This will conclude the
proof because A is countable. To prove the claim, let M be the closure of A. To
show that M is a subspace of H, let x, y ∈ M, and let a, b ∈ 𝕂. Then there exist
sequences (xn ) and (yn ) in A, and sequences an , bn ∈ ℚ + iℚ such that limn xn =
x, limn yn = y, limn an = a, and limn bn = b. The sequence (an xn + bn yn ) is in
A, and limn an xn + bn yn = ax + by. Therefore ax + by ∈ M. We now show that
M = H. If not, then H = M ⊕ M⊥ , and M⊥ ≠ {0}. Pick a unit vector z ∈ M⊥ . Then
S ∪ {z} is an orthonormal subset of H that properly contains S. This contradicts the
maximality of S and completes the proof.
Example 2. It is possible for a separable inner product space (hence for a sepa-
rable Hilbert space) to contain uncountably many pairs of orthogonal vectors.
1 𝜋
Consider the space 𝒞[−𝜋, 𝜋] with the inner product ⟨ f, g⟩ = ∫−𝜋 f(x)g(x);
2𝜋
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
We focus mostly but not exclusively on separable Hilbert spaces. The existence
of inseparable Hilbert spaces of arbitrary Hilbert dimension will be presented in
the excursion at the end of this section. Many of the results we develop in this
chapter are valid for inseparable Hilbert spaces. Examples include the projection
theorem, the Riesz representation theorem, and the next three theorems. Also, in
the definition below, the set I need not be countable; hence H is not assumed to be
separable.
n
Proof. We only need to show that the vector z = x − y = x − ∑i=1 x̂i ui is in M⊥ . The
rest of the assertions follow from the projection theorem and theorem 7.2.4. Now,
for a fixed 1 ≤ j ≤ n,
n
⟨z, uj ⟩ = ⟨x, uj ⟩ − ∑ x̂i ⟨ui , uj ⟩ = ⟨x, uj ⟩ − x̂j = 0.
i=1
n
Proof. By theorem 7.2.5, ∑i=1 |x̂i |2 ≤ ‖x‖2 for each n ∈ ℕ. Taking the limit as
n → ∞ yields Bessel’s inequality.
2 The set of trigonometric polynomials with rational coefficients is dense in 𝒞[−𝜋, 𝜋]. See corollary
4.10.3.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
n
Proof. (a) implies (b). Let yn = ∑k=1 x̂k uk .
m m
For n < m, ‖ym − yn ‖2 = ‖ ∑k=n+1 x̂k uk ‖2 = ∑k=n+1 |x̂k |2 → 0 as m, n → ∞,
∞
because ∑n=1 |x̂n |2 < ∞ (Bessel’s inequality). This shows that (yn ) is a Cauchy
∞
sequence in H; hence it converges to, say y. Thus y = ∑n=1 x̂n un . We need to show
that y = x. For a fixed k ∈ ℕ, ⟨y, uk ⟩ = limn→∞ ⟨yn , uk ⟩ = ⟨x, uk ⟩. Thus y − x is
y−x
orthogonal to each uk . If y − x ≠ 0, then S ∪ { } would be an orthonormal set
‖y−x‖
that properly contains S. The maximality of S forces y = x.
n n
That (b) implies (c) is obvious, since x = limn→∞ ∑k=1 x̂k uk , and each ∑k=1 x̂k uk
is in Span(S).
∞
(c) implies (d). Suppose, for some x ∈ H, ‖x‖2 > ∑n=1 |x̂n |2 , and let 𝛿 2 =
∞
‖x‖2 − ∑n=1 |x̂n |2 . We show that the ball B(x, 𝛿) contains no finite linear
n
combination of S. This will show that Span(S) is not dense in H. If ∑k=1 ak uk ∈
n n
Span(S), then, by theorem 7.2.5, ‖x − ∑k=1 ak uk ‖2 ≥ ‖x − ∑k=1 x̂k uk ‖2 =
n n
‖x‖2 − ‖ ∑k=1 x̂k uk ‖2 = ‖x‖2 − ∑k=1 |x̂k |2 = 𝛿 2 .
∞
(d) implies (e). The identity ‖x‖2 = ∑n=1 |x̂n |2 can be written as ‖x‖2 = ‖x‖̂ 22 ,
where x̂ = (x̂n ) ∈ l2 , and ‖x‖̂ 2 is the l2 -norm of x.̂ Now, assuming (d) is true, then,
for every 𝛼 ∈ 𝕂, ⟨x + 𝛼y, x + 𝛼y⟩ = ⟨x̂ + 𝛼 y,̂ x̂ + 𝛼 y⟩. ̂ Equivalently, 𝛼⟨y, x⟩ +
𝛼⟨x, y⟩ = 𝛼⟨y,̂ x⟩̂ + 𝛼⟨x,̂ y⟩.
̂ Setting 𝛼 = 1/2, we obtain Re(⟨x, y⟩) = Re(⟨x,̂ y⟩). ̂
Setting 𝛼 = 1/2i yields Im⟨x, y⟩ = Im⟨x,̂ y⟩. ̂ This proves that ⟨x, y⟩ = ⟨x,̂ y⟩,
̂ which
∞
is equivalent to ⟨x, y⟩ = ∑n=1 x̂n yn̂ .
(e) implies (a). Suppose there exists a unit vector u such that S ∪ {u} is orthonor-
∞
mal. Then û k = ⟨u, uk ⟩ = 0 for all k ∈ ℕ, and 1 = ⟨u, u⟩ = ∑k=1 û k û k = 0. This
contradiction shows that (a) is true.
∞
Example 3. Every element in l2 can be written as a series x = ∑n=1 xn en .
n
Consider the vectors yn = x − ∑i=1 xi ei = (0, 0, … , 0, xn+1 , xn+2 , ...). Since
∞ ∞
limn ‖yn ‖2 = limn ∑i=n+1 |xi |2 = 0, x = ∑n=1 xn en .
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Definition. Two Hilbert spaces H1 and H2 are isomorphic (as Hilbert spaces) if
there exists an isomorphism T ∶ H1 → H2 such that, for all x, y ∈ H1 ,
Proof. We only prove the second statement. The proof of the first statement is simpler.
Let {un } be an orthonormal basis for H. For x ∈ H, let T(x) = (x̂n )∞ n=1 = x;̂
T ∶ H → l2 is linear since (ax + by) =̂ ax̂ + by.̂ The fact that ⟨x, y⟩ = ⟨Tx, Ty⟩ is
Parseval’s identity in theorem 7.2.7. To verify that T is one-to-one, suppose that
∞
̂ 0, and ∑n=1 |x̂n − yn̂ |2 = 0. Therefore x̂n = yn̂ . Hence, by
x̂ = y.̂ Then (x − y) =
∞ ∞
theorem 7.2.7, x = ∑n=1 x̂n un = ∑n=1 yn̂ un = y; T is onto because if (an ) ∈ l2 ,
∞
then the series ∑n=1 an un converges to a vector x ∈ H such that x̂ = (an ). See
problem 3 at the end of this section.
The closest point property and the projection theorem (theorems 7.1.5 and 7.1.7,
respectively) are at the heart of the constructions of this chapter. An examination
of the proof of theorem 7.1.5 reveals that the parallelogram law delivers both
the existence and the uniqueness of the closest point to a closed convex set. The
parallelogram law is a direct result of the fact that the norm on a Hilbert space is
induced by an inner product, which is what sets Hilbert spaces apart from general
Banach spaces, where the closest point property fails as does the conclusion of the
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
projection theorem. The following simple example illustrates one of the discussion
points.
Example 5. Consider the space ℝ2 with the norm ‖x‖∞ = max{|x1 |, |x2 |}. The set
M = {x = (x1 , x2 ) ∈ ℝ2 ∶ |x1 | ≤ 1, |x2 | ≤ 1} is closed and convex. Every point on
the line segment {(1, y) ∶ |y| ≤ 1} has distance 1 from the point x = (2, 0), and
dist(x, M) = 1. There are examples where the very existence of a closest point is
not guaranteed. See problems 6–8 at the end of this section for a slight expansion
of this discussion.
It was mentioned in section 6.4 that not every closed subspace of a Banach space is
complemented. Theorem 7.1.7 guarantees that every closed subspace of a Hilbert
space is complemented. Projections in Banach spaces play a similar role to orthog-
onal projections in proving that certain closed subspaces are complemented. See
problem 10 on section 6.4 for necessary and sufficient conditions for a closed
subspace of a Banach space to be complemented. Also examine example 6 in
section 6.4.
Inseparable Hilbert spaces do exist. They are mostly a curiosity and do not have
much practical use. We include the discussion below for the satisfaction of the
inquisitive reader.
The motivation for the definition below and the construction in theorem 7.2.9
is provided by the following example.
Definition. Let I be an infinite set, and let ℵ = Card(I). Define l2 (ℵ) to be the
set of all functions x ∶ I → ℂ such that x𝛼 = 0 for all but countably many 𝛼 ∈ I
and ‖x‖ = (∑𝛼∈I |x𝛼 |2 )1/2 < ∞. To eliminate any danger of ambiguity, let Ix =
{𝛼1 , 𝛼2 , ...} be the subset of I for which x𝛼 ≠ 0. The notation ∑𝛼∈I |x𝛼 |2 means
∞
∑i=1 |x𝛼i |2 . We will continue to employ this notation for the remainder of this
discussion.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Theorem 7.2.9. The set l2 (ℵ) is a Hilbert space with the operations defined within
the proof.
Proof. Let x = (x𝛼 ), and y = (y𝛼 ) ∈ l2 (ℵ). We show that x + y ∈ l2 (ℵ) and that
‖x + y‖ ≤ ‖x‖ + ‖y‖. Let Ix = {𝛼 ∈ I ∶ x𝛼 ≠ 0}, and Iy = {𝛼 ∈ I ∶ y𝛼 ≠ 0}, and
let J = Ix ∪ Iy . Since J is countable, we can write J = {𝛼1 , 𝛼2 , ...}. Note that x̂ = (x𝛼i )
and ŷ = (y𝛼i ) are in l2 ; hence ‖x̂ + y‖̂ 2 ≤ ‖x‖̂ 2 + ‖y‖̂ 2 . But every 𝛼 for which
∞
x𝛼 + y𝛼 ≠ 0 is in J; hence ‖x + y‖ = (∑i=1 |x𝛼i + y𝛼i |2 )1/2 = ‖x̂ + y‖̂ 2 ≤ ‖x‖̂ 2 +
‖y‖̂ 2 = ‖x‖ + ‖y‖. The fact that ‖ax‖ = |a|‖x‖ for all x ∈ l2 (ℵ) and all scalars a
requires an even simpler argument. The rest of the properties of a normed linear
space are easily verifiable. Thus l2 (ℵ) is a normed linear space.
∞
Define an inner product on l2 (ℵ) as follows: ⟨x, y⟩ = ⟨x,̂ y⟩̂ = ∑i=1 x𝛼i y𝛼 This
i
inner product induces the norm on l2 (ℵ) we defined earlier. We now show
the completeness of l2 (ℵ). Suppose (x(n) ) is a Cauchy sequence in l2 (ℵ), let
(n)
In = {𝛼 ∈ I ∶ x𝛼 ≠ 0}, and let J = ∪n∈ℕ In . Then J is a countable subset of I,
and we can write J = {𝛼1 , 𝛼2 , ...}. Since ‖x̂(m) − y(n)
̂ ‖ = ‖x(n) − y(n) ‖, (x̂(n) ) is a
Cauchy sequence in l and is therefore convergent to an element x̂ = (x1 , x2 , ...) ∈ l2 .
2
Define x ∈ l2 (ℵ) by
xi if 𝛼 = 𝛼i ,
x𝛼 = {
0 otherwise.
The reader can now anticipate the theorem that must be stated next: the set {e𝛼 }𝛼∈I
is an orthonormal basis for l2 (ℵ), where e𝛼 (𝛽) = 𝛿𝛼,𝛽 . Thus, for any cardinal
number ℵ, we have constructed a Hilbert space whose orthonormal basis has
cardinality ℵ. Such a space is also unique up to Hilbert space isomorphism in
the sense that it depends only on ℵ and not on the particular set I in the above
construction. We leave it to the interested reader to reflect on the details.
The cardinality of an orthonormal basis of a Hilbert space H is known as the
Hilbert dimension of H.
Exercises
1. Let {un } be an orthonormal basis for a separable Hilbert space H, and let
∞
{vn } be an orthonormal set in H such that ∑n=1 ‖un − vn ‖2 < 1. Prove that
{vn } is an orthonormal basis for H.
2. Let S = {v1 , v2 , ...} be an orthonormal subset of a separable Hilbert space H
(not necessarily an orthonormal basis), and let M = Span(S). Prove that if
∞
P is the projection of H onto M, then Px = ∑i=1 ⟨x, vi ⟩vi .
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
3. Let {un } be an orthonormal basis for a separable Hilbert space H, and let
(an ) ∈ l2 . Prove that there is an element x ∈ H such that x̂n = an .
4. Use theorems 6.2.4 and 7.2.8 to provide an alternative proof of the Riesz
representation theorem for separable Hilbert spaces.
5. Let {un } be an orthonormal basis for a separable Hilbert space H. Define a
∞
function ‖.‖′ ∶ H → ℝ as follows: ‖x‖′ = ∑n=1 2−n |x̂n |. Show that ‖.‖′ is a
norm on H and that it is not equivalent to the original norm on H.
6. Let X = 𝒞[0, 1] endowed with the uniform norm, and let M be the subset
of X consisting of all functions f such that f(0) = 0, f(1) = 1, f ≥ 0, and
1
∫0 f(x)dx = 1. Prove that M is closed and convex and that dist(0, M) = 1.
Also show that, for every f ∈ M, ‖f‖∞ > 1, and hence M contains no element
of smallest norm.
9. Prove that if {un } is an orthonormal basis for a separable Hilbert space, then
(un ) converges weakly to 0.
10. Show that a norm convergent sequence is weakly convergent but not
conversely.
11. Show that the weak limit of a sequence in a Hilbert space, if it exists, is
unique.
12. Show that if limn ⟨xn , y⟩ = ⟨x, y⟩ for every y in a dense subset of H, then
xn →w x.
13. Let {un } be an orthonormal basis for a separable Hilbert space H. Prove the
xn →w x if and only if limn ⟨xn , uj ⟩ = ⟨x, uj ⟩ for every j ∈ ℕ.
14. Show that if xn →w x, and limn ‖xn ‖ = ‖x‖, then xn is norm convergent to x.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
15. Prove that if xn →w x, then {‖xn ‖} is bounded and ‖x‖ ≤ lim infn ‖xn ‖. Hint:
The linear functionals 𝜆n (𝜉) = ⟨𝜉, xn ⟩ are pointwise bounded.
16. Let xn →w x. Prove that there exists a subsequence xnk of (xn ) such that
1 N
∑k=1 xnk is strongly convergent to x. Hint: Without loss of generality,
N
assume that x = 0. Inductively define a subsequence xnk of (xn ) such that
|⟨xni , xnk ⟩| < 2−k for i = 1, … , k − 1. Now
N N N k−1
1 1 1
‖ ∑ x ‖2 = 2 ∑ ‖xnk ‖2 + 2 ∑ ∑ 2Re⟨xni , xnk ⟩.
N k=1 nk N k=1 N k=2 i=1
The above equation is the defining property of the adjoint operator T∗ of T. The
reader can easily see that the definition is consistent with the definition of the
adjoint operator on a Banach space that was introduced in section 6.6.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
This shows that T∗ (y1 + y2 ) = T∗ (y1 ) + T∗ (y2 ). We now show that T∗ ∈ ℒ(H).
‖T∗ y‖2 = ⟨T∗ y, T∗ y⟩ = ⟨T(T∗ y), y⟩ ≤ ‖T(T∗ y)‖‖y‖ ≤ ‖T‖‖T∗ y‖‖y‖. Thus ‖T∗ y‖ ≤
‖T‖‖y‖ for every y ∈ H. Hence T∗ ∈ ℒ(H), and ‖T∗ ‖ ≤ ‖T‖.
Proof. The computations needed to prove parts (a)–(d) are simple. As an example, we
establish part (c): ⟨T1 T2 x, y⟩ = ⟨T2 x, T∗1 y⟩ ⟨x, T∗2 T∗1 y⟩, which, by definition, means
that (T1 T2 )∗ = T∗2 T∗1 . We already saw that ‖T∗ ‖ ≤ ‖T‖. Applying the same fact to
T∗ and using part (d), we have ‖T∗ ‖ ≤ ‖T∗∗ ‖ = ‖T‖, thus proving (e). To prove
(f), ‖T∗ T‖ ≤ ‖T∗ ‖‖T‖ = ‖T‖2 . Also,
‖Tx‖2 = ⟨Tx, Tx⟩ = ⟨x, T∗ Tx⟩ ≤ ‖x‖‖T∗ Tx‖ ≤ ‖x‖‖T∗ T‖‖x‖ = ‖T∗ T‖‖x‖2 ,
which implies that ‖T‖2 ≤ ‖T∗ T‖, and the proof of (f) is complete.
N N
⟨Px, y⟩ = ⟨∑ x̂ u , y⟩ =∑ x̂ ŷ ,
n=1 n n n=1 n n
while
N N N
⟨x, Py⟩ = ∑ ⟨x, yn̂ un ⟩ = ∑ ŷ ⟨x, un ⟩ = ∑ x̂ ŷ .
n=1 n=1 n n=1 n n
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Projection operators are the simplest self-adjoint operators, and, in a way, are
building blocks we can use to generate more examples of self-adjoint operators.
Proof. We leave the proof of (a)–(c) to the reader. To prove (d), let (Tn ) be a sequence
of self-adjoint operators that converges in ℒ(H) to T. We show that T∗ = T. Then
Lemma 7.3.5. (a) Let H be a complex Hilbert space, and let T ∈ ℒ(H). If ⟨Tx, x⟩ =
0 for all x ∈ H, then T = 0.
(b) Let H be a real Hilbert space, and let T ∈ ℒ(H) be self-adjoint. If ⟨Tx, x⟩ = 0
for all x ∈ H, then T = 0.
Remark. Part (b) of the above theorem is false if T is not self-adjoint. For example,
if T ∶ ℝ2 → ℝ2 is the 90∘ rotation of the plane, then ⟨Tx, x⟩ = 0 for all x ∈ ℝ2 .
Theorem 7.3.6. Let H be a complex Hilbert space, and let T ∈ ℒ(H). Then T is self-
adjoint if and only if ⟨Tx, x⟩ is real for all x ∈ H.
Proof. If T is self-adjoint, then ⟨Tx, x⟩ = ⟨x, T∗ x⟩ = ⟨x, Tx⟩ = ⟨Tx, x⟩. Thus ⟨Tx, x⟩ is
real. Conversely, if ⟨Tx, x⟩ is real for all x ∈ H, then ⟨Tx, x⟩ = ⟨Tx, x⟩ = ⟨x, T∗ x⟩ =
⟨T∗ x, x⟩. Thus ⟨(T∗ − T)x, x⟩ = 0 for all x; hence T∗ − T = 0, by the previous
lemma.
Thus M ≤ ‖T‖.
It follows from the definition of M that |⟨Tx, x⟩| ≤ M‖x‖2 for all x ∈ H. The
following identities are easy to verify:
Thus
1 1
|Re⟨Tx, y⟩| ≤ |⟨T(x + y), x + y⟩| + |⟨T(x − y), x − y⟩|
4 4
M M
≤ {‖x + y‖ + ‖x − y‖ } = {‖x‖2 + ‖y‖2 }.
2 2
4 2
M
for all x, y ∈ H, |Re⟨Tx, y⟩| ≤ {‖x‖2 + ‖y‖2 } (3)
2
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Tx
If ‖x‖ = 1, and Tx ≠ 0, let y = . Then
‖Tx‖
Tx 1
Re ⟨Tx, ⟩= ⟨Tx, Tx⟩ = ‖Tx‖.
‖Tx‖ ‖Tx‖
M Tx 2
‖Tx‖ ≤ {‖x‖2 + ‖ ‖ } = M.
2 ‖Tx‖
Proof. Since |𝜆| ≤ ‖T‖ for all 𝜆 ∈ 𝜎(T), it is sufficient to find an element 𝜆 ∈ 𝜎(T)
such that |𝜆| = ‖T‖. By the previous theorem, there exists a sequence of unit
vectors (xn ) such that limn |⟨Txn , xn ⟩| = ‖T‖. Thus there exists a subsequence (yn )
of (xn ) such that limn ⟨Tyn , yn ⟩ = ‖T‖, or limn ⟨Tyn , yn ⟩ = −‖T‖. Therefore there
exists a real number 𝜆 such that |𝜆| = ‖T‖ and limn ⟨Tyn , yn ⟩ = 𝜆. Now
Proof. Suppose P is the projection of H onto a closed subspace M. The fact that P2 = P
has been established in theorem 7.1.8, We show that P is self-adjoint. First observe
that, for all x, y ∈ H, Px ∈ M and Py − y ∈ M⊥ ; hence ⟨Px, Py − y⟩ = 0.
Now for x, y ∈ H,
⟨Px, y⟩ = ⟨Px, y − Py⟩ + ⟨Px, Py⟩ = ⟨Px, Py⟩ = ⟨Px − x, Py⟩ + ⟨x, Py⟩ = ⟨x, Py⟩.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Let 𝜖 > 0. Since 𝜆n → 0, there exists a positive integer N such that, for all
m
n > N, |𝜆n | < 𝜖. Now, for m > n > N, and an arbitrary x ∈ H, ‖ ∑k=n 𝜆k Pk x‖2 =
m m m m
∑k=n |𝜆k |2 ‖Pk x‖2 ≤ 𝜖2 ∑k=n ‖Pk x‖2 = 𝜖2 ‖ ∑k=n Pk x‖2 ≤ 𝜖2 ‖ ∑k=n Pk ‖2 ‖x‖2 =
𝜖2 ‖x‖2 . This shows that the sequence Sn of partial sums of the series defining T is
Cauchy; hence the series converges, and T ∈ ℒ(H). This proves part (a).
∞ ∞ ∞ ∞
Now ⟨Tx, y⟩ = ⟨∑n = 1 𝜆n Pn x, y⟩ = ∑n = 1 𝜆n ⟨Pn x, y⟩ = ∑n = 1 𝜆n ⟨x, Pn y⟩ =⟨x, ∑n = 1
𝜆n Pn y⟩; hence we obtain the stated formula for T∗ .
Thus (𝜇 − 𝜆n )⟨u, Tx⟩ = 0. Since 𝜇 ≠ 𝜆n , ⟨u, Tx⟩ = 0. We have shown that u ∈ M⊥n
for every n ∈ ℕ. Therefore u ⊥ S = Span{∪n∈ℕ Mn }; hence u ⊥ S. Clearly, ℜ(T) ⊆
⊥
S; hence u ∈ ℜ(T) .
∞ x̂
Example 4. Consider the operator T ∶ l2 → l2 defined by Tx = ∑n=1 n en . Here
n
Pn is the projection of l2 on the one-dimensional subspace spanned by en .
1
By the above theorem, T is self-adjoint, and the set {𝜆n = ∶ n ∈ ℕ} is the
n
entire set of nonzero eigenvalues. Since the spectrum of T is closed, 𝜆 = 0 is
in 𝜎(T). However, since T is injective, 𝜆 = 0 is not an eigenvalue of T. We
1
now show that the set S = { ∶ n ∈ ℕ} ∪ {0} is the entire spectrum of T. If
n
∞ 1
𝜆 ∈ ℂ − S, then 𝛿 = dist(𝜆, S) > 0. Now (T − 𝜆I)x = ∑n=1 ( − 𝜆)x̂n en ; hence
n
∞ 1 ∞
‖((T − 𝜆I)(x)‖2 = ∑n=1 | − 𝜆|2 |x̂n |2 ≥ 𝛿 2 ∑n=1 |x̂n |2 = 𝛿 2 ‖x‖2 . Thus T − 𝜆I is
n
bounded away from zero. In the same manner, the adjoint of T − 𝜆I, namely,
T − 𝜆I, is bounded away from zero. Hence T is invertible by problem 11 at the
end of this section.
⊥
(a) ℜ(T) = 𝒩(T∗ ), and
(b) 𝒩(T∗ )⊥ = ℜ(T).
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
⊥
Proof. y ∈ ℜ(T) if and only if ⟨y, Tx⟩ = 0 for all x ∈ H if and only if ⟨T∗ y, x⟩ = 0
for all x ∈ H if and only if T∗ y = 0 if and only if y ∈ 𝒩(T∗ ). Part (b) follows from
⊥⊥
(a) and 𝒩(T∗ )⊥ = ℜ(T) = ℜ(T).
The following example shows that the entire spectrum, not just the eigenvalues,
of a self-adjoint operator is contained in ℝ. Problem 16 at the end of this section
provides a sharper result.
The next example illustrates that theorem 7.3.11 is not the only way to construct
self-adjoint operators and that we have wide control over the design of the
spectrum.
We briefly discuss two classes of bounded operators. The section exercises extend
the discussion. The definition of a normal operator is the same as that in the finite-
dimensional case discussed in section 3.7.
Observe that every self-adjoint operator is normal and that, for an arbitrary
operator T, TT∗ and T∗ T are self-adjoint.
Now equate the square roots of the extreme quantities of the last string.
Example 9. A bounded operator T is normal if and only if, for every x ∈ H, ‖Tx‖ =
‖T∗ x‖. Consequently, 𝒩(T) = 𝒩(T∗ ).
If T is normal, then ‖Tx‖2 − ‖T∗ x‖2 = ⟨Tx, Tx⟩ − ⟨T∗ x, T∗ x⟩ = ⟨x, T∗ Tx⟩ −
⟨x, TT∗ x⟩ = ⟨x, (T∗ T − TT∗ )x⟩ = ⟨x, 0⟩ = 0.
If ‖Tx‖ = ‖T∗ x‖, then, by the above calculation, ⟨x, (T∗ T − TT∗ )x⟩ = 0 for all
x ∈ H. Since T∗ T − TT∗ is self-adjoint, it is 0, by lemma 7.3.5.
Example 10. Let 𝜇 be an arbitrary complex number, and define T(x) = 𝜇x.
It is clear that T∗ x = 𝜇x. By the previous example, T is normal because
‖Tx‖ = |𝜇|‖x‖ = |𝜇|‖x‖ = ‖𝜇x‖ = ‖T∗ x‖.
Observe that U is unitary if and only if U−1 = U∗ and that every unitary operator
is normal.
Example 12. For 𝜃 ∈ [0, 2𝜋), the operator U(x) = ei𝜃 x is unitary, by the previous
example.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
The results of the last two examples are consistent with the fact the unitary matrices
resemble rotations of the plane. Also see the last three problems in the section
exercises.
Exercises
12. Let {un } be an orthonormal basis for H, and let {𝜆n } ∈ ℂ be such
that limn 𝜆n = 𝜆 and 𝜆n ≠ 𝜆 for all n. Prove that the function Tx =
∞
∑n=1 𝜆n ⟨x, un ⟩un is a bounded linear operator on H. Prove also that each
𝜆n is an eigenvalue of T, but 𝜆 is not.
13. Let T ∈ ℒ(H). Show that if a subspace M is invariant under T, then M⊥ is
invariant under T∗ .
14. Let R and L be the right and left shift operators on l2 , respectively.
(a) Show that R∗ = L.
(b) Describe the eigenvalues of each operator.
(c) Prove that 𝜎(R) = 𝜎(L) = the closed unit disk in the complex plane.
15. Let T be a self-adjoint operator on H, and let 𝜆 be a complex number. Prove
that 𝜆 ∈ 𝜎(T) if and only if inf‖x‖=1 ‖(T − 𝜆I)(x)‖ = 0. Hint: If there exists
a constant 𝛿 > 0 such that ‖(T − 𝜆I)(x)‖ ≥ 𝛿‖x‖ for every x ∈ H, then, by
example 8 in section 6.2, ℜ(T − 𝜆I) is closed. Show that it is also dense
in H. To prove the converse, examine the proof of theorem 7.3.8. Observe
that this result is false if T is not self-adjoint. The right shift on l2 satisfies
‖Rx‖ = ‖x‖ but 0 ∈ 𝜎(R).
16. Let T be a self-adjoint operator on a separable Hilbert space H, let
m = inf‖x‖=1 ⟨Tx, x⟩, and let M = sup‖x‖=1 ⟨Tx, x⟩. Prove that 𝜎(T) ⊆ [m, M]
and that both m and M are in 𝜎(T). Hint: Since 𝜎(T + 𝜇I) = 𝜎(T) + 𝜇, we
may assume (by considering T + 𝜇I for a sufficiently large positive constant
𝜇) that 0 ≤ m ≤ M. By theorem 7.3.7, ‖T‖ = M. Thus 𝜎(T) ⊆ [−M, M]. Let
𝛿 > 0, and let 𝜆 = m − 𝛿. For every unit vector u, ‖Tu − 𝜆u‖ ≥ ⟨Tu − 𝜆u, u⟩.
Show that ⟨Tu − 𝜆u, u⟩ ≥ 𝛿. By the previous problem, m − 𝛿 ∉ 𝜎(T). This
proves that 𝜎(T) ⊆ [m, M]. To show that M ∈ 𝜎(T), use theorem 7.3.7 to
find a sequence of unit vectors (un ) such that limn ⟨Tun , un ⟩ = M. Show that
limn Tun − Mun = 0, and again use problem 15. To show that m ∈ 𝜎(T),
assume (by considering T − 𝜇I for a sufficiently large positive constant 𝜇)
that m ≤ M ≤ 0. Apply the result you just obtained to the operator S = −T
to conclude that −m ∈ 𝜎(S).
17. Let T be a self-adjoint operator on H, let M be a closed, T-invariant
subspace of H, and let N = M⊥ . If T1 and T2 are the restrictions of T to
M and N, respectively, prove that ℜ(T) = ℜ(T1 ) ⊕ ℜ(T2 ) and that 𝜎(T) =
𝜎(T1 ) ∪ 𝜎(T2 ). Hint: Use problem 15 to show that if 𝜆 ∉ 𝜎(T1 ) ∪ 𝜎(T2 ), then
𝜆 ∉ 𝜎(T).
18. Prove that if P is the projection on a closed subspace M, then I − P is the
projection of H on M⊥ .
19. Let P be a projection. Prove that 0 and 1 are the only eigenvalues of P. What
is 𝜎(P)?
20. Let P be a projection. Show that, for all x ∈ H, ⟨Px, x⟩ = ‖Px‖2 .
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
In this section, we study compact operators in some depth. The culmination of the
section is the spectral theorem for compact self-adjoint operators.
Theorem 7.4.1. An operator T ∈ ℒ(H) is compact if and only if, for every bounded
sequence (xn ) in H, (T(xn )) contains a convergent subsequence.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Proof. Suppose T is compact, and let (xn ) be a bounded sequence in H, say, ‖xn ‖ ≤ r.
By assumption, T(B(0, r)) is compact and contains T(xn ). By the sequential
compactness of T(B(0, r)), (T(xn )) contains a convergent subsequence.
Conversely, if T is not compact, then there exists a bounded subset A such that
T(A) is not compact. In particular, T(A) is not totally bounded. Thus there exists
a positive number 𝜖 and a sequence (xn ) in A such that ‖T(xn ) − T(xm )‖ ≥ 𝜖 for all
m, n ∈ ℕ. See the proof of theorem 4.7.6. We have constructed a bounded sequence
(xn ) for which T(xn ) contains no convergent subsequence.
Proof. We leave it to the reader to verify that 𝒦 is a vector space. To prove that 𝒦
is closed, let T ∈ ℒ(H) be in the closure of 𝒦. Let (xn ) be a bounded sequence
in H, and suppose that ‖xn ‖ ≤ r. If 𝜖 > 0, there exists a compact operator K
such that ‖T − K‖ < 𝜖. Since K is compact, a subsequence (yn ) of (xn ) exists
such that K(yn ) is convergent. In particular, K(yn ) is a Cauchy sequence, so
there exists a positive integer N such that, for m, n > N, ‖Kyn − Kym ‖ < 𝜖. Now,
for m, n > N, ‖Tyn − Tym ‖ ≤ ‖Tyn − Kyn ‖ + ‖Kyn − Kym ‖ + ‖Kym − Tym ‖ ≤
‖T − K‖‖yn ‖ + 𝜖 + ‖K − T‖‖ym ‖ < r𝜖 + 𝜖 + r𝜖. Thus Tyn is Cauchy; hence it is
convergent.
Theorem 7.4.3. (a) If T is compact and S ∈ ℒ(H), then ST and TS are compact.
(b) If T is compact, and H is infinite dimensional, then 0 ∈ 𝜎(T).
Proof. The proof of part (a) is a straightforward application of theorem 7.4.1 and the
fact that a bounded operator maps bounded sequences into bounded sequences
and convergent sequences into convergent sequences. To prove (b), suppose
0 ∉ 𝜎(T). Then T is invertible, so there exists a bounded operator S such that
ST = I. By part (a), ST would be compact, so I would be compact, which is false
by example 2.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Theorem 7.4.4.
(a) A bounded, finite-rank operator T is compact.
(b) If T is compact and ℜ(T) is closed, then T is of finite rank.
Proof. (a) Suppose dim(ℜ(T)) < ∞. The continuity of T implies that T(A) is a
bounded subset of ℜ(T) for every bounded subset A of H. But bounded subsets
of a finite-dimensional space are relatively compact by the Heine-Borel theorem.
This proves that T is compact.
(b) If ℜ(T) is closed, then it is a Banach space, and T maps H onto ℜ(T). The open
mapping theorem implies that T is an open mapping. Coupled with the compact-
ness of T, this implies that the image T(B) of the unit ball, B, in H is relatively com-
pact and contains a ball B′ = {x ∈ ℜ(T) ∶ ‖x‖ < 𝛿}. In particular, the closed ball
B′ in ℜ(T) is compact. This cannot happen unless ℜ(T) is finite dimensional.
∞ ∞
Example 6. Let (aij ) be an infinite matrix such that ∑i=1 ∑j=1 |aij |2 < ∞, and
∞
define an operator T on l2 as follows: for x = (xn ) ∈ l2 , T(x) = ∑j=1 aij xj . We
∞
claim that T is compact. Observe that the assumptions imply that | ∑j=1 aij xj |2 ≤
∞ ∞ ∞ ∞ ∞
∑j=1 |aij |2 ∑j=1 |xj |2 = ‖x‖ ∑j=1 |aij |2 . Also, limn→∞ ∑i=n+1 ∑j=1 |aij |2 = 0.
For n ∈ ℕ let Pn be the projection of l2 onto the finite-dimensional subspace
Span({e1 , … , en }). Thus, for x = (xn ) ∈ l2 , Pn (x) = (x1 , … , xn , 0, 0, 0, ...). Since
Pn is compact, Pn T is compact by theorem 7.4.3. If we show that limn ‖T −
Pn T‖ = 0, the proof will be complete by theorem 7.4.2. Now ‖(T − Pn T)x‖2 =
∞ ∞ ∞ ∞
∑i=n+1 | ∑j=1 aij xj |2 ≤ ∑i=n+1 ‖x‖2 ∑j=1 |aij |2 . This shows that ‖T − Pn T‖ ≤
∞ ∞ ∞ ∞
∑i=n+1 ∑j=1 |aij |2 . Since limn ∑i=n+1 ∑j=1 |aij |2 = 0, we are done.
Not every compact operator is of finite rank. The following theorem provides the
next best result.
Theorem 7.4.5. Every compact operator T on a separable Hilbert space H is the limit
of a sequence of finite-rank operators.
Proof. Let B be the closed unit ball in H. Since T(B) is relatively compact, for every
n ∈ ℕ, there exists a finite subset Fn of H such that T(B) ⊆ ∪y∈Fn B(y, 1/n). Let
Mn = Span{Fn }, and let Pn be the projection of H onto Mn . Finally, let Tn = Pn T.
Note that ℜ(Tn ) has finite dimension because it is contained in Mn . Thus each
Tn if a finite-rank operator. We now show that, for x ∈ B, ‖Tn x − Tx‖ < 2/n.
This will prove that limn Tn = T. Fix n ∈ ℕ and write Fn = {y1 , … , yN }. If x ∈ B,
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Tx ∈ T(B) ⊆ ∪Ni=1 B(yi , 1/n). Thus, for some 1 ≤ i ≤ N, ‖Tx − yi ‖ < 1/n. Now
‖Tn x − yi ‖ = ‖Pn (Tx) − Pn (yi )‖ = ‖Pn (Tx − yi )‖ ≤ ‖Pn ‖‖Tx − yi ‖ = ‖Tx − yi ‖ <
1/n. Finally, ‖Tn x − Tx‖ ≤ ‖Tn x − yi ‖ + ‖yi − Tx‖ < 1/n + 1/n = 2/n.
‖T∗ yni − T∗ ynj ‖ = supx∈B |⟨x, T∗ (yni − ynj )⟩| = supx∈B |⟨Tx, yni − ynj ⟩|
= supx∈B |⟨Tx, yni ⟩ − ⟨Tx, ynj ⟩|
= supx∈B |𝜆ni (Tx) − 𝜆nj (Tx)|.
The uniform convergence of 𝜆nk on T(B) guarantees that the last quantity can be
made less than 𝜖 for sufficiently large integers i and j. Thus T∗ (ynk ) is Cauchy and
hence convergent.
Theorem 7.4.7 (the Riesz-Schauder theorem). Let T be compact, and let r > 0.
Then the set of eigenvalues 𝜆 of T such that |𝜆| > r is finite.
Proof. Suppose there exist infinitely many eigenvalues 𝜆n of T such that |𝜆n | > r. For
each eigenvalue 𝜆n , choose an eigenvector xn , and let Mn = Span{x1 , … , xn }. Note
that Mn is properly contained in Mn+1 and that T(Mn ) ⊆ Mn . By Riesz’s lemma,
for every n ≥ 2, there exists a unit vector yn ∈ Mn such that dist(yn , Mn−1 ) ≥ 1/2.
It is easy to verify that (T − 𝜆m I)ym ∈ Mm−1 . Now if n < m, then Tyn −
1
(T − 𝜆m I)ym ∈ Mm−1 , so [Tyn − (T − 𝜆m I)ym ] ∈ Mm−1 , and ‖Tyn − Tym ‖ =
𝜆m
1
|𝜆m |‖ [Tyn − (T − 𝜆m I)ym ] − ym ‖ ≥ |𝜆m | dist(ym , Mm−1 ) ≥ r/2. Thus (Tyn )
𝜆m
contains no convergent subsequence, contradicting the compactness of T.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
(a) the set of eigenvalues of T is at most countable and can be arranged as follows:
|𝜆1 | ≥ |𝜆2 | ≥ ...; and the set {𝜆n } of eigenvalues of T has no nonzero limit
points,
(b) If T has infinitely many eigenvalues, then limn 𝜆n = 0; and
(c) if 0 ≠ 𝜆 ∈ ℂ, and L = T − 𝜆I, then dim(Ker(L)) < ∞.
Proof. Let Λ be the set of nonzero eigenvalues of T, and assume Λ is infinite. Let r be
the spectral radius of T, and let rn = r/n (n ∈ ℕ). If Un is the complement in the
complex plane of the closed disk of radius rn centered at 0, then ℂ − {0} = ∪∞ n=1 Un ,
and Λ = ∪∞ n=1 Λ ∩ Un . Since each of the sets Λn = Λ ∩ Un is finite by theorem
7.4.7, Λ is countable. If Λ has a nonzero limit point, z, then z ∈ Un for some
positive integer n. Because Un is open, it contains a disk centered at z, and such
a disk would contains infinitely many points of Λ. This would contradict the
finiteness of Λn , so no such point z exists. Next let n ∈ ℕ be such that Λ ∩ Un ≠ ∅.
Since Λ ∩ Un is finite, the eigenvalues 𝜆 such that |𝜆| > rn can be enumerated such
that |𝜆1 | ≥ |𝜆2 | ≥ ... |𝜆N1 |. Since Λ is infinite, there exists an integer m > n such
that Λ ∩ Um properly contains Λ ∩ Un . Arrange the eigenvalues in (Um − Un ) ∩ Λ
in such a way that |𝜆N1 +1 | ≥ |𝜆N1 +2 | ≥ ... ≥ |𝜆N2 |. Continuing in this manner, we
can enumerate all the eigenvalues in the desired fashion.
(b) Any disk centered at 0 contains all but finitely many of the points 𝜆n . This
proves part (b).
(c) Write NL for Ker(L). Note that T(NL ) ⊆ NL ; hence the restriction of T to NL is
1
compact. Since Tx = 𝜆x for all x ∈ NL , I = T on NL . Thus the identity operator
𝜆
on NL is compact, so NL is finite dimensional.
Now that we have established enough of the basic properties of compact operators,
we give examples of how compact operators can be constructed. We hope this will
help motivate some of the results we discuss later in the section. The following
builds on the constructions of theorem 7.3.11.
compact operators but also illustrate that we have wide control over tailoring
the spectrum, as the examples below illustrate. Also see problem 4 at the end
of this section for an example of the ultimate tailoring of the spectrum of a
bounded operator.
We will adopt the following standing assumptions for the remainder of this section:
T is a compact operator on a separable Hilbert space H, and 𝜆 is a nonzero
complex number. We also use the following notation: L = T − 𝜆I, L∗ = T∗ − 𝜆I;
NL = Ker(L); RL = ℜ(L); NL∗ = Ker(L∗ ); RL∗ = ℜ(L∗ ).
In the calculations in the rest of this section, we repeatedly use the fact that T
commutes with the powers of L. This is because the powers of T commute.
Lemma 7.4.10. Let NLn denote Ker(Ln ). Then NLn is finite dimensional, and NLn ⊆
NLn+1 . Moreover, there exists an integer n such that, for every k ≥ n, NLk = NLn .
n
n
Ln = (T − 𝜆I)n = ∑ ( )Tn−i (−𝜆I)i
i=0
i
= (Tn − n𝜆Tn−1 + ... + n(−𝜆)n−1 T) − [(−1)n+1 𝜆n ]I.
Lemma 7.4.11. Let RLn = ℜ(T − 𝜆I)n . Then each RLn is closed, RLn ⊇ RLn+1 , and
there exists a positive integer n such that RLk = RLn for all k ≥ n.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Conversely, suppose NL = {0}. Note that NLn = {0}, that is, Ln is one-to-one. If
RL ≠ H, then there is an element x ∉ RL . In this case, for all y ∈ H, Ln x − Ln+1 y =
Ln (x − Ly) ≠ 0, because x ≠ Ly and Ln is injective. Hence Ln x ≠ Ln+1 y for all
y ∈ H. This means that RLn strictly contains RLn+1 for all n, thus contradicting
lemma 7.4.11.
Proof. Suppose 𝜆 ≠ 0 and that 𝜆 is not an eigenvalue, that is, NL = {0}. By propo-
sition 7.4.12, RL = H. Hence L = T − 𝜆I is one-to-one and onto and hence is
invertible by theorem 6.3.7.
Remark. It follows from theorem 7.4.8 and the previous theorem that the spec-
trum of a compact operator on an infinite-dimensional separable Hilbert space
is {0, 𝜆1 , 𝜆2 , ...}.
(a) T − 𝜆I is invertible.
(b) 𝜆 is an eigenvalue of T.
Proof. Suppose, for a contradiction, that dim(NL ) = m < n = dim(NL∗ ), and let
{u1 , … , um } and {v1 , … , vn } be orthonormal bases for NL and NL∗ respectively.
Define a finite rank operator on H by
m
F(x) = ∑⟨x, ui ⟩vi .
i=1
The discussion so far shows that compact operators, like self-adjoint operators,
share some properties with operators on finite-dimensional spaces. When we limit
our attention to compact, self-adjoint operators, we obtain results that directly
extend those of the finite-dimensional case.
Proof. By the proof of theorem 7.3.8, there exists a real number 𝜆 and a sequence of
unit vectors (yn ) such that |𝜆| = ‖T‖ and limn Tyn − 𝜆yn = 0. Since T is compact,
(yn ) contains a subsequence (un ) such that (Tun ) is convergent. It follows that
(un ) is convergent (it is the difference between the two convergent sequences
1 1
Tun and [Tun − 𝜆un ]). Let u = limn un . Now limn Tun − 𝜆un = 0; hence
𝜆 𝜆
Tu − 𝜆u = 0. Since u ≠ 0, 𝜆 is an eigenvalue of T.
In light of theorems 7.3.8 and 7.4.13, r(T) = ‖T‖ = |𝜆1 | (the largest eigenvalue of
T). Thus the previous lemma is, in fact, redundant. However, we decided to include
it here in order to make this subsection self-contained and independent of the
Fredholm theory.
Proof. Let 𝜆1 , 𝜆2 , ... be the nonzero eigenvalues of T, and, for each n, let Bn be an
orthonormal basis for the (finite-dimensional) eigenspace, Vn , that corresponds to
𝜆n . The reader should keep in mind that the set of eigenvalues may be finite. Since
the eigenspaces are mutually orthogonal, the set B = ∪n Bn is an orthonormal
set. Let M be the closure of the span of B, and let N = M⊥ . Since each Vn is T-
invariant, so is M. It follows that N is also T-invariant (see problem 13 on section
7.3). If N = {0}, then M = H and B is the desired orthonormal basis for H.
∞
Tx = ∑ 𝜆n ⟨x, un ⟩un .
n=1
∞
Proof. Write x = ∑n=1 ⟨x, un ⟩un . Then
∞ ∞ ∞
Tx = T ( ∑ ⟨x, un ⟩un ) = ∑ ⟨x, un ⟩Tun = ∑ 𝜆n ⟨x, un ⟩un .
n=1 n=1 n=1
The spectral theorem is the exact analog of the finite-dimensional case for a
Hermitian matrix. If we define Pn to be the projection on the one-dimensional
∞
subspace spanned by un , then Pn is a rank-1 operator, and T = ∑n=1 𝜆n Pn . Notice
∞
that the series ∑n=1 𝜆n Pn converges in the operator norm by theorem 7.3.11.
Example 10. Let T be a compact self-adjoint operator, let {𝜆n } be the nonzero
eigenvalues of T, and let un be the corresponding eigenvectors. For a fixed
g ∈ H, consider the equation Tf − 𝜆f = g. We work out two cases:
∞ ∞
𝜆f + g = ∑ ⟨𝜆f + g, un ⟩un = ∑ [𝜆fn̂ + gn̂ ]un . (6)
n=1 n=1
By theorem 7.4.18,
∞
Tf = ∑ 𝜆n fn̂ un . (7)
n=1
Equating the Fourier coefficients of the two series in (6) and (7), we obtain
ĝ
𝜆n fn̂ = 𝜆fn̂ + gn̂ , which gives fn̂ = n , and the unique solution of the
𝜆n −𝜆
equation is
∞
−g 1 −g 𝜆 ⟨g, un ⟩
f= + Tf = +∑ n u .
𝜆 𝜆 𝜆 n=1 𝜆(𝜆n − 𝜆) n
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
See problem 8 at the end of the section for the continuation of this example.
We are now ready to prove the spectral theorem for compact normal operators.
It is easy to verify that T∗ (x1 , x2 , ...) = (−ix1 , ix2 , (1 − i)x3 , (1 + i)x4 , 0, 0, ...) and
that T is a normal operator. The self-adjoint operator U = T∗ T is given by
The theory of compact operators has deep roots in the study of integral equations,
and this section would not be complete without a brief mention of integral
equations.
Consider the Fredholm integral equation
Tu − 𝜆u = f,
where T is the integral operator generated by the function K(x, 𝜉),
b
Tu(x) = ∫ K(x, 𝜉)u(𝜉)d𝜉.
a
The complex function K(x, 𝜉) on the square [a, b] × [a, b] is called the kernel of the
operator, and we limit ourselves to Hilbert-Schmidt kernels since these, as it turns
out, define compact integral operators on 𝔏2 = 𝔏2 [a, b].
Theorem 7.4.20. If K(x, 𝜉) is continuous on the closed square [a, b] × [a, b] and
u ∈ 𝔏2 , then Tu is continuous on [a, b].
Proof. Let 𝜖 > 0. By the uniform continuity of K on [a, b] × [a, b], there exists a
number 𝛿 > 0 such that if |x1 − x2 | < 𝛿, then |K(x1 , 𝜉) − K(x2 , 𝜉)| < 𝜖. Now using
the Cauchy-Schwarz inequality,
b
| |
|Tu(x1 ) − Tu(x2 )| = || ∫ (K(x1 , 𝜉) − K(x2 , 𝜉))u(𝜉)d𝜉 ||
a
b
≤ ∫ |(K(x1 , 𝜉) − K(x2 , 𝜉))u(𝜉)|d𝜉
a
b 1/2 b 1/2
≤ ( ∫ |K(x1 , 𝜉) − K(x2 , 𝜉)|2 d𝜉) ( ∫ |u(𝜉)|2 d𝜉)
a a
b 1/2
≤ ( ∫ 𝜖2 d𝜉) ‖u‖2 = 𝜖(b − a)1/2 ‖u‖2 .
a
Proof. This is obvious from the proof of the previous theorem since if ‖u‖2 ≤ C
for all u ∈ 𝔉, then |Tu(x1 ) − Tu(x2 )| ≤ C(b − a)1/2 𝜖 for all x1 , x2 ∈ [a, b] with
|x1 − x2 | < 𝛿.
Theorem 7.4.22. If K(x, 𝜉) is continuous on [a, b] × [a, b], then the integral operator
it generates is a compact operator on 𝔏2 .
Proof. This result follows from the previous corollary and Ascoli’s theorem. If {un } is
a bounded sequence in 𝔏2 , then T(un ) is equicontinuous and bounded in 𝒞[a, b];
hence it contains a subsequence Tunk that converges uniformly in 𝒞[a, b]. Since,
for any function u ∈ 𝒞[a, b], ‖u‖2 ≤ (b − a)1/2 ‖u‖∞ , the subsequence Tunk is
convergent in 𝔏2 .
Proof. We utilize the fact that 𝒞([a, b] × [a, b]) is dense in 𝔏2 ([a, b] × [a, b]).
Let Kn (x, 𝜉) be a sequence of continuous functions on [a, b] × [a, b] such that
limn ‖Kn − K‖2 = 0. It suffices to show that if Tn is the compact integral operator
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
b b
‖Tn u − Tu‖22 = ∫ | ∫ (Kn (x, 𝜉) − K(x, 𝜉))u(𝜉)d𝜉|2 dx
a a
b b
≤ ∫ |Kn (x, 𝜉) − K(x, 𝜉)|2 d𝜉dx ∫ |u(𝜉)|2 d𝜉 = ‖Kn − K‖22 ‖u‖22 .
a a
−g(x) c cos x
(a) 0 ≠ 𝜆 ≠ 𝜋/2. The unique solution of the equation is f(x) = + 𝜋 ,
𝜆 𝜆( −𝜆)
2
𝜋
where c = ∫0 g(𝜉)cos
𝜉d𝜉.
𝜋
(b) The equation Tf − f = g has a solution if and only if ⟨g, cos x⟩ = 0, and, in
2
−2g(x)
this case, f = + k cos x, where k is an arbitrary constant.
𝜋
Exercises
g m 𝜆 ⟨g, uk ⟩
f=− + ∑ ai vi + ∑{ k u ∶ k ≠ n}.
𝜆n i=1 𝜆n (𝜆k − 𝜆n ) k
The reader may have observed that the definition of a compact operator makes
perfectly good sense for an operator on a Banach space. We state the definition
again. A linear operator on a Banach space X is compact if it maps bounded
subsets of X into relatively compact subsets of X. All the results in theorems 7.4.1
through 7.4.15 are valid for compact operators on Banach spaces. All the proofs
we presented for theorems 7.4.1 through 7.4.15, are valid without alteration for
compact operators on Banach spaces, with the exception of theorems 7.4.5, 7.4.6,
and 7.4.15. The proofs of theorems 7.4.1 through 7.4.15 (with the exceptions noted
above) were deliberately made more general than is needed for Hilbert spaces. For
example, we used Riesz’s theorem at several places when a simpler alternative was
available. As an illustration, in the proof of lemma 7.4.10, we could simply choose a
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
unit vector un ∈ NLn+1 such that un ⟂ NLn . Another place where the proof could be
simplified is theorem 7.4.9, where we could have used the orthogonal complement
N⊥L instead of a complement X1 of NL . We now furnish the proofs of theorems 7.4.5,
7.4.6, and 7.4.15 for compact operators on Banach spaces.
Theorem 7.5.2. If a Banach space X has a Schauder basis, then every compact
operator T on X is the limit of a sequence of finite-rank operators.
Proof. Let {un } be a Schauder basis for X, and let Pn be the canonical projection of
X onto Span{u1 , … , un } (see the definition before problem 25 on section 6.2). We
prove that the sequence Tn = Pn T of finite-rank operators converges in ℒ(X) to T.
For every x ∈ X, limn (Pn − I)(x) = 0. By the previous lemma, sequence (Pn − I)
converges uniformly to 0 on compact subsets of X. In particular, Pn − I converges
uniformly to 0 on T(B). Now ‖Tn − T‖ = supx∈B ‖Tn (x) − T(x)‖ = supx∈B ‖(Pn −
I)(Tx)‖ → 0.
‖T∗ 𝜆ni − T∗ 𝜆nj ‖ = supx∈B |⟨x, T∗ (𝜆ni − 𝜆nj )⟩| = supx∈B |⟨Tx, 𝜆ni − 𝜆nj ⟩|
= supx∈B |⟨Tx, 𝜆ni ⟩ − ⟨Tx, 𝜆nj ⟩|
= supx∈B |𝜆ni (Tx) − 𝜆nj (Tx)|.
The uniform convergence of 𝜆nk on T(B) guarantees that the last quantity can be
made less than 𝜖 for sufficiently large integers i and j. Thus T∗ 𝜆nk is Cauchy and
hence convergent.
Recall that we have already established (theorem 7.4.8) that NL and NL∗ are finite
dimensional and that RL and RL∗ are closed (theorem 7.4.9).
Proof. Let y1 , … , yn be such that yĩ = yi + RL form a basis for X/RL , let x1 , … , xm
be a basis for NL , and let X1 be a closed complement of NL . We will show
that m = n. Suppose that m < n. Define a finite-rank operator F ∈ ℒ(X) by
F|X1 = 0, Fxi = yi for 1 ≤ i ≤ m. The operator K = T + F is compact, and we
claim that K − 𝜆I is one-to-one. If (K − 𝜆I)(x) = 0, then (T − 𝜆I)(x) = −Fx ∈
RL ∩ Span{y1 , … , yn } = {0}. Thus (T − 𝜆I)x = 0 = Fx, and hence x ∈ NL . The
restriction of F to NL is clearly one-to-one; hence Fx = 0 implies that x = 0,
and we have proved the claim. By the Fredholm alternative, K − 𝜆I is onto,
which contradicts the fact that ym+1 is not in the range of K − 𝜆I. (Note that
ℜ(K − 𝜆I) ⊆ RL ⊕ Span{y1 , … , ym }).
Exercises
8
Integration Theory
The only teaching that a professor can give, in my opinion, is that of thinking in
front of his students.
Henri Lebesgue
Lebesgue entered the École Normale Supérieure in Paris in 1894 and was awarded
his teaching diploma in mathematics in 1897. He studied Baire’s papers on dis-
continuous functions and realized that much more could be achieved in this area.
Building on the work of others, including that of Émile Borel and Camille Jordan,
Lebesgue formulated measure theory, which he published in 1901. He generalized
the definition of the Riemann integral by extending the concept of the area (or
measure), and his definition allowed the integrability of a much wider class of
functions, including many discontinuous functions. This generalization of the
Riemann integral revolutionized integral calculus. Up to the end of the nineteenth
century, mathematical analysis was limited to continuous functions, based largely
on the Riemann method of integration.
Fundamentals of Mathematical Analysis. Adel N. Boules, Oxford University Press (2021). © Adel N. Boules.
DOI: 10.1093/oso/9780198868781.003.0008
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Hawkins writes,1
In Lebesgue’s work ... the generalized definition of the integral was simply
the starting point of his contributions to integration theory. What made the
new definition important was that Lebesgue was able to recognize in it an
analytic tool capable of dealing with—and to a large extent overcoming—
the numerous theoretical difficulties that had arisen in connection with Rie-
mann’s theory of integration. In fact, the problems posed by these difficulties
motivated all of Lebesgue’s major results.
Lebesgue did not concentrate throughout his career on the field which he started.
He also made major contributions in other areas of mathematics, including
topology, potential theory, the Dirichlet problem, the calculus of variations, set
theory, the theory of surface area, and dimension theory.
In this section, we treat the definition and the fundamental properties of the
Riemann integral of a bounded function on a compact box. The main reason for
the inclusion of this section is that our definition of Lebesgue measure is, loosely
stated, based on the notion that the Riemann integral of a continuous function f
on a compact box measures the volume of the region below the graph of f. The
presentation in this section is standard and reflects almost exactly the standard
approach to the Riemann integral on a compact interval found in undergraduate
real analysis textbooks.
1 T. Hawkins, “Lebesgue, Henri Léon” in C. C. Gillispie, F. L. Holmes, and N. Koertge (eds.), Complete
Dictionary of Scientific Biography (Detroit: Charles Scribner’s Sons, 2008), 110–12.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Δ = {𝜎 = J1 × J2 × ... × Jn ∶ Ji ∈ 𝒫i } .
Both numbers are finite because f is assumed to be bounded. We define the upper
and lower Riemann sums, respectively, of f corresponding to the partition Δ on
Q by
K K
S∆ (f ) = ∑ f 𝜍i vol(𝜎i ), and s∆ (f ) = ∑ f𝜍i vol(𝜎i ).
i=1 i=1
bounded below, and hence the number 𝛽 = inf∆ S∆ (f ) is finite, where inf is taken
over all partitions Δ of Q.
Similarly, 𝛼 = sup∆ s∆ (f ) is a finite number. The numbers 𝛼 and 𝛽 are called,
respectively, the lower and upper Riemann integrals of f over Q.
s∆1 (f ) ≤ S∆2 (f ).
Now
(f2 )𝜍i − (f2 )𝜍i = (f 𝜍i + f𝜍i )(f 𝜍i − f𝜍i ) ≤ 2M(f 𝜍i − f𝜍i ),
1 if x ∈ ℚ,
f (x) = {
−1 if x ∉ ℚ
2nk 2nk
𝜍i
Sk (f ) = ∑ f vol(𝜎i ), and sk (f ) = ∑ f𝜍i vol(𝜎i ).
i=1 i=1
S1 (f ) ≥ S2 (f ) ≥ … , and s1 (f ) ≤ s2 (f ) ≤ … .
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Proof. In the notation of the previous paragraph, we prove that 𝛼0 = 𝛽0 . This will
establish all the assertions of the theorem. Let 𝜖 > 0, and let k be a positive
integer such that |Sk (f ) − 𝛽0 | < 𝜖/3, and |sk (f ) − 𝛼0 | < 𝜖/3. Since f is uniformly
𝜖
continuous on Q, there exists 𝛿 > 0 such that | f (x) − f(y)| < whenever
3 vol(Q)
‖x − y‖ < 𝛿. We may assume, without loss of generality, that the integer k is such
that the diameter of each sub-box in Δk is less than 𝛿. Since f assumes its maximum
𝜖
and minimum values on 𝜎i in 𝜎i , | f 𝜍i − f𝜍i | < , for each 1 ≤ i ≤ 2nk . Now
3 vol(Q)
2nk 𝜖 2nk
|Sk (f ) − sk (f )| ≤ ∑i=1 | f 𝜍i − f𝜍i |vol(𝜎i ) ≤ ∑i=1 vol(𝜎i ) = 𝜖/3. Finally,
3 vol(Q)
|𝛼0 − 𝛽0 | ≤ |𝛼0 − sk (f )| + |sk (f ) − Sk (f )| + |Sk (f ) − 𝛽0 | < 𝜖/3 + 𝜖/3 + 𝜖/3 = 𝜖.
Since 𝜖 is arbitrary, 𝛼0 = 𝛽0 .
2nk
Proof. Sk (f + g) = ∑i=1 (f + g)𝜍i vol(𝜎i ). Now (f + g)𝜍i = maxx∈𝜍i (f (x) + g(x)) ≤
maxx∈𝜍i f (x) + maxx∈𝜍i g(x) = f 𝜍i + g𝜍i . Therefore, Sk (f + g) ≤ Sk (f ) + Sk (g).
Taking the limit of both sides as k → ∞, ∫Q (f + g)dx ≤ ∫Q f + ∫Q gdx. Similarly,
sk (f + g) ≥ sk (f ) + sk (g); hence ∫Q (f + g)dx ≥ ∫Q f + ∫Q gdx.
Proof. Since (−f )𝜍i = maxx∈𝜍i (−f (x)) = −minx∈𝜍i f (x) = −f𝜍i ,
2nk
∫(−f )dx = lim Sk (−f ) = lim ∑(−f )𝜍i vol(𝜎i )
k k
Q i=1
2nk
= − lim ∑ f𝜍i vol(𝜎i ) = − lim sk (f ) = − ∫ fdx.
k k
i=1 Q
It is now easy to verify the linearity of the integral: if f and g are continuous on Q,
and a, b ∈ ℝ, then ∫Q (af + bg)dx = a ∫Q fdx + b ∫Q gdx.
∫Q fdx = ∫Q f1 + i ∫Q f2 dx.
Theorems 8.1.6(a) and 8.1.7 are often summarized by the terminology that the
Riemann integral is a positive linear functional on the space 𝒞(Q) of continuous
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Exercises
For the remainder of this chapter, we use the notation E′ for the complement X − E
of a subset E of a set X.
Theorem 8.2.1.
(a) If 𝔐 is an algebra, then ∅, X ∈ 𝔐.
(b) If 𝔐 is an algebra and E1 , E2 ∈ 𝔐,then E1 ∩ E2 ∈ 𝔐, and E1 − E2 ∈ 𝔐. It
follows by induction that an algebra is closed under the formation of finite
unions and intersections.
(c) If 𝔐 is a 𝜎-algebra, and En ∈ 𝔐, then ∩∞ n=1 En ∈ 𝔐.
Proof.
(a) Let E ∈ 𝔐. Then E′ ∈ 𝔐; hence X = E ∪ E′ ∈ 𝔐, and ∅ = X′ ∈ 𝔐.
(b) Using De Morgan’s laws, if E1 , E2 ∈ 𝔐, then E1 ∩ E2 = (E′1 ∪ E′2 )′ ∈ 𝔐. Also
E1 − E2 = E1 ∩ E′2 ∈ 𝔐.
n=1 En = (∪n=1 En ) .
′ ′
(c) This follows from De Morgan’s law, since ∩∞ ∞
Definition. The smallest 𝜎-algebra that contains a collection of sets ℭ is called the
𝜎-algebra generated by ℭ.
It follows from theorem 8.2.1 that ℬ(X) contains all open sets, closed sets, F𝜍 sets,
and G𝛿 sets.
(a) 𝜇 ≢ ∞, in the sense that 𝜇(E) < ∞ for at least one E ∈ 𝔐; and
(b) if {En } is a countable collection of mutually disjoint members of 𝔐, then
∞
𝜇(∪∞
n=1 En ) = ∑ 𝜇(En ).
n=1
The pair (X, 𝔐) is called a measurable space, the members of 𝔐 are called
measurable sets, and (X, 𝔐, 𝜇) is called a measure space. If 𝔐 and 𝜇 are
understood, we loosely say that X is a measure space.
Example 4. the (counting measure). Let X be a nonempty set, and let 𝔐 = 𝒫(X).
Define 𝜇 ∶ 𝔐 → ℝ as follows: 𝜇(E) = Card(E) if E is finite, and 𝜇(E) = ∞
otherwise. Then 𝜇 is a measure on 𝒫(X).
Example 5. the (Dirac measure). Let X be a nonempty set, and let 𝔐 = 𝒫(X).
Fix an element x0 ∈ X, and define 𝜇 ∶ 𝔐 → ℝ as follows: 𝜇(E) = 1 if x0 ∈ E,
and 𝜇(E) = 0 otherwise. Then 𝜇 is a measure on 𝒫(X).
𝜇(E) ≤ 𝜇(F ).
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
𝜇(∪∞
n=1 En ) = limn 𝜇(En ).
𝜇(∩∞
n=1 En ) = limn 𝜇(En ).
(c) Let B1 = E1 , and, for n ≥ 2, let Bn = En − En−1 . The sequence {Bn } is pairwise
∞
disjoint, and ∪ni=1 Bi = En . Now 𝜇(∪∞ ∞
n=1 En ) = 𝜇(∪n=1 Bn ) = ∑n=1 𝜇(Bn ) =
n
limn ∑i=1 𝜇(Bi ) = limn 𝜇(∪ni=1 Bi ) = limn 𝜇(En ).
Example 7. The condition 𝜇(E1 ) < ∞ in part (d) of the previous theorem cannot
be omitted. For example, if 𝜇 is the counting measure on ℕ, and En = [n, ∞) ∩
ℕ, then limn 𝜇(En ) = ∞, while 𝜇(∩∞ n=1 En ) = 𝜇(∅) = 0.
Outer Measures
∞
m∗ (∪∞ ∗
n=1 En ) ≤ ∑ m (En ).
n=1
Thus an outer measure is a nonnegative set function on 𝒫(X) that is monotone and
countably subadditive. Outer measures have little intrinsic importance. However,
an outer measure can be restricted to a positive measure on a certain 𝜎-algebra of
sets in X, as we detail below.
m∗ (A) = m∗ (A ∩ E) + m∗ (A ∩ E′ )
The Carathéodory condition also implies without too much difficulty that 𝔐 is an
algebra (see lemma 8.2.4). In fact, it turns out that 𝔐 is a 𝜎-algebra and that the
restriction of m∗ to 𝔐 is a positive measure. We prove this in three steps.
To complete the proof of part (a), we use induction coupled with the fact we just
established (n = 2) and the fact that 𝔐 is an algebra.
To prove (b),
n
∑ m∗ (A ∩ Ei ) = m∗ (A ∩ ∪ni=1 Ei ) ≤ m∗ (A ∩ ∪∞
i=1 Ei )
i=1
∞
= m∗ (∪∞ ∗
i=1 (A ∩ Ei )) ≤ ∑ m (A ∩ Ei ).
i=1
Taking the limit as n → ∞, we obtain (b). Part (c) follows from (b) by taking
A = ∪∞i=1 Ei .
Proof. The fact that m∗ is countably additive on 𝔐 is part (c) of the previous
theorem. We need to show that 𝔐 is closed under the formation of countable
unions. Let En ∈ 𝔐, and write E = ∪∞ n=1 En . Define B1 = E1 , and, for n ≥ 2, Bn =
En − ∪n−1
i=1 iE . Since 𝔐 is an algebra, each Bn ∈ 𝔐. Notice that the sets Bn are
∞ ∞
mutually disjoint, and ∪n=1 Bn = ∪n=1 En . Therefore, without loss of generality,
we may assume the sets En are mutually disjoint. We need to show that, for
A ⊆ X, m∗ (A) ≥ m∗ (A ∩ E) + m∗ (A ∩ E′ ). Using the facts that ∪ni=1 Ei ∈ 𝔐, A ∩
(∪ni=1 Ei )′ ⊇ A ∩ E′ , and lemma 8.2.5, we obtain
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
n
m∗ (A) = m∗ (A ∩ (∪i=1 Ei )) + m∗ (A ∩ (∪ni=1 Ei )′ )
n
≥ m∗ (A ∩ (∪ni=1 Ei )) + m∗ (A ∩ E′ ) = ∑ m∗ (A ∩ Ei ) + m∗ (A ∩ E′ ).
i=1
Taking the limit as n → ∞ in the above string, then applying part (b) of the
previous theorem, we obtain
∞
m (A) ≥ ∑ m∗ (A ∩ Ei ) + m∗ (A ∩ E′ ) = m∗ (A ∩ E) + m∗ (A ∩ E′ ).
∗
i=1
Theorem 8.2.7. Let m∗ be an outer measure on a set X, and let 𝔐 be the 𝜎-algebra
of measurable subsets of X. Then the restriction of m∗ to 𝔐 is a complete measure.
Measurable Functions
For the remainder of this section, (X, 𝔐) is a measurable space. We allow real-
valued functions on X to take infinite values. This is essential because, for example,
the limit of a sequence of functions fn (x) may well diverge to ±∞, or it may not
even exist for some x ∈ X. It will turn out that this is largely a technicality because,
in practice, the exceptional set of points where a reasonable measurable function
f takes infinite values has measure 0 (see, e.g., example 1 in section 8.3). In this
section, we have to contend with the nuisance that functions can assume infinite
values.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
(a) f is measurable.
(b) f−1 ([a, ∞]) ∈ 𝔐.
(c) f−1 ([−∞, a)) ∈ 𝔐.
(d) f−1 ([−∞, a]) ∈ 𝔐.
(a) f−1 (−∞) and f−1 (∞) are measurable subsets of X, and
(b) f−1 (V) is measurable for every open subset V of ℝ.
f−1 (U) if 0 ∉ V,
h−1 (V) = { −1 −1 −1
f (U) ∪ f (∞) ∪ f (−∞) if 0 ∈ V.
This lemma can be applied to infer the measurability of a wide class of functions.
The following is a sample.
Proof. This follows from lemma 8.2.12 applied with 𝜑(t) = max{t, 0}, 𝜑(t) =
min{t, 0}, 𝜑(t) = |t|p , and 𝜑(t) = tm , respectively. Here it is assumed, in
accordance with the lemma, that when | f (x)| = ∞, (𝜑of)(x) is defined to be 0.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
h−1 ((a, ∞]) = {x ∈ A ∶ h(x) > a} ∪ {x ∈ X − A ∶ h(x) > a} = {x ∈ A ∶ f (x) > a},
which is measurable.
If a < 0,
h−1 ((a, ∞]) = {x ∈ A ∶ f (x) > a} ∪ (X − A),
which is also measurable.
f (x) + g(x) if x ∈ A,
h(x) = {
0 if x ∉ A,
f (x)g(x) if x ∈ A,
k(x) = {
0 if x ∉ A.
(a) supn fn ,
(b) infn fn ,
(c) lim supn fn , and
(d) lim infn fn .
and
{x ∈ X ∶ infn fn (x) < a} = ∪∞
n=1 {x ∈ X ∶ fn (x) < a},
respectively. Now parts (c) and (d) follow from parts (a) and (b) because
The last assertion follows from parts (c) and (d) and from lemma 8.2.10, because
the set in question is
The motivation for the Hopf extension included below is not entirely precise, but
we hope it will help the reader gain some insight into the construction of important
measures such as the Lebesgue measure on ℝ2 . The plane contains a collection of
subsets for which a natural measure exists, namely, the collection of rectangles.3
The measure (area) of a rectangle ought to be the product of its dimensions. The
collection ℭ of finite disjoint unions of rectangles in the plane is known to be an
algebra in ℝ2 , and the measure of a member of ℭ is defined in the obvious way: it
is the (finite) sum of the measures of the rectangles in the union. The immediate
question is whether the natural measure we just described can be extended to the
𝜎-algebra 𝔐 generated by ℭ.
The Hopf extension abstracts the above motivation and provides an affirmative
answer (theorem 8.2.19). Theorem 8.2.20 gives a sufficient condition for the
uniqueness of such an extension.
∞
n∗ (E) = inf{ ∑ 𝜇(Cn ) ∶ Cn ∈ ℭ, E ⊆ ∪∞
n=1 Cn }.
n=1
(c) Let E ∈ ℭ, A ⊆ X, and, without loss of generality, assume that n∗ (A) < ∞.
For every 𝜖 > 0, there exists a sequence {Cn } in ℭ such that A ⊆ ∪∞ n=1 Cn and
∞ ∞
∑n=1 𝜇(Cn ) ≤ n∗ (A) + 𝜖. By the additivity of 𝜇 on ℭ, n∗ (A) + 𝜖 ≥ ∑n=1 𝜇(Cn ) =
∞ ∞
∑n=1 𝜇(Cn ∩ E) + ∑n=1 𝜇(Cn ∩ E′ ) ≥ n∗ (A ∩ E) + 𝜈 ∗ (A ∩ E′ ). Since 𝜖 is arbi-
trary, the result follows.
Theorem 8.2.19 (the Hopf extension theorem). Under the standing assumptions,
the set function 𝜇 has an extension to a positive measure on the 𝜎-algebra 𝔐
generated by ℭ.
The next corollary establishes a sufficient condition for the uniqueness of the Hopf
extension.
Proof. The following two facts are consequences of the 𝜎-finiteness assumption.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
(a) The sequence (Xn ) may be assumed to be mutually disjoint, because we can
replace it with the sequence Y1 = X1 , and, for n ≥ 2, Yn = Xn − ∪n−1
i=1 Xi . Clearly,
∞
Yn ∈ ℭ, 𝜇(Yn ) ≤ 𝜇(Xn ) < ∞, and ∪n=1 Yn = X.
We now prove the result. Suppose there is another measure that extends 𝜇 from ℭ
to 𝔐. We continue to use the symbol 𝜇 to denote this extension. Thus we assume
that 𝜇(C) = 𝜈(C) for every C ∈ ℭ and prove that 𝜇(E) = 𝜈(E) for every E ∈ 𝔐.
Observe the following facts:
(e) If E ∈ 𝔐 and 𝜈(E) < ∞, then 𝜇(E) = 𝜈(E). Let 𝜖 > 0. There exists a
∞
sequence (Cn ) in ℭ such that E ⊆ C = ∪∞ n=1 Cn and ∑n=1 𝜇(Cn ) < 𝜈(E) + 𝜖.
∞ ∞
Now 𝜈(C) ≤ ∑n=1 𝜈(Cn ) = ∑n=1 𝜇(Cn ) < 𝜈(E) + 𝜖. In particular, 𝜈(C − E) <
𝜖. Using fact (c), we have 𝜈(E) ≤ 𝜈(C) = 𝜇(C) = 𝜇(E) + 𝜇(C − E) ≤ 𝜇(E) +
𝜈(C − E) < 𝜇(E) + 𝜖. Since 𝜖 is arbitrary, 𝜈(E) ≤ 𝜇(E). Now 𝜇(E) = 𝜈(E) by
fact (d).
Exercises
Show that 𝜇 is well defined and that 𝜇 is a complete measure. Hint: Show
that the set 𝔐1 = {E ∪ Z ∶ E ∈ 𝔐, Z ∈ ℨ} is a 𝜎-algebra.
4. This exercise provides a useful alternative characterization of the comple-
tion of a measure space. In the notation of the previous exercise, prove that,
for a subset E of X, E ∈ 𝔐 if and only if there exists two sets A and B in 𝔐
such that A ⊆ E ⊆ B and 𝜇(B − A) = 0.
5. Prove that each of the following collections of sets generates ℬ(ℝ):
(a) {(a, ∞) ∶ a ∈ ℝ}
(b) {(−∞, b) ∶ b ∈ ℝ}
n
6. Prove that the collection of open boxes {∏i=1 (ai , bi ) ∶ ai , bi ∈ ℚ} generates
ℬ(ℝn ).
7. Suppose 𝔐 is a 𝜎-algebra generated by a collection ℭ of subsets of a
nonempty set X. Prove that 𝔐 is the union of the 𝜎-algebras generated
by 𝔉 where 𝔉 ranges over all the countable subsets of ℭ. Hint: The latter
union is a 𝜎-algebra.
8. Prove that if E and F are measurable sets such that 𝜇(EΔF) = 0, then 𝜇(E) =
𝜇(F ) = 𝜇(E ∪ F) = 𝜇(E ∩ F).
∞
9. Let En be a sequence of measurable sets such that ∑n=1 𝜇(En ) < ∞. Prove
that the set ∩∞ n=1 ∪k≥n Ek has measure 0. Conclude that, except for a set of
measure 0, every x ∈ X belongs to finitely many of the sets En .
10. Let E1 , … , En be measurable sets and, for 1 ≤ j ≤ n, let Fj to be the set
of points in X that belong to exactly j of the sets E1 , … , En . Prove that
n n n
𝜇(∪ni=1 Ei ) = ∑j=1 𝜇(Fj ), and ∑i=1 𝜇(Ei ) = ∑j=1 j𝜇(Fj ). Hint: Fj = {x ∈ X ∶
n
∑i=1 𝜒Ei (x) = j}.
11. Prove theorem 8.2.9.
12. Show that if f is measurable and a ∈ ℝ, then f−1 (a) is measurable.
13. Let (X, 𝔐) be a measurable space such that 𝔐 ≠ 𝒫(X). Prove that there is
a function f such that | f | is measurable but f is not.
14. Suppose that (X, 𝔐) is a measurable space and that Y is a nonempty set.
Show that if f ∶ X → Y, then the collection 𝔑 = {E ⊆ Y ∶ f−1 (E) ∈ 𝔐} is a
𝜎-algebra.
15. Let (X, 𝔐) be a measurable space, and let f ∶ X → ℝ be a measurable
function. Show that f−1 (B) is measurable for every Borel subset B of ℝ.
Hint: The collection Ω = {E ⊆ ℝ ∶ f−1 (E) ∈ 𝔐} contains all open subsets
of ℝ.
16. Let X be a topological space, and let f ∶ X → ℝ be a continuous function.
Show that f−1 (B) is a Borel subset of X for every Borel subset B of ℝ.
17. Show that if E ∈ ℬ(ℝs ) and F ∈ ℬ(ℝr ), then E × F ∈ ℬ(ℝr+s ). Hint: For
an open subset E of ℝr , consider the collection Ω = {F ⊆ ℝs ∶ E × F ∈
ℬ(ℝr+s )}. Show that ℬ(ℝs ) ⊆ Ω. Then, for a Borel subset F of ℝs , consider
the collection {E ⊆ ℝr ∶ E × F ∈ ℬ(ℝr+s )}.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
18. Let C be the Cantor set, and define a function f ∶ [0, 1] → C as follows:
∞ a ∞ 2a
f(0) = 0, and, for x ∈ (0, 1], write x = ∑i=1 ii and set f (x) = ∑i=1 ii .⁴ Show
2 3
that f is Borel measurable. Hint: For a fixed i ∈ ℕ, define fi (x) = ai . It is
enough to show that fi is measurable. To this end, show that fi = ∑ {𝜒Ei,k ∶
k k+1
k = 1, 3, 5, … , 2i − 1}, where Ei,k = ( i , ].
2 2i
yk+1
yk
Ek Ek Ek
n
the approximate area below the graph is ∑k=1 yk 𝜇(Ek ), which, by definition, is
the integral of a simple function. Needless to say, as the partition of the range
of f gets finer, we expect the integrals of the simple functions to converge to the
integral of f. This is the overarching idea in Lebesgue integration. As it turns out,
we can integrate far more functions under the Lebesgue definition than under
the Riemann definition. For example, the integral of any positive measurable
function is defined, although it may not be finite. Additionally, the definition of the
integral extends seamlessly to abstract measure spaces. The section results capture
the above ideas. First we define the integral of a positive measurable function
f, then we show that f is the limit of simple functions, sn , and then we show
that ∫X fd𝜇 = limn ∫X sn d𝜇. Extending the definition of the integral to complex
functions follows without difficulty. The section concludes with three important
convergence theorems.
Remarks. (a) It is clear that a simple function is measurable if and only if each
Ei is a measurable set. Also, a simple function need not have bounded support.
For example, s = 𝜒(−∞,0) + 𝜒(1,2) is not supported on a bounded set.
m
∫ sd𝜇 = ∑ ai 𝜇(Ei ).
X i=1
The above formula is robust in the sense that it is valid even when s is not
in standard form. This follows from remark (b) above. If a1 , … , am are not
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
n
all distinct, write s in standard form using remark (b): s = ∑i=1 bi 𝜒Bi . Then
n n m
∑i=1 bi 𝜇(Bi ) = ∑i=1 bi ∑j∈T 𝜇(Ej ) = ∑j=1 aj 𝜇(Ej ).
i
m n
∫(s + t)d𝜇 = ∑ ∑(ai + bj )𝜇(Bij )
X i=1 j=1
m n n m
= ∑ ai ∑ 𝜇(Bij ) + ∑ bj ∑ 𝜇(Bij )
i=1 j=1 j=1 i=1
m n
= ∑ ai 𝜇(Ei ) + ∑ bj 𝜇(Fj ) = ∫ sd𝜇 + ∫ td𝜇.
i=1 j=1 X X
Remark. This proof includes a proof of the fact that the sum of two simple
functions is a simple function.
Observe that this definition is reminiscent of the fact that the Riemann integral
of a function is the supremum of the lower Riemann sums of the function
and that a lower Riemann sum of a function is the integral of a step function
dominated by f.
The fact that, for positive functions f and g, ∫X (f + g)d𝜇 = ∫X fd𝜇 + ∫X gd𝜇 requires
the development of some machinery. First we show that a positive measurable
function f is the limit of a sequence, sn , of simple functions, then we show that
limn ∫X sn d𝜇 = ∫X fd𝜇. The details appear below.
f = f+ − f− , and | f | = f+ + f− .
Theorem 8.3.2. (a) Let f ∶ X → [0, ∞] be a measurable function. Then there exists
an increasing sequence of simple functions s1 , s2 , ... such that limn sn (x) = f (x)
for every x ∈ X.
(b) Let f ∶ X → ℂ be a measurable function. Then there exists a sequence of
simple functions u1 , u2 , ... such that limn un (x) = f (x) for every x ∈ X and
|u1 | ≤ |u2 | ≤ ... ≤ | f |.
k−1 k
Proof. For each n ∈ ℕ, define En,k = {x ∈ X ∶ ≤ f (x) < }, k = 1, 2, … , n2n ,
2n 2n
n2n k−1
and Fn = {x ∈ X ∶ f (x) ≥ n}. Let sn = 𝜒En,k + n𝜒Fn .
∑k=1 n
2
The fact that sn (x) ≤ f (x) is clear. Now every x ∈ X belongs to exactly one of
the sets En,k or to Fn . We show that sn is an increasing sequence of functions. If
2(k−1) 2k−1 2(k−1) 2k−1 2k
≤ f (x) < , then sn (x) = = sn+1 (x). If ≤ f (x) < , then
2n+1 2n+1 2n+1 2n+1 2n+1
2(k−1) 2k−1
sn (x) = n+1 < n+1 = sn+1 (x). If f (x) ≥ n, n = sn (x) ≤ sn+1 (x). Now we show
2 2
that limn sn (x) = f (x). If f (x) < ∞, 0 ≤ f (x) − sn (x) ≤ 1/2n . If f (x) = ∞, sn (x) =
n for all n ∈ ℕ. In either case, limn sn (x) = f (x).
∫ fd𝜇 = ∫ f+ − ∫ f− d𝜇,
X X X
provided that at least one of the integrals on the right-hand side of the definition
is finite. We say f is integrable if both ∫X f+ d𝜇 and ∫X f− d𝜇 are finite, which
is equivalent to the condition that ∫X | f |d𝜇 < ∞. This is because | f | = f+ + f− ,
f+ ≤ | f |, and f− ≤ | f |.
Theorem 8.3.6. If f and g are real and integrable, then ∫X (f + g)d𝜇 = ∫X fd𝜇 +
∫X gd𝜇. Also, ∫X afd𝜇 = a ∫X fd𝜇 for every real number a.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Notice that a complex function is integrable if and only if ∫X | f |d𝜇 < ∞. This is
because | f1 | ≤ | f |, | f2 | ≤ | f |, and | f | ≤ | f1 | + | f2 |.
Proof. ∫X |af + bg|d𝜇 ≤ ∫X |a|| f | + |b||g|d𝜇 = |a| ∫X | f |d𝜇 + |b| ∫X |g|d𝜇 < ∞. Thus
af + bg is integrable. The verification that ∫X (f + g)d𝜇 = ∫X fd𝜇 + ∫X gd𝜇 is a
routine calculation, as is the fact that ∫X cfd𝜇 = c ∫X fd𝜇 when c is a real constant.
It now suffices to show that ∫X ifd𝜇 = i ∫X fd𝜇. Indeed, ∫X ifd𝜇 = ∫X i(f1 + if2 )d𝜇 =
∫X (−f2 + if1 )d𝜇 = ∫X −f2 d𝜇 + i ∫X f1 d𝜇 = i ∫X fd𝜇.
It is easy to see that the set of complex integrable functions is a vector space. We
denote it by 𝔏1 (𝜇). In fact, if a norm is defined on 𝔏1 (𝜇) by ‖ f‖1 = ∫X | f |d𝜇, then
𝔏1 (𝜇) is a normed linear space, as the reader can easily verify.
Definition. Let (X, 𝔐, 𝜇) be a measure space, and let P(x) be a property that may
or may not be satisfied by a point x ∈ X. For example, for a given extended real-
valued function f, P(x) may be the property that f (x) is finite. Another example
is the property that f (x) = g(x) for two measurable functions f and g. We say
that property P holds for almost every x in a measurable set E, or that P holds
almost everywhere in E, if 𝜇({x ∈ E ∶ P(x) is false}) = 0. In this situation, we
often write “P holds for a.e. x ∈ E.” The examples below are good illustrations of
the concept.
Note that if E1 ⊆ E2 , then ∫E1 fd𝜇 ≤ ∫E2 fd𝜇. Also, if 0 ≤ f ≤ g, then ∫E fd𝜇 ≤
∫E gd𝜇.
m m
When s = ∑j=1 aj 𝜒Ej is a simple function, then s𝜒E = ∑j=1 aj 𝜒Ej ∩E is also a simple
function and
m
∫sd𝜇 = ∑ aj 𝜇(Ej ∩ E).
E j=1
This equation can very well be used to define ∫E sd𝜇. One can then take the
alternative approach of defining
The two methods of defining ∫E fd𝜇 are clearly equivalent, and the interested
reader is encouraged to work out the details of reconciling the two definitions.
1 1
Let En = {x ∈ E ∶ f (x) > }. Then 𝜇(En ) ≤ ∫En fd𝜇 ≤ ∫E fd𝜇 = 0. Thus
n n
𝜇(En ) = 0. The result now follows from the fact that {x ∈ E ∶ f (x) > 0} =
∞
n=1 En , and 𝜇(∪n=1 En ) ≤ ∑n=1 𝜇(En ) = 0.
∪∞ ∞
Convergence Theorems
Proof. Let gn = infk≥n fk . Then 0 ≤ g1 ≤ g2 ≤ … , and let f (x) = limn gn (x). Note that
f (x) = lim infn fn (x). If s is a simple function such that 0 ≤ s ≤ f, then, by lemma
8.3.3, ∫X sd𝜇 ≤ limn ∫X gn d𝜇. Hence ∫X fd𝜇 = sup{∫X sd𝜇 ∶ s ≤ f} ≤ limn ∫X gn d𝜇.
Since gn ≤ fn , ∫X gn d𝜇 ≤ ∫X fn d𝜇, and limn ∫X gn d𝜇 ≤ lim infn ∫X fn d𝜇.
Example 5. Let (fn ) be a convergent sequence in 𝔏1 (𝜇), and let f be its 𝔏1 -limit.
Then (fn ) contains a subsequence that converges to f for almost every x ∈ X.
Choose a subsequence (fni ) of (fn ) such that, for i ∈ ℕ, ‖ fni − f‖1 < 2−i . Let
k
gk = ∑i=1 | fni − f|. The functions gk are in 𝔏1 and, by construction, 0 ≤ g1 ≤
g2 ≤ … , and ‖gk ‖1 ≤ 1. Let g(x) = limk gk (x). By Fatou’s theorem, ∫X gd𝜇 ≤
∞
lim infn ‖gk ‖1 ≤ 1. This shows that g ∈ 𝔏1 .⁵ Since g(x) = ∑i=1 | fni (x) − f (x)|,
∞
it follows that the series ∑i=1 | fni (x) − f (x)| is convergent for a.e. x ∈ X (by
example 1). In particular, limi→∞ | fni (x) − f (x)| = 0 for a.e. x ∈ X.
⁵ One can also use the monotone convergence theorem to show that g ∈ 𝔏1 .
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Proof. Notice that | fn (x)| ≤ |g(x)| implies that | f (x)| ≤ |g(x)|. Hence fn ∈ 𝔏1 (𝜇),
and f ∈ 𝔏1 (𝜇). Since | fn − f| ≤ 2g, we can apply Fatou’s theorem to the
sequence 2g − | fn − f| to obtain ∫X 2gd𝜇 ≤ lim infn ∫X 2g − | fn − f|d𝜇 = ∫X 2gd𝜇 −
lim supn ∫X | fn − f|d𝜇. Hence lim supn ∫X | fn − f|d𝜇 ≤ 0, so lim supn ∫X | fn −
f|d𝜇 = 0. Since ∫X | fn − f|d𝜇 is a nonnegative sequence, limn ∫X | fn − f|d𝜇 = 0,
as desired.
Example 7. Let f ∈ 𝔏1 (𝜇). Then, for every 𝜖 > 0, there exists 𝛿 > 0 such that
whenever 𝜇(E) < 𝛿, ∫E |f|d𝜇 < 𝜖.
Suppose there exists a number 𝜖 > 0 such that, for every n ∈ ℕ, there is a
measurable set En such that 𝜇(En ) < 2−n , and ∫En | f |d𝜇 ≥ 𝜖. Let Fk = ∪n≥k En ,
∞
and let F = ∩∞k=1 Fk . On the one hand, 𝜇(Fk ) ≤ ∑n=k 2
−n
= 2−k+1 ; hence 𝜇(F ) =
limk 𝜇(Fk ) = 0, and ∫F | f |d𝜇 = 0. On the other hand, by the dominated con-
vergence theorem, ∫F | f |d𝜇 = limk ∫Fk | f |d𝜇 ≥ lim infk ∫Ek | f |d𝜇 ≥ 𝜖. This con-
tradiction establishes the result.
Exercises
1. Let f be a measurable function, and let g be a function such that f (x) = g(x)
for a.e. x ∈ X. Prove that g is measurable.
2. Define a relation on the collection of measurable functions as follows: f ≡ g
if f (x) = g(x) for a.e. x ∈ X. Prove that ≡ is an equivalence relation.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
3. Let f be an integrable function, and let g be such that f (x) = g(x) a.e. Prove
that g is integrable and that ∫X fd𝜇 = ∫X gd𝜇.
4. Let f ∈ 𝔏1 (𝜇), and let E = {x ∈ X ∶ | f (x)| > c}, where c > 0. Prove
1
the inequality (Tchebychev) 𝜇(E) ≤ ∫E | f |d𝜇. More generally, if f is
c
1
measurable and | f |p ∈ 𝔏1 (𝜇), then 𝜇(E) ≤ p ∫E | f |p d𝜇. Here 1 ≤ p < ∞.
c
5. Let f ∈ 𝔏1 (𝜇). Show that the set E = {x ∈ X ∶ f (x) ≠ 0} is a countable union
of sets of finite measure.
6. Let f be a positive measurable function. Show that if E and F are measurable
sets such that 𝜇(EΔF) = 0, then ∫E fd𝜇 = ∫F fd𝜇.
∞
7. Let fn be a sequence of measurable functions such that ∑n=1 ∫X | fn |d𝜇 < ∞.
∞
Show that the series ∑n=1 | fn (x)| converges a.e. in X.
8. Show that if 𝜇 is a finite measure and (fn ) is a sequence of bounded mea-
surable functions such that fn converges uniformly to f, then limn ∫X | fn −
f|d𝜇 = 0.
9. Let f ∈ 𝔏1 (𝜇). Prove that for every 𝜖 > 0, there exists a set E of finite measure
such that ∫E | f |d𝜇 > ‖ f‖1 − 𝜖.
10. Let (fn ) be a decreasing sequence of nonnegative measurable functions, and
let f = limn fn . Show that if f1 is integrable, then ∫X fd𝜇 = limn ∫X fn d𝜇.
This section is the centerpiece of the chapter. The motivation for the definition of
the Lebesgue measure, as well as an extensive development of its properties, appear
later in the section. We must furnish some needed background. The four leading
results in this section are valid for locally compact Hausdorff spaces, and this is
made abundantly clear in the excursion on Radon measures. We chose to limit the
bulk of the section to the Lebesgue measure because we do not wish to base this
section too heavily on chapter 5.
Preliminaries
Proof. The functions g(x) = dist(x, F) and h(x) = dist(x, E) are continuous and are
never simultaneously zero since E and F are closed and disjoint. Furthermore,
g(x) > 0 for every x ∈ E, and h(x) > 0 for every x ∈ F.
g(x)
The function f (x) = has the stated properties.
g(x)+h(x)
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Lemma 8.4.2. Let K be a compact subset of an open subset V of ℝn . Then there exists
an open set U such that U is compact and K ⊆ U ⊆ U ⊆ V.
Proof. For every x ∈ K, there exists a ball B(x, 𝛿x ) such that B(x, 𝛿x ) ⊆ V. Since
K is compact, and K ⊆ ∪x∈K B(x, 𝛿x ), there exists a finite number of points
x1 , … , xm ∈ K such that K ⊆ ∪m m
i=1 B(xi , 𝛿xi ). The set U = ∪i=1 B(xi , 𝛿xi ) satisfies the
requirements.
Proof. By lemma 8.4.2, there exists an open set U such that U is compact and
K ⊆ U ⊆ U ⊆ V. Applying lemma 8.4.1 with E = K and F = ℝn − U, we find the
function we seek.
Proof. First we show that there exists an open cover {U1 , … , Um } of K such that
each Ui is compact and Ui ⊆ Vi . The proof is by induction on m. When m = 2,
let K1 = K − V2 . Then K1 is compact and contained in V1 . By lemma 8.4.2,
there exists an open set U1 with compact closure such that K1 ⊆ U1 ⊆ U1 ⊆ V1 .
Clearly, {U1 , V2 } is an open cover of K. Now let K2 = K − U1 , and repeat
the above argument to find an open set U2 with compact support such that
K2 ⊆ U2 ⊆ U2 ⊆ V2 . Clearly, {U1 , U2 } is an open cover of K. This proves the base
case when m = 2. We outline the inductive step. Let {V1 , … , Vm } be an open cover
of K, and write W = V2 ∪ ... ∪ Vm . Then {V1 , W} is an open cover of K. By what
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
we already established, there are open sets U1 and W1 with compact closures such
that K ⊆ U1 ∪ W1 , and U1 ⊆ V1 and W1 ⊆ W = V2 ∪ ... ∪ Vm . Now apply the
inductive hypothesis to the compact set W1 and its open cover {V2 , … , Vm }.
By lemma 8.4.3, there exist functions gi ∈ 𝒞rc (ℝn ) such that Ui ≺ gi ≺ Vi . Define
Dicing ℝn
This partitions each interval [m, m + 1)(m ∈ ℤ) into 2k congruent half-open inter-
1
vals, each of length k .
2
The above partition of ℝ can be employed to partition ℝn into a collection of
half-open cubes:
𝜈 𝜈1 +1 𝜈 𝜈n +1
𝒮k = {𝜎 = [ k1 , ) × ... × [ kn , ) ∶ (𝜈1 , … , 𝜈n ) ∈ ℤn }.
2 2k 2 2k
Note that, for 𝜎 ∈ 𝒮k , diam(𝜎) = √n2−k and that if 𝜎 and 𝜎 ′ are distinct cubes in
𝒮k , then 𝜎 ∩ 𝜎 ′ = ∅.
Observe that the half-open unit cube [0, 1) × ... × [0, 1) is the union of 2nk cubes
in 𝒮k , that 𝒮k+1 is a refinement of 𝒮k , and that each cube in 𝒮k is the union of 2n
cubes in 𝒮k+1 .
(a) (b)
This construction should be geometrically obvious. The set 𝒮k (V) is the largest set
of cubes in 𝒮k that fits inside V. It is also clear that 𝒮k+1 (V) is a refinement of 𝒮k (V)
that also contains all the additional cubes in 𝒮k+1 that fit in V. Figure 8.2 illustrates
the construction: figure 8.2(a) depicts all the squares of length 1/8 that fit in the
unit disk U, and figure 8.2(b) shows all the squares of length 1/16 that fit in the
disk. The union of the squares are G3 (U) and G4 (U), respectively.
Proof. Let B1 = G1 , and, for k ≥ 1, let Bk+1 = Gk+1 − Gk . The family {Bk } is mutually
disjoint, and ∪∞ k=1 Bk = V. Each Bk is the union of cubes in 𝒮k . The collection of
all such cubes is countable, and their union (over k ∈ ℕ) is V. Renumbering those
cubes as 𝜎1 , 𝜎2 , … , we obtain V = ∪∞ k=1 𝜎k . Finally, consider two distinct cubes, 𝜎i
and 𝜎j . If 𝜎i ⊆ Br , where 𝜎j ⊆ Bs and r ≠ s, then 𝜎i ∩ 𝜎j = ∅ because Br ∩ Bs = ∅
if r ≠ s. If 𝜎i and 𝜎j are subsets of Br , for some integer r, then 𝜎i ∩ 𝜎j = ∅ because
the cubes in 𝒮r are disjoint.
This subsection is included for the sole purpose of building the reader’s intuition.
It is not meant to be a rigorous development of any particular set of ideas.
It must be emphasized at the outset that the Lebesgue measure is not an artificial
construct but rather a very natural kind of measure, as the reader will see below.
The broad goals are intuitively clear; we wish to find a large enough 𝜎-algebra
ℒn in ℝn and a positive measure 𝜆 on ℒn that extends and is consistent with our
common geometric perceptions about length, area, and volume. It is therefore
entirely reasonable to expect (indeed, require) that every closed box Q must
be in ℒn and that the Lebesgue measure of such a box must be the product of its
dimensions, consistent with our definition of the volume of a closed box in section
8.1. Surprisingly, those two simple requirements allow us to achieve most of our
broad goals. Because every open subset of ℝn is a countable union of closed boxes,
every open subset of ℝn must be in ℒn ; hence ℒn contains all Borel subsets of ℝn .
The requirement that 𝜆(Q) = vol(Q) uniquely extends the Lebesgue measure to
all open sets, as we explain below.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
i
In theorem 8.4.5, if we define Ki = ∪ij=1 𝜎j , then 𝜆(Ki ) = ∑j=1 vol(𝜎j ), and it follows
directly from theorem 8.2.3 that
Another way to view equation (1) is as follows: Since, for any compact subset K of
V, 𝜆(K) ≤ 𝜆(V) and since there is a sequence Ki of compact subsets of V such that
𝜆(V) = limi 𝜆(Ki ), it must be true that, for an open set V ⊆ ℝn ,
We will use a variant of equation (2) as the definition of the Lebesgue outer
measure of an open subset V of ℝn . However, this raises a serious question: why
would we abandon equation (1), which defines 𝜆(V) in terms of the measure
of a sequence of simple compact subsets of V, in favor of equation (2), which
involves the measure of general compact sets? In other words, how do we define
the measure of an arbitrary compact subset K of ℝn ? The answer is, we do not!
We use the Riemann integral as an instrument for the approximation of 𝜆(K) for a
compact subset K of V, and this is why Urysohn’s lemma is crucially important for
our development of the Lebesgue measure. Figures 8.4 and 8.5 illustrate the idea.
In figure 8.4, the outer disk depicts the open set V, and the inner disk depicts a
compact subset K of V. If f is a continuous function such that K ≺ f ≺ V, then the
Riemann integral ∫ℝn f (x)dx can be regarded as an approximation of both 𝜆(K)
and 𝜆(V). Figure 8.5 further illustrates the point. In that figure, the measure of K is
the volume of the cylinder above K, which differs from ∫ℝn f (x)dx by the volume
of the thin shell between the cylinder and the wall of the graph of f. Since we can
construct a compact subset K that fills as much of V as we wish (the compact sets Ki
in equation (1)), ∫ℝn f (x)dx can be used to simultaneously approximate 𝜆(K) and
𝜆(V) with arbitrary precision. We hope that the preceding discussion motivates
the definition below of the outer measure of an open subset of ℝn .
Card(𝒮 (V))
k
⁶ Equation (1) can be written more explicitly as 𝜆(V) = limk . This is a perfectly viable
2nk
approach, and some recent books have adopted this as the definition of the measure of an open subset
of ℝn . Observe that this definition accepts as a axiom the fact that the measure of the half-open cube is
1
the product of its dimensions; hence the quantity nk . Another implied assumption is that all the cubes
2
in 𝒮k (V) have the same measure. This is the seed of the translation invariance of the Lebesgue measure.
See the proof of theorem 8.4.14.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
In figure 8.4 we depict V as a bounded open set. However, the discussion points
are valid even for open sets of infinite measure. Specifically, this means that if V
is an open set of infinite measure, then V contains compact subsets of arbitrarily
large measure.
Lebesgue Measure
As explained in the above motivation, the Riemann integral will play a pivotal role
in our development of the Lebesgue measure. For a function f ∈ 𝒞c (ℝn ), let Q be
a closed box that contains supp(f ), and define
∫ f (x)dx = ∫ f (x)dx.
ℝn Q
The Riemann integral is clearly a positive linear functional on 𝒞c (ℝn ). For the
remainder of this section, we will use the following notation: for a function
f ∈ 𝒞c (ℝn ),
Proof. The monotonicity of m∗ is obvious. First we show that, for open sets V1 , … , Vm ,
m
m∗ (∪m ∗ m
i=1 Vi ) ≤ ∑i=1 m (Vi ). Let f ≺ ∪i=1 Vi . By lemma 8.4.4, there exist func-
tions hi ≺ Vi , 1 ≤ i ≤ m, such that that h1 (x) + ... + hm (x) = 1 for every x ∈
m m m
supp(f ). Now I(f ) = I(∑i=1 hi f) = ∑i=1 I(hi f) ≤ ∑i=1 m∗ (Vi ). This shows that
n
m∗ (∪ni=1 Vi ) ≤ ∑i=1 m∗ (Vi ).
We now show that m∗ is countably subadditive. Let (Ei ) be a sequence of subsets
∞ ∞
of ℝn . We must prove that m∗ (∪∞ ∗ ∗
i=1 Ei ) ≤ ∑i=1 m (Ei ). If ∑i=1 m (Ei ) = ∞, there
∞
is nothing to prove, so assume that ∑i=1 m∗ (Ei ) < ∞. Let 𝜖 > 0, and choose
open sets Vi such that Ei ⊆ Vi and m∗ (Vi ) < m∗ (Ei ) + 𝜖/2i . Let V = ∪∞ i=1 Vi ,
and let f ≺ V, that is, K = supp(f ) ⊆ ∪∞ V
i=1 i . The compactness of K produces a
m
finite subcover V1 , … , Vm of K. Now I(f ) ≤ m∗ (V1 ∪ ... ∪ Vm ) ≤ ∑i=1 m∗ (Vi ) ≤
∞ ∞ ∞
∑i=1 m∗ (Vi ) ≤ ∑i=1 [m∗ (Ei ) + 𝜖/2i ] = ∑i=1 m∗ (Ei ) + 𝜖. Since the last inequal-
∞
ity is true for an arbitrary f ≺ V, m∗ (V) ≤ ∑i=1 m∗ (Ei ) + 𝜖. Since ∪∞ i=1 Ei ⊆ V and
∞
m∗ is monotone, m∗ (∪∞ E
i=1 i ) ≤ m ∗
(V) ≤ ∑ i=1
m ∗
(E i ) + 𝜖. Because 𝜖 is arbitrary,
∞
m∗ (∪∞ E
i=1 i ) ≤ ∑ i=1
m ∗
(E i ).
Example 3. The following facts follow directly from example 2. The outer measure
of a countable subset of ℝ is zero. The outer measure of the closed interval [a, b]
is b − a.
m∗ (A) = m∗ (A ∩ E) + m∗ (A ∩ E′ ).
The immediate task now is to show that every open subset of ℝn is Lebesgue
measurable (theorem 8.4.9). We first need to establish the finite additivity of m∗
for compact and open sets.
Theorem 8.4.7.
(a) If K is compact, then
Proof. Let K ≺ f. If 0 < 𝛼 < 1, then the set V𝛼 = {x ∈ ℝn ∶ f (x) > 𝛼} is open and
contains K. Now if g ≺ V𝛼 , then 𝛼g < f, and m∗ (K) ≤ m∗ (V𝛼 ) = sup{I(g) ∶ g ≺
1
V𝛼 } ≤ I(f ). Letting 𝛼 → 1, we obtain m∗ (K) ≤ I(f ). Let 𝜖 > 0. There exists
𝛼
an open set V containing K such that m∗ (V) < m∗ (K) + 𝜖. Choose a function
f ∈ 𝒞rc (ℝn ) such that K ≺ f ≺ V. Then I(f ) ≤ m∗ (V) < m∗ (K) + 𝜖. This establishes
part (a).
To prove part (b), let 𝜖 > 0. By part (a), there exists a function g ∈ 𝒞rc (ℝn )
such that K1 ∪ K2 ≺ g, and I(g) < m∗ (K1 ∪ K2 ) + 𝜖. By lemma 8.4.2, there exists
an open subset W with compact closure such that K1 ⊆ W ⊆ W ⊆ ℝn − K2 .
By theorem 8.4.3, there exists a function f ∈ 𝒞rc (ℝn ) such that K1 ≺ f ≺ W. In
particular, f(K1 ) = 1, and f(K2 ) = 0. Note that K1 ≺ fg and that K2 ≺ (1 − f )g.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Now m∗ (K1 ) + m∗ (K2 ) ≤ I(fg) + I(g − fg) = I(g) < m∗ (K1 ∪ K2 ) + 𝜖. Since 𝜖 is
arbitrary, m∗ (K1 ) + m∗ (K2 ) ≤ m∗ (K1 ∪ K2 ). Now the subadditivity of m∗ delivers
the result.
Proof. Let 𝛼 < m∗ (V). By the definition of m∗ (V), there exists a function f ∈ 𝒞rc (ℝn )
such that f ≺ V and I(f ) > 𝛼. Let K = supp(f ). If K ⊆ W for some open set W, then
f ≺ W, so I(f ) ≤ m∗ (W). This shows that
Thus 𝛼 < m∗ (K) ≤ m∗ (V). This proves part (a). Observe that this proof is valid
even when m∗ (V) = ∞.
To prove (c), we may, without loss of generality, assume that V1 and V2 have
finite outer measure. Let 𝜖 > 0. By part (a), there exist compact sets K1 and K2
such that m∗ (Vi ) < m∗ (Ki ) + 𝜖/2, i = 1, 2. The set K = K1 ∪ K2 is compact, and
K1 ∪ K2 ⊆ V1 ∪ V2 . Now m∗ (V1 ) + m∗ (V2 ) ≤ m∗ (K1 ) + m∗ (K2 ) + 𝜖 = m∗ (K1 ∪
K2 ) + 𝜖 ≤ m∗ (V1 ∪ V2 ) + 𝜖. Since 𝜖 is arbitrary, and m∗ is subadditive, m∗ (V1 ) +
m∗ (V2 ) = m∗ (V1 ∪ V2 ).
(a) For 𝜖 > 0, there exists a closed subset F and an open subset V such that
F ⊆ E ⊆ V and 𝜆(V − F) < 𝜖.
(b) 𝜆(E) = sup{𝜆(K) ∶ Kcompact, K ⊆ E}.
(c) There exists an F𝜍 set A and a G𝛿 set B such that A ⊆ E ⊆ B and 𝜆(B − A) = 0.
Proof. ℝn is the countable union of the nest of compact balls Ki = B(0, i), i ∈
ℕ. For each i ∈ ℕ, 𝜆(Ki ∩ E) < ∞. Thus there exists open sets Vi ⊇ Ki ∩ E
such that 𝜆(Vi − (Ki ∩ Ei )) < 𝜖/2i+1 . Let V = ∪∞
i=1 Vi . Then E ⊆ V, V − E ⊆
∪∞ ′
i=1 (Vi − (Ki ∩ E)), and 𝜆(V − E) < 𝜖/2. Applying this result to E , we can
find an open set W containing E such that 𝜆(W − E ) < 𝜖/2. Let F = W′ .
′ ′
∞
If F is closed, F = ∪i=1 (Ki ∩ F). Each Ki ∩ F is compact and limi 𝜆(Ki ∩ F) = 𝜆(F ).
Thus (b) holds for closed subsets of ℝn . If E ∈ ℒn and 𝜖 > 0, by part (a) we
can choose a closed set F ⊆ E such that 𝜆(E − F) < 𝜖/2. If 𝜆(F ) = ∞, then sup
{𝜆(K) ∶ Kcompact, K ⊆ E} ≥ sup{𝜆(K) ∶ Kcompact, K ⊆ F} = ∞. If 𝜆(F ) < ∞,
there exists a compact subset K of F such that 𝜆(F − K) < 𝜖/2. Now 𝜆(E) =
𝜆(K) + 𝜆(E − F) + 𝜆(F − K) < 𝜆(K) + 𝜖.
To prove part (c), find open sets Vi and closed sets Fi such that Fi ⊆ E ⊆ Vi
and 𝜆(Vi − Fi ) < 1/i. Set A = ∪∞ ∞
i=1 Fi , and B = ∩i=1 Vi . Then 𝜆(B − A) < 1/i for
every i ∈ ℕ; hence 𝜆(B − A) = 0. Observe that these results are valid even when
𝜆(E) = ∞.
Corollary 8.4.11. ℒn is the smallest 𝜎-algebra that contains ℬn and all sets of
Lebesgue (outer) measure 0.
Proof. We have already seen that ℬn ⊆ ℒn and that all subsets of Lebesgue outer
measure 0 are Lebesgue measurable. We show that if 𝔐 is a 𝜎-algebra containing
ℬn and all subsets of Lebesgue measure 0, then ℒn ⊆ 𝔐. If E ∈ ℒn , then, by
theorem 8.4.10, there exists an F𝜍 set A such that A ⊆ E and 𝜆(E − A) = 0. Thus
E = A ∪ (E − A), where A ∈ ℬn and 𝜆(E − A) = 0.
Q = {x ∈ ℝn ∶ ai ≤ xi ≤ bi }
n
is, by definition, vol(Q) = ∏i=1 (bi − ai ).
n
𝜆(Q) = vol(Q) = ∏(bi − ai ).
i=1
Proof. Let f ≺ Q. Then I(f ) = ∫Q f (x)dx ≤ ∫Q 1dx = vol(Q). Thus 𝜆(Q) ≤ vol(Q).
For small-enough positive constants 𝜖, define Q𝜖 = [a1 + 𝜖, b1 − 𝜖] × ... × [an +
𝜖, bn − 𝜖]. There exists a function f ∈ 𝒞rc (ℝn ) such that Q𝜖 ≺ f ≺ Q. Therefore
n
𝜆(Q) ≥ I(f ) = ∫Q f (x)dx ≥ ∫Q𝜖 f (x)dx = ∫Q𝜖 1dx = vol(Q𝜖 ) = ∏i=1 (bi − ai − 2 𝜖).
n
Since 𝜖 is arbitrary, 𝜆(Q) ≥ ∏i=1 (bi − ai ) = vol(Q).
Example 6. Consider an open box Q = (a1 , b1 ) × ... × (an , bn ), and let Q be the
closed box [a1 , b1 ] × ... × [an , bn ]. For every k ∈ ℕ, let Qk be the open box
1 1 1 1
(a1 − , b1 + ) × ... × (an − , bn + ). Since {Qk } is a descending sequence and
k k k k
n 2 n
Q = ∩∞
k=1 Qk , 𝜆(Q) = limk 𝜆(Qk ) = limk ∏i=1 (bi − ai + ) = ∏i=1 (bi − ai ) =
k
vol(Q) = 𝜆(Q). Therefore 𝜆(Q) = 𝜆(Q), and the boundary of any box has
Lebesgue measure zero. Therefore the Lebesgue measure of any box (open,
closed, or half-open) is the product of its dimensions. We will continue to refer
to the Lebesgue measure of a box as its volume.
Let Br be the open ball of radius r centered at the origin, and let B be
the open unit ball. By lemma 8.4.5, B = ∪∞
i=1 𝜎i , where {𝜎i } is a sequence of
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
∞
disjoint half-open cubes. Because Br = rB = ∪∞
i=1 r𝜎i , 𝜆(Br ) = ∑i=1 𝜆(r𝜎i ) =
∞ n
∑i=1 r 𝜆(𝜎i ) = r 𝜆(B) = cn r .
n n
We are now ready to prove that the Riemann integral of a function of compact
support is the same as its integral with respect to Lebesgue measure.
Proof. It is sufficient to prove the result for a positive function f. Let Q be a half-
open cube containing supp(f ) in its interior. Consider the partition of Q into
2nk
2nk congruent, half-open sub-cubes {𝜎1 , … , 𝜎2nk }, and let sk (f ) = ∑i=1 f𝜍i vol(𝜎i )
be the lower Riemann sums of f. Here f𝜍i = min{f (x) ∶ x ∈ 𝜎i }. By theorem
8.1.2, limk→∞ sk (f ) = ∫ℝn f (x)dx. On the other hand, the simple functions fk (x) =
2nk
∑i=1 f𝜍i 𝜒𝜍i (x) satisfy 0 ≤ f1 ≤ f2 ≤ … , and limk fk (x) = f (x) for every x ∈ ℝn .
By the monotone convergence theorem, limk ∫ℝn fk d𝜆 = ∫ℝn fd𝜆. But ∫ℝn fk d𝜆 =
2nk 2nk
∑i=1 f𝜍i 𝜆(𝜎i ) = ∑i=1 f𝜍i vol(𝜎i ) = sk (f ). Therefore, limk sk (f ) = ∫ℝn fd𝜆.
The previous theorem is commonly cast in the following language: the Lebesgue
integral extends the Riemann integral from 𝒞c (ℝn ) to 𝔏1 (ℝn , ℒn , 𝜆). We also
say that the Lebesgue measure 𝜆 represents the positive linear functional I(f ) =
∫ℝn f (x)dx.
Proof. It is easy to see that vol(Q + x) = vol(Q) for every box Q. Now let V be an
open subset of ℝn . By lemma 8.4.5, we can write V as a disjoint union of half-
open cubes, V = ∪∞ ∞ ∞
i=1 𝜎i . Thus V + x = ∪i=1 (𝜎i + x), and 𝜆(V + x) = 𝜆(∪i=1 (𝜎i +
∞ ∞
x)) = ∑i=1 𝜆(𝜎i + x) = ∑i=1 𝜆(𝜎i ) = 𝜆(V). Thus the result holds for open subsets
of ℝn . The general result for an arbitrary measurable set E follows from the special
case we just established and the fact that 𝜆(E) = inf{𝜆(V) ∶ E ⊆ V, Vopen}. See
the definition of the Lebesgue outer measure.
∫ f (x)dx = ∫ fd𝜆
ℝn ℝn
(a) outer regular if, for every E ∈ 𝔐, 𝜇(E) = inf{𝜇(V) ∶ E ⊆ V, Vopen}, and
(b) inner regular if, for everyE ∈ 𝔐, 𝜇(E) = sup{𝜇(K) ∶ K ⊆ E, Kcompact}.
Lebesgue measure is outer regular by the very definition of the Lebesgue outer
measure, m∗ . Theorem 8.4.10(b) states that Lebesgue measure is inner regular.
We conclude this section with two uniqueness results that characterize Lebesgue
measure.
Theorem 8.4.16. Let 𝜇 be a regular measure on ℒn such that 𝜇(K) < ∞ for
every compact subset K of ℝn , and ∫ℝn fd𝜇 = ∫ℝn f (x)dx for every f ∈ 𝒞c (ℝn ).
Then 𝜇 = 𝜆.
Proof. It is sufficient to prove that 𝜇(K) = 𝜆(K) for every compact set K. The result
then follows from the regularity of 𝜆 and 𝜇.
Let 𝜖 > 0. There exists an open subset V such that K ⊆ V and 𝜇(V) < 𝜇(K) + 𝜖.
Let f ∈ 𝒞rc (ℝn ) be such that K ≺ f ≺ V. Then 𝜆(K) = ∫ℝn 𝜒K d𝜆 ≤ ∫ℝn fd𝜆 =
∫ℝn f (x)dx = ∫ℝn fd𝜇 ≤ ∫ℝn 𝜒V d𝜇 = 𝜇(V) < 𝜇(K) + 𝜖. Since 𝜖 is arbitrary,
𝜆(K) ≤ 𝜇(K). Switching the roles of 𝜆 and 𝜇, we obtain 𝜇(K) ≤ 𝜆(K).
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Proof. Let Q be the half-open unit cube [0, 1)n . For a fixed k ∈ ℕ, partition Q
into 2nk congruent half-open sub-cubes 𝜎1 , … , 𝜎2nk in 𝒮k . Each cube in 𝒮k is a
translation of any other cube in 𝒮k , therefore, by assumption, 𝜇(𝜎i ) = 𝜇(𝜎1 ) for
i = 1, 2, … , 2nk . Since 𝜎1 , … , 𝜎2nk are disjoint, c = 𝜇(Q) = 2nk 𝜇(𝜎1 ). Also c = c.1 =
2nk
c𝜆(Q) = c ∑i=1 𝜆(𝜎i ) = c2nk 𝜆(𝜎1 ). It follows that 𝜇(𝜎1 ) = c𝜆(𝜎1 ); hence 𝜇(𝜎) =
c𝜆(𝜎) for every cube 𝜎 in 𝒮k . Since k is arbitrary, 𝜇(𝜎) = c𝜆(𝜎) for any cube
𝜎 ∈ ∪∞ k=1 𝒮k . By lemma 8.4.5, an arbitrary open set V is a countable union of
disjoint cubes in ∪∞ k=1 𝒮k . The countable additivity of 𝜆 and 𝜇 produces 𝜇(V) =
c𝜆(V) for every open subset V.
Now let E ∈ ℒn . For an open set V ⊇ E, 𝜇(E) ≤ 𝜇(V) = c𝜆(V). Hence
𝜇(E) ≤ c inf{𝜆(V) ∶ V ⊇ E, Vopen} = c𝜆(E). To show that 𝜇(E) ≥ c𝜆(E), we first
assume that E is bounded. Choose a large enough open box Ω that contains E.
Then 𝜇(Ω) − 𝜇(E) = 𝜇(Ω − E) ≤ c𝜆(Ω − E) = c𝜆(Ω) − c𝜆(E) = 𝜇(Ω) − c𝜆(E).
Thus 𝜇(E) ≥ c𝜆(E).
If E is unbounded, let Bi be the open ball of radius i and centered at the origin.
Then E = ∪∞ i=1 (E ∩ Bi ), and 𝜇(E) = limi 𝜇(E ∩ Bi ) = c limi 𝜆(E ∩ Bi ) = c𝜆(E).
A close examination of the constructions and the results of this section so far
reveals that most of the theory we developed can be extended to locally compact
Hausdorff spaces. Specifically, if ℝn is replaced with a locally compact Hausdorff
space X and the Riemann integral is replaced with a positive linear functional I on
𝒞c (X), then we can construct a measure 𝜇 that represents I and has most (but not
all) of the regularity properties we derived for Lebesgue measure.⁷
The following results can be established by replicating the proofs of the corre-
sponding results for the Lebesgue integral. Theorem 5.9.2 must be used instead of
theorem 8.4.2, and lemma 5.11.6 instead of theorem 8.4.3. The proof we included
for lemma 8.4.4 is valid for any locally compact Hausdorff; hence we state the
lemma below for the sake of completeness. We urge the reader to scrutinize our
claim that the proofs of the theorems below for Radon measures are identical to
those provided for the Lebesgue measure. The exercise is illuminating.
Throughout this subsection, X is a locally compact Hausdorff space, and I is a
positive linear functional on 𝒞c (X). Explicitly, for f, g ∈ 𝒞c (X), and 𝛼, 𝛽 ∈ 𝕂, I(𝛼f +
𝛽g) = 𝛼I(f ) + 𝛽I(g), and if f ≥ 0, then I(f ) ≥ 0. Observe that such a functional is
monotone in the sense that if f ≤ g, then I(f ) ≤ I(g).
We continue to use the notation K ≺ f ≺ V to indicate that f ∈ 𝒞rc (X), 0 ≤ f ≤ 1,
f(K) = 1, and supp(f ) ⊆ V.
The Radon outer measure induced by the positive linear functional I is the set
function
m∗ ∶ 𝒫(X) → [0, ∞],
defined as follows: for an open set V ⊆ X,
m∗ (A) = m∗ (A ∩ E) + m∗ (A ∩ E′ ).
By theorem 8.2.7, the set 𝔐 of measurable sets is a 𝜎-algebra, and the restric-
tion of m∗ to 𝔐 is a complete positive measure: the Radon measure on 𝔐
induced by I . We will reserve the notation 𝜇(E) exclusively to denote the 𝜇-
measure of a set E ∈ 𝔐. We continue to write m∗ (E) for the outer measure of a
set E whose Radon measurability has not been established.
Theorem 8.4.20.
(a) If K is compact, then
We now arrive at the main distinction between the Lebesgue measure and general
Radon measures. Part (a) of theorem 8.4.21 states that 𝜇 is inner regular on
open subsets of X. Inner regularity does not extend to all Radon measurable sets,
however. But we do have the following result.
Proof. Let 𝜖 > 0, and choose an open set U ⊇ E such that 𝜇(U) < 𝜇(E) + 𝜖.
Since 𝜇(U − E) = 𝜇(U) − 𝜇(E) < 𝜖, there exists an open set V ⊇ U − E such
that 𝜇(V) < 𝜖. By theorem 8.4.21, U contains a compact subset H such that
𝜇(U) < 𝜇(H) + 𝜖. The set K = H − V is clearly compact, and K ⊆ U − V ⊆ E.
Now
n
∑(yi − 𝜖)𝜇(Ei ) ≤ ∫ fd𝜇. (4)
i=1 X
For 1 ≤ i ≤ n, choose open subsets Vi ⊇ Ei such that 𝜇(Vi ) < 𝜇(Ei ) + 𝜖/n and
f (x) < yi + 𝜖 for all x ∈ Vi , and let {hi } be a partition of unity of K subordinate
to {Vi }. Since hi ≺ Vi ,
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
n
Since f = ∑i=1 hi f and since hi f ≤ (yi + 𝜖)hi , we have (using inequalities (4)
and (5)),
n n n
I(f ) = ∑ I(hi f) ≤ ∑(yi + 𝜖)I(hi ) ≤ ∑(yi + 𝜖)(𝜇(Ei ) + 𝜖/n)
i=1 i=1 i=1
n n
= ∑(yi − 𝜖)𝜇(Ei ) + 2𝜖𝜇(K) + 𝜖/n ∑(yi + 𝜖)
i=1 i=1
n
≤ ∫ fd𝜇 + 2𝜖𝜇(K) + 𝜖/n ∑(1 + 𝜖) = ∫ fd𝜇 + 2𝜖𝜇(K) + 𝜖(1 + 𝜖).
X i=1 X
To prove the uniqueness part of this theorem, mimic the proof of theorem 8.4.16.
Observe that the proof of theorem 8.4.16 is based only on the outer regularity of
the measures in question and their inner regularity for open sets.
The following theorem provides a sufficient condition for the inner regularity of
Radon measures (for all E ∈ 𝔐). The proof is identical to that of theorem 8.4.10.
(a) For 𝜖 > 0, there exists a closed subset F and an open subset V such that F ⊆
E ⊆ V and 𝜇(V − F) < 𝜖.
(b) 𝜇(E) = sup{𝜇(K) ∶ Kcompact, K ⊆ E}.
(c) There exists an F𝜍 set A and a G𝛿 set B such that A ⊆ E ⊆ B and
𝜇(B − A) = 0.
Exercises
(b) Prove that E is Lebesgue measurable if and only if, for every 𝜖 > 0, E
contains a closed subset F such that m∗ (E − F) < 𝜖.
The importance of this problem and the next is that they provide more
intuitive characterizations of Lebesgue measurability than the Carathédory
condition does. Intuitively, a subset E of ℝn is Lebesgue measurable if it can
be approximated from the outside by an open set or from the inside by a
closed set.
9. Let E be a subset of ℝn .
(a) Prove that E is Lebesgue measurable if and only if there exists a G𝛿 set
G containing E such that m∗ (G − E) = 0.
(b) Prove that E is Lebesgue measurable if and only if E contains an F𝜍 set
F such that m∗ (E − F) = 0.
10. In this exercise, we use 𝜆k to denote the Lebesgue measure on ℝk . Let r and
s be positive integers, and let n = r + s. Prove that if U ⊆ ℝr and V ⊆ ℝs are
open sets, then 𝜆n (U × V) = 𝜆r (U)𝜆s (V). Hint: 𝒮k (U × V) = 𝒮k (U) × 𝒮k (V).
1 1
11. Let rn be an enumeration of ℚ, and let G = ∪∞ n=1 (rn − 2 , rn + 2 ). Prove
n n
that 𝜆(G Δ F) > 0 for every closed subset F of ℝ. Hint: Show that if 𝜆
(G − F) = 0, then F = ℝ.
12. Let f be a continuous function in 𝔏1 (ℝn ). Show that if lim‖x‖→∞ f (x) exists,
then lim‖x‖→∞ f (x) = 0. Also give an example to show that a continuous
positive integrable function need not be bounded.
13. Let f ∈ 𝔏1 (ℝn ), and let a ∈ ℝn be fixed. Define (𝜏a f)(x) = f(x − a). Show
that ∫ℝn fd𝜆 = ∫ℝn (𝜏a f)d𝜆. This is a familiar linear change of variables
formula. Using more conventional notation, ∫ℝn f (x)d𝜆(x) = ∫ℝn f(x −
a)d𝜆(x).
14. For a subset E of ℝn , let −E = {−x ∶ x ∈ E}. Prove that E is measurable if
and only if −E is measurable and, in this case, 𝜆(−E) = 𝜆(E).
15. For r > 0 and E ⊆ ℝn , define rE = {rx ∶ x ∈ E}. Prove that E is measurable
if and only if rE is measurable and that 𝜆(rE) = rn 𝜆(E).
16. Let f ∈ 𝔏1 (ℝn ). For r > 0, define fr (y) = f(ry). Show that ∫ℝn fd𝜆 =
rn ∫ℝn fr d𝜆. Using more familiar notation, if x = ry, then d𝜆(x) = rn d𝜆(y).
17. Let X be an infinite-dimensional normed linear space. Prove that there
does not exist a translation-invariant measure on ℬ(X) that assigns finite
measure to bounded sets in ℬ(X). Hint: Use Riesz’s theorem to find a
sequence {un } of unit vectors such that ‖ui − uj ‖ ≥ 1/2.
Complex measures do not really measure anything in the strict geometric sense of
the word, but they do share the defining property of a positive measure, namely,
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
countable additivity. Although they are rather abstract, real and complex measures
have applications in differentiation and probability theories, among many other
applications. We study the notion of differentiating one measure with respect to
another measure, and our main result is the Radon-Nikodym theorem, which we
apply in section 8.6 to study duals of 𝔏p spaces. Although the section results are
limited to the basics, example 2 and the section exercises significantly expand
the scope of the section, where we introduce such topics as the total variation
of real and complex measures, uniform integrability, and measurable dissections.
The properties of the Radon-Nikodym derivatives are also explored in the section
exercises.
Proof. The proofs parallel those of theorem 8.2.3 and are therefore omitted.
Warning. Monotonicity does not hold for real measures. It is possible for a set of
positive measure to contain a subset of negative measure, and conversely. Mono-
tonicity does hold, however, for positive and negative sets: if F is a measurable
subset of a positive set E, then 𝜈(F ) ≤ 𝜈(E).
⁹ This definition is not standard. Most books allow a real measure to take extended real values, ∞
or −∞, but not both. The standard term used in this case is signed measure.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Proposition 8.5.2. A measurable subset of a positive set is positive, and the count-
able union of positive sets is positive. The corresponding statements are true for
negative sets.
Proof. The first assertion follows from the definition. To prove the second assertion,
let (En ) be a sequence of positive measurable subsets of X. Define A1 = E1 ,
n−1
and, for n ≥ 2, let An = En − ∪i=1 Ei . The sequence An is disjoint, and each An
is a positive set. Now let E ⊆ ∪∞ ∞
n=1 En = ∪n=1 An . Since 𝜈(E ∩ An ) ≥ 0, 𝜈(E) =
∞
∑n=1 𝜈(E ∩ An ) ≥ 0.
Lemma 8.5.3. Every set of positive measure contains a positive set of positive
measure.
Proof. Suppose, for a contradiction, that S is a set of positive measure that contains
no positive sets of positive measure. We first establish the following:
If A ⊆ S and 𝜈(A) > 0, then there is a subset of B of A such that 𝜈(B) > 𝜈(A).
(*)
Since A is not a positive set, A contains a subset C such that 𝜈(C) < 0. Set B =
A − C. Then 𝜈(B) = 𝜈(A) − 𝜈(C) > 𝜈(A). This proves (*).
1
Set A1 = S, and let n1 be the least natural number for which 𝜈(A1 ) > . By
n1
(*), A1 contains a set B such that 𝜈(B) > 𝜈(A1 ). Let n2 be the least natural
1
number for which A1 contains a set B such that 𝜈(B) > 𝜈(A1 ) + , and let A2
n2
be such a set. Continue inductively to construct a sequence of natural numbers
n1 , n2 , … , and measurable sets A1 ⊇ A2 ⊇ ... such that nj is the least positive
1
integer for which Aj−1 contains a set B with 𝜈(B) > 𝜈(Aj−1 ) + , and Aj is such
nj
1 1 1 1 1 1
a set. Now 𝜈(A3 ) > 𝜈(A2 ) + > 𝜈(A1 ) + + > + + . Inductively,
n3 n2 n3 n1 n2 n3
j 1 ∞ 1
𝜈(Aj ) > ∑i=1 . Define A = ∩∞j=1 Aj . Then ∞ > 𝜈(A) = limj 𝜈(Aj ) ≥ ∑j=1 . In
ni nj
∞ 1
particular, ∑j=1 is convergent, and limj nj = ∞. Again by (*), A contains a
n j
subset B such that 𝜈(B) > 𝜈(A), and there is a natural number n such that 𝜈(B) >
1 1
𝜈(A) + . But there is an integer j such that nj > n. Thus 𝜈(B) > 𝜈(A) + >
n n
1
𝜈(Aj−1 ) + . This contradicts the definition of nj because B ⊆ Aj−1 .
n
Proof. We use the notation of the proof of the previous theorem. Let k = 𝜈(N). For a
measurable set E, 𝜈(E) = 𝜈(E ∩ P) + 𝜈(E ∩ N). Since 0 ≤ 𝜈(E ∩ P) ≤ K, and k ≤
𝜈(E ∩ N) ≤ 0, k ≤ 𝜈(E) ≤ K.
Proof. Let (P, N) be a Hahn decomposition of 𝜈, and define 𝜈 + (E) = 𝜈(E ∩ P), and
𝜈− (E) = −𝜈(E ∩ N). The pair 𝜈+ and 𝜈 − has the desired properties since 𝜈 + (N) =
0 = 𝜈 − (P). If 𝜇+ and 𝜇− is another pair satisfying the stated properties with
𝜇+ (M) = 0 = 𝜇− (Q), where Q ∩ M = ∅, Q ∪ M = X, then the pair (Q, M) is a
Hahn decomposition of 𝜈 and hence PΔQ is 𝜈-null. Therefore, for E ∈ 𝔐,
𝜇+ (E) = 𝜇+ (E ∩ Q) + 𝜇+ (E ∩ M) = 𝜇+ (E ∩ Q)
= 𝜇+ (E ∩ Q) − 𝜇− (E ∩ Q) = 𝜈(E ∩ Q) = 𝜈(E ∩ P) = 𝜈 + (E).
Thus 𝜇+ = 𝜈 + ; hence 𝜇− = 𝜈 − .
Definitions. The finite positive measures 𝜈 + and 𝜈− are called the positive and
negative variations of 𝜈, respectively. The finite positive measure |𝜈| = 𝜈 + + 𝜈 −
is called the total variation of 𝜈. Notice that, for every E ∈ 𝔐, |𝜈(E)| ≤ |𝜈|(E).
Define ‖𝜈‖ = |𝜈|(X).
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Now let 𝜈 be a complex measure, and let 𝜈r and 𝜈i be the real and imaginary parts
of 𝜈, that is for E ∈ 𝔐, 𝜈(E) = 𝜈r (E) + i𝜈i (E). Clearly, 𝜈r and 𝜈i are real measures;
hence ‖𝜈r ‖ < ∞, ‖𝜈i ‖ < ∞, and |𝜈(E)| ≤ |𝜈r (E)| + |𝜈i (E)| ≤ ‖𝜈r ‖ + ‖𝜈i ‖ < ∞.
Therefore, complex measures, like real measures, are bounded. Notice that the
set of complex measures contains the set of real measures and, in particular, the
set of finite positive measures. The set of complex measures on a 𝜎-algebra 𝔐 is a
vector space under the obvious operations: for complex measures 𝜈 and 𝜇 and for
a complex scalar 𝛼, (𝜈 + 𝜇)(E) = 𝜈(E) + 𝜇(E), and (𝛼𝜈)(E) = 𝛼𝜈(E)(E ∈ 𝔐).
The following theorem generalizes example 1 and provides a rich source of real
and complex measures.
Theorem 8.5.7. Let (X, 𝔐) be a measurable space, and let 𝜇 be a positive (not
necessarily finite) measure on 𝔐. If h ∈ 𝔏1 (𝜇), then the following set function
defines a complex measure on 𝔐:
𝜈(E) = ∫hd𝜇.
E
∞
𝜈(E) = ∫ h𝜒E d𝜇 = ∫ ∑ h𝜒En d𝜇
X X n=1
n n ∞
= lim ∫ ∑ h𝜒Ei d𝜇 = lim ∑ 𝜈(Ei ) = ∑ 𝜈(En ).
n n
X i=1 i=1 n=1
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Definition. Let 𝜇 and 𝜈 be as in theorem 8.5.7. The function h is called the Radon-
d𝜈
Nikodym derivative of 𝜈 with respect to 𝜇, and we symbolically write h = ,
d𝜇
or d𝜈 = hd𝜇, to indicate that 𝜈(E) = ∫E hd𝜇. The following theorem justifies the
definition and the notation.
∫ fd𝜈 = ∫ fhd𝜇.
X X
Theorem 8.5.8 justifies the definition of the Radon-Nikodym derivative and the
d𝜈
notation h = . Indeed, this theorem can be stated using the notation ∫X fd𝜈 =
d𝜇
d𝜈
∫X f d𝜇. Observe that the last formula is reminiscent of the change of variables
d𝜇
formula. Problem 8 at the end of this section is what one might call the chain rule
for Radon-Nikodym derivatives.
Notice that if 𝔐, 𝜇, h, and 𝜈 are as in theorem 8.5.7, then 𝜈 << 𝜇. The Radon-
Nikodym theorem, in effect, is the converse of theorem 8.5.7.
Proof. Since 0 < 𝜈(X) < ∞ and 0 < 𝜇(X) < ∞, there exists a positive number 𝜖
such that 𝜈(X) − 𝜖𝜇(X) > 0. Let (P, N) be the Hahn decomposition of the real
measure 𝜈 − 𝜖𝜇. Then P is positive for 𝜈 − 𝜖𝜇. If 𝜇(P) = 0, then 𝜈(P) = 0; hence
𝜈(X) − 𝜖𝜇(X) = 𝜈(N) − 𝜖𝜇(N) ≤ 0, since N is negative for 𝜈 − 𝜖𝜇. This contra-
dicts 𝜈(X) − 𝜖𝜇(X) > 0 and proves that 𝜇(P) > 0.
Proof. The uniqueness of f follows from example 4 in section 8.3.1⁰ Let 𝔉 be the
following set of measurable functions:
Proof. If 𝜈 is real and 𝜈 << 𝜇, then 𝜈+ << 𝜇 and 𝜈 − << 𝜇. By theorem 8.5.10, we
find positive 𝜇-integrable functions f+ and f− such that, for every E ∈ 𝔐, 𝜈 + (E) =
∫E f+ d𝜇, and 𝜈− (E) = ∫E f− d𝜇. Thus 𝜈(E) = ∫E fd𝜇, where f = f+ − f− ∈ 𝔏1 (𝜇).
If 𝜈 is a complex measure, apply the result we just established to the real and
imaginary parts of 𝜈, since each part is absolutely continuous with respect to 𝜇.
d𝜇1 d𝜇
∫| f1 |d𝜇1 = ∫| f1 | d𝜉 = ∫| f2 | 2 d𝜉 = ∫| f2 |d𝜇2 .
E E d𝜉 E d𝜉 E
Exercises
1. Prove that if P and Q are positive sets for a real measure 𝜈 such that P Δ Q
is 𝜈-null, then 𝜈(E ∩ P) = 𝜈(E ∩ Q) = 𝜈(E ∩ P ∩ Q) for every measurable
set E.
1 1
2. Show that, for a real measure 𝜈, 𝜈 + = (|𝜈| + 𝜈) and 𝜈 − = (|𝜈| − 𝜈).
2 2
3. Let 𝜈 be a real measure. Prove that if 𝜉 and 𝜂 are finite positive measures
such that 𝜈 = 𝜉 − 𝜂, then 𝜉 ≥ 𝜈 + and 𝜂 ≥ 𝜈 − .
4. Define the following function on the space of real measures on 𝔐: ‖𝜈‖ =
|𝜈|(X). Prove that ‖.‖ is a norm.
5. Show that if 𝜈 is a real measure on 𝔐, then 𝜈 << 𝜇 if and only if 𝜈 + << 𝜇,
and 𝜈 − << 𝜇 if and only if |𝜈| << 𝜇.
6. Let f ∈ 𝔏1 (𝜇) be a real-valued function, and define 𝜈(E) = ∫E fd𝜇, (E ∈ 𝔐).
Prove that
(a) the pair (P, N) is a Hahn decomposition of 𝜈, where P = {x ∈ X ∶ f (x) ≥
0}, and N = {x ∈ X ∶ f (x) < 0};
(b) 𝜈 + (E) = ∫E f+ d𝜇, and 𝜈− (E) = − ∫E f− d𝜇; and
(c) |𝜈|(E) = ∫E |f|d𝜇; using our notation for Radon-Nikodym derivatives,
d|𝜈| d𝜈
this result can be written as =| |.
d𝜇 d𝜇
13. Prove that theorem 8.5.11 is valid when 𝜇 is a 𝜎-finite positive measure and
𝜈 is a complex measure such that 𝜈 << 𝜇.
Here is a proof outline. It is sufficient to prove the result when 𝜈 is a finite
positive measure. For n ∈ ℕ, define two finite positive measures on 𝔐 as
follows: 𝜇n (E) = 𝜇(E ∩ Xn ) and 𝜈n (E) = 𝜈(E ∩ Xn ). Show that 𝜈n << 𝜇n . By
theorem 8.5.10, there exist positive functions hn ∈ 𝔏1 (𝜇n ) such that d𝜈n =
∞
hn d𝜇n . Without loss of generality, hn vanishes outside Xn . Set h = ∑n=1 hn .
Argue that h ∈ 𝔏1 (𝜇) and that d𝜈 = hd𝜇.
8.6 𝔏p Spaces
In addition to the function spaces ℬ(X), 𝒞(X), and ℬ𝒞(X), the 𝔏p spaces are
prototypical examples of Banach spaces and play a prominent role in modern
analysis. By far, the most important of the 𝔏p spaces is the Hilbert space
𝔏2 (X, 𝔐, 𝜇), where (X, 𝔐, 𝜇) is a positive measure space, such as a Lebesgue
measure restricted to a subset X of ℝn . The section results parallel those for the
sequence spaces lp . We prove the completeness of 𝔏p and derive the representation
theorem that, for 1 < p < ∞, 𝔏q is the dual of 𝔏p . In fact, the sequence spaces lp
are special cases of the 𝔏p spaces. See problem 1 at the end of this section. The
next section is a continuation of this one.
|x|p |y|q
(a) |xy| ≤ + , 1 < p, q < ∞.
p q
(b) |x + y|p ≤ 2p (|x|p + |y|p ), 1 ≤ p < ∞.
∞ e−x 1
Example 2. We show that ∫1 dx ≤ .
x e√2
−x
Let f (x) = 1/x, and g(x) = e . Then f and g are in 𝔏2 ((1, ∞)), and ‖ f‖2 =
1
1, ‖g‖2 = . The desired inequality now follows from the Cauchy-Schwartz
e√2
inequality.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Example 3. If 𝜇(X) < ∞, then, for p ∈ (1, ∞), 𝔏p (𝜇) ⊆ 𝔏1 (𝜇), and, for f ∈ 𝔏p (𝜇),
‖ f‖1 ≤ ‖ f‖p (𝜇(X))1/q . In particular, if 𝜇(X) = 1, then ‖ f‖1 ≤ ‖ f‖p .
Proof. We leave the cases p = 1 and p = ∞ to the reader. Assume 1 < p < ∞. By
p p
lemma 8.6.1, ∫X | f + g|p d𝜇 ≤ ∫X 2p (| f |p + |g|p )d𝜇 = 2p (‖ f‖p + ‖g‖p ) < ∞. This
shows that f + g ∈ 𝔏p . To prove Minkowski’s inequality when 1 < p < ∞, notice
that if h ∈ 𝔏p , then |h|p−1 ∈ 𝔏q because (p − 1)q = p. Now
p
‖ f + g‖p = ∫ | f + g|p d𝜇
X
Case 1. 1 ≤ p < ∞. We use the result of problem 10 on section 6.1. Let (fk ) be a
∞ ∞
sequence in 𝔏p such that K = ∑k=1 ‖ fk ‖p < ∞. We show that the series ∑k=1 fk
n ∞
converges in 𝔏p . Define gn = ∑k=1 | fk |, and let g = ∑k=1 | fk |. Then gn ∈ 𝔏p
n
and ‖gn ‖p ≤ ∑k=1 ‖ fk ‖p ≤ K. By the monotone convergence theorem, ∫X gp d𝜇 =
p ∞
limn ∫X gn d𝜇 ≤ Kp . Thus g ∈ 𝔏p . In particular, the series ∑k=1 fk (x) converges for
∞
a.e. x ∈ X. Define f (x) = ∑k=1 fk (x). Since | f | ≤ g, f ∈ 𝔏p . Finally, we show that
n n
the sequence ∑k=1 fk converges to f in 𝔏p . Now | f − ∑k=1 fk |p ≤ (2g)p , and the
n
dominated convergence theorem implies that limn ‖ f − ∑k=1 fk ‖p = 0.
g(x)
if g(x) ≠ 0,
(sgn(g))(x) = { |g(x)|
1 otherwise .
Theorem 8.6.5. Let 1 < p ≤ ∞, and let g ∈ 𝔏q (𝜇). Then the functional
Φg (f ) = ∫X fgd𝜇
is a bounded linear functional on 𝔏p (𝜇), and ‖Φg ‖ = ‖g‖q . The same is true for
p = 1 under the additional assumption that 𝜇 is 𝜎-finite.
Proof. By Hölder’s inequality, |Φg (f )| ≤ ∫X | fg|d𝜇 ≤ ‖ f‖p ‖g‖q . Since the linearity of
Φg is obvious, this inequality shows that Φg is bounded and that ‖Φg ‖ ≤ ‖g‖q .
It remains to show that ‖Φg ‖ = ‖g‖q .
p q
If 1 < p < ∞, let f = |g|q−1 sgn(g). Then f ∈ 𝔏p (𝜇), and ‖ f‖p = ‖g‖q . Now
q q−1
Φg (f ) = ∫X fgd𝜇 = ∫X |g|q d𝜇 = ‖g‖q = ‖g‖q ‖g‖q = ‖ f‖p ‖g‖q . This concludes
the proof of the case 1 < p < ∞.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Now suppose that p = 1 and that 𝜇 is 𝜎-finite. Let 0 < 𝜖 < ‖g‖∞ , and let E = {x ∈
X ∶ |g(x)| > ‖g‖∞ − 𝜖}. By definition of ‖g‖∞ , 𝜇(E) > 0. Since 𝜇 is 𝜎-finite, X =
∪∞ ∞
n=1 Xn , where each Xn has finite measure. Since E = ∪n=1 (E ∩ Xn ) and since 0 <
∞
𝜇(E) ≤ ∑n=1 𝜇(E ∩ Xn ), 𝜇(E ∩ Xn ) > 0 for some integer n. Let A = E ∩ Xn , and
1
let f = sgn(g)𝜒A /𝜇(A). Then f ∈ 𝔏1 (𝜇), ‖ f‖1 = 1, and |Φg (f )| = ∫ |g|d𝜇 ≥
𝜇(A) A
‖g‖∞ − 𝜖.
Theorem 8.6.5 establishes the fact that Φ ∶ g ↦ Φg is an isometry from 𝔏q (𝜇) into
(𝔏p (𝜇))∗ . Theorem 8.6.7 establishes sufficient conditions for Φ to be an isometric
isomorphism. First we need a technical result.
Lemma 8.6.6. Suppose 𝜇(X) < ∞. If g ∈ 𝔏1 (𝜇) and there exists a constant M such
that | ∫X sgd𝜇| ≤ M‖s‖p for every simple function s, then g ∈ 𝔏q .
Proof. Because 𝜇(X) < ∞, all measurable simple functions are in 𝔏p (𝜇), and
𝔏∞ (𝜇) ⊆ 𝔏p (𝜇) for all p ≥ 1. We work out two separate cases.
Case 1. 1 < p < ∞. First we show that | ∫X fgd𝜇| ≤ M‖ f‖p for every function
f ∈ 𝔏∞ . To see that, let (sn ) be a sequence of simple functions that converges
to f in 𝔏∞ (see theorem 8.3.2.) In this case, sn converges to f in 𝔏p for every
p ≥ 1. Now | ∫X fg − sn gd𝜇| ≤ ‖sn − f‖∞ ‖g‖1 → 0 as n → ∞. Thus | ∫X fgd𝜇| =
limn | ∫X sn gd𝜇| ≤ limn M‖sn ‖p = M‖ f‖p .
We show that g ∈ 𝔏q . Let En = {x ∈ X ∶ |g(x)| ≤ n}, and let f = |g|q−1 sgn(g)𝜒En .
Then f ∈ 𝔏∞ , fg = |g|q 𝜒En , and | f |p = |g|q 𝜒En . Hence ∫En |g|q d𝜇 = ∫X fgd𝜇 ≤
M(∫X | f |p d𝜇)1/p = M(∫En |g|q d𝜇)1/p . Thus (∫En |g|q d𝜇)1/q ≤ M. Taking the limit
of the left side of the last inequality, the monotone convergence theorem yields
‖g‖q ≤ M < ∞.
Theorem 8.6.7. If 𝜇 is 𝜎-finite, then the function Φ in theorem 8.6.5 is onto for
1 ≤ p < ∞.
Proof. Let 𝜑 ∈ (𝔏p (𝜇))∗ . We need to prove the existence of a function g ∈ 𝔏q such
that 𝜑 = Φg . Equivalently, for all f ∈ 𝔏p (𝜇),
We first prove the result in the special case when 𝜇(X) < ∞. For a measurable
set E, define 𝜈(E) = 𝜑(𝜒E ). Since 𝜑 is linear and since, for disjoint measurable
sets E1 and E2 , 𝜒E1 ∪E2 = 𝜒E1 + 𝜒E2 , 𝜈(E1 ∪ E2 ) = 𝜈(E1 ) + 𝜈(E2 .) Thus 𝜈 is finitely
additive. We show that 𝜈 is countably additive, and this will establish the fact
that 𝜈 is a complex measure. Let (En ) be a disjoint sequence in 𝔐, and let
p
E = ∪∞ k
n=1 En . Let Ak = ∪i=1 Ei . Then limk 𝜇(Ak ) = 𝜇(E); hence ‖𝜒E − 𝜒Ak ‖p =
p p
∫X |𝜒E − 𝜒Ak | d𝜇 = 𝜇(E − Ak ) → 0 as k → ∞. Thus 𝜒Ak converges to 𝜒E in 𝔏 (𝜇).
∞
By the continuity of 𝜑, limk 𝜑(𝜒Ak ) = 𝜑(𝜒E ). that is, ∑n=1 𝜈(En ) = 𝜈(E).
If 𝜇(E) = 0, then 𝜒E = 0 𝜇-a.e.; thus 𝜈(E) = 0.
The summary of the proof so far is that 𝜈 is a complex measure and 𝜈 << 𝜇. By
the Radon-Nikodym theorem, there exists a function g ∈ 𝔏1 (𝜇) such that 𝜑(𝜒E ) =
∫E gd𝜇 for every measurable set E. The linearity of the functionals on the two
sides of the last identity implies that 𝜑(s) = ∫X sgd𝜇 for every simple measurable
function s. Now, for a simple function s, | ∫X sgd𝜇| = |𝜑(s)| ≤ ‖𝜑‖‖s‖p . By the
previous lemma, g ∈ 𝔏q . The functional on the left-hand side of identity (6) is
continuous on 𝔏p by assumption, and the functional on the right side of (6) is
continuous by theorem 8.6.5. Since the two functionals agree on a dense subset
of 𝔏p , namely, the set of simple functions,11 identity (6) holds for all f ∈ 𝔏p . This
completes the proof of the theorem when 𝜇(X) < ∞.
Now suppose that 𝜇(X) = ∞ and that X is the disjoint union of a countable
∞ 𝜒
sequence (En ) of sets of finite measure. Define h(x) = ∑n=1 n En . By the
2 𝜇(En )
∞ 𝜒 d𝜇 ∞
monotone convergence theorem, ∫X hd𝜇 = ∑n=1 ∫En n En = ∑n=1 1/2n = 1.
2 𝜇(En )
1
Thus h ∈ 𝔏 (𝜇). Let 𝜈 be the finite positive measure such that d𝜈 = hd𝜇. For
1 ≤ p < ∞, the correspondence F ↦ h1/p F defines a linear isometry from 𝔏p (𝜈)
onto 𝔏p (𝜇) (theorem 8.5.8 is relevant here and for the rest of the proof). In
particular, 𝜓(F ) = 𝜑(h1/p F) defines a bounded linear functional on 𝔏p (𝜈). By the
first part of the proof, there exists a function G ∈ 𝔏q (𝜈) such that 𝜓(F ) = ∫X FGd𝜈,
for every F ∈ 𝔏p (𝜈).
Exercises
8.7 Approximation
Lemma 8.7.1 (the Tietze extension theorem). Let K be a compact subset of ℝn and
let f ∶ K → [0, 1] be continuous. Then f can be extended to a continuous function
g ∈ 𝒞c (ℝn ) such that 0 ≤ g ≤ 1. If K is contained in an open set U, then g can be
constructed in such a way that supp(g) ⊆ U.
Remark. The Tietze extension theorem is valid for locally compact Hausdorff
spaces. See problem 1 at the end of this section.
Proof. First we show that, for every pair of positive real numbers 𝜖 and 𝛿, there exists
a measurable set A and an integer N ≥ 1 such that 𝜇(A) < 𝛿 and sup{| fk (x) −
f (x)| ∶ x ∈ X − A} < 𝜖 for every k ≥ N. Define Ck = {x ∈ X ∶ | fk (x) − f (x)| < 𝜖},
and let Dn = ∩∞ k=n Ck = {x ∈ X ∶ | fk (x) − f (x)| < 𝜖 for every k ≥ n}. Clearly,
D1 ⊆ D2 ⊆ … . The set X − ∪∞ n=1 Dn is contained in the set {x ∈ X ∶ limn fn (x) ≠
f (x)}, which, by assumption, has measure 0. It follows that limn 𝜇(Dn ) = 𝜇(X).
Therefore there exists a positive integer N such that 𝜇(X − DN ) < 𝛿. Set A =
X − DN . This proves our assertion because if x ∉ A, then x ∈ Ck for every k ≥ N,
and | fk (x) − f (x)| < 𝜖 for every k ≥ N.
For a fixed 𝛿 > 0, and each k ∈ ℕ, let 𝛿k = 𝛿/2k . Applying the above construction
to the pair 𝜖k = 1/k, and 𝛿k , we find a measurable set Ak such that 𝜇(Ak ) < 𝛿/2k
and a positive integer nk such that sup{| fn (x) − f (x)| ∶ x ∈ X − Ak } < 1/k for
∞ ∞
n > nk . Define E = ∪∞ k
k=1 Ak . Then 𝜇(E) ≤ ∑k=1 𝜇(Ak ) ≤ ∑k=1 𝛿/2 = 𝛿.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Now let 𝜖 > 0, and choose a positive integer k such that 1/k < 𝜖.
Now, for m > nk = N,
Before we proceed to the next theorem, we call the reader’s attention to the fact that
m
𝔏∞ contains all simple functions, while a simple function s = ∑i=1 ai 𝜒Ei belongs
p
to 𝔏 if and only if the support of s has finite measure, that is, if 𝜇(Ei ) < ∞ for all
1 ≤ i ≤ m.
Proof. Let U be an open set containing E such that 𝜆(U − E) < 𝜖/2. For each 1 ≤
𝜖
i ≤ m, let Ki be a compact subset of Ei such that 𝜆(Ei − Ki ) < , and set H =
2m
∪mi=1 Ki . Notice that 𝜆(E − H) < 𝜖/2. For 1 ≤ i ≤ n, define Vi = U − ∪j≠i Kj . By the-
orem 8.4.3, there exist functions gi ∈ 𝒞c (ℝn ) such that Ki ≺ gi ≺ Vi . Now define
m
g = ∑i=1 ai gi . Clearly, g ∈ 𝒞c (ℝn ), g|H = s|H , and g vanishes outside U. The set
{x ∈ ℝn ∶ s(x) ≠ g(x)} is contained in the union of U − E and E − H, and the
Lebesgue measure of each of the two sets is less than 𝜖/2. If ‖g‖∞ > ‖s‖∞ , we
modify g as follows to satisfy the last requirement of the theorem. Let S = {x ∈
ℂ ∶ |z| ≤ ‖s‖∞ }, and T = {z ∈ ℂ ∶ |z| ≤ ‖g‖∞ }. Define 𝜑 ∶ T → S by
z if z ∈ S,
𝜑(z) = { z‖s‖∞
if z ∈ T − S.
|z|
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Lemma 8.7.4 is a very special case of the following well-known theorem, which
says, loosely speaking, that a measurable function on a set of finite Lebesgue
measure is not too far from being continuous.
Proof. Let U be an open set such that E ⊆ U and 𝜆(U − E) < 𝜖/3. Let si be a sequence
of simple measurable functions such that |s1 | ≤ |s2 | ≤ ... ≤ | f | and limi si (x) =
f (x). Since f is supported in E, each si is supported in E. By Egoroff ’s theorem,
there exists a subset A of E such that 𝜆(A) < 𝜖/3, and the sequence si converges
uniformly to f on E − A. By the proof of lemma 8.7.4, there exist compact sets
𝜖/3
Hi ⊆ E − A such that 𝜆((E − A) − Hi ) < i and functions gi ∈ 𝒞c (ℝn ) such that
2
and gi |Hi = si |Hi , and each gi vanishes outside U. Now let K = ∩∞ i=1 Hi . Clearly,
K is compact, and 𝜆((E − A) − K) < 𝜖/3. The sequence of continuous functions
gi converges uniformly to f on K. Thus f|K is continuous. By the Tietze extension
theorem, there exists a function g ∈ 𝒞c (ℝn ) that extends f|K and g(x) = 0 for every
x ∉ U. The set {x ∈ ℝn ∶ f (x) ≠ g(x)} is contained in the union of U − E, A, and
(E − A) − K, and each of the three sets has Lebesgue measure less than 𝜖/3.
If ‖ f‖∞ < ∞, and ‖g‖∞ > ‖ f‖∞ , we modify g as in the proof of lemma 8.7.4. to
satisfy the requirement ‖g‖∞ ≤ ‖ f‖∞ .
Theorem 8.7.6. For 1 ≤ p < ∞, 𝒞c (ℝn ) is dense in 𝔏p (ℝn ) for all 1 ≤ p < ∞.
Proof. Let f ∈ 𝔏p (ℝn ), and let 𝜖 > 0. By lemma 8.7.3, we may assume that f = s, a
simple function with 𝜆(supp(s)) < ∞. Lemma 6.7.4 produces a set A of measure
less than 𝜖 and a function g ∈ 𝒞c (ℝn ) such that s(x) = g(x) for x ∉ A, and
p
‖g‖∞ ≤ ‖s‖∞ . Thus |g − s| ≤ |g| + |s| ≤ 2‖s‖∞ . Hence ‖g − s‖p = ∫A |g − s|p d𝜇 ≤
p p
2p ‖s‖∞ 𝜆(A) < 2p ‖s‖∞ 𝜖.
Remark. Lemma 8.7.4 and theorems 8.7.5 and 8.7.6 are valid for Radon measures
on locally compact Hausdorff spaces without any alterations to the proofs
12 Observe that 𝜑 simply fixes S and retracts the annulus between the disks T and S radially onto the
boundary of S.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
included for the last three results. For example, in the proof of lemma 8.7.4,
we only used the inner regularity of measurable sets of finite measure.
This result follows immediately from the last two examples and theorem
4.10.1.
Proof. Let ℭ be the collection of half-open boxes of the form 𝜎 = [a1 , b1 ) × ... ×
[an , bn ), where ai , bi ∈ ℚ. Define 𝔇 to be the collection of linear combinations of
characteristic functions of members of ℭ with rational coefficients. Thus a member
m
of 𝔇 is a simple function of the form s = ∑i=1 ci 𝜒𝜍i , where m ∈ ℕ, the coefficients
ci are rational numbers, and 𝜎i ∈ ℭ. It is clear that 𝔇 is countable. We prove that
it is dense in 𝔏p (ℝn ). In light of theorem 8.7.6, it suffices to show that if f ∈ 𝒞rc (ℝn )
and 𝜖 > 0, then there is a function s ∈ 𝔇 such that ‖ f − s‖p < c𝜖 for some constant
c, which is independent of 𝜖.
By the uniform continuity of f, there exists a number 𝛿 > 0 such that | f (x) −
f(y)| < 𝜖 whenever ‖x − y‖ < 𝛿. Let Q be a box in ℭ that contains supp(f ) in
its interior. Partition Q into disjoint sub-boxes 𝜎1 , … , 𝜎m , where each 𝜎i ∈ ℭ,
and diam(𝜎i ) < 𝛿. For each 1 ≤ i ≤ m, choose a rational number ci such that
m
minx∈𝜍i f (x) ≤ ci ≤ maxx∈𝜍i f (x). Finally, define s = ∑i=1 ci 𝜒𝜍i . By construction,
‖ f − s‖∞ < 𝜖.
p p
Now ‖ f − s‖p = ∫Q |f − s|p d𝜆 ≤ ‖ f − s‖∞ vol(Q) < 𝜖p vol(Q).
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Approximation by 𝒞∞ Functions
|(f ∗ g)(x)| ≤ ∫ | f(x − y)g(y)|dy ≤ ‖ f‖∞ ∫ |g(y)|dy ≤ ‖ f‖∞ ‖g‖∞ 𝜆(K) < ∞.
ℝn ℝn
Lemma 8.7.9. Let f ∈ 𝔏p (ℝn ), where 1 ≤ p < ∞. Then lima→0 ‖𝜏a f − f‖p = 0.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Proof. Recall that (𝜏a f)(x) = f(x − a). First we prove the result for a function
g ∈ 𝒞c (ℝn ). Without loss of generality, assume that ‖a‖ < 1. Thus the functions
𝜏a g have a common support, say, K. Let 𝜖 > 0. By the uniform continuity of g,
there exists a number 𝛿 > 0 such that whenever ‖a‖ < 𝛿, then |(𝜏a g)(x) − g(x)| =
p
|g(x − a) − g(x)| < 𝜖. Now ‖𝜏a g − g‖p = ∫K |g(x − a) − g(x)|p dx ≤ 𝜖p 𝜆(K).
p n
Now let f ∈ 𝔏 (ℝ ), and let 𝜖 > 0. By theorem 8.7.6, there is a function g ∈
𝒞c (ℝn ) such that ‖ f − g‖p < 𝜖/3. By the first part of the proof, there is 𝛿 > 0 such
that for ‖a‖ < 𝛿, ‖𝜏a g − g‖p < 𝜖/3. Now if ‖a‖ < 𝛿, then
exp{−1/x} if x > 0,
f (x) = {
0 if x ≤ 0.
It is easily seen that, for x > 0, and k ∈ ℕ, f(k) (x) = p(1/x) exp{−1/x}, where p is
a polynomial of degree 2k. Therefore limx↓0 f(k) (x) = 0. Hence f(k) (0) = 0, and
f is infinitely differentiable at x = 0. Since the differentiability of f at x ≠ 0 is
obvious, f ∈ 𝒞∞ (ℝ).
Example 5 (the bump function). For a fixed h > 0, consider the function
−h2
exp{ } if |x| < h,
𝜑(x) = { h2 −|x|2
0 if |x| ≥ h.
c (ℝ).
As |x| ↑ h, h2 − |x|2 ↓ 0, so, by example 1, 𝜑 ∈ 𝒞∞
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Example 6 (the bump kernel). We can use the function 𝜑 in example 5 to construct
a continuously parameterized family of functions as follows. For a fixed h > 0,
−h2
An h−n exp{ } if ‖x‖ < h,
𝛿h (x) = { h2 −‖x‖2
0 if ‖x‖ ≥ h,
−1
where A−1 c (ℝ ).
}dy. By the above examples, 𝛿h ∈ ℂ∞ n
n = ∫‖y‖<1 exp {
1−‖y‖2
The first assertion is obvious. For the second assertion, use the change of variable
x = hy. By problem 14 on section 8.4, dx = hn dy, and
−h2
∫ 𝛿h (x)dx = ∫ An h−n exp { } dx
ℝn ‖x‖<h
h2 − ‖x‖2
−1
= An h−n ∫ exp { } hn dy = 1.
‖y‖<1
1 − ‖y‖2
Lemma 8.7.10. If f ∈ 𝒞∞ n n ∞ n
c (ℝ ), and g ∈ 𝒞c (ℝ ), then f ∗ g ∈ 𝒞c (ℝ ), and, for every
𝛼 𝛼
multi-index 𝛼, D (f ∗ g) = D f ∗ g.
Proof. The proof is by induction on |𝛼|. It is sufficient to prove the result when
𝜕 𝜕f
|𝛼| = 1. Thus we need to show that (f ∗ g) = ∗ g. For simplicity of notation,
𝜕xi 𝜕xi
fix x1 , … , xi−1 , xi+1 , … , xn , and consider f as a function of the single variable xi ,
d
which we rename x. Thus we need to prove that (f ∗ g) = f ′ ∗ g.
dx
(f∗g)(x+t)−(f∗g)(x) ′
We will show that limt→0 − (f ∗ g)(x) = 0.
t
Let 𝜖 > 0. By the uniform continuity of f′ , there is 𝛿 > 0 such that
(f∗g)(x+t)−(f∗g)(x)
| f′ (𝜉) − f′ (𝜂)| < 𝜖 whenever |𝜉 − 𝜂| < 𝛿. Now | − (f′ ∗ g)(x)| =
t
f(x+t−y)−f(x−y)
| ∫ℝ { − f′ (x − y)}g(y)dy| = | ∫ℝ {f′ (x + 𝜃t − y) − f′ (x − y)}g(y)dy|,
t
where 0 < 𝜃 < 1. Now if |t| < 𝛿, then | f′ (x + 𝜃t − y) − f′ (x − y)| < 𝜖 and
| ∫ℝ {f′ (x + 𝜃t − y) − f′ (x − y)}g(y)dy| ≤ 𝜖 ∫ℝ |g(y)|dy ≤ 𝜖‖g‖∞ 𝜆(K), where K =
supp(g).
Proof. We will make use of the fact that ∫ℝn 𝛿h (y)dy = 1 and example 4 on sec-
tion 8.6:
| |
|(f ∗ 𝛿h )(x) − f (x)| = || ∫ (f(x − y) − f (x))𝛿h (y)dy||
ℝn
Integrating the pth power of the extreme sides of the above string, we have
p
‖ f ∗ 𝛿h − f‖p ≤ ∫ ∫ | f(x − y) − f (x)|p 𝛿h (y)dydx
ℝn ℝn
By lemma 8.7.9, there exists a number h0 > 0 such that, for ‖y‖ < h0 , ‖𝜏y f − f ‖p <
p
𝜖. Hence, for h < h0 , ∫‖y‖<h ‖𝜏y f − f‖p 𝛿h (y)dy ≤ 𝜖p ∫‖y‖≤h 𝛿h (y)dy = 𝜖p .
Proof. (a) Let f ∈ 𝔏p (ℝn ), and let 𝜖 > 0. By theorem 8.7.6, there is a function
g ∈ 𝒞c (ℝn ) such that ‖ f − g‖p < 𝜖/2. By the previous proposition, we can choose
h > 0 small enough so that ‖g − g ∗ 𝛿h ‖p < 𝜖/2. The function g ∗ 𝛿h is in 𝒞∞ n
c (ℝ ),
and ‖ f − g ∗ 𝛿h ‖p < 𝜖.
(b) Let f ∈ 𝒞0 (ℝn ), and let 𝜖 > 0. By theorem 5.11.8, there exits a function
g ∈ 𝒞c (ℝn ) such that ‖ f − g‖∞ < 𝜖. By the uniform continuity of g, there is a
number 𝛿 > 0 such that |g(x) − g(y)| < 𝜖 whenever ‖x − y‖ < 𝛿. Choose a positive
number h < 𝛿. The proof will be complete if we show that ‖g ∗ 𝛿h − g‖∞ < 𝜖. Since
∫ℝn 𝛿h (x − y)dy = 1,
| |
|g ∗ 𝛿h (x) − g(x)| = || ∫ {g(y) − g(x)}𝛿h (x − y)dy||
ℝn
< 𝜖∫ 𝛿h (x − y)dy = 𝜖.
‖x−y‖<h
Exercises
Throughout this section, (X, 𝔐, 𝜇) and (Y, 𝔑, 𝜈) denote a pair of measure spaces.
The objective of this section is to find a reasonable definition of the product
measure on X × Y. Fubini’s theorem is one of the section’s main results. We also
settle questions about the products of Lebesgue measures in this section.
The basic definitions are motivated by the ideas found in standard calculus
textbooks. Let us look at the simplest case, which is the product of two copies
of the real line with Lebesgue measure, 𝜆. The problem of computing the area of
a plane region contains all the motivations for the ideas behind the definitions in
this section. Figure 8.6 depicts a (bounded) plane region E in ℝ2 . To compute the
area of E, we take a vertical cross section Sx in E, and the area of E is obtained
by integrating the length (the Lebesgue measure) of the cross section. The same
can be achieved by taking a horizontal cross section Sy in E. Thus the area (two-
dimensional measure) of E, denoted 𝜌(E), is given by
(x,y)
Sy
Sx
We also wish the two-dimensional measure 𝜌 to preserve the property that the
area of a rectangle is the product of its dimensions. More generally, if A and B are
measurable subsets of ℝ, then it should be the case that
𝜌(A × B) = 𝜆(A)𝜆(B).
Now see theorem 8.8.9, where the definition of the product measure appears.
Before we can achieve any of the above goals, we need to define a reasonable 𝜎-
algebra in X × Y where our expectations can materialize. Geometry dictates that
the product of two intervals (or, more generally, measurable subsets) A and B in
ℝ ought to be measurable in the product space. This immediately suggests that we
look at the smallest 𝜎-algebra that contains all rectangles, and this provides the
motivation of the definitions below of the product of measurable spaces.
Definition. The product of the measurable spaces (X, 𝔐) and (Y, 𝔑) is the
measurable space (X × Y, 𝔐 ⊗ 𝔑), where 𝔐 ⊗ 𝔑 is the 𝜎-algebra generated
by the collection of measurable rectangles.
The following lemma will be used without explicit reference. Its proof is simple.
B if x ∈ A,
(A × B)x = {
∅ if x ∉ A.
Proof. Let a ∈ ℝ, and let E = f−1 (a, ∞) = {(x, y) ∈ X × Y ∶ f(x, y) > a}. Now
f−1
x (a, ∞) is exactly the set Ex , which is measurable by the previous proposition.
Thus fx is 𝔑-measurable.
Proof. It is clear that the intersection of two measurable rectangles is either empty or
a measurable rectangle. Also (A × B)′ = (A′ × Y) ∪ (A × B′ ), so the complement
of a measurable rectangle is an elementary set.
Let E = ∪ni=1 Ri and F = ∪m j=1 Sj be elementary sets, where each of {Ri } and {Sj }
is a set of disjoint measurable rectangles. Now E ∩ F = ∪{Ri ∩ Sj ∶ 1 ≤ i ≤ n, 1 ≤
j ≤ m}. This shows that E ∩ F ∈ 𝔈, and that 𝔈 is closed under the formation of
finite intersections
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
n
Now consider the complement of an elementary set E = ∪i=1 Ri . Since E′ =
∩ni=1 R′i ,′
E ∈ 𝔈. Thus 𝔈 is closed under complementation. It is also clear that 𝔈
is closed under the formation of finite disjoint unions.
Now if E1 , E2 ∈ 𝔈, then, by the above, E′1 ∩ E2 ∈ 𝔈; hence E1 ∪ E2 = E1 ∪
(E1 ∩ E2 ) ∈ 𝔈.
′
Proof. Let 𝔐 be the 𝜎-algebra generated by 𝔈, and let 𝔐1 be the smallest monotone
class in T containing 𝔈. Since 𝔐 is a monotone class containing 𝔈, 𝔐1 ⊆ 𝔐.
Thus we need to establish the reverse inclusion. It is clearly sufficient to show that
𝔐1 is a 𝜎-algebra.
We first show that 𝔐1 is an algebra. Let 𝔐′1 = {E ⊆ T ∶ E′ ∈ 𝔐1 }. It is clear
that 𝔐′1 is a monotone class in T and that 𝔈 ⊆ 𝔐′1 . Thus 𝔐1 ⊆ 𝔐′1 ; hence 𝔐1
is closed under complementation.
For a member F ∈ 𝔐1 , define Ω(F ) = {E ∈ 𝔐1 ∶ E ∪ F ∈ 𝔐1 }. It is easy to
verify that Ω(F ) is a monotone class in X. Now if G ∈ 𝔈, then Ω(G) contains
𝔈, so Ω(G) = 𝔐1 . Hence, for any H ∈ 𝔐1 , H ∈ Ω(G). By the very definition of
Ω(H), G ∈ Ω(H), so 𝔈 ⊆ Ω(H) for each H ∈ 𝔐1 . Because Ω(H) is a monotone
class, 𝔐1 = Ω(H), so 𝔐1 is an algebra.
Now if (En ) is a sequence of members of 𝔐1 , let Bn = ∪ni=1 Ei . Because 𝔐1 is an
algebra, each Bn ∈ 𝔐1 . Since 𝔐1 is a monotone class, it follows that ∪∞ n=1 En =
∪∞n=1 Bn is in 𝔐1 . This shows that 𝔐1 is a 𝜎-algebra, and the proof is complete.
Corollary 8.8.7. The 𝜎-algebra 𝔐 ⊗ 𝔑 is the smallest monotone class that contains
the algebra 𝔈 of elementary sets.
Product Measures
Theorem 8.8.8. Suppose (X, 𝔐, 𝜇) and (Y, 𝔑, 𝜈) are 𝜎-finite measure spaces.
For a subset E ∈ 𝔐 ⊗ 𝔑, and for x ∈ X, y ∈ Y, define
Then
(i) 𝜑 is 𝔐-measurable,
(ii) 𝜓 is 𝔑-measurable, and
(iii) ∫X 𝜑d𝜇 = ∫Y 𝜓d𝜈.
Proof. Let Ω be the collection of members of 𝔐 ⊗ 𝔑 for which all three conclusions
of the theorem hold. We will show that Ω = 𝔐 ⊗ 𝔑.
Now we use the 𝜎-finiteness assumption to write X as the disjoint union of subsets
Xn of finite 𝜇-measure, and Y as the disjoint union of subsets Ym of finite 𝜈
measure. For a member E of 𝔐 ⊗ 𝔑, define Em,n = E ∩ (Xn × Ym ), and let Ω1 be
the collection of all members E of 𝔐 ⊗ 𝔑 such that, for all m, n ∈ ℕ, Em,n ∈ Ω.
Facts (b) and (c) imply that Ω1 is a monotone class, and fact (a) implies that Ω1
contains all elementary sets. Thus Ω1 = 𝔐 ⊗ 𝔑 by corollary 8.8.7.
Thus Em,n ∈ Ω for every E ∈ 𝔐 ⊗ 𝔑 and for all m, n ∈ ℕ. Since E = ∪m,n Em,n
and the sets Em,n are disjoint, fact (d) implies that E ∈ Ω.
Theorem 8.8.9. Under the assumptions of theorem 8.8.8, the set function defined by
∞ ∞ ∞
(𝜇 ⊗ 𝜈)(E) = ∫ ∑ 𝜈((En )x )d𝜇 = ∑ ∫ 𝜈((En )x )d𝜇 = ∑ (𝜇 ⊗ 𝜈)(En ).
X n=1 n=1 X n=1
Remark. Both the existence and uniqueness of the product measure of 𝜎-finite
spaces can be based on the Hopf extension theorem. For a measurable rectangle
A × B in 𝔐 × 𝔑, we define 𝜌(A × B) = 𝜇(A)𝜈(B), and, for an elementary set
n
C = ∪ni=1 Ai × Bi , we define 𝜌(C) = ∑i=1 𝜇(Ai )𝜈(Bi ). Then one can check that
𝜌 is countably additive on the algebra ℭ of elementary sets (it is not difficult).
Now all the conditions of theorems 8.2.19 and 8.2.20 are met, and the (unique)
Hopf extension of 𝜌 is the product measure 𝜇 ⊗ 𝜈. The approach we took to
define the product measure has the slight advantage that it is better motivated
by calculus concepts, as explained in the opening remarks of this section. In
addition, Fubini’s theorem follows without difficulty from the above results.
Fubini’s Theorem
Proof. It is clearly sufficient to prove the result when f is a real function. Let f+
and f− be the positive and negative parts of f, and write 𝜑1 (x) = ∫Y (f+ )x d𝜈, and
𝜑2 (x) = ∫Y (f− )x d𝜈. Since f+ ≤ | f |, f+ ∈ 𝔏1 (𝜇 ⊗ 𝜈), theorem 8.8.10 applies and
∫X 𝜑1 d𝜇 = ∫X×Y f+ d(𝜇 ⊗ 𝜈) < ∞. Thus 𝜑1 ∈ 𝔏1 (𝜇) and example 1 in section 8.3
now implies that 𝜑1 (x) is finite for a.e. x ∈ X, that is, (f+ )x is integrable for a.e.
x ∈ X. Similar results apply to f− ; 𝜑2 ∈ 𝔏1 (𝜇), and 𝜑2 is finite for a.e. x ∈ X.
The function 𝜑 = 𝜑1 − 𝜑2 is defined for a.e. x ∈ X, and the identity ∫X 𝜑d𝜇 =
∫X×Y fd(𝜇 ⊗ 𝜈) follows from the fact that fx = (f+ )x − (f− )x and the linearity of
the integral. The remaining assertion of the theorem and the other identity in (7)
are obtained by replicating the above proof for the function f y .
In the discussion below and until the end of the section, k is a positive integer, and
𝜆k denotes Lebesgue measure on the 𝜎-algebra ℒk of Lebesgue measurable subsets
of ℝk . We also use the notation ℬk to denote the 𝜎-algebra of Borel subsets of ℝk .
In the following, we use the result of problem 10 on section 8.4 without explicit
mention.
Proof. First assume that B is bounded, and choose an open set V of finite measure
such that B ⊆ V ⊆ ℝs . Let 𝜖 > 0. Choose an open set U such that Z ⊆ U ⊆ ℝr
and 𝜆r (U) < 𝜖. Since we have not yet established the Lebesgue measurability of
Z × B, we estimate its outer measure: m∗n (Z × B) ≤ m∗n (U × V) = 𝜆n (U × V) =
𝜆r (U)𝜆s (V) < 𝜖𝜆s (V). Since 𝜖 is arbitrary, m∗n (Z × B) = 0; hence Z × B is mea-
surable of measure 0.
If B is unbounded, consider the intersection Bi of B with the open ball in ℝs
of radius i and centered at the origin. By what we just proved, for each i ∈ ℕ,
Z × Bi ∈ ℒn has measure 0. Since Z × B = ∪∞ i=1 (Z × Bi ), the proof is complete.
(a) ℬn ⊆ ℒr ⊗ ℒs ⊆ ℒn .
(b) If A ∈ ℒr and B ∈ ℒs , then 𝜆n (A × B) = 𝜆r (A)𝜆s (B).
Proof. (a) Every open cube in ℝn is the product of two open cubes, one in ℝr and
one in ℝs . Thus ℒr ⊗ ℒs contains all open cubes in ℝn . Since every open subset of
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
(b) First assume that A and B are bounded G𝛿 sets. Thus there exist descending
sequences of bounded open sets {Ui } in ℝr and {Vj } in ℝs such that A = ∩∞
i=1 Ui
and B = ∩∞ V
j=1 j . Now
𝜆n (A × B) = 𝜆n (∩∞ ∞ ∞
i=1 ∩j=1 (Ui × Vj )) = lim 𝜆n (∩j=1 (Ui × Vj ))
i
= lim lim 𝜆n (Ui × Vj ) = lim lim 𝜆r (Ui )𝜆s (Vj ) = 𝜆r (A)𝜆s (B).
i j i j
Now, for arbitrary (unbounded) G𝛿 sets A and B, the result follows from the 𝜎-
finiteness of Lebesgue measure. We invite the reader to work out the details.
Before we proceed to the next theorem, we will show that 𝜆1 ⊗ 𝜆1 is not a complete
measure. Let E be a subset of [0, 1] that is not Lebesgue measurable. The set A =
E × {0} ⊆ ℝ2 is contained in B = [0, 1] × {0}, which is in ℒ1 ⊗ ℒ1 . Clearly, (𝜆1 ⊗
𝜆1 )(B) = 0. However, A is not in ℒ1 ⊗ ℒ1 by proposition 8.8.2. As a by-product of
this example, it follows that ℒ2 is strictly larger than ℒ1 ⊗ ℒ1 .
Theorem 8.8.14. Let r and s be positive integers, and let n = r + s. Then (ℝn , ℒn , 𝜆n )
is the completion of (ℝn , ℒr ⊗ ℒs , 𝜆r ⊗ 𝜆s ).
The proof will be complete if we show that for a member E of ℒn , there are members
A and B of ℒr ⊗ ℒs such that A ⊆ E ⊆ B, and (𝜆r ⊗ 𝜆s )(B − A) = 0; see problem
4 on section 8.2. By theorem 8.4.10, there exists an F𝜍 set A ⊆ ℝn and a G𝛿 set
B ⊆ ℝn such that A ⊆ E ⊆ B, and 𝜆n (B − A) = 0. Since A, B ∈ ℬn ⊆ ℒr ⊗ ℒs , the
above paragraph implies that (𝜆r ⊗ 𝜆s )(B − A) = 𝜆n (B − A) = 0, as desired.
It is clear that the above definitions and constructions for the product of two
measurable spaces can be extended to the product of any finite number of
measurable spaces {(Xi , 𝔐i ), 1 ≤ i ≤ n}. A measurable rectangle is a set of the form
A1 × ... × An , Ai ∈ 𝔐i , and an elementary set is a disjoint union of a finite number
of measurable rectangles. It is easy to see that the collection, ℭ, of elementary
sets is an algebra. By definition, 𝔐1 ⊗ ... ⊗ 𝔐n is the 𝜎-algebra generated by
the collection of measurable rectangle. Obviously, the algebra ℭ also generates
𝔐1 ⊗ ... ⊗ 𝔐n .
𝔐1 ⊗ (𝔐2 ⊗ 𝔐3 ) = 𝔐1 ⊗ 𝔐2 ⊗ 𝔐3 .
This immediately suggests an inductive definition of the product of more than two
measure spaces.
Theorem 8.8.9 and the inductive nature of the construction imply that 𝜇1 ⊗ ... ⊗
𝜇n is a 𝜎-finite measure on 𝔐1 ⊗ ... ⊗ 𝔐n and that, for a measurable rectangle
n
A1 × ... × An , we have (𝜇1 ⊗ ... ⊗ 𝜇n )(A1 × ... × An ) = ∏i=1 𝜇i (Ai ).
Fubini’s theorem (theorem 8.8.11) extends to the product of any finite number of
measures in a straightforward manner. Using the notation we established earlier
in this excursion, if f ∈ 𝔏1 (𝜇1 ⊗ ... ⊗ 𝜇n ), then
Exercises
⎧1 if x ≥ 0, x ≤ y ≤ x + 1,
f(x, y) = −1 if x ≥ 0, x + 1 ≤ y ≤ x + 2,
⎨
⎩0 otherwise.
∞ ∞ ∞ ∞
Prove that ∫−∞ ∫−∞ f(x, y)dxdy ≠ ∫−∞ ∫−∞ f(x, y)dydx. This does not con-
tradict theorem 8.8.11 because, clearly, | f | is not integrable.
7. Let X = [0, 1], 𝔐 be ℒ1 -restricted to [0, 1], and dx (or dy) denote the
Lebesgue measure on [0, 1]. Choose a sequence 𝛼1 < 𝛼2 < ... in (0, 1) and,
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
This section has a number of axes. We extend the discussion of Fourier series of
2𝜋-periodic functions we started in section 4.10. We also study Fourier series of
functions in 𝔏p (−𝜋, 𝜋). Then we take a brief tour through the Fourier transform.
Finally we take a last look at the orthogonal polynomials we encountered in
section 4.10.
n n
Observe that Dn (x) = 1 + ∑j=1 (eijx + e−ijx ) = 1 + 2 ∑j=1 cos(jx). Multiplying the
two sides of last identity by sin(x/2) we obtain
n
x x x
sin( )Dn (x) = sin( ) + 2 ∑ sin( )cos(jx)
2 2 j=1
2
n
x 1 1 1
= sin( ) + ∑ [sin(j + )x − sin(j − )x] = sin(n + )x,
2 j=1
2 2 2
𝜋 n
1 4 1
‖Dn ‖1 = ∫ |D (x)|dx > 2 ∑ .
2𝜋 −𝜋 n 𝜋 k=1 k
20
15
10
-π 0
𝜋 𝜋
1 |sin(n + 1/2)x| 2 |sin(n + 1/2)x|
‖Dn ‖1 = ∫ x dx ≥ ∫ dx
𝜋 0 sin 𝜋 0 x
2
(n+1/2)𝜋 n𝜋 n k𝜋
2 |sin x| 2 |sin x| 2 |sin x|
= ∫ dx > ∫ dx = ∑ ∫ dx
𝜋 0 x 𝜋 0 x 𝜋 k=1 (k−1)𝜋 x
n k𝜋 n
2 1 4 1
> ∑ ∫ |sin x|dx = 2 ∑ .
𝜋 k=1 k𝜋 (k−1)𝜋 𝜋 k=1 k
We are now ready to prove that the Fourier series of a continuous, 2𝜋-periodic
function f need not converge pointwise to f.
Theorem 8.9.1. There exists a function f ∈ 𝒞(𝒮1 ) such that Sn f(0) does not converge
to f(0).
Proof. We prove that, for some continuous, 2𝜋-periodic function f, the sequence
Sn f(0) is unbounded. For each n ∈ ℕ, define a functional 𝜆n on 𝒞(𝒮1 ) as follows:
𝜆n (f ) = Sn f(0). Then
𝜋 𝜋
1 ‖ f‖∞
|𝜆n (f )| ≤ ∫ | f(t)|Dn (t)dt ≤ ∫ |D (t)|dt = ‖Dn ‖1 ‖ f‖∞ .
2𝜋 −𝜋 2𝜋 −𝜋 n
It follows that ‖‖𝜆n ‖‖ ≤ ‖Dn ‖1 . We show that ‖𝜆n ‖ = ‖Dn ‖1 . Let 𝜖 > 0. Consider
the function
1 if Dn (x) ≥ 0,
f (x) = {
−1 if Dn (x) < 0.
Observe that f (x)Dn (x) = |Dn (x)| for all x ∈ [−𝜋, 𝜋]. By example 3 in section 8.7,
there exists a function g ∈ 𝒞(𝒮1 ) such that ‖ f − g‖1 < 𝜖. Now
𝜋 𝜋 𝜋
1 | | 1 | |
| ∫ D (x)g(x)dx − ∫ |Dn (x)|dx| = | ∫ Dn (x)g(x) − Dn (x)f (x)dx|
2𝜋 | −𝜋 n −𝜋
| 2𝜋 |
−𝜋
|
𝜋
1
≤ ∫ |D (x)|| f (x) − g(x)|dx ≤ ‖Dn ‖∞ ‖ f − g‖1 < 𝜖‖Dn ‖∞ .
2𝜋 −𝜋 n
1 𝜋 1 𝜋
It follows that |𝜆n (g)| = | ∫−𝜋 Dn (x)g(x)dx| > ∫−𝜋 |Dn (x)|dx − 𝜖‖Dn ‖∞ .
2𝜋 2𝜋
Since 𝜖 is arbitrary, ‖𝜆n ‖ = ‖Dn ‖1 .
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
We now prove another classical theorem about the convergence of the means
of the sequence of partial sums of the Fourier series of continuous, 2𝜋-periodic
functions.
1
Kn (x) = (D + ... + Dn ).
n+1 0
The function Kn is known as the Feijer kernel. We derive a formula for Kn below.
Form the formula for the Dirichlet kernel, we have
n
x 1
(n + 1)sin ( ) Kn (x) = ∑ sin (j + ) x.
2 j=0
2
Thus
n
x x 1
(n + 1)sin2 ( ) Kn (x) = ∑ sin ( ) sin (j + ) x
2 j=0
2 2
n
1 1 (n + 1)x
= ∑ (cos jx − cos (j + 1)x) = (1 − cos(n + 1)x) = sin2 ( ).
2 j=0 2 2
Hence
(n+1)x
sin2 ( )
Kn (x) = 2
x .
(n+1)sin2
2
( )
𝜋
Clearly, Kn is an even, positive, 2𝜋-periodic function, and since ∫−𝜋 Dj (x)dx = 2𝜋
𝜋
for all j ∈ ℕ, ∫−𝜋 Kn (x)dx = 2𝜋 for all n ∈ ℕ.
1 𝜋
Proof. Since f ∗ Kn = Kn ∗ f, it is more convenient here to write 𝜎n f (x) = ∫−𝜋 f(x −
2𝜋
t)Kn (t)dt. Let 𝜖 > 0. By the uniform continuity of f, there exists a number
𝛿 > 0 such that if |t| < 𝛿, then | f(x − t) − f (x)| < 𝜖 for all x ∈ [−𝜋, 𝜋]. Choose a
natural number N such that, for n > N, max{Kn (t) ∶ 𝛿 ≤ |t| ≤ 𝜋} < 𝜖. Recall that
𝜋
∫−𝜋 Kn (t)dt = 2𝜋. Now, for n > N,
𝜋
| 1 |
|𝜎n f (x) − f (x)| = || ∫ (f(x − t) − f (x))Kn (t)dt||
2𝜋 −𝜋
𝜋
1
≤ ∫ | f(x − t) − f (x)|Kn (t)dt
2𝜋 −𝜋
1 1
= ∫ | f(x − t) − f (x)|Kn (t)dt + ∫ | f(x − t) − f (x)|Kn (t)dt
2𝜋 |t|<𝛿 2𝜋 𝛿≤|t|≤𝜋
𝜖 2‖ f‖∞
≤ ∫ K (t)dt + ∫ 𝜖dt ≤ 𝜖 + 2𝜖‖ f‖∞ .
2𝜋 |t|<𝛿 n 2𝜋 𝛿≤|t|≤𝜋
Observe that Feijer’s theorem furnishes another proof that trigonometric polyno-
mials are uniformly dense in 𝒞(𝒮1 ).
𝜋 1/p
1
‖ f‖p = ( ∫ | f (x)|p dx) , 1 ≤ p < ∞.
2𝜋 −𝜋
𝜋
̂ 1
|f(n)| ≤ ∫ | f(t)|dt = ‖ f‖1 < ∞.
2𝜋 −𝜋
̂
It is convenient to refer to the set of Fourier coefficients (f(n)) n∈ℤ of a function
p
f ∈ 𝔏 (−𝜋, 𝜋) by the notation 𝔉(f ). We think of 𝔉 as a linear transformation
from 𝔏p (−𝜋, 𝜋) to some suitable range space. For example, when p = 2, the
range space of 𝔉 is l2 (ℤ). We will show in example 2 below that, for all p ≥ 1,
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
and all f ∈ 𝔏p (−𝜋, 𝜋), 𝔉(f ) ∈ c0 (ℤ). The norm on c0 (ℤ) is the ∞-norm. Thus
̂
‖𝔉(f )‖∞ = sup{|f(n)| ∶ n ∈ ℤ}.
We are now ready to extend the discussion of Fourier series we started in section
4.10 from 𝒞(𝒮1 ) to 𝔏2 (−𝜋, 𝜋).
Theorem 8.9.4. The set {un (t) = eint ∶ n ∈ ℤ} is an orthonormal basis for
𝔏2 (−𝜋, 𝜋).
Proof. By theorem 8.9.3, Span({un ∶ n ∈ ℤ}) is dense in 𝔏2 (−𝜋, 𝜋). The assertion
of this theorem follows directly from theorem 7.2.7.
All the results we obtained in section 4.10 for Fourier series of continuous
functions extend to 𝔏2 (−𝜋, 𝜋). The following theorem lists some of the properties.
They follow directly from general Hilbert space theory.
Theorem 8.9.5. The following are true for a function f ∈ 𝔏2 (−𝜋, 𝜋):
The simplicity, elegance, and completeness of theorem 8.9.5 does not extend
to functions in 𝔏1 (−𝜋, 𝜋). For example, the sequence of partial sums Sn f (x) =
n
̂ j (x) need not converge to f in the 1-norm (see the section exercises), and
∑−n f(j)u
𝔉 does not map 𝔏1 (−𝜋, 𝜋) onto its range space, which we now describe.
𝜋
̂ ̂ − p(n)| 1 | |
|f(n)| = |f(n) ̂ = | ∫ (f(t) − p(t))e−int dt|
2𝜋 | −𝜋 |
𝜋
1
≤ ∫ | f(t) − p(t)|dt = ‖ f − p‖1 < 𝜖.
2𝜋 −𝜋
𝜋 𝜋 𝜋 𝜋
1 1 | | 1
‖𝜎n f‖1 = ∫ | ∫ f(t)Kn (x − t)dt|dx ≤ 2 ∫ ∫ | f(t)|Kn (x − t)dtdx
2𝜋 −𝜋 2𝜋 | −𝜋 | 4𝜋 −𝜋 −𝜋
𝜋 𝜋 𝜋
1 1
= ∫ | f(t)| ∫ Kn (x − t)dxdt = ∫ | f(t)|dt = ‖ f‖1 .
4𝜋 2 −𝜋 −𝜋
2𝜋 −𝜋
̂ = 0 for all
Theorem 8.9.6 (the uniqueness theorem). If f ∈ 𝔏1 (−𝜋, 𝜋) and f(n)
1
n ∈ ℤ, then f = 0 a.e. Consequently, the mapping 𝔉 ∶ 𝔏 (−𝜋, 𝜋) → c0 (ℤ) is
injective.
̂
Proof. First observe that 𝔉 is bounded by virtue of the inequality |f(n)| ≤ ‖ f‖1 . If 𝔉
is surjective, then, by the open mapping theorem, 𝔉 would be invertible; hence,
for every f ∈ 𝔏1 (−𝜋, 𝜋), ‖ f‖1 ≤ M‖𝔉(f )‖∞ , where M = ‖𝔉−1 ‖. Now, for the
sequence (Dn ) of Dirichlet kernels, ‖𝔉(Dn )‖∞ = 1, while ‖Dn ‖1 → ∞ as n → ∞.
This contradiction delivers the result.
̂ = 1
f(x) ∫ f(t)e−ixt dt.
√2𝜋 ℝ
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
1
The normalization constant is included only for the symmetry of the formulas
√2𝜋
and is not essential.
One can think of the Fourier transform as the continuous equivalent of Fourier
series. Instead of using the discrete set of frequencies {eint }n∈ℤ , the Fourier
transform uses a continuum of frequencies {eixt }t∈ℝ .
It is clear that f ̂ ∈ 𝔏∞ (ℝ) and that ‖f‖̂ ∞ ≤ ‖ f‖1 . The following theorem narrows
down the range space of the Fourier transform.
̂ n ) = limn ∫ 1 1 ̂ 0 ).
limn f(x f(t)e−ixn t dt = ∫ f(t)e−ix0 t dt = f(x
ℝ √2𝜋 ℝ √2𝜋
Thus
̂ = 1 1
2f(x) ∫ f(𝜉)e−ix𝜉 d𝜉 − ∫ (𝜏a f)(𝜉)e−ix𝜉 d𝜉
√2𝜋 ℝ √2𝜋 ℝ
1
= ∫ (f − 𝜏a f)(𝜉)e−ix𝜉 d𝜉.
√2𝜋 ℝ
It follows that
̂ 1 1
2|f(x)| ≤ ∫ | f(𝜉) − (𝜏a f)(𝜉)e−i𝜉x |d𝜉 = ‖ f − 𝜏a f‖1 .
√2𝜋 ℝ √2𝜋
The next goal is to prove the inversion formula. Guided by the inversion formula
for a function f ∈ 𝔏1 (−𝜋, 𝜋) when 𝔉(f ) ∈ l1 (ℤ) (see problem 1 at the end of this
section), one can reasonably conjecture that if f and f ̂ are both in 𝔏1 (ℝ), then
1
f (x) = ̂ ixt dt for almost every x ∈ ℝ.
∫ℝ f(t)e
√2𝜋
The proof bears some resemblance to that of Feijer’s theorem in that we will find
a family of functions {G𝜍 } such that lim𝜍↓0 f ∗ G𝜍 converges to f in 𝔏1 (ℝ). Before
we construct the family G𝜍 , it may be useful to find an even function that is equal
to its own Fourier transform. One such function exists. The proof of the following
proposition is left as an exercise.
1 2
Proposition 8.9.9. For the function G1 (x) = e−x /2 , Ĝ 1 = G1 .
√2𝜋
1 1
G1 (x) = G1 (−x) = Ĝ 1 (−x) = ∫ G1 (t)eixt dt = ∫ Ĝ 1 (t)eixt dt.
√2𝜋 ℝ √2𝜋 ℝ
1 −x2
G𝜍 (x) = exp{ }.
𝜎√2𝜋 2𝜎 2
1
Observe that G𝜍 (x) = G1 (x/𝜎).
𝜍
Let 𝜖 > 0, and choose 𝛿 > 0 such that |g(y) − g(0)| < 𝜖 for all y such that |y| < 𝛿.
Also choose 𝜎0 > 0 such that ∫|y|>𝛿 G𝜍 (y)dy < 𝜖 whenever 0 < 𝜎 < 𝜎0 :
| |
|g(y)G𝜍 (y) − g(0)| = || ∫ g(y)G𝜍 (y) − ∫ g(0)G𝜍 (y)dy||
ℝ ℝ
p p
‖ f ∗ G𝜍 − f‖p ≤ ∫ ‖𝜏y f − f‖p G𝜍 (y)dy.
ℝ
p
The function g(y) = ‖𝜏y f − f‖p is continuous by problem 4 on section 8.7 and is
p
bounded because |g(y)| ≤ (‖𝜏y f‖p + ‖ f‖p )p = 2p ‖ f‖p .
p
Applying the previous example, we have lim𝜍→0 ∫ℝ ‖𝜏y f − f‖p G𝜍 (y)dy =
g(0) = 0.
1
Example 8. We will later need the identity G𝜍 (x) = ∫ℝ G1 (𝜎t)eixt dt:
√2𝜋
1 1 1 x
G𝜍 (x) = G (x/𝜎) = Ĝ 1 (−x/𝜎) = ∫ G1 (y) exp(iy )dy
𝜎 1 𝜎 𝜎√2𝜋 ℝ 𝜎
1
= ∫ G1 (𝜎t)eixt dt.
√2𝜋 ℝ
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
Theorem 8.9.11 (the inversion theorem). If f and f ̂ are in 𝔏1 (ℝ), then for almost
every x ∈ ℝ,
1 ̂ ixt dt.
f (x) = ∫ f(t)e
√2𝜋 ℝ
1
f ∗ G𝜍 (x) = ∫ f(x − t)G𝜍 (t)dt = ∫ f(x − t) ∫ G1 (𝜎y)eiyt dydt
ℝ ℝ √2𝜋 ℝ
1
= ∫ ∫ f(x − t)eiyt dtG1 (𝜎y)dy
√2𝜋 ℝ ℝ
1
= ∫ ∫ f(u)eiy(x−u) duG1 (𝜎y)dy
√2𝜋 ℝ ℝ
̂ ixy G1 (𝜎y)dy.
= ∫ f(y)e
ℝ
̂ ixy G1 (𝜎y)dy.
f ∗ G𝜍 (x) = ∫ f(y)e
ℝ
On the other hand, by proposition 8.9.10, the left side of identity (8) converges to f
in 𝔏1 (ℝ). By example 5 in section 8.3, the sequence f ∗ G𝜍n contains a subsequence
that converges a.e. to f.
Putting the last two facts together, we arrive at the inversion formula.
1
Observe that the function g(x) = ̂ ixt dt is in 𝒞0 (ℝ) by an argument
∫ℝ f(t)e
√2𝜋
identical to that in the proof of theorem 8.9.8. Thus the assumptions of the above
theorem imply the f is equal a.e. to a 𝒞0 (ℝ) function.
OUP UNCORRECTED PROOF – FINAL, 15/1/2021, SPi
We are now ready to settle a question that could not be answered completely in
chapter 4. What is the smallest Hilbert space that contains the space H? The answer
is now within our reach.
The situation is far less obvious in the case of Hermite polynomials. In this
2
case, d𝜇 = e−x d𝜆, and it is true that the normalized Hermite polynomials
1
H̃ n = Hn (see problem 15 on section 4.10) form an orthonormal basis for
√n!2n √𝜋
𝔏2 (𝜇). Equivalently, we prove the following.
Theorem 8.9.14. If f ∈ 𝔏2 (𝜇) and ∫ℝ f (x)H̃ n (x)d𝜇 = 0 for all n ∈ ℕ, then f (x) = 0
for a.e. x ∈ ℝ.
We leave it to the reader to verify that, for a fixed x ∈ ℝ, the function h(t) = e|xt| ∈
2
𝔏2 (𝜇). It follows that the product f(t)h(t) ∈ 𝔏1 (𝜇); hence f(t)e|xt| e−t ∈ 𝔏1 (𝜆).
Now
∞
2 2 (−ixt)n
̂ = ∫ f(t)e−t e−ixt dt = ∫ f(t)e−t ∑
√2𝜋g(x) dt
ℝ ℝ n=0
n!
∞ n
(−ix) 2
=∑ ∫ f(t)tn e−t dt = 0.
n=0
n! ℝ
Exercises
∞ ∞
1. Prove that if f ∈ 𝔏1 (−𝜋, 𝜋) and ∑−∞ |f(n)| ̂ ̂
< ∞, then f (x) = ∑−∞ f(n)e inx
APPENDIX A
Before we embark on the task of proving theorem 2.2.1, we need to develop some back-
ground work.
Notation. If S is a subset of a well-ordered set A, we use the notation min{S} to denote the
least element of S.
Definition. Let A be a well-ordered set, and let x ∈ A. The initial segment of A determined
by x is the set
S(A, x) = { y ∈ A ∶ y < x}.
Lemma A.1. In the notation of the above paragraph, let ℭ = {(A𝛼 , ≤𝛼 )}𝛼∈I be a chain in 𝔚,
and let A = ∪𝛼 A𝛼 . Then A is well ordered.
Proof. Recall that to say that ℭ is a chain means that, for 𝛼, 𝛽 ∈ I, either (A𝛼 , ≤𝛼 ) ⊆ (A𝛽 , ≤𝛽 )
or (A𝛽 , ≤𝛽 ) ⊆ (A𝛼 , ≤𝛼 ). Here is an explicit definition of the ordering ≤ on A: for a, b ∈ A,
let 𝛼, 𝛽 ∈ I be such that a ∈ A𝛼 , b ∈ A𝛽 . Since ℭ is a chain, say, (A𝛼 , ≤𝛼 ) ⊆ (A𝛽 , ≤𝛽 ). Then
a, b ∈ A𝛽 . Define a ≤ b if a ≤𝛽 b. The fact that ≤ is well defined follows from the fact that
ℭ is a chain. It is a simple exercise to show that ≤ linearly orders A. We now show that
≤ is a well ordering on A. Let S be a nonempty subset of A. Then S ∩ A𝛼 ≠ ∅ for some 𝛼.
Let a be the least element of S ∩ A𝛼 . We claim that a is the least element of S. Let b ∈ S be
such that b ≤ a, and assume that b ∈ A𝛽 . If (A𝛽 , ≤𝛽 ) ⊆ (A𝛼 , ≤𝛼 ), then a, b ∈ S ∩ A𝛼 and
b = a, since a is least in S ∩ A𝛼 . If (A𝛼 , ≤𝛼 ) ⊆ (A𝛽 , ≤𝛽 ), then A𝛼 is a segment of A𝛽 , and
b ≤ a; hence, b ∈ A𝛼 , and, as before, b = a.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Note that (A, ≤) in the above lemma is an upper bound of the chain ℭ. Thus if A𝛼 ≠ A,
then (A, ≤) is a continuation of (A𝛼 , ≤𝛼 ). The reader is encouraged to work out the details.
The crucial step to verify is the following: if a ∈ A𝛼 , y ∈ A, and y < a, then y ∈ A𝛼 . See
lemma A.4.
Proof. Let X be a nonempty set. We show that X can be well ordered. Let 𝔚 be the collection
of well-ordered subsets of X, and partially order 𝔚 by continuation. By lemma A.1, a chain
in 𝔚 has an upper bound. By Zorn’s lemma, 𝔚 has a maximal member (A, ≤). We claim
that A = X. If A ≠ X, pick an element z in X − A, and define an ordering ≤0 on Z = A ∪ {z}
as follows: retain the ordering ≤ on A, and define a <0 z for all a ∈ A. Now (Z, ≤0 ) is a strict
continuation of (A, ≤), which contradicts that maximality of (A, ≤).
Theorem A.3. The well ordering principle implies the axiom of choice.
Proof. Let {X𝛼 } be a nonempty collection of nonempty sets. By assumption, each X𝛼 can be well
ordered. Let x𝛼 be the least element of X𝛼 , and let x = (x𝛼 ). Clearly, x is a choice function
and ∏𝛼 X𝛼 ≠ ∅.
We need a final set of details before we prove the last leg of theorem 2.2.1. The definition
below makes sense for linearly ordered sets, but we limit the discussion to well-ordered sets
because this is where our interest lies now.
We adopt the following assumptions and terminology for the remainder of this appendix.
Let (X, ≤) be a partially ordered set such that every chain in X has an upper bound but X
has no maximal element.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
appendix a 447
Given a proper chain A in X, A has an upper bound, u. Since u is not maximal in X, there
is x ∈ X such that x > u. Clearly, x ∉ A because x is a strict upper bound of A. Let 𝔄 be
the collection of all chains in X. Invoking the axiom of choice, we can choose a strict upper
bound of each chain in X. Thus we have a function f ∶ 𝔄 → X that assigns to each chain A
a strict upper bound, f (A).
Proof. Let x = min{A − B}, and define C = S(A, x). It is easy to verify that C ⊆ B. We claim
that C = B. Suppose, for a contradiction, that B − C ≠ ∅, and let y = min{B − C}. We need
three steps before we finalize the proof:
1. S(B, y) is a proper subset of C: Suppose u ∈ B, u < y. If u ∉ C, then u ∈ B − C, and
u < y. This contradicts the definition of y. If S(B, y) = C = S(A, x), then y = f (S(B, y)) =
f (S(A, x)) = x, which is a contradiction because y ∈ B and x ∉ B. This proves our
assertion that S(B, y) is a proper subset of C.
2. S(B, y) is a section of C: If u ∈ C, v ∈ S(B, y), and u < v, we show that u ∈ S(B, y). Since
u < v < y, u < y. If u ∉ S(B, y), then u ∉ B. Thus u ∈ A − B; hence u ≥ x. But u ∈ C =
S(A, x); hence u < x. This contradiction proves that u ∈ S(B, y); hence S(B, y) is a section
of C.
3. S(B, y) is a segment of C; thus S(B, y) = S(C, z), where z ∈ C: This follows directly from
steps 1 and 2 and lemma A.4.
Now we conclude the proof. By step 3, y = f (S(B, y)) = f (S(C, z)) = z. This is a contradiction
because z ∈ C, but y ∉ C by the definition of y. This contradiction proves that B = C.
Let U be the union of all the conforming subsets of X. The following is a direct result of
lemma A.5.
Observation. If A is a conforming subset of X, a ∈ A, y ∈ U and y < a, then y ∈ A.
Theorems A.2 and A.3 together with theorem A.7 below constitute the proof of
theorem 2.2.1.
Proof. Let (X, ≤) be a partially ordered set such that each chain in X has an upper bound. If X
is a chain, then it would have a maximal (in fact, a largest) element, and there is nothing
more to prove. Therefore, we assume that X is not a chain. We show that X has a maximal
element. Suppose, for a contradiction, that X contains no maximal element.
We have shown in lemma A.6 that the set U is the largest conforming subset of X. Since U
is well ordered and X is not, U ≠ X. Let 𝜔 = f (U). The set U ∪ {𝜔} is clearly a conforming
subset of X that strictly contains U. This contradiction establishes the theorem.
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
APPENDIX B
Matrix Factorizations
The main purpose of this appendix is to prove a useful matrix factorization result: theorem
B.4. Theorem B.3 is a useful by-product of this appendix.
1
⎛ ⎞
⋱
⎜ ⎟
⎜ 1 ⎟
S(s, i) = ⎜ s ⎟
⎜ 1 ⎟
⎜ ⋱ ⎟
⎝ 1⎠
(b) an elementary permutation matrix (the off diagonal entries are (i, j) and ( j, i)):
1
⎛ ⎞
⋱
⎜ ⎟
⎜ 0 1 ⎟
P(i, j) = ⎜ ⋮ ⋱ ⋮ ⎟
⎜ 1 0 ⎟
⎜ ⋱ ⎟
⎝ 1⎠
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
1
⎛ ⎞
1
⎜ ⎟
⋱
⎜ ⎟
1
M(𝜇, i, j) = ⎜ ⎟
⎜ ⋮ ⋱ ⎟
⎜ 𝜇 ⋱ ⎟
⎜ ⋮ ⎟
⎝ 1⎠
Observe that a multiplier matrix can be written as M(𝜇, i, j) = I + 𝜇ej eTi . Using this, it is easy
to verify that (I + 𝜇ej eTi )−1 = I − 𝜇ej eTi . Here I is the identity matrix of the appropriate size.
Proof. Verifying the theorem when E or F is a scaling matrix or a permutation matrix is trivial.
If F is obtained from In by adding 𝜇 times column j to column i, then AF = A(In + 𝜇ej eTi ) =
A + 𝜇(Aej )eTi . Now Aej is the jth column of A, and (Aej )eTi is a matrix whose only nonzero
column is the jth column of A placed in the ith column. Hence the result.
Proving part (a) for the case of left multiplication by a multiplier matrix is similar and is
left to the reader.
d1
⎛ ⎞
⎜ ⋱ ⎟
⎜ dq ⎟ , di ≠ 0, 1 ≤ i ≤ q.
⎜ ⎟
⎜ ⎟
⎝ ⎠
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
appendix b 451
Proof. In light of lemma B.1, it is enough to prove that A can be reduced to a diagonal matrix
through a sequence of elementary row and column operations. We proceed by induction on
the number of rows, m. The result is true for a 1 × n matrix and for any n ∈ ℕ. Consider
the 1 × n matrix A = (a1 , … , an ). If a1 = 0, we interchange the first entry with a later,
nonzero entry. Once that is achieved, we subtract ai /a1 times entry 1 from entry j, 2 ≤ j ≤ n.
We obtain a matrix of the form (a1 , 0, 0, … , 0). This proves the base case of our inductive
proof. Now we show the inductive step. Suppose the conclusion of the theorem holds for
k × n matrices if k < m and n ∈ ℕ. Let A be an m × n matrix. If a1,1 = 0, we can move a
nonzero entry from a later row and/or column to the top left entry, so assume that a1,1 ≠ 0.
Subtracting ai,1 /a1,1 times the top row from row i, 2 ≤ i ≤ m, then subtracting a1,j /a1,1 times
the first column from column j, 2 ≤ j ≤ n, we obtain a matrix of the form
a 0 ... ... 0
⎛ 11 ⎞
⎜ 0 ⎟. (∗)
⎜ ⋮ A′ ⎟
⎜ ⎟
⎝ 0 ⎠
Proof. Use the previous theorem and take Q = (Er . . . E1 )−1 and P = F1 . . . Fs .
Proof. Using theorem B.2, if A is invertible, so is D (recall that elementary matrices are
invertible). In this case, still in the notation of theorem B.2, q = n and D=S1 S2 . . . Sn , where
Si is the scaling matrix
1
⎛ ⎞
⎜ ⋱ ⎟
Si = ⎜ di ⎟.
⎜ ⎟
⎜ ⋱ ⎟
⎝ 1⎠
1 . . . Er S1 . . . Sn . . . Fs . . . F1 , as desired.
Thus A = E−1 −1 −1 −1
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
OUP UNCORRECTED PROOF – FINAL, 14/1/2021, SPi
Bibliography
454 bibliography
Mostow, George et al. Fundamental Structures of Algebra. New York: McGraw-Hill, 1975.
Munkres, James. Topology: A First Course. London: Prentice-Hall, 1975.
Muscat, Joseph. Functional Analysis: An Introduction to Metric Spaces, Hilbert Spaces and
Banach Algebras. Cham: Springer, 2014.
O’Connor, J. J. and E. F. Robertson. MacTutor History of Mathematics. St. Andrews:
University of St Andrews, 1998, http://www-history. mcs.st-andrews.ac.uk/index.html,
accessed Nov. 6, 2020.
Oden, Tinsley and Leszek Demkowicz. Applied Functional Analysis. 2nd Edition. Boca
Raton: CRC Press, 2010.
Pedersen, Gert K. Analysis Now. New York: Springer-Verlag, 1989.
Pinter, Charles. A Book of Set Theory. New York: Dover Publications, 2014.
Pitts, C. G. C. Introduction to Metric Spaces. University Mathematical Texts. Edinburgh:
Oliver and Boyd, 1972.
Rudin, Walter. Functional Analysis. McGraw-Hill Series in Higher Mathematics. New York:
McGraw-Hill, 1973.
Rudin, Walter. Principles of Mathematical Analysis. 3rd Edition. International Series in Pure
and Applied Mathematics. New York: McGraw-Hill, 1976.
Rudin, Walter. Real and Complex Analysis. 2nd Edition. McGraw-Hill Series in Higher
Mathematics. New York: McGraw-Hill, 1974.
Rynne, Bryan and Martin Youngson. Linear Functional Analysis. London: Springer-Verlag,
2008.
Searcóid, Micheál. Metric Spaces. Springer Undergraduate Mathematics Series. London:
Springer, 2007.
Shalit, Orr Moshe. A First Course in Functional Analysis. Boca Raton: CRC Press, 2017.
Simmons, George. Introduction to Topology and Modern Analysis. New York: McGraw-Hill,
1963.
Smith Douglas et al. A Transition to Advanced Mathematics. 8th Edition. Andover: Cenage
Learning, 2015.
Spence Lawrence et al. Elementary Linear Algebra: A Matrix Approach. 2nd Edition. Noida:
Pearson, 2018.
Stakgold, Iver. Green’s Functions and Boundary Value Problems. Pure and Applied Mathe-
matics. New York: John Wiley and Sons, 1979.
Stoer, J. and R. Bulirsch. Introduction to Numerical Analysis. New York: Springer-Verlag,
1980.
Viro, O. Ya. et al. Elementary Topology Problem Book. Providence: American Mathematical
Society, 2008.
Wade, William. Introduction to Analysis. Edinburgh: Pearson, 2014.
Young, Nicholas. An Introduction to Hilbert Space. Cambridge: Cambridge University Press,
1988.
OUP UNCORRECTED PROOF – FINAL, 12/1/2021, SPi
Glossary of Symbols
Index
458 index
index 459
460 index
index 461
462 index