0% found this document useful (0 votes)
72 views86 pages

Real Analysis Course Notes 2017

This course covered real analysis and was taught by Yum-Tong Siu. There were weekly problem sets, a midterm, and final exam. Topics included measure theory, integration, Fourier series, and partial differential equations. The course aimed to understand how to represent functions as eigenfunctions of differential operators in order to solve differential equations.

Uploaded by

PK admin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views86 pages

Real Analysis Course Notes 2017

This course covered real analysis and was taught by Yum-Tong Siu. There were weekly problem sets, a midterm, and final exam. Topics included measure theory, integration, Fourier series, and partial differential equations. The course aimed to understand how to represent functions as eigenfunctions of differential operators in order to solve differential equations.

Uploaded by

PK admin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 86

Math 212a - Real Analysis

Taught by Yum-Tong Siu


Notes by Dongryul Kim
Fall 2017

This course was taught by Yum-Tong Siu. The class met on Tuesdays and
Thursdays at 2:30–4pm, and the textbooks used were Real analysis: measure
theory, integration, and Hilbert spaces by Stein and Shakarchi, and Partial dif-
ferential equations by Evans. There were weekly problem sets, a timed take-
home midterm, and a take-home final. There were 6 students enrolled, and
detailed lecture notes and problem sets might be found on the course website.

Contents
1 August 31, 2017 4
1.1 Convergence of the Fourier series . . . . . . . . . . . . . . . . . . 4

2 September 5, 2017 6
2.1 Measure of a set . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 More properties of the measure . . . . . . . . . . . . . . . . . . . 8

3 September 7, 2017 9
3.1 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Egorov’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3 Convergence theorems . . . . . . . . . . . . . . . . . . . . . . . . 10

4 September 12, 2017 12


4.1 Differentiability of nondecreasing dunctions . . . . . . . . . . . . 12
4.2 Fundamental theorem of calculus . . . . . . . . . . . . . . . . . . 14

5 September 14, 2017 15


5.1 Fundamental theorem of calculus II . . . . . . . . . . . . . . . . . 15
5.2 High-dimensional analogue . . . . . . . . . . . . . . . . . . . . . 17

6 September 19, 2017 19


6.1 The averaging problem . . . . . . . . . . . . . . . . . . . . . . . . 19
6.2 Convergence of Fourier series . . . . . . . . . . . . . . . . . . . . 22

1 Last Update: January 15, 2018


7 September 21, 2017 23
7.1 Dirichlet test for Fourier series . . . . . . . . . . . . . . . . . . . 23
7.2 Approximation to identity . . . . . . . . . . . . . . . . . . . . . . 24
7.3 Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

8 September 26, 2017 27


8.1 Fejér–Lebesgue theorem . . . . . . . . . . . . . . . . . . . . . . . 27
8.2 Fourier inversion formula . . . . . . . . . . . . . . . . . . . . . . 29
8.3 Fubini’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

9 September 28, 2017 30


9.1 Mean value theorem . . . . . . . . . . . . . . . . . . . . . . . . . 30
9.2 Hölder and Minkowski inequalities . . . . . . . . . . . . . . . . . 31
9.3 Hilbert space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

10 October 3, 2017 34
10.1 Riesz representation theorem . . . . . . . . . . . . . . . . . . . . 34
10.2 Adjoint operator . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

11 October 5, 2017 37
11.1 Approximating by convolutions . . . . . . . . . . . . . . . . . . . 38

12 October 10, 2017 40


12.1 Compact operators . . . . . . . . . . . . . . . . . . . . . . . . . . 40
12.2 Spectral theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

13 October 12, 2017 43


13.1 Sturm–Liouville equation . . . . . . . . . . . . . . . . . . . . . . 43
13.2 Fredholm’s alternative . . . . . . . . . . . . . . . . . . . . . . . . 44
13.3 Banach spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

14 October 17, 2017 46


14.1 Useful facts about Banach spaces . . . . . . . . . . . . . . . . . . 46
14.2 Hahn–Banach theorem . . . . . . . . . . . . . . . . . . . . . . . . 47

15 October 19, 2017 49


15.1 Calculus on manifolds . . . . . . . . . . . . . . . . . . . . . . . . 49
15.2 Poincaré lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
15.3 Rellich’s lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

16 October 24, 2017 52


16.1 L2 space of differential forms . . . . . . . . . . . . . . . . . . . . 52
16.2 Gårding’s inequality . . . . . . . . . . . . . . . . . . . . . . . . . 53

17 October 26, 2017 55


17.1 Solving the equation . . . . . . . . . . . . . . . . . . . . . . . . . 55
17.2 Regularity of harmonic forms . . . . . . . . . . . . . . . . . . . . 57

2
18 October 31, 2017 58
18.1 Smearing out by polynomials . . . . . . . . . . . . . . . . . . . . 58
18.2 Differential equations with constant coefficients . . . . . . . . . . 59

19 November 2, 2017 61
19.1 Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
19.2 Tempered distributions . . . . . . . . . . . . . . . . . . . . . . . . 62

20 November 7, 2017 64
20.1 Malgrange–Ehrenpreis on unbounded region . . . . . . . . . . . . 64

21 November 9, 2017 67
21.1 Division problem for distributions . . . . . . . . . . . . . . . . . . 68
21.2 Changing the domain of integration . . . . . . . . . . . . . . . . 68

22 November 14, 2017 70


22.1 Elliptic regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

23 November 16, 2017 73


23.1 Division problem of tempered distributions by polynomials . . . 73
23.2 Sobolev embedding . . . . . . . . . . . . . . . . . . . . . . . . . . 74

24 November 21, 2017 76


24.1 Nirenberg’s trick of integrating in a rectangular direction . . . . 76
24.2 Moser’s iteration trick . . . . . . . . . . . . . . . . . . . . . . . . 78

25 November 28, 2017 79


25.1 Radon–Nikodym derivative . . . . . . . . . . . . . . . . . . . . . 79
25.2 Ergodic measure theory . . . . . . . . . . . . . . . . . . . . . . . 80

26 November 30, 2017 83


26.1 Hausdorff measure . . . . . . . . . . . . . . . . . . . . . . . . . . 83
26.2 Radon transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

3
Math 212a Notes 4

1 August 31, 2017


We are going to use two books: Stein–Shakaarchi, Real Analysis and Lawrence
Evans, Partial Differential Equations.
The motivation studying real analysis is solving differential equations. Fourier
in 1822 wrote a book about the solving the heat equation. His idea was to con-
vert the differential equation to algebraic equations. Solving differential equa-
tions is finding eigenfunctions for differential operators. For instance, erx is an
eigenfunctions for d/dx with eigenvalue r. So if we let
X
f (x) = cj erj x ,
j
rj x
P
then df /dx = j cj rj e and so this becomes an algebraic equation in the
coefficients. A special case is when f (x) is on R with period 2π. Then the
eigenfunctions are einx with n ∈ Z.
But one needs to argue that there is a unique representation f (x) = n∈Z cn einx .
P
In case of a finite dimensional vector space, with an inner product, this is easy.
You can easily choose an orthonormal basis of Cn , and it is easy to check that
n
X
~v = cj ~ej , where cj = h~v , ~ej i.
j=1

People tried to do a similar thing. It is easy to check that einx / 2π are or-
thonormal with respect to the inner product
Z 2π
(f, g) = f (x)g(x)dx.
x=0
So it would be reasonable that
inx
√ Z 2π
einx
γn = (f, e / 2π) = f (x) √ dx.
x=0 2π
Now the question is whether
1 X
f (x) = (f, einx )einx

n∈Z

is true. Lebesgue answered this question in 1902, using Lebesgue theory. Ac-
tually this didn’t solve the problem entirely even in constant coefficients, and
it remained unsolved until 1955 when Malgrange and Ehrenpreis solve it in
constant coefficients.

1.1 Convergence of the Fourier series


We need to start with replacing the function f by (cn )n∈Z . Consider the N th
partial sum
XN
sN = cn einx .
n=−N
Math 212a Notes 5

We want to know if limN →∞ sN = f . It can be computed that


n n Z 2π
X X 1
sn (x) = ck eikx
= f (y)e−iky dyeinx
2π y=0
k=−n k=−n
Z 2π X n 
1
= f (y) eik(x−y) dy.
2π y=0
k=−n

This sum in the integral is called the Dirichlet–Dini kernel. We can sum
them as
n
X e−inx − ei(n+1)x sin(n + 12 )x
Dn (x) = eikx = = .
k=−n
1−e ix sin 21 x

Then we can write


Z 2π
1
sn (x) = f (y)Dn (x − y)dy,
2π 0

as the convolution.
To argue about the limit of sn (x) as n → ∞, we need to say about the
commutation of limits. This is not very true of most functions, for instance
characteristic functions of bad sets. Lebesgue had this big idea of partitioning
the range instead of the domain. The base of the rectangles may be strange,
though.
If we change the period from [0, 2π] to [−L, L], we can rescale the length and
use einxπ/L instead. In the limiting case L → ∞, we will get another theory.
Then
Z ∞ Z ∞
ˆ
f (ξ) = f (x)e −2πiξx
dx, f (x) = fˆ(ξ)e2πiξx dξ.
x=−∞ ξ=−∞

These are called the Fourier transform and the inversion formula.
This still doesn’t solve the differential equations with constant coefficients.
Let me tell you why. Suppose you have a differential equation like
 ∂2 ∂2 
+ · · · + f = g.
∂x21 ∂x2n

If you take the Fourier transform, we get

−4π 2 (ξ12 + · · · + ξn2 )fˆ = ĝ

and then divide by the polynomial. But this is a problem because there might
be zeros in the polynomial. This was a big problem, and it took a long time to
handle these.
Math 212a Notes 6

2 September 5, 2017
So we start out with the details. We want to solve differential equations by
eigenfunctions, but P
there are two problems here. The first is the convergence of

the series like f = n=−∞ cn einx . The second is the effect of differentiation.
But here, it is easier to do integration, because it will be easier to exchange
sums. So the tool we need here is the fundamental theorem of calculus. For
Riemann integration, we need “uniformity”, but this is too strong and so we will
use Lebesgue theory. We are going to study the size of the set when a certain
property holds or fails.

2.1 Measure of a set


We start with the reals R. For a set E ⊆ R, we are going to study this.
Theorem 2.1 (Structre theorem). Let O be an open subset of R. Then O is
the disjoint union of a countable number of open intervals, possibly of infinite
length.

Proof. The key is that a point disconnects an open set in R. Then any connected
component of O is an open interval. This is because, if x ∈ O, then

Ax = {y ∈ R : y > x and [x, y] ⊆ O}

is of the form (x, b). If you do the same thing on the other side, the connected
component of x in O is (a, b). Countability follows from the fact that the rational
numbers is dense and countable.
S
This means that we can write O = k∈J (ak , bk ) as a disjoint union. This
allows us to define its measure as
X
m(O) = (bk − ak ).
k∈J

Now given arbitrary set E ⊆ R, Lebesgue tried to approximate E from the


outside by open set, and from the inside by closed sets. But we are going to do
something different.

Definition 2.2. Given a set E ⊆ R, we define its outer measure as

m∗ (E) = inf m(O).


O⊇E open

Lebesgue’s original definition for the inner measure was also like mi (E) =
supF ⊆E m(F ) and call E measurable if the outer and inner measures coincide.
But we are not doing this. Instead, we define
Definition 2.3. Given E ⊆ R, E is measurable if for each  > 0, there exists
an open set O ⊇ E such that m∗ (O − E) < .
Math 212a Notes 7

Why are these equivalent? Intuitively, we are trying to approximate O − E


from the outside, i.e., taking a neighborhood G of O −E. Here we can replace G
with G ∩ O . Then this is like approximating E from the inside by F = O − G,
which is closed in E.
There are three important properties of the collection of all measurable sets.
This is the abstract formulation.
(0) The empty set φ and the universe (R) are both measurable.
(1) If A and B are measurable, then A − B is measurable.
(2) Countable unions of measurable sets are measurable.
(3) Countable intersections of measurable sets are measurable.
If X is a set and a collection of subset of X have these three properties, then it
is called a σ-algebra.
Proposition 2.4. Countable unions of measurable sets are measurable.
Proof. Let En be the collection. For eachSn, we approximate En ⊆ O,n so that
m∗ (O,n − En ) < 2−n . Now E ⊆ O = n O,n and the excess would be
[
O − E ⊆ (O,n − En ),
n

and by countable sub-additivity of the outer measure, the right hand side has
outer measure at most .
Proving (1) is more complicated, because outer approximation and inner
approximation get mixed up.
Proposition 2.5 (Additivity of outer measure). If E1 , E2 ⊆ R have positive
distance between them, i.e., inf xj ∈Ej |x1 −x2 | > 0, then m∗ (E1 ∪E2 ) = m∗ (E1 )+
m∗ (E2 ).
Proof. Let δ be the distance. Let Gj = {y ∈ R : dist(y, Ej ) < δ/2}. This is an
open set by the triangle inequality. Also G1 and G2 are disjoint. Given O ⊇
E1 ∪ E2 , we can assume that O ⊆ G1 ∪ G2 by replacing with the intersection.
Let O,j = Gj ∩ O so that O is the disjoint union of O,1 and O,2 . Then
m∗ (E1 ) + m∗ (E2 ) ≤ m(O,1 ) + m(O,1 ) = m(O ) < m∗ (E1 ∪ E2 ) + .
Since  is arbitrary, we get the inequality on the other side.
Proposition 2.6. Every closed subset F ⊆ R is measurable.
S
Proof. By countable union, we may assume that F is bounded; F = n F ∩
[−n, n]. Assume that F ⊆ (a, b). Then we can again we can write
[
F = (aj , bj ).
j∈J

Now we can approximate (aj , bj ) from the inside by closed sets, and then take
the complement. Here, we actually need to select a finite number of big intervals.
Math 212a Notes 8

Proposition 2.7. If E is measurable, then E c = R − E is measurable.


Proof. By definition, there are open sets On ⊇ E such that m∗ (On − E) < n−1 .
If we let S = n On , then E c − S ⊆ On − E for all n, and so m∗ (E c − S) = 0.
c
S
This implies that E c − S is measurable. On the other hand, S is a countable
union of closed sets and so measurable. Hence E c is also measurable.

This and (2) implies (3), and then (1) follows. Thus measurable sets form
an σ-algebra.

2.2 More properties of the measure


Proposition
S 2.8 (Additivity of measure). If Ej are measurable and disjoint
and E = n En , then
X
m(E) = m(Ej ).
j

Proof. First reduce to the case when each Ej is bounded, by breaking them up
into Ej ∩ [n, n + 1). For the other side of the inequality, we approximate Ej for
the inside like Fj ⊆ Ej so that m(Ej − Fj ) < 2−j . Here the point is that any
two disjoint compact sets have positive distance. So we can use the previous
additivity property to get additivity for Fj .
Proposition 2.9. The measure of a limit of an increasing sequence is the limit
of the measure.

Proof. You just use the fact that En = (En − En−1 ) ∪ (En−1 − En−2 ) ∪ · · · .
Proposition 2.10. The measure of a limit of a decreasing sequence is the limit
of the measure, if the measure of at least one is finite.
This m(E1 ) < ∞ assumption is crucial. Here is a counterexample. If En =
(n, ∞), then E = ∅ and m(En ) = ∞.
Math 212a Notes 9

3 September 7, 2017
Let me make a remark about Borel measurable sets. Abstract measure theory is
just taking some sets and taking countable unions, intersections, complements.
Borel measurable sets are the sets that can be obtained by these operations
from open sets. There are more Lebesgue measurable sets, namely the sets with
measure zero.
Let me give you two counter-examples. We are going to construct a non-
measurable set E ⊆ [0, 1] such that a countable disjoint union of E contains
[0, 1] but is contained in [−1, 2]. Consider the equivalence relation x ∼ y if and
only if x − y ∈ Q on [0, 1], and pick one representative for each class. Then this
set E has the property.
The second example is the Cantor function, which does not satisfy the fun-
damental theorem of calculus.

3.1 Integration
Let’s look at the definition of Riemann integration. Riemann’s idea was to cut
the interval [a, b] into smaller intervals, and take the limit as the mesh goes to
zero. Here, we require that f is continuous, or almost continuous. We then
approximate the function with the step function
X
a j χ Ij .
j

The definition depends on the fact that this step function approaches f uni-
formly. Then by definition,
X Z X Z
aj m(Ij ) = aj χIj → f.
j j

In Lebesgue’s case, we assume that 0 ≤ f ≤ M and cut the target into pieces
0 = y0 < y1 < · · · < yp = M . Let

{x : yj−1 ≤ f (x) < yj } = Ej .

Then you conclude that


X
yj χEj → f

pointwise as the mesh goes to 0. Such a function is called a simple function.


Then we define Z X Z
X
yj m(Ej ) = yj χEj → f.
j
Math 212a Notes 10

3.2 Egorov’s theorem


Theorem 3.1 (Egorov, almost uniform convergence). Let E be measurable and
m(E) < ∞. Let fk be a sequence of measurable functions supported on E, and
fk → f almost everywhere on E. Then for every  > 0 there exists a closed
subset A ⊆ R with A ⊆ E such that m(E − A ) <  and fk → f uniformly on
A .
Definition 3.2. f is measurable if {f < c} is measurable as a set for all
c ∈ R.
Proof. Uniform convergence fk → f means that there exists an N such that
|fk − f | < n−1 for all k ≥ N . So we look at where this fails.
For n, N , define

En,N = {x ∈ E : |fk (x) − f (x)| < n−1 for all k ≥ N }.

In words, this is the set on which fk is uniformly n−1 -close to f for k ≥ N . By


definition En,N ⊆ En,N +1 % E, because fk → f pointwise. So given n, there
exists a Nn such that
m(E − En,Nn ) < 2−n .
So fk is n−1 -closed to f for k ≥ Nn outside a set of measure 2−n .
Given  > 0, there exists a ` such that
\
 = En,Nn
n≥`

has m(E − Â ) < /2. This is the “good set”, but we also have the condition
that this has to be closed. Approximate  by a “good” closed subset A ⊆ Â
such that m(Â − A ) < /2.

Corollary 3.3. If ϕk is a sequence of simple functions, with common finite-


measure support, and ϕk → f pointwise, then ϕk → f almost uniformly.
R
Corollary 3.4. If ϕk is further uniformly bounded, then ϕk is Cauchy.
So we can define Lebesgue integration.

3.3 Convergence theorems


There are three convergence theorems and Fatou’s lemma, all of which follow
from Egorov’s theorem.

Lemma 3.5 (Fatou). Let fk ≥ 0 be a sequence of functions on R. Then


Z Z
lim inf fk ≥ lim inf fk .
k→∞ k→∞
Math 212a Notes 11

R
R All the other theorems regard with the question of whether limk→∞ fk =
f if fk → f .
Before I go on, I want to make some comments on the definition of Lebesgue
integration. So far, we have defined the integral of 0 ≤ f ≤ M on E with
m(E) < ∞. Now we relax the conditions, one by one. Suppose f ≥ 0 but that
it could be unbounded. What we can do is to truncate f at n so that it becomes
max(f, n). The integral of this is well-defined and we can then take the limit.
Next, we would like to remove the condition m(E) < ∞. Here we truncate
the domain and replace E with E∩[−n, n]. But we still need the f ≥ 0 condition,
because otherwise we could get ∞ − ∞.
R R
Definition
R R3.6. We say that f is integrable if f+ = max{0, f } is finite
and f− = max{0, −f } is also finite. In this case, we have f = f+ − f− and
we define Z Z Z
= f+ − f− .
f

The dominated convergence theorem concerns withR the case |fk | ≤ g where
g is integrable. Here we need absolute continuity, i.e., A g → 0 as m(A) → 0.
Proof of Fatou’s lemma. The proof is actually quite simple. Let f = lim inf k fk
and ϕn = inf k≥n fk so that ϕn % f .R Here we R can use monotone convergence.
On the other hand, ϕn ≤ fn implies ϕn ≤ fn . Taking lim inf of both sides
gives the result immediately.
If things are not finite, you can truncate the functions and then take the
limit.

I’d also like to compare Riemann integration and Lebesgue integration. It


turns out that f is Riemann integrable if and only if f is continuous outside a
set of measure zero.
Math 212a Notes 12

4 September 12, 2017


To define Lebesgue integration, we first looked at how to measure sets. Then we
approximated the function with simple functions. We needed Egorov’s theorem
to give sense to it. Then for unbounded functions with non-compact support,
we had to truncate it.
We were comparing Riemann integration with Lebesgue integration. We
would want to characterize functions which are Riemann integrable.

Theorem 4.1. If f is bounded on [a, b] (with a < b), then f is Riemann


integrable if and only if f is continuous outside a set of measure zero.
Proof. (⇒) Let ψP and ϕP be the step functions on the partition P which gives
the upper sum and the lower sum. If f is Riemann integrable, then there is a
sequence of partitions such that

ϕPν ≤ ϕPν+1 ≤ · · · ≤ f ≤ · · · ≤ ψPν+1 ≤ ψPν .

Here, if we let ϕ = limν→∞ ϕPν and ψ = limν→∞ ψPν , then


Z Z Z Z
ϕ = lim ϕPν = f = ψ,
ν→∞

in the Lebesgue sense. Since ψ ≥ ϕ, we have ψ = ϕ outside E, a measure zero


set.
We claim that f is continuous outside E plus the countable number of par-
tition points. This is because at any of the point outside E and not a partition
point, the function f is approximated from above and below.
(⇐) The function is Riemann integrable if it is continuous. Also, f is
bounded, so if the non-continuous points have measure zero, then the upper
sum and lower sum will have a small difference.

4.1 Differentiability of nondecreasing dunctions


There are two forms of the fundamental theorem of calculus: you first differen-
tiate, or you first differentiate.
Z x Z b
d
f (t)dt = f (x), F 0 (x)dx = F (b) − F (a)
dx a x=a

For the first part, the left hand side is only defined by f up to “almost every-
where”. So this is the best you can expect. But then we need to ask when
dF/dx makes sense, almost everywhere.
Dini had the idea of defining 4 derivates. When you define the difference
quotient
f (h + x0 ) − f (x0 )
lim ,
h→0 h
Math 212a Notes 13

you actually have limit from the left and right, and lim sup and lim inf.

lim sup = D+ f (x0 ), lim inf = D+ , lim sup D− , lim inf = D−


h→0+ h→0+ h→0− h→0−

Under what condition are they the same? Vitali decided to look at the cumu-
lative effect of the errors. Assume that f is nondecreasing. The “bad” sets we
want to look at is the ones with
f (x0 + h) − f (x0 ) f (x0 + h) − f (x0 )
lim inf < lim sup .
h h
Because this is hard to control, fix rationals α < β and only look at the x0 such
that the left hand side is smaller than α and the right hand side greater than β.
Lemma 4.2. If Ek % E in R then limk→∞ m∗ (Ek ) = m∗ (E).
Proof. Approximate Ek by the open sets from the outside.
Lemma 4.3 (Vitali covering). Let E ⊆ R be m∗ (E) < ∞. For every x ∈ E,
suppose there exists a nonempty Ax of positive numbers. Then for every  > 0,
there exist x1 , . . . , xk ∈ E and rxj ∈ Axj such that (xj , xj + rxj ) are all disjoint
and
 k
[ 
m∗ E ∩ (xj , xj + rxj ) ≥ m∗ (E) − .
j=1

The main difficulty is that we are only allowed to use finitely many intervals.
Proof. Let
En = {x ∈ E : sup Ax > 1/n, x ∈ [−n, n]}.
Then En % E and so at sufficiently large n, we can replace E by En with at
most  error. Now you use the at least 1/n-length intervals to cover E almost
entirely.
Let f is nondecreasing on [a, b] with a < b in R. We would like to show that
D+ f (x) = D+ f (x) almost everywhere. You need two other similar statements,
to get differentiability entirely.
For 0 ≤ α < β rational numbers, let

Eα = {x ∈ [a, b) : D+ f (x) < α}.

Now for each x ∈ Eα there exists a rx > 0 such that f (x + rx ) − f (x) < αrx .
Cover this set Eα by (xj , xj + rxj ) up to error . Then the total length of the
f -image of the union is going to be at most
X
α rxj .
j

Now replace [a, b] by [xj , xj + rxj ] for each j to apply Vitali again. Define

Fj,β = {x ∈ [xj , xj + rxj ) : D+ f (x) > β}.


Math 212a Notes 14

We can then cover the intervals [xj , xj +rxj ] with [xj,l , xj,l +sxj,l ] up to measure
. It then follows that the total length of the f -image of {[xj,l , xj,l + sj,l ]} is
at least β times the total length. These two inequalities give a contradiction as
 → 0. This finishes the proof.

4.2 Fundamental theorem of calculus


We now want to show that Z x
d
f (t)dt
dx a
exists almost everywhere if f is integrable. In the Riemann case, you can do
this because it is literally defined using rectangles and so taking the difference
gives the rectangle.
For the Lebesgue case, let f be integrable on [a, b], and let
Z x
F (x) = f (t)dt.
a

We may assume that f > 0 since we can take the positive and negative part,
and write f as the difference of the two. This also means that dF/dx is almost
everywhere defined.
The difference quotient is, if n > 0,
Z b
1 b
Z b
1  b+n
Z a+n 
F (x + n ) − F (x)
Z  Z
= F (x+n )− F (x) = F− F .
a n n a a n b a

We now use the convergence theorem for Lebesgue integrals. Because F is


continuous, the right hand side goes to F (b) − F (a) as n → 0. This shows, by
Fatou’s lemma,
Z b Z b
F (x + n ) − F (x)
F 0 ≤ lim inf = F (b) − F (a).
a n→∞ a n
Actually, this is true for any increasing continuous F . (Just forget about f .)
Rb
We would now like to show that F (b) − F (a) ≤ a F 0 and that F 0 = f .
Rb
Assume now that f is bounded. The claim is that a F 0 = F (b) − F (a).
Instead of Fatou, we use bounded convergence. Note that
Z x+n
F (x + n ) − F (x) 1
= f
n n x
is bounded. So we really get
Z b
F (b) − F (a) = F0
a

if f is bounded. In general, you truncate the range to get fn = min{f, n}. In


Rx Rb
this case, let Fn = a fn so that a Fn0 = Fn (b) − Fn (a).
Math 212a Notes 15

5 September 14, 2017


As we have seen before, if f is integrable, then
Z x
d
f (t)dt = f (x).
dx a
Here theRkey is to look at the cumulative errors and monotone functions. Define
x
F (x) = a f (t)dt, and then F 0 (x) exists almost everywhere, and the question
is whether F 0 = f almost everywhere. The way to show this is to show that
Rb 0
a
(F − f ) = 0 for every a < b.
If f is bounded, then we can use bounded convergence theorem to get
Z b Z b
F (x + ) − F (x)
lim = f.
→0 a  a

If f is not
R x bounded, we can use the truncation fn = min(f, n) and then for
Fn (x) = a fn , we have Fn0 (x) = fn (x) almost everywhere. Fatou’s lemma tells
us that for any F increasing and continuous,
Z b
F 0 ≤ F (b) − F (a).
a

In this case, F 0 ≥ Fn0 pointwise almost everywhere, and so


Z b Z b
F0 ≥ Fn0 = Fn (b) − Fn (a) → F (b) − F (a)
a a

by monotone convergence.
Theorem 5.1 (Fundamental theorem of calculus I). If f is integrable, then
Z x
d
f (t)dt = f (x)
dx a
for almost every x.

5.1 Fundamental theorem of calculus II


For the other version of the fundamental theorem of calculus, we need the
condition that F is absolutely continuous. It took a long time for people to
nail down this condition. R
The measure-theoretic condition is that E f → 0 if m(E) → 0. But this
doesn’t make sense, and we use the approximate version, which replaces E by
a finite number of disjoint open intervals.
Definition 5.2. A function F : [a, b] → R is called absolutely continuous if
`
X
|F (bj ) − F (aj )| < 
j=1
S` P`
for disjoint open intervals j=1 (aj , bj ) ⊆ [a, b] with j=1 (bj − aj ) < δ .
Math 212a Notes 16

This is stronger than uniform continuity, because that is only for one interval.
The technique for proving this is again comparing the functions. Let’s as-
sume for now that we know that F 0 exists and is integrable. Let
Z x
G(s) = F 0.
a

Then G0 = F 0 almost everywhere, because we have proved this already, and we


want to check that G(x) = F (x)−F (a). So if (G−F )0 = 0 is almost everywhere,
then is G−F continuous? This is true only when G−F is absolutely continuous.
(Look at the Cantor function, which have almost everywhere zero derivative.)
So how do you prove that F absolutely continuous implies F 0 exists almost
everywhere? If we manage to split F into a difference of increasing functions
F = F+ − F− , then each of them are differentiable and F is also differentiable.
Consider a partition P of [a, b] into a = x0 < · · · < xn = b. Define
n
X
Var(P, f ) = |f (xj ) − f (xj−1 )|, Varba f = sup Var(P, f ).
j=1 P

If f is monotone, then clearly Varba f = |f (b) − f (a)|. On the other hand, we


always clearly have Varba (f + g) ≤ Varba f + Varba g.
Theorem 5.3. A function f on [a, b] is the difference of two monotone functions
if and only if Varba f < ∞.
Proof. We have already proven the forward direction. For the other direction,
define
n
X
Var+ (P, f ) = max(f (xj ) − f (xj−1 ), 0),
j=1
Xn
Var− (P, f ) = min(f (xj ) − f (xj−1 ), 0).
j=1

Then Var(P, f ) = Var+ (P, f ) − Var− (P, f ) and f (b) − f (a) = Var+ (P, f ) +
Var− (P, f ). It follows that

2 Var+ (P, f ) = Var(P, f ) + (f (b) − f (a))

and likewise for Var− . Then we can take sup on both sides. After doing this,
we can define
f+ = sup Var+ (P[a,x],f )
P

and likewise for f− so that f = f+ − f− . These are clearly increasing functions,


almost by definition.
We now have to show that absolute continuity implies finite total variation,
but this is obvious. So F is almost everywhere differentiable. We now want
Math 212a Notes 17

to to show that F 0 is integrable. We can do this for monotone functions, and


we actually know that the derivative of a monotone continuous function is in-
tegrable, by Fatou’s lemma. Thus we need to check that f+ and f− are both
continuous, if f is absolutely continuous.
Let’s check left-continuity. Given any , is there a δ such that |Vary+,a (f ) −
Varx+,a (f )| <  if x − δ < y < x? This you can just use the assumption for
absolute continuity. For right-continuity, you reflect the whole picture by using
additivity of Var+ .
The second statement is that if F is absolutely continuous, F 0 = 0 almost
everywhere, then F is constant. This is because you can pick a very small set
such that F is locally constant outside. Then all the variation is inside this little
set. This shows that every two point has difference below epsilon. That is, F is
actually constant.
Theorem 5.4 (Fundamental theorem of calculus II). If F is absolutely contin-
uous, F 0 is integrable and
Z b
F 0 = F (b) − F (a).
a

5.2 High-dimensional analogue


For some hundred years, people tried to make rigorous what Fourier did. To do
this, we need the commutation between integration and differentiation, or more
generally the commutation between integration and limits. This is the whole
point of convergence theorems. Under the fundamental theorem of calculus, the
statement Z b Z b
d ∂f
f (x, y)dx = (x, y)dx
dy a a ∂y
could be made into a commutation of integration.
If we are able to change order of integration, we can write
Z y Z b Z b Z y
∂ ∂
f (x, η)dη = f (x, η)dη
η=c x=a ∂η x=a η=c ∂η
Z b Z b
= f (x, y)dx − f (x, c)dx
x=a x=a

by integrating the fundamental theorem of calculus. Then by fundamental the-


orem again, we get the result.
Fubini was the one who managed to prove this, and his idea was to define
the “double integral” that is just defined on functions on R2 . To do this, we
need measure theory and Lebesgue theory on Rd . The main difference is that
we no longer have the structure theorem for open subsets in R. But why do we
need the structure theorem anyways? We used it to define the measure of an
open set, but if we can do this, we’re fine.
The building blocks are the closed cubes in Rd . An open set O is always
an almost disjoint union of a countable number of cubes in Rd . Here, almost
Math 212a Notes 18

disjoint means that their interiors are disjoint. Is the measure defined in this
way unique? Actually you don’t have to worry about this, because you’re going
to define the exterior measure as
X
m∗ (E) = inf m(Qj ).
j

Now you can repeat everything again and get a measure theory on Rd .
The next step is the fundamental theorem of calculus, and here you have a
trouble, because you have many variables. One analogue of this can be thought
of as Z
1
lim f = f (x)
x∈B,m(B)→0 m(B) B

almost everywhere. (Note the one-dimensional case.) People call this an “aver-
aging problem”, and was solved by Lebesgue.
How are we going to approach this problem? In the one-dimensional case,
we used a truncation of f and used the approximation fn % f . Here, we
are going to approximate f by a continuous function g of compact support,
in the L1 norm. And then you hope that a 3 argument will give work out.
The approximation between f (x)
R and g(x) 1is called Tchebychev’s theorem, and
1
R
the comparison between m(B) B
f and m(B) B g is the inequality on Hardy–
Littlewood maximal function.
Math 212a Notes 19

6 September 19, 2017


We proved the fundamental theorem of calculus, and we verified to some extent
the change of integration and differentiation by Fubini’s theorem. Now we want
to talk about the fundamental theorem of calculus in higher dimensions. In the
1-dimensional case, the fundamental theorem is saying that
Z x
1
lim f (t)dt = f (a).
x→a x − a a

We can replace this set [a, x] by a set B and write


Z
1
lim f = f (x).
x∈B m(B) B

This is something that can be generalized.


If you have a function f , you can associate to it a measure
Z
E 7→ f = µ(E).
E

Then what we are actually looking at is the Radon–Nikodym derivative

µ(E)
lim = g.
B→p+ m(E)

The second
R part of the fundamental theorem of calculus can then be stated as
µ(E) = E g. This is what we are going to do.
Example 6.1. Before starting, let me mention the Cantor function. This is
function f : [0, 1] → [0, 1] is defined as
ja k
j
x = 0.a1 a2 a3 . . .(3) 7→ f (x) = 0.b1 b2 b3 . . .(2) where bj = .
2
This satisfies f 0 = 0 almost everywhere.

6.1 The averaging problem


We want to show that, if f on Rd is integrable, then
Z
1
lim f = f (x)
m(B)→0 m(B) x∈B

for almost every x. The proof is done by looking at the stronger version
Z
1
lim sup |f (y) − f (x)|dy = 0
m(B)→0 m(B) B

for almost every 0. (This is stronger by the triangle inequality.)


Math 212a Notes 20

The proof goes by approximation by g for which the statement is correct.


For example, if g is continuous on Rd then the assertion is certainly true. The
condition f integrable means that f can be approximated by a linear combina-
tion of characteristic functions of cubes, and then smooth out the corners to get
a continuous function. So we get a sequence of functions g such that
Z
kf − gkL1 = |f − g| → 0.
Rd

Then we check that the statement carries out. We have


Z Z Z
1 1 1
|f (y)−f (x)|dt ≤ |f (y)−g(y)|+ |g(y)−g(x)|dy+|g(x)−f (x)|.
m(B) B m(B) B m(B) B

Applying lim sup on both sides, we get


Z Z
1 1
lim sup |f (y) − f (x)| ≤ sup |f (y) − g(y)|dy + |g(x) − f (x)|.
m(B)→0 m(B) B x∈B m(B) B

For α > 0, let


Z
n 1 o
Eα = x : lim sup |f (y) − f (x)| > α ,
m(B)→0 m(B) B
Z
n 1 o
Ẽα = x : sup |f (y) − g(y)|dy > α ,
x∈B m(B) B

Êα = {x : |g(x) − f (x)| > α}.

Then Eα is contained in Ẽα/2 ∪ Êα/2 .


Now we claim that
3d
(1) m∗ (Ẽα ) ≤ kf − gkL1 ,
α
1
(2) m(Êα ) ≤ kf − gkL1 .
α
This will finish the proof, because we have kf − gkL1 → 0 and then we can take
the countable union over α > 0.
The claim (2) is Tchebychev’s inequality. In general, if F ≥ 0 is inte-
grable, then Z
1
m({F > α}) ≤ F
α Rd
because F ≥ αχ{F >α} . (2) is simply this applied to F = |f − g|.
The claim (1) is more difficult. Define the Hardy–Littlewood maximal
function of an integrable function F as
Z
∗ 1
F (x) = sup |F |.
x∈B m(B) B
Math 212a Notes 21

We are then claiming the weak-type inequality

3d
m({F ∗ > α}) ≤ kF kL1 .
α
This is done by Vitali’s covering technique.
Proposition 6.2. Given a finite number of open balls B, there exists a finite
subcollection B 0 of disjoint open balls such that
[  X
m B ≤ 3d m(B).
B∈B B∈B0

Proof. Choose B0 ∈ B with maximal radius. Consider the ball 3B0 with the
same center and three-times radius. Among the balls in B that are not com-
pletely covered by 3B0 , take the maximal radius ball B1 ∈ B. Note that B0
and B1 cannot intersect by maximality. Among the balls in B that are not
completely covered by 3B0 ∪ 3B1 , take the maximal radius ball B2 ∈ B. Repeat
this until we exhaust of balls, and let B 0 = {B0 , B1 , . . .}.
Theorem 6.3 (Weak-type inequality). If F is integrable, then

3d
m({F ∗ > α}) ≤ kF kL1 .
α
Proof. We first note that F ∗ is measurable because {F ∗ > α} is in fact open.
Wiggling around doesn’t change the integral much.
Consider any compact subset K ⊆ Eα = {F ∗ > α}. Then there is a finite
covering of K by large balls, and then

3d 3d
[  X Z Z
m(K) ≤ m B ≤ 3d m(B) ≤ |F | ≤ kF k1 .
0
α B∈B0 B α
B∈B B∈B

This finishes the proof.


So we have shown that
1
lim |f (y) − f (x)|dy = 0 (∗)
m(B)→0 m(B)
for almost every x. This is actually not stronger that the first fundamental
theorem of calculus
1
lim f (y)dy = f (x).
m(B)→0 m(B)

Definition 6.4. The set on which (∗) holds is called the Lebesgue set of f .
This has an interesting consequence. If f = g almost everywhere, we regard
them as more or less the same. The first fundamental theorem of calculus allows
us to pick a representative of the equivalence class. Of course, the limit may
not exist at some point, but we can then let them to take 0.
Math 212a Notes 22

6.2 Convergence of Fourier series


This has something to do with the Lebesgue set. Start out with an integrable
function f on R with period 2π. We define
Z π
1
cn = f (x)e−inx dx,
2π −π

and consider the sum


n π
sin(n + 21 )y
Z
X 1
sn (x) = ck eikx = f (x − y) dy
k=−n
2π −π sin 12 y
π
sin(n + 21 )y
Z
1
= (f (x − y) + f (x + y)) dy.
2π 0 sin 21 y

(We are doing this last step so that there is a better chance of convergence.) So
π
sin(n + 21 )y
Z
1
sn (x) − L = (f (x − y) + f (x + y) − 2L) dy
2π 0 sin 12 y

and people want to see if this goes to 0 as n → ∞.


Even if f (x) is continuous, it is not true that sn → f . The Riemann–
Lebesgue lemma tells us that if F is integrable and a < b are finite, then
Z b
F (x) sin λxdx → 0
a

as λ → ∞. The way you do this is to approximate F by a smooth function G


and estimate the difference. So people tried to group

f (x − y) + f (x + y) − 2L
sin 12 y

together. If f 0 exists, then we indeed let L = f (x) and get an integrable function.
But this doesn’t really work very nicely in general. Cesàro decided to look
at the Lebesgue set.
Proposition 6.5. If x is in the Lebesgue set, then
s0 + s1 + · · · + sn−1
σn = → f (x)
n
as n → ∞.
Math 212a Notes 23

7 September 21, 2017


Let f be a integrable function on R with period 2π, and define
Z π n
1 X
cn = f (x)e−inx dx, sn = cn eikx .
2π −π
k=−n

The first thing we want to establish is Dini’s test, which is that the Fourier
series for f (x) converges at x to L if
f (x + y) + f (x − y) − 2L
y
is integrable as a function of y near y = 0.
There also Cesàro convergence. If x is in the Lebesgue set, then we can show
that
s0 + s1 + · · · + sn−1
σn = → f (x).
n
Why is this interesting? We can interpret
n−1 n−1
X n−k
1X
σn = sk = ak
n n
k=0 k=0

as a sum with some weight. We can also look at other weights, for instance,
Abel’s

X
Ar = r|n| an .
n=−∞

Ordinary convergence implies Cesàro convergence, and this implies Abel’s con-
vergence. Can you go back? These are called Tauberian theorems, and is an
interesting field.

7.1 Dirichlet test for Fourier series


Firstly, we have the Dirichlet kernel
n
X sin(n + 21 )x
Dn (x) = eikx = ,
k=−n
sin 21 x

and then
Z π Z π
1 1
sn −L = (f (x−y)−L)Dn (y)dy = (f (x−y)+f (x+y)−2L)dy.
2π y=−π 2π y=0

Does this go to 0 as n → ∞?
The first thing done was to look at the function
f (x − y) + f (x + y) − 2f (x)
.
y
If this is integrable, then sn − f actually converge to 0.
Math 212a Notes 24

Lemma 7.1 (Riemann–Lebesgue lemma). If f (x) is integrable on [a, b], then


Z b
f (x) sin λx → 0
a

as λ → ∞.
Proof. You approximate f by a smooth function g in L1 [a, b]. We first check
Z b Z b
cos λx cos λx
g(x) sin λx = − g(x)|ba + g 0 (x) dx,
a λ a λ

which goes to 0 as λ → ∞. Now we have the estimate


Z b Z b Z b
f (x) sin λxdx = (f (x) − g(x)) sin λx + g(x) sin λxdx,
a a a

and we can use a 3 argument to get the estimate.


Corollary 7.2 (Dirichlet test). If (f (x − y) + f (x + y) − 2f (x))/y is integrable
as a function of y, then sn (x) → f (x).

7.2 Approximation to identity


Cesàro only defined the notion of convergence. The guy who actually proved
this is Fejér. We have
n−1 n−1
1X 1 1 X 1 sin2 21 nx
Fn (x) = Dk (x) = 1 sin(k + 12 )x = .
n n 2π sin 2 x 2πn sin2 12 x
k=0 k=0

This is the Fejér kernel. Then


π
sin2 12 ny
Z
1
σn (x) − f (x) = (f (x + y) + f (x − y) − 2f (x))dy.
2πn y=0 sin2 12 y

If f is continuous at x, for instance, you can show that this goes 0 for n → ∞,
because we have 1/n.
What is really the difference between Dn and Fn ? The intuitive picture is
that Fn is positive and approximates the Dirac delta for n → ∞. On the other
hand, Dn has some oscillation that does not disappear in magnitude as n → ∞.
Let us make this more precise. A good kernel is a family of integrable
functions Kδ for δ > 0 satisfying
Z
(1) Kδ (x)dx = 1 (unit interval),
d
ZR
(2) |Kδ (x)| ≤ A independent of δ (uniform integrability),
Rd
Math 212a Notes 25

Z
(3) for every η > 0, |Kδ (x)|dx → 0 as δ → 0 (small L1 norm outside the
|x|>η
origin).

This give convergence almost everywhere, but we don’t know if things converge
on Lebesgue sets.
So we strengthen the condition, and say that Kδ is an approximation to
identity if it satisfies (1) and the following two strengthenings:
A
(2s) |Kδ (x)| ≤ for all δ > 0 (parameter pole order estimate),
δd

(3s) |Kδ (x)| ≤ for all δ > 0 (coordinate pole order estimate).
|x|d+1
Proposition 7.3. If Kδ is a good kernel, and f is an integrable function, then
there is a sequence δν → 0 such that f ∗ Kδν → f almost everywhere.
Proof. Write fy (x) = f (x − y). Note that
Z
(f ∗ Kδ )(x) − f (x) = (fy (x) − f (x))Kδ (y)dy.
Rd

So if we take the L1 -norm,


Z
kf ∗ Kδ − f kL1 ≤ kfy − f kL1 |Kδ (y)|dy
Rd
Z Z
= kfy − f kL1 |Kδ (y)|dy + 2kf kL1 |Kδ |
|y|>η |y|<η

≤ A sup kfy − f kL1 + 2kf kL1 .


|y|≤η

Then we can use approximation of f by smooth functions to show limη→0 sup|y|≤η kfy −
f kL1 = 0.
So we get limδ→0 kf ∗ Kδ − f k = 0. You have shown in you homework that
then there is a subsequence δν such that f ∗ Kδν → f almost everywhere.
Let us now look at the approximation to identity case. Consider an approx-
imation by identity Kδ (x). If x is in the Lebesgue set, we have
Z
1
Ax (r) = d |f (x − y) − f (x)|dy → 0
r |y|<r

as r → 0. Also, Ax (r) is continuous on r > 0 and uniformly bounded since it is


at most 2kf kL1 for r > 1.
Proposition 7.4. If Kδ is an approximation to identity, f is measurable and
x is a Lebesgue point, then f ∗ Kδ (x) → f (x).
Math 212a Notes 26

Proof. We have
Z
|(f ∗ Kδ )(x) − f (x)| ≤ |f (x − y) − f (x)||Kδ (y)dy|
d
ZR
= |f (x − y) − f (x)||Kδ (y)|dy
|y|<δ
∞ Z
X
+ |f (x − y) − f (x)||Kδ (y)|dy
k=0 2k δ<|y|≤2k+1 δ

X cδ
≤ cA(δ) + A(2k+1 δ)
(2k δ)d+1
k=0

From the properties we had for A(r), it follows that this goes to 0 as δ → 0.

7.3 Fourier transform


If f is integrable on Rd , we can define the Fourier transform
Z
ˆ
f (ξ) = f (x)e−2πix·ξ dx.
Rd

The inverse Fourier transform formula is


Z
f (x) = fˆ(ξ)e2πix·ξ dξ.
Rd

You can ask when is the inverse Fourier transform actually the inverse of the
Fourier transform? It turns out this is easier to deal with than the Fourier
series, because f and fˆ both are defined in Rd .
The idea is that if f and fˆ are in LR1 , then Rthe Fourier transform is formally
symmetric. That is, if f, g ∈ L1 , then f ĝ = fˆg. The verification is Fubini’s
theorem and rescaling the Gauss distribution to use it as a good kernel.
Math 212a Notes 27

8 September 26, 2017


The difference between the Dirichlet kernel and the Fejér kernel is that the
Dirichlet kernel has oscillation with constant amplitude, while the Fejér kernel
has oscillation going to zero. So for convergence with the Dirichlet kernel, you
need some nice properties for f . The Fejér kernel is nice, so the Fourier series
converges to f on Lebesgue sets. So f can be approximated by some very nice
functions. Moreover, sn = f ∗ Dn so f can be “smoothed out” by convoluting.

8.1 Fejér–Lebesgue theorem


Theorem 8.1 (Fejér–Lebesgue theorem). If f ∈ L1 ([0, 2π]), and x is in its
Lebesgue set, then σn (x) → f (x) as n → ∞.
Proof. We have
Z 2π
σn (x) − f (x) = Fn (y)(f (x + y) + f (x − y) − 2f (x))dy,
y=0

where Fn is the Fejér kernel. Here, we are going to take care of 0 < y < η with
the Lebesgue point condition and handle η < y < π with the oscillation. Let us
write φ(y) = 21 (f (x + y) + f (x − y) − 2f (x)) and
Z t
Φ(t) = |φ(y)|dt.
y=0

Then Φ(t) ≤ 2kf kL1 and the Lebesgue point property gives Φ(t)/t → 0 as t → 0.
Given  > 0, choose 0 < η < π such that 0 ≤ Φ(t) < t for all t ≤ η. Now
we want to show that
sin2 πny
Z π
1 2
φ(y)dy → 0
2πn y=0 y2
as n → ∞. We have from 0 < y < 1/n,
sin2 πny
Z 1/n
1 2 1/n
Z
1 2 n 1
2
φ(y)dy ≤ n |φ(y)|dy = Φ → 0,
2πn 0 y 2πn 0 2π n
because sin2 θ ≤ θ2 .
For the other interval, we note that integration by parts give
sin2 πny
Z b Z b Z b
2 1 Φ(b) Φ(a) 1
2
≤ 2
|φ(y)|dy = 2
− 2
+ 2 3
Φ(y)dy.
y=a y a y b a y=a y
So we have
sin2 πny
Z η Z η
1 2
1 Φ(η) n 1 1 1

2πn 2
φ(y)dy ≤

2
− Φ( + 3
Φ(y)dy
1/n y 2πn η 2π n πn y1/n y
Z η
  dy 3
≤ + 2
≤ .
2π πn y=1/n y 2π
Math 212a Notes 28

Finally, we have
π
sin2 ny
Z
1 2
φ(y)dy
2πn y=η y2
goes to zero, because the amplitude of the oscillation goes to zero.

Consider the function given by

1 X einx
sf (x) = .
2i n
n6=0

This the Fourier series of the sawtooth function, which looks something like
(
− x+π
2 −π < x < 0
sf (x) =
− x−π
2 0 < x < π.

Let Pn (θ) be the principal sum


N
1 X einθ X sin nθ
PN (θ) = = .
2i n n=1
n
|n|<N

This is actually uniformly bounded for all N ∈ N and all θ ∈ R. On the other
hand, the non-principal sum
−1
X einθ
QN (θ) =
n
n=−N

PN
cannot converge because |QN (0)| = n=1 n1 ≥ log N . To show that PN (θ) is
uniformly bounded, you use Abel summation.
Choose αk > 0 and Nk ∈ N so that
P∞
(1) k=1 αk < ∞,
(2) Nk+1 > 3Nk ,
(3) αk log Nk → ∞ as k → ∞,
3
e.g., αk = k −2 , Nk = 2k , and then construct the function

X
f (θ) = αk ei2Nk θ PNk (θ).
k=1

The condition (2) ensures that the coefficients do not interfere. Then if we
cut the sum at 2Nk , then we get one large non-principal sum and some small
principal sums that are uniformly bounded. This means that the Fourier series
cannot converge.
Math 212a Notes 29

8.2 Fourier inversion formula


For f ∈ L1 (R), consider the function
Z
ˆ
f (ξ) = f (x)e−2πiξx dx.
x∈R

We further assume that fˆ ∈ L (R), and we want to prove that


1
Z
f (x) = fˆ(ξ)e2πiξx dξ.
ξ∈R

Here we are going to approximate the identity by Gauss distributions. These


are given by
1 2
Kδ (y) = d/2 e−π|y| /δ .
δ
The√mean is 0 and the variance is δ/2π. It is a good kernel with parameter
t = δ.
The key argument is that if f, g, fˆ, ĝ ∈ L1 (Rd ), then
Z Z
f (y)ĝ(y)dy = fˆ(ξ)g(ξ)dξ,
y∈Rd ξ∈Rd

i.e., f 7→ fˆ is formally symmetric with respect to the pairing. This is just Fubini,
because we can replace ĝ and fˆ with integrals.
2
We first note that e−πx is its own Fourier transform. This can be verified
directly. Now take
2 1 −π|x−y|2 /δ
g(ξ) = e−πδ|ξ| e2πix·ξ , e
ĝ(y) = .
δ d/2
Then as δ → 0, we immediately get by dominated convergence,
Z
f (x) = fˆ(ξ)g(ξ)dξ
ξ∈Rd

for x a Lebesgue point.

8.3 Fubini’s theorem


We haven’t proved this, so I should talk about it. Let f (x, y) be a function on
Rd1 × Rd2 and f y (x) = f (x, y).
Theorem 8.2 (Fubini’s theorem). Suppose f is integrable on Rd1 × Rd2 . Then
y d1 d2
R
f is integrable on R for almost all y ∈ R , y 7→ Rd1 f (x) is integrable on
Rd2 , and Z Z  Z
f dx dy = f.
R d2 Rd1 Rd1 +d2
Proof. Check this for a linear combination of characteristic functions of cubes.
Then use monotone convergence theorem to check that this is correct for simple
functions.
Math 212a Notes 30

9 September 28, 2017


Since we have everything, we can look at
Z b Z b
d ∂f
f (x, y)dx = (x, y)dy.
dy x=a x=a ∂y

This holds when ∂f /∂y is L1 ([a, b] × [c, d]). As I have said, we can use Fubini
and fundamental theorem of calculus.

9.1 Mean value theorem


There are two ways of looking at this. The first is to look through derivatives.
The statement is that

F (b) − F (a) = F 0 (ξ)(b − a)

for some ξ. The integration formulation is


Z b Z b
f (x)dx = f (ξ)(b − a) = f (ξ) dx.
a a

The interpretation is that the left hand side is the weighted average of the
constant function 1. This can be replaced by a constant weight somewhere.
You can also look at the Stieltjes version
Z b Z b
f (x)ϕ(x)dx = f (ξ) ϕ(x)dx.
a a

But what if you want to have the weight concentrated at the end? The
second mean value theorem is, if ϕ monotone and f is an integrable function on
(a, b), there exists a ξ such that
Z b Z ξ Z b
f (x)ϕ(x)dx = ϕ(b−) f (x)dx + ϕ(a+) f (x)dx.
a a ξ

Proof. Suppose ϕ is increasing. We may assume that ϕ(a+ ) = 0. Now let


Rb Rb
m = minξ ξ f (x)dx and M = maxξ ξ f (x)dx. Then by Abel summation, we
have the inequality
Z b
ϕ(b)m ≤ f (x)ϕ(x)dx ≤ ϕ(b)M.
a
Rb Rb
Then there exists a ξ such that ϕ(b) ξ
f (x)dx = a
f (x)ϕ(x)dx.
When we were talking about the fundamental theorem of calculus in 1-
dimension, the platform for the functions was integrable and absolutely contin-
uous. In higher dimensions, the platform the first part is integrable. Then we
Math 212a Notes 31

can talk about Lebesgue points and so forth. What about the second part? If
Rb
you think about a F 0 = F (b) − F (a), these are two different ways of giving an
identical “measure” to an interval. So the higher dimensional analogue will also
be equating two measures.
This is how abstract measure theory work. People look at abstract measure
spaces and also define signed measures. For two measures µ and ν, we say that
µ is absolutely continuous with respect to ν if ν(E) = 0 implies µ(E) = 0.
Then there is a Radon–Nikodym derivative, which is a measurable function,
such that Z

µ(E) = dν.
E dν

9.2 Hölder and Minkowski inequalities


We now want to solve linear differential equations with compatibility conditions.
The theme is generalized Cramer’s rule applied to infinite dimensional spaces.
We want to solve Ax = b subject to Sb = 0. In the finite dimensional case, we
get that the minimal solution is

xmin = A∗ (AA∗ + S ∗ S)−1 b.

To get the inverse, we need that the map is invertible, i.e., has nonzero eigen-
values. We have

h(AA∗ + S ∗ S)y, yi = kA∗ yk2 + kSyk2 ,

and so is automatically invertible. But for this to be true for function spaces,
we need an estimate. And to formulate this, we need to set up the notion of
Hilbert spaces.
In an inner product space, we have the parallelogram law

ku − vk2 + ku + vk2 = 2kuk2 + 2kvk2 .

Definition 9.1. Let E be a measurable set.R For p ≥ 1, the space Lp (E) is the
space of measurable functions f such that E |f | < ∞. The Lp -norm is defined
p

as Z 1/p
kf kLp = kf kp = |f |p .
E

The first thing you need to worry about is the triangle inequality. This is
Minkowski’s
√ inequality, and it comes from concavity. The most basic inequality
is ab ≤ (a + b)/2, which can be shown easily, and its general case is aα bβ ≤
αa+βb for α+β = 1 and a, b, α, β > 0. The condition α+β ensures homogeneity
and so we may set b = 1 and show aα ≤ α(a − 1) + 1. The left hand side is a
concave function in a, and the right hand side is a tangent line at a = 1.
From this, we can deduce other inequalities. Suppose P wePhave numbers
a1 , . . . , an and b1 , . . . , bn , and by normalization assume aj = bj = 1. Using
Math 212a Notes 32

β

j bj ≤ αaj + βbj , we get

n
X n
X n
α X β
β

j bj ≤ 1 = aj bj .
j=1 j=1 j=1

If we rescale aj and bj , and let p = 1/α, q = 1/β, then we get the standard
Hölder inequality
n
X n
X n
1/p X 1/q
aj bj ≤ apj bqj .
j=1 j=1 j=1

We can do this with functions, and then we will get

kf gkL1 ≤ kf kLp kgkLq .

Minkowski’s inequality is Hölder’s inequality with handling the coefficients.


We have
Z Z Z Z
|f + g|p = |f + g||f + g|p−1 ≤ |f ||f + g|p−1 + |g||f + g|p−1
E E E E
Z 1/p Z 1−1/p Z 1/p Z 1−1/p
≤ |f |p |f + g|p + |g| |f + g|p .
E E E E

Then we can move things to the other side to get the Minkowski inequality

kf + gkp ≤ kf kp + kgkp .

9.3 Hilbert space


The prototype for our differential equation is
(
∂u
∂x (x, y) = P (x, y) subject to
∂P

∂Q
= 0.
∂u ∂y ∂x
∂t (x, y) = Q(x, y)

This can be handled using Poincaré’s lemma, by integrating. But we want to


do things more generally.
Definition 9.2. A Hilbert space (over C) with a Hermitian inner product
that is

(i) complete, i.e., it is complete respect to the metric defined by the norm,
(ii) separable, i.e., there exists a countable dense subset.
Example 9.3. For any measurable space E, the space L2 (E) is a Hilbert space.
We know that it is separable, because every function can be approximated by
simple functions and we can make them have rational coefficients.
Math 212a Notes 33

To show completeness, consider a Cauchy sequence fn and then choose a


subsequence such that kfn − fn+1 k ≤ 2−n . Then n kfn − fn+1 k < ∞. Look
P
at the telescopic series

fn = f1 + (f2 − f1 ) + · · · + (fn−1 − fn−2 ) + (fn − fn−1 ).

The sequence of functions

gn = |f1 | + |f2 − f1 | + · · · + |fn − fn−1 |

converges to some g∞ , which is L2 by Minkowski and monotone. This shows


that fn converges almost everywhere, and the limit is L2 .

We are trying to solve differential equations T f = g where Sg = 0. But in


general T is going to be a differential operator, and T : L2 → L2 is not going
to be always defined. But it is densely defined, and we will be able to apply
Cramer’s rule here.
We would want to make sense of the adjoint operator. This is supposed to
be something that satisfy
(T f, g) = (f, T ∗ g).
Given g, we want to define T ∗ g such that the identity is true for all f . That
is, we want to represent the map f 7→ (T f, g) by the inner product with some
element. This step is called the Riesz representation theorem.
Math 212a Notes 34

10 October 3, 2017
We will be looking at linear differential equations. We can ask how to get
A∗ (AA∗ + S ∗ S)−1 b in function spaces. So we look at Hilbert spaces, which are
complete and separable. One difficulty, which help people until Friedrichs(1944),
was that some operators are not defined on the domain.
The operators like d/dx are not defined on L2 (R). Rather, it is defined on
differentiable functions, which is dense. The operator T ∗ means that

(T x, y) = (x, T ∗ y).

That means that we fix a y and are trying to find a T ∗ y.

10.1 Riesz representation theorem


Proposition 10.1. Let X be a Hilbert space and Y be a closed subspace (so
it’s also a Hilbert space). For v ∈ X − Y , there exists a unique w ∈ Y such that
v−w ⊥Y.
Proof. Get w ∈ Y to minimize kv − wk. Rigorously, let µ = inf y∈Y kv − yk and
there is a sequence kv − yn k → µ. Then

kyn − ym k2 + k2v − yn − ym k2 = 2kv − yn k2 + 2kv − ym k2

by the parallelogram law. Then k2v − yn − ym k2 ≥ 4µ2 and the right hand side
goes to 4µ2 . This shows that kyn − ym k → 0 and that this is a Cauchy sequence.
Then yn → w.
Now we need to show that it is perpendicular. Let 0 6= y ∈ Y . We have

µ2 ≤ kv − w + λyk2 = kv − wk2 + 2λ(v − w, y) + λ2 kyk2 .

If we take λ = −(v − w, y)/kyk2 , we get (v − w, y) = 0.


Theorem 10.2 (Riesz representation). Let X be a Hilbert space and let f :
X → R be an R-linear continuous functional. (Then there exists a cf such that
kf (x)k ≤ cf kxk. The minimal such cf is written as kf k.) Then there exists a
unique vf ∈ X such that f = (−, vf ).
Proof. Assume f 6= 0. Let Y = ker f , so that Y ( X is a closed subspace. Let
v ∈ X such that f (v) 6= 0, and take w = projY x. Replace v by v − w, and we
may assume that v ⊥ Y and also that kvk = 1 by normalization.
We seek vf = λv for some λ ∈ R. We do this by testing at v. We need
(v, vf ) = f (v), so we use λ = f (v). Now we claim that

vf = f (v)v

does the job. Because x decomposes into a linear multiple of v and an element
of Y we only need to show (x, f (v)v) = f (x) for x ∈ Y and x = v. We have
already checked it for x = v. For x ∈ Y , (x, f (v)v) = 0 = f (x).
Math 212a Notes 35

Now it can be proved that kvf k = kf k. First,

|f (x)| ≤ kxkkvf k

by Cauchy–Schwartz and so kf k ≤ kvf k. On the other hand, testing at vf gives


kf kkvf k = f (vf ) = kvf k2 .

10.2 Adjoint operator


Let T : X → X be an R-linear continuous map. We define the adjoint T ∗ as

(T x, y) = (x, T ∗ y).

If we fix y, then
|(T x, y)| ≤ kT xkkyk ≤ kT kkykkxk,
and so there exists this element T ∗ y that represents the map. Moreover, we
would have the inequality

kT ∗ yk = kx 7→ (T x, y)k ≤ kT kkyk.

This shows that T ∗ is well-defined and that kT ∗ k ≤ kT k. By symmetry and


since (T ∗ )∗ = T , we get kT ∗ k = kT k.
Now let us look at the real situation. We have
T S
H1 H2 H3 ,

where H1 , H2 , H3 are Hilbert spaces but T and S are only densely defined. We
can assume that the graphs of T and S are closed.
What do I mean by this? We want to make this function defined on a domain
as large as possible, so we look at the closure of the graph

{(x, T x) : x in domain of T } ⊆ H1 × H2 .

But there might be an ambiguity, so that xν → x∗ = 0 but yν → y ∗ 6= 0. It


turns out that differential operators have this property, and this is included in
the definition.
Consider the space C0∞ of compactly supported smooth functions. Here we
have integrating by parts. So if T = d/dx and fν → 0 and fν0 → g in L2 , then

(fν0 , h) = (fν , h0 ) → 0

for all h ∈ C0∞ and so g = 0.


If T is densely defined with closed graph, then we can define its adjoint T ∗ .
Here, g ∈ dom(T ∗ ) if and only if there exists an h such that

(f, h)H1 = (T f, g)H2

for all f ∈ dom(T ). In this case, we say h = T ∗ g.


Math 212a Notes 36

Proposition 10.3. Assume that ST = 0, i.e., dom(S) ⊇ im(T ) and ST = 0.


Also assume we have the a priori estimate

kT ∗ gk2 + kSgk2 ≥ ckgk2

for all g ∈ dom(T ∗ ) ∩ dom(S), for some c > 0. The conclusion is that if
g ∈ dom S such that Sg = 0, then there exists a f ∈ dom T such that T f = g
and f ⊥ ker T .
You could try to define (T ∗ T + SS ∗ )−1 , but it becomes very complicated
because the domain is bad. Instead, we do this directly.

Proof. Take any g ∈ H2 . Write g = g1 + g2 with g1 ∈ ker S and g2 ∈ (ker S)⊥ .


(ker S is closed because the graph is closed.) Then we have

g2 ∈ (ker S)⊥ ⊆ (im T )⊥ ⊆ ker T ∗ .

The idea is that we want to solve T f = g weakly, so that (T f, u) = (g, u)


for all test functions u. We’ll finish next time.
Math 212a Notes 37

11 October 5, 2017
We need to prove the following proposition.
Proposition 11.1. Let S : H1 → H2 and T : H2 → H3 be densely defined
closed operators between Hilbert spaces and ST = 0 and

kT ∗ gk2H1 + kSgk2H3 ≥ ckgk2H2

for all g ∈ dom(T ∗ ) ∩ dom(S). Then the equation T u = f with f ∈ ker S can
be solved uniquely with u ⊥ ker T and moreover there is the estimate
1
kukH1 ≤ √ kf kH2 .
c

Proof. We want to do this in the weak sense, so that we want to make so


that (u, T ∗ g) = (T u, g) = (f, g). To get u, we would need to use the Riesz
representation theorem to T ∗ g 7→ (g, f ). This is not defined on all of H1 . To do
this, we are going to prove that this is bounded. Then we can take the closure.
We have
kf k p ∗ 2
|(g, f )| ≤ kgkkf k ≤ √ kT gk + kSgk2 ,
c
but only when g ∈ dom S. We also have the additional term kSgk2 .
The idea is to decompose g = g1 +g2 with g1 ∈ ker S and g2 ∈ (ker S)⊥ . Here,
ker S is closed because S is a closed operator. Because ST = 0, im T ⊆ ker S.
Then h 7→ (g2 , T h) = 0 because g2 ∈ (ker S)⊥ and T h ∈ ker S. This shows that
g2 ∈ dom T ∗ with T ∗ g2 = 0. Therefore if g ∈ dom T ∗ then g1 , g2 ∈ dom T ∗ .
Now for any g ∈ dom T ∗ , we get

kf k kf k
|(g, f )| = |(g1 , f )| ≤ √ kT ∗ g1 k = √ kT ∗ gk.
c c

So T ∗ g 7→ (g, f ) is bounded by kf k/ c. This is a map im T ∗ → R that is
bounded, and so we can extend it to its closure im T ∗ and the to H1 by orthog-
onal projection. Now by Riesz representation theorem, there exists a u ∈ H1
such that
kf k
kuk ≤ √
c
such that (T ∗ g, u) = (g, f ) for all g ∈ dom T ∗ .
This implies that f = (T ∗ )∗ u with u ∈ dom(T ∗ )∗ . But it can be shown that
∗ ∗
(T ) = T and so f = T u.

So let’s prove that (T ∗ )∗ = T .


Theorem 11.2. Let T : H1 → H2 be densely defined with closed graph. Then
T ∗ is densely defined with closed graph and (T ∗ )∗ = T .
Math 212a Notes 38

Proof. The idea is to go to the product space H1 × H2 . The orthogonal com-


plement of the graph is

(Graph T )⊥ = {(−T ∗ f, f ) : f ∈ dom T ∗ },

just by definition of T ∗ . Then by closedness of the graph, we get

H1 × H2 = (Graph T ) ⊕ (Graph T )⊥ .

If dom T ∗ is not dense, there exists a nonzero h ∈ H2 that is perpendicular


to dom T ∗ . Then (h, f )H2 = 0 for all f ∈ dom T ∗ and so

((0, h), (−T ∗ f, f )) = (0, −T ∗ f ) + (h, f ) = 0

for all f ∈ dom T ∗ . Then (0, h) ∈ Graph(T ) which is not possible unless h = 0.
This shows that dom T ∗ is dense.
Finally, T = (T ∗ )∗ because Graph((T ∗ )∗ ) is going to be the orthogonal
complement of Graph(T ∗ ), which is Graph(T ).

11.1 Approximating by convolutions


So we have the basic theorem we can apply to real situations. The problem is
that we really need this approximation T T ∗ + S ∗ S ≥ c > 0. The technique here
will be sum of squares.
Consider a differential operator
n
X (k) ∂ (k)
L= aj (x1 , . . . , xn ) + bj (x1 , . . . , xn ).
j=1
∂xj

We can clearly apply it to smooth compactly supported u ∈ C0∞ . Then if


(u, Lu) → (v, w) in L2 , we have to define w = Lv. Friedrichs’ idea was to do
things in u ∈ C0∞ and then pass to general (v, w). To approximate (v, w) by
such (u, Lu), you take a convolution and smooth out.R
For instance, we consider a χ(x) ≥ 0 in C0∞ with R χ = 1. Then we scale

1 x
χ (x) = χ ,
 
so that the support has order .
Now assume u ∈ dom L and let u = u ∗ χ . Now we are good if we could
prove that
(u , Lu ) → (u, Lu)
in the L2 norm of the product space.
Theorem 11.3. Let u, Lu ∈ L2 for some differential operator L. Then for any
u ∈ C0∞ , L(u ∗ χ ) → Lu in L2 as  → 0.
Math 212a Notes 39

Proof. The main idea is the 3 argument applied to the sequence. We have

u = u ∗ χ → u, (Lu) ∗ χ → Lu

in L2 , by the good kernel argument. So what we are trying to show is

L(u ∗ χ ) − (Lu) ∗ χ → 0

in L2 .
For simplicity, consider the operator
d
L = a(x) + b(x).
dx
Suppose we show that

kχ ∗ Lu − L(χ ∗ u)k ≤ Ckuk (†)

independent of . Given any δ > 0, v ∈ C0∞ , we have

k(χ ∗Lu)−L(χ ∗u)k ≤ kχ ∗(L(u−v))−L(χ ∗(u−v))k+kχ ∗Lv −L(χ ∗v)k.

But the first term is bounded by Cku − vk and the second part goes to 0 as
 → 0. So if we prove this statement, we get

χ ∗ Lu − L(χ ∗ u) → 0

as  → 0.
Now let us show the inequality (†). We have
 ∂u  ∂
χ ∗ Lu − L(χ ∗ u) = χ a − a (χ ∗ u)
∂x ∂x
 ∂   ∂a  ∂
= χ ∗ (au) − χ ∗ u − a (χ ∗ u)
∂x ∂x ∂x
 ∂   ∂    ∂a 
= χ ∗ (au) − a χ ∗ u − χ ∗ u .
∂x ∂x ∂x
The last term goes to zero, so the first two terms is
Z
∂χ ∂χ
(y) a(x − y)u(x − y) − a(x) (y)u(x − y)
∂y ∂y
Z
∂χ (y)
= (a(x − y) − a(x))u(x − y)dy.
∂y

Now as  → 0, the first derivative grows with order −1 and a(x − y) − a(x)
decreases with order .
Math 212a Notes 40

12 October 10, 2017


We have seen how to solve the equation T u = f , when there is an estimate

kT ∗ gk2 + kSgk2 ≥ ckgk2

and Sf = 0. Now if we actually solve the problem, we want to have something


like regularity. If T u = f is smooth and u is orthogonal to ker T , is u smooth?
To do this, we are going to look at Sobolov spaces and Gårding’s inequality.
To actually get the solution, we need to know how to invert operators. Here,
we invert T T ∗ + S ∗ S, which is self-adjoint. Self-adjoint operators can be diag-
onalized, and this is how we invert, as long as there is no zero eigenvalue. The
diagonalization statement is called the spectral theorem.

12.1 Compact operators


Let H be a Hilbert space, and let T : H → H be a bounded self-adjoint
operator. We want to diagonalize this, i.e., find orthonormal ϕn ∈ H such that
T ϕn = λn ϕn . Here, orthonormality means (ϕm , ϕn ) = δmn and every f ∈ H
PN
can be approximated by n=1 (f, ϕn )ϕn .
Here, we need the additional assumption that the map is almost finite-
dimensional. That is, there exist finite rank operators Tn such that

lim kT − Tn k = 0.
n→∞

Definition 12.1. An operator T : H → H is called compact if for every


sequence fn bounded in H, there exists a subsequence fnk such that T fnk is
Cauchy.
In other words, this means that the image of the unit ball is sequentially
compact. The condition that T can be approximated by finite rank operators
is equivalent to T being compact.
Theorem 12.2 (Spectral theroem). If T : H → H is compact self-adjoint
operator, then there exists an orthonormal basis ϕn with T ϕn = λn ϕn , and for
each c > 0, there exists finitely many n such that |λn | ≥ c.
How can this be used? We can, for instance, look at Sturm–Liouville
equations
1  d  d  
(Lf )(x) = p(x) f (x) − q(x)f (x)
r(x) dx dx
This is actually self-adjoint, because we formally have (forgetting about r for a
moment)
(Lf, g) = ((pf 0 )0 − qf, g) = −(pf 0 , g 0 ) − (qf, g).
Then r(x) is just gives a weight on the inner product. This only holds formally,
because we have some terms on the boundary. To make this actually work, we
need to work with the space of all L2 functions on [a, b] vanishing at a and b.
Math 212a Notes 41

Let H be a Hilbert space, and let L(H) be all bounded linear operators from
H to H. This is an algebra, i.e., there is bilinear multiplication. Let Lc (H) be
the space of all compact operators from H to itself.
Theorem 12.3. Lc (H) is a two-sided, closed, adjoint-invariant ideal L(H). In
Lc (H), the finite-rank elements of Lc (H) is dense.
Proof. We need to show that if T ∈ L(H) and S ∈ Lc (H), then T S ∈ Lc (H)
and ST ∈ Lc (H). This is clear, just by mapping the sequence.
To show closedness, we need to check that if kT − Tn k → 0 and Tn ∈ Lc (H)
then T ∈ Lc (H). This can be done by diagonal subseqeunce from a sequence of
nested subsequences. The usual 3 argument works.
Now let us show that elements in Lc (H) can be approximated by finite rank
operators. Let {en } be an orthonormal basis. Then define
X X
Tn f = (T f, ek )ek , Qn f = (f, ek )ek
k≤n k>n

so that T = Tn + Qn T . We need to show that kQn T k → 0. Suppose not. Then


there exists a sequence fn with kfn k = 1 and kQn T fn k ≥ c > 0. Then let
T fnk → g. Then
Qnk g = Qnk T fnk − Qnk (T fnk − g)
we get a contradiction as we send k → ∞.
Finally, we need to show that T ∈ Lc (H) implies T ∗ ∈ Lc (H). But if we
write I = Pn + Qn with kPn T − T k → 0, then kT ∗ Pn∗ − T ∗ k → 0. Here Pn∗ has
finite rank.

12.2 Spectral theorem


The idea for proving the spectral theorem is to find one eigenvector v, and then
replace the whole thing to v ⊥ to get the next eigenvector.
Theorem 12.4 (Spectral theroem). If T : H → H is compact self-adjoint
operator, then there exists an orthonormal basis ϕn with T ϕn = λn ϕn , and for
each c > 0, there exists finitely many n such that |λn | ≥ c.
Lemma 12.5. kT k = supkxk≤1 |(T x, x)|.
Proof. First we see that kT k = supkxk,kyk≤1 |(T x, y)|. One direction is Cauchy–
Schwartz and the other direction is by specifying y = T x/kT xk. Now to change
(T x, y) to (T x, x), we use polarization
3
1X
(T x, y) = (T (x + ik y), x + ik y).
4
k=0

Proof of the spectral theorem. Let µ = kT k. Then there exist xn with kxn k = 1
such that (T xn , xn ) → ±µ. (This is from the lemma.) By compactness, there
exists a subsequence of T xnk that converges, and let
lim T xnk = y.
k→∞
Math 212a Notes 42

The claim is that y is an eigenvector. We have

0 ≤ (T xn − µxn , T xn − µxn ) ≤ µ2 − 2µ(T xn , xn ) + µ2 → 0

as n → ∞. This shows that


1 1
xn = (T xn − (T xn − µxn )) → y.
µ µ
Then because T is bounded, T y = µ limn→∞ T xn = µy.
As we take away eigenvectors, the eigenvalues have to go down to zero,
because otherwise the eigenvectors themselves form a sequence in the unit ball
whose image doesn’t contain a Cauchy subsequence.

In general, we have an equation like Lf = g, where L is not compact and


not even bounded. But in many cases 1 + L∗ L is bounded, and we can hope
that it is bounded. Let’s look at the equation, and for instance,

d2
Lf = f =g
dx2
with f (a) = f (b) = 0. This operator L is not compact, but Green figured out
that L−1 is compact. He constructed the kernel
(
(x−a)(b−y)
a−b a ≤ x ≤ y ≤ b,
K(x, y) = (b−x)(y−a)
a−b a ≤ y ≤ x ≤ b.

Then the solution is given by


Z b
f (x) = K(x, y)g(y)dy.
y=a
Math 212a Notes 43

13 October 12, 2017


Last time we have proved the spectral theorem for self-adjoint compact opera-
tors. For instance, let us look at integral operators. Let L2 (Rd ) and K(x, y) ∈
L2 (Rd × Rd ). Then we can look at the integral operator, or Hilbert–Schmidt
operator Z
(T f )(x) = K(x, y)f (y)dy.
y∈Rd
Then we know that
kT k ≤ kKkL2 (Rd ,Rd ) .

13.1 Sturm–Liouville equation


These can help us in solving differential equations. For instance, take f (x)
defined on [a, b], and look at the toy model
(Lf )(x) = f 00 (x).
The motivation is that if you integrate this to remove the derivative, you get a
single integral. By integration by parts, you can replace multiple integration by
single integration, like
Z x Z x
0 0
f (x) = f (a) + f (t)dt = f (a) + (x − a)f (a) + (x − t)f 00 (t)dt
a a
Z x
0 (x − a)n−1 (n−1) (x − t)n−1 (n)
= f (a) + (x − a)f (x) + · · · + f (a) + f (t)dt.
(n − 1)! a (n − 1)!
So x
dn dn (x − t)n−1 (n)
Z
f (x) = f (t)dt
dxn dxn a (n − 1)!
and so we can set K(x, y) = (x − y)n−1 /(n − 1)! · χ[a,x] .
In this case, the operator f 00 is self adjoint, so we want K(x, y) to be sym-
metric in x and y. We also want it to vanish at the boundary, so that the
boundary terms vanish. After experimenting, people got
(
(x−a)(b−y)
a−b a≤x≤y≤b
K(x, y) = (b−x)(y−a)
a−b a ≤ y ≤ x ≤ b.

You can differentiate to actually verify this. This K(x, y) is called the Green’s
kernel.
The Sturm–Liouville equation is solved similarly. Consider the operator
Lf = (pf 0 )0 − qf.
If we can find ϕ1 and ϕ2 such that
L(ϕ− ) = 0, ϕ− (a) = 0, ϕ0− (a) 6= 0,
L(ϕ+ ) = 0, ϕ+ (b) = 0, ϕ0+ (b) 6= 0.
Math 212a Notes 44

Then we can write down Green’s kernel as


(
ϕ− (x)ϕ+ (y)
w a ≤ x ≤ y ≤ b,
K(x, y) = ϕ− (y)ϕ+ (x)
w a ≤ y ≤ x ≤ b.

Here, the function


ϕ (x) ϕ+ (x)
w(x) = p(x) −
ϕ0− (x) ϕ0+ (x)

is actually constant in x.

13.2 Fredholm’s alternative


In the toy model, we use ~xmin = A∗ (AA∗ + S ∗ S)−1~b. Here, AA∗ + S ∗ S is not
going to be nice for general Hilbert spaces, so we were not allowed to use this
directly. This is why we needed all the things like kT ∗ gk2 + kSgk2 ≥ ckgk2 .
But maybe just inverting the operator works for the equations we care about,
i.e., differential equations. The Fredholm alternative is the assumption that
every λ 6= 0 is either an eigenvalue or (T − λI)−1 is bounded.
If the range is finite dimension, this is fine. If the operator T is compact
and self-adjoint, this is still fine, because we have the spectral theorem. We can
pick a orthonormal basis ~ek and write
X 1
(T − λ)−1 f = (f, ~ek )~ek .
λk − λ
k

This is bounded by the fact that there are finitely many eigenvalues outside a
neighborhood of 0.
Actually the Fredholm alternative holds for all compact operators T . The
techniques we will use are the lower bound technique and the iterated approxi-
mation to get surjectivity.
Theorem 13.1. If T is compact and λ 6= 0 is not an eigenvalue, then (T −λ)−1
is bounded.
This actually works for Banach spaces too.

Proof. Let’s first prove that k(T − λ)xk ≥ ckxk. If this is not true, there is a
sequence of xn with kxn k = 1 such that (T − λ)xn → 0. Assume that T xn → y.
Then
y ← T xn = λxn + (T − λ)xn .
So xn → y/λ and T xn → T y/λ. So y is an eigenvalue.
Now we need to show that T − λ is surjective, so that we can invert it. Let
kTn − T k → 0 with Tn finite dimension range. Now the claim is that there exists
a c > 0 such that
k(Tn − λ)xk ≥ ckxk
Math 212a Notes 45

for all sufficiently large n. This is by the same technique. If (sn − λ)xn → 0,
then by the diagonal argument we can assume sn xk → yn as k → ∞ for all n.
Also assume T xk → y. Then

kyn − yk ≤ kyn − Sn xn k + kSn xk − T xk k + kT xk − yk.

Now (Tn − λ)−1 is defined an is bounded, because Tn has finite-dimensional


range. Take an arbitrary y and let (Tn − λ)xn = y with kxn k ≤ C|y|. Then
(T − λ)xn = y + (T − Tn )xn . So

k(T − λ)xn − yk ≤ kT − Tn kCkyk.

Now if xn → x then (T − λ)x = y.

13.3 Banach spaces


When you solve a differential equation, you only get a weak solution. But now,
you would want to know if the solution is differentiable, or even continuous. We
can solve an equation like d2 f /dx2 = g, with g smooth, but can we know that
f is smooth? That is, is f the same, up to a measure zero set, as a smooth
function? This is known as the Sobolev embedding theorem.
Here, we will look at Lp spaces as well as L2 spaces. This means that we
will have to look at Banach spaces, which are vector spaces with a complete
metric.
Generally, looking at complements to subspaces is not possible in Banach
spaces. This is a big problem. So we only have the technique of iterated ap-
proximation to hit some element.
Theorem 13.2 (Baire category theroem). Let X be a complete metric space.
If On are open and dense in X, then
\
E= On
n∈N

is dense in X.
Proof. Let U be a nonempty open set. Then each U ∩ O1 is nonempty, and so
there is a open ball B1 ⊆ U ∩ O1 . Then by the same argument there is an open
ball B2 ⊆ U ∩ O2 , and so forth. Then the intersection of Bn is nonempty by
completeness, and so U ∩ E 6= ∅.
Theorem 13.3 (Open mapping theorem). Let X, Y be Banach spaces and T :
X → Y be continuous, i.e., bounded. If it is surjective, then the map is open.
Math 212a Notes 46

14 October 17, 2017


We were dealing with Banach spaces because we want to look at Lpk in the
regularity problem.

14.1 Useful facts about Banach spaces


Theorem 14.1 (Baire category theorem). A countable intersection of open
dense sets is dense. Equivalently, a countable union of closed nowhere dense
sets is nowhere dense.
Theorem 14.2 (Banach–Steinhaus). Let X be a Banach space and let T be a
collection of linear operators on X. Suppose that for each x ∈ X, supT ∈T kT xk <
∞. Then supT ∈T kT k < ∞.
Proof. For each n ∈ N, the set

Fn = {x ∈ X : sup kT xk ≤ n}
T ∈T
S
is closed, and we also have n∈N Fn = X. So some Fn contains an open ball.
Theorem 14.3 (Open mappint theorem). Let T : X → Y is a bounded and
surjective map of Banach spaces. Then T is open.
Proof. For BX the unit open ball,
S we need to show thatST (BX ) contains some
open ball in Y . But we have n T (nBX ) = Y , and so n T (nBX ) = Y . The
Baire category theorem implies that some T (nBX ) contains an open ball in Y .
We may assume that T (nBX ) ⊇ BY .
The whole point is to get rid of the closure. The condition we have obtained
says that for each y ∈ Y and  > 0, there exists an x ∈ X such that

kxkX ≤ nkykY and ky − T xkY < .

Then we replace y by y − T x. Then you can approximate y = T x1 + T x2 +


T x3 + · · · .

Theorem 14.4 (Closed graph theorem). Let T : X → Y be a linear map such


that the graph is closed. Then T is bounded.
Proof. The space ΓT ⊆ X × Y is a Banach space. Now the projection ΓT → X
is surjective, so it is open. This shows that any inverse image of an open set in
Y to ΓT is open, and the projection to X is open.

Theorem 14.5. (Lp )∗ = Lq , when 1


p + 1
q = 1.
The main tool we use is the characterization of the indefinite integral of Lp
functions. This was in the assignment.
Math 212a Notes 47

Proof. We have Lq ⊆ (Lp )∗ by Hölder’s inequality. The difficult part is (Lp )∗ ⊆


Lq . Suppose Φ : Lp → R is a linear functional. To determine the function g
that represents Φ, it suffices to know its value at f = [ξ, η]. This goes through
the whole process of obtaining measurable sets from open sets, and then using
simple functions to approximate measurable functions. Then we can define
d
g= Φ(χ[a,x] ).
dx
It suffices to show that F (χ[a,x] ) is the indefinite integral of a Lq function.
In the homework, you have shown that if F satisfies
k
X |F (ηj ) − F (ξj )|q
≤M
j=1
(ηj − ξj )q−1

for any disjoint (ξ1 , η1 ), . . . , (ξk , ηk ) ⊆ [a, b], then F is the indefinite integral of
a Lq function. Let
k
X |F (ηj ) − F (ξj )|q−1 sgn(F (ηj ) − F (ξj ))
f= χ[ξj ,ηj ]
j=1
(ηj − ξj )q−1

so that Φ(f ) is the sum we want to estimate. Now if you compute kf kLp , it is
k 1/p
|F (ηj ) − F (ξj )|q
X
kf kLp = = (Φ(f ))1/p .
j=1
|ηj − ξj |q−1

But we have Φ(f ) ≤ kΦkkf kLp , so we get the right bound.

14.2 Hahn–Banach theorem


When you do the Fredholm alternative, you prove that k(λI − T )xk ≥ ckxk,
and you needed to extend the map by using an orthogonal projection. But
the thing is that you cannot use orthogonal projection in a Banach space. The
Hahn–Banach theorem deals with this situation.

Theorem 14.6 (Hahn–Banach theorem). Let X be a linear space with a semi-


norm, over R. (That is, k−k : X → R+ ∪ {0} satisfies kλxk = |λ|kxk and
kx + yk ≤ kxk + kyk.) Given Y a subspace of X, assume that f : Y → R is
linear and bounded. Then f extends to a linear function on X with the same
bound.

Proof. We extend one dimension at a time, from Y to Y + Rz. To do this, you


only need to know how to assign f (z). Assume |f (y)| ≤ kyk for all y ∈ Y by
rescaling. We want to make

−ky + zk ≤ f (y) + f (z) ≤ ky + zk


Math 212a Notes 48

for all y. That is, we want

f (z) ≤ ky + zk − f (y), f (z) ≥ −f (y 0 ) − ky 0 + zk

for all y and y 0 . This can be done, because

f (y − y 0 ) ≤ ky − y 0 k ≤ ky + zk + ky 0 + zk

by the triangle inequality.


We also have a C-linear version. Let f be a C-linear function. We can always
write f = u+iv, where u and v are R-linear. Then f being C-linear is equivalent
to v(x) = −u(ix). That is, given any u, we can define f (x) = u(x) − iu(ix).

Proof of C-linear version. Suppose |f (x)| ≤ kxk. Then clearly |u(x)| ≤ kxk.
We extend u(x) to U (x), and then define F (x) = U (x) − iU (ix). The norm
estimate follows automatically. We have

|F (x)| = |F (eiα x)| = |U (x)| ≤ kxk.

Let X be a compact smooth differentiable manifold. Hodge theory is about


generalizing the simple prototype to ω = P dx+Qdy, dω = 0 and solving ω = dη.
This is not generally solvable, and our goal is to look at these problems with
k-forms.
Math 212a Notes 49

15 October 19, 2017


Theorem 15.1 (Hahn–Banach). Let X be a vector space with a semi-norm,
and let Y be a subspace with a bounded functional f : Y → C. Then there exists
an extension to X with the same bound.
S
Note that a semi-norm is the same thing as a convex set A such that n nA =
X and tA = A for |t| = 1. Then f bounded by 1 means that A contains the ball
B or radius 1. What we are doing is to separating this set with another point
by a hyperplane. Using the difference of convex sets A − B, you can show that
if two convex sets are disjoint, then they can be separated by a hyperplane.

15.1 Calculus on manifolds


To discuss regularity, we need the Sobolev spaces. There is the Gårding’s in-
equality and Rellich’s lemma. The motivating example is Hodge theory. This
started out with
(
∂u
∂x = P, subject to
∂P

∂Q
= 0.
∂u ∂y ∂x
∂t = Q

This is sort of the fundamental theorem of calculus over C, i.e., Cauchy’s theory.
But we can go in another direction, which is to interpret it as d(P dx+Qdy) = 0.
There is the problem with boundaries, so Hodge looked at compact manifolds
without boundary.
We have discussed solving T u = f subject to Sf = 0. But for compact
manifolds there is a local question and a global question. The local one is called
the Poincaré lemma. We can let T = d on (p − 1)-forms and S = d on p-forms.
This satisfies the estimate, but globally it fails up to finite dimension. These
finite dimensional exceptions are called harmonic forms and have important
consequences in geometry and topology.
Let us look at the p-forms. If M is a smooth (C ∞ ) manifold, there is
the notion of a tangent vector. This is relates to each smooth function its
directional derivative. The tangent vector maps the germs of smooth functions
to R. This map is R-linear and satisfies the Leibniz formula

v(f g) = v(f )g(p) + f (p)v(g).

So the space of tangent vectors is a 1-st order approximation to X at P .


Differential forms are then defined. A 0-form is just a function. A 1-form
at a point p is an element (TX,p )∗ , and smooth 1-form can be defined. Each
0-form f can be made into a 1-form as
X ∂f   ∂ ∗ X ∂f 
(df )p = = (dxi )p .
i
∂xi p ∂xi p i
∂xi p
Math 212a Notes 50

What are the 2-forms then? If you just take the derivative, the second order
terms will enter. To keep them out, you take the skew-symmetrization. Define
X  X
∂ϕj
d ϕj xj = dxk ∧ dxj .
j
∂xk
j,k

15.2 Poincaré lemma


So let us look at the local compatibility condition. Take a smooth p-form
1 X
ω= ωj1 ,...,jp dxj1 ∧ · · · ∧ dxjp ,
p! j ,...,j
1 p

where ωj1 ,...,jp is alternating. Assume that dω = 0 locally. Then we would like
to say that ω = dγ for some γ a local smooth (p − 1)-form.
Lemma 15.2 (Poincaré lemma). Let Ω be a star-like domain. If ω is a p-form
on Ω such that dω = 0, then there is a (p − 1)-form γ on Ω such that dγ = ω.
There are two ways of doing this: one is using polar coordinates and the
other is using Cartesian coordinates, one at a time.

Proof. Let us take a radial coordinate t, on a star-like domain Ω. There is a


map
Φ : Ω × [0, 1]; (x, t) 7→ tx.
Write
Φ∗ ω = α + dt ∧ β,
where dt does not occur in α or β. This is closed, so
∂α
0 = dΦ∗ ω = dt ∧ + dx α − dt ∧ dx β.
∂t
∂α
The coefficient of dt is 0, so ∂t = dx β. Then
Z 1
ω − 0 = α|t=1 − α|t=0 = dx βdt.
0

If you want to do this on Cartesian coordinates, you use descending induction


on the number of dx1 , . . . , dxn occuring in ω.

15.3 Rellich’s lemma


Now let us look at the global situation. Let X be a compact smooth manifold.
Let f be a smooth p-form on X with df = 0. Let us try to solve du = f for
some u a smooth (p − 1)-form. We want to look at the same thing

kT ∗ gk2 + kSgk2 ≥?.


Math 212a Notes 51

This is the same thing as looking at the operator T T ∗ + S ∗ S. You ask if this
allows diagonalization. Here it is easier to use 1 + T T ∗ + S ∗ S.
One way is to use Ascoli–Arzelà. Let us look at the L2 space and the L21
space. What is the L2 space of global p-forms? Note that locally p-forms look
like linear combinations of dxj1 ∧· · ·∧dxjp with some coefficients. So we can take
the L2 norm of this after defining the metric and volume form on the manifold.
I am going to do this.
Now L21 space consists of measurable functions whose first derivative is mea-
surable in the weak sense. This means that there exists a g ∈ L2 such that
Z Z
0
− f ϕ = gϕ

for all ϕ ∈ Cc∞ .


Lemma 15.3. Let Hs be the L2s space of p-forms, and let Hr be the L2r space
of p-forms, with r > s. Then Hr ⊂ Hs . If a map

T : Hs → Hr

satisfies kT f kr ≤ ckf k, then T : Hs → Hs is compact.


Proof. We can only look at the coefficients. Let ωn be a bounded sequence in
the L2s -norm. We want to extract a subsequence of T ωn that converges in the
Ls -norm. Taking a partition of unity, we may assume that we are working on
Euclidean space, and the function has compact support.
We now assume that the chart is (−1, 1)n , and assume
P that the period is 2π.
You look at the Fourier series, and you can show that |νcν |2 is bounded.
Math 212a Notes 52

16 October 24, 2017


We were developing Hodge theory for compact smooth manifolds. Locally there
is no obstruction to every closed p-form being exact. This is Poincaré’s lemma.

16.1 L2 space of differential forms


To make everything rigorous, we first need to define the L2 space of differential
forms. We do this by introducing a Riemannian metric gij . This is a symmetric
2-form
Xn
g= gij dxi ⊗ dxj
i,j=1

where gij is symmetric and positive-definite. This gives a pointwise inner prod-
uct, and to make this global, we need a volume form
p
det gij dx1 ∧ dx2 ∧ · · · ∧ dxn .

Now to a p-form
1 X
ϕ= ϕj1 ···jp dxj1 ∧ · · · ∧ dxjp ,
p!

we can define its L2 norm


XZ p
kϕk2L2 = ϕj1 ···jp ϕk1 ···kp g j1 k1 · · · g jp kp deg g dx1 ∧ · · · ∧ dxn .
X

In the old days, people tried to solve this locally using Poincaré, but then
they don’t agree at the boundary. This difference is then going to have zero
differential, so can be written as d of something. This motivated Leray to define
sheaves.
Let us call A p the space of all smooth p-forms on a smooth compact manifold
X. Also, let ALp2 be the space of all p-forms with locally L2 coefficients. We
have a sequence
d d d d
· · · → A p−2 −
→ A p−1 −
→Ap −
→ A p+1 −
→ A p+2 → · · · .

We are then trying to solve the obstructions to ker(A p → A p+1 ) ⊇ im(A p−1 →
A p ). But out method of solving this is involving limits, so we are forced to look
instead at
dp−1 dp
ALp−1
2 −−−→ ALp2 −→ A p+1 .
It is clear that dp ◦ dp−1 = 0. So the only thing we need is the a priori
estimate. We ask if dp−1 d∗p−1 +d∗p dp ≥ c is true. Then we have all our machinery
that allows us to solve the equation. But this is not all we want, because we are
interested in A p . So we have another regularity problem to solve.
Math 212a Notes 53

It turns out that dp−1 d∗p−1 + d∗p dp ≥ c up to a finite dimension. That is,
there is some finite dimension that behaves badly, and if we exclude the bad
guys by taking the orthogonal complement, we get the right inequality.
We will show that (I + dp−1 d∗p−1 + d∗p dp )−1 is compact. This will prove
that we have the right estimate up to finite dimension, by the spectral theorem.
There are two ingredients, and the first one is Gårding’s inequality.
Theorem 16.1 (Gårding’s inequality). If ϕ ∈ ALp2 is in Dom dp ∩ Dom d∗p−1
then
kd∗p−1 ϕk2A p−1 + kdp ϕk2A p+1 + kϕk2A p ≥ ck∇ϕkA p2 .
L2 L2 L2 L

Note that this also implies that ϕ is in L21 . The second ingredient is Rellich’s
lemma.
Lemma 16.2 (Rellich’s lemma). The map L2r ,→ L2s is compact for r > s.
Proof. Let ϕν be a sequence of L2 p-forms on X where the coefficients are in
L21 and are bounded uniformly. We want to select a subsequence that converges
in L2 . We can localize by using a partition of unity. Then assume supp ϕν ⊆
(−π, π)n .
Look at the Fourier series and write
X
ϕν = am1 ,...,mn ,ν eim1 x1 x1 +···+imn xn .

Note that g is bounded both above and below, and hence we may just assume g
is the standard metric, because we only need convergence. Then up to constant,
X
kϕν k2 = |am1 ,...,mn ,ν |2 ≤ 1.
m1 ,...,mn

If we take the derivative, Gårding’s inequality will give


∂ 2 X
ϕν = |mj |2 |am1 ,...,mn ,ν |2 ≤ 1.

∂xj

2
P
This shows that the tail part |mj |≤N |am1 ,...,mn ,ν | is bounded by 1/N , in-
dependent of ν. Then we can truncate this tail part, get a finite-dimensional
space, and find a subsequence. Now let N → ∞ and use the diagonal.

16.2 Gårding’s inequality


The statement is that
kd∗p−1 ϕk2 + kdp ϕk2 + kϕk2 ≥ ck∇ϕk2 .
Let us look at the case p = 1 first. If
n
X
ϕ= ϕdxj
j=1
Math 212a Notes 54

then X X
dϕ = ∂k ϕj dxk ∧ dxj = (∂j ϕk − ∂k ϕj )dxj ∧ dxk .
j,k j<k

But what is d∗ ϕ? Let us work in Rn for simplicity. If d∗ ϕ = ψ, then


Z XZ Z X 
ψu = ϕj ∂j u = − ∂ j ϕj u
j j

if everything has compact support. So d∗ ϕ = − j ∂j ϕj . Therefore


P

XZ Z X 2
2 2 ∗ 2
kdϕk = (∂j ϕk − ∂k ϕj ) , kd ϕk = ∂j ϕj .
j<k j

Also
XZ
2
k∇ϕk = (∂j ϕk )2 .
j,k

Here, we are adding the sum of squares of linear combinations of partial deriva-
tives of coefficients. This already shows that you can’t bound k∇ϕk2 by kdϕk2
and kd∗ ϕk2 . This is because there are n squares in k∇ϕk2 but n(n−1)/2 squares
in kdϕk2 and 1 square in kd∗ ϕk2 .
But this is not linear algebra and we have the squares. The idea is |~u · ~v |2 +
|~u × ~v |2 = |~u|2 |~v |2 . Then
|tr(~u ⊗ ~v )|2 + k~u ∧ ~v k2 = k~u ⊗ ~v k2
and if you let u = (∂1 , . . . , ∂n ) and v = (ϕ1 , . . . , ϕn ), then you formally get the
inequality.
So formally, pointwise,
X 1X
(∂j ϕk − ∂k ϕj )2 = (∂j ϕk − ∂k ∂j )2
2
j<k j,k
X X
= (∂j ϕk )2 − (∂k ϕj )(∂j ϕk ).
j,k j,k
R R
But when we integrate, we can move ∂ around, so (∂k ϕj )(∂j ϕk ) = (∂j ϕj )(∂k ϕk ).
This is the case for Euclidean space. But we have a Riemannian metric.
Here, Z p
kdϕk2 = (∂j ϕk − ∂k ϕj )(∂l ϕm − ∂m ϕl )g jl g km det g.
X

Here, you can mostly ignore gjl g km part because we are working up to constant.
But when you do integration by parts, you get up taking the derivative of g.
This is why you need an additional kϕk2 in the general case.
What about for p-forms? Let me explain a bit about covariant differentiation
in this case. Because difference quotient depends on the coordinate, you need a
connection on the manifold. Levi-Cevita gave a way to get a connection from a
Riemannian metric.
Math 212a Notes 55

17 October 26, 2017


We were looking at p-forms and ALp2 (X) → ALp+1
2 (X). Our actual goal is
ACp∞ (X) → ACp+1
∞ (X). The two ingredients are
Lemma 17.1 (Rellich’s lemma). For r > s, the inclusion L2r (X) → L2s (X) is
compact.
Proof. You localize, and then use Fourier series.
Theorem 17.2 (Gårding’s inequality). If ϕ ∈ ACp∞ (X) then

kd∗p−1 ϕk2 + kdp ϕk2 + kϕk2 ≥ ck∇ϕk2 .

Here ∇ is covariant differentiation which can be defined by the Riemannian


metric.
Theorem 17.3 (Gårding’s inequality, L2 -version). If ϕ ∈ ALp2 (X) in Dom dp ∩
Dom d∗p−1 then
kd∗p−1 ϕk2 + kdp ϕk2 + kϕk2 ≥ ck∇ϕk2 .
In particular, ∂k ϕj1 ,...,jp is L2 .
This is a bit complicated, and we need to use Fredrichs’ lemma, proven in
1944. This says that if L is a differential operator, and χ is kernel with supp χ
size of order , then u ∗ χ → u and L(uχ ) → Lu in L2 .
Proof. First localize, so that we have a compactly supported ϕ. Then Fredrich’s
lemma tells us that

ϕ ∗ χ → ϕ, dp (ϕ ∗ χ ) → dp ϕ, d∗p−1 (ϕ ∗ χ ) → d∗p−1 ϕ.

Then we use the C ∞ version Gårding’s inequality. To take care of ∇ϕ, use
Fatou.
If you want to solve the problem with boundary, then things get easily
complicated. You have done the D2 case in the homework.

17.1 Solving the equation


Now we want to solve the differential equation. The spectral theorem will be
applied to (I + d∗p dp + dp−1 d∗p−1 )−1 . Now you may object that this d∗p dp or
maybe their sum are not defined.
This is Friedrichs’ contribution. Let T : H → H be a closed densely defined
operator. He showed that given any u ∈ H, there exists a v such that (1 +
T ∗ T )v = u. This means both that T v is defined and then T ∗ T v is defined.
In our case, we are looking at the map

ALp2 (X) → ALp−1


2 (X) ⊕ ALp+1
2 (X); ϕ 7→ d∗p−1 ϕ ⊕ dp ϕ.
Math 212a Notes 56

In this case, we have T : H1 → H2 , and we make it into


 
0 0
S= : H1 ⊕ H2 → H1 ⊕ H2 .
T 0

If we apply the theorem, we just inverse (1 + SS ∗ )−1 , which is just (1 + T T ∗ )−1 .


So we just don’t have to worry about these double domain containment condi-
tions.
Given any uν a bounded sequence, we have vν such that

(1 + d∗p dp + dp−1 d∗p−1 )vν = uν .

Here, vν ∈ Dom dp ∩ Dom d∗p−1 . This sequence vν , by Gårding’s inequality, is


bounded in L21 . So by Rellich there exists a subsequence converging in L2 . This
shows that (1 + d∗p dp + dp−1 d∗p−1 )−1 is compact.
Applying the spectral theorem gives eigenvalues λn ≥ 0 converging to 0 as
n → ∞, with eigenspaces En finite-dimensional except for λn = 0. This is for
(1 + d∗p dp + dp−1 d∗p−1 )−1 . Now if we look at the map d∗p dp + dp−1 d∗p−1 , the same
eigenspaces En are eigenspaces with eigenvalue
1
µn = − 1.
λn
Because 0 < λn ≤ 1 with λn → 0, we get 0 ≤ µn with µn → ∞. The only
problem is when λn = 1 and µn = 0. This is a problem because we want to
solve the differential equation and so we want to invert it.
So let H be the eigenspace for

d∗p dp + dp−1 d∗p−1 = ∆

with eigenvalue 0. We can decompose ALp2 (X) = H ⊕ H ⊥ . Then we can invert

1
(∆|H ⊥ )−1~ej = ~ej ,
µ(~ej )

which has bound 1/µ0 where µ0 is the minimal nonzero µn . We can extend
(∆|H ⊥ )−1 to 0 on H. This operator G is called the Green’s operator.
Now we are looking at
dp−1 dp
ALp−1
2 ALp2 ALp+1
2 .

Given any ϕ ∈ ker dp , we want to obtain the minimal solution so that

ψ = d∗p−1 (dp−1 d∗p−1 + d∗p dp )−1 ϕ.

But of course ∆ is not invertible because of H. So the best we can do is to solve


it in the case ϕ ∈ H ⊥ , and then we can just write

ψ = d∗p−1 Gϕ.
Math 212a Notes 57

17.2 Regularity of harmonic forms


But I haven’t done anything about regularity although I have advertised it a
lot. We want to show that if ϕ is smooth then the solution ϕ is also smooth.
Here we are going to use the fact that

dp−1 ψ = ϕ, d∗p−2 ψ = 0.

Proposition 17.4. If f = dp ϕ and g = d∗p−1 ϕ are both C ∞ , then ϕ is C ∞ .


The technique is to use the commutators of differential operators with dif-
ference operators.

Proof. First you localize. Here this is a problem because

d(ρν ϕ) = ρν dϕ + E,

where E depends on ϕ linearly. So dp ϕ = f + E is no longer smooth but is just


in L2 . This implies that

dp ϕ = f + E, d∗p−1 ϕ = g + E 0

for some E, E 0 ∈ L2 . By Gårding’s inequality, we immediately get ϕ ∈ L21 .


Then we look at an arbitrary partial derivative D. Here

Ddp ϕ = dp Dϕ + [D, dp ]ϕ.

This [D, dp ] is first-order 1 by the product formula. Because Ddp ϕ and [D, dp ]
is L2 , we get we get dp Dϕ ∈ L2 . Then likewise we get d∗p−1 Dϕ ∈ L2 . Again
by Gårding, we get Dϕ ∈ L21 . Then we can bootstrap this iteratively to get
ϕ ∈ C ∞.
Here, we don’t know if ∇ is in L2 . So we use a difference quotient instead of
a partial derivative, and apply Gårding’s inequality. Then we can take the limit
using Fatou’s lemma. This problem here because we really can’t to integration
by parts because everything is a weak derivative.
Math 212a Notes 58

18 October 31, 2017


I think we finished Hodge theory. Now I will go and do the other technique
of solving differential equations. This was developed by Malgrange and Ehren-
spreis. We are going to use Fourier transform to solve linear partial differential
equations with constant coefficients. The point is to overcome the difficulty of
division by 0.
The point is that the Fourier transform is complex-analytic. We have
Z
fˆ(ξ) = f (x)e−2πiξx dx.
x∈R

This is the weighted average of the exponential function. So after taking the
weighted average, it is still going to be complex-analytic. Then we have the
mean-value property. Even though f might have a zero, it might not have a
zero on a circle.
We want to solve differential equations in a weak sense. So we want to use
test functions. We can use the compactly supported smooth functions Cc∞ . But
using the Schwartz space is good enough. These are the functions whose Dα f
decay faster than any polynomial order.
When people tried to solve Laplace’s equation
n
X ∂2u
= f,
j=1
∂x2j

they took the Fourier transform. Here,


Z Z
fb0 (ξ) = f 0 (x)e−2πiξx dx = (2πiξ) f 0 (x)e−2πiξx dx.
R R

So we have
∂ α
\
f (ξ) = (2πiξ)α fˆ(ξ).
∂x
So we would have

û = .
−4π 2 |ξ|2
Here you have a problem because you’re dividing by zero.

18.1 Smearing out by polynomials


This is the key lemma. We want to write F (z) = G(z)/P (z).
Pm−1
Lemma 18.1. Let us write P (z) = z m + k=0 bk z k . If the function F (z) is
holomorphic on |z| ≤ 1, then
Z 2π
2 1
|F (0)| ≤ |P (eiθ )F (eiθ )|2 dθ.
2π θ=0
Math 212a Notes 59

Proof. The idea is to reflect the zeros outside the unit circle. Let us write
P (z) = P1 (z)P2 (z) where the roots of P1 (z) are outside the unit disk and those
of P2 (z) are inside the unit disk. For |β| < 1, there is a Möbius transformation

z−β
z 7→
1 − β̄z
which maps the circle to itself. The polynomial P2 is the problem, so we reflect
to define Y
P̃2 (z) = (1 − β̄z).
|β|<1,P (β)=0

So if we let P̃ (z) = P1 (z)P̃2 (z) then all roots of P̃ are outside the unit disk and
Y
|P̃ (0)| = |α| ≥ 1, |P (eiθ )F (eiθ )| = |P̃ (eiθ )F (eiθ )|.
α≥1,P (ζ)=0

But the mean-value property with Cauchy–Schwartz imply


Z 2π
1
|H(0)|2 ≤ |H(eiθ )|2 dθ.
2π θ=0

Applying it to P̃ F gives
Z 2π
2 1
|P̃ (0)F (0)| ≤ |P̃ (eiθ )F (eiθ )|2 dθ.
2π θ=0

Then we are done.

18.2 Differential equations with constant coefficients


Suppose we want to solve Lu = f , so that taking the Fourier transform gives
Lu(ξ)
c = Q(ξ)û(ξ). We want to replace by Riesz representation theorem and
estimate.
Let us first look at the formal adjoint of
X  ∂ α
L= aα ,
α
∂x

which is  ∂ α
X
L∗ = (−1)|α| āα .
α
∂x
We assume for now that Ω ⊆ Rn is a bounded domain where we are solving the
equation.
So it suffices to find a c > 0 such that

kψkL2 (Ω) ≤ ckL∗ ψkL2 (Ω) .


Math 212a Notes 60

If this is the case, we will be able to use Riesz representation to show that for
all f ∈ L2 (Ω) there exists an u ∈ L2 (Ω) such that Lu = f and kukL2 (Ω) ≤
ckf kL2 (Ω) .
How do we go from L2 to holomorphic? Assume that Ω = (−M, M ) with
M > 0. Assume g is L2 (R) with supp g ⊆ Ω. Then I claim that
Z M
ĝ(ξ + iη) = e−2πi(ξ+iη)x dx
−M

is holomorphic on C. To show this, we need to justify passing the differential


operator past g. This can be justified by using the fundamental theorem of
calculus and Fubini’s theorem.
Now we want to check that kψkL2 (Ω) ≤ ckL∗ ψkL2 (Ω) . We do this by using
the Fourier transform. This is done by using the fact that ˆ preserves the L2
norm. It is also done in the context of the Schwartz space S . Let Ω ⊆ Rn+1
with coordinates (x, y1 , . . . , ym ). Assume that L takes the form of
m−1
∂m X ∂v X ∂ λ1 +···+λn
L=B + Lv , Lv = aλ1 ,...,λn .
∂xm v=0
∂xv
λ1 ,...,λn
∂y1λ1 · · · ∂ynλn

We now check that


kψkL2 (Ω) ≤ ckL∗ ψkL2 (Ω)
for all ψ ∈ S . We can extend by zero and instead show

kψ̂kL2 (Rn+1 ) ≤ ckQ(ξ)ψk


b L2 (Rn+1 ) .

But note that this is exactly in the situation of the key lemma. We need
Z
|Q(ξ + iη, σ1 , . . . , σk )ψ̂(ξ + iη, σ1 , . . . , σn )|2
ξ,σ1 ,...,σn

at η = 0 to dominate some constant times |ψ̂(ξ, σ1 , . . . , σn )|2 . Let z = ξ + reiθ .


R

Then the key lemma gives


Z 2π
2 1
|ψ̂(ξ, σ1 , . . . , σn )| ≤ C |(Qψ̂)(ξ + cos θ + i sin θ, σ1 , . . . , σn )|2 dθ.
2π θ=0
So we have
Z 2π
2 C ∗ ψ(ξ + cos θ + i sin θ, σ , . . . , σ )|2 dθ.
|ψ̂(ξ, σ1 , . . . , σn )| ≤ |Ld 1 n
2π θ=0

But
∗ ψ(ξ + cos θ + i sin θ, σ , . . . , σ ) = L
Ld ∗ ψ(ξ, σ , . . . , σ )e−2πi(ξ+cos θ+i sin θ)x .
1 n 1 n
d

Because Ω ⊆ [−M, M ] × Rn , the e−2πi(ξ+cos θ+i sin θ)x is bounded. This shows
that we have our inequality.
Math 212a Notes 61

19 November 2, 2017
Last time we looked at the method of Malgrange–Ehrenpreis. Here, we proved
the estimate
kukL2 (Ω) ≤ ckL∗ ukL2 (Ω)
for all u ∈ C0∞ (Ω). Then we were able to conclude that L is surjective and so
Lu = f is always solvable.
Note that this is not a special case of the a priori estimate

c2 (kT ∗ gk2 + kSgk2 ) ≥ kgk2 .

The a priori estimate needs to hold for all g ∈ Dom S ∩ Dom T ∗ . But we have
only shown this for g a compactly supported smooth function.
When we say that Lu = f in the weak sense, we are simply saying that
(L∗ g, u) = (g, f ) for all g ∈ C0∞ (Ω). This is not saying that u is in the domain
of L and Lu = f . To do this, we need some Friedrich thing and we have a
problem for higher order derivatives.

19.1 Distributions
Definition 19.1. A distribution or a generalized function is a continuous
linear functional on C0 (Ω) = D(Ω). The space is called D0 (Ω).
But what does continuous mean? This should mean that ϕn → 0 implies
T (ϕn ) → 0.
Definition 19.2. We introduce a metric such that ϕn → 0 means that there
exists a compact K such that supp ϕn ⊆ K for all n, and Dα ϕn → 0 uniformly
for all α.
So when u ∈ L2 , the function Lu makes sense as a distribution. We define
it as
(Lu, ϕ) = (u, L∗ ϕ).
This is going to be a continuous functional.
This also comes up when you want Ω unbounded in the Malgrange–Ehrenpreis.
If Ω ⊆ [−M, M ] × Rn all is good because the Fourier transform is bounded, but
once Ω is too big we get into trouble. Then we are forced to use distributions.
Using this setting, we can use language like
 1 
∆ cn = δ(x).
|x|n−2

Here, we want to have Fourier analysis in the setting of distributions. We


would have, for all ϕ ∈ D(R),
Z Z Z
fˆ(ξ)ϕ(ξ) = f (x)e−2πixξ ϕ(ξ)dξdx.
ξ ξ x
Math 212a Notes 62

So we would like to say (fˆ)(ϕ) = f (ϕ̂). But here we have trouble because ϕ̂
doesn’t have compact support. I will talk about this more, but you can justify
this.
Let us write E = cn /|x|n−2 for the fundamental solution. Then
1
Ê(ξ) = − .
4π|ξ|2
We want to say something like this, but this can’t be justified even if we know
the answer.

19.2 Tempered distributions


We had this problem about ϕ̂ not being compactly supported even if ϕ is com-
pactly supported. This is a problem when defining T̂ , so we are going to change
our definition of a distribution.
Definition 19.3. The Schwartz space S(Rn ) is the set of all functions C ∞ (Rn )
such that
sup |x|a |Dα f (x)| < ∞
x∈Rn
for all a, α. A tempered distribution is a distribution that extends to a
continuous functional on S(Rn ).
So far we’ve been looking at solving differential equations. If we directly take
the Fourier transform, then this is bad because we get division by zero. The
other thing we’ve been doing is to use Riesz representation. Here, we estimate
the L2 norm using the fact that the Fourier transform fixes the L2 norm.
Because we can’t directly apply the Fourier transform, we are going to take
some nice properties.
(0) Df corresponds to fˆ times a polynomial.
(1) f[∗ g = fˆ · ĝ.
(2) fˆg = f ĝ.
R R
P P ˆ
(3) n f (n) = n f (n).

Recall that Plancherel follows from (1). If we let g(x) = f (−x) and evaluate
f ∗ g at 0, then Z Z
f ∗ g(0) = (f ∗ g)(ξ)dξ = |fˆ|2 dξ.
\
2
The inversion formula follows from e−πx being its own Fourier transform and
the identity (2).
Theorem 19.4. Let P (D) be a (constant coefficient) polynomial in the partial
differential operators on Rn . There exists a fundamental solution E for P (D),
which means that E is a (tempered) distribution on Rn with
P (D)E = δ0
is the Dirac delta at 0.
Math 212a Notes 63

We can basically repeat the same argument. But we are going to refine the
trick of Malgrange–Ehrenpreis. Recall that we were only using this trick when
Ω ⊆ [−M, M ] × Rn . Also, there was this one direct x such that the highest term
is just (∂/∂x)m . This is not true anymore, so we are going to average over all
he special one coordinates. That is, we let zj = eiθj and average over the torus
T n.
Let us write P = P0 + P1 + · · · + PN where Pν is homogeneous of degree ν.
We are going to use the mean-value property of a holomorphic function. For f
a holomorphic function on Cn , consider

F (λ) = f (z + rλw)

for fixed center z ∈ Cn and w ∈ T n . The trick of Malgrange–Ehrenpreis then


gives Z 2π
N 1
r |PN (w)||f (w)| ≤ |(f P )(z + reiθ w)|dθ
2π θ=0
because Q has leading coefficient rN P (w). If average, we get
Z
A dθ1 dθn
|f (z)| ≤ N |(f P )(z + w)| ··· ,
r w∈T n 2π 2π

where Z
1 dθ1 dθn
= |PN (w)| ··· .
A w∈T n 2π 2π
Math 212a Notes 64

20 November 7, 2017
We have looked at Malgrange–Ehrenpreis, and here we need the monic assump-
tion and also bounded domain Ω ⊆ [−M, M ]×Rn . To get rid of the boundedness
assumption in one coordinate, we were forced to look at distributions.
Distributions is the dual of the space of test functions. The test functions on
compact sets form a Fréchet spaces, and so the space D(Rn ) of all test functions
is a direct limit of Fréchet space. Then the dual is going to be the distributions.
Duals of Fréchet spaces are DF spaces, but we are looking at the dual of a direct
limit of Fréchet space.
Other than Malgrange–Ehrenpreis, there is a way of solving a differential
equation with probability. When I was at Yale, I once heard this ingenious idea
from Kakutani and was very impressed. Suppose you want to solve Dirichlet’s
problem with some boundary condition. What you can do here is to find a value
in the middle is to consider a random walk starting at this point. It is going
to hit the boundary at some finite time, and look at the expected value of the
function value of the hitting point. This enjoys the harmonic property because
it has to first go somewhere, and then the expected value is the expected value
of starting at the nearby points.

20.1 Malgrange–Ehrenpreis on unbounded region


We remove the monic assumption. Let P (z1 , . . . , zn ) be a homogeneous poly-
nomial of degree m. Then leading coefficient is
 Z 2π −1
1 iθ1 iθn
C= |Pm (e , . . . , e )|dθ1 · · · dθn
(2π)n θj =0
and then we have the inequality
Z 2π
1
|F (0)| ≤ |(F P )(eiθ1 , . . . , eiθn )|dθ1 · · · dθn .
(2π)n θj =0
After you do this, we want to repeat the same technique. Instead of solving
Lu = f , we solve LE = δ0 and then we would get u = E ∗ f . This E is called
a fundamental solution, and is called a tempered distribution. Note that for a
distribution T , we are going to define
 ∂   ∂ϕ 
T ϕ=T − ,
∂xj ∂xj
because we want integration by parts. So if
X
L= aα ∂xα ,
|α|≤m

then X
L1 = (−1)|α| aα ∂xα .
|α|≤m
Math 212a Notes 65

Using this we can write (LE)ϕ = E(L1 ϕ), so we want this E to send L1 ϕ to
ϕ(0).
Now it suffices to show that L1 ϕ 7→ ϕ(0) defined on L1 S(Rn ) is bounded.
Then we have Hahn–Banach and so extend to S(Rn ) with the same bound.
Note that on S(Rn ), the Fourier transform is a isomorphism of topological
vector spaces. Let Q be the symbol of the characteristic polynomial of L1 . Then
we want a bound for
Q(ξ)ϕ̂(ξ) 7→ ϕ(0).
That is, we want to show that
|ϕ(0)| ≤ C` kQ(ξ)ϕ̂(ξ)k`,S .
We apply Malgrange–Ehrenpreis to η 7→ Q(η + ξ)ϕ̂(η + ξ) for fixed ξ. Then
we will get
Z 2π
1
|ϕ̂(ξ)| ≤ C |(Qϕ̂)(ξ + eiθ1 , . . . , ξ + eiθn )|dθ1 · · · dθn .
(2π)n θj =0
Then
Z
|ϕ(0)| ≤ |ϕ̂(ξ)|
ξ∈Rn
Z 2π Z
C
≤ |(Qϕ̂)(ξ + eiθ1 , . . . , ξ + eiθn )|dξdθ1 · · · dθn
(2π)n θj =0 ξ∈Rn
Z
=C k(Qϕ̂)(ξ)k
ξ∈Rn
Z
1
=C n+1
(1 + |ξ|n+1 )|Qϕ̂(ξ)| ≤ C 0 kϕk`,S ,
ξ∈R n 1 + |ξ|
because we have rapid decay. Here you need to be a bit careful because we are
shifting in the complex direction, but this can be done.
There is another way of handling this problem. If we have LE = δ0 , then
we have E(L1 ϕ) = ϕ(0). Then
Z Z
Qϕ̂
ϕ(0) = ϕ̂(ξ) = .
Q
If Q has no zero, we are fine, but if Q has a zero, we can’t just do this. So
what we can do is to shift the domain. Instead of ξn , we are going to use
ξn + iη̃(ξ1 , . . . , ξn−1 ), where η̃ depends on ξ1 , . . . , ξn−1 . Then define
Z Z  ϕ̂ 
E(ϕ) = (ξ1 , . . . , ξn−1 , ξn + iη̃(ξ1 , . . . , ξn−1 )).
ξ1 ,...,ξn−1 ξn ∈R Q

If we can choose η̃ so that Q ≥ c > 0, we would have


Z
E(L1 ϕ) = ϕ̂(ξ1 , . . . , ξn−1 , ξn + iη̃(ξ1 , . . . , ξn−1 ))
ξ1 ,...,ξn
Z
= ϕ̂(ξ1 , . . . , ξn ) = ϕ(0)
ξ1 ,...,ξn
Math 212a Notes 66

because the Fourier transform is holomorphic. We still need to be talk about


how to choose η̃, but this is the most elegant and explicit way to solve the
problem.
After this, we want to know whether if the fundamental solution is smooth,
outside the origin. This is mostly in done in the case of elliptic operators, e.g.,
the Laplacian.
Math 212a Notes 67

21 November 9, 2017
People were looking at simple linear PDEs with constant coefficients. Then
where L is is a differential equation, we want to solve LE = δ0 . Here D(Rn )
is the direct limit of the Fréchet space DK (R). This is not nice, so we look at
the Schwartz space S(Rn ), which is a Fréchet space. Another nice thing is that
ˆmaps S(Rn ) to itself, and S(Rn )0 is a tempered distribution.
The fundamental solution was obtained first by Malgrange–Ehrenpreis by
reflection with the unit circle, and next by extension of a linear functional. We
wanted to take the Fourier transform and say QÊ = 1, but then we had this
problem of dividing by zero. What we want is
E(L1 ϕ) → ϕ(0)
and then extend it to E : D(Rn ) → C. So we check that E on L1 (D(Rn )) is
already continuous. Then we want to check that for fixed K,
|ϕ(0)| ≤ CK kL1 ϕk`K ,K .
The Malgrange–Ehrenpreis trick is
Z
C
|ϕ̂(ξ)| ≤ |(Qϕ̂)(ξ1 + eiθ1 , . . . , ξn + eiθn |dθ1 · · · dθn .
(2π)n 0≤θj ≤2π
Then
Z Z

|ϕ(0)| = ϕ̂(ξ) ≤ |ϕ̂(ξ)|
ξ∈Rn ξ∈Rn
Z Z
C
≤ |(Qϕ̂)(ξ1 + eiθ1 , . . . , ξn + eiθn )|dθ1 · · · dθn .
(2π)n 0≤θj ≤2π ξ∈Rn
Here Z P
(Qϕ̂)(ξj + e iθj
)= (L1 ϕ)(x)e−2πi ν xν (ξν +cos θn +i sin θn )
x∈K
and so
Z P
|(Qθ̂)(ξj + eiθj )| ≤ |(L1 ϕ)(x)|e2π ν |ξν | 00
≤ CK sup (L1 ϕ)(x).
x∈K x∈K

But this is not good enough because we have to integrate over ξ ∈ Rn . So


we need another bound. Consider L2 the linear PDE with constant coefficients
the symbol (1 + ξ 2 )` . Then the same bound gives us
Pn 00
|(1 + ν=1 (ξν + eiθν )2 )` (Qϕ̂)(ξj + eiθj )| ≤ CK sup |(L2 L1 ϕ)(x)|.
x∈K

Then some technical inequality gives


00 8`
|(Qϕ̂)(ξj + eiθj )| ≤ CK sup |(L2 L1 ϕ)(x)|.
(1 + ξ 2 )` x∈K
Now we can integrate over ξ and this gives
000
|ϕ(0)| ≤ CK sup |(L2 L1 ϕ)(x)|.
x∈K
Math 212a Notes 68

21.1 Division problem for distributions


The thing about tempered distributions is that −̂ maps S(Rn ) to S(Rn ). So −̂
also maps S(Rn )0 to S(Rn )0 .
If we want to solve LE = δ0 , the question is whether you can divide a
tempered distribution by a polynomial. In particular, is multiplication by a
polynomial Q surjective in S(Rn )0 ? If you go to the dual, we are asking if
multiplication by Q is injective with closed image in S(Rn ).
Our first goal is to show that multiplication by Q on S(Rn ) is injective, and
the second goal is to show that this implies that multiplication by Q in S(Rn )0
is surjective.
Proposition 21.1. For a nonzero polynomial Q, multiplication by Q on S(Rn )
is injective and has closed image.
Injectivity is no trouble, because Q vanishes on a very small set. The problem
is to check that the image is closed. This amounts to bounding the norm of ψ/Q
by the norm of ψ.
Let’s look at the simple case of a single variable. How can you bound
 
ψ
(1 + x2 )α ∂xβ
(x − a1 ) · · · (x − an )

in terms of (1 + x2 )α ∂xβ ψ(x). This is going to be L’Hôpital’s rule. But what


about in higher dimensions? It took people something like 4 years to deal with
this. We are not going to do this, but there is a way of doing this, due to
Whitney. Basically you assign to each point some jet with some proximity
condition.

21.2 Changing the domain of integration


In Malgrange–Ehrenpreis, we replace R × Rn−1 by S 1 × R × Rn−1 . But here we
replace it by (R + iη̃) × Rn−1 in C × Rn−1 . If we move it well, we get an explicit
formula for E.
LE(ϕ) = E(L1 ϕ) = ϕ(0).
If Q is bounded from below, we would get
Z
ϕ̂
E(ϕ) = (ξ)
Rn Q

and this would be an answer. This is good already good for tempered distribu-
tions. This is because we would have
Z Z
Ld1ϕ
E(L1 ϕ) = = ϕ̂ = ϕ(0).
Rn Q Rn

For instance, if L = (1 − ∆)N then Q = (1 + 4π|x|2 )N and so we can just do


this.
Math 212a Notes 69

But what if Q has a zero? We are going to choose η̃1 such that fixed
ξ2 , . . . , ξn , we have {η1 = η̃1 } disjoint from the zeros of Q(ξ1 + iη1 , ξ2 , . . . , ξn ).
We can do this so that |Q| is even bounded from below.
If ϕ ∈ DK (Rn ) then ϕ̂ is holomorphic on Cn . Then we can write
Z  
ϕ̂
E(ϕ) = (ξ1 + iη̃1 (ξ2 , . . . , ξn ), ξ2 , . . . , ξn )dξ1 · · · dξn .
Q

We need to check that this is really a fundamental solution. We have


Z
E(L1 ϕ) = ϕ̂(ξ1 + iη̃1 (ξ2 , . . . , ξn ), ξ2 , . . . , ξn ),
Rn
R
and then we want to justify that this is equal to Rn ϕ̂(ξ1 , . . . , ξn ) = ϕ(0).
To justify this, we introduce L2 with symbol (1 + ξ12 )` . Then

(1 + (ξ1 + iη1 )2 )` ϕ̂(ξ1 + iη1 , ξ2 , . . . , ξn ) = (L


\ 2 ϕ)(ξ1 + iη̃1 , ξ2 , . . . , ξn )
Z
= (L2 ϕ)(x1 , . . . , xn )e−2πi((ξ1 +iη1 )x1 +ξ2 x2 +···+ξn xn ) .
Rn

Then todo
Math 212a Notes 70

22 November 14, 2017


Last times we were solving LE = δ0 to get the fundamental solution E, in the
context of distributions. One reason we’re not happy with E ∈ D0 (Rn ) is that
we can’t take the Fourier transform. To take the Fourier transform, we want
our distribution to be in S(Rn )0 . So there is another question, of whether E
can be chosen to be a tempered distribution. This is a division problem which
was solved by Hörmander.
We’re not going to talk about this in great detail, but the main idea is to
generalize L’Hôpital’s rule. We want to show that
kf km0 ,S(Rn ) ≤ CkP f km,S(Rn ) .
You can just try to expand the partial derivatives of Dα (P f ) and do something,
which is actually what people did initially, but it quickly becomes complicated
as n becomes large.

22.1 Elliptic regularity


There is the analytic Laplacian
n
X ∂2
∆=
j=1
∂x2j

on Rn . In the context of Hodge theory, there is the geometric Laplacian


(dd∗ + d∗ d)ω = −∆ω.
In Rn , ∆ is just applying the analytic Laplacian to each coefficient.
For elliptic operators, we have interior regularity.
Definition 22.1. Let L be a single linear partial differential equation with
constant coefficients and order m. Then L is said to be elliptic if
|P (ξ)| ≥ c|ξ|m
for some c > 0 and all sufficiently large |ξ|, where P is the symbol for L.
This just means that the if Pm is the principal (degree-m homogeneous) part
of P then |Pm (ξ)| ≥ c|ξ|m .
Definition 22.2. L is hypo-elliptic if Lu being C ∞ on an open set Ω implies
that u is C ∞ .
We want to prove that L is hypo-elliptic. Let Lu = f be C ∞ . We introduce
a (almost) fundamental solution by ignoring the C ∞ functions. This is also
called a parametrix.
Definition 22.3. Let L be a differential operator constant coefficient on Rn .
Let Q be a distribution on Rn . The distribution Q is said to be a parametrix
for L if LQ = δ0 + r with r ∈ C ∞ (Rn ).
Math 212a Notes 71

Theorem 22.4. If L is elliptic then there exists a parametrix Q which is regular


in the sense that Q is C ∞ on Rn − {0}.
Proof. The proof uses properties of the symbol P . The original method of
Fourier analysis was to take the inverse Fourier transform of
1
P (ξ)

to get E, if it is possible. There are two difficulties: one is when P (ξ) = 0,


and the other is at ξ = ∞, because it might not decay as fast as we want. To
solve the second problem, we multiply by a large power of |ξ|. To solve the first
problem, we use a C ∞ cut-off function γ(ξ) on R such that 0 ≤ γ(ξ) ≤ 1, and
γ(ξ) ≡ 0 near the ξ = 0, where P (ξ) could be small. We are going to take the
inverse Fourier transform of γ/P .
So define Q as the inverse Fourier transform of γ/P . Let β be an arbitrary
number and consider  γ(ξ) 
∆N (2πiξ)β .
P (ξ)
Its growth at ∞ is |ξ|m+2N −|β| , and if 2N + m − |β| > n, then it is L2 . So its
Fourier transform
(−4π|x|2 )N ∂xβ Q
is L2 on Rn . This shows that Q is C m−1 on Rn − {0} for all m. Because we
have chosen m to be any number, we see that Q is actually C ∞ on Rn − {0}.
Now we have
d = P · γ = γ = 1 + (γ − 1).
LQ
P
The function γ − 1 is C ∞ on Rn , with compact support, so its inverse Fourier
transform is C ∞ . Therefore LQ − δ is the inverse Fourier transform of γ − 1,
which is smooth.

Generally, when LE = δ0 we can solve Lu = f by taking the convolution


u = E ∗ f . But we need something like E is of compact support in order for
this to make sense.
The idea is to take a cut-off function χ such that χ ≡ 1 on B/2 (0) such
that supp χ ⊆ B . We replace Q by χ Q. Then

L(χ Q) = χ (LQ) + (L(χ Q) − χ LQ) = δ0 + χ r + (L(χ Q) − χ LQ)

and χ r and L(χ Q) − χ LQ are smooth.


Theorem 22.5 (Elliptic implies hypo-elliptic). Let L be an elliptic operator on
Rn . Let U be a distribution on an open subset Ω of Rn and assume that LU is
C ∞ on Ω. Then U is C ∞ on Ω.
Proof. Consider an open set Ω0 ⊆ Ω such that Ω0 + B (0) ⊆ Ω. Then there
exists a parametrix Q for L with support in B (0).
Math 212a Notes 72

Let f = LU . Then

Q ∗ f = (LQ) ∗ U = (δ0 + r) ∗ U = U + r ∗ U

which is smooth. This shows that U is smooth on Ω0 , and we can let Ω0 go to


Ω.
Math 212a Notes 73

23 November 16, 2017


For a distribution it is allowed to divide by a polynomial. This is all about the
fundamental solution. Then people tried to do this for tempered distributions.
To do this it suffices to show

kgk`0 ,S ≤ c̃kP gk`,S .

Let me give you how to solve this, and then other dimensions will be similar.

23.1 Division problem of tempered distributions by poly-


nomials
Let me first tell you L’Hôpital’s rule from Taylor expansion. We have
Z x
0 f (n) (a) n (t − a)n (n+1)
f (x) = f (a) + f (a)(x − a) + · · · + (x − a) + f (t)dt
n! t=a n!

and then
(x − a)n+1 (n+1)
Rn (x) = f (a + O(x − a)).
(n + 1)!
If there is a zero at a with order ≥ n, then

f (x) f (n) (a) x − a (n+1)


n
= + f (a + O(x − a)).
(x − a) n! n+1

Now consider the polynomial


Y
P (x) = (x − aj )kj .
j

Suppose f, g ∈ S(R) satisfy f = P g. We want to bound the sup norm of


g(x) = f (x)P (x) where ordaj f ≥ kj . We need to worry about the cases
(i) x is near some aj ,
(ii) x is bounded away from all zeros of P .

For the second case, we are dividing by a smooth function so we don’t have to
worry. For the first case, we are going to have

f (x) f (x)/(x − aj )kj


= .
P (x) P (x)/(x − aj )kj

Then
f (x) 1
1 (h)
|f (`) (y)|.

≤ sup f (y) + · · · ≤ Cη sup

P (x) c |y−a|≤η h!

`≤h+1,|y−x|≤η
Math 212a Notes 74

S
For the points outside j Bη (aj ), we have have |g| ≤ C̃|f | so

sup(1 + |x|)` |g(x)| ≤ C ∗ sup (1 + |y|` )|f (ν) (y)|.


x∈R y∈R,ν≤maxj hj +1

For derivatives, we have


m   m
X m X
Dm f (Dp P )(Dm−p g) = P (Dm g) + (Dp P )(Dm−p g).
p=0
p p=1

Then
m  
m m
X m
P (D g) = D f − (Dp P )(Dm−p g)
p=1
p
and use induction to get the bound.
You can read Hörmander, Arkiv. für Math. 53 (1958), 555–568 or M. Atiyah,
Comm. Pure Appl. Math 23 (1970), 145–150. Here is what Atiyah did. In the
case when the polynomial is locally nice, like
pj
Y
P = (xν − aν (j))`ν,j ,
ν=1

then we are good. So what you do is to change coordinates and blow up. For
instance, if you have P (x, y) = x3 − y 2 then we can use y = xu and x = v to get
u2 = v. Then you still get a manifold, and you can do analysis locally. After
you do the resolution of singularity and get a manifold, and also a “tempered
distribution” on the manifold, you push forward.

23.2 Sobolev embedding


When we have a distribution, we test against compact support. Then the next
best thing you can get is a tempered distribution, unless you have something
special like elliptic regularity. But we want to say how bad a distribution is.
Let T be a compactly supported distribution. We define the order as the
minimal ` such that we can bound
|T (g)| ≤ kgk`,DK = sup|Dν g|
ν≤`

for a K that contains a neighborhood of the support of T . For instance, ∂xα · δ0


has order |α| and functions have order 0.
To test the function, we need more spaces, like L2k = W 2,k . We want an
embedding W0p,k ⊆ C0m . Here, we can do stuff like
Z Z
ˆ
kDf kL2 = |xf (x)| = x2 f (x) · f (x) ≤ kD2 f kL2 kf kL2 .
2

This suggests that we can use Hölder to bound things. The following is a
theorem of Nirenberg, Ann. d. Scnola Normale Superiore d. Pisa 13 (1959),
115–162.
Math 212a Notes 75

Theorem 23.1 (Nirenberg). Let r ∈ Lq (Rn ) and Dm u ∈ Lr for 1 ≤ q, r ≤ ∞.


Then
kDj ukLp ≤ CkDm ukaLp kuk1−a
Lq

for j/m ≤ a ≤ 1. Actually there are two exceptions:


• if j = 0 and rm < n, q = ∞, then we need the additional assumption
u ∈ Lq̃ for some q̃ > 0.
• if 1 < r < ∞ and m − j − n/r ∈ N ∪ {0} then the inequality holds for a
satisfying only j/m ≤ a < 1.
The main techniques will be:
• Applying the fundamental theorem of calculus applied to the radial direc-
tion and then averaging over the angular directions. This is the simplest
one, but cannot handle the critical “equal” condition.
• Fundamental theorem of calculus applied to each rectangular direction.
This is Nirenberg’s trick.
• Moser’s trick of iterating.
• Integration by parts to control more direction, which is similar to Gårding’s
inequality techniques.
Let me tell you the first simple trick, which shows

W01,p (Ω) ⊆ L∞ (Ω).

Assume that Ω is bounded for simplicity. Take a point x ∈ Ω, and then integrate
along the radial direction to get something about x. Then average over the paths
around the sphere and use Hölder’s inequality.
Math 212a Notes 76

24 November 21, 2017


We had these four techniques in proving the Sobolev inequalities. The first
technique of applying the fundamental theorem of calculus in the radius works
in the following way. To avoid the boundary term, we are going to assume that
the function has compact support. We want to show that
W0p,1 ⊆ L∞ n
loc (R )

if p > n. Let u ∈ Cc∞ . We want to show that


kukL∞ ≤ CkukLp1 .
Let Ω ⊂ Rn be a bounded set and we are assuming supp u ⊂ Ω. Because u
vanishes outside a large ball, we have
Z Z T
1 ∂
u(x) = − n−1
u(x + tê)dtdê.
Vol(S ) ê∈S n−1 t=0 ∂t
Then
Z
1 1
|u(x)| ≤ |grad u| n−1
Vol(S n−1 ) B n (x,T ) rx
Z 1/p Z  1− p1
p 1 p/(p−1)
≤ |grad u| ≤ CkukLp1
B n (x,T ) rxn−1
where rx is the distance to x. Because p > n, the integral is finite.
We can generalize this technique. For m ≥ 1, we have the Taylor expansion
m Z x
X 1 (h) 1 (m+1)
f (x) = f (a)(x − a)h + f (t)(b − t)m dt.
k! t=a m!
h=0

The result coming from this inequality is


kukL∞ ≤ CkukW k,p .
0

24.1 Nirenberg’s trick of integrating in a rectangular di-


rection
This is quite impressive. You want to fix the axis direction and then integrate
in these directions. But to be fair, you need to do it from the other end too.
Consider the jth coordinate. We have
Z ∞
|2u(x)| ≤ |∂j u(x1 , . . . , xj−1 , t, xj+1 , . . . , xn )|dt.
−∞

We take the product for all the j’s and get


Yn Z ∞
|2u(x)|n ≤ |∂j u(x1 , . . . , xj−1 , t, xj+1 , . . . , xn )|dt.
j=1 −∞
Math 212a Notes 77

But this is only for one coordinate. So we integrate with respect to x1 . The
first factor is independent of x1 , so we can take it out and have n − 1 factors.
Then by Hölder,
Z Z 1
 n−1 Z 1
 n−1
n−1 n−1
|f1 · · · fn−1 | ≤ |f1 | ··· |fn−1 | .

Using this, we get


1
Z ∞ Z ∞  n−1
n
|2u(x)| n−1 ≤ ∂j u(t, x2 , . . . , xn )
x1 =−∞ t=−∞
n Z 1
Y ∞ Z ∞  n−1
|∂j u(x − 1, . . . , xj−1 , t, xj+1 , . . . , xn )|dtdx1 .
j=2 x1 =−∞ t=−∞

But we have integrate for x2 too. Again, there is precisely one factor that does
not depend on x2 , so we can do the same thing. After all this, we get
Z n Z
Y 1
 n−1
n
|2u(x)| n−1 ≤ |∂j u| .
Rn j=1 Rn

Then
Z  n−1 n Z
n
n
1X
|2u(x)| n−1 ≤ |∂j u|
Rn n j=1 Rn

by the arithmetic-geometric mean inequality. This inequality is precise, even


with constants. Therefore

kukLn/(n−1) ≤ CkukL11 .

This is the p = 1 case. In general, we can generalize to 1 ≤ p < n. The


inequality here is going to be

kukLp∗ (Rn ) ≤ CkukW 1,p (Rn ) ,


0

where the Sobolve conjugate exponent is


np
p∗ = for 1 ≤ p < n.
n−p

The usual way of getting this inequality is to replace u by |u|γ . Then the case
p = 1 implies
k|u|γ kL n−1
n
(Rn )
≤ CkD(|u|γ )kL1 (Rn ) ,

and then you can use Hölder on the right hand side and choose an appropriate
γ.
Math 212a Notes 78

24.2 Moser’s iteration trick


This is an way to get the sup norm estimate. I’ll illustrate to recover the
sup estimate from the spherical coordinate method. You can do this for the
rectangular integration, but I am trying to just tell you how this works. We
had the inequality

k|u|γ kL n−1
n ≤ Cγk|u|γ−1 k p kDukLp
L p−1

for p, γ > 1, by Hölder. We are going to iterate this to get the estimate on
infinity.
First, note that we can scale the function so that CkDukLp (Rn ) = 1. Then

k|u|γ kL n−1
n ≤ γk|u|γ−1 k p ,
L p−1

which means that


γ−1 γ−1
1 1
γ γ
kukLγn0 ≤ γ γ kuk (γ−1)p = γ γ kukL(γ−1)p 0.
L p−1

Let us assume that p > n, so that we have n0 > p0 . Then we increase the
norm number by a factor of n0 /p0 every time we apply the inequality. Here, we
0 0
can be a bit generous and replace the L(γ−1)p norm by the Lγp norm, using
Hölder. There will be a constant coming from the volume of Ω, but we can
again scale Ω so that its volume is 1. Then
1− 1
kukLγn0 ≤ γ 1/γ kukLγpγ0 .

n0
Write δ = p0 . If you iterate this, you get
ν
1
Y
kukLδν n0 ≤ (δ ν ) δµ .
µ=1

This converges and therefore

kukL∞ (Rn ) ≤ CkukW 1,p (Rn ) .


0
Math 212a Notes 79

25 November 28, 2017


We want to look at measures in general. We want something like the Lebesgue
decomposition, into an absolutely continuous part and a jump part.
We are also going to look at ergodicity. Suppose T : X → X is a measure-
preserving map. We want to say something about the convergence of
m−1
1 X k
T (f ) → const.
m
k=0

Here, T is assumed to be ergodic in the sense that T (S) = S up to measure zero


implies S = 0 or S = X. Birkhoff proved this, and we are going to do this in
three parts. First we will prove convergence without assuming ergodicity. This
was motivated by statistical mechanics.

25.1 Radon–Nikodym derivative


This is the second half of the fundamental theorem of calculus in Lebesgue
theory. We have introduced the maximal function
Z
1
f ∗ (x) = sup |f |
x∈B m(B) B

If f ∈ L1 , then it is not true that f ∗ ∈ L1 . But it is true that f ∈ L2 implies


f ∗ ∈ L2 . This was assigned in the homework. Then we can show that
Z x
d
f (t)dt = f (x)
dx a
almost everywhere.
Now we want to do the other part. This is differentiating and then integrat-
ing.
Definition 25.1. A measure space (X, µ) contains the data of a σ-algebra,
a collection E of X containing ∅, X that is closed under countable unions and
complements, and a measure µ : E → [−∞, ∞] with countable additivity.
Example 25.2. Let X = Rk with the standard metric. Let E be the σ-algebra
generated by the open set, the Borel σ-algebra. Then we can define µ as the
Lebesgue measure.
We can do the same thing, and we get monotone convergence, dominated
convergence, and Fatou. I’m not going to do this because we’ve done it and
everything works the same way.
There still is the question of how to define a measure from σ-algebra. The
way to do it is to define an exterior measure first and then use the inf of the
exterior measures containing it. The usual way to do this is Carathéodory’s
criterion. This says that A is measurable if and only if m∗ (A∩E)+m∗ (E −A) =
m∗ (E) for all E.
Math 212a Notes 80

Theorem 25.3. Assume that µ, ν are two (signed) σ-finite (|µ| and |ν|) mea-
sures. Then we can decompose ν = νa + νs where νa is absolutely continuous
with respect to µ (i.e., µ(E) = 0 implies νa (E) = 0) and νs is (mutually) sin-
gular with respect to µ (i.e., X = A ∪ B for disjoint A, B and supp νs ⊆ B and
supp µ ⊆ A). Moreover, Z
νa (E) = f dµ
E
for some f measurable on X.
Proof. Von Neumann’s proof is the most elegant. This uses Riesz representation.
Let us consider the special case µ, ν ≥ 0 and µ(X), ν(X) < ∞. Let ρ = µ + ν
and consider the Hilbert space L2 (X, ϕ). Consider the linear functional
Z
Φ(f ) = f dν.

Then this is continuous because X has finite measure, and so there exists a
g ∈ L2 (X, ρ) such that
Z Z
Φ(f ) = f dν = f gdρ.
X X

His main idea is that all the information between µ and ν is encoded in g. First
note that 0 ≤ g ≤ 1 because testing with f = χE gives a contradiction. Now
consider the sets
B = {g = 1}, A = {g < 1}.
Then µ(B) = 0 by testing with f = χB .
Now what happens in E ⊆ A? Set f = χE (1 + g + · · · + g n−1 ). Then we get
Z Z
(1 − g n )dν = g(1 + g + · · · + gn − 1)dµ.
E E

As n → ∞, we get, by monotone convergence,


Z
g
ν(E) = dµ.
E 1 − g
If it is not σ-finite, you take the union, and if it is signed, you can just
separate into positive and negative parts.

25.2 Ergodic measure theory


Let (X, µ) be a nonnegative measure with µ(X) = 1. Let T : X → X be a
measurable map and assume that it is an isometry, i.e., measure-preserving.
(Usually it is not assumed to be bijective.) Our goal is to to look at the stable
situation. We want to look at the limit limn→∞ T n , but this is too complicated.
1
So we use the Césaro limit. In that case, we are looking at m (1 + T + T 2 + · · · +
m−1
T ).
Math 212a Notes 81

Of course, you can’t add measure spaces. So you take a function f defined
on X, and let τ : X → X be an isometry. Then we define

(T f )(x) = f (τ (x)).

So the goal now is to look at the average


m−1
1 X h
Am (f ) = T (f ).
m
h=0

For functions, there are different kinds of convergences. There is strong


convergence, convergence in norms. There is convergence in measure, and there
is almost everywhere convergence. People were interested in almost everywhere
convergence, because they wanted to get back to the space. We’ll discuss only
the case (X, µ) with µ ≥ 0 and µ(X) = 1.
Theorem 25.4 (Mean ergodic theorem). Let τ : X → X be measure-preserving
and let (T f )(x) = f (τ (x)). Then for f ∈ L2 (X),
m−1
1 X h
Am (f ) = T f → P (f ),
m
h=0

as m → ∞, where P (f ) is the L2 -projection to S = {f : T f = f } = ker(I − T ).


Proof. The key is to decompose H = S ⊕ S ⊥ . Clearly S goes to S. Now we need
to know the behavior of T on S ⊥ . Note that ker(I − T ) = S is the orthogonal
complement of im(1 − T ∗ ).
But note that T is an isometry, so T ∗ T = I. (We have (T ∗ T f, g) = (f, g).)
We want to show that T f = f if and only if T ∗ f = f . We now hat T f = f
implies T ∗ f = f . For the other direction, we use the equality condition on
Cauchy–Schwartz. If T ∗ f = f , then

kf k2 = (f, f ) = (f, T ∗ f ) = (T f, f ) ≤ kT f kkf k.

So T f = cf and then c = 1 and so T f = f .


What this shows is that S = ker(1 − T ∗ ) = ker(1 − T ). So the orthogonal
complement of S is the closure of im(1 − T ).
So every f ∈ H can be decomposed as

f = F0 + (1 − T )F1 + F2

where kF2 k < . Then the (1 − T )F1 telescopes and contributes nothing. So it
converges to F0 in L2 .
Theorem 25.5 (Maximal ergodic theorem). Let f ∈ L1 (X, µ) and define a
maximal function
m−1
1 X
f ∗ (x) = sup |f (τ k (x))|.
m≥1 m
k=0
Math 212a Notes 82

Then there exists a universal constant A > 0 such that


A
µ({x : f ∗ (x) > α}) ≤ kf kL1 .
α
Proof. Let us first prove the special case X = Z, µ the cardinal measure, and
τ : n 7→ n + 1. We do this by replacing Z by R and replacing points with half-
closed intervals of length 1. Then this just follows from the Hardy–Littlewood
bound.
The general case can be reduced to this. Consider X × Z with the product
measure, with the map (x, n) 7→ (τ x, n + 1). Then you can do this carefully.
Theorem 25.6 (Pointwise ergodic theorem). Suppose f ∈ L1 (X) where µ(X) =
1. Let τ : X → X be measure preserving. Then for almost all x ∈ X,
m−1
1 X
Am (f ) = f (τ h (x)) → P 0 (f )
m
h=0

|P 0 f |dµ ≤
R R
for almost everywhere, and by Fatou, X X
|f |dµ.
Proof. First note that L2 is dense in L1 by truncating the range. Here, we have
the decomposition H = S ⊕ S ⊥ and so f = F0 + (1 − T )F1 + F2 . We don’t
need anything for F0 , and (1 − T )F1 is also telescoped well. To deal with F2 , we
use the maximal bound. Then the places it does not converge is going to have
measure zero.
Math 212a Notes 83

26 November 30, 2017


There is the Radon transform, which used in tomography. This is similar to
Fourier transform, where are trying to recover f from integration of f on lines.
Here, we can actually recover fˆ from this information, because every Fourier
transform is an integral of these.

26.1 Hausdorff measure


These are used for fractals, or self-similar objects. What is the main difference
between Hausdorff measures and Euclidean/Lebesgue measures? For Lebesgue
measures, we started out with building blocks, intervals or cubes. So these are
volume-based measures. But then you need to know the dimension to start out
with.
The Hausdorff-measure is diameter-based. This is much more powerful, be-
cause it is defined for any metric space, regardless of the dimension. Then you
can use it to test and define the dimension.
Definition 26.1. Let X be a metric space, and let E be a subset. Given δ > 0,
choose a d for testing the dimension. We define
X [ 
Hd (E) = inf
δ d
(diam Fj ) : E ⊆ Fj and diam Fj < δ .
j j

Then the exterior measure is defined as

m∗d (E) = lim Hdδ (E).


δ→0

Note that if m∗α (E) < ∞ for some α ≥ 0, then m∗b (E) = 0 for β > α. This
is because there is an extra factor of δ β−α . Likewise, if m∗α (E) > 0 for some
α > 0 then m∗β (E) = ∞ for β < α.
Definition 26.2. The Hausdorff dimension is defined as sup{α : m∗α (E) =
∞} = inf{α : m∗α (E) = 0}.
If F1 and F2 are positively separated sets, then m∗α (F1 )+m∗α (F2 ) = m∗α (F1 ∪
F2 ). This is because we can set δ sufficiently small and they don’t interfere.
Example 26.3. Consider the Cantor set C ⊆ [0, 1]. Its Hausdorff dimension is
α = log 2/ log 3. We can cover C by some 2k intervals of form [3−k l, 3−k (l + 1)]
and then m∗α (C) ≤ 1. The other direction uses the Cantor–Lebesgue function.
This function is α-Hölder continuous. So we get mα
α (C) > 0.

Lemma 26.4. Let f : E → F , where E is compact and F = f (E). If f is


γ-Hölder continuous with dist(f (x), f (y)) ≤ M dist(x, y), then

m∗β (f (E)) ≤ M β m∗α (E)

where β = α/γ.
Math 212a Notes 84

Example 26.5. There is also the Sierpinski triangle. This set is a triangle,
minus the middle triangle of side length 21 , and from each three remaining
triangle remove the half-size triangles and so on. This has dimension log 3/ log 4.
Example 26.6. There is also the Koch curve. There is also the Peano curve,
or the space-filling curve.

There are also self-similar constructions, with rescaling by a factor of κ.


Theorem 26.7. Let S1 , . . . , Sm be m separated similarities by a factor of κ.
That is, there exists an open O such that Sj (O) are disjoint in O. Then there
exists a unique F such that F is the disjoint union of Sj (F ). Moreover, its
Hausdorff dimension is log m/ log(1/κ).

26.2 Radon transform


The emphasis here is on the smoothness, or the Hölder continuity of the result.
Let me tell you the answer first.

Definition 26.8. The Radon transform is, for a function f on Rd , t ∈ R,


and γ is a unit direction in Rd ,
Z
R(f )(t, γ) = f.
Pt,γ

Here Pt,γ = {x : x · γ = t}.

The result concerns “well-posedness”. In the real situation, everything is


measured approximately. So it would be bad if a small error in measurement
gives a large change in the inverse function. Consider

R∗ (f )(γ) = sup|R(f )(t, γ)|


t,R

R∗ (f )(γ)dσ(γ) as the measure of the error.


R
and then we are going to use S d−1

Theorem 26.9. We have


Z
R∗ (f )(γ)dσ(γ) ≤ c(kf kL1 + kf kL2 )
S d−1

for d ≥ 3 and some c > 0 only depending on d.

Theorem 26.10. If f is continuous on Rd with compact support, with d ≥ 3,


then
|R(f )(t1 , γ) − R(f )(t2 , γ)|
Z
sup dσ(γ) ≤ c(kf kL1 + kf kL2 )
γ∈S d−1 t1 6=t2 |t1 − t2 |α

for 0 < α < 21 .


Math 212a Notes 85

If we want to invert the Radon transform, we perform the Fourier transform


in the remaining variable. So if we do Fourier transform only on t, we get

R̂(f )(τ, γ) = fˆ(τ γ).

This is why we need the condition d ≥ 3. The main tool we were using were
invariance of L2 under the Fourier transform. Here, we are using the Euclidean
version and the spherical version. So if f is continuous and compactly supported,
then Z Z ∞ Z
2 d−1
|R̂(f )(λ, γ)| |λ| dλdσ(γ) = 2 |f (x)|2 dx.
S d−1 −∞ Rd

R ∞ 26.11. Let F be a function on R, and let d ≥ 3. If sup|F̂ | ≤ A


Lemma
and −∞ |F̂ (λ)|2 |λ|d−1 dλ ≤ B 2 , then supt∈R |F (t)| ≤ c(A + B). Moreover, if
0 < α < 21 then
|F (t1 ) − F (t2 )| ≤ Cα |t1 − t2 |α (A + B).
Index

absolutely continuous, 15, 31 hypo-elliptic operator, 70


adjoint, 35
approximation to identity, 25 integrable, 11

Baire category theorem, 45 Lebesgue integration, 10


Banach space, 45 Lebesgue set, 21
Banach–Steinhaus theorem, 46 Lp space, 31

Cantor function, 19 maximal ergodic theorem, 81


closed graph theorem, 46 mean ergodic theorem, 81
compact operator, 40 measurable, 6
measurable function, 10
Dirichlet–Dini kernel, 5, 23 measure space, 79
distribution, 61 Minkowski inequality, 32
order, 74
open mapping theorem, 46
elliptic operator, 70 outer measure, 6
Fatou’s lemma, 10 parametrix, 70
Fejér kernel, 24 Poincaré lemma, 50
Fourier transform, 26 pointwise ergodic theorem, 82
Fredholm alternative, 44
Fubini’s theorem, 29 Radon transform, 84
fundamental theorem of calculus, Radon–Nikodym derivative, 31, 80
15, 17, 21 Rellich’s lemma, 51, 53
Riemann–Lebesgue lemma, 24
Gårding’s inequality, 53, 55 Riesz representation theorem, 34
good kernel, 24
Green’s kernel, 43 Schwartz space, 62
Green’s operator, 56 σ-algebra, 7
spectral theorem, 41
Hölder inequality, 32 Sturm–Liouville equation, 40
Hahn–Banach theorem, 47
Hardy–Littlewood maximal tangent vector, 49
function, 20 Tchebychev’s inequality, 20
Hausdorff dimension, 83 tempered distribution, 62
Hilbert space, 32
Hilbert–Schmidt operator, 43 Vitali covering, 21

86

You might also like