Linear Algebra for Math Students
Linear Algebra for Math Students
Oliver Knill
MATH 22A
Lecture
1.1. A finite rectangular array A of real numbers is called a matrix. If there are n
rows and m columns in A, it is called a n × m matrix. We address the entry in the i’th
row and j’th column with Aij . A n × 1 matrix is a column vector, a 1 × n matrix is
a row vector. A 1 × 1 matrix is called a scalar. Given aPn × p matrix A and a p × m
matrix B, the n × m matrix AB is defined as (AB)ij = pk=1 Aik Bkj . It is called the
matrix product. The transpose of a n × m matrix A is the m × n matrix ATij = Aji .
The transpose of a column vector is a row vector.
1.2. Denote by M (n, m) the set of n×m matrices. It contains the zero matrix O with
Oij = 0. In the case m = 1, it is the zero vector. The addition A+B of two matrices
in M (n, m) is defined as (A+B)ij = Aij +Bij . The scalar multiplication λA is defined
as (λA)ij = λAij if λ is a real number. These operations make M (n, m) a vector
space = linear space: the addition is associative, commutative with a unique
additive inverse −A satisfying A − A = 0. The multiplications are distributive:
A(B + C) = AB + AC and λ(A + B) = λA + λB and λ(µA) = (λµ)A.
1.3. The space M (n, 1) is also called Rn . It is the n-dimensional Euclidean space.
The vector space R2 is the plane and R3 is the physical space. These spaces are
dear to us as we draw on paper and live in space. The dot product between two
column vectors v, w ∈ Rn is the matrix product v · w = v T w. Because the dot product
is a scalar, the product is also called the scalar product. In the matrix product of
two matrices A, B, the entry at position (i, j) is the dot product of the i’th row in A
with the j’th column in B. More generally, the dot product between two arbitrary
n × m matrices can be defined by A · B = tr(AT B), wherePthe trace of a matrix is
the sum of its diagonal entries. This means tr(AT B) = i,j Aij Bij . We just take
the product over all matrix entries and add them up. The dot product is distributive
(u + v) · w = u · w√+ v · w and commutative v · w = w · v. We can use it to define
the length |v| = v · v of a vector or the length |A| of a matrix, where we took the
positive square root. The sum of the squares is zero exactly if all components are zero.
The only vector satisfying |v| = 0 is therefore v = 0.
1.4. An important key result is the Cauchy-Schwarz inequality.
Theorem: |v · w| ≤ |v||w|
Linear Algebra and Vector Analysis
Proof. If w = 0, there is nothing to prove as both sides are zero. If w 6= 0, then we can
divide both sides of the equation by |w| and so achieve that |w| = 1. Define a = v · w.
Now, 0 ≤ (v − aw) · (v − aw) = |v|2 − 2av · w + a2 |w|2 = |v|2 − 2a2 + a2 = |v|2 − a2
meaning a2 ≤ |v|2 or v · w ≤ |v| = |v||w|.
1.5. It follows from the Cauchy-Schwarz inequality that for any two non-zero vectors
v, w, the number (v · w)/(|v||w|) is in the closed interval [−1, 1]. There exists therefore
a unique angle α ∈ [0, π] such that cos(α) = (v · w)/(|v||w|). If this angle between v
and w is equal to α = π/2, the two vectors are orthogonal. If α = 0 or π the two
vectors are called parallel. There exists then a real number λ such that v = λw. The
zero vector is considered both orthogonal as well as parallel to any other vector.
1.6. Two vectors v, w define a (possibly degenerate) triangle {0, v, w} in Euclidean
space Rn . The above formula defines an angle α at the point 0 (which could be the
zero angle). The side lengths a = |v|, b = |w|, c = |v − w| of the triangle satisfy the
following cos formula. It is also called the Al Kashi identity.
Proof. We use the definitions as well as the distributive property (FOIL out):
c2 = |v − w|2 = (v − w) · (v − w) = v · v + w · w − 2v · w = a2 + b2 − 2ab cos(α).
1.7. The case α = π/2 is particularly important. It is the Pythagorean theorem:
Examples
1 1 1
1.8. The dot product 3 · −2 is [1, 3, 1] −2 = 1 − 6 − 1 = −6. We have
1 −1 −1
√ √ √
|v| = 11, |w| = 6 and angle α = arccos(−6/ 66).
3 1 2 2
1.9. The dot product of A = and B = is tr(AT B) = 6 + 2 + 8 +
2 1 4 −1
√
(−1) = 15 . The length
√ of A is 12,√the length of B is 5. The angle between A and B
is α = arccos(15/(5 12)) = arccos( 3/2) = π/6.
1 2 1 −1
1.10. A = and B = are perpendicular because tr(AT B) = 0.
1 2 −1 1
√
The angle between them is π/2. The lengthof A is a = 10. The length of B is
√ 2 1 √
b = 4 = 2. The length of A + B = is c = 14. We confirm a2 + b2 = c2 .
0 3
Note that AB 6= BA. Multiplication is not commutative.
1.11. Find the angles in a triangle of length a=4,b=5 and c=6. Answer: Al Kashi gives
2 · 4 · 5 cos(γ) = 42 + 52 − 62 = 5 so that γ = arccos(5/40). Similarly 2 · 4 · 6 cos(β) = 27
so that γ = arccos(27/48) and 2 · 5 · 6 cos(α) = 45 so that α = arccos(45/60).
Illustrations
Homework
This homework is due on Thursday.
1 2 3
Problem 1.1: Given A = 4 5 6 .
7 8 9
a) Find A , then build B = A + AT and C = A − AT . The first matrix is
T
Problem 1.2: Use the definitions to find the angle between the vector
v = [1, 1, 0, −3, 0, 1]T and w = [1, 1, 9, −3, −5, −3]T . What? Is this not
a bit esoteric? These vectors are in R6 . It actually is very applied: the
value cos(α) is the correlation between the two data points v and w. If
the cosine is positive, the data have positive correlation. If the cosine is
negative, they have negative correlation.
Problem 1.5: a) Find two vectors in R2 for which all coordinate entries
are 1 or −1 and which are both perpendicular to each other.
b) Design four vectors in R4 for which all coordinate entries are 1 or −1
which are all perpendicular to each other.
Optional and needs not to be turned in: Can you invent a strategy which
allows you for example to find 16 vectors in R16 which are all perpendicular
to each other and have still entries in {−1, 1}?
MATH 22A
Lecture
2.1. If a n × m matrix A is multiplied with a vector x ∈ Rm , we get a new vector Ax
in Rn . The process x → Ax defines a linear map from Rm to Rn . Given b ∈ Rn , one
can ask to find x satisfying the system of linear equations Ax = b. Historically,
this gateway to linear algebra was walked through much before matrices were even
known: there are Babylonian and Chinese roots reaching back thousands of years. 1
2.2. The best way to solve the system is to row reduce the augmented matrix
B = [A|b]. This is a n × (m + 1) matrix as there are m + 1 columns now. The Gauss-
Jordan elimination algorithm produces from a matrix B a row reduced matrix
rref(B). The algorithm allows to do three things: subtract a row from another
row, scale a row and swap two rows. If we look at the system of equations, all
these operations preserve the solution space. We aim to produce leading ones
,
1 which are matrix entries 1 which are the first non-zero entry in a row. The goal
is to get to a matrix which is in row reduced echelon form. This means: A) every
row which is not zero has a leading one, B) every column with a leading 1 has no other
non-zero entries besides the leading one. The third condition is C) every row above a
row with a leading one has a leading one to the left.
2.3. We will practice the process in class and homework. Here is a theorem
Theorem: Every matrix A has a unique row reduced echelon form.
Proof. 2 We use the method of induction with respect to the number m of columns
in the matrix. The induction assumption is the case m = 1 where only one column
exists. By condition B) there can either be zero or 1 entry different from zero. If there
is none, we have the zero column. If it is non-zero, it has to be at the top by condition
C). We are in row reduced echelon form. Now, let us assume that all n × m matrices
have a unique row reduced echelon form. Take a n × (m + 1) matrix [A|b]. It remains
in row reduced echelon form, if the last column b is deleted (see lemma). Remove
the last column and row reduce is the same as row reducing and then delete the last
column. So, the columns of A are uniquely determined after row reduction. Now note
that for a row of [A|b] without leading one at the end, all entries are zero so that also
1For more, look at the exhibit on the website: google “catch 22 Harvard” to get there
2The proof is well known: i.e. Thomas Yuster, Mathematics Magazine, 1984
Linear Algebra and Vector Analysis
the last entries agree. Assume we have two row reductions [A0 |b0 ] and [A0 |c0 ] where A0
is the row reduction of A. A leading
1 in the last column of [A0 |b0 ]) happens if and
only if the corresponding row in A was zero. So, also [A0 , c0 ] has that leading
1 at the
end. Assume now there is no leading one in the last column and b0k 6= c0k . We have so
x, a solution to the equation A0kq xq + A0k,q+1 xq+1 + ...A0k,m xm = b0k . Since solutions to
equations stay solutions when row reducing, also A0kq xq +A0k,q+1 xq+1 +· · ·+A0k,m xm = c0k .
Therefore b0k = c0k .
2.4. A separate lemma allows to break up a proof:
Proof. We have to check the three conditions which define row reduced echelon form.
2.5. It is not true that if A is in row reduced echelon form, then any sub-matrix is in
row reduced echelon form. Can you find an example?
Examples
2.6. To row reduce, we use the three steps and document on the right. To save space,
we sometimes report only after having done two steps. We circle the leading
. 1 Note
that we did not immediately go to the leading
1 by scaling the first. It is a good idea
toavoid fractions as much
as possible.
3 4 5 6 7 → row 3
1 2 3 4 5
20 30 40 50 60 20 30 40 50 60 ∗1/10
1 2 3 4 4 → row 1 3 4 5 6 7
1 2 3 4 5
1 2 3 4 5
2 3 4 5 6 −R1 1 1 1 1 1 −R1
3 4 5 6 7 −R2 1 1 1 1 1 −R2
1 2 3 4 5 +2R2
1 0 −1 −2 −3
0 −1 −2 −3 −4 ∗(−1) 0
1 2 3 4
0 0 0 0 0 0 0 0 0 0
2.7. Finish the following Suduku problem which is a game where one has to fix
matrices. The rules are that in each of the four 2 × 2 sub-squares, in each of the
four rows and each of the four
columns, the entries 1 to 4 have to appear and so
2 1 x 3
3 y z 1
add up to 10 4 3 a 2 . We have the equations 2 + 1 + x + 3 = 10, 3 + y +
b c d e
z + 1 = 10, 4 + 3 + a + 2 = 10, b + c + d + e = 10 for the rows, 2 + 3 + 4 + b =
10, 1 + y + 3 + c = 10, x + z + a + d = 10, 3 + 1 + 2 + e + 10 for the columns and
2 + 1 + 3 + y = 10, x + 3 + z + 1 = 10, 4 + 3 + b + c = 10, a + 2 + d + e = 10 for the
four squares. We could solve the system by writing down
the corresponding
augmented
2 1 4 3
3 4 2 1
matrix and then do row reduction. The solution is 4 3 1 2 .
1 2 3 4
Illustrations
The system of equations
x + u = 3
y + v = 5
z + w = 9
x + y + z = 8
u + v + w = 9
is a tomography problem. These problems appear in magnetic resonance imaging.
A precursor was was X-ray Computed Tomography (CT) for which Allen MacLeod Cormack got the Nobel in 1979
(Cormack had a sabbatical at Harvard in 1956-1957, where the idea hatched). Cormack lived until 1998 in Winchester
MA. He originally had been a physicist. His work had tremendous impact on medicine.
x u
y v
z w
We build the augmented matrix [A|b] and row reduce. First remove the sum of the
first three rows from the 4th, then change the sign of the 4’th column:
1 0 0 1 0 0 3
1 0 0 1 0 0 3
1 0 0 0 −1 −1 −6
0
1 0 0 1 0 5 0
1 0 0 1 0 5 0
1 0 0 1 0 5
0 0 1 0 0 1 9 ⇒ 0 0
1 0 0 1 9 ⇒ 0 0
1 0 0 1 9
1 1 1 0 0 1 8 0 0 0
1 1 1 9 0 0 0
1 1 1 9
0 0 0 1 1 1 9 0 0 0 1 1 1 9 0 0 0 0 0 0 0
Now we can read of the solutions. We see that v and w can be chosen freely. They are
free variables. We write v = r and w = s. Then just solve for the variables:
x = −6 + r + s
y = 5−r
z = 9−s
u = 9−r−s
v = r
w = s
Linear Algebra and Vector Analysis
Homework
1 2 3 4
Problem 2.2: Row reduce the matrix A = 1 2 3 0 .
1 2 0 0
1 2 3
Problem 2.5: Given A = 4 5 6 . Compare rref(AT ) with
7 8 9
T
(rref(A)) . Is it true that the transpose of a row reduced matrix is a
row reduced matrix?
MATH 22A
Seminar
3.1. Theorems are mathematical statements which can be verified using proofs. The-
orems are the backbone of mathematics. A proof assures that the theorem is true and
remains valid also in the future. Lets look at an example of a theorem. It has already
been known and proven by Euclid of Alexandria. It deals with integers and primes
positive integers larger than 1 which are only divisible by 1 or itself. The theorem tells
that every positive integer is either 1 or prime or the product of two or more primes.
To formulate the theorem more elegantly, we extend the notion of product and say
that a prime is the product of k = 1 primes and that the number 1 is a product of k = 0
primes. Also we would say the number 20 = 2∗2∗5 is the product of k = 3 primes, even
so the prime 2 appears twice. This is similar to the water molecule H2 O = H ∗ H ∗ O
containing k = 3 atoms, as hydrogen H appears twice and oxygen O once. Now, like
every molecule decomposes into atoms, every number decomposes into primes:
This is a remarkable statement because there are infinitely many integers. We can not
go therefore through an infinite list and check things for each. It could a priori happen
1000
that for some very large number, like the Fermat number F1000 = 2(2 ) + 1, which
can not even be written down in our universe, 1 the statement would fail.
3.2. In order that such a statement can be verified or refuted, one needs first of all to
make sure that the objects are described by clear definitions. In the above sentence,
this means that we need to know what the “integers” are, what a “product” is and
what “prime numbers” are. This is already tricky in general. Most confusions which
have happened historically in science (and still today!) are based on sloppy definitions.
2
1There
are less than 2300 elementary particles available in our universe (as far as we know).
2
Amuse yourself and try to find definitions of “entropy”, “multiverse”, “intelligence” or “life”
Linear Algebra and Vector Analysis
3.3. Once, the definitions of the ingredient of the statement is clear, it is helpful to
clarify its meaning. We get intuition by looking at examples. We see for example
that 100 = 2 ∗ 2 ∗ 5 ∗ 5 is indeed a product of prime numbers. We see that 7 is a prime
number. Examples are great but it is important at this stage also to realize:
Principle: Checking a statement by showing a few examples is not a
proof.
We will come back to this later in the course.
Problem C: The following statements are examples to theorems we have
seen in the first two lectures:
32 + 42 = 52
Proof: the statement S(n) is true for n = 1. Assume S(n) is true. Now S(n + 1) tells
1 + 2 + .. + n + (n + 1) = ((n + 1)2 + (n + 1))/2. Using the induction assumption this
means (n2 + n)/2 + (n + 1) = ((n + 1)2 + (n + 1))/2 which is true. We know therefore
that the statement is true for all n.
3.6. Lets look at the theorem on primes above. In order to make this a statement
which we can extend from n to n+1, we modify the statement to
3.7. S(2) is true as {2} only contains one number which is prime. Now assume S(n)
meaning that the statement is true for n, prove that S(n + 1) is true. There are two
cases: if n + 1 is prime, then S(n + 1) is true. If n + 1 is not prime, then n = ab
where a and b are numbers larger than 1 but smaller or equal than n. By induction
assumption, both a and b decay into primes: a = p1 p2 · · · , pk and b = q1 q2 . . . , ql where
pj and qj are primes. Therefore, n + 1 = p1 p2 · · · pk q1 q2 · · · ql .
3.8. It is important to understand the statement and not overreach it. We have not
proven that every integer has a unique decomposition into prime factors. This was
not known by Euclid (he might not even have thought about it). It was only proven
2000 years later by Gauss. A common mistake which happens in mathematical proofs
is that one cites a theorem which is known but over reaches its scope or forgets one of
the assumptions.
Principle: Do not extend the scope of an already established fact with-
out justification.
3.9. If you think such mistakes happen to rookies only, this is not the case. Leonard
Euler, probably the greatest mathematician of all times once attempted√a proof of
Fermat’s last theorem by working with√extended number systems like Z[ −3] which
are all the numbers of the form a + −3b, where a, b are integers. You see, one
can add and multiply such numbers like integers and remain in the class. Euclid’s
proof also shows that there is a prime factorization.√ But there √ can be different prime
factorizations. An example is 4 = 2 ∗ 2 = (1 + −3)(1 − −3). A similar mistake
was done by Gabriel Lamé who announced in 1847 a proof of Fermat’s last theorem
telling that for n ≥ 3, no solutions to xn + y n = z n exist unless xyz = 0. Lamé’s genius
3Already used by Plato and a second order axiom in the Peano axiom system.
Linear Algebra and Vector Analysis
3.10. Of course, we have to try to avoid mistakes in the final product at all costs. Euler
certainly earned the right to make some mistakes by creating a lot of mathematics,
which will remain true for all eternity. But mistakes can be much more basic. Here is
a beautiful example due to Polya: 5
Proof: The induction assumption is clear as for n=1, all horses have the same color.
Now assume the statement is true for all groups of n horses. Now take n+1 horses and
take the first away. These are n horses so that all have the same color. Now put the
first back and take the last one away. Again we have n horses, so that all have the
same color. Therefore all have the same color.
Problem D: What is wrong in the proof of Polya’s horse theorem?
3.11. Proof: No cat has no tail. A cat with a tail has a tail more than no cat. No cat
has eight tails. Therefore, cats have nine tails.
4see Mario Livio: Brilliant blunders, 2013
5George Polya: Induction and Analogy in Math, 1954 (Thanks to Jun Hou Fung for suggestion):
3.12. For the following definition of “Prime numbers” we follow 6:
A prime is a number with no divisors.
Boxes of chocolates always contain a prime number
so that, whatever the number of people present
somebody has to have that one left over.
3.13. Why do we start to do induction at n = 1 and not from the other end? The
following song explains why: (just as a bit of background to appreciate the song:
Aleph-Null = ℵ0 is the cardinality of the natural numbers N. ℵ1 is the next larger
cardinality. The cardinality of the real numbers R is 2ℵ0 (as the Cantor diagonal
argument shows that the real numbers can not be counted) which is the cardinality of
all subsets of natural numbers. Cantor had shown that there are different infinities.
A beautiful mind like Cantor of course asked whether there is an infinity in between
these two infinities.
The statement 2ℵ0 = ℵ1 is the continuum hypothesis abbreviated CH. Work of Paul
Cohen Kurt Gödel in the sixties shows that one can not prove the statement nor its
negation from ZFC set theory (an axiom system of our standard mathematics from
which one can derive the Peano axioms including the principle of induction). Cantor
had for a long time tried to prove CH, in vain. We know now that his efforts to prove
this were doomed from the beginning. This possibility always exists. There is the pos-
sibility (very unlikely although) that we can not prove that every even number larger
than 2 is a sum of two primes, even in the case if it would be true! 7. The continuum
hypothesis problem had been the first of Hilbert’s problems of 1900.
Homework
Exercises A)-D) are done in the seminar. This homework is due on Tuesday:
MATH 22A
Lecture
4.1. The three dimensional space R3 is special. It is not only the only Euclidean space
in which the Kepler problem is stable 1, it also features a cross product v × w which
is in the same space. Such a product can be defined in Rn but it produces a vector in
Rn(n−1)/2 . It happens that for n = 3 that the result is again in R3 . The problem of
“multiplying triplets” has been pondered by William Hamilton in the first half of the
19th century and is related to the fascinating story of quaternions. The discovery of
quaternions was simultaneously the birth place of the dot and cross product.
4.2. The cross product of two vectors v = [v1 , v2 , v3 ]T and w = [w1 , w2 , w3 ]T is
v1 w1 v2 w3 − v3 w2
v2 × w2 = v3 w1 − v1 w3 .
v3 w3 v1 w2 − v2 w1
Take the dot product with v or w to see that v × w is perpendicular to both v and w.
Obvious is also v × w = −w × v. The product is handy for constructions in R3 . The
vectors v, w, v × w are oriented like the first three fingers on the right hand: if v is the
thumb, w is the pointing finger, then v×w is the middle finger. Let v·w = |v||w| cos(α):
Proof. We will verify in class by brute force the Lagrange’s identity |v × w|2 =
|v|2 |w|2 − (v · w)2 which is also called Cauchy-Binet formula. Now use |v · w| =
|v||w| cos(α) to get the result with cos2 (α) + sin2 (α) = 1.
4.3. Given a triangle with side lengths a, b, c and angles α, β, γ, where α is opposite
to a etc. We have the following sin-formula
a b c
Corollary: sin(α)
= sin(β)
= sin(γ)
.
Proof. We can use the theorem and express the area of the triangle as ab sin(γ) or
bc sin(α) or ac sin(β). By equating these three quantities and dividing out the common
factor, we get the sin-formula.
1by a theorem of Joseph Bertrand of 1873 and work of Sundman-von Zeipel
Linear Algebra and Vector Analysis
4.4. This is useful in applications as to define the area of the parallelogram as |v × w|.
That this is justified can be seen in two dimensions and:
Corollary: |v × w| is the parallelogram area spanned by v and w.
Proof. Use the formula |v × w| = |v||w| sin(α) and note that |w| sin(α) is the height of
the parallelogram spanned by v and w. The base length is |v|.
4.5. The scalar u · (v × w) is called the triple scalar product of u, v, w. Its sign
defines an orientation of the three vectors. It is also the determinant of the matrix
u1 v1 w1
u2 v2 w2 .
u3 v3 w3
The absolute value of u · v × w defines the volume of the parallelepiped spanned
by u, v and w. Without the absolute value, we also speak of signed volume.
4.6. Side remark: In higher dimensions, the cross product is called exterior prod-
uct. One uses ∧ rather than × which is used in three dimensions. If I = (i, j) is a
choice of two elements in {1, 2, . . . , n} and v, w are two vectors in Rn , then (v ∧ w)I =
vi wj −vj wi . The formula |v∧w| = |v||w| sin(α) still holds and the proof is the same. We
only need again to verify the Cauchy-Binet formula |v|2 |w|2 − (v · w)2 = |v ∧ w|2 . But
this is better donePusing matrices. If A is the matrix which contains v, w as columns,
then det(AT A) = P det(AP )2 , where the sum on the right is over all 2×2 submatrices
AP of A. The expression det(AP ) is called a minor. Cauchy-Binet formula is super
cool 2. By the way, if we have k vectors and build A ∈ M (n, k), a matrix which has
these vectors as columns. Now, det(AT A) is the volume of the parallelepiped spanned
by these vectors. And Cauchy-Binet writes this as a sum of squares of k-dimensional
volumes of projections which is in some sense a generalization of Pythagoras.
Examples
4.7. What is the area of the triangle A = (1, 1, 1), B = (3, 5, 2) and C = (2, 0, 3)? We
find the cross product between the vector [2, 4, 1]T going from A to B and the vector
[1, −1, 2]T going from A to C. The cross product is
2 1 9
4 × −1 = −3 .
1 2 −6
√ √
Its length is 3 14. The area of the triangle is half of it: 3 14/2.
4.8. Find the volume of the parallelepiped for which one of the vertices is (0, 0, 0) and
the other neighbors are A, B, C from before? We find the signed volume
1 3 2 1 12
1 · ( 4 × 0 ) = 1 · −5 = −1 .
1 2 3 1 −8
and take the absolute value. A negative number indicates that OA, OB, OC is left
handed.
2O. Knill, Cauchy Binet for pseudo-determinants, Lin. Alg. and its Applications 459 (2014) 522-547
Illustrations
Figure 1. The just newly released Swiss 200 Frank bill shows the
right hand rule: thumb = v, pointing finger = w, then v × w is the
middle finger. Source: Swiss National Bank, issued August 22, 2018.
Homework
MATH 22A
Unit 5: Surfaces
Lecture
5.1. If A is a matrix, the solution space of a system of equations Ax = b is called
a linear manifold. It is the set of solutions of Ax = 0 translated so that it passes
through one of the points. The equation 3x + 2y = 6 for example describes a line in
R2 passing through (2, 0) and (0, 3). The solutions to Ax = 0 form a linear space,
meaning that we can add or scale solutions and still have again solutions. We can
rephrase the just said in that a linear space is a linear manifold which contains 0. For
example, for x+2y +3z = 6 we get a plane which is parallel to the plane x+2y +3z = 0.
The former is a linear manifold (also called affine space),
the later is a linear space. It
is the solution space to Ax = 0 with A = 1 2 3 and x = [x, y, z]T . Both planes
are perpendicular to n = [1, 2, 3]T . To find an equation for the plane through 3 points
P, Q, R, define n = P Q × P R = [a, b, c]T then write down ax + by + cz = d, where d is
obtained by plugging in a point. The cross product comes handy.
5.2. The following important example deals with A = [a1 , . . . , am ] in M (1, m).
Proof. Given two points y, z in the plane. Then we have Ay = d and Az = d. Then
x = y−z is a vector inside the plane. Now AT ·x = Ax = A(y−z) = Ay−Az = d−d = 0.
This means that x is perpendicular to the vector AT .
In three dimensions, this means that the plane ax + by + cz = d has a normal vector
AT = n = [a, b, c]T . Keep this in mind, especially because R3 is our home.
5.3. This duality result will later will identified as a fundamental theorem of
linear algebra. It will be important in data fitting for example. The kernel of a
matrix A is the linear space of all solution Ax = 0. The kernel consists of all roots of
A. The image of a matrix A is the linear space of all vectors {Ax}. We abbreviate
ker(A) for the kernel and im(A) of the image. We will come back to this later.
5.10. Given a polynomial p of n variables, one can look at the surface {p(x) = 0}. It
is called a variety.
Examples
5.11. Q: Find the plane Σ containing the line x = y = z and the point P = (3, 4, 5).
A: Σ contains Q = (0, 0, 0) and R = (1, 1, 1) and so the vectors v = [1, 1, 1]T and
w = [3, 4, 5]T . The cross product between v and w is [1, −2, 1]T . It is perpendicular to
Σ. So, the equation is x − 2y + z = d, where d can be obtained by plugging in a point
(3, 4, 5). This gives d = 0 so that x − 2y + z = 0.
5.12. Can we identify the surface x2 + 2x + y 2 − 4y − z 2 + 6z = 0? Completion of
the square gives x2 + 2x + 1 + y 2 − 4y + 4 − z 2 + 6z − 9 = 1 + 4 − 9 = −4.
Now (x + 1)2 + (y − 2)2 − (z − 3)2 = −4. This is a two-sheeted hyperboloid centered
at (−1, 2, 3).
5.13. Intersecting the cone x2 + y 2 = z 2 with the plane y = 1 gives a hyperbola
z 2 − x2 = 1. Intersection with z = 1 gives a circle x2 + y 2 = 1. Intersecting with
z = x + 1 gives y 2 = 2x + 1, a parabola. Because bisecting a cone can give hyperbola,
an ellipse or a parabola as cuts, one calls the later conic sections.
5.14. The case of singular quadratic manifolds is even richer: x2 − y 2 = 1 is a
cylindrical hyperboloid, x2 −y 2 = 0 is a union of two planes x−y = 0 and x+y = 0.
The surface x2 = 1 is a union of two parallel planes, the surface x2 = 0 is a plane.
Homework
Problem 5.2: What kind of curves can you get when you intersect a
hyperbolic paraboloid x2 − y 2 = z with a plane?
Problem 5.3: Find explicit planes which when intersected with the
hyperboloid x2 + 2y 2 − z 2 = 1 produces an ellipse, or a hyperbola or a
parabola.
Problem 5.4: Find the equation of a plane which is tangent to the three
unit spheres centered at (3, 4, 5), (1, 1, 1), (2, 3, 4).
MATH 22A
Seminar
6.1. Geometric intuition and pictures allow to prove results visually. An example:
6.2. By drawing a rectangle of side length a and b, we can see that the area a ∗ b is
the same as the area b ∗ a. For the cross product or matrices, this is wrong.
6.3. Pictures help to get intuition about a mathematical result. The Pythagorean
theorem was first proven geometrically. The visual proof we look at here could well
have been the first which was found.
Problem B: Use Figure (3) for a proof of the Pythagorean theorem. You
can either describe in words, or label some parts of the picture. Remember
that we want to show c2 = a2 + b2 .
a+b
ab
2
a b
√
Figure 4. A visual proof of ab ≤ (a + b)/2.
2
6.4. The geometric-algebraic inequality assures that the geometric mean is smaller
or equal than the algebraic mean. In order to appreciate that proof, we have first to
verify an identity relating the lengths a, b cut by the altitude line and height h.
Problem C: First check why the triangle in Figure 4 is a right angle.
Then use Pythagoras three times to prove ab = h2 . Finally check the
geometric-algebraic inequality.
Problem D: Use Figure (5) from the “9 Chapters” to prove the theorem.
c
c
5
3
M
b
a
a b
A 4 B
Figure 5. The 3-4-5 triangle. Can you use the picture to prove that a=1?
6.6. Find the formula for the volume of a tetrahedron given by 4 points A, B, C, D.
Problem E: Use Figure (6) to prove that the volume is a sixth of the
volume of the corresponding parallelepiped.
Homework
Exercices A-D are done in the seminar. This homework is due on Tuesday:
Problem 6.1 The 3D Pythagoras theorem states that the square of
the area of ABC is the sum of the squares of the areas of the triangles
OAB, OBC and OCA (which are each half of a rectangle). Use Figure (7)
with A = (a, 0, 0), B = (0, b, 0), C = (0, 0, c) to verify this theorem. Use
the cross product to get the areas.
Problem 6.4 Find a formula for the distance between the line through
a point A, B and a line through the point C, D. The final formula should
not use any trig functions.
MATH 22A
Unit 7: Curves
Lecture
7.1. Given n continuous functions xj (t) of one variable t, we can look at the vector-
valued function r(t) = [x1 (t), . . . , xn (t)]T . We call it a parametrized curve. An
example is r(t) = [3 + 2t, 4 + 6t] which is a line through the point (3, 4) and containing
the vector [2, 6]. 1 If t is in the parameter interval a ≤ t ≤ b, then the image
of r is {r(t) | a ≤ t ≤ b}, which defines a curve in Rn . The curve starts at the
point r(a) and ends at the point r(b). An other important example is the circle
r(t) = [cos(t), sin(t)], where t is in the interval [0, 2π]. Its image is a circle in the
plane R2 . The parametrization r(t) contains more information than the curve itself:
the parabolic curve r(t) = [t, t2 ] defined on t ∈ [−1, 1] for example is the same as
the curve r(t) = [t3 , t6 ] for ∈ [−1, 1], but in the second parametrization, the curve is
traveled with different speed. Curves in R3 can be admired in our physical space like
r(t) = [x(t), y(t), z(t)] = [t cos(t), t sin(t), t] which is a spiral. You can see that this
particular curve is contained in the cone x2 + y 2 = z 2 .
7.2. If the functions t → xj (t) are differentiable, we can form the derivative r0 (t) =
[x01 (t), . . . , x0n (t)]. While this technically is again a curve, we think of r0 (t) as a vector
attached to the point r(t) and say that r0 (t) is tangent to r(t). The length |r0 (t)|
of the velocity is called the speed of r. If also higher derivatives of the functions
xj (t) exist, we can form the second derivative r00 (t) called the acceleration or third
derivative r000 (t) = r(3) (t) called the jerk. Then come snap r(4) (t), crackle r(5) (t) and
pop r(6) (t) and the Harvard r(7) (t) introduced in the fall of 2016 in a multi-variable
exam.
7.3. Given the first derivative function r0 (t) as well as the initial point r(0), we can get
back the function r(t) thanks to the fundamental theorem of calculus. Because
of Newton’s law which tells that a mass point of mass m subject to a force field F
depending on position and velocity satisfies the Newtonian differential equation
mr00 (t) = F (r(t), r0 (t)), the following result is important:
Theorem: r(t) is uniquely determined from r00 (t) and r(0) and r0 (0).
Rt Rt
Proof. In each coordinate we get x0k (t) = 0 x00k (s) ds + x0k (0) and xk (t) = 0 x0k (s) ds +
xk (0). We have just applied twice the fundamental theorem of calculus.
1To reduce clutter, we write row vectors [2, 6] rather than column vectors
Linear Algebra and Vector Analysis
A special case is if r00 (t) is constant. A special case is the free fall situation. The
coordinate functions are then quadratic. Assume r00 (t) = [0, 0, −10], and r0 (0) = [0, 0, 0]
and r(0) = [0, 0, 20], then r(t) = [0, 0, 20 − 5t2 ]. If you jump from 20 meters into a
pool, you need t = 2 seconds to hit the water.
7.4. Given a curve r(t) for which the velocity r0 (t) is never zero, we can form the
unit tangent vector T (t) = r0 (t)/|r0 (t)|. If T 0 (t) is never zero, we can then form
N (t) = T 0 (t)/|T 0 (t)|, the normal vector. The vector B = T × N is called the
binormal vector. The scalar |T 0 (t)|/|r0 (t)| is called the curvature of the curve.
normalization we have limt→0,t>0 N (t) = [0, 1] and limt→0,t<0 N (t) = [0, −1]. At the
inflection point of the graph of the cube function, the concavity has changed from
concave down to concave up. This has changed the direction of the normal vector N .
7.7. Side remark. We have looked at parametrized vectors only. If the entries Aij (t)
of a matrix depend on times we have a matrix valued curve A(t). This appears in
differential equations, in quantum mechanics (operators moving in time) or - most
importantly - in moving pictures! A movie is just a matrix valued curve.
7.8. Side remark. A planar curve r(t) = [x(t), y(t)]T in the plane defined on t ∈
[0, 2π] is called a simple closed curve if r(0) = r(2π) and there are no values 0 ≤
s 6= t < 2π for which r(t) = r(s). For a smooth curve, meaning that the first two
derivatives exist, we can look at the polar angle α(t) of the vector r0 (t). Define the
signed curvature of the Rcurve as κ(t) = α0 (t)/|r0 (t)|. We have |κ(t)| = K(t). The
2π
Hopf Umlaufsatz tells 0 κ(t) dt = 2π. In the case of the circle for example,
κ(t) = 1.
7.9. Side remark. We can verify that any curve r(t) parametrized on [a, b] such that
r0 (t) 6= 0 for all t ∈ [a, b] can be parametrized as R(t) on [a, b] such that |R0 (t)| = 1 for
all t. Proof: we look for a monotone function s(t) such that the derivative of r(s(t)) has
length 1. This means we want |r0 (s(t))|s0 (t) = 1. In other words, look for a function
s(t) such that s0 (t) = 1/|r0 (s(t))| = F (s(t)) and s(a) = 0. This is what we call a
differential equation. There is a general existence theorem for differential equations
(proven later) which assures that there exists a unique solution s(t). End of proof.
The result is very intuitive. You can drive from r(a) to r(b) along the curve traced by
r(t) by just keeping the speed 1. This gives your your new parametrization. Your new
time interval will be [0, L] where L is the arc length (the length of your trip). We will
come to arc length computation in the next lesson.
7.10. Side remark. Continuous curves can be complicated: If you look at the pollen
particle in a microscope, it moves erratically on a curve which is nowhere differentiable
as it is constantly bombarded with air molecules which bounce it around. This is
Brownian motion. There are also Peano curves or Hilbert curves [0, 1] → [0, 1]2
or space filling Hilbert curves r(t) : [0, 1] → Q = [0, 1]3 which cover every point of the
cube Q. These curves define a continuous bijection from [0, 1] to [0, 1]3 . (The inverse
is not continuous. Still, the construction shows that there are the same number of
points in [0, 1] than in [0, 1]3 ).
Figure 1. The four first stages in the construction of a space filling curve.
Examples
7.11. Assuming the Newton equations mr00 (t) = F (t), find the path r(t) of a body
of mass m = 1/2 subject to a force F (t) = [sin(t), cos(t), −10] with r(0) = [3, 4, 5]
and r0 (0) = [1, 2, 7]. Solution: we have r00 (t) = [2 sin(t), 2 cos(t), −20]. Integration
gives r0 (t) = [−2 cos(t), 2 sin(t), −20t] + [c1 , c2 , c3 ]. Fixing the constants gives r0 (t) =
[3 − 2 cos(t), 2 + 2 sin(t), 7 − 20t]. A second integration gives r(t) = [3t − 2 sin(t), 2t −
2 cos(t), 7t − 10t2 ] + [c1 , c2 , c3 ] with other constants C = [c1 , c2 , c3 ]. Comparing r(0) =
[0, −2, 0]+[c1 , c2 , c3 ] = [3, 4, 5] gives r(t) = [3+3t−2 sin(t), 6+2t−2 cos(t), 5+7t−10t2 ].
7.12. Let r(t) = [L cos(t), L sin(t), 0]. Then r0 (t) = [−L sin(t), L cos(t), 0] and r00 (t) =
[−L cos(t), −L cos(t), 0] and r0 (t) × r00 (t) = [0, 0, L2 ] and |r0 (t)| = L so that |r0 (t) ×
r00 (t)|/|r0 (t)|3 = 1/L. A circle of radius L has curvature 1/L!
7.13. A closed simple curve C in R3 is a knot. For any positive integer n, m we can
look at the torus knot r(t) = [(3 + cos(mt)) cos(nt), (3 + cos(mt)) sin(nt), sin(mt)].
R 2π
The total curvature of a knot is defined as 0 K(t) dt. See Figure 2. 2
Figure 2. Torus knots T (2, 3), T (7, 3), T (12, 13) and T (30, 43). Their
total curvatures are 38.6, 245.6, 487.2, 2167.3.
2A general theorem of Fay and Milnor assures that a knot of total curvature ≤ 4π is trivial.
Linear Algebra and Vector Analysis
Homework
Problem 7.4: Verify that the torus knot r(t) = [x(t), y(t), z(t)] =
[(2+cos(mt)) cos(nt), (2+cos(mt)) sin(nt), sin(mt)] lives on the torus (3+
x2 + y 2 + z 2 )2 − 16(x2 + y 2 ) = 0.
Problem 7.5: In the lecture on surfaces, we have sliced some bagels. Let
us assume that the doughnut is given by (x2 +y 2 +z 2 +16)2 −100(x2 +y 2 ) =
0. Verify that if we intersect this torus with the plane 3x = 4z, then we
get the Villarceau circles r(t) = [4 cos(t), 3 + 5 sin(t), 3 cos(t)] as well as
the circle r(t) = [4 cos(t), −3 + 5 sin(t), 3 cos(t)].
MATH 22A
Lecture
8.1. We assume in this lecture that curves are continuously differentiable meaning
that the velocity is continuous. We would write r ∈ C 1 ([a, b], Rd ). Given a parametrized
curve r(t) defined over an interval I = [a, b], its arc length is defined as
Z b
L= |r0 (t)| dt .
a
0
For f (t) = |r (t)| the integral is defined as the lim sup (we don’t know yet whether
lim exists),
Z b
Sn 1 X k
f (t) dt = lim sup = lim sup f( ) .
a n→∞ n n→∞ n k
n
a≤ n <b
Proof. (i) To see parameter independence, assume a time change φ(t) with a monotone
smooth function φ : [a, b] → [φ(a), φ(b)]. If r(t) on [φ(a), φ(b)] and R(t) = r(φ(t)) on
[a, b] are the two parametrizations and f (t) = |r0 (t)| and F (t) = |R0 (t)| = |r0 (φ(t))|φ0 (t),
R φ(b) Rb
then by substitution, the arc length of r(t) is φ(a) f (t) dt = a f (φ(t))φ0 (t) dt which is
Rb
a
F (t) dt, the arc length of R(t).
(ii) From (i) we can assume [a, b] = [0, 1]. By uniform continuity, there are Mn → 0
such that if |y − x| ≤ 1/n, then |f (y) − f (x)| ≤ Mn . The intermediate value
theorem,
R xk+1 gives for every Ik = [x
R k1, xk+1 ] = [k/n, (k P+ 1)/n] ⊂ [0, 1], a yk ∈R Ik such that
1
xk
f (x) dx = f (yk )/n. Now, 0 f (x) dx = (1/n) k f (yk ) and |Sn /n− 0 f (x) dx| =
P P P
(1/n)| k [f (xk ) − f (yk )]| ≤ (1/n) k |f (xk ) − f (yk )| ≤ 1/n k Mn = Mn → 0.
Linear Algebra and Vector Analysis
Examples
R 2π
8.2. The arc length of the circle r(t) = [R cos(t), R sin(t)] with t ∈ [0, 2π] is 0 |r0 (t)| dt =
R 2π
0
R dt = 2πR.
R1 √
8.3. The arc length of the parabola r(t) = [t, t2 /2] with t ∈ [−1, 1] is −1 1 + t2 dt.
√
We will do this integral in class. The result is 2 + arcsinh(1).
√ R2p
8.4. The arc length of the curve r(t) = [log(t), 2t, t2 /2] for t ∈ [1, 2]. It is 1 1/t2 + t2 + 2 dt =
R2
1
(t + 1/t) dt = log(2) + 3/2.
Illustrations
8.6. In order to define the Lebesgue integral, one first introduces a so called σ-
algebra A. 1 It is the smallest set of subsets of R which is closed under the operation
of taking countable unions and intersections and complements and which contains
the class of intervals. The Lebesgue measure on intervals |[a, b]| = b − a can then be
extended to A where it inherits all the properties we want, like |A ∪ B| = |A| + |B| −
|A ∩ B|. For indicator functions which are functions R f (x) = 1A (x) which is 1 if
x ∈ A and 0 else, the Lebesgue integral is defined as 1A (x) dx = |A|.
8.7. First write the function f as f + − f − , where f + and f − are both non-negative.
This is a simplification because we need to define the integral
P only for non-negative
functions. A simple Rstep functionP is a finite sum i ai 1Ai , with Ai ∈ A. For
such functions,
R define I
f dx = a
i i |Ai |. The Lebesgue integral is now defined as
supg≤f I g dx, where the supremum is taken over all simple step functions g smaller
or equal than f . If the limit exists, the function is called Lebesgue integrable.
8.8. The Lebesgue integral is also a Monte Carlo integral limn→∞ n1 a≤xk <b f (xk ),
P
where xk are random choices in [a, b]. This is justified by the law of large numbers.
The transition Riemann → Lebesgue replaces a regular lattice k/n with a random one.
8.9. The Lebesgue integral can integrate also non-continuous
R functions: let g(x) be
0 on rational numbers and 1 on irrational numbers. Then I g dx = |I| because all
except a countable number of x are irrational. The Riemann integral would give 0.
8.10. The proof that a continuous function is Lebesgue integrable is even simpler
than for the Riemann integral: first again use that f is uniform continuous on [a, b],
there exists Mn → 0 such that whenever |x − y| ≤ 1/n, also |f (x) − P f (y)| ≤ Mn .
P intervals Ik = [k/n, (k + 1)/n] ∩ [a, b] and step functions g = k ck 1Ik and
Take the
h = k dk 1Ik , where ck is the minimum of f on Ik and dk the maximum. Now
Rb P P
a
|g − h| dx ≤ k |ck − dk ||Ik | ≤ Mn k |Ik | = Mn (b − a). Now f is sandwiched
between step functions g, h which for n → ∞ have the same integral.
8.11. We don’t prove RademacherR here. One needs to show that f 0 is Lebesgue in-
x
tegrable and that g(x) = f (a) + a f 0 (t) dt agrees with f (x). In modern language
Rademacher tells Lipschitz = Sobolev W 1,∞ ([a, b]) = {f 0 ∈ L∞ ([a, b])}. More gen-
eral is absolute continuity = W 1,1 ([a, b]) = {f 0 ∈ L1 ([a, b])}.
1For details see i.e. O.Knill, Probability theory and stochastic processes, 2011
Linear Algebra and Vector Analysis
Homework
Problem 8.1: Find the arc length of the catenary r(t) = [t, cosh(t)],
where cosh(t) = (et + e−t )/2 is the hyperbolic cosine and t ∈ [−1, 1].
Hint. You can use the identity cosh2 (t) − sinh2 (t) = 1, where sinh(t) =
(et − e−t )/2 is the hyperbolic sine. We have cosh0 = sinh, sinh0 = cosh.
Galileo was the first to investigate the catenary. It is the curve, a freely hanging heavy rope describes, if the end points
have the same height. Galileo mistook the curve for a parabola. It was Johannes Bernoulli in 1691, who obtained
its true form after some competition involving Huygens, Leibniz and two Bernoullis. The name “catenarian” (=chain
curve) was first used by Huygens in a letter to Leibnitz in 1690.
Problem 8.4: Compute numerically the arc length of the knot r(t) =
[sin(4t), sin(3t), cos(5t), cos(7t)] from t = 0 to t = 2π. By drawing the first
coordinates only and using color as the fourth coordinate, we can see that
there are no non-trivial knots in R4 . You can not tie your shoes in R4 !
R1
Problem 8.5: What is the relation between | 0
r0 (t) dt| and
R1 0
0
|r (t)| dt? Give an interpretation of both sides.
MATH 22A
Unit 9: Intuition
Seminar
9.1. It is important in mathematics to gain intuition about objects, definitions and
theorems and proofs. The fact that this is not easy can be illustrated by showing that
intuition can mislead us. We can state “false theorems” which we would believe to be
true but which are false. We start with the notion of “continuity” for which an intuitive
definition tells: we can “draw the graph of a continuous function without having to lift
the pen”. Of course, we can not work with this definition to prove theorems.
9.2. Starting with Cauchy and pushed heavily by Weierstrass, continuity is defined
precisely using the infamous − δ definition: f is continuous at x, if for every > 0
there exists δ > 0 such that if |x − y| ≤ δ, then |f (x) − f (y)| ≤ . Using more fancy
mathematical quantifier notation ∀ (for all) and ∃ (exists) and ⇒ (implies) and (is
element of) you can impress your friends (and annoy readers and graders) by writing
∀ > 0∃δ > 0∀y ∈ [a, b], |x−y| ≤ δ ⇒ |f (x)−f (y)| ≤ .
The fact that his definition is not intuitive at all and that most students just learn this
“epsilontic” by intimidation is illustrated by the following variation by Ed Nelson 1
We make it our first exercise:
Problem A: What does the following statement mean?
9.3. In the first lecture we have seen how a polygonal approximation of a curve allows
to compute the arc length of a curve. Here is a first “anti-theorem”. Your task is to
figure out what is wrong.
1E. Nelson, Internal set theory: A new approach to nonstandard analysis, 1977
Linear Algebra and Vector Analysis
9.5. This leads to the following anti-theorem: 2 A continuous planar curve is a function
t → r(t) = [x(t), y(t)], where both functions x(t), y(t) are continuous functions.
False Theorem: The circumference of the unit circle is 8.
9.6. We could also think that the arc length of a continuous curve is finite.
False Theorem: The arc length of a continuous curve is finite.
Problem C: Find a formula for the length of the k’th Koch curve ap-
proximation if initially, the triangle has side length 1
9.8. A counter example is the devil comb r(t) = [t, sin(1/t)] for t ∈ [0, 1]. it does not
have a jump discontinuity and it is bounded. The function is not defined at t = 0 but
we can define r(0) = [0, 0] to make it defined anywhere on [0, 1].
9.10. A counter example was given by Weierstrass. It is called the Weierstrass func-
tion. G.H. Hardy proved in 1916 that the function
X∞
f (x) = a−n cos(an x)
n=1
does not have any point of differentiability if a > 1.
P∞
Problem E: Show that f (x) = n=1 2−n cos(2n x) ∈ [−1, 1].
Linear Algebra and Vector Analysis
Homework
Exercises A-E are done in the seminar. This homework is due on Tuesday:
Problem 9.1 Prove that there was a time in your life when the length
of your largest tooth in millimeters was your height in meters.
Problem 9.2 Use the intermediate value theorem to prove the mean
value theorem: if f is continuously differentiable and f (0) = f (1) = 0,
then there exists a point in (0, 1) with f 0 (x) = 0.
Problem 9.3 Look up, formulate and understand the proof of the “Wob-
bly table theorem”. This theorem appears to have been found in 2008 by
David Richeson. You find an exposition in some of Harvard Math 1a
handouts.
Problem 9.5 What does your intuition say? We will come back to this
later when we look at surface area. Given a nice smooth surface S like
a paraboloid which is triangulated with triangles of size . If Sn is the
polygonal approximation. Does the surface area |Sn | of the polyhedron
and the surface area of the surface S satisfies |Sn | → |S|?
MATH 22A
Lecture
10.1. It was René Descartes who in 1637 introduced coordinates and brought algebra
close to geometry. 1 The Cartesian coordinates (x, y) in R2 canpbe replaced by
other coordinate systems like polar coordinates (r, θ), where r = x2 + y 2 ≥ 0 is
the radial distance to the (0, 0) and θ ∈ [0, 2π) is the polar angle made with the
positive x-axis. Since θ is in the interval [0, 2π), it is best described in the complex
notation θ = arg(x + iy). The conversion from the (r, θ) coordinates to the (x, y)-
coordinates is
x = r cos(θ)
y = r sin(θ)
p
The radius is x2 + y 2 , where if non-zero, we always take the positive root. The angle
formula arctan(y/x) only holds if x and y are both positive. The angle θ is not uniquely
defined at the origin (0, 0), most software just assumes arg(0) = 0.
10.3. The proof is to write the series definition on both sides. First recall the defini-
tions of ex = 1 + x + x2 /2! + x3 /3! + .... If we plug in x = iθ we get eiθ = 1 + iθ − θ2 /2! −
iθ3 /3! + θ4 /4!... But this is (1 − θ2 /2 + θ4 /4!...) + i(θ − θ3 /3! + θ5 /5! − ...) which is
cos(θ) + i sin(θ). QED. If you prefer not to see the functions exp, sin, cos being defined
as
P∞ series, you can see them as Taylor series f (x) = f (0) + f 0 (0)x + f 00 (0)/2!x2 + ... =
(k)
k=0 (f (0)/k!)xk . By differentiating the functions at 0, we see then the connection.
Theorem: eiπ + 1 = 0
This formula is often voted the “nicest formula in math”. 2 It combines “analysis”
in the form e, “geometry” in the form of π, “algebra” in the form of i, the additive unit
0 and the multiplicative unit 1. The Euler formula also allows to define the logarithm
of any complex number as log(z) = log(|z|) + iarg(z) = log(r) + iθ. We see now that
going from (x, y) to (log(r), θ) is a very natural transformation from C \ 0 to C. The
exponential function exp : z → ez is a map from C → C \ 0. It transforms the additive
structure on C to the multiplicative structure because exp(z + w) = exp(z) exp(w).
10.5. In three dimensions, we can look at cylindrical coordinates (r, θ, z). It is just
the polar coordinates in the first two coordinates. A cylinder of radius 2 for example
is given as r = 2. The torus (3 + x2 + y 2 + z 2 )2 − 16(x2 + y 2 ) = 0 can be written as
3 + r2 + z 2 = 4r or more intuitively as (r − 2)2 + z 2 = 1, a circle in the r − z plane.
p
10.6. The spherical coordinates (ρ, θ, φ), where ρ = x2 + y 2 + z 2 . The angle θ
is the polar angle as in cylindrical coordinates and φ is the angle between the point
(x, y, z) and the z-axis. We have cos(φ) = [x, y, z]·[0, 0, 1]/|[x, y, z]| = z/ρ and sin(φ) =
|[x, y, z] × [0, 0, 1]|/|[x, y, z]| = r/ρ so that z = ρ cos(φ) and r = ρ sin(φ) and therefore
x = ρ sin(φ) cos(θ)
y = ρ sin(φ) sin(θ)
z = ρ cos(φ)
10.9. If f (z) = z 2 +c with c = a+ib, z = x+iy is written as f (x, y) = (x2 −y 2 +a, 2xy+
b), then df is a 2 × 2 rotation dilation matrix which corresponds to the complex
number f 0 (z) = 2z. The algebra C is the same as the algebra of rotation-dilation
matrices.
2D. Wells, Which is the most beautiful?, Mathematical Intelligencer, 1988
10.10. A coordinate change x → f (x) in space is a map f : R3 → R3 . We compute
x1 f1 (x) x1 ∂x1 f1 (x) ∂x2 f1 (x) ∂x3 f1 (x)
f x2 = f2 (x) , df x2 = ∂x1 f2 (x) ∂x2 f2 (x) ∂x3 f2 (x) .
x3 f3 (x) x3 ∂x1 f3 (x) ∂x2 f3 (x) ∂x3 f3 (x)
We wrote x = (x1 , x2 , x3 ). Its determinant det(dT )(x) is a volume distortion factor.
10.11. For spherical coordinates, we have
ρ ρ sin(φ) cos(θ) ρ sin(φ) cos(θ) ρ cos(φ) cos(θ) −ρ cos(φ) sin(θ)
f φ = ρ sin(φ) sin(θ) , df φ = sin(φ) sin(θ) ρ cos(φ) sin(θ) ρ cos(φ) cos(θ) .
θ ρ cos(φ) θ cos(φ) −ρ sin(φ) 0
The distortion factor is det(df (ρ, φ, θ)) = ρ2 sin(φ).
Examples
10.12. The point (x, y) = (−1, 1) corresponds
√ to the complex number z = −1 + i.
It has the polar coordinates (r, θ) = ( 2, 3π/4). As we have z = reiθ , we check
z 2 = (−1 + i)(−1 + i) = −2i which agrees with (reiθ )2 = r2 e2iθ = 2e6πi/4 .
√
10.13. a) (x, y, z) = (1, 1, − 2) corresponds to spherical coordinates (ρ, φ, θ) = (2, 3π/4, π/4).
b) The point given in spherical coordinates as (ρ, φ, θ) = (3, 0, π/2) is the point (0, 3, 0).
10.14. a) The set of points with r = 1 in R2 form a circle.
b) The set of points with ρ = 1 in R3 form a sphere.
c) The set of points with spherical coordinates φ = 0 are points on the positive z-axis.
d)The set of points with spherical coordinates θ = 0 form a half plane in the yz-plane.
e) The set of points with ρ = cos(φ) form a sphere. Indeed, by multiplying both sides
with ρ, we get ρ2 = ρ cos(φ) which means x2 + y 2 + z 2 = z, which is after a completion
of the square equal to x2 + y 2 + (z − 1/2)2 = 1/4.
10.15. For A ∈ M (n, n), f (x) = Ax + b has df = A and distortion factor det(A).
10.16. Find the Jacobian matrix and distortion factor of the map f (x1 , x2 ) = (x31 +
x2 , x22 − sin(x1 )). Answer: Write both the transformation and the Jacobian:
x1 x31 + x2 x1 3x21 1
f = , df = .
x2 x22 − sin(x1 ) x2 − cos(x1 ) 2x2
The Jacobian matrix is det(df (x)) = 6x21 x2 + cos(x1 ).
Illustrations
10.17. Let T : C → C be defined as z → z 2 +c where z = x+iy. The set of all c = a+ib
for which the iterates T n (0) stay bounded is the Mandelbrot set M . For c = −1 we
get T (0) = −1, T 2 (0) = T (−1) = 0 so that T n (z) is either 0 or −1. The point c = −1
is in M . The point c = 1 gives T (0) = 1, T 2 (0) = 12 = 1 = 2, T 3 (0) = 22 + 1 = 5.
Induction shows that T n (0) does not converge. The point c = 1 is not in M .
10.18. If T is the transformation in R3 which is in spherical coordinates given by
T (x) = x2 + c, where x2 has spherical coordinates (ρ2 , 2φ, 2θ) if x has (ρ, φ, θ). It turns
out that T (x) = x8 + c gives a nice analogue of the Mandelbrot set, the Mandelbulb.
Linear Algebra and Vector Analysis
Homework
√
Problem 10.1: a) Find the polar coordinates of (x, y) = (1, 3).
b) Which point has the polar coordinates (r, θ) = (3, 4)?
c) Find the spherical coordinates of the point (x, y, z) = (1, 1, 1).
d) Which point has the spherical coordinates (ρ, θ, φ) = (3, π/2, π/3)?
MATH 22A
Lecture
11.1. A map r : Rm → Rn is called a parametrization. We have seen maps r
from R to Rn , which were curves. Then we have seen maps f : Rn → Rn which
were coordinate changes. In each case we defined the Jacobian matrix df (x). In
the case of the curve r : R → Rn , it was the velocity dr(t) = r0 (t). In the case of
coordinate changes, the
p Jacobian matrix df (x) was used to get the volume distortion
factor det(df (x)) = det(df T df ). Today, we look at the case m < n. In particular
at m = 2, n = 3. As in the case of curves, we use the letter r to describe the map.
The image of a map r : R ⊂ Rm → Rn is then a m-dimensional surface in Rn . The
distortion factor ||dr|| defined as ||dr||2 = det(drT dr) will be used later to compute
surface area. 1
11.2. We mostly discuss here the case m = 2 and n = 3, as we ourselves are made of
two-dimensional surfaces, like
cells, membranes,
skin or tissue. A map r : R ⊂ R2 →
x(u, v)
3 u
R , written as r( ) = y(u, v) defines a two-dimensional surface. In order to
v
z(u, v)
save space, we also just write r(u, v) = [x(u, v), y(u, v), z(u, v)]. In computer graphics,
the r is called uv-map. The uv-plane is where you draw a texture. The map r places
it onto the surface. In geography, the map r is called (surprise!) a map. Several maps
define an atlas. The curves u → r(u, v) and v → r(u, v) are called grid curves.
1Distinguish ||A||2 = det(AT A) and |A|2 = tr(AT A) in M (n, m). They only agree for m = 1.
Linear Algebra and Vector Analysis
11.3. The parametrization r(φ, θ) = [sin(φ) cos(θ), sin(φ) sin(θ), cos(φ)] produces the
sphere x2 + y 2 + z 2 = 1. The full sphere has 0 ≤ φ ≤ π, 0 ≤ θ < 2π. By modifying
the coordinates, we get an ellipsoid r(φ, θ) = [a sin(φ) cos(θ), b sin(φ) sin(θ), c cos(φ)]
satisfying x2 /a2 + y 2 /b2 + z 2 /c2 = 1. By allowing a, b, c to be functions of φ, θ we get
“bumpy spheres” like r(φ, θ) = (3 + cos(3φ) sin(4θ))[sin(φ) cos(θ), sin(φ) sin(θ), cos(φ)].
11.4. Planes are described by linear maps r(x) = Ax + b with A ∈ M (3, 2) and
b ∈ M (3, 1). The Jacobian map is dr = A. Let ru , rv be the two column vectors of A.
Actually, ru is a short cut for ∂u r(u, v), which is the velocity vector of the grid curve
u → r(u, v).
11.5. An example is the parametrization
r(u,
v) = [u + v − 1, u −v + 3, 3u− 5v + 7]
−1 1 1 1 1
In this case b = 3 , ru = 1 rv = −1 and A = dr = 1 −1 . We see
7 3 −5 3 −5
11 −15
AT A = which has determinant 72. We also have
−15 27
1 1 −2
|ru × rv |2 = | 1 × −1 |2 = | 8 |2 = 72
3 −5 −2
11.6. The previous computation suggests a relation between the normal vector and
the fundamental form g = drT dr. In three dimensions, the distortion factor of a
parametrization r : R2 → R3 can indeed always be rewritten using the cross product:
Examples
11.7. For the unit sphere r(φ, θ) = [sin(φ) cos(θ), sin(φ) sin(θ), cos(φ)] and A = dr:
cos(φ) cos(θ) − sin(φ) sin(θ)
cos(φ) cos(θ) cos(φ) sin(θ) − sin(φ)
g = AT A = cos(φ) sin(θ) sin(φ) cos(θ)
− sin(φ) sin(θ) sin(φ) cos(θ) 0
− sin(φ) 0
1 0 p
This is g = and det(g) = sin(φ) is the distortion factor.
0 sin2 (φ)
11.8. An important class of surfaces are graphs z = f (x, y). Its most natural
parametrization is r(x, y) = [x, y, f (x, y)], where the map r just lifts up the bottom part
to the elevated version. An example is the elliptic paraboloid r(x, y) = [x, y, x2 + y 2 ]
and the hyperbolic paraboloid r(x, y) = [x, y, x2 − y 2 ]. We could of course have written
also r(u, v) = [u, v, u2 − v 2 ].
11.9. A surface of revolution is parametrized like r(θ, z) = [g(z) cos(θ), g(z) sin(θ), z].
Note that we can use any variables. In this case, u = θ, v = z are used. An ex-
ample is the
√ cone r(θ, z)√= [z cos(θ), z sin(θ), z] or the one-sheeted hyperboloid
r(θ, z) = [ z 2 + 1 cos(θ), z 2 + 1 sin(θ), z].
11.10. The torus is in cylindrical coordinates given as (r − 3)2 + z 2 = 1. We can
parametrize this using the polar angle θ and the polar angle centered at center of the
circle as r(θ, φ) = [(3 + cos(φ)) cos(θ), (3 + cos(φ)) sin(θ), sin(φ)]. Both angles θ and φ
go from 0 to 2π. We see now also the relation with the toral coordinates.
11.11. The helicoid is the surface you see as a staircase or screw. The parametrization
is r(θ, p) = [p cos(θ), p sin(θ), θ]. How can we understand this? The key is to look at
grid curves. If p = 1, we get a curve r(θ) = [cos(θ), sin(θ), θ] which we had identified
as a helix. On the other hand, if you fix θ, then you get lines.
11.12. Side remark. The first fundamental form g = drT dr is also called a
metric tensor. In Riemannian geometry one looks at a manifold M equipped
with a metric g. The simplest case is when g comes from a parametrization, as we did
here. In physics, we know that it is mass which deforms space-time. The quantity
||g||2 = det(g) is a multiplicative analogue of |g|2 = tr(g). For an invertible positive
definite square matrix A, we will later see the identity log det(A) = tr log(A) which
illustrates how both determinant and trace are pivotal numerical quantities derived
from a matrix. Trace is additive because of tr(A+B) = tr(A)+tr(B) and determinant
is multiplicative det(AB) = det(A)det(B) as we will see later.
11.13. To summarize, we have seen so far that there are two fundamentally different
ways to describe a manifold. The first is to write it as a level surface f = c which is a
kernel of a map g(x) = f − c. A second is to write it as the image of some map r.
Illustration
Homework
Problem 11.1: Parametrize the upper part of the two sheeted hyper-
boloid x2 + y 2 − z 2 = −1, z > 0 in two different ways:
a) as a surface of revolution b) as a graph z = f (x, y).
Problem 11.5: The matrix g = drT dr is also called the first funda-
mental form. If r : R4 to R4 is a parametrization of space time then
g is the space time metric tensor. The matrix entries of g appear in
general relativity. Now for some reasons, physics folks use Greek sym-
bols to access matrix entries. They write gµν for the entry at row µ and
column ν. This appears for example in the Einstein field equations
1 8πG
Rµν − Rgµν = 4 Tµν .
2 c
Find the general solution of this equation. Just kidding. We just want
you to look up the equations and tell from each of the variables, what it
is called and whether it is a matrix, a scalar function or a constant.
MATH 22A
Seminar
12.1. As we are heading for our first midterm, let us organize the knowledge accu-
mulated so far. We can do that in various ways. One technique is a mind map. It
allows on one picture to organize a vast amount of content and see connections which
might otherwise be missed. In Figure (1) we started to build such a mind map. There
are lots of branches still missing, even main ones. One could start also with one entry
like “matrix”, put it in the center then build connections to other objects definitions
or results.
Area
Trace
Parametrizations
Matrices
!!a Hourly *
Jacobean
Matrix Product
Surfaces
Integral Spheres
Calculus
Quadrics
Proofs
Derivative
Induction
Figure 1.
12.2. What does this have to do with creativity? It turns out that in order to be
creative, one has to have a fertile base of knowledge. You can not assemble new
building blocks before possessing and understanding some already. In order to prove
the point that knowledge is important, one can also look at computer science and
Linear Algebra and Vector Analysis
especially the field artificial intelligence (AI). One of the great pioneers in AI, Marvin
Minsky once wrote: ”the best way to solve a problem is to know how to solve it”. The
modern paradigms in machine learning confirm that in order to train an AI entity,
one has to feed in a lot of knowledge to work with. New models come then through
data fitting, gradient decent methods or more sophisticated algorithms. 1
Problem A: Make a mind map of the most important facts which have
appeared in the course so far. Do it on paper, a blackboard, whiteboard
or using software. Figure (1) makes a start. Refine it as much as possible.
12.3. To illustrate how difficult it can be to get a new solution, try the following
problem. Of course, if you know the answer or have seen it already, it can be easy. If
you have never seen it, it can be very hard. It is important that you try to find the
solution for at least a half an hour even if you should not be successful.
Problem B: Given 6 sticks of the same length 1, arrange them so that
you get 4 equilateral triangles of side length 1.
Homework
Exercises A-D are done in the seminar. This homework is due on Thursday. In all
the following question, creativity is key. Your object has to be original. It is ok to
modify a known object. And of course, use technology so that one can admire your
creation.
Problem 12.1 Be creative and generate your own parametrized curve.
Remark: According to the Apocrypha of Krantz (page 79), part a) and b) were once
given as an algebraic geometry exam given here at Harvard. It is rumored that this
was then used also at the Harvard philosophy department, where (and this is creative
too), part c) was added. As far as we know, giving the homework assignment of writing
an exam assignment is a first! Heureka! We were creative.
Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018
LINEAR ALGEBRA AND VECTOR ANALYSIS
MATH 22A
Theorems
Cauchy-Schwarz
Pythagoras
Al Khashi
Uniqueness of Row reduction
The cross product formula
Image of transpose is perpendicular to kernel of matrix.
Cauchy Binet formula
For differentiable curves, arc length exists.
The equivalence of curvature formulas
Euler formula and special case
Distortion formula in space
Algorithms
Find angle between vectors
Find area of parallelogram
Find volume of parallelepiped
Row reduce a matrix
Get position from acceleration
Find vector perpendicular to a plane
Find length of a curve or matrix
Find curvature at some point
Compute with complex numbers
Switch between coordinate systems
Compute the distortion factor
Get distances between objects
Objects
Matrices
Vectors
Curves
Linear manifolds
Quadratic manifolds
Kernel of map
Linear Algebra and Vector Analysis
Parametrized surfaces
Differentiation
Velocity
Acceleration
The Frenet TNB frame
Jacobian matrix
Curvature
Integration
Integrate to get arc length.
Integrate to get position from velocity etc.
Integration technique: substitution
Integration technique: partial fractions
Integration technique: simplification
Coordinate systems
Cartesian coordinates
Polar coordinates
Cylindrical coordinates
Spherical coordinates
General coordinate change
Parametrized Surfaces
Spheres
Surfaces of revolution
Graphs
Planes
People
Mandelbrot
Hamilton
Descartes
Cauchy
Binet
Schwarz
Euler
Heine
Cantor
Bolzano
Archimedes
Newton
4
Name:
5
7
LINEAR ALGEBRA AND VECTOR ANALYSIS
8
MATH 22A Total :
9
10
Problems
4
Name:
5
7
LINEAR ALGEBRA AND VECTOR ANALYSIS
8
MATH 22A Total :
9
10
Problems
4
Name:
5
7
LINEAR ALGEBRA AND VECTOR ANALYSIS
8
MATH 22A Total :
9
10
Problems
MATH 22A
Lecture
14.1. A partial differential equation is a rule which combines the rates of changes of
different variables. Our lives are affected by partial differential equations: the Maxwell
equations describe electric and magnetic fields E and B. Their motion leads to the
propagation of light. The Einstein field equations relate the metric tensor g with the
mass tensor T . The Schrödinger equation tells how quantum particles move. Laws
like the Navier-Stokes equations govern the motion of fluids and gases and especially
the currents in the ocean or the winds in the atmosphere. Partial differential equations
appear also in unexpected places like in finance, where for example, the Black-Scholes
equation relates the prices of options in dependence of time and stock prices.
14.2. If f (x, y) is a function of two variables, we can differentiate f with respect to
both x or y. We just write fx (x, y) for ∂x f (x, y). For example, for f (x, y) = x3 y + y 2 ,
we have fx (x, y) = 3x2 y and fy (x, y) = x3 + 2y. If we first differentiate with respect to
x and then with respect to y, we write fxy (x, y). If we differentiate twice with respect
to y, we write fyy (x, y). An equation for an unknown function f for which partial
derivatives with respect to at least two different variables appear is called a partial
differential equation PDE. If only the derivative with respect to one variable appears,
one speaks of an ordinary differential equation ODE. An example of a PDE is
fx2 + fy2 = fxx + fyy , an example of an ODE is f 00 = f 2 − f 0 . It is important to realize
that it is a function we are looking for, not a number. The ordinary differential equation
f 0 = 3f for example is solved by the functions f (t) = Ce3t . If we prescribe an initial
value like f (0) = 7, then there is a unique solution f (t) = 7e3t . The KdV partial
differential equation ft + 6f fx + fxxx = 0 is solved by (you guessed it) 2sech2 (x − 4t).
This is one of many solutions. In that case they are called solitons, nonlinear waves.
Korteweg-de Vries (KdV) is an icon in a mathematical field called integrable systems
which leads to insight in ongoing research like about rogue waves in the ocean.
14.3. We say f ∈ C 1 (R2 ) if both fx and fy are continuous functions of two variables
and f ∈ C 2 (R2 ) if all fxx , fyy , fxy and fyx are continuous functions. The next theorem is
called the Clairaut theorem. It deals with the partial differential equation fxy = fyx .
The proof demonstrates the proof by contradiction. We will look at this technique
a bit more in the proof seminar.
Illustration
14.6. In many cases, one of the variables is time for which we use the letter t and
keep x as the space variable. The differential equation ft (t, x) = fx (t, x) is called
the transport equation. What are the solutions if f (0, x) = g(x)? Here is a cool
derivation: if Df = f 0 is the derivative, 1 we can build operators like (D+D2 +4D4 )f =
f 0 +f 00 +4f 0000 . The transport equation is now ft = Df . Now as you know from calculus,
the only solution of f 0 = af, f (0) = b is beat . If we boldly replace the number a with
with the operator D we get f 0 = Df and get its solution
eDt g(x) = (1 + Dt + D2 t2 /2! + · · · )g(x) = g(x) + g 0 (x)t + g 00 (x)t2 /2! + · · · .
By the Taylor formula, this is equal to g(x+t). You should actually remember Taylor
as g(x + t) = eDt g(x) . We have derived for g(x) = f (0, x) in C 1 (R2 ):
1We usually write df for derivative but D tells it is an operator. D also stands for Dirac.
Theorem: ft = fx is solved by f (t, x) = g(x + t).
Proof. We can ignore the derivation and verify this very quickly: the function satisfies
f (0, x) = g(x) and ft (t, x) = fx (t, x). QED.
14.7. Another example of a partial differential equation is the wave equation ftt =
fxx . We can write this (∂t + D)(∂t − D)f = 0. One way to solve this is by looking at
(∂t − D)f = 0. This means transport ft = fx and f (t, x) = f (x + t). We can also have
(∂t +D)f = 0 which means ft = −fx leading to f (x−t). We see that every combination
af (x + t) + bf (x − t) with constants a, b is a solution. Fixing the constants a, b so that
f (x, 0) = g(x) and ft (x, 0) = h(x) gives the following d’Alembert solution. It
requires g, h ∈ C 2 (R).
g(x+t)+g(x−t) h(x+t)−h(x−t)
Theorem: ftt = fxx is solved by f (t, x) = 2
+ 2
.
14.8. Proof. Just verify directly that this indeed is a solution and that f (0, x) = g(x)
and ft (0, x) = h(x). Intuitively, if we throw a stone into a narrow water way, then the
waves move to both sides.
14.9. The partial differential equation ft = fxx is called the heat equation. Its
solution involves the normal distribution
2 2
√
N (m, s)(x) = e−(x−m) /(2s ) / 2πs2
in probability theory. The number m is the average and s is the standard deviation.
14.10. If the initial heat g(x) = f (0, x) at time t = 0 is continuous and zero outside a
bounded interval [a, b], then
Rb √
Theorem: ft = fxx is solved by f (t, x) = a g(m)N (m, 2t)(x) dm.
√
Proof. For every fixed m, the function N (m, 2t)(x) solves the heat equation.
f=PDF[ N o r m a l D i s t r i b u t i o n [m, Sqrt [ 2 t ] ] , x ] ; Simplify [D[ f , t ]==D[ f , { x , 2 } ] ]
Pn
Every Riemann sum approximation√g(x) = (1/n) k=1 g(mk ) of g defines a function
fn (t, x) = (1/n) nk=1 g(mk )N (mk , 2t)(x) which solves the heat
P
R ∞ equation. So does
f (t, x) = limn→∞ fn (t, x). To check f (0, x) = g(x) which need −∞ N (m, s)(x) dx = 1
R∞
and −∞ h(x)N (m, s)(x) dx → h(m) for any continuous h and s → 0, proven later.
14.11. For functions of three variables f (x, y, z) one can look at the partial differential
equation ∆f (x, y, z) = fxx + fyy + fzz = 0. It is called the Laplace equation and ∆ is
called the Laplace operator. The operator appears also in one of the most important
partial differential equations, the Schrödinger equation
~2
i~ft = Hf = − ∆f + V (x)f ,
2m
where ~ = h/(2π) is a scaled Planck constant and V (x) is the potential depending
on the position x and m is the mass. For i~ft = P f with P = −i~D, then the
solution f (x − t) is forward translation. The operator P is the momentum operator
in quantum mechanics. The Taylor formula tells that P generates translation.
Linear Algebra and Vector Analysis
Homework
√
Problem 14.2: √ We have seen that f (t, x) = N (m, 2t) =
−(x−m)2 /(4t)
e / 4πt solves the heat equation ft = fxx . Verify more gen-
erally that √
2
e−(x−m) /(at) / aπt
solves the heat equation
ft = (a/4)fxx .
MATH 22A
Seminar
15.1. We have already seen one proof technique, the “method of induction.” Other
proofs were done either by direct computations or by combining already known
theorems or inequalities. Today, we look at two new and fundamentally different
proof techniques. The first is the method “by contradiction.” The second method
is the “method of deformation.” Both methods are illustrated by a theorem.
15.2. The first theorem is one of the earliest results in mathematics. It is the Hypas-
sus theorem from 500 BC. It was a result which shocked the Pythagoreans so much
that Hypassus got killed for its discovery. That is at least what the rumors tell.
Proof. Assume the statement is false and the diagonal has rational length p/q. Then
by Pythagoras theorem 2 = p2 /q 2 or 2q 2 = p2 . By the fundamental theorem of arith-
metic, the left hand side has an odd number of factors 2, the right hand side an even
number. This is a contradiction . The assumption must have been wrong.
15.3.
Problem A: Prove that the cube root of 2 is irrational.
√
Figure 1. 2 is irrational. Start by assuming the side length and
diagonal of the large yellow square are integers. Conclude that for the
strictly smaller orange square, the side length and diagonal are integers.
Linear Algebra and Vector Analysis
15.4. Note that the proof relied on the fundamental theorem of arithmetic which
assured that every integer has a unique prime factorization.
Problem B: Figure (1) is a geometric proof by contradiction which does
not need the fundamental theorem of arithmetic. Complete the proof.
1
15.5. Proofs by contradiction can be dangerous. A flawed proof can ” assume the con-
trary, mess around with arguments, make a mistake somewhere and get a contradiction .
QED”. Better than a proof by contradiction is a constructive proof.
15.6. Here is a non-constructive proof which is amazing:
Theorem: There exist two irrational x, y such that xy is rational.
√ √2
Proof: there are two possibilities. Either z = 2√ is irrational or not. In the first
case, we have found an example where x = y = 2. In the second case, take x = z
√ √ 2
and take y = 2. Now xy = 2 = 2 is rational and we have an example.
15.7. The second proof technique we see today is a deformation argument. To
illustrate it, take a closed C 2 curve in R2 without self intersections. We have defined
its curvature κ(t) already. For curves in R2 , define the signed curvature K(t).
If the curve parametrized so that |r0 (t)| = 1 and T (t) = [cos(α(t)), sin(α(t))], then
K(t) = α0 (t). Note that κ(t) = |T 0 (t)| = |[− sin(α(t)), cos(α(t))]α0 (t)| = |K(t)|. Now
Rb
if we have a curve r : [a, b] → R2 , we can define the total curvature as a K(t) dt.
By the fundamental theorem of calculus, this total curvature is the change of the
angle α(b) − α(a). Now, if the curve is closed, the initial and final angles have to differ
by a multiple of 2π. The Hopf Umlaufsatz tells that
Theorem: The total curvature of a simple closed curve is 2π or −2π.
Figure 2. Four simple closed curves for which it is not obvious that
the total curvature is 2π.
15.8.
Problem C: a) Why is the total curvature not always 2π?
b) Formulate out what happens in in Figure (3).
Homework
Exercises A-C are done in the seminar. This homework is due on Tuesday
√
Problem 15.1 Prove by contradiction that 12 is irrational.
MATH 22A
Lecture
16.1. Given a differentiable function r : Rm → Rp , its derivative at x is the Jacobian
matrix dr(x) ∈ M (p, m). If f : Rp → Rn is another function with df (y) ∈ M (n, p),
we can combine them and form f ◦ r(x) = f (r(x)) : Rm → Rn . The matrices df (y) ∈
M (n, p) and dr(x) ∈ M (p, m) combine to the matrix product df dr at a point. This
matrix is in M (n, m). The multi-variable chain rule is:
16.2. For m = n = p = 1, the single variable calculus case, we have df (x) = f 0 (x)
and (f ◦ r)0 (x) = f 0 (r(x))r0 (x). In general, df is now a matrix rather than a number.
By checking a single matrix entry, we reduce to the case n = m = 1. In that case,
f : Rp → R is a scalar function. While df is a row vector, we define the column
vector ∇f = df T = [fx1 , fx2 , . . . fxp ]T . If r : R → Rp is a curve, we write r0 (t) =
[x01 (t), · · · , x0p (t)]T instead of dr(t). The symbol ∇ is addressed also as “nabla”. 1 The
special case n = m = 1 is:
Theorem: d
dt
f (r(t)) = ∇f (r(t)) · r 0 (t).
which is (1D chain rule) in the limit h → 0 the sum fx1 (x)x01 (t) + · · · + fxp (x)x0p (t).
16.4. Proof of the general case: Let h = f ◦ r. The entry ij of the Jacobian matrix
dh(x) is dhij (x) = ∂xj hi (x) = ∂xj fi (r(x)). The case of the entry ij reduces with t = xj
and hi = f to the case when r(t) is a curve and f (x) is a scalar function. This is the
case we have proven already.
1Etymology tells that the symbol is inspired by a Egyptian or Phoenician harp.
Linear Algebra and Vector Analysis
Example
cos(t)
16.5. Assume a ladybug walks on a circle r(t) = and f (x, y) = x2 −y 2 is the
sin(t)
temperature at the position (x, y), then f (r(t)) is the rate of change of the temperature.
We can write f (r(t)) = cos2 (t) − sin2 (t) = cos(2t).
Now, = −2 sin(2t).
d/dtf (r(t)) The
2x − sin(t)
gradient of f and the velocity are ∇f (x, y) = , r0 (t) = . Now
−2y cos(t)
0 2 cos(t) − sin(t)
∇f (r(t)) · r (t) = · = −4 cos(t) sin(t) = −2 sin(2t) .
−2 sin(t) cos(t)
Illustrations
16.6. The case n = m = 1 is extremely important. The chain rule d/dtf (r(t)) =
∇f (r(t)) · r0 (t) tells that the rate of change of the potential energy f (r(t)) at the
position r(t) is the dot product of the force F = ∇f (r(t)) at the point and the velocity
with which we move. The right hand side is power = force times velocity. We will
use this later in the fundamental theorem of line integrals.
16.7. If f, g : Rm → Rm , then f ◦g is again a map from Rm to Rn . We can also iterate
a map like x → f (x) → f (f (x)) → f (f (f (x))) . . . . The derivative df n (x) is by the
chain rule the product df (f n−1 (x)) · · · df (f (x))df (x) of Jacobian matrices. The number
λ(x) = lim supn→∞ (1/n) log(|df n (x)|) is called the Lyapunov exponent of the map f
at the point x. It measures the amount of chaos, the “sensitive dependence on initial
conditions” of f . These numbers are hard to estimate mathematically. Already for
simple examples like the Chirikov map f ([x, y]) = [2x − y + c sin(x), x], one can
measure positive entropy S(c). A conjecture of Sinai tells that that the entropy
of
R 2πthe
R 2π map is positive 2for large c. Measurements show that this entropy S(c) =
2
0 0
λ(x, y) dxdy/(4π ) satisfies S(x) ≥ log(c/2). The conjecture is still open.
16.8. If H(x, y) is a function called the Hamiltonian and x0 (t) = Hy (x, y), y 0 (t) =
−Hx (x, y), then d/dtH(x(t), y(t)) = 0. This can be interpreted as energy conserva-
tion. We see that a Hamiltonian differential equation always preserves the energy. For
the pendulum, H(x, y) = y 2 /2−cos(x), we have x0 = y, y 0 = − sin(x) or x00 = − sin(x).
2To generate orbits, see http://www.math.harvard.edu/˜knill/technology/chirikov/.
Figure 2. The map f ([x, y]) = [x2 − x/2 − y, x] is a Henon map. We
see some orbits. The map f ([x, y]) = [2x − y + 4 sin(x), x] on the right
appeared in the first hourly. The torus T2 = R2 /(2πZ)2 is filled with a
blue “stochastic sea” containing red “stable islands”.
16.9. The chain rule is useful to get derivatives of inverse functions. Like
d d
1= x= sin(arcsin(x)) = cos(arcsin(x)) arcsin0 (x)
dx dx
p √
which then gives arcsin0 (x) = 1/ 1 − sin2 (arcsin(x)) = 1/ 1 − x2 .
16.10. Assume f (x, y) = x3 y + x5 y 4 − 2 − sin(x − y) = 0 is a curve. We can not solve
for y. Still, we can assume f (x, y(x)) = 0. Differentiation using the chain rule gives
fx (x, y(x)) + fy (x, y(x))y 0 (x) = 0. Therefore
fx (x, y(x))
y 0 (x) = − .
fy (x, y(x))
In the above example, the point (x, y) = (1, 1) is on the curve. Now gx (x, y) =
3 + 5 − 1 = 7 and gy (x, y) = 1 + 4 + 1 = 6. So, g 0 (1) = −7/6. This is called implicit
differentiation. We could compute with it the derivative of a function which was not
known.
16.11. The implicit function theorem assures that a differentiable implicit function
g(x) exists near a root (a, b) of a differentiable function f (x, y).
P.S. We can get the root of h by applying Newton steps T (y) = y − h(y)/h0 (y).
Taylor (seen in the next class) shows the error is squared in every step. The Newton
step T (y) = y − dh(y)−1 h(y) works also in arbitrary dimensions. One can prove the
implicit function theorem by just establishing that Id − T = dh−1 h is a contraction
and then use the Banach fixed point theorem to get a fixed point of Id − T which
is a root of h.
Linear Algebra and Vector Analysis
h(x)
x-T(x)
Homework
Problem 16.4: Consider the Hénon map f ([x, y]T ) = [x2 − x4 − y, x]T .
Compute either d(f ◦ f )([1, 1]T ) or df (f ([1, 1]T ))df ([1, 1]T ). The chain rule
tells it is the same matrix.
Figure 4. Some orbits of the Henon map f ([x, y]) = [x2 − x4 − y, x].
MATH 22A
Lecture
17.1. Given a function f : Rm → Rn , its derivative df (x) is the Jacobian matrix. For
every x ∈ Rm , we can use the matrix df (x) and a vector v ∈ Rm to get Dv f (x) =
df (x)v ∈ Rm . For fixed v, this defines a map x ∈ Rm → df (x)v ∈ Rn , like the original
f . Because Dv is a map on X = { all functions from Rm → Rn }, one calls it an
operator. The Taylor formula f (x + t) = eDt f (x) holds in arbitrary dimensions:
Dv tf (x) Dv2 t2 f (x)
Theorem: f (x + tv) = eDv t f = f (x) + 1!
+ 2!
+ ...
17.2. Proof. It is the single variable Taylor on the line x+tv. The directional derivative
Dv f is there the usual derivative as limt→0 [f (x + tv) − f (x)]/t = Dv f (x). Technically,
we need the sum to converge as well: like functions built from polynomials, sin, cos, exp.
17.3. The Taylor formula can be written down using successive derivatives df, d2 f, d3 f
also, which are then called tensors. In the scalar case n = 1, the first derivative df (x)
leads to the gradient ∇f (x), the second derivative d2 f (x) to the Hessian matrix
H(x) which is a bilinear form acting on pairs of vectors. The third derivative d3 f (x)
then acts on triples of vectors etc. One can still write as in one dimension
2
Theorem: f (x) = f (x0 ) + f 0 (x0 )(x − x0 ) + f 00 (x0 ) (x−x
2!
0)
+ ···
if we write f (k) = dk f . For a polynomial, this just means that we first write down the
constant, then all linear terms then all quadratic terms, then all cubic terms etc.
17.4. Assume f : Rm → R and stop the Taylor series after the first step. We get
L(x0 + v) = f (x0 ) + ∇f (x0 ) · v .
It is custom to write this with x = x0 + v, v = x − x0 as
L(x) = f (x0 ) + ∇f (x0 ) · (x − x0 )
17.5. If we stop the Taylor series after two steps, we get the function Q(x + v) =
f (x) + df (x) · v + v · d2 f (x) · v/2. The matrix H(x) = d2 f (x) is called the Hessian
matrix at the point x. It is also here custom to eliminate v by writing x = x0 + v.
Q(x) = f (x0 ) + ∇f (x0 ) · (x − x0 ) + (x − x0 ) · H(x0 )(x − x0 )/2
is called the quadratic approximation of f . The kernel of Q−f (x0 ) is the quadratic
manifold Q(x) − f (x0 ) = x · Bx + Ax = 0, where A = df and B = d2 f /2. It
approximates the surface {x | f (x) − f (x0 ) = 0} even better than the linear one. If
|x − x0 | is of the order , then |f (x) − L(x)| is of the order 2 and |f (x) − Q(x)| is of
the order 3 . This follows from the exact Taylor with remainder formula. 2
L=C
f=C
Q=C
17.6. To get the tangent plane to a surface f (x) = C one can just look at the linear
manifold L(x) = C. However, there is a better method:
The tangent plane to a surface f (x, y, z) = C at (x0 , y0 , z0 ) is ax+by+cz =
d, where [a, b, c]T = ∇f (x0 , y0 , z0 ) and d = ax0 + by0 + cz0 .
Proof. Let r(t) be a curve on S with r(0) = x0 . The chain rule assures d/dtf (r(t)) =
∇f (r(t)) · r0 (t). But because f (r(t)) = c is constant, this is zero assuring r0 (t) being
perpendicular to the gradient. As this works for any curve, we are done.
Examples
17.8. Let f : R2 → R be given as f (x, y) = x3 y 2 + x + y 3 . What is the quadratic
approximation at (x0 , y0 ) = (1, 1)? We have df (1, 1) = [4, 5] and
fx 4 fxx fxy 6 6
∇f (1, 1) = = , H(1, 1) = = .
fy 5 fyx fyy 6 8
2If Pn Rt
f ∈ C n+1 , f (x+t) = k=0 f (k) (x)tk /k!+ 0
(t−s)n f (n+1) (x+s)ds/n! (prove this by induction!)
The linearization is L(x, y) = 4(x − 1) + 5(y − 1) + 3. The quadratic approximation
is Q(x, y) = 3 + 4(x − 1) + 5(y − 1) + 6(x − 1)2 /2 + 12(x − 1)(y − 1)/2 + 8(y − 1)2 /2.
This is the situation displayed to the left in Figure (1). For v = [7, 2]T , the directional
derivative Dv f (1, 1) = ∇f (1, 1) · v = [4, 5]T · [7, 2] = 38. The Taylor expansion given
at the beginning is a finite series because f was a polynomial: f ([1, 1] + t[7, 2]) =
f (1 + 7t, 1 + 2t) = 3 + 38t + 247t2 + 1023t3 + 1960t4 + 1372t5 .
17.9. For f (x, y, z) = −x4 + x2 + y 2 + z 2 , the gradient and Hessian are
fx 2 fxx fxy fxz −10 0 0
∇f (1, 1, 1) = fy = 2 , H(1, 1, 1) = fyx fyy fyz = 0 2 0 .
fz 2 fzx fzy fzz 0 0 2
The linearization is L(x, y, z) = 2 − 2(x − 1) + 2(y − 1) + 2(z − 1). The quadratic
approximation
Q(x, y, z) = 2 − 2(x − 1) + 2(y − 1) + 2(z − 1) + (−10(x − 1)2 + 2(y − 1)2 + 2(z − 1)2 )/2
is the situation displayed to the right in Figure (1).
17.10. What is the tangent plane to the surface f (x, y, z) = 1/10 for f (x, y, z) =
10z 2 − x2 − y 2 + 100x4 − 200x6 + 100x8 − 200x2 y 2 + 200x4 y 2 + 100y 4 = 1/10
0
at the point (x, y, z) = (0, 0, 1/10)? The gradient is ∇f (0, 0, 1/10) = 0 . The
2
tangent plane equation is 2z = d, where the constant d is obtained by plugging in the
point. We end up with 2z = 2/10. The linearization is L(x, y, z) = 1/20 + 2(z − 1/10).
17.11. P.S. The following remark should maybe be skipped as many objects have not been properly introduced. The
exterior derivative d for example will appear in the form of grad,curl,div later on and d2 = 0 in the form curl(grad(f )) = 0.
The quite deep remark illustrates how important the topic of Taylor series is if it is taken seriously.
The derivative d acts on anti-symmetric tensors (= forms), where d2 = 0. A vector field X then defines a Lie derivative
LX = dιX +ιX d = (d+ιX )2 = DX
2 with interior product ι . For scalar functions and the constant field X(x) = v, one
X
gets the directional derivative Dv = ιX d. The projection ιX in a specific direction can be replaced with the transpose
d∗ of d. Rather than transport along X, the signal now radiates everywhere. The operator d + ιX becomes then the
Dirac operator D = d+d∗ and its square is the Laplacian L = (d+d∗ )2 = dd∗ +d∗ d. The wave equation ftt = −Lf
can be written as (δt2 +D2 )f = (δt −iD)(δt +iD)f = 0 which has the solution aeiDt +be−iDt . Using the Euler formula
eiDt = cos(Dt) + i sin(Dt) one gets the explicit solutions f (t) = f (0) cos(Dt) + iD−1 ft (0) sin(Dt) of the wave equation.
It gets more exciting: by packing the initial position and velocity into a complex wave ψ(0, x) = f (0, x)+iD−1 ft (0, x),
we have ψ(t, x) = eiDt ψ(0, x). The wave equation is solved by a Taylor formula, which solves a Schrödinger
equation for D and the classical Taylor formula is the Schrödinger equation for DX . This works in any
framework featuring a derivative d, like finite graphs, where Taylor resembles a Feynman path integral, a sort of
Taylor expansion used by physicists to compute complicated particle processes.
The Taylor formula shows that the directional derivative Dv generates translation by −v. In physics, the operator
P = −i~Dv is called the momentum operator associated to the vector v. The Schrödinger equation i~ft = P f
has then the solution f (x − tv) which means that the solution at time t is the initial condition translated by tv. This
2 acting on forms defined by a
generalizes to the Lie derivative LX given by Cartan’s magic formula as LX = DX
vector field X. For the analog L = D2 , the motion is not channeled in a determined direction X (this is a photon) but
spreads (this is a wave) in all direction leading to the wave equation. We have just seen both the “photon picture” LX
as well as the “wave picture” L of light. And whether it is particle or wave, it is all just Taylor.
Linear Algebra and Vector Analysis
Homework
Problem 17.1: Evaluate without technology the cube root of 1002 using
quadratic approximation. Especially look how close you are to the real
value.
Problem 17.3: Given g(x, y) = (6y 2 −5)2 (x2 +y 2 −1)2 , define the surface
S by f (x, y, z) = g(x, y) + g(y, z) + g(z, x) = 3. The following equation
could be derived with the chain rule. You can take this for granted:
gx (1, −1) + gy (1, 1)
∇f (1, −1, 1) = gx (−1, 1) + gy (1, −1) .
gx (1, 1) + gy (−1, 1)
Using this, find the tangent plane to S at (1, −1, 1).
y
-4 -3 -2 -1 0 1 2 3 4
4 2 2 1 4
7
3
3 U 4 3
9 7 3
2 10 2
V P 8
8 9
10
1 1
T
5
2
0 3
R
6 S 0 x
-4 Y
-1 -1
1
-2 -1 -8 -2
Q
-9
-7 -2
-6
-3 -3
-3 -5
W X
-4 0 -4
-4 -3 -2 -1 0 1 2 3 4
Figure 2.
MATH 22A
Seminar
18.1. In this seminar, we see how calculus can help to compute things effectively and
also hope to get insight into topics which are of more number theoretical nature. To
find the cube root of 10 for example, we have
2 2
101/3 ∼ 81/3 + = 2 + = 2.1666 . . . .
3 · 82/3 12
The actual value is 2.15443.
18.2.
Problem A: Find the cube root of 999999 using linear approximation.
0.00 0.00
-0.01 -0.01
-0.02
-0.02
-0.03
-0.03
-0.04
-0.04 -0.05
20 40 60 80 100 200 400 600 800 1000
18.4. These were all finite sums but seeing the pattern allows us to take a limit and
compute the infinite series:
1
Problem C: For which a is 1 + a + a2 + a3 + ... = 1−a
valid?
18.5. Recall the definition of Taylor series and answer the following trick question:
1
Problem D: What is the Taylor series of f (x) = (1−x)
at x0 = 0?
18.6. How can you get from the last exercise the following identity?
x2 x3 x4
Problem E: − log(1 − x) = x + 2
+ 3
+ 4
+ ....
P∞
18.7. We can also differentiate to verify the formula n=1 nxn−1 = 1/(1 − x)2 and so
∞
X x
nxn = .
n=1
(1 − x)2
This function is called Li−1 (x).
18.8. How come that great number theorists like Leonard Euler or Godfrey Hardy
were also masters in calculus? The reason is that many results of number theoretic
nature have intimate relations with calculus. Lets look at the following problem:
Problem G: What is the value of the Leibniz series
1 1 1
1 − + − + ... .
3 5 7
18.9. Hint: compute first the Taylor series of f (x) = arctan(x) using the Taylor series
of 1/(1 + x2 ) (the later is a geometric series), then evaluate f at x = 1.
18.10. ∞
X xn x2 x3
Lis (x) = =x+ + s + ···
n=1
ns 2s 3
is called the poly logarithm function. For s = 0 it is Problem D, for s = 1 it is
problem E, for s = −1, x = 1/2 it is problem F. While in calculus, we might be more
interested in the function as a function of x, number theorists are more interested
in the function as a function of s P
and s is complex. In the case x = 1, one gets the
Riemann zeta function ζ(s) = ∞ 1
k=1 ks .
Theorem: ζ(s) = ∞ 1 1 −1
P Q
n=1 ns = p prime (1 − ps ) .
18.12.
Problem I: Verify the Euler golden key identity.
18.13. First verify (maybe look at Problem C) that for a single prime p
1 1 1 1
1 = 1 + s + 2s + 3s + . . .
1 − ps p p p
which is the sum over all n1s , where n has only prime factors p. Then look at the
product of these for two primes p, q and see that this is the sum over all n1s where n
has only prime factors p and q.
18.14. The Goldbach conjecture tells that every even number larger than 2 is the
sum of two primes. What is the relation with calculus? Define g(x) = (f (x))2 with
X xp x2 x3 x5 x7
f (x) = = + + + + ....
p
p! 2! 3! 5! 7!
Problem J: Goldbach is equivalent to g (n) (x) > 0 for all even n > 2.
Linear Algebra and Vector Analysis
Homework
Exercises A-F are done in the seminar. This homework is due on Tuesday
Problem 18.1 The function f defined by f (x) = e−1/x for x > 0 and
0 for x ≤ 0 is smooth and that all derivatives at 0 are zero. Check
f 0 (0), f 00 (0), f 000 (0) = 0.
1.0
0.8
0.6
0.4
0.2
0.0
-2 -1 0 1 2
MATH 22A
Lecture
19.1. All functions are assumed here to be in C 2 . It all starts with an observation
going back to Pierre de Fermat:
19.7. Let us look at the case, where f (x, y) is a function of two variables such that
fx (x0 , y0 ) = 0 and gx (x0 , y0 ) = 0. The Hessian matrix is
fxx fxy
H(x0 , y0 ) = .
fyx fyy
In this two dimensional case, we can classify the critical points if the determinant
2
D = det(H) = fxx fyy − fxy of H is non-zero. The number D is also called the
discriminant at a critical point.
19.8. We say (x0 , y0 ) is a Morse point, if (x0 , y0 ) is a critical point and the deter-
minant is non-zero. A C 2 function is a Morse function if every critical point is
Morse. Examples of Morse functions are f (x, y) = x2 + y 2 , f (x, y) = −x2 − y 2 and
f (x, y) = x2 − y 2 . The last case is called a hyperbolic saddle. In general, a critical
point is a hyperbolic saddle if D 6= 0 and if it is neither a maximum nor a minimum.
Here is the second derivative test in dimension 2:
19.12. Proof. We use induction with respect to m. (i) Induction foundation. For
m = 1, the result tells that for a Morse critical point, the function looks like y = x2
or y = −x2 . First show that if f (0) = f 0 (0) = 0, f 00 (0) 6= 0, then f (x) = x2 h(x) or
f (x) = −x2 h(x) for some positive C 2 function h. Proof. By a linear coordinate change
we assume x0 = 0 and f (0) = 0. There exists then g(x) such that f (x) = xg(x):
it is g(x) = f (x)/x for x 6= 0 and is in the limit x → 0 the value of limx→0 (f (x) −
f (0))/x = f 0 (0). By the product rule, f 0 (x) = g(x) + xg 0 (x) with g(0) = 0. Because
f 0 (0) = g(0) = 0 can define f (x)/x2 for x 6= 0 and take the limit x → 0, because
by applying Hôpital twice, the limit is f 00 (0). p The coordinate change is now given by
a functionp y = φ(x) satisfying g(x, y) = y h(y) = x. Implicit differentiation gives
gy (0, 0) = h(y) 6= 0 so that by the implicit function theorem y(x) exists.
2
(ii) Induction step m → m+1: P we first note that Taylor for C with remainder term
implies that f (x1 , . . . , xn ) = i,j xi xj hij (x1 , . . . , xn ) with some continuous functions
hij . Furthermore, the function value hij (0) = fxi xj (0) = Hij (0) are the coordinates
of the Hessian. Apply first a rotation so that h11 6= 0. Now look at x1 and keep the
other coordinates constant. As in (i), find a coordinate change φ such that f (φ(x)) =
±x21 + g(x2 , . . . , xm ), where g inherits the properties of f 1, but is of one dimension less.
By induction assumption, there is a second coordinate change such that g(ψ(x)) =
x22 − · · · − x2l + x2l+1 + · · · + x2m . Combining φ and ψ produces the Morse normal form.
Examples
19.13. Q: Classify the critical points of f (x, y) = x3 − 3x − y 3 − 3y. A: As ∇f (x, y) =
[3x2 − 3, −3y 2 + 3]T , the critical points are (1, 1),(−1, 1),(1, −1) and (−1, −1). We
2x 0
compute H(x, y) = . For (1, 1) and (−1, −1) we have D = −4 and so
0 −2y
saddle points. For (−1, 1), we have D = 4, fxx = −2, a local max. For (1, −1) where
D = 4, fxx = 2 we have a local min.
1This will be more clear after having seen more linear algebra
Linear Algebra and Vector Analysis
Homework
Problem 19.4: Find all the critical points of the function f (x, y, z) =
(x − 1)2 − y 2 + xz 2 . In each of the cases, find the Hessian matrix. We
have not talked about eigenvalues yet, but they are numbers λ such that
Hv = λv for some non-zero vector. One can find them by looking for
the roots of the characteristic polynomial χH (λ) = det(L − λ). You can
calculate them on a computer. Find in each case the eigenvalues.
MATH 22A
Lecture
20.1. If we want to maximize a function f : Rm → R on the constraint S = {x ∈
Rm | g(x) = c}, then both the gradients of f and g matter. We call two vectors v, w
parallel if v = λw or w = λv for some real λ. The zero vector is parallel to everything.
Here is a variant of Fermat:
Theorem: If x0 is a maximum of f under the constraint g = c, then
∇f (x0 ) and ∇g(x0 ) are parallel.
20.2. Proof: use contradiction: assume ∇f (x0 ) and ∇g(x0 ) are not parallel and x0
is a local maximum. Let T be the tangent plane to S = {g = c} at x0 . Because
∇f (x0 ) is not perpendicular to T we can project it onto T to get a non-zero vector v
in T which is not perpendicular to ∇f . Actually the angle between ∇f and v is acute
so that cos(α) > 0. Take a curve r(t) in S with r(0) = x0 and r0 (0) = v. We have
d/dtf (r(0)) = ∇f (r(0)) · r0 (0) = |∇f (x0 )||v| cos(α) > 0. By linear approximation, we
know that f (r(t)) > f (r(0)) for small enough t > 0. This is a contradiction to the fact
that f was maximal at x0 = r(0) on S.
20.3. This immediately implies: (distinguish ∇g 6= 0 and ∇g = 0)
20.4. For functions f (x, y), g(x, y) of two variables, this means we have to solve a
system with three equations and three unknowns:
20.5. To find a maximum, solve the Lagrange equations and add a list of critical points
of g on the constraint. Then pick a point where f is maximal among all points. We
don’t bother with a second derivative test. But here is a possible statement:
d2
Dtv Dtv f (x0 )|t=0 < 0
dt2
for all v perpendicular to ∇g(x0 ), then x0 is a local maximum.
Linear Algebra and Vector Analysis
20.6. Of course, the case of maxima and minima are analog. If f has a maximum
on g = c, then −f has a minimum at g = c. We can have a maximum of f under a
smooth constraint S = {g = c} without that the Lagrange equations are satisfied. An
example is f (x, y) = x and g(x, y) = x3 − y 2 shown in Figure (1).
f(x,y)=x
x 3-y 2=0
1= λ 3x 2
0=-λ 2y
x 3-y 2=0
20.7. The method of Lagrange can maximize functions f under several constraints.
Lets show this in the case of a function f (x, y, z) of three variables and two constraints
g(x, y, z) = c and h(x, y, z) = d. The analogue of the Fermat principle is that at a
maximum of f , the gradient of f is in the plane spanned by ∇g and ∇h. This leads
to the Lagrange equations for 5 unknowns x, y, z, λ, µ.
1This example is from Rufus Bowen, Lecture Notes in Math, 470, 1978
Linear Algebra and Vector Analysis
Homework
Problem 20.1: Find the cylindrical basket which is open on the top
has has the largest volume for fixed area π. If x is the radius and y
is the height, we have to maximize f (x, y) = πx2 y under the constraint
g(x, y) = 2πxy + πx2 = π. Use the method of Lagrange multipliers.
Problem 20.5: A solid bullet made of a half sphere and a cylinder has
the volume V = 2πr3 /3 + πr2 h and surface area A = 2πr2 + 2πrh + πr2 .
Doctor Manhattan designs a bullet with fixed volume and minimal
area. With g = 3V /π = 1 and f = A/π he therefore minimizes
f (h, r) = 3r2 + 2rh under the constraint g(h, r) = 2r3 + 3r2 h = 1. Use the
Lagrange method to find a local minimum of f under the constraint g = 1.
MATH 22A
Seminar
21.1. With an “island” we mean a region in the plane R2 which is bound by a simple
closed curve C which is continuous everywhere and differentiable everywhere except
at a finite set of points. So, simple polygons are allowed. What island does have the
maximal area if the length of the boundary is fixed? This is called the isoperimetric
problem. If we look at the problem restricted to polygons with a fixed number n of
vertices, then we have a nice finite dimensional Lagrange problem.
21.2. Let us look at a triangular island T (x, y) with vertices (−1, 0), (1, 0), (x, y).
21.3. Here is a side problem from good old Euclidean geometry. If you should not
know, look up “string method pins”.
21.4. Solving the problem to find the n-gon with maximal area is a messy Lagrange
problem. It can be done by a computer but there is a more elegant way:
Problem C: Use the computation in problem A to show that for a
maximal polygon containing vertices ..., P, Q, R, ... in a row, the distance
between P and Q is the same as the distance between Q and R.
21.5. You are on your treasure island G and have two locations A, B in G. The
problem to find the shortest connection between A and B can be quite complex in
general. An example is when G is bound by a Gosper curve. For the following let
us assume that the boundary of G is a convex curve: this means that for any two
points A, B in G, the line segment through A, B is contained in G. A triangle A, B, C
for which all three points A, B, C are on the boundary is called a “shore triangle”.
Problem E: Verify that for a shore triangle, the billiard law of reflec-
tion at the boundary holds.
21.6. Hint: to see that the incoming angle is the same as the outgoing angle, take a
minimal triangle A, B, C, where B is on the island shore, then replace the curve with
the tangent curve L at B. Now reflect C at L to get a point C 0 . Verify that the shortest
billiard path ABC has the same length than the straight line connecting A with C 0 .
21.7. The next time you are cast away on an island, count the number m of mountain
peaks, the number s of sinks and the number p of mountain passes. Make some
experiments. You notice the following rule which is known as a special case of the
Poincaré-Hopf theorem:
Theorem: maxima + minima − saddles = 1.
21.8. If you want to challenge yourself, see whether you can prove the island theorem
by deformation. (This is probably too hard. Just enjoy the struggle!)
21.9. Assume now that our island is an atoll, a ring shaped reef.
Problem G: By looking at examples, what is the island number
maxima + minima − saddles on an atoll?
Figure 2. First an island with 2 mountain peaks and with 1 mountain
pass. Then an island with 3 mountain peaks and 2 mountain passes. We
see maxima + minima − saddles = 1.
Figure 3. The Atafu atoll. Picture by NASA Johnson Space Center, 2009.
21.10. Let us look at the one-dimensional case, where we prove things easier. Assume
the island is the interval [a, b]. Let f be a smooth function on [a, b] which has the
property that f is zero for x ≥ b and for x ≤ a. We look at critical points of f in the
interior (a, b) which are Morse, (meaning f 00 (x) 6= 0 at critical points), so that we only
have only local maxima and minima as critical points. Let m be the number of maxima
and s the number of minima (sinks). In order to prevent the island to be flooded, we
also assume that the function f is positive for x > a, close to a and x < b close to b.
Theorem: maxima − minima = 1.
Homework
MATH 22A
Lecture
22.1. Given a bounded region RRRin R2 and a continuous function f (x, y) : R → R,
define the Riemann integral I = R f (x, y) dA as the n → ∞ limit of
1 X i j
In = 2 f( , ) .
n n n
(i/n,j/n)∈R
The bounded region R is a defined as closed subset of R2 bound by finitely many differ-
entiable curves R = {g1 ≤ c1 , . . . gk ≤ ck }. As already in one dimension, the definition
is designed to be independent of an orientation chosen on R. We are integrating like
summing up a spread sheet. Just add up all entries. To justify that the limit exists,
we again can use the Heine-Cantor theorem which tells that f is continuous on R if
and only if it is uniformly continuous. This means there are numbers Mn → 0 such
that if |(x1 , y1 ) − (x2 , y2 )| ≤ 1/n, then |f (x1 , y1 ) − f (x2 , y2 )| ≤ Mn .
RR
Theorem: For continuous f on a bounded region R, R f dxdy exists.
for large enough n, any Qij ∩ R us a basic region. Now we can define the integral in
R b R d(x) R d R b(y)
the first case as a [ c(x) f (x, y) dy]dx and in the second case as c [ a(y) f (x, y) dx]dy.
Is this the same? This is answered with Fubini, which we have already used. Let R be
a rectangle R = {(x, y) | a ≤ x ≤ b, c ≤ y ≤ d}. Here is the Fubini theorem:
R Rb Rd Rd Rb
Theorem: R
f (x, y) dA = [ f (x, y) dy]dx = c [ a f (x, y) dx]dy.
a c
22.4. Proof: first make a coordinate change to get R = [0, 1]×[0, 1], then cover R with
n2 cubes Qij of side length 1/n. We have for every y a uniformly continuous function
x → f (x, y) and for every x a uniformly continuous function y → f (x, y) and the
constants Mn work for all: there is Mn → 0 so that if |x1 −x2 | < 1/n and |y1 −y2 | < 1/n,
thenRR|f (x1 , y1 ) − f (x2 , y2 )| ≤ Mn . Now use the notation A ∼c BR if |A − B| ≤ c and
Pn−1 1
get R f (x, y)dA ∼Mn n1 n−1
P 1
Pn−1 1
i=0 n j=0 f (i/n, j/n) ∼2Mn n i=0 0 f (i/n, y) dy ∼3Mn
R1 R1 RR R1 R1
[
0 0
f (x, y) dy] dx. Similarly, we can show R
f (x, y)dA ∼3M n [ f (x, y) dx] dy.
0 0
22.5. Without continuity, Fubini is false: the standard example is illustrated in Fig-
ure (2):
Z 1Z 1 2 Z 1Z 1 2
−π (x − y 2 ) (x − y 2 ) π
= 2 2 2
dydx 6
= 2 2 2
dxdy = .
4 0 0 (x + y ) 0 0 (x + y ) 4
R R
Proof. (x2 −y 2 )/(x2 +y 2 )2 dx = −x/(x2 +y 2 ), (x2 −y 2 )/(x2 +y 2 )2 dy = y/(x2 +y 2 ). so
R1 R1
that 0 (x2 −y 2 )/(x2 +y 2 )2 dx = −1/(1+y 2 ) and 0 (x2 −y 2 )/(x2 +y 2 )2 dy = 1/(1+x2 ).
22.6. Integrals in higher dimensions are defined in the same way. We will cover the
three dimensional case in particular later. Lets just add the definition for now. Given
a m dimensional region R in Rm and a continuous f : Rm → R, using the multi-index
notation x = (x1 , . . . , xm ), dx = dx1 dx2 · · · dxm and i/n = (i1 /n, i2 /n, . . . , im /n) de-
fine Z
1 X i
f (x)dx = lim m f( ) .
R n→∞ n n
i
n
∈R
m
A region is now a set R = {x ∈ R | g1 (x) ≤ c1 , . . . , gk (x) ≤ ck } where gk are smooth
functions. It is called bounded if there exists ρ > 0 such that R ⊂ {|x| ≤ ρ}.
Figure 2. Integrating over a region via a Riemann integral. A double
integral is a signed volume. Parts where f < 0 is negative volume. Fubini
can fail, even if the two conditional integrals exist.
Examples
RR RR
22.7. If f (x, y) = 1, then R f (x, y) dxdy is the area of R. For example, if x2 +y2 ≤9
RR
8 dxdy = 8 x2 +y2 ≤9 1 dxdy = 8Area(R) = 72π.
Rb
22.8. We know from single variable calculus that a f (x) dx is the signed area under
R b R f (x)
the curve of f . For f (x) ≥ 0, where it is the area, we can write this as a 0 1 dydx.
Note that as we have defined the integrals, the equivalence would be wrong if f (x)
is negative somewhere. It is the double integral which is the correct notion of area.
Example: The area of the region bounded by the curve y = 1/(1 + x2 ), the curve
R 1 R 1/(1+x2 )
y = 0 and the curve x = −1 and x = 1 is −1 0 dydx = arctan(x)|1−1 = π/2.
Figure 3.
RR
22.9. The integral R f (x, y) dxdy can be interpreted as the signed volume under
the graph of f above the region R. Find the volume of the region bound by z =
4R −R 2x4 − 2y 4 and z = 4 − 2x2 − 2y 2 and −1 ≤ x ≤ 1 and −1 ≤ y ≤ 1. Solution:
1 1
0 0
(4 − 2x4 − 2y 4 ) − (4 − 2x2 − 2y 2 ) dxdy = (4/15)2 .
22.10. Problem. Find the area of a disc of radius a. Solution:
Z a Z √a2 −x2 Z a √
√
1 dydx = 2 a2 − x2 dx .
−a − a2 −x2 −a
Linear Algebra and Vector Analysis
2 π/2
2 (1+cos(2u))
R
Using a double angle formula, this gives a −π/2 2
du = a2 π. We will next
time compute this much more effectively.
22.11.
R R −xProblem. Let R be the triangle {1 ≥ x ≥ 0, 0 ≤ y ≤ x}. Evaluate
2 2
R
e dxdy. Solution. We can not evaluate the integral directly because e−x
has no anti-derivative
R 1 R x −x2 given in terms of elementary functions. But we can write the
integral as 0 [ 0 e dy] dx
Z 1 2
−x2 e−x 1 (1 − e−1 )
= xe dx = − | = .
0 2 0 2
Homework
R 1 R 2−x
Problem 22.1: Calculate the iterated integral 0 x (x2 − y) dydx in
two ways, once as a “left to right” and once as a “bottom to top” integral.
Problem 22.3: Compute the area of the region bound by the ellipse
x2 /42 + y 2 /92 = 1 using trig substitution. (It is the “hardest problem in
geometry”, according to the comedy-drama “Rushmore”, a movie from
1998).
MATH 22A
Lecture
u x(u, v)
23.1. If Φ : R → S, → is a coordinate change, then the distor-
v y(u, v)
tion factor was defined as |dΦ| = |det(dΦ)|, where
∂u x(u, v) ∂v x(u, v)
dΦ(u, v) = .
∂u y(u, v) ∂v y(u, v)
The change of variable theorem is the same in all dimensions. In the following proof,
we assume that Φ is C 2 . Because √ of Heine-Cantor, we know there exists Mn → 0 with
d2
| dt2 Φ(u0 + tv, v0 + tw)| ≤ Mn for v 2 + w2 ≤ 1/n and all (u0 , v0 ) ∈ R. 1
RR RR
Theorem: R
f (Φ(u, v)) |dΦ(u, v)|dudv = S f (x, y) dxdy.
23.2. Proof. Cover S with cubes Qij as in the last lecture. Then
ZZ X ZZ X i j 1
f (x, y) dxdy = f (x, y) dxdy ∼ f( , ) 2 .
S Q Qij ∩S i,j
n n n
ij
The transformed squares Φ(Qij ) are close to the parallelograms dΦ(Qij ) which have
area |dΦ(i/n, j/n)|/n2 . Now make a quadratic Taylor expansion Φ(x, y) = Φ(x0 , y0 ) +
dΦ(x0 , y0 )(x − x0 , y − y0 ) + d2 Φ(x0 , y0 )(x − x0 , y − y0 )2 /2 at (x0 , y0 ) = (i/n, j/n), where
|d2 Φ(x0 , y0 )(x − x0 , y − y0 )2 | ≤ Mn . Let F = max(x,y)∈R (|f (x, y)|). Applying in every
direction, Taylor with remainder, we see
Z
i j i j 1 Mn F
| f (x, y) dxdy − f (Φ( , ))|dΦ( , )| 2 | ≤ .
Φ(Qij )∩S) n n n n n n2
As the number of squares hitting R is bound by An2 + 4Ln where A is the area of R
and L is the length of the boundary of R, the sum of the non-linear errors is therefore
bound by (An2 + 4Ln)Mn F/n2 which goes to zero for n → ∞. QED.
23.4. Let Φ : [0, 1] × [0, 1] → [0, 1] × [0, 1] be given as Φ(x, y) = (y, x). Now det(dΦ) =
−1 and |dΦ| = 1. While we usually could ignore talking about orientation, it is evident
here that the integrals considered so far, we do not care about the orientation of the
space. If the change of coordinates switches the orientation, the resulting integral does
not change.
23.5. The chain rule assures that combining two coordinate changes Φ, Ψ, gives a
new coordinate change with d(Ψ ◦ Φ)(x) = dΨ(Φ(x))dΦ(x). For example if Ψ(x, y) =
[ax, by]T and Φ(r, θ) = [r cos(θ), r sin(θ)]T changes into polar coordinates, then Ψ(Φ(r, θ)) =
[ar cos(θ), br sin(θ)]T . Now the image of R = [0, RR
1] × [0, 2π] is the ellipse S = {x2 /a2 +
y 2 /b2 ≤ 1} and the area of the ellipse is A = R abr drdθ because det(dΦ) = r and
R 1 R 2π
det(dΨ) = ab. The result is 0 0 abr dθdr = πab.
23.6. Preview: We will next week look at more general casesp like r : R ⊂ R2 → R3 of
where the distortionRRfactor is |dr| = det(drT dr) = |ru × rv |
a parametrized surface,RR
and the surface area is R |ru × rv |dudv = S 1 dA.
Rd Rb
23.7. The theorem generalizes substitution c f (Φ(x))|Φ0 (x)| dx = a f (x) dx if
Φ(c) = a and Φ(d) = b. We usually insist that Φ is monotonically √ increasing and
π/2
write u = Φ(x), du = Φ0 (x)dx to get computations like in 0
R
sin(x2 )2xdx =
R π/2
0
sin(u) du, where Φ(x) = x2 . As a hack, one can extend the formula to the
case when Φ can decrease in which case the [a, b] interval becomes the negative [b, a]
interval
R1 with a < b. RExample: Let Φ(x) = 2 − 2x which has Φ0 = −2, then
1
1/2
(2−2x)2 |(−2)|dx = 0 x2 dx. In single variable calculus, one can also work with the
R 1/2 R 1/2 R1
negative sign case and compute 1 (2 − 2x)2 (−2)dx which works if 1 = − 1/2 but
this is not compatible with the defined Riemann integral: we use “spread-sheet”
summation and do not distinguish whether we add up the function values from left
to right or from right to left. 2
2In single variable one can easily go from orientation independent ‘Bosonic’ integrals to ‘Fermionic’
integrals. In higher dimensions, one can then apply the derivative to anti-symmetric tensors. The
switch “Bosonic” → “Fermionic” requires however to orient objects like curves or surfaces.
RR
23.8. We can again look at the Fubini counter example x2 +y2 ≤1 (x2 − y 2 )/(x2 +
R 1 R 2π
y 2 )2 dxdy = 0 0 cos(2θ)/r dθdr = 0. We can not change the order of integration
R1
as we can not integrate 0 1/r dr. The trouble also continues in the new coordinate
system and it is even more dramatic.
23.9. If Φ : x → Ax and Ψ : x → Bx are two linear coordinate changes then Ψ ◦
Φ = BA is the matrix product and the chain rule tells |d(Ψ ◦ Φ)| = |det(AB)| which
agrees with the product |dΨ||dΦ| = |det(A)||det(B)|. We can do the verification
of
a b
the Cauchy-Binet formula det(AB) = det(A)det(B) directly. If A = and
c d
p q ap + br aq + bs
B = , then AB = and you can check the determinant
r s cp + dr cq + ds
formula.
23.10. Here is a famous open problem about coordinate changes. It is called the
Jacobian conjecture. It deals with polynomial coordinate changes, where x(u, v)
and y(u, v) are polynomials in u, v.
Homework
Problem 23.1: Given a disk R = {x2 + y 2 ≤ 1}, we can make this into
aRRprobability space and define the expectation of a function f as E[f ] =
R
f dxdy/π. The expectation of the random variables f (x, y) = xn are
examples of moments. Find E[x], E[x2 ], E[x3 ] and E[x4 ].
Problem 23.3: The fidget spinner is so “2017” now. What is hot now
is
R Rthe math 22 spinner with 23 bearings! What is the moment of inertia
2 2
G
x + y dxdy of the math 22 fidget spinner region G given in
polar coordinates as 1/2 ≤ r ≤ 2 + cos(22θ). To keep our bearings, we do
not count the bearings.
MATH 22A
Seminar
24.1. In this seminar we look a bit around in the literature and collect problem solving
strategies. We have seen already a few methods:
24.2. We will introduce a few more principles and tips and take the opportunity to
introduce a bit the literature. We only look at 4 books.
24.3. The mother of all problem solving books is Polya’s ”How to solve it” which was
published in 1945. If you read and absorb this book, you immediately get measurably
stronger in math. Still after more than 70 years, it is the best. Here are the now
famous Polya principles:
Linear Algebra and Vector Analysis
Polya principles
1. Understand the problem: unknowns, data, draw figure.
2. Devise a plan: similar or related problem?
3. Carry out the plan: check each step.
4. Examine the solution: can other problems be solved as such?
24.4. This sounds a bit like ”open the door, step through the door, close the door”
advise to ”how to exit the house”. But it is amazing to see the power in a method.
Why is it powerful? Because if one sees a harder problem the first time, one is totally
lost. (Proof: if not, then the problem was easy ....) Where do we start? This is where
it is good already to have a guide telling you: well, just first start to understand the
problem.
24.6. An here is another problem from Polya, slightly reformulated. Work out also
this problem using the Polya principles:
24.7. The second best book in our collection is ”Solving mathematical problems” by
Terrence Tao. Why? Like Polya, also Tao has proven new important theorems (many
as a single author) and so got some street cred. Here are some problems from his book:
24.8. Tao calls the following identity ”his favourite algebraic identity”. We have done
the case of the sum of the first n squares in a practice exam.
24.9. Tao does not give a formal list of strategies, but explains in an example on page
4 the following principles. We paraphrase here these ”deformation principles”:
Tao’s deformation principles
a. Consider special, extreme or degenerate cases.
b. Solve a simplified version of the problem
c. Formulate a conjecture
d. Derive intermediate steps which would get it.
e. Reformulate, especially try contraposition.
f. Examine solutions of similar problems
g. Generalize the problem
24.10. The book of Perkins analyses skillfully the mechanisms of break through ideas.
It destills the following mechanism for break through ideas. It captures it pretty well,
since problems which are solved quickly rarely cover new ground.
Perkins
1. Long search. 99 percent perspiration. Work for years or decades.
2. Little apparent progress. Many failures.
3. A precipitating event. Maybe external circumstances.
4. A cognitive snap. Usually in a flash. Eureka!
5. Transformation. Flesh it out. Consequences.
24.11. The following exercise is from Perkin’s book. Try to solve it yourself and also
keep track on how you pursue the task to solve the problem.
24.12. If this was too easy (experiments show that some people can answer it very
quickly. For others it takes longer), try this one, also from Perkins:
Problem G: You are driving a jeep through the Sahara desert. You
encounter someone lying face down in the sand, dead. There are no tracks
anywhere around. There has been no wind for days to destroy tracks. You
look into the pack on the person’s back. What do you find?
24.13. The book of Posamentier and Krulik is more intended for the teacher and less
for the research mathematician. It goes through the following principles
Posamentier-Krulik
1. Reason logically 2. Recognize patterns
3. Work backwards 4. Adopt different view
5. Consider extreme cases 6. Solve simpler problems
7. Organize data 8. Make a picture
9. Account all possibilities 10. Experiment, guess and test
Linear Algebra and Vector Analysis
24.14. Here is a strategy which often occurs: ”make it more general”. In the book
”Posamentier-Krulik: Problem-Solving-Strategies in mathematics” for example is the
problem:
Problem H: We have a 5 × 5 seating arrangement of students. The
teacher wants every student to change place and move to a seat to the
left, right, front or left. It it possible? Solve this problem by looking first
at smaller classrooms like 2 × 2 or 3 × 3 or 2 × 3. In which cases is it
possible?
24.1 A nursery rhyme is the riddle “As I was going to St. Ives, I met
a man with seven wives, Each wife had seven sacks, Each sack had seven
cats, Each cat had seven kits: Kits, cats, sacks, and wives, How many
were there going to St. Ives?” Pretend not to know the answer, solve the
riddle and follow the Polya principle. The rhyme was inspired by one of
the oldest problems texts in math, the Rhind Papyrus. But it was a more
serious question which translates: ”how many kits came from St Ives”?
24.3 (Tao). Find all triangles for which the length have an arithmetic
progression a, a + d, a + 2d.
24.4 Here are a few children riddles. We hope you don’t know all of
them (if you know the answer there is little benefit). Keep a log of how
you search for an answer: a) I’m tall when I’m young and I’m short when
I’m old. What am I? b) What gets wetter and wetter the more it dries?
c) What can run but can’t walk? d) What is full of holes and still holds
water?
MATH 22A
Lecture
25.1. A basic solid R in Rn is a bounded region enclosed by finitely many surfaces
gi (x1 , · · · , xn ) = ci . A solid is aRRR
finite union of such basic solids. We focus here mostly
on n = 3. A 3D integral I = R
f (x, y, z) dxdydz is defined in the same way as a
limit of a Riemann sum In which for a given integer n is defined as
1 X i j k
In = 3 f( , , ) .
n n n n
(i/n,j/n,k/n)∈R
The convergence is proven in the same way. The boundary contribution can be ne-
glected in the limit n → ∞. If Φ : R → E is a parametrization of the solid, then
RRR RRR
Theorem: R
f (u, v, w)|dΦ(u, v, w)|dudvdw = E
f (x, y, z) dxdydz
1An exam problem at ETH in a single variable calculus exam when Oliver was an undergrad.
2Archimedes Revenge, first appeared in Math S21a exam, Harvard Summer School, 2017
Linear Algebra and Vector Analysis
Figure 3. Illustrating two harder problems: the pen problem and the
“Archimedes revenge problem” asking to prove that E : x2 + y 2 − z 2 ≤
1, y 2 + z 2 − x2 ≤ 1, z 2 + x2 − y 2 ≤ 1 has Vol(E) = log(256).
Homework
RRR
Problem 25.1: Find the moment of inertia E
x2 + y 2 dV , where
E = {x2 + y 2 ≤ z 2 , |z|2 ≤ 1 is the double cone.
MATH 22A
Lecture
26.1. A map r : R ⊂ R2 → R3 has an image r(R) = S which is a parametrized
surface.
p What is its surface area? WeT have seen that the distortion factor is now
|dr| = det(g) = |ru × rv |, where g = dr dr was the first fundamental form of the
surface. Of course, it is more convenient to use |ru × rv |, which is the same as |dr|.
RR RR
Theorem: The surface area S dS of S is R |ru × rv |dudv.
26.3. Here is the most general p change of integration formula for maps r : Rm → Rn ,
with distortion factor |dr| = det(drT dr). The formula holds for m > n too, det is
then a pseudo determinant. If S = r(R) is the image of a solid R under a C 2 map r
and f : Rn → R is a function, then the mother of all substitution formulas is
RR RR
Theorem: R
f (r(u))|dr(u)| du = S
f (u) du.
26.4. The proof is the same as seen in the two-dimensional change of variable sit-
uation. Just because n is used for the target space Rn , we use the basic size 1/N .
We chop up the region into parts R ∩ Q with cubes Q of size 1/N and estimate
the difference V ol(dr(Q)) and V ol(r(Q)) by CMN /N 2 leading to an overall difference
bounded by F CMN /N 2 , where F is the maximal value of f on R and Mn is the
Heine-Cantor function modulus of continuity of f . Adding everything up gives an
error F CVol(R)MN + 2n Vol(δR)F/Np → 0, where δR is the boundary of R. There is
one new thing: we have to see why det(AT A) is the volume of the parallelepiped
spanned by the column vectors of the Jacobian matrix A = dr. We will talk about
determinants in detail later but if A is in row reduced echelon form then AT A is the
1Unfortunately, scalar integrals are often placed close to the integration of differential forms (like
volume forms). The later are of different nature and use an integration theory in which spaces come
with orientation. So far, if we replace r(u, v) with r(v, u) gives the same result (like area or mass).
Linear Algebra and Vector Analysis
identity matrix and the determinant is 1, agreeing with the volume. Now notice that if
a column of A is scaled by λ producing a new matrix B, then det(B T A) = λdet(AT A)
and det(B T B) = λ2 det(AT A). If two columns of A are swapped leading to a new
matrix B, then det(B T A) = −det(AT A) and det(B T B) = det(AT A). If a column of A
is added to another column, then this does change det(B T B). The only row reduction
step which affects the |dr| is the scaling. But that is completely in sync what happens
with the volume. QED.
26.5. The last theorem covers everything we have seen and we ever need to know when
integrating scalar functions over manifolds. In the special case n = m it leads to:
RR
Theorem: R
|dr(u)| du = Vol(S).
Examples
26.7. In all the examples of surface area computations, p we take a parametrization
r(u, v) : R → S, then use use that the distortion factor is det(drT dr) = |ru × rv |.
p p
Figure 1. The distortion factors |dr| = |g| = det(g)RR= det(drT dr)
appear in general. For m = 2, n = 3 we get surface area R |ru ×rv | dudv.
and 30
2π 2π
Theorem: |Bn | = n
|Bn−2 |, |Sn | = |S |.
n−1 n−2
25
20
The 5-ball has maximal volume 5.26379... among all unit balls. The 6-sphere has 15
10
maximal surface area 33.0734... among all unit spheres. The volume of the 30-ball is 5
only 0.00002.... The surface area of the 30-sphere for example is only 0.0003. Compare 5 10 15 20 25 30
Dimension
with a n-unit cube of volume 1 and a boundary surface area 2n. High dimensional
spheres and balls are tiny!
26.15. If S is a cylinder x2 + y 2 = 1, 0 < z < 1, triangulated with each triangle
smaller than 1/n → 0, does the area converge to the surface area A(S)? No! A counter
example is the Schwarz lantern from 1880. The cylinder is cut into m slices and
n points are marked on the rim of each slice to get triangles like A = (1, 0, 0), B =
(cos(4π/n), sin(4π/n,
p 0)), C = (cos(2π/n), sin(2π/n), 1/m) of√area
sin(2π/n)(1/m) 2 + 3m√ 2 − 4m2 cos(2π/n) + m2 cos(4π/n)/ 2. The nm triangles have
p
area ∼ 2 + 8m2 π 4 /n4 / 2. For m = n3 , the triangulated area diverges.
Linear Algebra and Vector Analysis
Homework
Problem 26.2: Find the area of the surface given by the helicoid
r(u, v) = [u cos(v), u sin(v), v]T with 0 ≤ u ≤ 1, 0 ≤ v ≤ π.
MATH 22A
Seminar
27.1. In this seminar we have the honor to have Archimedes as a special guest. We
talk to him using a technology called “quantum forward tunneling” which allows to
interact with part of the past without running into a causality paradox. The actual
Archimedes did not know about the interview. It is his “quantum spirit” which does
it for us. How does it work? Quantum space-time produces sometimes tiny wormhole
constellations in which a wave function can be trapped. By harvesting many of those
trapped waves, we can rebuild and interact with an object or person from a previous
time. The so established “time tunnel” is sustainable only for a short time as the
trapped waves will fade within a half an hour. It is enough time however for a short
interview. We take the opportunity and ask him about his theorems.
27.2. Math 22a: What a pleasure to have you here. Welcome! Archimedes: I’m
glad to find myself in this lovely place. It must be a dream. I don’t recognize the
town but it feels like a ‘Alexandria in the future’. Math 22a: yes, it is also a hot
spot for science, but there are many now. We are eager to learn a bit about your proof
expertise.
27.3. Math 22a: What result of yours do you consider the most important one?
Archimedes: Definitely the formula for the volume of the sphere! Math 22a: Why?
Archimedes: It was much harder to get this than the circumference of the circle or the
surface area of the sphere. It was also harder to test the result experimentally. Math
22a: How did you measure? Archimedes: We build wood models of cylinders, cones
and spheres of the same base radius and height and measured their volume ratios.
Problem A: Explain how Archimedes can using wooden models measure
their volumes. If you don’t know, take a bath. Given a cylinder C, a cone
O and a sphere S of base length 1. What ratios |C|/|S|, |O|/|S| do the
measurements show?
Linear Algebra and Vector Analysis
27.4. Math 22a: Was the comparison of the sphere with the complement of a cone
in the cylinder historically the first proof? Archimedes: The relation had been
conjectured before. It had been suspected that the ratio between the volume of a
sphere and the volume of a cylinder is the fraction 2/3 but nobody had been able to
prove this relation before I could see the slicing trick.
Problem B: Explain why slicing the unit sphere at height z gives the
same area as a ring of radius 1 in which a hole of size z has been has been
drilled.
27.5. Math 22a: Do you remember the precise moment, when the discovery stuck?
Archimedes: I don’t recall directly but it must have been one of these “hot tub ideas”.
27.6. Math 22a: This discovery must have occurred after you got the circle circum-
ference computed. How difficult was the later? Archimedes: also this needed some
time. It emerged pretty early that the circumference is somehow proportional to the
radius. The measurement of the constant was then a bit trickier even so it remained
open what fraction it is. 22/7 was close. I got first the diameter/area ratio. I did that
using the following picture.
Problem B: How does the picture below prove that the area A, radius
R and diameter D of a circle satisfies 2A = RD? How can you make this
precise as in reality the circular sector does not have the same area as the
triangle. (Hint: you can use modern tools like L’Hôpital’s rule if you like).
27.7. Math 22a: We also wonder about your computation of the volume of the
“hoof” which is the solid bound by the cylinder x2 + y 2 = 1 and z = x and z = 0.
Archimedes: I don’t recognize the symbols you just spelled out but I know what
object you are talking about. It was exciting to see a solid bound partly by round
parts to have a rational volume, which is 2 third of the height. One can see that the
result is 2/3 in various ways.
27.8. Math 22a: Also very impressive is your computation of the surface area of the
sphere by relating it with the surface area of a cylinder. What was the intuition there?
Archimedes: Actually, a drawing which is accurate enough shows this pretty well.
As both situations have circular symmetry, we only need to understand what happens
with the lengths on a sphere when it is projected on the cylinder. There are similar
triangles. Take a stick of some length and place it onto the sphere pointing to the
north pole. As it gets closer to the pole and its height-length is one half of the actual
length. then the radius of that position is also half etc. As the area of a small sphere
strip is height times radius times about 22/7, this is also the area of a cylinder. In
the sphere case, the factor one-half is applied to the radius. In the cylinder case it is
applied to the height.
change of πr2 is 2πr and indeed this is my formula for the circumference of a circle.
This is “Phaidros”.
27.10. Math 22a Thank you very much for the interview. It will inspire us for the
second midterm exam. Maybe you can visit and take the exam on Tuesday or review
on Sunday? Archimedes ‘It will be my pleasure.”
Homework
27.1 Find a solid which has the property that if you project it on the
xy-plane it is a half circle, if you project it on the yz plane it is a triangle
and if you project it onto the xz-plane, it is a rectangle.
27.2 There are regions in the plane which have the property that their
thickness is constant 1 but which are not circles. Find some.
MATH 22A
Partial Derivatives
∂
fx (x, y) = ∂x f (x, y) partial derivative
L(x, y) = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ) linear approximation
Q(x, y) = L(x0 , y0 ) + fxx (x − x0 )2 /2 + fyy (y − y0 )2 /2 + fxy (x − x0 )(y − y0 ) quadratic
L(x, y) estimates f (x, y) near f (x0 , y0 ). The result is f (x0 , y0 ) + a(x − x0 ) + b(y − y0 )
tangent line: ax + by = d with a = fx (x0 , y0 ), b = fy (x0 , y0 ), d = ax0 + by0
tangent plane: ax + by + cz = d with a = fx , b = fy , c = fz , d = ax0 + by0 + cz0
estimate f (x, y, z) by L(x, y, z) near (x0 , y0 , z0 )
fxy = fyx Clairaut’s theorem, if fxy and fyx are continuous.
ru (u, v), rv (u, v) tangent to surface parameterized by r(u, v)
Gradient
∇f (x, y) = [fx , fy ]T , ∇f (x, y, z) = [fx , fy , fz ]T , gradient
Dv f = ∇f · v directional derivative
d
dt
f (r(t)) = ∇f (r(t)) · r 0 (t) chain rule
∇f (x0 , y0 ) is orthogonal to the level curve f (x, y) = c containing (x0 , y0 )
∇f (x0 , y0 , z0 ) is orthogonal to the level surface f (x, y, z) = c containing (x0 , y0 , z0 )
d
dt
f (x + tv) = Dv f by chain rule
(x − x0 )fx (x0 , y0 , z0 ) + (y − y0 )fy (x0 , y0 , z0 ) + (z − z0 )fz (x0 , y0 , z0 ) = 0 tangent plane
f (x, y) increases in the ∇f /|∇f | direction. Functions dance upwards.
f (x, y, z) = c defines z = g(x, y), and gx (x, y) = −fx (x, y, z)/fz (x, y, z) implicit diff
Extrema
∇f (x, y) = [0, 0]T , critical point or stationary point
2
D = fxx fyy − fxy discriminant, useful in second derivative test
f (x0 , y0 ) ≥ f (x, y) in a neighborhood of (x0 , y0 ) local maximum
f (x0 , y0 ) ≤ f (x, y) in a neighborhood of (x0 , y0 ) local minimum
Linear Algebra and Vector Analysis
Double Integrals
RR
f (x, y) dydx double integral
R b RRd(x)
f (x, y) dydx bottom-to-top region
Rad Rc(x)
b(y)
f (x, y) dxdy left-to-right region
Rc R a(y)
R RR f (r, θ) r drdθ polar coordinates
|r × rv | dudv surface area
R b RRd u RdRb
a c
f (x, y) dydx = c a f (x, y) dxdy Fubini
RR
R RR 1 dxdy area of region R
R
f (x, y) dxdy signed volume of solid bound by graph of f and xy-plane
Triple Integrals
RRR
f (x, y, z) dzdydx triple integral
R b R dRR v
f (x, y, z) dzdydx integral over rectangular box
Rab Rcg2 (x)
u R
h2 (x,y)
a g 1 (x) h1 (x,y)
f (x, y) dzdydx type I region
RRR
f (r, θ, z) r dzdrdθ integral in cylindrical coordinates
R R RR
f (ρ, θ, z) ρ2 sin(φ) dzdrdθ integral in spherical coordinates
R b R dRR v RvRdRb
a cR u
f (x, y, z) dzdydx = u c a
f (x, y, z) dxdydz Fubini
RR
V = R R RE 1 dzdydx volume of solid E
M= E
f (x, y, z) dzdydz mass of solid E with density f .
General advise
Draw the region when integrating in in higher dimensions.
Consider other coordinate systems if the integral does not work.
Consider changing the order of integration if the integral does not work.
For tangent planes, compute the gradient [a, b, c]T first then fix the constant.
When looking at relief problems, mind the gradient.
Theorems
R R
fxy = fyx , Taylor, f dxdy = f dydx, Morse theorem, chain rule, gradient theorem,
change of variables
People
Clairaut, Fubini, Lagrange, Fermat, Riemann, Archimedes, Hamilton, Euler, Taylor,
Morse, Hopf
4
Name:
5
7
LINEAR ALGEBRA AND VECTOR ANALYSIS
8
MATH 22A Total :
9
10
Problems
-1.
-4.
E 2. F
5. H 8.
x
11. I
D32.38. 23.
14.
20.
23. C26. 44. A 29.
41. 35.
17.
J
8.
23. 20. 17. 11. 14. 20. 23.
4
Name:
5
7
LINEAR ALGEBRA AND VECTOR ANALYSIS
8
MATH 22A Total :
9
10
Problems
P∞
b) The series f (x) = k=0 xk /k! = 1 + x + x2 /2! + x3 /3! + · · · represents a
function. Which one?
P∞
d) What is the name of the function f (s) = n=1 n−s ?
e) On a circular island there are exactly 3 maxima and one minimum for the
height f . Assuming f is a Morse function, how many saddle points are there?
f) Which mathematician first found the value for the volume of the ball
x2 + y 2 + z 2 ≤ 1?
We see the level curves of a Morse function f . The circle through ABC will
sometimes serve as a constraint g(x, y) = x2 + y 2 = 1. In all questions, we only pick
points from A,B,C,D,E,F,G,H,I,J,K,L,M.
B
10
9 9 9
L 8
D
11 11
7
14
16
K F 14
16
18 18
20 20
I G J A x
19 19
17 17
15 15
6 13
12 1312
10 10
4
E H 5 M 3
2 1
0
-2 -1
-3
-4
-5
-6 C
-7
-8 -8
Problem 28.4 (10 points):
b) (2 points) Does the function f (x, y) have a global minimum or global max-
imum?
Using the Lagrange optimization method, find the parameters (x, y) for which
the area of an arch
f (x, y) = 2x2 + 4xy + 3y 2
is minimal, while the perimeter
g(x, y) = 8x + 9y = 33
is fixed.
Linear Algebra and Vector Analysis
MATH 22A
Lecture
29.1. A vector field F assigns to every point x ∈ Rn a vector F (x) = [F1 (x), . . . , Fn (x)]T
such that every Fk (x) is a continuous function. We think of F as a force field. Let
t → r(t) ∈ Rn be a curve parametrized on [a, b]. The integral
Z Z b
F · dr = F (r(t)) · r0 (t) dt
C a
0
R called the line integral of F along C. We think of F (r(t)) · r (t) as power and
is
C
F · dr as the work. Even so F and r are column vectors, we write in this lecture
[F1 (x), . . . , Fn (x)] and r0 = [x01 , . . . , x0n ] to avoid clutter. Mathematically, F : Rn → Rn
can also be seen as a coordinate change, we think about it differently however and
draw a vector F (x) at every point x.
29.2. If F (x, y) = [y, x3 ], and r(t) = [cos(t), sin(t)] a circle with 0 ≤ t ≤ 2π, then
F (r(t)) = [sin(t), cos3R(t)] and r0 (t)R = [− sin(t), cos(t)] so that F (r(t))·r0 (t) = − sin2 (t)+
2π
cos4 (t). The work is C F · dr = 0 − sin2 (t) + cos4 (t) dt = −π/4. Figure 1 shows the
situation. We go more against the field than with the field.
29.3. A vector field F is called a gradient field if F (x) = ∇f (x) for some differen-
tiable function f . We think of f as the potential. The first major theorem in vector
calculus is the fundamental theorem of line integrals for gradient fields in Rn :
Rb
Theorem: a ∇f (r(t)) · r0 (t) dt = f (r(b)) − f (r(a)).
Linear Algebra and Vector Analysis
29.4. Proof: by the chain rule, ∇f (r(t)) · r0 (t) = dtd f (r(t)). The fundamental
Rb
theorem of calculus now gives a dtd f (r(t)) dt = f (r(b)) − f (r(a)). QED.
29.5. As a corollary we immediately get path independence
R R
If C1 , C2 are two curves from A to B then C1 F · dr = C2 F · dr,
29.6. Is every vector field F a gradient field? Lets look at the case n = 2, where
F = [P, Q]. Now, if this is equal to [fx , fy ] = [P, Q], then Py = fxy = fyx = Qx . We
see that Qx − Py = 0. More generally, we have the following Clairaut criterion:
29.10. Proof: By the fundamental theorem of line integral, we can replace Cxy by a
path [t, 0] Rgoing from (0, 0) toR (x, 0) and then with R x[x, t] to (x, y).R yThe line integral is
x y
f (x, y) = 0 [P, Q] · [1, 0]dt + 0 [P, Q] · [0, 1] dt = 0 P (t, 0) dt + 0 Q(x, t) dt. We see
that fy = Q(x, y). IfRwe use the path going R x (0, 0) to (0, y) and R y to (x, y) instead,
R x the line
y
integral is f (x, y) = 0 [P, Q] · [0, 1]dt + 0 [P, Q] · [1, 0] dt = 0 Q(0, t) dt + 0 P (t, y) dt.
Now, fx = P (x, y). QED.
Examples
R
29.11. Find C [2xy 2 + 3x2 , 2x2 y + 3y 2 ] · dr for a curve r(t) = [t cos(t), t sin(t)] with
t ∈ [0, 2π]. Answer: we found already F = ∇f with f = x3 + x2 y 2 + y 3 . The curve
starts at A = (1, 0) and ends at B = (2π, 0). The solution is f (B) − f (A) = 8π 3 .
Rb
29.12. If F = E is an electric field, then the line integral a E(r(t)) · r 0 (t) dt is
an
R b electric potential. In celestial mechanics, if F is the gravitational field, then
0
a
F (r(t)) · r (t) dt is a gravitational potential difference. If f (x, y, z) is a temper-
ature and r(t) the path of a fly in the room, then f (r(t)) is the temperature, which
the fly experiences at the point r(t) at time t. The change of temperature for the fly is
d
dt
f (r(t)). The line-integral of the temperature gradient ∇f along the path of the fly
coincides with the temperature difference.
29.13. A device which implements a non-gradient force field is called a perpetual
motion machine. It realizes a force field for which the energy gain is positive along
some closed loop. The first law of thermodynamics forbids the existence of such a
machine. It is informative to contemplate the ideas which people have come up and to
see why they don’t work. We will look at examples in the seminar.
29.14. Let F (x, y) = [P, Q] = [ x2−y , x ]. Its potential f (x, y) = arctan(y/x) has
+y 2 x2 +y 2
the property that fx = (−y/x2 )/(1 + y 2 /x2 ) = P, fy = (1/x)/(1 + y 2 /x2 ) = Q. In the
seminar you ponder the riddle that the line integral along the unit circle is not zero:
Z 2π Z 2π
− sin(t) cos(t)
[ 2 , ] · [− sin(t), cos(t)] dt = 1 dt = 2π .
0 cos (t) + sin2 (t) cos2 (t) + sin2 (t) 0
The vector field F is called the vortex.
Linear Algebra and Vector Analysis
Figure 3. The vortex vector field has a singularity at (0, 0). All the
curl is concentrated at (0, 0).
Homework
Problem 29.1: Let C be the space curve r(t) = [cos(t), sin(t), sin(t)]
Rfor t ∈ [0, π/2] and let F (x, y, z) = [y, x, 15]. Calculate the line integral
C
F · dr.
Problem 29.2: What is the work done by moving in the force field
F (x, y) = [2x3 + 1, 4π sin(πy 4 )y 3 ] along the quartic y = x4 from (−1, 1) to
(1, 1)?
Problem 29.3: Let F be the vector field F (x, y) = [−y, x]/2. Compute
the line integral of F along the curve r(t) = [a cos(t), b sin(t)] with width
2a and height 2b. The result should depend on a and b.
Problem 29.5: Find a closed curve C : r(t) for which the vector field
F (x, y) = [P (x, y), Q(x, y)] = [xy, x2 ]
F (r(t)) · r 0 (t) dt 6= 0.
R
satisfies C
MATH 22A
Seminar
30.1. Wouldn’t it be nice to have a machine which would produce energy from nothing?
Humans have dreamed about this for centuries. There is no mathematical proof that
such a machine can not exist. It is an experimental fact that all isolated physical
process we know preserve energy. 1 In experiments, we see that all basic forces of
nature are gradient fields. So, how come we can harvest energy from the wind force
for example? Wind energy is driven by external sources, in particular the solar energy
which heats up different parts of the earth surface. The sun energy comes from nuclear
processes, mainly the fusion process.
30.2. It is a nice sport to come up with machines which seem to work or then to
analyze a given machine which has been constructed and to find why it fails.
30.3. Our first machine is a circular pipe which is half filled with water. On the side
without water, the gravitational force pulls a wooden ball down. On the water side,
the buoyancy force pulls the ball up. Valves are in place so that the water stays in
place.
Problem A: Analyse the pipe machine. You can assume that operating
the valves uses arbitrary little energy and when opening one of the valves,
the water stays in place.
1Except for very short time, where virtual particles can appear and disappear in a short time frame.
Linear Algebra and Vector Analysis
30.4. An other class of machines uses magnets. Magnets are arranged in a circular
way to produce a circular non-conservative force field in which a magnet is pushed
forward.
30.5. And then there are mechanical machines. Here is an example with weights.
Problem C: Analyze the hammer machine using line integrals using the
gravitational potential f (x, y, z) = z.
Figure 4. The capillary effect lifts the water level.
30.6. You all know that a sponge, a paper or a plant put into water lifts up the water
using the capillary effect. In narrow spaces, this force can beat gravity.
30.7. Why are there no “perpetual motion machines”? There is no fundamental prin-
ciple which forbids it. We could certainly produce a computer simulation of a world,
where energy conservation fails. But it is like with “time machines”. If such a machine
would exist in our physical world, there would be serious dangers luring for a physicist
who studies it. Benjamin Peirce refers in his book “A system of analytic mechanics”
of 1855 to the “Antropic Principle”: “Such a series of motions would receive the
technical name of a ’perpetual motion’ by which is to be understood, that of a system
which would constantly return to the same position, with an increase of power, unless
a portion of the power were drawn off in some way and appropriated, if it were desired,
to some species of work. A constitution of the fixed forces, such as that here supposed
and in which a perpetual motion would possible, may not, perhaps, be incompatible
with the unbounded power of the Creator; but, if it had been introduced into nature, it
would have proved destructive to human belief, in the spiritual origin of force, and the
necessity of a First Cause superior to matter, and would have subjected the grand plans
of Divine benevolence to the will and caprice of man”.
30.8. Non-conservative fields can also be generated by optical illusion as M.C. Es-
cher did. The illusion suggests the existence of a force field which is not conservative.
Can you figure out how Escher’s pictures ”work”? This is part of the homework. Here
is a last possible task for the seminar:
Homework
30.3 Design a vector field F (x, y) = [P (x, y), Q(x, y)] which has the
property such that for any closed curve C : r(t) in {x2R+ y 2 > 1} winding
once around the hole {x2 + y 2 ≤ 1}, the line integral C F (r(t)) · r0 (t) dt
is a multiple of 6π. An example of a curve winding once around is r(t) =
[2 cos(t), 2 sin(t)] with 0 ≤ t ≤ 2π.
30.4 A heat engine is a system that convert heat energy into mechanical
energy. We have seen such a machine in class. How does it work?
MATH 22A
Lecture
31.1. For a C 1 vector field F = [P, Q] in a region G ⊂ R2 , the curl is defined as
curl(F ) = Qx − Py . Assume the boundary C of G oriented so that the region G is
to the left (meaning that if r(t) = [x(t), y(t)] is a parametrization, then the turned
velocity [−y 0 (t), x0 (t)] cuts through G close to r(t)). Green’s theorem assures that if
C is made of a finite collection of smooth curves, then
RR R
Theorem: G curl(F ) dxdy = C F (r(t)) · dr(t).
31.2. Proof. It is enough to prove the theorem for F = [0, Q] or F = [P, 0] separately
and for regions G which are both “bottom to top” G = B = {a ≤ x ≤ b, c(x) ≤ y ≤
d(x)} and “left to right” G = L = {c ≤ y ≤ d, a(y) ≤ x ≤ b(y)}. For F = [P, 0],
use a bottom to top integral, where the two vertical integrals along r(t) = [b, t] and
r(t) = [a, t] are zero. The integrals along r(t) = [t, c(t)] and r(t) = [t, d(t)] give
Z b Z b Z b Z d(t) ZZ
P (t, c(t)) ds − P (t, d(t)) ds = −Py (t, s) dsdt = −Py dsdt .
b a a c(t) G
For F = [Q, 0], use a left to right integral, where the bottom and top integrals are zero
and where
Z d Z d Z d Z b(s) ZZ
Q(b(t), t) dt − Q(a(t), t) dt = Qx (t, s) dtds = Qx dsdt .
c c c a(s) G
In general, write F = [0, Q] + [P, 0], use the first computation for [P, 0] and the second
computation for [0, Q]. In general, cut G along a small grid so that each part is of both
types. When adding the line integrals, only the boundary survives. QED.
Figure 1. To prove Green cut the region into regions which are “bot-
tom to top” and “left to right”. Interior cuts cancel.
Linear Algebra and Vector Analysis
31.3. To see that we can cut G into regions of both types, turn the coordinate system
first a tiny bit so that no horizontal nor vertical line segments appear at the boundary.
This is possible because we assume the boundary to consist of finitely many smooth
pieces. Now also use a slightly turned grid to chop up the region into smaller parts.
Now we have a situation where each piece has the form G = {(x, y) |c(x) ≤ y ≤
d(x)} = {(x, y) | a(y) ≤ x ≤ b(y)}, where a, b, c, d are piecewise smooth functions.
31.4. Green assures:
Theorem: If F is irrotational in R2 , then F is a gradient field.
Applications
31.6. Green’s theorem allows to Rcompute areas. If curl(F ) = 1 and C is a curve enclos-
ing a region G, then Area(G) = C F (r(t)) · r0 (t) dt. For example, with F = [−y, x]/2,
R R 2π
and r(t) = [a cos(t), b sin(t)], then C F ·dr = 0 [−b sin(t), a cos(t)] ·[−a sin(t), b cos(t)]/2 dt
R 2π
= 0 ab/2 dt = πab is the area of the ellipse x2 /a2 + y 2 /b2 = 1.
31.7. What is the area of the region enclosed
R 2π by r(t) = [cos(t), sin(t) + cos(22t)/22]?
Take F (x, y) = [0, x]. The line integral is 0 [0, cos(t)] ·[− sin(t), cos(t) − sin(22t)] dt =
π.
31.8. The planimeter is an analogue computer which computes the area of regions.
It works because of Green’s theorem. The vector F (x, y) is a unit vector perpendicular
to the second leg (a, b) → (x, y) if (0, 0) → (a, b) is the second leg. Given (x, y) we find
(a, b) by intersecting two circles. The magic is that the curl of F is constant 1. The
following computer assisted computation proves this:
s=Solve [ { ( x−a )ˆ2+(y−b)ˆ2==1 , aˆ2+bˆ2==1},{a , b } ] ;
{A, B}=First [ { a , b } / . s ] ; F={−(y−B) , x−A} ; Simplify [ Curl [ F, { x , y } ] ]
Homework
R
Problem 31.1: Calculate the line integral C F · dr with F = [−22y +
3x2 sin(y)+2222 sin(x6 ), x3 cos(y)+2342y 22 sin(y) ]T along a triangle C which
traverses the vertices (0, 0), (7, 0) and (7, 11) back to (0, 0) in this order.
R √
Problem 32.3: Find C [sin( 1 + x3 ), 7x] · dr, where C is the boundary
of the region K(n). You see in the picture K(0), K(1), K(2), K(3), K(4).
The first K(0) is an equilateral triangle of length 1. The second K(1) is
K(0) with 3 equilateral triangles of length 1/3 added. K(2) is K(1) with
3 ∗ 41 equilateral triangles of length 1/9 added. K(3) is K(2) with 3 ∗ 42
of length 1/27 added and K(4) is K(3) with 3 ∗ 43 triangles of length 1/81
added. What is the line integral in the Koch Snowflake limit K = K(∞)?
The curve K is a fractal of dimension log(4)/ log(3) = 1.26 . . . .
Linear Algebra and Vector Analysis
Problem 32.5: Let C be the boundary curve of the white Yang part
of the Yin-Yang symbol in the disc of radius 6. You can see in the image
that the curve C has three parts, and that the orientation of each part is
given. Find the line integral of the vector field
F (x, y) = [−y + sin(ex ), x]T
along C. There are three separate line integrals.
MATH 22A
Lecture
32.1. Given a C 1 surface S = r(G) in R3 and a differentiable vector field F = [P, Q, R],
we can form the flux integral
ZZ ZZ
F · dS = F (r(u, v)) · ru × rv dudv .
S G
For F = [P, Q, R], the curl is defined as ∇ × F = [Ry − Qz , Pz − Rx , Qx − Py ]. The
Stokes theorem tells that if C = r(I) is the boundary of S = r(G) and I is oriented
so that G is to the left, then
RR R
Theorem: S
curl(F ) · dS = C
F · dr.
This is straightforward and done in class. Now define the field F̃ (u, v) = [P̃ , Q̃] =
[F (r(u, v)) · ru (u, v), F (r(u, v)) · rv (u, v)] in the uv-plane. The 2-dimensional curl of F̃
is Q̃u − P̃v = Fu · rv − Fv · ru as we can see by using Clairaut ruv = rvu . The Stokes
theorem is now a direct consequence of Green’s theorem proven last time. QED. 1
Examples
32.3. Problem: Compute the flux of F (x, y, z) = [0, 0, 8z 2 ]T through the upper half
unit sphere S oriented outwards. Solution: we parametrize the surface as r(u, v) =
[cos(u) sin(v), sin(u) sin(v), cos(v)]T . Because ru × rv = − sin(v)r, this parametrization
has the wrong orientation! We continue nevertheless and just change the sign at the
end. We have F (r(u, v)) = [0, 0, 8 cos2 (v)]T so that
Z 2π Z π/2
−[0, 0, 8 cos2 (v)]T · [cos(u) sin2 (v), sin(u) sin2 (v), cos(v) sin(v)]T dvdu .
0 0
R 2π R π/2 π/2
The flux integral is 0 0 - 8 cos3 (v) sin(v) dvdu which is 2π · 8 cos4 (v)/4|0 = −4π.
The flux with the outward orientation is +4π. We could not use the Stokes theorem
here because we don’t deal with the flux of the curl but the flux of F itself.
R
32.4. Problem: What is the value of C F · dr if F = [sin(sin(x)) + z 2 , ey + x3 +
y 2 , sin(y 2 ) + z 2 ] and C is the unit polygon (0, 0, 0) → (1, 0, 0) → (1, 1, 0) → (0, 1, 0) →
(0, 0, 0)? Solution: use Stokes theorem. The curl of F is [2y cos(y 2 ), 2z, 3x2 ]. The
surface S : r(u, v) =RR[u, v, 0] with 0 ≤ u ≤ 1 and 0 ≤ v ≤ 1 has C as boundary. Stokes
allows to compute S curl(F ) · dS instead. Since ru × rv = [0, 0, 1], the flux integral
R1R1
is 0 0 3u2 dvdu = 1. The computation of the line integral would have been more
painful.
32.5. Problem: Compute the flux of the curl of F (x, y, z) = [0, 1, 8z 2 ]T through the
upper half sphereRRS oriented outwards.
R Solution: Great, it is here, where we can use
Stokes theorem S curl(F ) · dS = C F · dr, where C is the boundary curve which
can be parametrized by r(t) = [cos(t), sin(t), 0]T with 0 ≤ t ≤ 2π. Before diving into
the computation of the line integral, it is good to check, whether the vector field is
a gradient field. Indeed, we see that curl(F ) = [0, 0, 0]. This means that F = ∇f
for
R some potential f implying by the fundamental theorem of line integrals that
C
F · dr = 0. But wait a minute, if the curl of F is zero, couldn’t we just have seen
directly that the flux of the curl through the surface is zero? Yes, we could have seen
that before: for a gradient field, the flux of the curl of F through a surface is always
zero, for the simple reason that the curl of such a field is zero.
32.6. Problem. What is the flux of the curl of F (x, y, z) = [sin(xyz), zecos(x+y) , zx5 +
z 22 ] through the lower ellipsoid S given by xR2 /4 + y 2 /9 + z 2 /16 = 1, z < 0? Solution:
by Stokes theorem, it is the line integral C F · dr. Through the boundary r(t) =
[2 cos(t), 3 sin(t), 0]. But in the xy-plane z = 0, the field F is zero. The result is zero.
32.7. Problem: What is the flux of the curl of F through an ellipsoid x2 /4 + y 2 /9 +
z 2 /16 = 1? Solution: We can cut the ellipsoid into two parts to get two surfaces with
boundary. The upper part S+ = {(x, y, z) ∈ S, z > 0} has the boundary C+ : r(t) =
[2 cos(t),
RR 3 sin(t), 0] which Rmatches the orientation of the surface. Stokes theorem tells
that S+ curl(F ) · dS = C+ F · dr. The lower part S− = {(x, y, z) ∈ S, z < 0} has
the boundary C− : r(t) = [2 cos(t), −3 RRsin(t), 0] which matches
R the orientation of the
lower part. Stokes theorem tells that S− curl(F ) · dS = C− F · dr. Together we have
R R
C−
F · dr + C+ F · dr = 0 as the line integrals have just different signs. The result is
zero.
Remarks
32.8. The left hand side of the important formula (it “imports” the curl) 2 is defined
only in three dimensions. But the right hand side also makes sense in Rn . It is
tr((dF )∗ dr), where * rotates the 2-frame by 90 degrees. The Stokes theorem for 2-
surfaces works for Rn if n ≥ 2. For n = 2, we have with x(u, v) = u, y(u, v) = v
the identity tr((dF )∗ dr) = Qx − Py which is Green’s theorem. Stokes has the general
R R
structure G δF = δG F , where δF is a derivative of F and δG is the boundary of G.
32.9. Why are we interested in Rn and not only in R3 ? One example is that 2-
dimensional surfaces appear as “paths” which a moving string in 11 dimension traces.
More important maybe is that statisticians work by definition in high dimensional
spaces. When dealing with n data points, one works in Rn . Why would you care
about theorems like Stokes in statistics? As a matter of fact, integral theorems in
general allow to simplify computations. As we have seen in Green’s theorem, when
computing the sum over all the curls, there are cancellations happening in the inside.
Integral theorems “see these cancellations” and allow to bypass and ignore stuff
which does not matter.
Rb
32.10. The fundamental theorem of line integrals a tr(df (r(t))dr(t))dt = f (r(b)) −
f (r(a)) holds also in Rn . The flux integral
ZZ
tr(F ∗ (r(u, v))dr(u, v)) dudv
G
is the analogue of a line integral in two dimensions. Written like this, we don’t need
the cross product. And not yet the language of differential forms.
32.11. Stokes deals with “fields” and “space”. What happens if the field R b is 2space
∗ T
itself, that is if F = dr? It is of interest. For m = 1, and F = dr , then a |dr| dt is
the action integral in physics. R b A general Maupertius principle assures that it is
equivalent to the arc length a |dr| dt in the sense that minimizing arc length between
two points is equivalent to minimize the action integral (which is more like the energy
one
RR uses Tto get from the first point to the second).RR Now, inT two dimensions we have
G
tr(dr dr) dudv. We can compare this with G
det(dr dr) dudv which is called
RR p
the Nambu-Goto action, which resembles the surface area G det(drT dr) dudv
also called the Polyakov action. Nature likes to minimize. Free particles move on
shortest
R B 0 paths, minimize the arc length. Maupertius R B 0tells that minimizing the length
0
A
|r (t)| dt of a path equivalent to minimizing A r (t) · r (t) dt which essentially is
the integrated kinetic energy or gasoline use to go from A to B. For the purpose of
minimizing
RR stuff this also works for two dimensional actions. Minimizing the surface
area G |ru × rv | dudv RR among all surfaces connecting two one dimensional curves is
equivalent to minimize G |ru × rv |2 dudv. Also in higher dimensions, Nambu-Goto
and Polyakov are equivalent.
Homework
R
Problem 32.1: Use Stokes to find C F · dr, where F (x, y, z) =
z
[12x2 y, 4x3 , 12xy + e(e ) ] and C is the curve of intersection of the hyper-
bolic paraboloid z = y 2 − x2 and the cylinder x2 + y 2 = 1, oriented
counterclockwise as viewed from above.
RR
Problem 32.2: Evaluate the flux integral S
curl(F ) · dS, where
y2 x2 +z 2 +z 2 +z
F (x, y, z) = [xe z 3 + 2xyze , x + z 2 ex , yex + zex ]T
and where S is the part of the ellipsoid x2 + y 2 /4 + (z + 1)2 = 2, z > 0
oriented so that the normal vector points upwards.
R
Problem 32.3: Find the line integral C F dr, where C is the circle of
radius 3 in the xz-plane oriented counter clockwise when looking from the
point (0, 1, 0) onto the plane and where F is the vector field
F (x, y, z) = [4x2 z + x5 , cos(ey ), −4xz 2 + sin(sin(z))]T .
Use a convenient surface S which has C as a boundary.
RR
Problem 32.4: Find the flux integral S
curl(F )·dS, where F (x, y, z) =
[2 cos(πy)e2x + z 2 , x2 cos(zπ/2) − π sin(πy)e2x , 2xz]T
and S is the surface parametrized by
r(s, t) = [(1 − s1/3 ) cos(t) − 4s2 , (1 − s1/3 ) sin(t), 5s]T
with 0 ≤ t ≤ 2π, 0 ≤ s ≤ 1 and oriented so that the normal vectors point
to the outside of the thorn.
MATH 22A
Seminar
33.1. In this seminar, we replace the space Rn with a finite graph G = (V, E), where
V is a set of vertices called nodes and E is a set of edges called connections. A scalar
field is a function f which assigns to every vertex x a function value f (x). We assume
the vertices to be ordered leading to an order of the edges: draw an arrow a → b if
a < b. This a priori order has no effect on any of the theorems. A vector field assigns
to every edge a number F (x). A curve is a list of nodes x1 , x2 , . . . , xn such that x1
is connected to x2 , x2 is connected to x3 etc. The gradient ∇f R of a scalar function
fPis the vector field F (a, b) = f (b) − f (a). The line integral C F · dr is defined as
e∈C F (e)de. We just add up the function values of F along the curve C, positive
de = 1 if we go with the arrow, negative de = −1 if we go against the arrow.
1
3 -1
4 1 2
-2 0
2
Theorem:
R If F = ∇f is a gradient field and C is a curve from a to b,
then C ∇f · dr = f (b) − f (a).
Linear Algebra and Vector Analysis
33.3. Let’s look at some terminology. Given a vertex x in a graph G, the unit sphere
S(x) of x is the sub-graph generated by the set of vertices directly attached to x. The
unit sphere of the vertex labeled 11 in Figure 2 for example is the circular graph
generated by the vertices {2, 4, 9, 8, 7, 9}. It is a “circle”. The unit sphere of the vertex
with label 4 in that figure is the graph generated by the vertices {11, 7, 1}. It is an
linear graph, a half circle.
33.5. The curl of a vector field F is a function on the triangles T of G. To get the value
of the triangle (a, b, c) we form the line integral of F along the curve C : a → b → c → a.
Each triangle is assumed to be oriented (if drawn in the plane, then counter clockwise).
33.6. Given
RR a function F on the triangles
P of a region G which is oriented, the flux
integral G F (x) dA is defined as t∈T f (t), where T is the set of triangles in G.
2
-2 7
-9
4 9
7 -2
3 11 -8
5 2
1 2 -4 1
8 -3 -6
9 7
-1 -1 -1 -3
-7 8 -3
-1 4
2 0 3 4
6 -7
8 3 11
7 3
7 4
5 7 6 4
5
1
4 3
4 1
6
3
3 5 3 1 2
2
2
2 0
Homework
33.1 Check that the curl of a gradient field is zero: curl(grad(f )) = 0 for
every triangle.
33.2 Figure 4 shows a tree, a graph without closed loops. Find a potential
function f . You can assume that the value at the top node is 0. You see
then that the function value right below is 1. Get all the function values
of the potential.
1
9
8
10
2 2
1 7
-6
6
1 1 8
3
3
4
1 -7
2 1
6 -2
11 3
33.3 Find a vector field on a circular graph with 5 vertices which is not
a gradient field.
Linear Algebra and Vector Analysis
33.5 Construct your own 2-dimensional discrete region and define a vector
field on it, then check the Green theorem by computing the sum of the
curls and the line integral along the boundary.
MATH 22A
Topology
34.1. A region E in Rn is called simply connected if it is connected and for every
closed loop C in E there is a continuous deformation Cs of C within G such that
C0 = C and C1 (t) = P is a point. For example, C(t) = [cos(t), sin(t), 0] can be
deformed in E = R3 to a point with Cs (t) = [(1 − s) cos(t), (1 − s) sin(t), 0] as C1 (t) =
P = [0, 0, 0] for all t. Each Euclidean space Rn is simply connected. The region
G = {x2 +y 2 > 0} ⊂ R3 is not simply connected as the circle C : r(t) = [cos(t), sin(t), 0]
winding around the z-axis can not be pulled together to a point within G. The region
G = {x2 + y 2 + z 2 > 0} ⊂ R3 is simply connected, but G = {x2 + y 2 > 0} in R2 is not.
Remember that F was called irrotational if curl(F ) = 0 everywhere.
Theorem: If F is irrotational on a simply connected E then F = ∇f in E.
Electromagnetism
34.5. The Maxwell-Faraday equation in electromagnetism relates the electric
field E and the magnetic field B with theRRpartial differential equation curl(E) =
− dtd B. Given a surface S, the flux integral S B · dS is called the magnetic flux
Linear Algebra and Vector Analysis
34.6. Changing the magnetic flux can happen in various ways. We can generate a
changing magnetic field by using alternating current. This is how transformers
work. An other way to change the flux is to rotate a wire in a fixed magnetic field.
This is the principle of the dynamo:
[−y,x,0]
34.7. The vector field A(x, y, z) = (x2 +y 2 +z 2 )3/2 is called the vector potential of a
magnetic field B = curl(A). The picture shows some flow lines of this magnetic dipole
field B. Problem: Find the flux of B through the lower half sphere x2 + y 2 + z 2 =
1, z ≤ 0 oriented downwards. Solution: Since we have an integral of the curl of the
vector field A, we use Stokes theorem and integrate A(r(t)) along the boundary
curve r(t) = [cos(t), − sin(t), 0]. First of all, we have RA(r(t)) = [sin(t), cos(t), 0]. The
2π
velocity is r 0 (t) = [− sin(t), cos(t), 0]. The integral is 0 −1 dt = −2π.
34.8. Here are all the four magical Maxwell equations for the electric field E and
magnetic field B related to the charge density σ and the electric current j. The
constant c is the speed of light. (By using suitable coordinates, one can assume c = 1.)
div(E) = 4πσ, div(B) = 0, c · curl(E) = −Bt , c · curl(B) = Et + 4πj .
Fluid dynamics
R
34.9. If F is the fluid velocity field and C is a closed curve, then C F · dr is called
the circulation of F along C. The curl of F is called the vorticity of F . A vortex
line is a flow line of curl(F ). Given a curve C, we can let any point in C flow along
the vorticity field. This produces a vortex tube S. The flux of the vorticity though
a surface S is the vortex strength of F through S. Stokes theorem implies the
Helmholtz theorem.
R
Theorem: If Cs flows along F , then Cs F · dr stays constant.
34.10. Proof: Let C be a closed curve and Cs (t) be the curve after letting it flow
S using
a deformation parameter s. The deformation produces a tube surface S = ts=0 Cs
which has the boundary C and Ct . Since the curl of F is always tangent to the
surface
R S, Rthe flux of the curl of F through S is zero. Stokes theorem implies that
C
F · dr − Cs F · dr = 0. The negative sign is because the orientation of Cs is different
from the orientation of C if the surface has to be to the left.
Complex analysis
34.11. An application of Green’s theorem is obtained, when integrating in the complex
plane C. Given a function f (z) = u(z) + iv(z) from C → C and a closed path C
Rb
parametrized by r(t) = x(t) + iy(t) in C, define the complex integral a (u(x(t) +
Rb
iy(t)) + iv(x(t) + iy(t)))(x0 (t) + iy 0 (t)) dt. This is a u(r(t))x0 (t) − v(r(t))y 0 (t) dt +
Rb
i a v(r(t))x0 (t) + u(r(t))0 (t) dt. These are two line integrals. The real part is F =
[u, −v], the imaginary part is F = [v, u].RRAssume C bounds a region G, then Green’s
theorem
RR tells that the first integral is G −vx − uy dxdy and the second integral is
u − vy dxdy. It turns out now that for nice functions f like polynomials, the
G x
Cauchy-Riemann differential equations ux = vy , vx = −ux hold so that these line
integrals are zero. We have therefore
R
Theorem: If f is a polynomial and C a closed loop, C f (z) dz = 0
Linear Algebra and Vector Analysis
Problem 34.2:
a) Define div([P, Q, R]) = Px + Qy + Rz . Check that div(curl(F )) = 0.
b) Is div(grad(f )) = 0 for all functions?
c) Is curl(curl(F )) = [0, 0, 0] for all fields?
d) Which of the regions in Figure 4 are simply connected?
e) Which of the capital letters A − Z are not simply connected?
Problem 34.5: a) Can you find a vector field F with curl(F ) = [0, x2 , 0]?
b) Can you find a vector field F with curl(F ) = [0, 0, x2 ]?
c) Can you find a vector field F = [P, Q, R] such that div(F ) = x2 ?
d) Can you find a gradient field F = ∇(f ) such that div(F ) = x2 ?
e) Given a function g(x, y, z), find F such that div(F ) = g.
MATH 22A
Lecture
35.1. The divergence of a vector field F = [P, Q, R] in R3 is defined as div(F ) =
∇·F = Px +Qy +Rz . Let G be a solid in R3 bound by a surface S made of finitely many
smooth surfaces, oriented so the normal vector to S points outwards. The divergence
theorem or Gauss theorem is
RRR RR
Theorem: G
div(F ) dV = S
F · dS.
35.3. The theorem gives meaning to the term divergence. The total divergence over a
small region is equal to the flux of the field through the boundary. If this is positive,
then more field leaves than enters and field is “generated” inside. The divergence
measures the expansion of the field. The field F (x, y, z) = [x, 0, 0] for example expands,
while f (x, y, z) = [−x, 0, 0] compresses. F (x, y, z) = [y, z, x] is “incompressible”.
integral for F is the flux integral for F ⊥ . The two dimensional divergence theorem is
Green’s theorem “turned”.
Examples
35.6. Problem: Compute the flux of F = [x, y, z] through the sphere RRRof radius ρ
bounding a ball G, oriented outwards. Solution: As div(F )RR = 3 we have G
div(F )dV =
3Vol(G) = 3 · 4πρ3 /3. The flux through the boundary is S F · dS. As in spherical
R 2π R π
coordinates, F (r(φ, θ)) · rφ × rθ = ρ3 sin(φ), the flux is 0 0 ρ3 sin(φ) dφdθ = 4πρ3
also.
35.7. Problem: What is the flux of the vector field F (x, y, z) = [6x + y 3 , 3z 2 +
8y, 22z + sin(x)] through the solid G = [0, 3] × [0, 3] × [0, 3] \ ([0, 3] × [1, 2] × [1, 2] ∪
[1, 2] × [0, 3] × [1, 2] ∪ [0, 3] × [0, 3] × [1, 2]) which is a cube with three perpendicular
cubic holes which is the first stage of the Menger sponge construction? Solution:
As div(F ) = 22 + 8 + 6 = 36, the result is 36 times the volume of the solid which is
36(27 − 7) = 720.
Figure 2. The gravity inside the moon is such that an elevator crossing
the moon oscillates like a harmonic oscillator. The flux of F = [0, 0, z]
through a surface is the volume inside.
35.8. Problem. How does the gravitational field look like inside the moon in dis-
tance ρ to the origin?
RR Solution. A direct computation of summing up all the field
values F (x) = G (x − y)/|x − y|3 dy is difficult as we can not compute in spherical
coordinates. Fortunately we have the divergence theorem. The field F (x) has con-
stant
RR length F (ρ) = |F (x)| for x on a sphere S(ρ) of radius ρ and points inwards. So
S(ρ)
F · dS = −4πρ2 F (ρ). Gauss was able to write down the gravitational field as
a partial differential equation div(F (x)) = 4πσ(x) , where σ(x) is the mass density of
RRR
the solid. We see then with the divergence theorem that B(ρ)
4πσ(x) dx is equal to
−4πρ F ∗ (ρ). Assuming σ to be constant, we have 4π(4πρ /3)σ = −4πρ2 F (ρ) which
2 3
gives F (ρ) = (4σ/3)ρ. TheRRR field grows linearly inside the body. If ρRRR
is bigger than the
radius of the moon, then B(ρ)
4πσ(x) dx is 4πM , where M = G
σ(x) dx is the
2
mass of the moon. We see that in that case F (ρ) = M/ρ , which is the Newton law.
35.9. Problem: Compute using the divergence theorem the flux of the vector field
F (x, y, z) = [2342434y, 2xy, 4yz + 21341324xy]T through the unit cube [0, 1] × [0, 1] ×
[0, 1] which is opened on the top. Solution: the divergence of F is 2x+4y. Integrating
this over the unit cube gives
R 1 R 11 + 2 = 3. The flux through all 6 faces is 3. The flux
through the face z = 1 is 0 0 4y dxdy = 2. We have to subtract this and get 3−2 = 1.
35.10. Similarly as Green’s theorem allowed area computation using line integrals the
volume of a region can be computed as a flux integral:R Rtake a vector field F with
constant divergence 1 like F (x, y, z) = [0, 0, z]. We have S
[0, 0, z] · dS = Vol(G).
35.11. Example: For an ellipsoid x2 /a2 + y 2 /b2 + z 2 /c2 , where the parametrization is
r(φ, θ) = [a sin(φ) cos(θ), b sin(φ) sin(θ), c cos(φ)], we have [0, 0, c cos(φ)][ab sin(φ) cos(φ)] =
abc sin(φ) cos2 (φ) leading to 2πabc2/3 = 4πabc/3.
35.12. A computer can determine the volume of a solid enclosed by a triangulated
surface by computing the flux of the vector field F = [0, 0, z] through the surface.
The vector field has divergence 1 so that by the divergence theorem, the flux gives
the volume. A computer stores a geometric object using triangles. Assume ABC is
that triangle. If n = AB × AC points outside the region, then the flux is F · n/2. A
computer can now add up all these values and get the volume.
Homework
Problem 35.3: Find the flux of the vector field F (x, y, z) = [xy, yz, zx]T
through the solid cylinder x2 + y 2 ≤ 1, 0 ≤ z ≤ 2.
MATH 22A
Lecture
36.1. We know already E = Rn = M (n, 1), the space of column vectors and its dual
E ∗ = M (1, n), the space of row vectors. To get more general objects, it is important
to think about vectors as maps. A row vector is a linear map F : E → R defined by
F (u) = F u and a column vector defines a linear map F : E ∗ → R by F (u) = uF .
A map F (x1 , . . . , xn ) of several variables is called multi-linear, if it is linear in each
coordinate. The set Tqp (E) of all multi-linear maps F : (E ∗ )p × E q → R is the space
of tensors of type (p, q). We have T01 (E) = E and T10 (E) = E ∗ and T11 (E) is M (n, n)
the space of n × n matrices: given a matrix A, a column vector v ∈ E and a row vector
w ∈ E ∗ , we have the bi-linear map F (v, w) = wAv. It is linear in v and in w.
36.2. Let Λq (E) be the subspace of Tq0 (E) which consists of tensors F of type (0, q) such
that F (x1 , . . . xq ) is anti-symmetric in x1 , . . . , xq ∈ E: this means F (xσ(1) , . . . , xσ(q) ) =
(−1)σ f (x1 , . . . , xq ) for all i, j = 1, . . . , q, where (−1)σ is the sign of the permutation
σ of {1, . . . , n}. If the Binomial coefficient B(n, q) = n!/(q!(n − q)!) counts the
number of subsets with q elements i1 < · · · < iq of {1, . . . , n} and E has dimension n,
then Λq (E) has dimension B(n, q). A map F : E → Tqp (E) is called a (p, q)-tensor
field. The set T01 (E) is the space of vector fields. If g : Rm → Rn is a smooth
map, then F = dk g is a tensor field of type (0, k). A k-form is a (0, k)-tensor field
F with F (x) ∈ Λk (E). A 2-form in R3 for example attaches to x ∈ R3 a bi-linear,
anti-symmetric map F (x)(u, v) = −F (x)(v, u). One writes P dydz + Qdxdz + Rdxdy
where dydz(u, v) = u2 v3 − u3 v2 , dxdz(u, v) = u1 v3 − u3 v1 , dxdy(u, v) = u1 v2 − v1 u2 .
p
36.3. The exterior derivative d : ΛP → Λp+1 is defined for f ∈ Λ0 as df = fx1 dx1 +
· · · + fxn dxn and d(f dxi1 · · · dxip ) = i fxi dxi dxi1 · · · dxip . For F = P dx + Qdy for
example, it is (Px dx + Py dy)dx + (Qx dx + Qy dy)dy = (Qx − Py )dxdy which is the
curl of F . If r : G ⊂ Rm → Rn is a parametrization, then S = r(G) is a m-surface
and δS = r(δG) is its boundary in Rn . If F ∈ Λp (Rn ) is a p-form on Rn , then
r∗ F (x)(u1 , . . . , up ) = F (r(x))(dr(x)(u1 ), dr(x)(u2 ), . . . , dr(x)(up )) is a p-form in Rm
called the R pull-back of r. Given a p-form F and an p-surface S = r(G), define the
integral S F = G r∗ F . The general Stokes theorem is
R
R R
Theorem: S
dF = δS
F for a (m − 1)-form F and m surface S in E.
Linear Algebra and Vector Analysis
36.4. Proof. As in the proof of the divergence theorem, we can assume that the region
G is simultaneously of the form gj (x1 , . . . , x̂j , . . . xm ) ≤ xj ≤ hj (x1 , . . . , x̂j , . . . xm ),
where 1 ≤ j ≤ n and that F = [0, . . . , 0, Fj , 0, . . . , 0]. The coordinate independent
definition of dF reduces the result to the divergence theorem in G. QED
Examples
36.5. For n = 1, there are only 0-forms and 1-forms. Both are scalar functions. We
write f for a 0-form and F = f dx for a 1-form. The symbol dx abbreviates the linear
map dx(u) = u. The 1-form assigns to every point the linear map f (x)dx(u) = f (x)u.
The exterior derivative d : Λ0 → Λ1 is given by df (x)u = f 0 (x)u. Stokes theorem is the
Rb
fundamental theorem of calculus a f 0 (x)dx = f (b) − f (a).
36.6. For n = 2, there are 0-forms, 1-forms and 2-forms. It is custom to write
F = P dx+Qdy rather than F = [P, Q] which is thought of as a linear map F (x, y)(u) =
P (x, y)u1 + Q(x, y)u2 . A 2-form is also written as F = f dxdy or F = f dx ∧ dy.
Here dxdy means the bi-linear map dxdy(u, v) = (u1 v2 − u2 v1 ). The 2-form de-
fines such a bi-linear map at every point (x, y). The exterior derivative dΛ0 → Λ1
is df (x, y)(u1 , u2 ) = fx (x, y)u1 + fy (x, y)u2 which encodes the Jacobian df = [fx , fy ],
a row vector. The exterior derivative of a 1-form F = P dx + Qdy is dF (x, y)(u, v) =
(−1)1 Py (x, y) det([u, v]) + (−1)2 Qx (x, y) det([u, v]) which is (Qx − Py )dxdy. Using co-
ordinates is convenient as dF = Py dydx + Qx dxdy = (Qx − Py )dxdy using now that
dydx = −dxdy.
36.7. For n = 3, we write F = P dx + Qdy + Rdz for a 1-form, and F = P dydz +
Qdzdx + Rdxdy for a 2-form. Here dydz = dy ∧ dz are symbols representing bi-
linear maps like dydz(u, v) = u2 v3 − v3 u2 . As a 2-form has 3 components, it can
be visualized as vector field. A 3-form f dxdydz defines a scalar function f . The
symbol dxdydz = dx ∧ dy ∧ dz represents the map dxdydz(u, v, w) = det([uvw]).
The exterior derivative of a 1-form gives the curl because d(P dx + Qdy + Rdz) =
Py dydx+Pz dzdx+Qx dxdy +Qz dzdy +Rx dxdz +Ry dydz which is (Ry −Qz )dydz +(Pz −
Rx )dzdx + (Qx − Py )dxdy. The exterior derivative of a 2-form P dydz + Qdzdx + Rdxdy
is Px dxdydz + Qy dydzdx + Rz dzdxdy = (Px + Qy + Rz )dxdydz. To integrate a 2-form
F = x2 yzdxdy + yzdydz + xzdxdz over a surface r(u, v) = [x, y, z] = [uv, u − v, u + v]
with G = {u2 + v 2 ≤ 1} we end up with integrating F (r(u, v)) · ru × rv . In order to
integrate
RR dF for a 1-form F = P dx + Qdy + Rdz we can also pull back F and get
F (r(u, v))ru − Fu (r(u, v)rv dudv.
G v
36.8. For n = 4, where we have 0-forms f , 1-forms F = P dx + Qdy + Rdz + Sdw and
2-forms F = F12 dxdy + F13 dxdz + F14 dxdw + F23 dydz + F24 dydw + F34 dzdw which are
objects with 6 components. Then 3-forms F = P dydzdw + Qdxdzdw + Rdxdydw +
Sdxdydz and finally 4-forms f dxdydzdw.
Remarks
36.9. Historically, differential forms emerged in 1922 with Élie Cartan. Most textbooks
introduce the Grassmanian algebra early and use the language of “chains” for example
which is the language used in algebraic topology. It was Jean Dieudonné in 1972 who
freed the general Stokes theorem from chains and used first the coordinate free pull
back idea. This allowed us in this lecture to formulate the general Stokes theorem from
scratch on a single page with all definitions.
B) One can understand differential forms better using arithmetic, the Grassmanian
algebra. This is done with the help of the tensor product, which induces an exte-
rior product F ∧ G on Λp × Λq → Λp+q . This product generalizes the cross product
Λ1 × Λ1 → Λ2 which works for n = 3 as there, the space of 1-forms Λ1 and 2-forms Λ2
can be identified. The exterior algebra structure helps to understand k-forms. We can
for example see a 2-form as an exterior product F ∧ G of two 1-forms. We can think of
a 2-form for example as attaching two vectors at a point and identify two such frames
if their orientation and parallelogram areas match.
C) A third way comes through physics. We are familiar with manifestations of elec-
tomagnetism: we see light, we use magnets to attach papers to the fridge or have
magnetic forces keep the laptop lid closed. Electric fields are felt when combing the
hair, as we see sparks generated by the high electric field obtained by stripping away
the electrons from the head. We use magnetic fields to store information on hard
drives and electric fields to store information on a SSD harddrive. Non-visible electro-
magnetic fields are used when communicating using cell phones or connecting through
blue-tooth or wireless network connections. The electro-magnetic field E, B is actu-
ally a 2-form in 4-dimensions. The B(4, 2) = 6 components are (E1 , E2 , E3 , B1 , B2 , B3 ).
Applications
36.11. An electromagnetic field is determined by a 1-form A in 4-dimensional space
time. The electromagnetic field is F = dA. The Maxwell equations are dF = 0 (the
relation d ◦ d = 0 is seen in the homework). The second part of the Maxwell equations
are d∗ F = j, where d∗ : Λp → Λp−1 is the adjoint and j is a 1-form encoding both
the electric charge and the electric current. We can always gauge with a gradient
A + df so that d∗ (A + df ) = 0. The Maxwell equations reduced to the Poisson equation
LF = (dd∗ + d∗ d)F = j, where L is the Laplacian on 1-forms. In vacuum, without
electric charges or currents, we have the wave equation LF = 0. And there was light.
Homework
Problem 36.2: Given the 1-form F = [xyz, xy, wx, RRwxy] = xyzdx +
xydy + wxdz + wxydw, find the curl dF . Now find S dF over the 2-
dimensional surface S : x2 + y 2 ≤ 1, z = 1, w = 1 which has as a boundary
the curve C : r(t) = [cos(t), sin(t), 1, 1]T , 0 ≤ t ≤ 2π. You certainly can use
the Stokes theorem. If you like to compute both sides of the theorem you can see how the theorem
works. The 2-manifold S is parametrized by r(t, s) = [s, t, 1, 1]T . The (rs ∧ rt )ij has 6 components,
where only one component (rs ∧ rt )12 is nonzero. This will match with the dF12 = P dxdy part of the
6-component 2-form dF building the curl. We will have to integrate then over G = s2 + t2 ≤ 1.
Problem 36.3: Given the 2-form F = z 4 xdxdz + xyzw2 dydw and the
3-sphere
RRR x2 + y 2 + z 2 + w2 = 1 oriented outwards. What is the integral
S
dF ? To compute this 3D integral, you can use the general integral
theorem.
Problem 36.5:
a) Take f (x, y, z, w). Check that F = df satisfies dF = 0.
b) Take F = F1 dx + F2 dy + F3 dz + F4 dw. Compute the curl G = dF and
check that dG = 0.
c) Take the 2-form F = F12 dxdy + F13 dxdz + F14 dxdw + F23 dydz +
F24 dydw + F34 dzdw. Write down the 3-form G = dF and check dG = 0.
d) Take the 3-form F = F1 dydzdw + F2 dxdzdw + F3 dxdydw + F4 dxdydz
and compute the 4-form G = dF . Check that dG = 0.
MATH 22A
Seminar
37.1. A 0-form f on a graph G = (V, E) is a function on the vertices V . It is what
we call a scalar function. A 1-form is a function on the oriented edges E meaning
F (a, b) = −F (b, a). Informally, as in the continuum, we think of a 1-form as a vector
field. The gradient F (a, b) = df (a, b) = f (b) − f (a) of a 0-form f is a 1 form F .
The curl of a vector field F is a 2-form. It is a function on triangles (a, b, c) given by
dF (a, b, c) = F (a, b) + F (b, c) + F (c, a) which can be seen as the line integral along the
boundary of the triangle. When describing p-forms for p > 0, orientation matters. To
fix it, just enumerate the vertices V and then choose the orientation of an edge (a, b)
with a < bRRor the orientationRof a triangle (a, b, c) if a < b < c. The discrete Stokes
theorem S curl(F ) · dS = C F · dr told us that that the sum of the curls of F on
triangles of a surface S is equal to the line integral of F along the boundary C of S.
37.2. A tetrahedral graph is a collection of 4 nodes which all are connected to each
other. A 3-form on a graph G is a function on tetrahedral sub-graphs x of G. An
example is the divergence dF (x) of a 2-form F which is defined as the sum of the
F (y) values of the triangles y ⊂ x enclosing the tetrahedron x. As in the continuum,
the orientation plays a role. Here is the discrete divergence theorem for a solid G
is built by tetrahedra x and where the boundary surface S consists of triangles:
P P
Problem A: Check that x∈G div(F )(x) = y∈S F (y).
Linear Algebra and Vector Analysis
Hint: prove by induction with respect to the number of tetrahedra. first check that
if G is a single tetrahedron, this is the definition of the divergence. Then see what
happens if a new tetrahedron is added.
37.3. We also have seen that the divergence of the curl of a vector field F is zero: We
had curl(F ) = [Ry − Qz , Pz − Rx , Qx − Py ] and taking the x derivative of Ry − Qz is
Ryx − Qzx , the y derivative of Pz − Rx is Pzy − Rxy and the z-derivative of Qx − Py is
Qxz − Pyz . Adding them all up gives 0. In the discrete it is even simpler. Start with
a 1-form F on the edges of a graph. Then form the curls, which are functions on the
triangles, then add up all these curls. You check:
37.4. The general Stokes theorem is not much different. A p-simplex in a graph
is a collection of p + 1 nodes which are all connected to each other. A p-form is a
function on the set of p-simplices x in G. The function value is fixed if the simplex is
given in an oriented way but defined also if the simplices are oriented differently, we
just have F (x0 , . . . , xp ) = (−1)σ f (σ(x0 ), . . . , σ(xp )) if σ is a permutation. For exam-
ple F (x0 , x1 , x2 ) = F (x1 , x2 , x0 ) = F (x2 , x0 , x1 ) = −F (x1 , x0 , x2 ) = −F (x0 , x2 , x1 ) =
−F (x2 , x1 , x0 ).
37.5. The exterior derivative of p-form F is the (p + 1)-form
p+1
X
dF (x0 , . . . , xp+1 ) = (−1)j F (x0 , . . . , xˆj , . . . xp+1 ) .
j=0
37.6. The general Stokes theorem tells that for a m-dimensional graph G with bound-
ary S and a (m − 1)-form F we have
P P
Theorem: x∈G dF (x) = y∈S F (y)
Gravity
d2
37.7. The Newton equations dt2 xk = − j Gmj /|xk − xj |2 with gravitational con-
P
stant G describe the motion of finitely many mass points with positions xk (t) ∈ R3
and mass mk . These classical laws govern the motion of planets in our solar system,
stars in a galaxy or galaxies in a galaxy cluster. While relativity modifies this
Newtonian picture slightly and produces corrections which for example manifest in the
Perihel advancement of Mercury, the Newtonian theory is amazingly accurate. Gauss
derived the gravitational inverse square force F from div(F ) = 4πσ, where σ is the
mass density. While divergence usually maps a 2-form to a 3-form, it is the adjoint
d∗ of the gradient d. In R3 it is equivalent. Now, L = div ◦ grad = d∗ d : Λ0 → Λ0 is
called the Kirchhoff Laplacian. The Gauss law of gravity therefore is the Poisson
equation LV = 4πσ , where V is the gravitational potential, a 0−form. Since d∗ = 0
on 0-forms, we can also write L = dd∗ + d∗ d. Classical gravity gets from a mass density
σ the gravitational potential V and so the gravitational field as a gradient F = dV :
(d∗ d + dd∗ )V = 4πσ defines the gravitational 1-form F = dV .
Electromagnetism
37.8. The Maxwell equations div(E) = 4πσ, div(B) = 0, curl(E) = −Bt , curl(B) =
Et +4πi become more elegant when written in four-dimensional space-time R4 . There
are then two equations only. The first is dF = 0 which is evident from F = dA and
d2 = 0. The second is d∗ F = 4πj, where j is the 4-current encoding both the
charge density σ as well as the electric current i. Now dF = 0 implies in a simply
connected region that F = dA, where A is an electro-magnetic potential. If d∗ A = 0
(which can always be achieved by adding a gradient to A) we get the Poisson equation
LA = (dd∗ + d∗ d)A = 4πj. This completely encodes the Maxwell equations; we can
look at it also in a discrete network. Classical electromagnetism in a world with charge
and current density j is the field F = dA, where A is obtained from
(d∗ d + dd∗ )A = 4πj defines the electromagnetic 2-form F = dA.
37.10. The rest will be up to you: it remains to include the Fermionic constituents of
matter (quarks (building mesons and baryons) as well as leptons) and bosons (photons,
gluons, vector bosons and the Higgs) as well as a few other details. Don’t worry, a
former student has solved a similar homework assignment in less than 7 days ...
Homework
1
3
7 -2 7
4
2 4 9
1
-3
1
1
0 0 -1
0
0 1 1
0
-1
MATH 22A
Lecture
38.1. Integral theorems deal with geometries G and fields F . Integration pairs
them up and gives the Stokes theorem
R R
G
dF = δG F
It involves the boundary δG of G and the exterior derivative dF of F . One can
classify the theorems by looking at the dimension n of space and the dimension m of
the object we are integrating over. In dimension n, there are n theorems:
1
d
1
dx FTC
1 −→ 1 1 −→ 1
grad curl FTL Green
1 −→ 2 −→ 1 1 −→ 2 −→ 1
grad curl div FTL Stokes Gauss
1 −→ 3 −→ 3 −→ 1 1 −→ 3 −→ 3 −→ 1
38.2. The Fundamental theorem of line integrals is a theorem about the gradient
∇f . It tells that if C is a curve going from A to B and f is a function (that is a 0-form),
then
R
Theorem: C ∇f · dr = f (B) − f (A)
38.4. Stokes theorem tells that if S is a surface with boundary C oriented to have
S to the left and F is a vector field, then
RR R
Theorem: S
curl(F ) · dS = C F · dr
In the general frame work, the field F = P dx + Qdy + Rdz is a 1-form and the 2-form
dF = (Px dx + Py dy + Pz dz)dx + (Qx dx + Qy dy + Qz dz)dy + (Rx dx + Ry dy + Rz dz)dz =
(Qx − Py )dxdy + (Ry − Qz )dydz + (Pz − Rx )dzdx is written as a column vector field
curl(F ) = [Ry − Qz , Pz − Rx , Qx − Py ]T . To understand the flux integral, we need to
see what a bilinear form like dxdy does on the pair of vectors ru , rv . In the case dxdy
we have dxdy(ru , rv ) = xu yv − yu xv which is the third component of the cross product
ru × rv with ru = [xu , yu , zu ]T . Integrating dF over S is the same as integrating the
dot product of curl(F ) · ru × rv . Stokes theorem implies that the flux of the curl of F
only depends on the boundary of S. In particular, the flux of the curl through a closed
surface is zero because the boundary is empty.
38.5. Gauss theorem: if the surface S bounds a solid E in space, is oriented out-
wards, and F is a vector field, then
RRR RR
Theorem: E
div(F ) dV = S
F · dS
Gauss theorem deals with a 2-form F = P dydz + Qdzdx + Rdxdy, but because a 2-
form has three components, we can write it as a vector field F = [P, Q, R]T . We have
computed dF = (Px dx + Py dy + Qz dz)dydz + (Qx dx + Qy dy + Qz dz)dzdx + (Rx dx +
Ry dy + Rz dz)dxdy, where only the terms Px dxdydz + Qy dydzdx + Rz dzdxdy = (Px +
Qy + Rz )dxdydz survive which we associate again with the scalar function div(F ) =
Px + Qy + Rz . The integral of a 3-form over a 3-solid is the usual triple integral. For a
divergence free vector field F , the flux through a closed surface is zero. Divergence-free
fields are also called incompressible or source free.
Remarks
38.6. We see why the 3 dimensional case looks confusing at first. We have three
theorems which look very different. This type of confusion is common in science: we
put things in the same bucket which actually are different: it is only in 3 dimensions
that 1-forms and 2-forms can be identified. Actually, more is mixed up: not only
are 1-forms and 2-forms identified, they are also written as vector fields which are
T01 tensor fields. From the tensor calculus point of view, we identify the three spaces
T01 (E) = E, T10 (E) = Λ1 (E) = E ∗ and Λ2 (E) ⊂ T20 . While we can still always identify
vector fields with 1-forms, this identification in a general non-flat space will depend
on the metric. In R4 , the 2-forms have dimension 6 and can no more be written as a
vector. One still does. The electro-magnetic F is a 2-form in R4 which we write as a
pair of two time-dependent vector fields, the electric field E and the magnetic field B.
38.7. Geometries and fields are remarkably similar. On geometries, the boundary
operation δ satisfies δ◦δ = 0. On fields the derivative operation d satisfies d◦d = 0.
‘Geometries” as well as “fields” come with an orientation: ru × rv = −rv × ru ,
dxdy = −dydx. The operations d and δ look different because calculus deals with
smooth things like curves or surfaces leading to generalized functions. In quantum
calculus they are thickened up and d, δ defined without limit. Fields and geometries
then become indistinguishable elements in a Hilbert space. The exterior derivative d
has as an adjoint δ = d∗ which is the boundary operator. It is a kind of quantum field
theory as d generates while d∗ destroys a “particle”. d2 = δ 2 = 0 is a “Pauli exclusion”.
38.8. We can spin this further: a m-manifold S is the image of a parametrization
r : G ⊂ Rm → Rn . The Jacobian dr is a dual m-form, the exterior product of the m
vectors dru1 up to drum (think of m column vectors attached to r(u) ∈ S). If we take
a map s : S ⊂ Rn → Rm and look at F = ds, we can think of it as a m-form F (think
of m row vectors attached to each point x in Rn ). The map s defines m × n Jacobian
ds(x), while the Jacobian dr(u) is the n × m matrix.R Cauchy-Binet
R shows that the
fluxR of F = ds through r(G) = S is the integral G F = G det(ds(r(u))dr(u)) du
= S det(ds(x)dr(s(x))). If s(r(u)) = u, then this is a geometric functional. So:
geometries G can come from maps from a space A to Ra space B, while fields F can
come from
R maps from BR to A. The action integral G F generalizes the Polyakov
action G det(drT dr) = G |dr|2 , a case where F and G are dual meaning s(r(u)) = u.
Linear Algebra and Vector Analysis
Prototype examples
Problem: Find the line integral of the vector field F (x, y) = [x4 +sin(x)+
y + 5xy, 4x + y 3 ] along the cardiod r(t) = (1 + sin(t))[cos(t), sin(t)], where
t runs from t = 0 to t = 2π.
MATH 22A
Geometries
40.1. The four dimensional Euclidean space R4 = M (4, 1) is the space of column
vectors with four real components X = [x, y, z, w]T . If we think of such a vector as
a point, we also write X = (x, y, z,√w). The dot product = inner product allows
as usual to define length |X| = X · X, the distance |X − Y | and the angles
cos(α) = (X · Y )/(|X||Y |) between vectors. The Cartesian coordinate system has
now four axes which are perpendicular to each other. Historically, as R4 is also the
space of quaternions, it is custom to label the coordinate directions as 1 = [1, 0, 0, 0], i =
[0, 1, 0, 0], j = [0, 0, 1, 0], k = [0, 0, 0, 1]. A vector [3, 4, 5, 1] for example is then written
also as 3 + 4i + 5j + k. We will however keep the vector-form. We will come back in
the last section of this document about why quaternions are natural.
40.2. The kernel of the 1 × 4 matrix A = [a, b, c, d] defines the linear hyperplane
ax+by +cz +dw = 0. It is a 3-dimensional linear space. An example is the coordinate
hyperplane x = 0, which consists of all points {(0, y, z, w) , y, z, w ∈ R}. More
generally, the solution space ax + by + dz + dw = e is an affine hyperplane. The
kernel of a 2 × 4 matrix is in general, as an intersection of two hyperplanes, a 2-
dimensional plane, which we just call a plane. The kernel of a 3 × 4 matrix A is in
general a line. Geometrically, it is the intersection of three hyperplanes.
40.3. A symmetric 4×4 matrix B, a row vector A ∈ M (1, 4) and a constant e define the
hyper quadric X ·BX +AX = e. For a diagonal matrix B = Diag(a, b, c, d), this gives
the quadric ax2 +by 2 +cz 2 +dw2 = e. Examples are the 3-sphere x2 +y 2 +z 2 +w2 = 1,
the hyper paraboloid x2 + y 2 + z 2 = w, the 3-cylinder x2 + y 2 + z 2 = 1 which is the
product of a 2-sphere and a line. Or the cylinder-plane x2 + y 2 = 1 which can be seen
as the product of the 1-sphere with a 2-plane. There are three types of hyperboloids like
x2 + y 2 + z 2 − w2 = 1 x2 + y 2 − y 2 − z 2 = 1 or x2 − y 2 − z 2 − w2 = 1. One could call them
1-hyper-hyperboloids, 2-hyper-hyperboloids and 3-hyper-hyperboloids, using
the Morse index as a label. There is still 1-hyperbolic-paraboloid x2 + y 2 − z 2 = w
but there are more degenerate surfaces like x2 − y 2 = w. The two-dimensional torus
T2 can be realized here as a quadratic surface. It is the intersection of x2 + y 2 =
1, z 2 + w2 = 1. This is the flat torus. We can not realize the two-dimensional torus
in a flat way in our three dimensional space R3 . In hyper-space, it can. There is also
a three dimensional torus T3 . To get a parametrization, start with the 2-torus
parametrization r(φ, θ) = [(3 + cos(φ)) cos(θ),(3 + cos(φ)) sin(θ), sin(φ)] then expand
Linear Algebra and Vector Analysis
the circle to get a hyper-torus r(φ, θ, ψ) = [(3 + cos(φ)) cos(θ), (3 + cos(φ)) sin(θ),
(3 + sin(φ)) cos(ψ), (3 + sin(φ)) sin(ψ)]T , You see that for every fixed ψ we have a
2-torus. We can compute 4|dr| = 18 + 6 cos(φ) + 6 sin(φ) + sin(2φ) which is always
positive and so verifies that the map from T3 to R4 is locally injective. We can also
easily check that if ψ or θ is fixed we get a translated scaled version of the 2-torus. If
φ is fixed, we get the flat 2-torus mentioned above.
40.4. In single variable calculus, one looks at graphs {(x, y) | y = f (x)} of functions of
one variable. In multi-variable, one adds graphs {(x, y, z) | z = f (x, y)} of functions of
two variables. The graph of a function w = f (x, y, z) is now a 3-dimensional space.
Paraboloids like w = x2 + y 2 + z 2 or w = x2 + y 2 − z 2 are graphs. An other example is
2 2 2
the three dimensional bell hyper-surface w = f (x, y, z) = π −3/2 e−x −y +z , where
the constant has been chosen so that the hyper-volume 0 ≤ w ≤ f (x, y, z) is equal
to 1. For obvious reasons, we usually do not draw the graph of a function of three
variables as we would have to draw in 4 dimensions. Now, in hyperspace, we can do
that.
40.5. Spaces can be parametrized in the same way as we parametrized curves or sur-
faces in three dimensions. A curve is defined by four real functions x(t), y(t), z(t), w(t)
of one variables and written as r(t) = [x(t), y(t), z(t), w(t)]T . A surface is parametrized
by r(u, v) = [(x(u, v), y(u, v), z(u, v), w(u, v)]. A hypersurface is now defined by
r(u, v, t) = [x(u, v, t), y(u, v, t), z(u, v, t), w(u, v, t)].
40.6. A coordinate change is defined by a map from R4 to R4 given by four differ-
entiable functions: r(u, v, s, t) = [x(u, v, s, t), y(u, v, s, t), z(u, v, s, t), w(u, v, s, t)]. We
have seen already the parametrization r(φ, θ1 , θ0 ) = [cos(φ) cos(θ1 ), cos(φ) sin(θ1 ),
sin(φ) cos(θ2 ), sin(φ) sin(θ2 )] of the unit 3-sphere= hyper-sphere x2 +y 2 +z 2 +w2 =
1. Because z = x2 +y 2 +z 2 is a cylinder, there is also a natural cylindrical coordinate
system in four dimensions. It is given by r(ρ, φ, θ, w) = [ρ sin(φ) cos(θ), ρ sin(φ) sin(θ),
ρ cos(φ), w]. If we write down the Jacobian matrix and compute the determinant we
get ρ2 sin(φ) as in spherical coordinates.
Fields
40.7. A scalar function f (x, y, z, w) is also called a 0-form. A vector field is denoted
by F = [P, Q, R, S]T and a 1-form F = [P, Q, R, S] is written as F = P dx + Qdy +
Rdz + Sdw. A 2-form F has 6 components: F = Adxdy + Bdxdz + Cdxdw +
P dydz +Qdydz +Rdzdw. A 3-form again has four components P dydzdw+Qdxdzdw+
Rdxdydw +Sdxdydz and a 4-form is again completely determined by a scalar function
f because F = f dxdydzdw.
40.8. The exterior derivatives are computed by using the anti-commutation rule
like dxdy = −dydx and df = fx dx + fy dy + fz dz + fw dw and extending this to terms
like P dydz = dP dydz = (Px dx + Py dy + Pz dz + Pw dw)dydz = Px dxdydz + Pw dwdydz.
For a 1 form F = P dx + Qdy + Rdz + Sdw we have
dF = Px dxdx + Py dydx + Pz dzdx + Pw dwdx +Qx dxdy + Qy dydy + Qz dzdy + Qw dwdy
+Rx dxdz + Ry dydz + Rz dzdz + Rw dwdz +Sx dxdw + Sy dydw + Sz dzdw + Sw dwdw
which simplifies to expression with 6 terms. We have ddF = 0 because every term
like Pyz dzdydx is paired with a term like Pzy dydzdx which cancel. For a 2-form
F = Adxdy + Bdxdz + Cdwdx + P dydz + Qdydw + Rdzdw, we have dF = (Az dz +
Aw dw)dxdy+(By dy+Bw dw)dxdz+(Cy dy+Cz dz)dwdx+(Px dx+Pw dw)dydz+(Qx dx+
Qz dz)dydw + (Rx dx + Ry dy)dzdw which simplifies to (Qz + Pw + Ry )dydzdw + (Bw +
Cz + Rx )dxdzdw + (Aw + Qx + Cy )dxdydw + (Az + By + Px )dxdydz. For a 3-form
F = P dydzdw + Qdzdwdx + Rdwdxdy + Sdydzdw we have dF = (Px + Qy + Rz +
Sw )dxdydzdw.
40.10. Here are some properties which we have seen already. The gradient ∇f = df T is
perpendicular to the level surface f (x, y, z, w) = c. The curl of the gradient is zero. The
hypercurl of the curl is zero. The divergence of the hypercurl is zero. The divergence
of the gradient is the Laplacian (using the identifications, the divergence map can be
identified with the adjoint −d∗ ). The chain rule is d/dtf (r(t)) = ∇f (r(t)) · r0 (t).
40.11. The line integral of a vector field F along a curve C is C F (r(t)) · r0 (t) dt.
R
The flux integral of a vector field F along a 2-dimensional surface is a flux integral.
The hyper flux integral of a hyper-field RRRRF along a surface . The hyper volume
integral of a function f on a solid G is G
f (x, y, z, w) dxdydzdw.
Theorems
40.12. The fundamental theorem of line integrals is
40.13. The Stokes theorem tells that for a surface S and 1 form F
RR R
Theorem: S curl(F ) · dS = C F · dr
40.14. The Hyper Stokes theorem assures that for a hypersurface S and a 2-form
F , the flux of the hypercurl of F through G (a 3D-integral) is the flux of F through
the boundary surface S (a 2D-integral)
RRR RR
Theorem: G
hypercurl(F ) · dG = S F · dS
40.15. The divergence theorem assures that for a 3-form (identified as a vector field
F ) and a solid G with boundary hyper-surface S, we have
RRRR RRR
Theorem: G
div(F ) dV = S
F · dS.
Linear Algebra and Vector Analysis
Quaternions
40.16. Hyperspace R4 is special: it is the only Euclidean space for which the unit
sphere is a non-Abelian Lie group. A Lie group G is a manifold r(Rm ) ⊂ Rn 1
on which one has a group operation x ∗ y which has the property that for every y,
the maps x → x ∗ y and x → y ∗ x are smooth maps on G. To have a group (G, ∗)
we must have the property that (x ∗ y) ∗ z = x ∗ (y ∗ z) and that there is a 1-element
1∗x = x∗1 = x such that every element x has an inverse x−1 satisfying x∗x−1 = 1. The
circle {x2 + y 2 = 1} = {z ∈ C||z| = 1} is an example of a group. This multiplication is
Abelian if x ∗ y = y ∗ x for all x, y ∈ G. The complex plane C = R2 is characterized
as the only Euclidean space Rn in which the unit sphere T1 = {|x| = 1} is an Abelian
Lie group. Why Lie groups? They are the dough, elementary particles are baked
from! Electromagnetism is built from T1 for example.
40.17. One can write a vector in R4 also as v = a+ib+jc+kd where i, j, k are symbols.
Hamilton noticed that when defining i2 = j 2 = k 2 = ijk = −1, the 4-dimensional space
becomes an algebra. An algebra is a linear space which also features a multiplication.
Now one has already M (2, 2), the space of 2 × 2 matrices, which is a 4-dimensional
algebra, but the algebra which Hamilton found is a division algebra: every non-zero
element can be inverted. This is not the case for M (2, 2). The matrix in which all
elements are 1 for example is non zero but it is also not invertible.
40.18. The algebra which Hamilton defined through the relations i2 = j 2 = k 2 = ijk =
−1 is called the quaternion algebra H. If v = a − ib − jc − kd, then |v|2 = v · v = vv,
where the right hand side is a quaternion multiplication. One can readily check that
|vw| = |v||w|. Thereason is that quaternions
v can be realized as complex 2 × 2-
a + ib c + id
matrices: if A(v) = , then |v| = det(A(v)) and A(v)A(w) = A(vw).
−c + id a − ib
Your favorite AI helps to check this last identity quickly.
Import [ ” Quaternions ‘ ” ] ;
A[ { x , y , z , w } ] : = { { x+I ∗y , z+I ∗w},{− z+I ∗w, x−I ∗y } } ;
Q=Quaternion [ a , b , c , d ] ∗ ∗ Quaternion [ p , q , r , s ] ;
Simplify [A[ { a , b , c , d } ] . A[ { p , q , r , s }]==A[ Table [Q [ [ k ] ] , { k , 4 } ] ] ]
1Manifolds
can be described abstractly, but a theorem of John Nash assures that every manifold
can be embedded in some Rn . So, looking at images of maps r is no loss of generality!
LINEAR ALGEBRA AND VECTOR ANALYSIS
MATH 22A
Discrete Calculus
G = (V, E) graph with vertex set V and edge set E.
0-form: function on V . Discrete scalar function
1-form: function on E. Discrete vector field
2-form: function on triangles T .
d(f ) = grad(f ) is a function on edges a− > b defined by f (b) − f (a).
H = dF = curl(F ) is a function on triangles obtained by summing F along the triangle.
d∗ H is a function on edges. Add up the attached triangle values.
d∗ F is a function on vertices. Add up the attached edge values.
New People
Cartan, Maxwell, Stokes, Green, Gauss, Newton, Maxwell, Kirchhoff, Menger, Koch,
Escher, Peirce
Partial Derivatives
L(x, y) = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ) linear approximation
Q(x, y) = L(x0 , y0 ) + fxx (x − x0 )2 /2 + fyy (y − y0 )2 /2 + fxy (x − x0 )(y − y0 ).
use L(x, y) to estimate f (x, y) near f (x0 , y0 ). The result is f (x0 , y0 )+a(x−x0 )+b(y−y0 )
tangent plane: ax + by + cz = d with a = fx , b = fy , c = fz , d = ax0 + by0 + cz0
estimate f (x, y) by L(x, y) or Q(x, y) near (x0 , y0 )
fxy = fyx Clairaut’s theorem for functions which are in C 2 .
ru (u, v), rv (u, v) tangent to surface parameterized by r(u, v)
Parametrization
r : G ⊂ Rm → Rn , dr Jacobian
√
g = drT dr first fundamental form, |dr| = g distortion factor.
curl(F )(r(u, v)) · (ru × rv ) = Fu · rv − Fv · ru important formula
Gradient
∇f (x, y) = [fx , fy ]T , ∇f (x, y, z) = [fx , fy , fz ]T , gradient
Dv f = ∇f · v directional derivative
d
dt
f (r(t)) = ∇f (r(t)) · r 0 (t) chain rule
∇f (x0 , y0 ) is orthogonal to the level curve f (x, y) = c containing (x0 , y0 )
∇f (x0 , y0 , z0 ) is orthogonal to the level surface f (x, y, z) = c containing (x0 , y0 , z0 )
d
dt
f (x + tv) = Dv f by chain rule
(x − x0 )fx (x0 , y0 ) + (y − y0 )fy (x0 , y0 ) = 0 tangent line
(x − x0 )fx (x0 , y0 , z0 ) + (y − y0 )fy (x0 , y0 , z0 ) + (z − z0 )fz (x0 , y0 , z0 ) = 0 tangent plane
Dv f (x0 , y0 ) is maximal in the v = ∇f (x0 , y0 )/|∇f (x0 , y0 )| direction
f (x, y) increases in the ∇f /|∇f | direction at points which are not critical points
if Dv f (x) = 0 for all v, then ∇f (x) = 0
f (x, y, z) = c defines y = g(x, y), and gx (x, y) = −fx (x, y, z)/fz (x, y, z) implicit diff
Extrema
∇f (x, y) = [0, 0]T , critical point
D = det(d2 f ) = fxx fyy − fxy 2
discriminant.
Morse: critical point and D 6= 0, in 2D looks like x2 + y 2 , x2 − y 2 , −x2 − y 2
f (x0 , y0 ) ≥ f (x, y) in a neighborhood of (x0 , y0 ) local maximum
f (x0 , y0 ) ≤ f (x, y) in a neighborhood of (x0 , y0 ) local minimum
∇f (x, y) = λ∇g(x, y), g(x, y) = c, λ Lagrange equations
∇f (x, y, z) = λ∇g(x, y, z), g(x, y, z) = c, λ Lagrange equations
second derivative test: ∇f = (0, 0), D > 0, fxx < 0 local max, ∇f = (0, 0), D >
0, fxx > 0 local min, ∇f = (0, 0), D < 0 saddle point
f (x0 , y0 ) ≥ f (x, y) everywhere, global maximum
f (x0 , y0 ) ≤ f (x, y) everywhere, global minimum
Double Integrals
RR
f (x, y) dydx double integral
R bRR d
f (x, y) dydx integral over rectangle
Rab Rcd(x)
f (x, y) dydx bottom-top region
Rad Rc(x)
b(y)
c a(y)
f (x, y) dxdy left-right region
RR
RRR f (r, θ) r drdθ polar coordinates
|r × rv | dudv surface area
R bRR d u RdRb
a c
f (x, y) dydx = c a f (x, y) dxdy Fubini
RR
RRR 1 dxdy area of region R
R
f (x, y) dxdy signed volume of solid bound by graph of f and xy-plane
Triple Integrals
RRR
f (x, y, z) dzdydx triple integral
R b RRd R v
f (x, y, z) dzdydx integral over rectangular box
u R
Rab Rcg2 (x) h2 (x,y)
a g1 (x) h1 (x,y)
f (x, y) dzdydx type I region
RRR
f (r, θ, z) r dzdrdθ integral in cylindrical coordinates
RRRR
R
f (ρ, θ, φ) ρ2 sin(φ) dρdφdθ integral in spherical coordinates
RbRdRv RvRdRb
a c RRR u
f (x, y, z) dzdydx = u c a f (x, y, z) dxdydz Fubini
V = RRRE 1 dzdydx volume of solid E
M= E
σ(x, y, z) dxdydz mass of solid E with density σ
Line Integrals
F (x, y) = [P (x, y), Q(x, y)]T vector field in the plane
F (x, y, z) = [P (x, y, z), Q(x, y, z), R(x, y, z)]T vector field in space
Rb
F (r(t)) · r 0 (t) dt line integral
R
C
F · dr = a
F (x, y) = ∇f (x, y) gradient field = potential field = conservative field
Green’s Theorem
F (x, y) = [P, Q]T , curl in two dimensions: Rcurl(F ) = Q
RRx − Py
Green’s theorem: C boundary of R, then C F · dr = R curl(F ) dxdy
Area computation: Take F with curl(F ) = Qx −Py = 1 like F = [−y, 0]T or F = [0, x]T
Green’s theorem is useful to compute difficult line integrals or difficult 2D integrals
Flux integrals
F (x, y, z) vector field, S = r(R) parametrized surface
rRuR× rv dudv =R dS
R 2-form on surface
S
F · dS = S
F (r(u, v)) · (ru × rv ) dudv flux integral
Stokes Theorem
F (x, y, z) = [P, Q, R]T , curl([P, Q, R]T ) = [Ry − QRz , Pz − Rx , Q T
RRx − Py ] = ∇ × F
Stokes’s theorem: C boundary of surface S, then C F · dr = S curl(F ) · dS
Stokes theorem allows to compute difficult flux integrals or difficult line integrals
div(grad(f )) = ∆f Laplacian
incompressible = divergence free field: div(F ) = 0 everywhere. Implies F = curl(H)
irrotational = curl(F ) = 0 everywhere. Implies F = grad(f )
Divergence Theorem
div([P, Q, R]T ) = Px + Qy + Rz = ∇ · F RR RRR
divergence theorem: solid E, boundary S then S F · dS = E
div(F ) dV
the divergence theorem allows to compute difficult flux integrals or difficult 3D integrals
Some topology
simply connected region D: can deform any closed curve within D to a point
interior of a region D: points in D for which small neighborhood is still in D
boundary of curve C: the end points of the curve
boundary of S points on surface not in the interior of the parameter domain
boundary of solid G: points in G which are not in the interior of D
closed surface: a surface without boundary like a sphere
closed curve: a curve with no boundary like a knot
Differential forms
A k-form is a field, which attaches at every point a multi-linear anti-symmetric map of
k variables.
F = 5x3 dydz + 7 sin(y)xdxdz + 3 cos(xy)dxdy is an example of a 2-form. In calculus
this is identified with a vector field F = [5x3 , 7 sin(y)x, 3 cos(xy)].
The exterior derivative of a term like F = P dxdy is dF = (Px dx + Py dy + Pz dz)dxdy =
Pz dzdxdy = Pz dxdydz. R R
The general stokes theorem tells G dF = δG F , where δG is the boundary of G.
Problems
7
1 0 4 5
2 3
Figure 1. A graph with a 1-Form F . Enter here the result for a) and b).
• You only need this booklet and something to write. Please stow away any other
material and any electronic devices. Remember the honor code.
• Please write neatly and give details. Except for problems 2 and 3 we want to see
details, even if the answer should be obvious to you.
• Try to answer the question on the same page. There is additional space on the
back of each page. If you must, use additional scratch paper at the end.
• If you finish a problem somewhere else, please indicate on the problem page where
we can find it.
• You have 180 minutes for this 3-hourly.
Linear Algebra and Vector Analysis
Problems
c) Write down the general formula for the arc length of a curve
r(t) = [x(t), y(t), z(t)]T
with a ≤ t ≤ b.
d) Which of the following vector fields are gradient fields? (It could be
none, one, two, three or all.)
F = [x, 0]T
F = [0, x]T
F = [x, y]T
F = [y, x]T
h) What is the cosine of the angle between the matrices A, B ∈ M (2, 2),
where A is the identity matrix and B is the matrix which has 1 every-
where? You should get a concrete number.
Problems
2 2 2
0 0 0 1 0 2 2 1
2 2 2 2
1 0 0 0 0 2 2 1
2 2 2
0 1 0 1 1 1
0 2
0 1 -1-2 2 3 -1
-1 0 1
2
-2
1 -2
-4-3 3
-8-6
-5-7 -9
0
-4
-1
-4 -5 -4 1 12
-3
-2
-1
0 -4-4
1 2 34 -3-2-1
4
1
2
-1
-2 -1 4 5 4 3
1
0 0
8 64
57 3
4
-3 0
2
-2 2
-2 1
-1
0 1 -3 -1 2 1 -1 0 x
MATH 22A
42.1. Of course, the hope is that no other literature is needed. These lecture notes are
quite dense. The idea is that you “write your own book” and fill in eventual gaps or
work out some parts in more detail. We live in a time where many great resources are
online. The total sum of the books below are in the ten thousands. Some are there for
historical reasons, like Gibbs, Cartan or Gleason. For proof literature, see Unit 24.
MATH 22A
Acknowledgements:
43.1. First of all thanks to the Math 22a class of the fall of 2018 for questions,
comments, feedback and proofreading. So, thanks to Troy Appel, Ethan Arellano,
Jordan Barkin, Vlad Batagui, David Bruno, Silvia Casacuberta, Ian Chan, Candice
Chen, Michael Chen, Jackson Delgado, Isabel Diersen, Michaela Donato, Shelby El-
der, William Elder, Phillip Evans, Eric Hansen, Emily He, Ben HoffnerBrodsky, Jerry
Huang, Spencer Hurt, Lauren Jiang, Drew Kelner, Nour Khachemoune, Madeleine
KlebanoffOBrien, Mary Kolesar, Jordan Lawanson, Tim Li, Jake Lim, Isaac Longo-
bardi, Robert Malate, Emily Murdock, Samantha OSullivan, Christopher Ong, Moni
Radev, Sanjana Ramrajvel, Isaac Robinson, Amia Ross, Julian Schmitt, Daniel Shin,
Ross Simmons, Daniel Slaw, Sophia Sun, Varun Tekur, Grace Tian, Eddie Tu, Connor
Wagaman, Carissa Wu, Rebecca Xi, Iris Xu, Mark Xu, Jenny Yao, Nicole Zhang, Jen-
nifer Zhu and Richard Zhu. There were not hundreds of small things which the class
has found but also substantial directives came from the class while teaching the course.
43.2. Thanks to Robin Gottlieb and Cliff Taubes for initiating the course and getting
it off the ground. And to the TF’s David, Aditya and Elliot for helping running the
course smoothly.
43.3. Thanks to Jameel Al-Aidroos, Wes Cain, Janet Chen and Dusty Grundmeyer for
valuable discussions and feedback for early versions during the spring and the summer
of 2018 while planning the course.
43.4. The cover picture uses a Povray code by Jaime Vives Piqueres from 2005 illus-
trating height-fields.
Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018