0% found this document useful (0 votes)

1K views201 pages

Linear Algebra for Math Students

The document is a lecture on linear algebra and vector calculus that covers topics like: - Defining matrices, vectors, and matrix operations like addition, scalar multiplication, and matrix multiplication. - Introducing vector spaces and properties like associativity and distributivity. - Defining the dot product of vectors and matrices and properties like Cauchy-Schwarz inequality. - Explaining how the dot product can define the angle between vectors and be used to express the Pythagorean theorem. - Providing examples to illustrate key concepts like orthogonal and parallel vectors.

Uploaded by

ptolmey

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1K views201 pages

Linear Algebra for Math Students

Uploaded by

ptolmey

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Linear Algebra And Vector Calculus I

Oliver Knill

Title image: Povray code by Jaime Vives Piqueres

This document is in the public domain.

Oliver Knill, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 1: Pythagorean theorem

Lecture
1.1. A finite rectangular array A of real numbers is called a matrix. If there are n
rows and m columns in A, it is called a n × m matrix. We address the entry in the i’th
row and j’th column with Aij . A n × 1 matrix is a column vector, a 1 × n matrix is
a row vector. A 1 × 1 matrix is called a scalar. Given aPn × p matrix A and a p × m
matrix B, the n × m matrix AB is defined as (AB)ij = pk=1 Aik Bkj . It is called the
matrix product. The transpose of a n × m matrix A is the m × n matrix ATij = Aji .
The transpose of a column vector is a row vector.
1.2. Denote by M (n, m) the set of n×m matrices. It contains the zero matrix O with
Oij = 0. In the case m = 1, it is the zero vector. The addition A+B of two matrices
in M (n, m) is defined as (A+B)ij = Aij +Bij . The scalar multiplication λA is defined
as (λA)ij = λAij if λ is a real number. These operations make M (n, m) a vector
space = linear space: the addition is associative, commutative with a unique
additive inverse −A satisfying A − A = 0. The multiplications are distributive:
A(B + C) = AB + AC and λ(A + B) = λA + λB and λ(µA) = (λµ)A.
1.3. The space M (n, 1) is also called Rn . It is the n-dimensional Euclidean space.
The vector space R2 is the plane and R3 is the physical space. These spaces are
dear to us as we draw on paper and live in space. The dot product between two
column vectors v, w ∈ Rn is the matrix product v · w = v T w. Because the dot product
is a scalar, the product is also called the scalar product. In the matrix product of
two matrices A, B, the entry at position (i, j) is the dot product of the i’th row in A
with the j’th column in B. More generally, the dot product between two arbitrary
n × m matrices can be defined by A · B = tr(AT B), wherePthe trace of a matrix is
the sum of its diagonal entries. This means tr(AT B) = i,j Aij Bij . We just take
the product over all matrix entries and add them up. The dot product is distributive
(u + v) · w = u · w√+ v · w and commutative v · w = w · v. We can use it to define
the length |v| = v · v of a vector or the length |A| of a matrix, where we took the
positive square root. The sum of the squares is zero exactly if all components are zero.
The only vector satisfying |v| = 0 is therefore v = 0.
1.4. An important key result is the Cauchy-Schwarz inequality.

Theorem: |v · w| ≤ |v||w|
Linear Algebra and Vector Analysis

Proof. If w = 0, there is nothing to prove as both sides are zero. If w 6= 0, then we can
divide both sides of the equation by |w| and so achieve that |w| = 1. Define a = v · w.
Now, 0 ≤ (v − aw) · (v − aw) = |v|2 − 2av · w + a2 |w|2 = |v|2 − 2a2 + a2 = |v|2 − a2
meaning a2 ≤ |v|2 or v · w ≤ |v| = |v||w|.
1.5. It follows from the Cauchy-Schwarz inequality that for any two non-zero vectors
v, w, the number (v · w)/(|v||w|) is in the closed interval [−1, 1]. There exists therefore
a unique angle α ∈ [0, π] such that cos(α) = (v · w)/(|v||w|). If this angle between v
and w is equal to α = π/2, the two vectors are orthogonal. If α = 0 or π the two
vectors are called parallel. There exists then a real number λ such that v = λw. The
zero vector is considered both orthogonal as well as parallel to any other vector.
1.6. Two vectors v, w define a (possibly degenerate) triangle {0, v, w} in Euclidean
space Rn . The above formula defines an angle α at the point 0 (which could be the
zero angle). The side lengths a = |v|, b = |w|, c = |v − w| of the triangle satisfy the
following cos formula. It is also called the Al Kashi identity.

Corollary: c2 = a2 + b2 − 2ab cos(α)

Proof. We use the definitions as well as the distributive property (FOIL out):
c2 = |v − w|2 = (v − w) · (v − w) = v · v + w · w − 2v · w = a2 + b2 − 2ab cos(α).
1.7. The case α = π/2 is particularly important. It is the Pythagorean theorem:

Theorem: In a right angle triangle we have c2 = a2 + b2 .

Examples
     
1 1 1
1.8. The dot product  3  ·  −2  is [1, 3, 1]  −2  = 1 − 6 − 1 = −6. We have
1 −1 −1
√ √ √
|v| = 11, |w| = 6 and angle α = arccos(−6/ 66).

3 1 2 2
1.9. The dot product of A = and B = is tr(AT B) = 6 + 2 + 8 +
2 1 4 −1
√
(−1) = 15 . The length
√ of A is 12,√the length of B is 5. The angle between A and B
is α = arccos(15/(5 12)) = arccos( 3/2) = π/6.

1 2 1 −1
1.10. A = and B = are perpendicular because tr(AT B) = 0.
1 2 −1 1
√
The angle between them is π/2. The lengthof A is a = 10. The length of B is
√ 2 1 √
b = 4 = 2. The length of A + B = is c = 14. We confirm a2 + b2 = c2 .
0 3
Note that AB 6= BA. Multiplication is not commutative.
1.11. Find the angles in a triangle of length a=4,b=5 and c=6. Answer: Al Kashi gives
2 · 4 · 5 cos(γ) = 42 + 52 − 62 = 5 so that γ = arccos(5/40). Similarly 2 · 4 · 6 cos(β) = 27
so that γ = arccos(27/48) and 2 · 5 · 6 cos(α) = 45 so that α = arccos(45/60).
Illustrations

Figure 1. A cuboid of integer side length a, b and c such that a2 +

b2 , a2 + c2 , b2 + c2 are squares is an Euler brick. Its side diagonals are
now integers. The smallest one (a, b, c) = (44, 117, 24) was found in 1719.
If also a2 + b2 + c2 is a square, meaning that the space diagonal is an
integer too, we have a perfect Euler brick. Nobody has found one. It
is a famous open problem due to Euler, whether there exists one.
1

Figure 2. This Povray scene was generated by a method which in-

volves a lot of vector calculus and linear algebra: this open source ray
tracer bounces around light in the virtual scene and computes the re-
flections. A camera then captures the photons, similarly as a real camera
does. Textures are implemented by images, here a postcard of Harvard
square from 1930. It is a image file encoding three 1688 × 1104 matri-
ces R,G,B, red, green and blue values at each pixel. The scene is an
“homage” to the novel “On Time and the River” by Thomas Wolfe who
was a Harvard undergraduate here from 1920-1922 (notice the 22!)

1Knill, 2009: http://www.math.harvard.edu/˜knill/various/eulercuboid/lecture.pdf

Linear Algebra and Vector Analysis

Homework
This homework is due on Thursday.
 
1 2 3
Problem 1.1: Given A =  4 5 6 .
7 8 9
a) Find A , then build B = A + AT and C = A − AT . The first matrix is
T

called symmetric, the second is called anti-symmetric.

b) Compute AAT and AT A. Then evaluate tr(AT A) and tr(AAT ).
c) Why are these two numbers computed in b) the same? Is it true in
general for two n × m matrices that tr(AT B) = tr(B T A)? (There is a
short verification using the sum notation).

Problem 1.2: Use the definitions to find the angle between the vector
v = [1, 1, 0, −3, 0, 1]T and w = [1, 1, 9, −3, −5, −3]T . What? Is this not
a bit esoteric? These vectors are in R6 . It actually is very applied: the
value cos(α) is the correlation between the two data points v and w. If
the cosine is positive, the data have positive correlation. If the cosine is
negative, they have negative correlation.

Problem 1.3: a) Verify the triangle identity |v −w| ≤ |v|+|w| in general

by FOILing out (v − w) · (v − w), then generate an example of two vectors
in the plane R2 , where this happens. Draw the situation.
b) Verify that if v and w have the same length, then (v − w) and (v + w)
are perpendicular. Describe the result in one sentence so that a junior
high school student would understand it.

Problem 1.4: Write the vector F = [2, 3, 4]T as a sum of a vector

parallel to v = [1, 1, 1]T and a vector perpendicular to v. If we interpret F
as a force acting on a kite of mass 1 and v as the velocity then F · v has
an interpretation as power, the rate of change of the energy of the kite.
The vector parallel to v would by Newton be the acceleration of the kite.

Problem 1.5: a) Find two vectors in R2 for which all coordinate entries
are 1 or −1 and which are both perpendicular to each other.
b) Design four vectors in R4 for which all coordinate entries are 1 or −1
which are all perpendicular to each other.
Optional and needs not to be turned in: Can you invent a strategy which
allows you for example to find 16 vectors in R16 which are all perpendicular
to each other and have still entries in {−1, 1}?

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 2: Gauss-Jordan elimination

Lecture
2.1. If a n × m matrix A is multiplied with a vector x ∈ Rm , we get a new vector Ax
in Rn . The process x → Ax defines a linear map from Rm to Rn . Given b ∈ Rn , one
can ask to find x satisfying the system of linear equations Ax = b. Historically,
this gateway to linear algebra was walked through much before matrices were even
known: there are Babylonian and Chinese roots reaching back thousands of years. 1
2.2. The best way to solve the system is to row reduce the augmented matrix
B = [A|b]. This is a n × (m + 1) matrix as there are m + 1 columns now. The Gauss-
Jordan elimination algorithm produces from a matrix B a row reduced matrix
rref(B). The algorithm allows to do three things: subtract a row from another
row, scale a row and swap two rows. If we look at the system of equations, all
these operations preserve the solution space. We aim to produce leading ones
,
1 which are matrix entries 1 which are the first non-zero entry in a row. The goal
is to get to a matrix which is in row reduced echelon form. This means: A) every
row which is not zero has a leading one, B) every column with a leading 1 has no other
non-zero entries besides the leading one. The third condition is C) every row above a
row with a leading one has a leading one to the left.
2.3. We will practice the process in class and homework. Here is a theorem
Theorem: Every matrix A has a unique row reduced echelon form.

Proof. 2 We use the method of induction with respect to the number m of columns
in the matrix. The induction assumption is the case m = 1 where only one column
exists. By condition B) there can either be zero or 1 entry different from zero. If there
is none, we have the zero column. If it is non-zero, it has to be at the top by condition
C). We are in row reduced echelon form. Now, let us assume that all n × m matrices
have a unique row reduced echelon form. Take a n × (m + 1) matrix [A|b]. It remains
in row reduced echelon form, if the last column b is deleted (see lemma). Remove
the last column and row reduce is the same as row reducing and then delete the last
column. So, the columns of A are uniquely determined after row reduction. Now note
that for a row of [A|b] without leading one at the end, all entries are zero so that also
1For more, look at the exhibit on the website: google “catch 22 Harvard” to get there
2The proof is well known: i.e. Thomas Yuster, Mathematics Magazine, 1984
Linear Algebra and Vector Analysis

the last entries agree. Assume we have two row reductions [A0 |b0 ] and [A0 |c0 ] where A0
is the row reduction of A. A leading 1 in the last column of [A0 |b0 ]) happens if and
only if the corresponding row in A was zero. So, also [A0 , c0 ] has that leading 1 at the
end. Assume now there is no leading one in the last column and b0k 6= c0k . We have so
x, a solution to the equation A0kq xq + A0k,q+1 xq+1 + ...A0k,m xm = b0k . Since solutions to
equations stay solutions when row reducing, also A0kq xq +A0k,q+1 xq+1 +· · ·+A0k,m xm = c0k .
Therefore b0k = c0k .
2.4. A separate lemma allows to break up a proof:

Lemma: If [A|b] is row reduced, then A is row reduced.

Proof. We have to check the three conditions which define row reduced echelon form.

2.5. It is not true that if A is in row reduced echelon form, then any sub-matrix is in
row reduced echelon form. Can you find an example?

Examples
2.6. To row reduce, we use the three steps and document on the right. To save space,
we sometimes report only after having done two steps. We circle the leading . 1 Note
that we did not immediately go to the leading
1 by scaling the first. It is a good idea
toavoid fractions as much
 as possible.  
3 4 5 6 7 → row 3
1 2 3 4 5
 20 30 40 50 60   20 30 40 50 60  ∗1/10
 1 2 3 4 4 → row 1  3 4 5 6 7
1 2 3 4 5
1 2 3 4 5
 2 3 4 5 6  −R1  1 1 1 1 1  −R1
 3 4 5 6 7 −R2  1 1 1 1 1 −R2
1 2 3 4 5 +2R2
1 0 −1 −2 −3
 0 −1 −2 −3 −4  ∗(−1)  0 1 2 3 4 
0 0 0 0 0 0 0 0 0 0
2.7. Finish the following Suduku problem which is a game where one has to fix
matrices. The rules are that in each of the four 2 × 2 sub-squares, in each of the
four rows and each of the four
 columns, the entries 1 to 4 have to appear and so
2 1 x 3
 3 y z 1 
add up to 10  4 3 a 2 . We have the equations 2 + 1 + x + 3 = 10, 3 + y +

b c d e
z + 1 = 10, 4 + 3 + a + 2 = 10, b + c + d + e = 10 for the rows, 2 + 3 + 4 + b =
10, 1 + y + 3 + c = 10, x + z + a + d = 10, 3 + 1 + 2 + e + 10 for the columns and
2 + 1 + 3 + y = 10, x + 3 + z + 1 = 10, 4 + 3 + b + c = 10, a + 2 + d + e = 10 for the
four squares. We could solve the system by writing down
 the corresponding
 augmented
2 1 4 3
 3 4 2 1 
matrix and then do row reduction. The solution is  4 3 1 2 .

1 2 3 4
Illustrations
The system of equations

x + u = 3

y + v = 5

z + w = 9

x + y + z = 8

u + v + w = 9
is a tomography problem. These problems appear in magnetic resonance imaging.
A precursor was was X-ray Computed Tomography (CT) for which Allen MacLeod Cormack got the Nobel in 1979
(Cormack had a sabbatical at Harvard in 1956-1957, where the idea hatched). Cormack lived until 1998 in Winchester
MA. He originally had been a physicist. His work had tremendous impact on medicine.

x u

y v

z w

Figure 1. A MRI scanner can measure averages of tissue densities

along lines. MRI (Magnetic Resonance Imaging) is a radiology imaging
technique that avoids radiation exposure to the patient). Solving a sys-
tem of equations allows to compute the actual densities and so to do the
magic of “seeing inside the body”.

We build the augmented matrix [A|b] and row reduce. First remove the sum of the
first three rows from the 4th, then change the sign of the 4’th column:
     
1 0 0 1 0 0 3
1 0 0 1 0 0 3
1 0 0 0 −1 −1 −6
 0 1 0 0 1 0 5   0 1 0 0 1 0 5   0 1 0 0 1 0 5 
     
 0 0 1 0 0 1 9  ⇒  0 0 1 0 0 1 9   ⇒  0 0 1 0 0 1 9 

   
 1 1 1 0 0 1 8   0 0 0 1 1 1 9   0 0 0 1 1 1 9 
0 0 0 1 1 1 9 0 0 0 1 1 1 9 0 0 0 0 0 0 0
Now we can read of the solutions. We see that v and w can be chosen freely. They are
free variables. We write v = r and w = s. Then just solve for the variables:

x = −6 + r + s
y = 5−r
z = 9−s
u = 9−r−s
v = r
w = s
Linear Algebra and Vector Analysis

Homework

Problem 2.1: For a polyhedron with v vertices, e edges and f tri-

angular faces Euler proved his famous formula v − e + f = 2. An other
relation 3f = 2e called a Dehn-Sommerville relation holds because each
face meets 3 edges and each edge meets 2 faces. Assume the number the
number f of triangles is 288. Write down a system of equations for the
unknowns v, e, f in matrix form Ax = b, then solve it to find v and e.

 
1 2 3 4
Problem 2.2: Row reduce the matrix A =  1 2 3 0 .
1 2 0 0

Problem 2.3: a) In the “Nine Chapters on Arithmetic”, the following

system of equations appeared 3x+2y+z = 39, 2x+3y+z = 34, x+2y+3z =
26. Solve it using row reduction by writing down an augmented matrix
and row reduce.

Problem 2.4: a) Which of the following matrices are in row reduced

echelon
form?
1 1 0 1 0 0 1 0 0 1 0 0
A= ,B= ,C= ,D= .
0 0 1 0 0 1 0 0 0 0 0 1
b) Two n × m matrices in reduced row-echelon form are called of the
same type if they contain
the same
number
of leading
l’s in the same
1 2 0 1 3 0
positions. For example, and are of the same type.
0 0 1 0 0 1
How many types of 2 × 2 matrices in reduced row-echelon form are there?

 
1 2 3
Problem 2.5: Given A =  4 5 6 . Compare rref(AT ) with
7 8 9
T
(rref(A)) . Is it true that the transpose of a row reduced matrix is a
row reduced matrix?

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 3: Definitions, Theorems and Proofs

Seminar
3.1. Theorems are mathematical statements which can be verified using proofs. The-
orems are the backbone of mathematics. A proof assures that the theorem is true and
remains valid also in the future. Lets look at an example of a theorem. It has already
been known and proven by Euclid of Alexandria. It deals with integers and primes
positive integers larger than 1 which are only divisible by 1 or itself. The theorem tells
that every positive integer is either 1 or prime or the product of two or more primes.
To formulate the theorem more elegantly, we extend the notion of product and say
that a prime is the product of k = 1 primes and that the number 1 is a product of k = 0
primes. Also we would say the number 20 = 2∗2∗5 is the product of k = 3 primes, even
so the prime 2 appears twice. This is similar to the water molecule H2 O = H ∗ H ∗ O
containing k = 3 atoms, as hydrogen H appears twice and oxygen O once. Now, like
every molecule decomposes into atoms, every number decomposes into primes:

Theorem: “Every integer n ≥ 1 is a product of k ≥ 0 primes”.

This is a remarkable statement because there are infinitely many integers. We can not
go therefore through an infinite list and check things for each. It could a priori happen
1000
that for some very large number, like the Fermat number F1000 = 2(2 ) + 1, which
can not even be written down in our universe, 1 the statement would fail.

3.2. In order that such a statement can be verified or refuted, one needs first of all to
make sure that the objects are described by clear definitions. In the above sentence,
this means that we need to know what the “integers” are, what a “product” is and
what “prime numbers” are. This is already tricky in general. Most confusions which
have happened historically in science (and still today!) are based on sloppy definitions.
2

Problem A: What is problematic with the definition: “a vector is an

object with magnitude and direction”? (Quote Britannica)

1There
are less than 2300 elementary particles available in our universe (as far as we know).
2
Amuse yourself and try to find definitions of “entropy”, “multiverse”, “intelligence” or “life”
Linear Algebra and Vector Analysis

Problem B: Why is 1 not considered a prime number?

3.3. Once, the definitions of the ingredient of the statement is clear, it is helpful to
clarify its meaning. We get intuition by looking at examples. We see for example
that 100 = 2 ∗ 2 ∗ 5 ∗ 5 is indeed a product of prime numbers. We see that 7 is a prime
number. Examples are great but it is important at this stage also to realize:
Principle: Checking a statement by showing a few examples is not a
proof.
We will come back to this later in the course.
Problem C: The following statements are examples to theorems we have
seen in the first two lectures:

Statement Belongs to theorem

32 + 42 = 52

63 = [3, 4] · [5, 12] ≤ 5 ∗ 13 = 65

[0, 1, 0, 0, 1] can not be row reduced to [0, 0, 1, 1, 0].

3.4. One of the important proof techniques is the principle of mathematical in-
duction. 3 It is mostly applied to integers but it can also be used for matrices as
we have seen in the second lecture. The principle applies for statements S(n) which
depend on a number n.

Principle: S(1) and S(n) ⇒ S(n + 1) implies S(n) for all n ≥ 1.

3.5. Here is an example

Theorem: S(n): 1 + 2 + 3 + · · · + n = (n2 + n)/2.

Proof: the statement S(n) is true for n = 1. Assume S(n) is true. Now S(n + 1) tells
1 + 2 + .. + n + (n + 1) = ((n + 1)2 + (n + 1))/2. Using the induction assumption this
means (n2 + n)/2 + (n + 1) = ((n + 1)2 + (n + 1))/2 which is true. We know therefore
that the statement is true for all n.
3.6. Lets look at the theorem on primes above. In order to make this a statement
which we can extend from n to n+1, we modify the statement to

Theorem: S(n): Every k ∈ {2, 3, 4 · · · n} has a prime factorization.

3.7. S(2) is true as {2} only contains one number which is prime. Now assume S(n)
meaning that the statement is true for n, prove that S(n + 1) is true. There are two
cases: if n + 1 is prime, then S(n + 1) is true. If n + 1 is not prime, then n = ab
where a and b are numbers larger than 1 but smaller or equal than n. By induction
assumption, both a and b decay into primes: a = p1 p2 · · · , pk and b = q1 q2 . . . , ql where
pj and qj are primes. Therefore, n + 1 = p1 p2 · · · pk q1 q2 · · · ql .
3.8. It is important to understand the statement and not overreach it. We have not
proven that every integer has a unique decomposition into prime factors. This was
not known by Euclid (he might not even have thought about it). It was only proven
2000 years later by Gauss. A common mistake which happens in mathematical proofs
is that one cites a theorem which is known but over reaches its scope or forgets one of
the assumptions.
Principle: Do not extend the scope of an already established fact with-
out justification.

3.9. If you think such mistakes happen to rookies only, this is not the case. Leonard
Euler, probably the greatest mathematician of all times once attempted√a proof of
Fermat’s last theorem by working with√extended number systems like Z[ −3] which
are all the numbers of the form a + −3b, where a, b are integers. You see, one
can add and multiply such numbers like integers and remain in the class. Euclid’s
proof also shows that there is a prime factorization.√ But there √ can be different prime
factorizations. An example is 4 = 2 ∗ 2 = (1 + −3)(1 − −3). A similar mistake
was done by Gabriel Lamé who announced in 1847 a proof of Fermat’s last theorem
telling that for n ≥ 3, no solutions to xn + y n = z n exist unless xyz = 0. Lamé’s genius
3Already used by Plato and a second order axiom in the Peano axiom system.
Linear Algebra and Vector Analysis

idea was to decompose xn + y n into linear factors using numbers satisfying ξ n = 1, so

called roots of unity. Also here, Euclid shows that a prime factorization exists, but it
is also here not unique. The mistake was actually quite important. It led to a “theory
of ideals” by Ernst Kummer which allowed to prove Fermat’s last theorem in certain
cases.
Principle: Mistakes can open new doors and find ideas. A creative
search process can lead to mistakes at first.
4

3.10. Of course, we have to try to avoid mistakes in the final product at all costs. Euler
certainly earned the right to make some mistakes by creating a lot of mathematics,
which will remain true for all eternity. But mistakes can be much more basic. Here is
a beautiful example due to Polya: 5

Theorem: S(n): In a collection of n horses all have the same color.

Proof: The induction assumption is clear as for n=1, all horses have the same color.
Now assume the statement is true for all groups of n horses. Now take n+1 horses and
take the first away. These are n horses so that all have the same color. Now put the
first back and take the last one away. Again we have n horses, so that all have the
same color. Therefore all have the same color.
Problem D: What is wrong in the proof of Polya’s horse theorem?

Here are some more amusements:

Theorem: Cats have nine tails.

3.11. Proof: No cat has no tail. A cat with a tail has a tail more than no cat. No cat
has eight tails. Therefore, cats have nine tails.
4see Mario Livio: Brilliant blunders, 2013
5George Polya: Induction and Analogy in Math, 1954 (Thanks to Jun Hou Fung for suggestion):
3.12. For the following definition of “Prime numbers” we follow 6:
A prime is a number with no divisors.
Boxes of chocolates always contain a prime number
so that, whatever the number of people present
somebody has to have that one left over.
3.13. Why do we start to do induction at n = 1 and not from the other end? The
following song explains why: (just as a bit of background to appreciate the song:
Aleph-Null = ℵ0 is the cardinality of the natural numbers N. ℵ1 is the next larger
cardinality. The cardinality of the real numbers R is 2ℵ0 (as the Cantor diagonal
argument shows that the real numbers can not be counted) which is the cardinality of
all subsets of natural numbers. Cantor had shown that there are different infinities.
A beautiful mind like Cantor of course asked whether there is an infinity in between
these two infinities.

The statement 2ℵ0 = ℵ1 is the continuum hypothesis abbreviated CH. Work of Paul
Cohen Kurt Gödel in the sixties shows that one can not prove the statement nor its
negation from ZFC set theory (an axiom system of our standard mathematics from
which one can derive the Peano axioms including the principle of induction). Cantor
had for a long time tried to prove CH, in vain. We know now that his efforts to prove
this were doomed from the beginning. This possibility always exists. There is the pos-
sibility (very unlikely although) that we can not prove that every even number larger
than 2 is a sum of two primes, even in the case if it would be true! 7. The continuum
hypothesis problem had been the first of Hilbert’s problems of 1900.

Aleph-null bottles of beer on the wall,

Aleph-null bottles of beer,
You take one down, and pass it around,
Aleph-null bottles of beer on the wall.
3.14. And here is another Ainsley quote:

At the end of a proof you write Q.E.D,

which stands not for
Quod Erat Demonstrandum
as the books would have you believe, but
for Quite Easily Done.

6R. Ainsley: “Bluff your way in maths, 1990”

7See Apostolos Doxiadis: Uncle Petros and the Goldbach conjecture, Novel of 1992
Linear Algebra and Vector Analysis

Homework
Exercises A)-D) are done in the seminar. This homework is due on Tuesday:

Problem 3.1 Write down a proof by induction showing that 1 + 3 + 5 +

7 + · · · + (2n − 1) = n2 for every integer n ≥ 1.

Problem 3.2 Given a n P × n matrix A, its trace is defined as the sum

of the diagonal elements k Akk . We can define in M (n, m) the inner
product tr(AT B). First check that this inner product makes sense and
that AT B is indeed a square matrix. Repeat each step of the proof of the
Cauchy-Schwarz inequality and see that it still works.

Problem 3.3 Let us define a vector v ? w = (v · w)v/|v|2 . It is called the

vector projection of w onto v.
a) Is the operation ? commutative?
b) Is the operation ? associative?
c) Verify that v is perpendicular to w − (v ? w).

Problem 3.4 Try to design yourself an elementary geometric proof of

the Pythagorean theorem which does not use any algebra. First try this
without looking up. Then look up one the many proofs available and pick
the one you like most and write or draw it out.

Problem 3.5 Given a n × m matrix A, assume that rref(A) has r leading

1 and that rref([A|b]) has s leading 1. What condition on r and s and
n and m implies that the system of equations Ax = b has no solution?
Experiment first with small examples.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 4: Cross product

Lecture
4.1. The three dimensional space R3 is special. It is not only the only Euclidean space
in which the Kepler problem is stable 1, it also features a cross product v × w which
is in the same space. Such a product can be defined in Rn but it produces a vector in
Rn(n−1)/2 . It happens that for n = 3 that the result is again in R3 . The problem of
“multiplying triplets” has been pondered by William Hamilton in the first half of the
19th century and is related to the fascinating story of quaternions. The discovery of
quaternions was simultaneously the birth place of the dot and cross product.
4.2. The cross product of two vectors v = [v1 , v2 , v3 ]T and w = [w1 , w2 , w3 ]T is
     
v1 w1 v2 w3 − v3 w2
 v2  ×  w2  =  v3 w1 − v1 w3  .
v3 w3 v1 w2 − v2 w1
Take the dot product with v or w to see that v × w is perpendicular to both v and w.
Obvious is also v × w = −w × v. The product is handy for constructions in R3 . The
vectors v, w, v × w are oriented like the first three fingers on the right hand: if v is the
thumb, w is the pointing finger, then v×w is the middle finger. Let v·w = |v||w| cos(α):

Theorem: |v × w| = |v||w| sin(α) and v · (v × w) = w · (v × w) = 0.

Proof. We will verify in class by brute force the Lagrange’s identity |v × w|2 =
|v|2 |w|2 − (v · w)2 which is also called Cauchy-Binet formula. Now use |v · w| =
|v||w| cos(α) to get the result with cos2 (α) + sin2 (α) = 1.

4.3. Given a triangle with side lengths a, b, c and angles α, β, γ, where α is opposite
to a etc. We have the following sin-formula
a b c
Corollary: sin(α)
= sin(β)
= sin(γ)
.

Proof. We can use the theorem and express the area of the triangle as ab sin(γ) or
bc sin(α) or ac sin(β). By equating these three quantities and dividing out the common
factor, we get the sin-formula.
1by a theorem of Joseph Bertrand of 1873 and work of Sundman-von Zeipel
Linear Algebra and Vector Analysis

4.4. This is useful in applications as to define the area of the parallelogram as |v × w|.
That this is justified can be seen in two dimensions and:
Corollary: |v × w| is the parallelogram area spanned by v and w.

Proof. Use the formula |v × w| = |v||w| sin(α) and note that |w| sin(α) is the height of
the parallelogram spanned by v and w. The base length is |v|.
4.5. The scalar u · (v × w) is called the triple scalar product of u, v, w. Its sign
defines an orientation of the three vectors. It is also the determinant of the matrix
 
u1 v1 w1
 u2 v2 w2  .
u3 v3 w3
The absolute value of u · v × w defines the volume of the parallelepiped spanned
by u, v and w. Without the absolute value, we also speak of signed volume.
4.6. Side remark: In higher dimensions, the cross product is called exterior prod-
uct. One uses ∧ rather than × which is used in three dimensions. If I = (i, j) is a
choice of two elements in {1, 2, . . . , n} and v, w are two vectors in Rn , then (v ∧ w)I =
vi wj −vj wi . The formula |v∧w| = |v||w| sin(α) still holds and the proof is the same. We
only need again to verify the Cauchy-Binet formula |v|2 |w|2 − (v · w)2 = |v ∧ w|2 . But
this is better donePusing matrices. If A is the matrix which contains v, w as columns,
then det(AT A) = P det(AP )2 , where the sum on the right is over all 2×2 submatrices
AP of A. The expression det(AP ) is called a minor. Cauchy-Binet formula is super
cool 2. By the way, if we have k vectors and build A ∈ M (n, k), a matrix which has
these vectors as columns. Now, det(AT A) is the volume of the parallelepiped spanned
by these vectors. And Cauchy-Binet writes this as a sum of squares of k-dimensional
volumes of projections which is in some sense a generalization of Pythagoras.
Examples
4.7. What is the area of the triangle A = (1, 1, 1), B = (3, 5, 2) and C = (2, 0, 3)? We
find the cross product between the vector [2, 4, 1]T going from A to B and the vector
[1, −1, 2]T going from A to C. The cross product is
     
2 1 9
 4  ×  −1  =  −3  .
1 2 −6
√ √
Its length is 3 14. The area of the triangle is half of it: 3 14/2.
4.8. Find the volume of the parallelepiped for which one of the vertices is (0, 0, 0) and
the other neighbors are A, B, C from before? We find the signed volume
         
1 3 2 1 12
 1  · ( 4  ×  0 ) =  1  ·  −5  = −1 .
1 2 3 1 −8
and take the absolute value. A negative number indicates that OA, OB, OC is left
handed.
2O. Knill, Cauchy Binet for pseudo-determinants, Lin. Alg. and its Applications 459 (2014) 522-547
Illustrations

Figure 1. The just newly released Swiss 200 Frank bill shows the
right hand rule: thumb = v, pointing finger = w, then v × w is the
middle finger. Source: Swiss National Bank, issued August 22, 2018.

Figure 2. The Lorentz force F is a vector F = qv ×B determined by

the velocity v of a charged particle with charge q moving in a magnetic
field B.

Figure 3. Given a particle of mass m at position r moving with the

velocity r0 then L = mr × r0 is the angular momentum.
Linear Algebra and Vector Analysis

Homework

Problem 4.1: Find a vector w perpendicular to the vectors u = [1, 1, 1]T

and v = [3, 4, 5]T . Then use this result to find a vector x perpendicular to
both v and w.

Problem 4.2: A 3D scanner is used to build a 3D model of a face.

It detects a triangle which has its vertices at P = (0, 1, 1), Q = (1, 1, 0)
and R = (1, 2, 3). Find the area of that triangle as well as a vector
perpendicular to the triangle. (*)

Problem 4.3: a) Find the volume of the parallelepiped which has

the vertices O = (0, 0, 0), P = (2, 3, 1), Q = (4, 3, 1), R = (6, 6, 2).
A = (1, 1, 1), B = (3, 4, 2), C = (5, 4, 2), D = (7, 7, 3).

Problem 4.4: Investigate which of the following formulas are always

true for all vectors u,v,w,x,y. If it is true, either explain, cite a source
(i.e. on the web), or a by hand or computer algebra verification. If it is
not true, find a counter example.
a) u × (v × w) = (u × v) × w
b) u · (v × w) = v · (w × u)
c) u × (v + w) = u × v + u × w
d) u × (v × w) = (u · w)v − (u · v)w.
e) (u × v) · (x × y) = (u · x)(v · y) − (u · y)(v · x).

Problem 4.5: Given

 two vectors
 p = [a, b, c]T and q = T
 [u, v, w] , build
0 a b 0 u v
the matrices P = −a 0 c Q = −u 0 w  Compare p × q
  
−b −c 0 −v −w 0
and QP − P Q. Describe what you see.
(*) The STL format which is used for 3D printing, has an extremely simple form. It consists of entries like
facet normal 0.15 -0.97 -0.20
outer loop
vertex -1.6996 -0.5597 -2.8360
vertex -1.8259 -0.5793 -2.8374
vertex -1.7232 -0.5399 -2.9509
endloop
endfacet
The first line gives the normal vector, then there is a loop with three vertices giving the triangle. There is obviously
some redundancy as one could get the normal vector from the points using the cross product. But there is purpose:
the redundant information makes working with the data structure faster, second, one can also look at situations, where
the normal vector is not perpendicular to the surface, one can change the way how the is “shaded”, like how light is
reflected at the surface. Third, redundancy is always good to catch errors. Our genetic information in the DNA is
stored in a highly redundant way. This allows error correction.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 5: Surfaces

Lecture
5.1. If A is a matrix, the solution space of a system of equations Ax = b is called
a linear manifold. It is the set of solutions of Ax = 0 translated so that it passes
through one of the points. The equation 3x + 2y = 6 for example describes a line in
R2 passing through (2, 0) and (0, 3). The solutions to Ax = 0 form a linear space,
meaning that we can add or scale solutions and still have again solutions. We can
rephrase the just said in that a linear space is a linear manifold which contains 0. For
example, for x+2y +3z = 6 we get a plane which is parallel to the plane x+2y +3z = 0.
The former is a linear manifold (also called affine space),
the later is a linear space. It
is the solution space to Ax = 0 with A = 1 2 3 and x = [x, y, z]T . Both planes
are perpendicular to n = [1, 2, 3]T . To find an equation for the plane through 3 points
P, Q, R, define n = P Q × P R = [a, b, c]T then write down ax + by + cz = d, where d is
obtained by plugging in a point. The cross product comes handy.

5.2. The following important example deals with A = [a1 , . . . , am ] in M (1, m).

Theorem: The vector n = AT is perpendicular to the plane Ax = d.

Proof. Given two points y, z in the plane. Then we have Ay = d and Az = d. Then
x = y−z is a vector inside the plane. Now AT ·x = Ax = A(y−z) = Ay−Az = d−d = 0.
This means that x is perpendicular to the vector AT .
In three dimensions, this means that the plane ax + by + cz = d has a normal vector
AT = n = [a, b, c]T . Keep this in mind, especially because R3 is our home.

5.3. This duality result will later will identified as a fundamental theorem of
linear algebra. It will be important in data fitting for example. The kernel of a
matrix A is the linear space of all solution Ax = 0. The kernel consists of all roots of
A. The image of a matrix A is the linear space of all vectors {Ax}. We abbreviate
ker(A) for the kernel and im(A) of the image. We will come back to this later.

Theorem: The image of AT is perpendicular to the kernel of A.

Proof. If x is in the kernel of A, then Ax = 0. This means that x is perpendicular to

each row vector of A. But this means that x is perpendicular to the column vector of
Linear Algebra and Vector Analysis

AT . So, x is perpendicular to the image of AT . This line of argument can be reversed

to see that if x is perpendicular to the image of AT , then it is in the kernel of A.
5.4. Given a function f : Rn → R, the solution set {f (x1 , · · · , xn ) = d} is a hyper
surface. We often say “surface” even so “surface” is reserved to n = 3. The simplest
non-linear surfaces are quadratic manifolds
x · Bx + Ax = d
defined by a symmetric matrix B and a row vector A and a scalar d. We assume that
B is not the zero matrix or else, we are in the case of a linear manifold. 
We also can

a 0 0
assume B to be symmetric B = B T . For notation, we write Diag(a, b, c) =  0 b 0 
0 0 c
and 1 = Diag(1, 1, 1).
5.5. Ellipsoids For B = 1 and A = 0 and d = 1 we get the sphere |x|2 = 1.
In R2 , a sphere is a circle x2 + y 2 = 1. In three dimensions we have the familiar
sphere x2 + y 2 + z 2 = 1. An more general ellipsoid with B = Diag(1/a2 , 1/b2 , 1/c2 ) is
x2 /a2 + y 2 /b2 + z 2 /c2 = 1. By intersecting with x = 0 or y = 0 or z = 0, we see traces,
which are all ellipses.

Figure 1. The sphere x2 + y 2 + z 2 = 1 and an example of an ellipsoid

x2 /a2 + y 2 /b2 + z 2 /c2 = 1.

5.6. Hyperboloids. For B = Diag(1, 1, −1) and d = 1, we get a one-sheeted

hyperboloid x2 + y 2 − z 2 = 1. For B = Diag(1, 1, −1) and d = −1, we get a two-
sheeted hyperboloid x2 + y 2 − z 2 = −1. A more general hyperboloid is of the form
x2 /a2 + y 2 /b2 − z 2 /c2 = d with d 6= 0. The intersection with z = 0 gives in the one-
sheeted case a circle, in the two-sheeted case nothing. The x = 0 trace or the y = 0
trace are both hyperbola.
5.7. Paraboloids. For B = Diag(1, 1, 0) and A = [0, 0, −1] and d = 0 we get the
paraboloid x2 + y 2 = z, for B = Diag(1, −1, 0) and A = [0, 0, −1] and d = 0 we get
the hyperbolic paraboloid x2 −y 2 = z. We can recognize paraboloids by intersecting
with x = 0 or y = 0 to see parabola. Intersecting the elliptical paraboloid x2 + y 2 = z
with z = 1 gives an ellipse. Intersecting the hyperbolic paraboloid x2 − y 2 = z with
z = 1 gives a hyperbola.
5.8. Special surfaces. If B = Diag(1, 1, −1) and d = 0, we get a cone x2 +y 2 −z 2 = 0.
For B = Diag(1, 1, 0) and d = 1 we get the cylinder x2 + y 2 = 1.
Figure 2. The one-sheeted hyperboloid x2 + y 2 − z 2 = 1 and the
two-sheeted hyperboloid x2 + y 2 − z 2 = −1.

Figure 3. An elliptic paraboloid z = x2 + y 2 and the hyperbolic

paraboloid z = x2 − y 2 .

Figure 4. The cone x2 + y 2 = z 2 and the cylinder x2 + y 2 = 1.

5.9. Side remark: The 1-sphere S 1 = {x2 + y 2 = 1} ⊂ R2 and the 3-sphere S 3 =

{x2 + y 2 + z 2 + w2 = 1} ⊂ R4 carry a multiplication: S 1 is in the complex numbers
C = {x+iy} and S 3 is in the quaternions H = {x+iy +jz +kw}. The 1-sphere is the
gauge group for electromagnetism, the 3-sphere (also called SU (2)) is responsible
for the weak force. No other Euclidean sphere carries a multiplication for which
x → x ∗ y is smooth. Michael Atiyah once pointed out that this algebraic particularity
might not be a coincidence and responsible for the structure of the standard model
of elementary particles (one of the most accurate theories ever built by humanity).
The strong force appears as one can let a set of 3 × 3 matrices SU (3) act on H. Atiyah
suggested that gravity could be related to the octonions O. There S 7 = {|x| = 1} ⊂
R8 carries still a multiplication, but it is no more associative. The list of normed
division algebras R, C, H and O. 1

1See the talk of 2010 of Atiyah (https : //www.youtube.com/watch?v = zCCxOE44M M ).

Linear Algebra and Vector Analysis

5.10. Given a polynomial p of n variables, one can look at the surface {p(x) = 0}. It
is called a variety.

Figure 5. More examples of varieties, solution sets to polynomial

equations. To the left we see cubic surface x3 − 3xy 2 − z = 0 called
the monkey saddle. To the right we see torus (3 + x2 + y 2 + z 2 )2 −
16(x2 + y 2 ) = 0 which is an example of a quartic manifold.

Figure 6. The variety x4 − x2 + y 2 + z 2 = d for d = −0.02, d = 0 and d = 0.02.

Examples
5.11. Q: Find the plane Σ containing the line x = y = z and the point P = (3, 4, 5).
A: Σ contains Q = (0, 0, 0) and R = (1, 1, 1) and so the vectors v = [1, 1, 1]T and
w = [3, 4, 5]T . The cross product between v and w is [1, −2, 1]T . It is perpendicular to
Σ. So, the equation is x − 2y + z = d, where d can be obtained by plugging in a point
(3, 4, 5). This gives d = 0 so that x − 2y + z = 0.
5.12. Can we identify the surface x2 + 2x + y 2 − 4y − z 2 + 6z = 0? Completion of
the square gives x2 + 2x + 1 + y 2 − 4y + 4 − z 2 + 6z − 9 = 1 + 4 − 9 = −4.
Now (x + 1)2 + (y − 2)2 − (z − 3)2 = −4. This is a two-sheeted hyperboloid centered
at (−1, 2, 3).
5.13. Intersecting the cone x2 + y 2 = z 2 with the plane y = 1 gives a hyperbola
z 2 − x2 = 1. Intersection with z = 1 gives a circle x2 + y 2 = 1. Intersecting with
z = x + 1 gives y 2 = 2x + 1, a parabola. Because bisecting a cone can give hyperbola,
an ellipse or a parabola as cuts, one calls the later conic sections.
5.14. The case of singular quadratic manifolds is even richer: x2 − y 2 = 1 is a
cylindrical hyperboloid, x2 −y 2 = 0 is a union of two planes x−y = 0 and x+y = 0.
The surface x2 = 1 is a union of two parallel planes, the surface x2 = 0 is a plane.
Homework

Problem 5.1: a) What kind of curve is x2 + 2x + y 2 + 1 = 0?

b) What surface is x2 + y 2 − 4y + z 2 + 8z = 100?

Problem 5.2: What kind of curves can you get when you intersect a
hyperbolic paraboloid x2 − y 2 = z with a plane?

Problem 5.3: Find explicit planes which when intersected with the
hyperboloid x2 + 2y 2 − z 2 = 1 produces an ellipse, or a hyperbola or a
parabola.

Problem 5.4: Find the equation of a plane which is tangent to the three
unit spheres centered at (3, 4, 5), (1, 1, 1), (2, 3, 4).

Problem 5.5: Build a concrete function f (x, y, z) of three variables such

that some level surface f (x, y, z) = c is a pretzel, a surface with three
holes. Hint: the surface g ∗ h = 0 is the union of the surfaces g = 0 and
h = 0. Now, g ∗ h = c can produce surfaces in which things are glued
nicely. If you should look up a surface on the web or literature, you have
to give the reference. You can use the computer to experiment, or then
describe your strategy in words.

Figure 7. In the pretzel baked to the right we have used a polynomial

f (x, y, z) of degree 12. A problem in algebraic geometry would be to find
the “smallest degree polynomial” which works and then find the most
elegant polynomial.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 6: Visual proofs

Seminar
6.1. Geometric intuition and pictures allow to prove results visually. An example:

Figure 1. This is a proof without words.

Problem A: What formula does Figure (1) prove?

6.2. By drawing a rectangle of side length a and b, we can see that the area a ∗ b is
the same as the area b ∗ a. For the cross product or matrices, this is wrong.

Figure 2. A Cuisenaire proof that 4 ∗ 5 = 5 ∗ 4. Four yellow sticks of

length 5 have the same area than 5 purple sticks of length 4.

1Cover of the book ”Proofs without words”

Linear Algebra and Vector Analysis

6.3. Pictures help to get intuition about a mathematical result. The Pythagorean
theorem was first proven geometrically. The visual proof we look at here could well
have been the first which was found.

Figure 3. A visual proof of the Pythagorean theorem. It is probably

one of the first proofs.

Problem B: Use Figure (3) for a proof of the Pythagorean theorem. You
can either describe in words, or label some parts of the picture. Remember
that we want to show c2 = a2 + b2 .

a+b
ab
2

a b
√
Figure 4. A visual proof of ab ≤ (a + b)/2.
2

6.4. The geometric-algebraic inequality assures that the geometric mean is smaller
or equal than the algebraic mean. In order to appreciate that proof, we have first to
verify an identity relating the lengths a, b cut by the altitude line and height h.
Problem C: First check why the triangle in Figure 4 is a right angle.
Then use Pythagoras three times to prove ab = h2 . Finally check the
geometric-algebraic inequality.

2C. Gallant, Mathematics Magazine, 50(2), 1977, page 98

6.5.
Theorem: The radius of the inscribed circle in a 3 : 4 : 5 triangle is 1

Problem D: Use Figure (5) from the “9 Chapters” to prove the theorem.

c
c
5
3

M
b
a

a b
A 4 B

Figure 5. The 3-4-5 triangle. Can you use the picture to prove that a=1?

6.6. Find the formula for the volume of a tetrahedron given by 4 points A, B, C, D.

Figure 6. The tetrahedron volume is 1/6 of a parallelepiped volume.

Not only the Egyptions knew it, this figure can also be found in the “nine
chapters”. We build a statue which can be 3D printed.
3

Problem E: Use Figure (6) to prove that the volume is a sixth of the
volume of the corresponding parallelepiped.

3”Illustrating Mathematics using 3D printers”, by O. Knill and E. Slavkovsky.

Linear Algebra and Vector Analysis

Homework
Exercices A-D are done in the seminar. This homework is due on Tuesday:
Problem 6.1 The 3D Pythagoras theorem states that the square of
the area of ABC is the sum of the squares of the areas of the triangles
OAB, OBC and OCA (which are each half of a rectangle). Use Figure (7)
with A = (a, 0, 0), B = (0, b, 0), C = (0, 0, c) to verify this theorem. Use
the cross product to get the areas.

Figure 7. The 3D Pythagoras theorem.

Problem 6.2 Find a formula for the distance of a point P to a line

through two points A, B. The final formula should not use any trig func-
tions.

Problem 6.3 Find a formula for the distance of a point P to a plane

through three points A, B, C. The final formula should not use any trig
functions.

Problem 6.4 Find a formula for the distance between the line through
a point A, B and a line through the point C, D. The final formula should
not use any trig functions.

Problem 6.5 Look up the rules for quaternion multiplication

(u0 , u1 , u2 , u3 )?(v0 , v1 , v2 , v3 ) and verify that (0, v1 , v2 , v3 )∗(0, w1 , w2 , w3 ) =
(−v · w, v × w). Historically, this is an important identity as the dot and
cross product have been introduced together in the form of quaternions.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 7: Curves

Lecture
7.1. Given n continuous functions xj (t) of one variable t, we can look at the vector-
valued function r(t) = [x1 (t), . . . , xn (t)]T . We call it a parametrized curve. An
example is r(t) = [3 + 2t, 4 + 6t] which is a line through the point (3, 4) and containing
the vector [2, 6]. 1 If t is in the parameter interval a ≤ t ≤ b, then the image
of r is {r(t) | a ≤ t ≤ b}, which defines a curve in Rn . The curve starts at the
point r(a) and ends at the point r(b). An other important example is the circle
r(t) = [cos(t), sin(t)], where t is in the interval [0, 2π]. Its image is a circle in the
plane R2 . The parametrization r(t) contains more information than the curve itself:
the parabolic curve r(t) = [t, t2 ] defined on t ∈ [−1, 1] for example is the same as
the curve r(t) = [t3 , t6 ] for ∈ [−1, 1], but in the second parametrization, the curve is
traveled with different speed. Curves in R3 can be admired in our physical space like
r(t) = [x(t), y(t), z(t)] = [t cos(t), t sin(t), t] which is a spiral. You can see that this
particular curve is contained in the cone x2 + y 2 = z 2 .
7.2. If the functions t → xj (t) are differentiable, we can form the derivative r0 (t) =
[x01 (t), . . . , x0n (t)]. While this technically is again a curve, we think of r0 (t) as a vector
attached to the point r(t) and say that r0 (t) is tangent to r(t). The length |r0 (t)|
of the velocity is called the speed of r. If also higher derivatives of the functions
xj (t) exist, we can form the second derivative r00 (t) called the acceleration or third
derivative r000 (t) = r(3) (t) called the jerk. Then come snap r(4) (t), crackle r(5) (t) and
pop r(6) (t) and the Harvard r(7) (t) introduced in the fall of 2016 in a multi-variable
exam.
7.3. Given the first derivative function r0 (t) as well as the initial point r(0), we can get
back the function r(t) thanks to the fundamental theorem of calculus. Because
of Newton’s law which tells that a mass point of mass m subject to a force field F
depending on position and velocity satisfies the Newtonian differential equation
mr00 (t) = F (r(t), r0 (t)), the following result is important:

Theorem: r(t) is uniquely determined from r00 (t) and r(0) and r0 (0).
Rt Rt
Proof. In each coordinate we get x0k (t) = 0 x00k (s) ds + x0k (0) and xk (t) = 0 x0k (s) ds +
xk (0). We have just applied twice the fundamental theorem of calculus.
1To reduce clutter, we write row vectors [2, 6] rather than column vectors
Linear Algebra and Vector Analysis

A special case is if r00 (t) is constant. A special case is the free fall situation. The
coordinate functions are then quadratic. Assume r00 (t) = [0, 0, −10], and r0 (0) = [0, 0, 0]
and r(0) = [0, 0, 20], then r(t) = [0, 0, 20 − 5t2 ]. If you jump from 20 meters into a
pool, you need t = 2 seconds to hit the water.
7.4. Given a curve r(t) for which the velocity r0 (t) is never zero, we can form the
unit tangent vector T (t) = r0 (t)/|r0 (t)|. If T 0 (t) is never zero, we can then form
N (t) = T 0 (t)/|T 0 (t)|, the normal vector. The vector B = T × N is called the
binormal vector. The scalar |T 0 (t)|/|r0 (t)| is called the curvature of the curve.

Theorem: In R3 , we have K = |T 0 |/|r0 | = |r0 × r00 |/|r0 |3 .

Proof. We will do this computation in class.

7.5. Even if r(t) is perfectly smooth, the curvature can become infinite. Lets look
at the example r(t) = [t2 , t3 , 0]. Then r0 (t) = [2t, 3t2 , 0] and r00 (t) = [2, 6t, 0] and
r0 (t) × r00 (t) = [0, 0, 6t2 ]. The curvature is (6/t)(4 + 9t2 )−3/2 which has a singularity at
t = 0.
7.6. Even when r(t) is perfectly smooth and never zero, the normal vector can depend
in a discontinuous
√ way on t. Example: r(t) = [t, t3 /3]. Now r0 [t] = [1, t2 ] and T (t) =
[0, t ]/ 1 + t4 . We see that T 0 (t) takes different signs in the second coordinate. After
2

normalization we have limt→0,t>0 N (t) = [0, 1] and limt→0,t<0 N (t) = [0, −1]. At the
inflection point of the graph of the cube function, the concavity has changed from
concave down to concave up. This has changed the direction of the normal vector N .
7.7. Side remark. We have looked at parametrized vectors only. If the entries Aij (t)
of a matrix depend on times we have a matrix valued curve A(t). This appears in
differential equations, in quantum mechanics (operators moving in time) or - most
importantly - in moving pictures! A movie is just a matrix valued curve.
7.8. Side remark. A planar curve r(t) = [x(t), y(t)]T in the plane defined on t ∈
[0, 2π] is called a simple closed curve if r(0) = r(2π) and there are no values 0 ≤
s 6= t < 2π for which r(t) = r(s). For a smooth curve, meaning that the first two
derivatives exist, we can look at the polar angle α(t) of the vector r0 (t). Define the
signed curvature of the Rcurve as κ(t) = α0 (t)/|r0 (t)|. We have |κ(t)| = K(t). The
2π
Hopf Umlaufsatz tells 0 κ(t) dt = 2π. In the case of the circle for example,
κ(t) = 1.
7.9. Side remark. We can verify that any curve r(t) parametrized on [a, b] such that
r0 (t) 6= 0 for all t ∈ [a, b] can be parametrized as R(t) on [a, b] such that |R0 (t)| = 1 for
all t. Proof: we look for a monotone function s(t) such that the derivative of r(s(t)) has
length 1. This means we want |r0 (s(t))|s0 (t) = 1. In other words, look for a function
s(t) such that s0 (t) = 1/|r0 (s(t))| = F (s(t)) and s(a) = 0. This is what we call a
differential equation. There is a general existence theorem for differential equations
(proven later) which assures that there exists a unique solution s(t). End of proof.
The result is very intuitive. You can drive from r(a) to r(b) along the curve traced by
r(t) by just keeping the speed 1. This gives your your new parametrization. Your new
time interval will be [0, L] where L is the arc length (the length of your trip). We will
come to arc length computation in the next lesson.
7.10. Side remark. Continuous curves can be complicated: If you look at the pollen
particle in a microscope, it moves erratically on a curve which is nowhere differentiable
as it is constantly bombarded with air molecules which bounce it around. This is
Brownian motion. There are also Peano curves or Hilbert curves [0, 1] → [0, 1]2
or space filling Hilbert curves r(t) : [0, 1] → Q = [0, 1]3 which cover every point of the
cube Q. These curves define a continuous bijection from [0, 1] to [0, 1]3 . (The inverse
is not continuous. Still, the construction shows that there are the same number of
points in [0, 1] than in [0, 1]3 ).

Figure 1. The four first stages in the construction of a space filling curve.

Examples
7.11. Assuming the Newton equations mr00 (t) = F (t), find the path r(t) of a body
of mass m = 1/2 subject to a force F (t) = [sin(t), cos(t), −10] with r(0) = [3, 4, 5]
and r0 (0) = [1, 2, 7]. Solution: we have r00 (t) = [2 sin(t), 2 cos(t), −20]. Integration
gives r0 (t) = [−2 cos(t), 2 sin(t), −20t] + [c1 , c2 , c3 ]. Fixing the constants gives r0 (t) =
[3 − 2 cos(t), 2 + 2 sin(t), 7 − 20t]. A second integration gives r(t) = [3t − 2 sin(t), 2t −
2 cos(t), 7t − 10t2 ] + [c1 , c2 , c3 ] with other constants C = [c1 , c2 , c3 ]. Comparing r(0) =
[0, −2, 0]+[c1 , c2 , c3 ] = [3, 4, 5] gives r(t) = [3+3t−2 sin(t), 6+2t−2 cos(t), 5+7t−10t2 ].
7.12. Let r(t) = [L cos(t), L sin(t), 0]. Then r0 (t) = [−L sin(t), L cos(t), 0] and r00 (t) =
[−L cos(t), −L cos(t), 0] and r0 (t) × r00 (t) = [0, 0, L2 ] and |r0 (t)| = L so that |r0 (t) ×
r00 (t)|/|r0 (t)|3 = 1/L. A circle of radius L has curvature 1/L!
7.13. A closed simple curve C in R3 is a knot. For any positive integer n, m we can
look at the torus knot r(t) = [(3 + cos(mt)) cos(nt), (3 + cos(mt)) sin(nt), sin(mt)].
R 2π
The total curvature of a knot is defined as 0 K(t) dt. See Figure 2. 2

Figure 2. Torus knots T (2, 3), T (7, 3), T (12, 13) and T (30, 43). Their
total curvatures are 38.6, 245.6, 487.2, 2167.3.

2A general theorem of Fay and Milnor assures that a knot of total curvature ≤ 4π is trivial.
Linear Algebra and Vector Analysis

Homework

Problem 7.1: A stone of mass m = 0.1 in the Pandora Halleluya

mountains is exposed to the force F (t) = [log(e + t), et/100 , sin(t)]. It is
initially at r(0) = [0, 0, 100] and has zero initial velocity r0 (0) = [0, 0, 0].
Where is it at t = 10? In this course, we always write log(t) = ln(t).
3

Problem 7.2: We want to produce a logo for a new company and

experiment. Draw the curve r(t) = [cos(t), sin(t)] + [cos(5t), sin(7t)]/4 +
[cos(13t), sin(9t)]/4 and find the velocity, acceleration, and curvature at
t = 0.

Problem 7.3: Parametrize the curve r(t) obtained by intersecting the

cylinder x2 /9 + y 2 /4 = 1 with the plane z = x + 5y.

Problem 7.4: Verify that the torus knot r(t) = [x(t), y(t), z(t)] =
[(2+cos(mt)) cos(nt), (2+cos(mt)) sin(nt), sin(mt)] lives on the torus (3+
x2 + y 2 + z 2 )2 − 16(x2 + y 2 ) = 0.

Problem 7.5: In the lecture on surfaces, we have sliced some bagels. Let
us assume that the doughnut is given by (x2 +y 2 +z 2 +16)2 −100(x2 +y 2 ) =
0. Verify that if we intersect this torus with the plane 3x = 4z, then we
get the Villarceau circles r(t) = [4 cos(t), 3 + 5 sin(t), 3 cos(t)] as well as
the circle r(t) = [4 cos(t), −3 + 5 sin(t), 3 cos(t)].

Figure 3. Villarceau circles.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

3The notation ln appears only in calculus books. Mathematicians use log.

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 8: Arc length

Lecture
8.1. We assume in this lecture that curves are continuously differentiable meaning
that the velocity is continuous. We would write r ∈ C 1 ([a, b], Rd ). Given a parametrized
curve r(t) defined over an interval I = [a, b], its arc length is defined as
Z b
L= |r0 (t)| dt .
a
0
For f (t) = |r (t)| the integral is defined as the lim sup (we don’t know yet whether
lim exists),
Z b
Sn 1 X k
f (t) dt = lim sup = lim sup f( ) .
a n→∞ n n→∞ n k
n
a≤ n <b

This Archimedes integral is a special Riemann integral. It satisfies min(f ) ≤

Rb
(b − a)−1 a f (t) dt ≤ max(f ). The intermediate value theorem implies that there
Rb
is y ∈ [a, b] such that f (y) = (b − a)−1 a f (t)dt. The minimum and maximum exists
by Bolzano’s extreme value theorem. Related to Bolzano is the Heine-Cantor
theorem assuring that a continuous function f on a closed finite interval [a, b] is
uniformly continuous: there exists a function M (t) satisfying limt→0 M (t) = 0 with
|f (x) − f (y)| ≤ M (|x − y|) for all x, y ∈ [a, b]. Stronger is Lipschitz continuity,
which is M (t) = M · t for some constant M . The next proof shows in general that
continuous functions are Riemann integrable; the limsup is actually a limit:
Theorem: Arc length exists and is independent of the parameterization.

Proof. (i) To see parameter independence, assume a time change φ(t) with a monotone
smooth function φ : [a, b] → [φ(a), φ(b)]. If r(t) on [φ(a), φ(b)] and R(t) = r(φ(t)) on
[a, b] are the two parametrizations and f (t) = |r0 (t)| and F (t) = |R0 (t)| = |r0 (φ(t))|φ0 (t),
R φ(b) Rb
then by substitution, the arc length of r(t) is φ(a) f (t) dt = a f (φ(t))φ0 (t) dt which is
Rb
a
F (t) dt, the arc length of R(t).
(ii) From (i) we can assume [a, b] = [0, 1]. By uniform continuity, there are Mn → 0
such that if |y − x| ≤ 1/n, then |f (y) − f (x)| ≤ Mn . The intermediate value
theorem,
R xk+1 gives for every Ik = [x
R k1, xk+1 ] = [k/n, (k P+ 1)/n] ⊂ [0, 1], a yk ∈R Ik such that
1
xk
f (x) dx = f (yk )/n. Now, 0 f (x) dx = (1/n) k f (yk ) and |Sn /n− 0 f (x) dx| =
P P P
(1/n)| k [f (xk ) − f (yk )]| ≤ (1/n) k |f (xk ) − f (yk )| ≤ 1/n k Mn = Mn → 0.
Linear Algebra and Vector Analysis

Examples
R 2π
8.2. The arc length of the circle r(t) = [R cos(t), R sin(t)] with t ∈ [0, 2π] is 0 |r0 (t)| dt =
R 2π
0
R dt = 2πR.
R1 √
8.3. The arc length of the parabola r(t) = [t, t2 /2] with t ∈ [−1, 1] is −1 1 + t2 dt.
√
We will do this integral in class. The result is 2 + arcsinh(1).
√ R2p
8.4. The arc length of the curve r(t) = [log(t), 2t, t2 /2] for t ∈ [1, 2]. It is 1 1/t2 + t2 + 2 dt =
R2
1
(t + 1/t) dt = log(2) + 3/2.
Illustrations

Figure 1. A polygon approximation of a curve produces a Riemann

sum approximation of the length integral.

Figure 2. A Riemann sum approximation of a continuous function

produces in the limit the “area under the curve”.

Figure 3. Brownian motion produces continuous paths which are not

differentiable. The arc length integral does not exist.
Background (cool, but can be ignored if you like)
8.5. A function f is called Lipschitz continuous on [a, b] if there exists a constant
M such that |f (t) − f (s)| ≤ M |t − s| for all t, s ∈ [a, b]. It turns out that for Lipschitz
functions the derivative f 0 exists “almost everywhere”. To make sense of this, a more
mature integration theory is needed. The Riemann integral is totally inadequate and
does not fit the bill. We need the Lebesgue integral. The fundamental theorem of
calculus for Lipschitz function is known as the Rademacher theorem:
Rb
Theorem: If f is Lipschitz, then a f 0 (t) dt = f (b) − f (a).

8.6. In order to define the Lebesgue integral, one first introduces a so called σ-
algebra A. 1 It is the smallest set of subsets of R which is closed under the operation
of taking countable unions and intersections and complements and which contains
the class of intervals. The Lebesgue measure on intervals |[a, b]| = b − a can then be
extended to A where it inherits all the properties we want, like |A ∪ B| = |A| + |B| −
|A ∩ B|. For indicator functions which are functions R f (x) = 1A (x) which is 1 if
x ∈ A and 0 else, the Lebesgue integral is defined as 1A (x) dx = |A|.
8.7. First write the function f as f + − f − , where f + and f − are both non-negative.
This is a simplification because we need to define the integral
P only for non-negative
functions. A simple Rstep functionP is a finite sum i ai 1Ai , with Ai ∈ A. For
such functions,
R define I
f dx = a
i i |Ai |. The Lebesgue integral is now defined as
supg≤f I g dx, where the supremum is taken over all simple step functions g smaller
or equal than f . If the limit exists, the function is called Lebesgue integrable.
8.8. The Lebesgue integral is also a Monte Carlo integral limn→∞ n1 a≤xk <b f (xk ),
P
where xk are random choices in [a, b]. This is justified by the law of large numbers.
The transition Riemann → Lebesgue replaces a regular lattice k/n with a random one.
8.9. The Lebesgue integral can integrate also non-continuous
R functions: let g(x) be
0 on rational numbers and 1 on irrational numbers. Then I g dx = |I| because all
except a countable number of x are irrational. The Riemann integral would give 0.
8.10. The proof that a continuous function is Lebesgue integrable is even simpler
than for the Riemann integral: first again use that f is uniform continuous on [a, b],
there exists Mn → 0 such that whenever |x − y| ≤ 1/n, also |f (x) − P f (y)| ≤ Mn .
P intervals Ik = [k/n, (k + 1)/n] ∩ [a, b] and step functions g = k ck 1Ik and
Take the
h = k dk 1Ik , where ck is the minimum of f on Ik and dk the maximum. Now
Rb P P
a
|g − h| dx ≤ k |ck − dk ||Ik | ≤ Mn k |Ik | = Mn (b − a). Now f is sandwiched
between step functions g, h which for n → ∞ have the same integral.
8.11. We don’t prove RademacherR here. One needs to show that f 0 is Lebesgue in-
x
tegrable and that g(x) = f (a) + a f 0 (t) dt agrees with f (x). In modern language
Rademacher tells Lipschitz = Sobolev W 1,∞ ([a, b]) = {f 0 ∈ L∞ ([a, b])}. More gen-
eral is absolute continuity = W 1,1 ([a, b]) = {f 0 ∈ L1 ([a, b])}.

1For details see i.e. O.Knill, Probability theory and stochastic processes, 2011
Linear Algebra and Vector Analysis

Homework

Problem 8.1: Find the arc length of the catenary r(t) = [t, cosh(t)],
where cosh(t) = (et + e−t )/2 is the hyperbolic cosine and t ∈ [−1, 1].
Hint. You can use the identity cosh2 (t) − sinh2 (t) = 1, where sinh(t) =
(et − e−t )/2 is the hyperbolic sine. We have cosh0 = sinh, sinh0 = cosh.
Galileo was the first to investigate the catenary. It is the curve, a freely hanging heavy rope describes, if the end points
have the same height. Galileo mistook the curve for a parabola. It was Johannes Bernoulli in 1691, who obtained
its true form after some competition involving Huygens, Leibniz and two Bernoullis. The name “catenarian” (=chain
curve) was first used by Huygens in a letter to Leibnitz in 1690.

Problem 8.2: Find the arc length of the cycloid

r(t) = [t − sin(t), 1 + cos(t)]
from 0 to 2π. The upside down cycloid is the solution to the fa-
mous Brachistochrone problem, the curve along which a ball de-
scends fastest. Hint. You might want to use the double angle formula
2 − 2 cos(t) = 4 sin2 ( 2t ).

Problem 8.3: Find the length of the curve

r(t) = [12t, 8t3/2 , 3t2 ] ,
where t ∈ [0, 3].

Problem 8.4: Compute numerically the arc length of the knot r(t) =
[sin(4t), sin(3t), cos(5t), cos(7t)] from t = 0 to t = 2π. By drawing the first
coordinates only and using color as the fourth coordinate, we can see that
there are no non-trivial knots in R4 . You can not tie your shoes in R4 !

R1
Problem 8.5: What is the relation between | 0
r0 (t) dt| and
R1 0
0
|r (t)| dt? Give an interpretation of both sides.

Figure 4. The catenary and the cycloid.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 9: Intuition

Seminar
9.1. It is important in mathematics to gain intuition about objects, definitions and
theorems and proofs. The fact that this is not easy can be illustrated by showing that
intuition can mislead us. We can state “false theorems” which we would believe to be
true but which are false. We start with the notion of “continuity” for which an intuitive
definition tells: we can “draw the graph of a continuous function without having to lift
the pen”. Of course, we can not work with this definition to prove theorems.

9.2. Starting with Cauchy and pushed heavily by Weierstrass, continuity is defined
precisely using the infamous − δ definition: f is continuous at x, if for every > 0
there exists δ > 0 such that if |x − y| ≤ δ, then |f (x) − f (y)| ≤ . Using more fancy
mathematical quantifier notation ∀ (for all) and ∃ (exists) and ⇒ (implies) and (is
element of) you can impress your friends (and annoy readers and graders) by writing
∀ > 0∃δ > 0∀y ∈ [a, b], |x−y| ≤ δ ⇒ |f (x)−f (y)| ≤ .
The fact that his definition is not intuitive at all and that most students just learn this
“epsilontic” by intimidation is illustrated by the following variation by Ed Nelson 1
We make it our first exercise:
Problem A: What does the following statement mean?

∀δ > 0∃ > 0∀y ∈ [a, b], |x−y| ≤ δ ⇒ |f (x)−f (y)| ≤ .

9.3. In the first lecture we have seen how a polygonal approximation of a curve allows
to compute the arc length of a curve. Here is a first “anti-theorem”. Your task is to
figure out what is wrong.

1E. Nelson, Internal set theory: A new approach to nonstandard analysis, 1977
Linear Algebra and Vector Analysis

9.4. We compute the circumference of a circle by a polygonal approximation. The

following statement uses the intuition that if a polygon is close to a curve, then its
length is close to the curve:

Figure 1. The circumference of a circle is 8.

9.5. This leads to the following anti-theorem: 2 A continuous planar curve is a function
t → r(t) = [x(t), y(t)], where both functions x(t), y(t) are continuous functions.
False Theorem: The circumference of the unit circle is 8.

Problem B: What is wrong with the argumentation?

9.6. We could also think that the arc length of a continuous curve is finite.
False Theorem: The arc length of a continuous curve is finite.

Figure 2. The first 4 approximations of the Koch snowflake.

Problem C: Find a formula for the length of the k’th Koch curve ap-
proximation if initially, the triangle has side length 1

2Again thanks to Jun Hou Fung for suggestion

9.7. If a curve t → r(t) = [x(t), y(t)] has the property that x(t) and y(t) stay bounded
and have no jump discontinuities, we would think that the curve is continuous.
False Theorem: A bounded curve without jumps is continuous.

9.8. A counter example is the devil comb r(t) = [t, sin(1/t)] for t ∈ [0, 1]. it does not
have a jump discontinuity and it is bounded. The function is not defined at t = 0 but
we can define r(0) = [0, 0] to make it defined anywhere on [0, 1].

Problem D: Why is this function r(t) not continuous at t = 0?

9.9. Finally, we could think:

False Theorem: A continuous function is differentiable at some point.

9.10. A counter example was given by Weierstrass. It is called the Weierstrass func-
tion. G.H. Hardy proved in 1916 that the function
X∞
f (x) = a−n cos(an x)
n=1
does not have any point of differentiability if a > 1.

Figure 3. The Weierstrass function for a = 2, displayed on [0, π].

P∞
Problem E: Show that f (x) = n=1 2−n cos(2n x) ∈ [−1, 1].
Linear Algebra and Vector Analysis

Homework
Exercises A-E are done in the seminar. This homework is due on Tuesday:
Problem 9.1 Prove that there was a time in your life when the length
of your largest tooth in millimeters was your height in meters.

Problem 9.2 Use the intermediate value theorem to prove the mean
value theorem: if f is continuously differentiable and f (0) = f (1) = 0,
then there exists a point in (0, 1) with f 0 (x) = 0.

Problem 9.3 Look up, formulate and understand the proof of the “Wob-
bly table theorem”. This theorem appears to have been found in 2008 by
David Richeson. You find an exposition in some of Harvard Math 1a
handouts.

Problem 9.4 We can draw curves in 4 dimensions by assigning to

each point a Hue value, which is a color parametrized by [0, 1]. Given
a curve (x(t), y(t), z(t)) and a color value c(t) define the curve r(t) =
[x(t), y(t), z(t), c(t)] in four dimensional space. Use this idea to argue why
there are no non-trivial knots in four dimensions.

Figure 4. The “Hue curve” in color space.

Problem 9.5 What does your intuition say? We will come back to this
later when we look at surface area. Given a nice smooth surface S like
a paraboloid which is triangulated with triangles of size . If Sn is the
polygonal approximation. Does the surface area |Sn | of the polyhedron
and the surface area of the surface S satisfies |Sn | → |S|?

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 10: Coordinates

Lecture
10.1. It was René Descartes who in 1637 introduced coordinates and brought algebra
close to geometry. 1 The Cartesian coordinates (x, y) in R2 canpbe replaced by
other coordinate systems like polar coordinates (r, θ), where r = x2 + y 2 ≥ 0 is
the radial distance to the (0, 0) and θ ∈ [0, 2π) is the polar angle made with the
positive x-axis. Since θ is in the interval [0, 2π), it is best described in the complex
notation θ = arg(x + iy). The conversion from the (r, θ) coordinates to the (x, y)-
coordinates is

x = r cos(θ)
y = r sin(θ)
p
The radius is x2 + y 2 , where if non-zero, we always take the positive root. The angle
formula arctan(y/x) only holds if x and y are both positive. The angle θ is not uniquely
defined at the origin (0, 0), most software just assumes arg(0) = 0.

10.2. We can write a vector in R2 also in the form of a complex number z =

x + iy ∈ C with some symbol i. This is not only notational convenience. Complex
numbers can be added and multiplied like other numbers and while R2 = C, the later
has a multiplicative structure. In order to fix that structure, one only needs to
specify that i2 = −1. This gives (a + ib)(c + id) = ac − bd + i(ad + bc). An important
observation of Euler is a link between the exponential and trigonometric functions:

Theorem: eiθ = cos(θ) + i sin(θ).

10.3. The proof is to write the series definition on both sides. First recall the defini-
tions of ex = 1 + x + x2 /2! + x3 /3! + .... If we plug in x = iθ we get eiθ = 1 + iθ − θ2 /2! −
iθ3 /3! + θ4 /4!... But this is (1 − θ2 /2 + θ4 /4!...) + i(θ − θ3 /3! + θ5 /5! − ...) which is
cos(θ) + i sin(θ). QED. If you prefer not to see the functions exp, sin, cos being defined
as
P∞ series, you can see them as Taylor series f (x) = f (0) + f 0 (0)x + f 00 (0)/2!x2 + ... =
(k)
k=0 (f (0)/k!)xk . By differentiating the functions at 0, we see then the connection.

1Descartes: La Géometrie, 1637 (1 year after the foundation of Harvard college)

Linear Algebra and Vector Analysis

10.4. This implies for θ = π the magical formula

Theorem: eiπ + 1 = 0

This formula is often voted the “nicest formula in math”. 2 It combines “analysis”
in the form e, “geometry” in the form of π, “algebra” in the form of i, the additive unit
0 and the multiplicative unit 1. The Euler formula also allows to define the logarithm
of any complex number as log(z) = log(|z|) + iarg(z) = log(r) + iθ. We see now that
going from (x, y) to (log(r), θ) is a very natural transformation from C \ 0 to C. The
exponential function exp : z → ez is a map from C → C \ 0. It transforms the additive
structure on C to the multiplicative structure because exp(z + w) = exp(z) exp(w).

10.5. In three dimensions, we can look at cylindrical coordinates (r, θ, z). It is just
the polar coordinates in the first two coordinates. A cylinder of radius 2 for example
is given as r = 2. The torus (3 + x2 + y 2 + z 2 )2 − 16(x2 + y 2 ) = 0 can be written as
3 + r2 + z 2 = 4r or more intuitively as (r − 2)2 + z 2 = 1, a circle in the r − z plane.
p
10.6. The spherical coordinates (ρ, θ, φ), where ρ = x2 + y 2 + z 2 . The angle θ
is the polar angle as in cylindrical coordinates and φ is the angle between the point
(x, y, z) and the z-axis. We have cos(φ) = [x, y, z]·[0, 0, 1]/|[x, y, z]| = z/ρ and sin(φ) =
|[x, y, z] × [0, 0, 1]|/|[x, y, z]| = r/ρ so that z = ρ cos(φ) and r = ρ sin(φ) and therefore

x = ρ sin(φ) cos(θ)
y = ρ sin(φ) sin(θ)
z = ρ cos(φ)

where 0 ≤ θ < 2π, 0 ≤ φ ≤ π and ρ ≥ 0.

10.7. A coordinate change x → f (x) in the plane can be seen as a map f : R2 → R2 .

A point (x1 , x2 ) is mapped into (f1 , f2 ). We write ∂xk for the partial derivative with
respect to the variable xk . For example ∂x1 (x21 x2 + 3x1 x32 ) = 2x1 x2 + 3x32 .

x1 f1 (x1 , x2 ) x1 ∂x1 f1 (x) ∂x2 f1 (x)
f = , df = ,
x2 f2 (x1 , x2 ) x2 ∂x1 f2 (x) ∂x2 f2 (x)
where df is a matrix called the Jacobian matrix. The determinant is called the
distortion factor at x = (x1 , x2 ).

10.8. For polar coordinates, we get

r r cos(θ) r cos(θ) −r sin(θ)
f = , df = .
θ r sin(θ) θ sin(θ) r cos(θ)
Its distortion factor is r. We will use this when integrating in polar coordinates.

10.9. If f (z) = z 2 +c with c = a+ib, z = x+iy is written as f (x, y) = (x2 −y 2 +a, 2xy+
b), then df is a 2 × 2 rotation dilation matrix which corresponds to the complex
number f 0 (z) = 2z. The algebra C is the same as the algebra of rotation-dilation
matrices.
2D. Wells, Which is the most beautiful?, Mathematical Intelligencer, 1988
10.10. A coordinate change x → f (x) in space is a map f : R3 → R3 . We compute
       
x1 f1 (x) x1 ∂x1 f1 (x) ∂x2 f1 (x) ∂x3 f1 (x)
f  x2  =  f2 (x)  , df  x2  =  ∂x1 f2 (x) ∂x2 f2 (x) ∂x3 f2 (x)  .
x3 f3 (x) x3 ∂x1 f3 (x) ∂x2 f3 (x) ∂x3 f3 (x)
We wrote x = (x1 , x2 , x3 ). Its determinant det(dT )(x) is a volume distortion factor.
10.11. For spherical coordinates, we have
       
ρ ρ sin(φ) cos(θ) ρ sin(φ) cos(θ) ρ cos(φ) cos(θ) −ρ cos(φ) sin(θ)
f  φ  =  ρ sin(φ) sin(θ)  , df  φ  =  sin(φ) sin(θ) ρ cos(φ) sin(θ) ρ cos(φ) cos(θ)  .
θ ρ cos(φ) θ cos(φ) −ρ sin(φ) 0
The distortion factor is det(df (ρ, φ, θ)) = ρ2 sin(φ).
Examples
10.12. The point (x, y) = (−1, 1) corresponds
√ to the complex number z = −1 + i.
It has the polar coordinates (r, θ) = ( 2, 3π/4). As we have z = reiθ , we check
z 2 = (−1 + i)(−1 + i) = −2i which agrees with (reiθ )2 = r2 e2iθ = 2e6πi/4 .
√
10.13. a) (x, y, z) = (1, 1, − 2) corresponds to spherical coordinates (ρ, φ, θ) = (2, 3π/4, π/4).
b) The point given in spherical coordinates as (ρ, φ, θ) = (3, 0, π/2) is the point (0, 3, 0).
10.14. a) The set of points with r = 1 in R2 form a circle.
b) The set of points with ρ = 1 in R3 form a sphere.
c) The set of points with spherical coordinates φ = 0 are points on the positive z-axis.
d)The set of points with spherical coordinates θ = 0 form a half plane in the yz-plane.
e) The set of points with ρ = cos(φ) form a sphere. Indeed, by multiplying both sides
with ρ, we get ρ2 = ρ cos(φ) which means x2 + y 2 + z 2 = z, which is after a completion
of the square equal to x2 + y 2 + (z − 1/2)2 = 1/4.
10.15. For A ∈ M (n, n), f (x) = Ax + b has df = A and distortion factor det(A).
10.16. Find the Jacobian matrix and distortion factor of the map f (x1 , x2 ) = (x31 +
x2 , x22 − sin(x1 )). Answer: Write both the transformation and the Jacobian:

x1 x31 + x2 x1 3x21 1
f = , df = .
x2 x22 − sin(x1 ) x2 − cos(x1 ) 2x2
The Jacobian matrix is det(df (x)) = 6x21 x2 + cos(x1 ).
Illustrations
10.17. Let T : C → C be defined as z → z 2 +c where z = x+iy. The set of all c = a+ib
for which the iterates T n (0) stay bounded is the Mandelbrot set M . For c = −1 we
get T (0) = −1, T 2 (0) = T (−1) = 0 so that T n (z) is either 0 or −1. The point c = −1
is in M . The point c = 1 gives T (0) = 1, T 2 (0) = 12 = 1 = 2, T 3 (0) = 22 + 1 = 5.
Induction shows that T n (0) does not converge. The point c = 1 is not in M .
10.18. If T is the transformation in R3 which is in spherical coordinates given by
T (x) = x2 + c, where x2 has spherical coordinates (ρ2 , 2φ, 2θ) if x has (ρ, φ, θ). It turns
out that T (x) = x8 + c gives a nice analogue of the Mandelbrot set, the Mandelbulb.
Linear Algebra and Vector Analysis

Figure 1. The Mandelbrot set M = {c ∈ C|T (z) = z 2 + c

has bounded T n (0)}. There is a similar construction in space R3
which uses spherical coordinates. This leads to the Mandelbulb set
B = {c ∈ R3 | T (x) = x8 + c has bounded T n (0) }, where x8 has
spherical coordinates (ρ8 , 8φ, 8θ) if x has spherical coordinates (ρ, φ, θ).

Homework
√
Problem 10.1: a) Find the polar coordinates of (x, y) = (1, 3).
b) Which point has the polar coordinates (r, θ) = (3, 4)?
c) Find the spherical coordinates of the point (x, y, z) = (1, 1, 1).
d) Which point has the spherical coordinates (ρ, θ, φ) = (3, π/2, π/3)?

Problem 10.2: a) Compute Tcn (0) for c = (1 + i) for n = 1, 2, 3, 4. Is

1 + i in the Mandelbrot set?
b) What is the “eye for an eye” number ii ? (You can use z w = ew log(z) ).

Problem 10.3: a) Which surface is described as r = z?

b) Describe the hyperbola x2 − y 2 = 1 in polar coordinates.
c) Which surface is described as ρ sin(φ) = ρ?
d) Describe the hyperboloid x2 + y 2 − z 2 = 1 in spherical coordinates.

Problem 10.4: a) Compute the Jacobian matrix and distortion factor

of the coordinate change T (x, y) = (2x + sin(x) − y, x) (Chirikov map).
b) Compute for fixed a the Jacobian matrix and distortion fac-
tor of the toral coordinates f (q, θ, φ) = ((a + q cos(φ)) cos(θ), (a +
q cos(φ)) sin(θ), q sin(φ)). The θ and φ coordinates are angles, the q coor-
dinate is the distance to the center circle of the torus.

Problem 10.5: a) Prove by induction that the Mandelbrot set M is

contained in the set |c| ≤ 2.
b) Prove by induction that the Mandelbulb set B is contained in the set
|c| ≤ 2.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 11: Parametrization

Lecture
11.1. A map r : Rm → Rn is called a parametrization. We have seen maps r
from R to Rn , which were curves. Then we have seen maps f : Rn → Rn which
were coordinate changes. In each case we defined the Jacobian matrix df (x). In
the case of the curve r : R → Rn , it was the velocity dr(t) = r0 (t). In the case of
coordinate changes, the
p Jacobian matrix df (x) was used to get the volume distortion
factor det(df (x)) = det(df T df ). Today, we look at the case m < n. In particular
at m = 2, n = 3. As in the case of curves, we use the letter r to describe the map.
The image of a map r : R ⊂ Rm → Rn is then a m-dimensional surface in Rn . The
distortion factor ||dr|| defined as ||dr||2 = det(drT dr) will be used later to compute
surface area. 1

Figure 1. An ellipsoid, half an ellipsoid, a bulb, a heart and a cat.

11.2. We mostly discuss here the case m = 2 and n = 3, as we ourselves are made of
two-dimensional surfaces, like
 cells, membranes,
 skin or tissue. A map r : R ⊂ R2 →
x(u, v)
3 u
R , written as r( ) = y(u, v)  defines a two-dimensional surface. In order to

v
z(u, v)
save space, we also just write r(u, v) = [x(u, v), y(u, v), z(u, v)]. In computer graphics,
the r is called uv-map. The uv-plane is where you draw a texture. The map r places
it onto the surface. In geography, the map r is called (surprise!) a map. Several maps
define an atlas. The curves u → r(u, v) and v → r(u, v) are called grid curves.

1Distinguish ||A||2 = det(AT A) and |A|2 = tr(AT A) in M (n, m). They only agree for m = 1.
Linear Algebra and Vector Analysis

11.3. The parametrization r(φ, θ) = [sin(φ) cos(θ), sin(φ) sin(θ), cos(φ)] produces the
sphere x2 + y 2 + z 2 = 1. The full sphere has 0 ≤ φ ≤ π, 0 ≤ θ < 2π. By modifying
the coordinates, we get an ellipsoid r(φ, θ) = [a sin(φ) cos(θ), b sin(φ) sin(θ), c cos(φ)]
satisfying x2 /a2 + y 2 /b2 + z 2 /c2 = 1. By allowing a, b, c to be functions of φ, θ we get
“bumpy spheres” like r(φ, θ) = (3 + cos(3φ) sin(4θ))[sin(φ) cos(θ), sin(φ) sin(θ), cos(φ)].
11.4. Planes are described by linear maps r(x) = Ax + b with A ∈ M (3, 2) and
b ∈ M (3, 1). The Jacobian map is dr = A. Let ru , rv be the two column vectors of A.
Actually, ru is a short cut for ∂u r(u, v), which is the velocity vector of the grid curve
u → r(u, v).
11.5. An example  is the parametrization
  r(u,
 v)  = [u + v − 1, u −v + 3, 3u− 5v + 7]
−1 1 1 1 1
In this case b =  3 , ru =  1  rv =  −1  and A = dr =  1 −1 . We see
7 3 −5 3 −5

11 −15
AT A = which has determinant 72. We also have
−15 27
     
1 1 −2
|ru × rv |2 = |  1  ×  −1  |2 = |  8  |2 = 72
3 −5 −2
11.6. The previous computation suggests a relation between the normal vector and
the fundamental form g = drT dr. In three dimensions, the distortion factor of a
parametrization r : R2 → R3 can indeed always be rewritten using the cross product:

Theorem: det(drT dr) = |ru × rv |2 .

T ru · ru ru · rv
Proof. As dr dr = , the identity is the Cauchy-Binet identity
rv · ru rv · rv
|ru × rv |2 = |ru |2 |rv |2 − |ru · rv |2 which boils down to sin2 (θ) = 1 − cos2 (θ), where θ is
the angle between ru and rv . This is the angle between the grid curves you see on the
pictures.

Figure 2. A plane, graph, surface of revolution and helicoid.

Examples
11.7. For the unit sphere r(φ, θ) = [sin(φ) cos(θ), sin(φ) sin(θ), cos(φ)] and A = dr:
 
cos(φ) cos(θ) − sin(φ) sin(θ)
cos(φ) cos(θ) cos(φ) sin(θ) − sin(φ) 
g = AT A = cos(φ) sin(θ) sin(φ) cos(θ) 
− sin(φ) sin(θ) sin(φ) cos(θ) 0
− sin(φ) 0

1 0 p
This is g = and det(g) = sin(φ) is the distortion factor.
0 sin2 (φ)
11.8. An important class of surfaces are graphs z = f (x, y). Its most natural
parametrization is r(x, y) = [x, y, f (x, y)], where the map r just lifts up the bottom part
to the elevated version. An example is the elliptic paraboloid r(x, y) = [x, y, x2 + y 2 ]
and the hyperbolic paraboloid r(x, y) = [x, y, x2 − y 2 ]. We could of course have written
also r(u, v) = [u, v, u2 − v 2 ].
11.9. A surface of revolution is parametrized like r(θ, z) = [g(z) cos(θ), g(z) sin(θ), z].
Note that we can use any variables. In this case, u = θ, v = z are used. An ex-
ample is the
√ cone r(θ, z)√= [z cos(θ), z sin(θ), z] or the one-sheeted hyperboloid
r(θ, z) = [ z 2 + 1 cos(θ), z 2 + 1 sin(θ), z].
11.10. The torus is in cylindrical coordinates given as (r − 3)2 + z 2 = 1. We can
parametrize this using the polar angle θ and the polar angle centered at center of the
circle as r(θ, φ) = [(3 + cos(φ)) cos(θ), (3 + cos(φ)) sin(θ), sin(φ)]. Both angles θ and φ
go from 0 to 2π. We see now also the relation with the toral coordinates.
11.11. The helicoid is the surface you see as a staircase or screw. The parametrization
is r(θ, p) = [p cos(θ), p sin(θ), θ]. How can we understand this? The key is to look at
grid curves. If p = 1, we get a curve r(θ) = [cos(θ), sin(θ), θ] which we had identified
as a helix. On the other hand, if you fix θ, then you get lines.
11.12. Side remark. The first fundamental form g = drT dr is also called a
metric tensor. In Riemannian geometry one looks at a manifold M equipped
with a metric g. The simplest case is when g comes from a parametrization, as we did
here. In physics, we know that it is mass which deforms space-time. The quantity
||g||2 = det(g) is a multiplicative analogue of |g|2 = tr(g). For an invertible positive
definite square matrix A, we will later see the identity log det(A) = tr log(A) which
illustrates how both determinant and trace are pivotal numerical quantities derived
from a matrix. Trace is additive because of tr(A+B) = tr(A)+tr(B) and determinant
is multiplicative det(AB) = det(A)det(B) as we will see later.
11.13. To summarize, we have seen so far that there are two fundamentally different
ways to describe a manifold. The first is to write it as a level surface f = c which is a
kernel of a map g(x) = f − c. A second is to write it as the image of some map r.
Illustration

Figure 3. “Veritas on Earth and the Moon” theme (rendered in Povray).

Linear Algebra and Vector Analysis

Figure 4. A fruit and math-candy

c
math-candy.com (rendered in Mathematica)

Homework

Problem 11.1: Parametrize the upper part of the two sheeted hyper-
boloid x2 + y 2 − z 2 = −1, z > 0 in two different ways:
a) as a surface of revolution b) as a graph z = f (x, y).

Problem 11.2: a) Parametrize the plane x + 2y + 3z − 6 = 0 using a

map r : R2 → R3 . b) Now find the
p matrix A = dr and compute g = A A
T

as well as the distortion factor det(AT A). c) Also compute ru , rv and

ru × rv and then compute |ru × rv |. You should get the same number.

Problem 11.3: Given a parametrization r(θ, φ) = [(7 +

2 cos(φ)) cos(θ), (7 + 2 cos(φ)) sin(θ), 2 sin(φ)] of the 2-torus, find the im-
plicit equation g(x, y, z) = 0 which describes this torus.

Problem 11.4: Parametrize the hyperbolic paraboloid z = x2 −

y 2 . What is the

T
first fundamental form g = dr dr which is g =
rx · rx rx · ry p
?. What is the distortion factor det(g)?
ry · rx ry · ry

Problem 11.5: The matrix g = drT dr is also called the first funda-
mental form. If r : R4 to R4 is a parametrization of space time then
g is the space time metric tensor. The matrix entries of g appear in
general relativity. Now for some reasons, physics folks use Greek sym-
bols to access matrix entries. They write gµν for the entry at row µ and
column ν. This appears for example in the Einstein field equations
1 8πG
Rµν − Rgµν = 4 Tµν .
2 c
Find the general solution of this equation. Just kidding. We just want
you to look up the equations and tell from each of the variables, what it
is called and whether it is a matrix, a scalar function or a constant.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 12: Creativity

Seminar
12.1. As we are heading for our first midterm, let us organize the knowledge accu-
mulated so far. We can do that in various ways. One technique is a mind map. It
allows on one picture to organize a vast amount of content and see connections which
might otherwise be missed. In Figure (1) we started to build such a mind map. There
are lots of branches still missing, even main ones. One could start also with one entry
like “matrix”, put it in the center then build connections to other objects definitions
or results.

Area

Dot Product Quantities

Velocity
Vectors

Cross product Curves

Trace

Parametrizations
Matrices

!!a Hourly *
Jacobean
Matrix Product

Surfaces
Integral Spheres
Calculus

Quadrics
Proofs
Derivative

Induction

Figure 1.

12.2. What does this have to do with creativity? It turns out that in order to be
creative, one has to have a fertile base of knowledge. You can not assemble new
building blocks before possessing and understanding some already. In order to prove
the point that knowledge is important, one can also look at computer science and
Linear Algebra and Vector Analysis

especially the field artificial intelligence (AI). One of the great pioneers in AI, Marvin
Minsky once wrote: ”the best way to solve a problem is to know how to solve it”. The
modern paradigms in machine learning confirm that in order to train an AI entity,
one has to feed in a lot of knowledge to work with. New models come then through
data fitting, gradient decent methods or more sophisticated algorithms. 1
Problem A: Make a mind map of the most important facts which have
appeared in the course so far. Do it on paper, a blackboard, whiteboard
or using software. Figure (1) makes a start. Refine it as much as possible.

12.3. To illustrate how difficult it can be to get a new solution, try the following
problem. Of course, if you know the answer or have seen it already, it can be easy. If
you have never seen it, it can be very hard. It is important that you try to find the
solution for at least a half an hour even if you should not be successful.
Problem B: Given 6 sticks of the same length 1, arrange them so that
you get 4 equilateral triangles of side length 1.

12.4. Finding proofs of theorems needs creativity. Creativity is neither “God-given”

nor inherited; it can be be trained like everything else. To back this claim up, we refer
to a scientist who has demonstrated creativity by discovering new things which
nobody else has thought about before. It is the Swiss scientist Fritz Zwicky who
taught at Caltech and wrote a book “Everybody a genius”. Why does Zwicky have
“street cred”? Well, he was not only extraordinarily creative, he also developed and
communicated creativity techniques that work and have been used since both in
industry and academia.

Figure 2. Fritz Zwicky at the International Astronomical Union meet-

ing in Brighton, England, in 1970. Image credit: AIP Emilio Segre Visual
Archives, John Irwin Slide Collection. Book: Fritz Zwicky, “Jeder ein
Genie” (everybody genius), Lang and Lang, 1971.
1See the Ahlfors lecture talk of 9/11/2018 by Sanjeev Arora, now on Youtube
12.5. First to the credentials: Fritz Zwicky proposed the existence of dark matter,
supernovas (together with Walter Baade), neutron stars, galactic cosmic rays,
gravitational lensing by galaxies, and galaxy clusters. He was also a pioneer in
rocket technology. He proposed and realized the first shot of a human produced
object to go into outer space. Each of these achievements alone would merit to be in
the list of greatest astronomers of all time. Still, Zwicky is not that well known. Why?
Maybe it has to do with the fact that Zwicky used to call his colleagues “spherical
bastards”. Why spherical? “Because they are bastards from whatever side you looked
at them!” No wonder he was not that much admired ...
12.6. One of the techniques is the morphological box. It is very simple. Produce
a matrix in which one has one type of objects, ideas or activities on one side and
another type of objects, ideas or activities. Now, just go through the matrix and look
for connections. Here is such a matrix:
Earth Moon Sun
shoot
dig
travel
12.7. Now look what Zwicky proposed: shoot onto the moon (he actually did that
with used V2 rockets which had an actual gun on top. At the end of the burn the gun
was fired, the bullet would travel to space), he proposed travel by large scale digging
through the earth (this is now realized by a company formed by Elon Musk) travel
with the sun (the proposal was to travel to a nearby star by moving the entire solar
system).
12.8. The matrix entry ”dig sun” might come in when realizing Zwicky’s space travel
idea. We might have to target part of the sun differently to trigger asymmetric burn
and so a travel. By the way there is an entire field of engineering, “macro-engineering”.
In 1997, I suggested in an essay (to the occasion of the 100th birthday of Zwicky) to
implement Zwicky’s idea by deliberate triggering of asymmetric fusion and fission in
the Sun. This is mentioned in a macro-engineering book. 2
12.9. Here is a beautiful problem assigned in the course Math 101 this semester, taught
by Sebastien Vasey. Borrowing a problem from another course does not make much of
the point for creativity: but the problem is too beautiful to be missed. It is an example
of an induction proof which needs some creativity. Try to solve it.
Problem C: You have bathroom tiles which have three squares arranged
in an L shape. Prove that you can cover a square shaped bath room floor
of length and width 2n with such tiles such that one square is left empty.

2V. Badescu, R.B. Cathcart, R.D. Schuiling, Macro-Engineering, Springer, 2006

Linear Algebra and Vector Analysis

Problem D: Martin Gardner wrote many books with puzzles. One of

them is “The mathematical magic show” (1977). On the book cover of
the German edition (1988), there is a famous puzzle: you have a cherry
in a glass built by 4 matches. Move two of the four matches to get the
cherry out of the glass. The glass should have the same shape as before.
You are not allowed to move the cherry. Solve the cherry puzzle.

Figure 3. The German edition of “mathematical magic show”.

Homework
Exercises A-D are done in the seminar. This homework is due on Thursday. In all
the following question, creativity is key. Your object has to be original. It is ok to
modify a known object. And of course, use technology so that one can admire your
creation.
Problem 12.1 Be creative and generate your own parametrized curve.

Problem 12.2 Be creative and generate a parametric surface.

Problem 12.3 Be creative and generate a level surface f (x, y, z) = c.

Problem 12.4 Be creative and generate your own coordinate system R2 .

Problem 12.5 a) Write a first hourly! b) take it! c) grade it!

Remark: According to the Apocrypha of Krantz (page 79), part a) and b) were once
given as an algebraic geometry exam given here at Harvard. It is rumored that this
was then used also at the Harvard philosophy department, where (and this is creative
too), part c) was added. As far as we know, giving the homework assignment of writing
an exam assignment is a first! Heureka! We were creative.
Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018
LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 13: Keywords for First Hourly

Theorems
Cauchy-Schwarz
Pythagoras
Al Khashi
Uniqueness of Row reduction
The cross product formula
Image of transpose is perpendicular to kernel of matrix.
Cauchy Binet formula
For differentiable curves, arc length exists.
The equivalence of curvature formulas
Euler formula and special case
Distortion formula in space

Algorithms
Find angle between vectors
Find area of parallelogram
Find volume of parallelepiped
Row reduce a matrix
Get position from acceleration
Find vector perpendicular to a plane
Find length of a curve or matrix
Find curvature at some point
Compute with complex numbers
Switch between coordinate systems
Compute the distortion factor
Get distances between objects

Objects
Matrices
Vectors
Curves
Linear manifolds
Quadratic manifolds
Kernel of map
Linear Algebra and Vector Analysis

Parametrized surfaces

Differentiation
Velocity
Acceleration
The Frenet TNB frame
Jacobian matrix
Curvature

Integration
Integrate to get arc length.
Integrate to get position from velocity etc.
Integration technique: substitution
Integration technique: partial fractions
Integration technique: simplification

Coordinate systems
Cartesian coordinates
Polar coordinates
Cylindrical coordinates
Spherical coordinates
General coordinate change

Parametrized Surfaces
Spheres
Surfaces of revolution
Graphs
Planes

People
Mandelbrot
Hamilton
Descartes
Cauchy
Binet
Schwarz
Euler
Heine
Cantor
Bolzano
Archimedes
Newton

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

4
Name:
5

7
LINEAR ALGEBRA AND VECTOR ANALYSIS
8
MATH 22A Total :
9

Unit 13: First Hourly Practice

Problems

Problem 13P.1 (10 points):

The Fibonacci numbers are defined recursively as follows: start with
F0 = 0, F1 = 1 then define Fn+1 = Fn + Fn−1 , so that F2 = 1, F3 = 2, F4 =
3, F5 = 5 etc. Prove that
F0 + F1 + · · · + Fn = Fn+2 − 1
for every positive integer n.

Problem 13P.2 (10 points):

Let
1 1 0 1
A= , B= .
1 1 1 1
a) (4 points) Compute AB and rref(AB).
b) (4 points) Now row reduce both A and B and form rref(A)rref(B).
c) (2 points) Is the statement rref(AB) = rref(A)rref(B) true for all A, B?

Problem 13P.3 (10 points):

a) (2 points) Parametrize the line through (1, 1, 1) and (4, 3, 1) in R3 .
b) (2 points) Parametrize the ellipse x2 /16 + y 2 /25 = 1 in R2 .
c) (2 points) Parametrize the graph y = x5 + x in R2 .
d) (2 points) Parametrize the circle x2 + (y − 2)2 = 1, z = 4 in R3 .
e) (2 points) Parametrize the line x = y = z in R3 .

Problem 13P.4 (10 points):

Find the arc length of the curve
r(t) = [t cos(t2 ), t sin(t2 ), t2 ]
for 0 ≤ t ≤ 2.
Linear Algebra and Vector Analysis

Problem 13P.5 (10 points):

a) (2 points) What is the Heine-Cantor theorem?
b) (2 points) Formulate the triangle inequality.
c) (2 points) What is the Al Kashi identity?
d) (2 points) Give the name of a nowhere differentiable function.
e) (2 points) Is it true that a continuous curve r(t) has a finite arc length?

Problem 13P.6 (10 points):

a) (2 points) Find (3 + i)(4 + 2i)
b) (2 points) What is ei3π/4 ?
c) (2 points) Convert from cylindrical (r, θ, z) = (2, π/2,√1) to Cartesian.
d) (2 points) What are the spherical coordinates of (1, 3, 2)?
e) (2 points) What surface is in spherical coordinates given as ρ sin(φ) = 1?

Problem 13P.7 (10 points):  

3
a) (5 points) You are given r000 (t) =  4  and r(0) = (7, 8, 9) and r0 (0) =
5
00
(1, 0, 0) and r (0) = (0, 1, 0). Find r(1).
b) (5 points) What is the curvature of r(t) = [t, t + t2 , t + t2 + t3 ] at t = 0?

Problem 13P.8 (10 points):

a) (5 points) Find a parametrization r(u, v) of the cylinder x2 + z 2 = 9.
b) (5 points) Find r(u, v) for the paraboloid y 2 + 3z 2 = x.

Problem 13P.9(10 points):

1 1
Let A =  2 1 . a) (2 points) The image of A is a plane. By using the
1 1
cross product, write it as ax + by + cz = d.
T
b) (2 points) What is the first fundamental form g = A√ A?
T 2 2 2
c) (2 points) From a) you have [a, b, c] = v × w.
p Find a + b + c .
d) (2 points) Find the distortion factor ||A|| = det(AT A) of A.
e) (2 points) What theorem was involved to see ||A|| = |v × w|?

Problem 13P.10 (10 points):

a) (5 points) What is the Jacobian matrix df of the map
f (x, y, z) = [x2 + y 2 + z 2 , x + y, −x2 ]T ?
b) (5 points) Find the distortion factor det(df ).

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

4
Name:
5

7
LINEAR ALGEBRA AND VECTOR ANALYSIS
8
MATH 22A Total :
9

Unit 13: Hourly 1 (II actual hourly)

Problems

Problem 13.1 (10 points):

Prove that
1 + 2 + 4 + 8 + · · · + 2n = 2n+1 − 1
for every positive integer n.

Problem 13.2 (10 points):  

1 1 1 1
a) (5 points) Row reduce the matrix A =  1 2 3 4 .
2 3 4 5
 
1
 1 
b) (5 points) Compute the matrix product 3 4 5 A 
 1 .

1
Linear Algebra and Vector Analysis

Problem 13.3 (10 points):

a) (2 points) Parametrize the curve x = sin(y) in R2 .
b) (2 points) Parametrize the curve r = sin2 (5θ) in R2 .
c) (2 points) Parametrize the curve y = x5 + x, z = 4 in R3 .
d) (2 points) Parametrize the line 2x + y = 4 in R2 .
2
e) (2 points) Parametrize the ellipse (x − 1)2 + y4 = 1 in R2 .

Problem 13.4 (10 points):

Find the arc length of the curve
et
 
e−t 
r(t) =  √
2t
for 0 ≤ t ≤ 1.
Problem 13.5 (10 points):
a) (2 points) Formulate the Cauchy-Schwarz inequality.
b) (2 points) What formula gives the area of the parallelogram spanned
by two vectors v and w?
c) (2 points) What formula gives the volume of a parallelepiped spanned
by three vectors u, v, w?
d) (2 points) Who invented the quaternions?
e) (2 points) Assume rref(A) = rref(B). Does this mean A = B?

Problem 13.6 (10 points):

a) (2 points) Write the complex number z = e−iπ/2 in the form z = a + ib.
b) (2 points) Which point (x, y, z) has the cylindrical coordinates
(r, θ, z) = (1, π/2, 0)?
c) (2 points)√ What
√ are the spherical coordinates (ρ, φ, θ) of the point
(x, y, z) = ( 2, 2, −2)?
d) (2 points) What surface is ρ sin2 (φ) = cos(φ)? Give the name and
write it in Cartesian coordinates
e) (2 points) What surface is given in cylindrical coordinates by the
equation r sin(θ) = 2?
Linear Algebra and Vector Analysis

Problem 13.7 (10 points):   

0 0
a) (5 points) You are given r (t) = 3 and r(0) = 0  and r0 (0) =
00   
t 0
 
1
 0 . Find r(1).
0
 
cos(t)
b) (5 points) What is the curvature of r(t) =  sin(t)  at t = 0?
t

Problem 13.8 (10 points):

a) (2 points) Find a parametrization of the cone x2 + y 2 = z 2 .
b) (2 points) Find a parametrization of x2 /4 + y 2 /9 + z 2 /16 = 1.
c) (2 points) Find a parametrization of the surface x2 − y 2 = z.
d) (2 points) Find a parametrization of the plane z = 2.
e) (2 points) Find a parametrization of the cylinder x2 + z 2 = 1.
Problem 13.9 (10 points):
a) (5 points) Find the dot product A · B = tr(AT B) between the two
matrices  
1 1
A= 1 1  ,
1 1
 
0 1
B= 1 0  .
1 0
b) (5 points) Find the cosine of the angle between these two matrices.

Problem 13.10 (10 points):

a) (5 points) What is the Jacobian matrix df of the coordinate change

x 2x − y + sin(x)
f( )= .
y x

b) (5 points) What is the distortion factor det(df ) of the map f which by

the way is called the Chirikov map.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

4
Name:
5

7
LINEAR ALGEBRA AND VECTOR ANALYSIS
8
MATH 22A Total :
9

Unit 13: Hourly 1 (III practice)

Problems

Problem 13R.1 (10 points):

Prove that 12 + 22 + 32 + · · · + n2 = n(n + 1)(2n + 1)/6 for every positive
integer n.

Problem 13R.2 (10 points):

 
1 0 1 0
 0 1 0 1 
a) (6 points) Row reduce 
 2 0
.
2 0 
1 1 1 1
b) (4 points) Find A2 .
Linear Algebra and Vector Analysis

Problem 13R.3 (10 points):

a) (2 points) Parametrize the curve y(x2 + 1) = 1.
b) (2 points) Parametrize the spiral r = θ2 .
c) (2 points) Parametrize the graph x = ey = exp(y)
d) (2 points) Parametrize the y-axis x = 0.
e) (2 points) Parametrize the ellipse (x − 1)2 /9 + (y − 4)2 /4 = 1.

Problem 13R.4 (10 points):

Find the arc length of the curve
√
r(t) = [ 2 log(t), log(t)2 /2, log(log(t))]T
for e ≤ t ≤ e2 .
Problem 13R.5 (10 points):
a) (2 points) If v and w are two non-zero vectors, what is |v × w|/|v · w|?
b) (2 points) Finish this: if f is uniformly continuous on [a, b], if

|x − y| ≤ . . . , then |f (x) − f (y)| ≤ Mn .

c) (2 points) If x · Bx + Ax − b = 0 is a quadratic manifold.
What object is B, What object is A? Be specific. In case of vectors,
distinguish between row and column vectors for example.
d) (2 points) Who found the formula eiθ = cos(θ) + i sin(θ)?
e) (2 points) Is it true that if f is a function on [a, b] with f (a) < 0 and
f (b) > 0, then f 0 (x) = 0 at some point?

Problem 13R.6 (10 points):

a) (2 points) What is (3 + i)3 ?
b) (2 points) What is (−1)i ?
c) (2 points) What is i4 ?
d) (2 points) What surface is in cylindrical coordinates given as z −r2 = 1?
e) (2 points) What set of points satisfies φ = π?
Linear Algebra and Vector Analysis

Problem 13R.7 (10 points):  

sin(t)
a) (5 points) You are given r00 (t) =  cos(t)  and r(0) = (0, 0, 0) and
t
0
r (0) = (0, 0, 0). Find r(2π).
b) (5 points) What is the curvature of r(t) = [2 sin(t), 3 cos(t), 0] at t = 0?

Problem 13R.8 (10 points):

a) (5 points) Find the parametrization of the surface x2 − y 2 + z 2 = 1.
b) (5 points) Find the parametrization of the surface x2 /4 + y 2 /9 = 1.
Problem 13R.9 (10 points):
a) (5 points) Find the dot product between the two matrices

3 4
A=
1 1
and
2 1
B= .
0 1
b) (5 points) Find the angle between these two matrices.

Problem 13R.10 (10 points):

a) (5 points) What is the Jacobian matrix of the coordinate change
 
x2
f (x, y, z) =  y 2 x + yx2 
z3
.
b) (5 points) What is the distortion factor of f ?

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 14: Partial differential equations

Lecture
14.1. A partial differential equation is a rule which combines the rates of changes of
different variables. Our lives are affected by partial differential equations: the Maxwell
equations describe electric and magnetic fields E and B. Their motion leads to the
propagation of light. The Einstein field equations relate the metric tensor g with the
mass tensor T . The Schrödinger equation tells how quantum particles move. Laws
like the Navier-Stokes equations govern the motion of fluids and gases and especially
the currents in the ocean or the winds in the atmosphere. Partial differential equations
appear also in unexpected places like in finance, where for example, the Black-Scholes
equation relates the prices of options in dependence of time and stock prices.
14.2. If f (x, y) is a function of two variables, we can differentiate f with respect to
both x or y. We just write fx (x, y) for ∂x f (x, y). For example, for f (x, y) = x3 y + y 2 ,
we have fx (x, y) = 3x2 y and fy (x, y) = x3 + 2y. If we first differentiate with respect to
x and then with respect to y, we write fxy (x, y). If we differentiate twice with respect
to y, we write fyy (x, y). An equation for an unknown function f for which partial
derivatives with respect to at least two different variables appear is called a partial
differential equation PDE. If only the derivative with respect to one variable appears,
one speaks of an ordinary differential equation ODE. An example of a PDE is
fx2 + fy2 = fxx + fyy , an example of an ODE is f 00 = f 2 − f 0 . It is important to realize
that it is a function we are looking for, not a number. The ordinary differential equation
f 0 = 3f for example is solved by the functions f (t) = Ce3t . If we prescribe an initial
value like f (0) = 7, then there is a unique solution f (t) = 7e3t . The KdV partial
differential equation ft + 6f fx + fxxx = 0 is solved by (you guessed it) 2sech2 (x − 4t).
This is one of many solutions. In that case they are called solitons, nonlinear waves.
Korteweg-de Vries (KdV) is an icon in a mathematical field called integrable systems
which leads to insight in ongoing research like about rogue waves in the ocean.
14.3. We say f ∈ C 1 (R2 ) if both fx and fy are continuous functions of two variables
and f ∈ C 2 (R2 ) if all fxx , fyy , fxy and fyx are continuous functions. The next theorem is
called the Clairaut theorem. It deals with the partial differential equation fxy = fyx .
The proof demonstrates the proof by contradiction. We will look at this technique
a bit more in the proof seminar.

Theorem: Every f ∈ C 2 solves the Clairaut equation fxy = fyx .

Linear Algebra and Vector Analysis

14.4. Proof. We use Fubini’s

R x0 +h R y0 +h theorem which will appear later in the double integral
lecture: integrate x0 ( y0 fxy (x, y) dy)dx by applying the fundamental theorem
R x +h
of calculus twice x00 fx (x, y0 + h) − fx (x, y0 ) dx = f (x0 + h, y0 + h) − f (x0 , y0 + h) −
R y +h R x +h
f (x0 +h, y0 )+f (x0 , y0 ). An analogous computation gives y00 ( x00 fyx (x, y) dx)dy =
f (x0 + h, y0 + h) − f (x0 , y0 + h) − f (x0 + h, y0 ) + f (x0 , y0 ). Fubini applied to g(x, y) =
R y +h R x +h R x +h R y +h
fxy (x, y) assures y00 ( x00 fyx (x, y) dx)dy = x00 ( y00 fyx (x, y) dy)dx so that
RR
f −fyx dydx = 0. Assume there is some (x0 , y0 ), where F (x0 , y0 ) = fxy (x0 , y0 )−
A xy
fyx (x0 , y0 ) = c > 0, then also for small h, Rthe
R function F is bigger than c/2 everywhere
on A = [x0 , x0 + h] × [y0 , y0 + h] so that A
F (x, y) dxdy ≥ area(A)c/2 = h2 c/2 > 0
contradicting that the integral is zero.
14.5. The statement is false for functions which are only C 1 . The standard counter
example is f (x, y) = 4xy(y 2 − x2 )/(x2 + y 2 ) which has for y 6= 0 the property that
fx (0, y) = 4y and for x 6= 0 has the property that fy (x, 0) = −4x. You can see the
comparison of f (x, y) = 2xy = r2 sin(2θ) and f (x, y) = 4xy(y 2 − x2 )/(x2 + y 2 ) =
r2 sin(4θ). The later function is not in C 2 . The values fxy and fyx , changes of slopes
of tangent lines, turn differently.

Figure 1. Clairaut holds for f (x, y) = 2xy which is in polar coordi-

nates r2 sin(2θ). It does not for the function f (x, y) = 4xy(y 2 −x2 )/(x2 +
y 2 ) which is in polar coordinates 2r2 sin(2θ) cos(2θ) = r2 sin(4θ).

Illustration
14.6. In many cases, one of the variables is time for which we use the letter t and
keep x as the space variable. The differential equation ft (t, x) = fx (t, x) is called
the transport equation. What are the solutions if f (0, x) = g(x)? Here is a cool
derivation: if Df = f 0 is the derivative, 1 we can build operators like (D+D2 +4D4 )f =
f 0 +f 00 +4f 0000 . The transport equation is now ft = Df . Now as you know from calculus,
the only solution of f 0 = af, f (0) = b is beat . If we boldly replace the number a with
with the operator D we get f 0 = Df and get its solution
eDt g(x) = (1 + Dt + D2 t2 /2! + · · · )g(x) = g(x) + g 0 (x)t + g 00 (x)t2 /2! + · · · .
By the Taylor formula, this is equal to g(x+t). You should actually remember Taylor
as g(x + t) = eDt g(x) . We have derived for g(x) = f (0, x) in C 1 (R2 ):
1We usually write df for derivative but D tells it is an operator. D also stands for Dirac.
Theorem: ft = fx is solved by f (t, x) = g(x + t).
Proof. We can ignore the derivation and verify this very quickly: the function satisfies
f (0, x) = g(x) and ft (t, x) = fx (t, x). QED.
14.7. Another example of a partial differential equation is the wave equation ftt =
fxx . We can write this (∂t + D)(∂t − D)f = 0. One way to solve this is by looking at
(∂t − D)f = 0. This means transport ft = fx and f (t, x) = f (x + t). We can also have
(∂t +D)f = 0 which means ft = −fx leading to f (x−t). We see that every combination
af (x + t) + bf (x − t) with constants a, b is a solution. Fixing the constants a, b so that
f (x, 0) = g(x) and ft (x, 0) = h(x) gives the following d’Alembert solution. It
requires g, h ∈ C 2 (R).
g(x+t)+g(x−t) h(x+t)−h(x−t)
Theorem: ftt = fxx is solved by f (t, x) = 2
+ 2
.

14.8. Proof. Just verify directly that this indeed is a solution and that f (0, x) = g(x)
and ft (0, x) = h(x). Intuitively, if we throw a stone into a narrow water way, then the
waves move to both sides.
14.9. The partial differential equation ft = fxx is called the heat equation. Its
solution involves the normal distribution
2 2
√
N (m, s)(x) = e−(x−m) /(2s ) / 2πs2
in probability theory. The number m is the average and s is the standard deviation.
14.10. If the initial heat g(x) = f (0, x) at time t = 0 is continuous and zero outside a
bounded interval [a, b], then
Rb √
Theorem: ft = fxx is solved by f (t, x) = a g(m)N (m, 2t)(x) dm.
√
Proof. For every fixed m, the function N (m, 2t)(x) solves the heat equation.

f=PDF[ N o r m a l D i s t r i b u t i o n [m, Sqrt [ 2 t ] ] , x ] ; Simplify [D[ f , t ]==D[ f , { x , 2 } ] ]

Pn
Every Riemann sum approximation√g(x) = (1/n) k=1 g(mk ) of g defines a function
fn (t, x) = (1/n) nk=1 g(mk )N (mk , 2t)(x) which solves the heat
P
R ∞ equation. So does
f (t, x) = limn→∞ fn (t, x). To check f (0, x) = g(x) which need −∞ N (m, s)(x) dx = 1
R∞
and −∞ h(x)N (m, s)(x) dx → h(m) for any continuous h and s → 0, proven later.
14.11. For functions of three variables f (x, y, z) one can look at the partial differential
equation ∆f (x, y, z) = fxx + fyy + fzz = 0. It is called the Laplace equation and ∆ is
called the Laplace operator. The operator appears also in one of the most important
partial differential equations, the Schrödinger equation
~2
i~ft = Hf = − ∆f + V (x)f ,
2m
where ~ = h/(2π) is a scaled Planck constant and V (x) is the potential depending
on the position x and m is the mass. For i~ft = P f with P = −i~D, then the
solution f (x − t) is forward translation. The operator P is the momentum operator
in quantum mechanics. The Taylor formula tells that P generates translation.
Linear Algebra and Vector Analysis

Homework

Problem 14.1: Verify that for any constant b, the function

f (x, t) = e−bt cos(x + t)
satisfies the driven transport equation
ft (x, t) = fx (x, t) − bf (x, t) .
This PDE is sometimes called the advection equation with damping b.

√
Problem 14.2: √ We have seen that f (t, x) = N (m, 2t) =
−(x−m)2 /(4t)
e / 4πt solves the heat equation ft = fxx . Verify more gen-
erally that √
2
e−(x−m) /(at) / aπt
solves the heat equation
ft = (a/4)fxx .

Problem 14.3: The Eiconal equation fx2 + fy2 = 1 can be rewritten as

||df || = 1, where df = ∇f = [fx , fy ]T is the gradient of f . (The gradient
is the transpose of the Jacobian matrix for the map f : R2 → R.) It is
an important equation in optics. Let f (x, y) be the distance to the circle
x2 + y 2 = 1. Show that it satisfies the eiconal equation.

Problem 14.4: The differential equation

ft = f − xfx − x2 fxx
is a version of the Black-Scholes equation. Here f (x, t) is the price of
a call option and x is the stock price and t is time. Find a function
f (x, t) solving it which depends both on x and t. Hint: look first for
solutions f (x, t) = g(t) or f (x, t) = h(x) and then for functions of the
form f (x, t) = g(t)h(x).

Problem 14.5: The partial differential equation

ft + f fx = fxx
is called Burgers equation and describes waves at the beach. In higher
dimensions, it leads to the Navier-Stokes equation which is used to de-
scribe the weather. Verify that the function
2
1 3/2 − x4t

xe
f (t, x) = qt 2
1 − x4t
t
e +1
is a solution of the Burgers equation. You better use technology.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 15: Contradiction and Deformation

Seminar
15.1. We have already seen one proof technique, the “method of induction.” Other
proofs were done either by direct computations or by combining already known
theorems or inequalities. Today, we look at two new and fundamentally different
proof techniques. The first is the method “by contradiction.” The second method
is the “method of deformation.” Both methods are illustrated by a theorem.

15.2. The first theorem is one of the earliest results in mathematics. It is the Hypas-
sus theorem from 500 BC. It was a result which shocked the Pythagoreans so much
that Hypassus got killed for its discovery. That is at least what the rumors tell.

Theorem: The diagonal of a unit square has irrational length.

Proof. Assume the statement is false and the diagonal has rational length p/q. Then
by Pythagoras theorem 2 = p2 /q 2 or 2q 2 = p2 . By the fundamental theorem of arith-
metic, the left hand side has an odd number of factors 2, the right hand side an even
number. This is a contradiction . The assumption must have been wrong.

15.3.
Problem A: Prove that the cube root of 2 is irrational.

√
Figure 1. 2 is irrational. Start by assuming the side length and
diagonal of the large yellow square are integers. Conclude that for the
strictly smaller orange square, the side length and diagonal are integers.
Linear Algebra and Vector Analysis

15.4. Note that the proof relied on the fundamental theorem of arithmetic which
assured that every integer has a unique prime factorization.
Problem B: Figure (1) is a geometric proof by contradiction which does
not need the fundamental theorem of arithmetic. Complete the proof.
1

15.5. Proofs by contradiction can be dangerous. A flawed proof can ” assume the con-
trary, mess around with arguments, make a mistake somewhere and get a contradiction .
QED”. Better than a proof by contradiction is a constructive proof.
15.6. Here is a non-constructive proof which is amazing:
Theorem: There exist two irrational x, y such that xy is rational.
√ √2
Proof: there are two possibilities. Either z = 2√ is irrational or not. In the first
case, we have found an example where x = y = 2. In the second case, take x = z
√ √ 2
and take y = 2. Now xy = 2 = 2 is rational and we have an example.
15.7. The second proof technique we see today is a deformation argument. To
illustrate it, take a closed C 2 curve in R2 without self intersections. We have defined
its curvature κ(t) already. For curves in R2 , define the signed curvature K(t).
If the curve parametrized so that |r0 (t)| = 1 and T (t) = [cos(α(t)), sin(α(t))], then
K(t) = α0 (t). Note that κ(t) = |T 0 (t)| = |[− sin(α(t)), cos(α(t))]α0 (t)| = |K(t)|. Now
Rb
if we have a curve r : [a, b] → R2 , we can define the total curvature as a K(t) dt.
By the fundamental theorem of calculus, this total curvature is the change of the
angle α(b) − α(a). Now, if the curve is closed, the initial and final angles have to differ
by a multiple of 2π. The Hopf Umlaufsatz tells that
Theorem: The total curvature of a simple closed curve is 2π or −2π.

Figure 2. Four simple closed curves for which it is not obvious that
the total curvature is 2π.

15.8.
Problem C: a) Why is the total curvature not always 2π?
b) Formulate out what happens in in Figure (3).

1for more explanation, see https://www.youtube.com/watch?v=Ih16BIoR9eM

Figure 3. Hopf’s deformation proof: each picture shows the line
through r(s), r(t) and to the right the parameter (s, t). In the left col-
umn, where s = t, we deal with the tangent turning. We have to show
it turns by 2π. The next columns deform the situation where the path
through the parameter square is changed. In the very right column, we
twice turn the segment by π, in total 2π.
Linear Algebra and Vector Analysis

Homework
Exercises A-C are done in the seminar. This homework is due on Tuesday
√
Problem 15.1 Prove by contradiction that 12 is irrational.

Problem 15.2 Prove by contradiction that log10 (2) is irrational. log10 is

the logarithm with respect to the base 10.

Problem 15.3 Verify

the Hopf Umlaufsatz for a circle of radius 5, where
5 cos(t)
r(t) = .
5 sin(t)

Problem 15.4 The Umlaufsatz generalizes to polygons. We can not

assign a tangent to the vertices of the polygon but we can look at the
deformation proof in the case when the parameters r, s are not equal.
Look at the Hopf Umlaufsatz say in the case of a triangle with angles
α, β, γ. What does the deformation argument tell then?

Figure 4. Can you adapt the Hopf Umlaufsatz for triangles?

Problem 15.5 There is a variant of proof by contradiction which is proof

by infinite descent. It was used in proving a special case of Fermat’s
Last theorem. This special result tells that the equation r2 + s4 = t4
has no solution with positive r, s, t. Look up and write down the proof of
this theorem.

Figure 5. Perre de Fermat: cropped from Foto by Didier Descouens:

showing the Monument to Pierre de Fermat by Alexandre Falguière in
Beaumont-de-Lomagne, Tarn-et-Garonne France.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 16: Chain rule

Lecture
16.1. Given a differentiable function r : Rm → Rp , its derivative at x is the Jacobian
matrix dr(x) ∈ M (p, m). If f : Rp → Rn is another function with df (y) ∈ M (n, p),
we can combine them and form f ◦ r(x) = f (r(x)) : Rm → Rn . The matrices df (y) ∈
M (n, p) and dr(x) ∈ M (p, m) combine to the matrix product df dr at a point. This
matrix is in M (n, m). The multi-variable chain rule is:

Theorem: d(f ◦ r)(x) = df (r(x))dr(x)

16.2. For m = n = p = 1, the single variable calculus case, we have df (x) = f 0 (x)
and (f ◦ r)0 (x) = f 0 (r(x))r0 (x). In general, df is now a matrix rather than a number.
By checking a single matrix entry, we reduce to the case n = m = 1. In that case,
f : Rp → R is a scalar function. While df is a row vector, we define the column
vector ∇f = df T = [fx1 , fx2 , . . . fxp ]T . If r : R → Rp is a curve, we write r0 (t) =
[x01 (t), · · · , x0p (t)]T instead of dr(t). The symbol ∇ is addressed also as “nabla”. 1 The
special case n = m = 1 is:

Theorem: d
dt
f (r(t)) = ∇f (r(t)) · r 0 (t).

16.3. Proof. d/dtf (x1 (t), x2 (t), . . . , xp (t)) is the limit h → 0 of

[f (x1 (t + h), x2 (t + h), . . . , xp (t + h)) − f (x1 (t), x2 (t), . . . , xp (t))]/h =
= [f (x1 (t + h), x2 (t + h), . . . , xp (t + h)) − f (x1 (t), x2 (t + h), . . . , xp (t + h)]/h
+ [f (x1 (t), x2 (t + h), . . . , xp (t + h)) − f (x1 (t), x2 (t), . . . , xp (t + h)]/h + · · ·
+ [f (x1 (t), x2 (t), . . . , xp (t + h)) − f (x1 (t), x2 (t), . . . , xp (t))]/h

which is (1D chain rule) in the limit h → 0 the sum fx1 (x)x01 (t) + · · · + fxp (x)x0p (t).
16.4. Proof of the general case: Let h = f ◦ r. The entry ij of the Jacobian matrix
dh(x) is dhij (x) = ∂xj hi (x) = ∂xj fi (r(x)). The case of the entry ij reduces with t = xj
and hi = f to the case when r(t) is a curve and f (x) is a scalar function. This is the
case we have proven already.
1Etymology tells that the symbol is inspired by a Egyptian or Phoenician harp.
Linear Algebra and Vector Analysis

Example

cos(t)
16.5. Assume a ladybug walks on a circle r(t) = and f (x, y) = x2 −y 2 is the
sin(t)
temperature at the position (x, y), then f (r(t)) is the rate of change of the temperature.
We can write f (r(t)) = cos2 (t) − sin2 (t) = cos(2t).
Now, = −2 sin(2t).
d/dtf (r(t)) The
2x − sin(t)
gradient of f and the velocity are ∇f (x, y) = , r0 (t) = . Now
−2y cos(t)

0 2 cos(t) − sin(t)
∇f (r(t)) · r (t) = · = −4 cos(t) sin(t) = −2 sin(2t) .
−2 sin(t) cos(t)

Figure 1. If f (x, y) is a height, the rate of change d/dtf (r(t)) is the

gain of height the bug climbs in unit time. It depends on how fast the
bug walks and in which direction relative to the gradient ∇f it walks.

Illustrations
16.6. The case n = m = 1 is extremely important. The chain rule d/dtf (r(t)) =
∇f (r(t)) · r0 (t) tells that the rate of change of the potential energy f (r(t)) at the
position r(t) is the dot product of the force F = ∇f (r(t)) at the point and the velocity
with which we move. The right hand side is power = force times velocity. We will
use this later in the fundamental theorem of line integrals.
16.7. If f, g : Rm → Rm , then f ◦g is again a map from Rm to Rn . We can also iterate
a map like x → f (x) → f (f (x)) → f (f (f (x))) . . . . The derivative df n (x) is by the
chain rule the product df (f n−1 (x)) · · · df (f (x))df (x) of Jacobian matrices. The number
λ(x) = lim supn→∞ (1/n) log(|df n (x)|) is called the Lyapunov exponent of the map f
at the point x. It measures the amount of chaos, the “sensitive dependence on initial
conditions” of f . These numbers are hard to estimate mathematically. Already for
simple examples like the Chirikov map f ([x, y]) = [2x − y + c sin(x), x], one can
measure positive entropy S(c). A conjecture of Sinai tells that that the entropy
of
R 2πthe
R 2π map is positive 2for large c. Measurements show that this entropy S(c) =
2
0 0
λ(x, y) dxdy/(4π ) satisfies S(x) ≥ log(c/2). The conjecture is still open.
16.8. If H(x, y) is a function called the Hamiltonian and x0 (t) = Hy (x, y), y 0 (t) =
−Hx (x, y), then d/dtH(x(t), y(t)) = 0. This can be interpreted as energy conserva-
tion. We see that a Hamiltonian differential equation always preserves the energy. For
the pendulum, H(x, y) = y 2 /2−cos(x), we have x0 = y, y 0 = − sin(x) or x00 = − sin(x).
2To generate orbits, see http://www.math.harvard.edu/˜knill/technology/chirikov/.
Figure 2. The map f ([x, y]) = [x2 − x/2 − y, x] is a Henon map. We
see some orbits. The map f ([x, y]) = [2x − y + 4 sin(x), x] on the right
appeared in the first hourly. The torus T2 = R2 /(2πZ)2 is filled with a
blue “stochastic sea” containing red “stable islands”.

16.9. The chain rule is useful to get derivatives of inverse functions. Like
d d
1= x= sin(arcsin(x)) = cos(arcsin(x)) arcsin0 (x)
dx dx
p √
which then gives arcsin0 (x) = 1/ 1 − sin2 (arcsin(x)) = 1/ 1 − x2 .
16.10. Assume f (x, y) = x3 y + x5 y 4 − 2 − sin(x − y) = 0 is a curve. We can not solve
for y. Still, we can assume f (x, y(x)) = 0. Differentiation using the chain rule gives
fx (x, y(x)) + fy (x, y(x))y 0 (x) = 0. Therefore
fx (x, y(x))
y 0 (x) = − .
fy (x, y(x))
In the above example, the point (x, y) = (1, 1) is on the curve. Now gx (x, y) =
3 + 5 − 1 = 7 and gy (x, y) = 1 + 4 + 1 = 6. So, g 0 (1) = −7/6. This is called implicit
differentiation. We could compute with it the derivative of a function which was not
known.
16.11. The implicit function theorem assures that a differentiable implicit function
g(x) exists near a root (a, b) of a differentiable function f (x, y).

Theorem: If f (a, b) = 0, fy (a, b) 6= 0 there exists c > 0 and a function

g ∈ C 1 ([b − c, b + c]) with f (x, g(x)) = 0.

Proof. Let c be so small that for fixed x ∈ [a − c, a + c], the function y ∈ [b − c, b + c] →

h(y) = f (x, y) has the property h(b−c) < 0 and h(b+c) > 0 and h0 (y) 6= 0 in [b−c, b+c].
The intermediate value theorem for h now assures a unique root z = g(x) of h near
b. The chain rule formula above then assures that for a − c < x < a + c, the differen-
tial quotient [g(x+h)−g(x)]/h written down for g has a limit −fx (x, g(x))/fy (x, g(x)).

P.S. We can get the root of h by applying Newton steps T (y) = y − h(y)/h0 (y).
Taylor (seen in the next class) shows the error is squared in every step. The Newton
step T (y) = y − dh(y)−1 h(y) works also in arbitrary dimensions. One can prove the
implicit function theorem by just establishing that Id − T = dh−1 h is a contraction
and then use the Banach fixed point theorem to get a fixed point of Id − T which
is a root of h.
Linear Algebra and Vector Analysis

h(x)

x-T(x)

Figure 3. The Newton step.

Homework

Problem 16.1: Let r(t) = [3t + cos(t), t + 4 sin(t)]T be a curve and

f ([x, y]T ) = [x3 + y, x + 2y + y 3 ]T be a coordinate change. a) Compute
v = r0 (0) at t = 0, then df (x, y) and A = df (r(0)) and df (r(0))r0 (0) = Av.
b) Compute R(t) = f (r(t)) first, then find w = R0 (0). It should agree
with a).

Problem 16.2: a) Define the function f (x, y) = x · y from R2 to R.

If both x and y are functions of t we get the curve r(t) = [x(t), y(t)]T .
What does the chain rule tell for t → f (r(t)) from R to R?
b) Do the same for the function f (x, y) = x/y. What rule do you get now?

Problem 16.3: The surface f (x, y, z) = x2 + y 2 /4 + z 2 /9 = 4 + 1/4 + 1/9

is an ellipsoid. Compute zx (x, y) at the point (x, y, z) = (2, 1, 1).

Problem 16.4: Consider the Hénon map f ([x, y]T ) = [x2 − x4 − y, x]T .
Compute either d(f ◦ f )([1, 1]T ) or df (f ([1, 1]T ))df ([1, 1]T ). The chain rule
tells it is the same matrix.

Problem 16.5: Apply the Newton step 3 times starting with x = 2 to

solve the equation x2 − 2 = 0.

Figure 4. Some orbits of the Henon map f ([x, y]) = [x2 − x4 − y, x].

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 17: Taylor approximation

Lecture
17.1. Given a function f : Rm → Rn , its derivative df (x) is the Jacobian matrix. For
every x ∈ Rm , we can use the matrix df (x) and a vector v ∈ Rm to get Dv f (x) =
df (x)v ∈ Rm . For fixed v, this defines a map x ∈ Rm → df (x)v ∈ Rn , like the original
f . Because Dv is a map on X = { all functions from Rm → Rn }, one calls it an
operator. The Taylor formula f (x + t) = eDt f (x) holds in arbitrary dimensions:
Dv tf (x) Dv2 t2 f (x)
Theorem: f (x + tv) = eDv t f = f (x) + 1!
+ 2!
+ ...

17.2. Proof. It is the single variable Taylor on the line x+tv. The directional derivative
Dv f is there the usual derivative as limt→0 [f (x + tv) − f (x)]/t = Dv f (x). Technically,
we need the sum to converge as well: like functions built from polynomials, sin, cos, exp.
17.3. The Taylor formula can be written down using successive derivatives df, d2 f, d3 f
also, which are then called tensors. In the scalar case n = 1, the first derivative df (x)
leads to the gradient ∇f (x), the second derivative d2 f (x) to the Hessian matrix
H(x) which is a bilinear form acting on pairs of vectors. The third derivative d3 f (x)
then acts on triples of vectors etc. One can still write as in one dimension
2
Theorem: f (x) = f (x0 ) + f 0 (x0 )(x − x0 ) + f 00 (x0 ) (x−x
2!
0)
+ ···

if we write f (k) = dk f . For a polynomial, this just means that we first write down the
constant, then all linear terms then all quadratic terms, then all cubic terms etc.
17.4. Assume f : Rm → R and stop the Taylor series after the first step. We get
L(x0 + v) = f (x0 ) + ∇f (x0 ) · v .
It is custom to write this with x = x0 + v, v = x − x0 as
L(x) = f (x0 ) + ∇f (x0 ) · (x − x0 )

This function is called the linearization of f . The kernel of L − f (x0 ) is a linear

manifold approximating the surface {x | f (x) − f (x0 ) = 0}. If f : Rm → Rn , then the
just said can be applied to every component fi of f , with 1 ≤ i ≤ n. One can not
stress enough the importance of this linearization. 1
1Again: the linearization idea is utmost important because it brings in linear algebra.
Linear Algebra and Vector Analysis

17.5. If we stop the Taylor series after two steps, we get the function Q(x + v) =
f (x) + df (x) · v + v · d2 f (x) · v/2. The matrix H(x) = d2 f (x) is called the Hessian
matrix at the point x. It is also here custom to eliminate v by writing x = x0 + v.
Q(x) = f (x0 ) + ∇f (x0 ) · (x − x0 ) + (x − x0 ) · H(x0 )(x − x0 )/2

is called the quadratic approximation of f . The kernel of Q−f (x0 ) is the quadratic
manifold Q(x) − f (x0 ) = x · Bx + Ax = 0, where A = df and B = d2 f /2. It
approximates the surface {x | f (x) − f (x0 ) = 0} even better than the linear one. If
|x − x0 | is of the order , then |f (x) − L(x)| is of the order 2 and |f (x) − Q(x)| is of
the order 3 . This follows from the exact Taylor with remainder formula. 2

L=C
f=C

Q=C

Figure 1. The manifolds f (x, y) = C, L(x, y) = C and Q(x, y) = C

for C = f (x0 , y0 ) pass through the point (x0 , y0 ). To the right, we see
the situation for f (x, y, z) = C. We see the best linear approximation
and quadratic approximation. The gradient is perpendicular.

17.6. To get the tangent plane to a surface f (x) = C one can just look at the linear
manifold L(x) = C. However, there is a better method:
The tangent plane to a surface f (x, y, z) = C at (x0 , y0 , z0 ) is ax+by+cz =
d, where [a, b, c]T = ∇f (x0 , y0 , z0 ) and d = ax0 + by0 + cz0 .

17.7. This follows from the fundamental theorem of gradients:

Theorem: The gradient ∇f (x0 ) of f : Rm → R is perpendicular to the
surface S = {f (x) = f (x0 ) = C} at x0 .

Proof. Let r(t) be a curve on S with r(0) = x0 . The chain rule assures d/dtf (r(t)) =
∇f (r(t)) · r0 (t). But because f (r(t)) = c is constant, this is zero assuring r0 (t) being
perpendicular to the gradient. As this works for any curve, we are done.
Examples
17.8. Let f : R2 → R be given as f (x, y) = x3 y 2 + x + y 3 . What is the quadratic
approximation at (x0 , y0 ) = (1, 1)? We have df (1, 1) = [4, 5] and

fx 4 fxx fxy 6 6
∇f (1, 1) = = , H(1, 1) = = .
fy 5 fyx fyy 6 8
2If Pn Rt
f ∈ C n+1 , f (x+t) = k=0 f (k) (x)tk /k!+ 0
(t−s)n f (n+1) (x+s)ds/n! (prove this by induction!)
The linearization is L(x, y) = 4(x − 1) + 5(y − 1) + 3. The quadratic approximation
is Q(x, y) = 3 + 4(x − 1) + 5(y − 1) + 6(x − 1)2 /2 + 12(x − 1)(y − 1)/2 + 8(y − 1)2 /2.
This is the situation displayed to the left in Figure (1). For v = [7, 2]T , the directional
derivative Dv f (1, 1) = ∇f (1, 1) · v = [4, 5]T · [7, 2] = 38. The Taylor expansion given
at the beginning is a finite series because f was a polynomial: f ([1, 1] + t[7, 2]) =
f (1 + 7t, 1 + 2t) = 3 + 38t + 247t2 + 1023t3 + 1960t4 + 1372t5 .
17.9. For f (x, y, z) = −x4 + x2 + y 2 + z 2 , the gradient and Hessian are
       
fx 2 fxx fxy fxz −10 0 0
∇f (1, 1, 1) =  fy  =  2  , H(1, 1, 1) =  fyx fyy fyz  =  0 2 0  .
fz 2 fzx fzy fzz 0 0 2
The linearization is L(x, y, z) = 2 − 2(x − 1) + 2(y − 1) + 2(z − 1). The quadratic
approximation
Q(x, y, z) = 2 − 2(x − 1) + 2(y − 1) + 2(z − 1) + (−10(x − 1)2 + 2(y − 1)2 + 2(z − 1)2 )/2
is the situation displayed to the right in Figure (1).
17.10. What is the tangent plane to the surface f (x, y, z) = 1/10 for f (x, y, z) =
10z 2 − x2 − y 2 + 100x4 − 200x6 + 100x8 − 200x2 y 2 + 200x4 y 2 + 100y 4 = 1/10
 
0
at the point (x, y, z) = (0, 0, 1/10)? The gradient is ∇f (0, 0, 1/10) =  0 . The
2
tangent plane equation is 2z = d, where the constant d is obtained by plugging in the
point. We end up with 2z = 2/10. The linearization is L(x, y, z) = 1/20 + 2(z − 1/10).
17.11. P.S. The following remark should maybe be skipped as many objects have not been properly introduced. The
exterior derivative d for example will appear in the form of grad,curl,div later on and d2 = 0 in the form curl(grad(f )) = 0.
The quite deep remark illustrates how important the topic of Taylor series is if it is taken seriously.
The derivative d acts on anti-symmetric tensors (= forms), where d2 = 0. A vector field X then defines a Lie derivative
LX = dιX +ιX d = (d+ιX )2 = DX
2 with interior product ι . For scalar functions and the constant field X(x) = v, one
X

gets the directional derivative Dv = ιX d. The projection ιX in a specific direction can be replaced with the transpose
d∗ of d. Rather than transport along X, the signal now radiates everywhere. The operator d + ιX becomes then the
Dirac operator D = d+d∗ and its square is the Laplacian L = (d+d∗ )2 = dd∗ +d∗ d. The wave equation ftt = −Lf
can be written as (δt2 +D2 )f = (δt −iD)(δt +iD)f = 0 which has the solution aeiDt +be−iDt . Using the Euler formula
eiDt = cos(Dt) + i sin(Dt) one gets the explicit solutions f (t) = f (0) cos(Dt) + iD−1 ft (0) sin(Dt) of the wave equation.
It gets more exciting: by packing the initial position and velocity into a complex wave ψ(0, x) = f (0, x)+iD−1 ft (0, x),
we have ψ(t, x) = eiDt ψ(0, x). The wave equation is solved by a Taylor formula, which solves a Schrödinger
equation for D and the classical Taylor formula is the Schrödinger equation for DX . This works in any
framework featuring a derivative d, like finite graphs, where Taylor resembles a Feynman path integral, a sort of
Taylor expansion used by physicists to compute complicated particle processes.
The Taylor formula shows that the directional derivative Dv generates translation by −v. In physics, the operator
P = −i~Dv is called the momentum operator associated to the vector v. The Schrödinger equation i~ft = P f
has then the solution f (x − tv) which means that the solution at time t is the initial condition translated by tv. This
2 acting on forms defined by a
generalizes to the Lie derivative LX given by Cartan’s magic formula as LX = DX
vector field X. For the analog L = D2 , the motion is not channeled in a determined direction X (this is a photon) but
spreads (this is a wave) in all direction leading to the wave equation. We have just seen both the “photon picture” LX
as well as the “wave picture” L of light. And whether it is particle or wave, it is all just Taylor.
Linear Algebra and Vector Analysis

Homework

Problem 17.1: Evaluate without technology the cube root of 1002 using
quadratic approximation. Especially look how close you are to the real
value.

Problem 17.2: Compute without a computer the square root of 102

using quadratic approximation. Also here, look how close you get to the
actual value.

Problem 17.3: Given g(x, y) = (6y 2 −5)2 (x2 +y 2 −1)2 , define the surface
S by f (x, y, z) = g(x, y) + g(y, z) + g(z, x) = 3. The following equation
could be derived with the chain rule. You can take this for granted:
 
gx (1, −1) + gy (1, 1)
∇f (1, −1, 1) =  gx (−1, 1) + gy (1, −1)  .
gx (1, 1) + gy (−1, 1)
Using this, find the tangent plane to S at (1, −1, 1).

Problem 17.4: a) Find the tangent plane to the √ surface f (x, y, z) =

√
xyz = 60 at (x, y, z) = (100, 36, 1). b) Estimate 100.1 · 36.1 · 0.999
using linear approximation (compute L(x, y, z) rather than f (x, y, z).)

Problem 17.5: a) At which of the points P, Q, R, S, T, . . . , Y does ∇f (x)

have maximal length?
b) At which of the points is fx > 0 and fy = 0?

y
-4 -3 -2 -1 0 1 2 3 4
4 2 2 1 4

7
3
3 U 4 3

9 7 3
2 10 2
V P 8
8 9
10
1 1
T
5
2
0 3
R
6 S 0 x

-4 Y
-1 -1
1

-2 -1 -8 -2
Q
-9
-7 -2
-6
-3 -3
-3 -5
W X
-4 0 -4
-4 -3 -2 -1 0 1 2 3 4

Figure 2.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 18: Number Magic

Seminar
18.1. In this seminar, we see how calculus can help to compute things effectively and
also hope to get insight into topics which are of more number theoretical nature. To
find the cube root of 10 for example, we have
2 2
101/3 ∼ 81/3 + = 2 + = 2.1666 . . . .
3 · 82/3 12
The actual value is 2.15443.

18.2.
Problem A: Find the cube root of 999999 using linear approximation.

0.00 0.00

-0.01 -0.01
-0.02
-0.02
-0.03
-0.03
-0.04
-0.04 -0.05
20 40 60 80 100 200 400 600 800 1000

Figure 1. The error of the linear approximation when computing

square roots and cube roots is in the 5 percent range.

18.3. In the exam problem 1, you have proven 1 + 2 + 22 + · · · + 2n = 2n+1 − 1. This

is a special case of the geometric series formula
1 − an+1
1 + a + a2 + · · · + an = .
1−a
Of course, we could also prove this formula by induction. Better do it directly:
Problem B: Verify the geometric series formula by multiplying with 1−a.
Linear Algebra and Vector Analysis

18.4. These were all finite sums but seeing the pattern allows us to take a limit and
compute the infinite series:
1
Problem C: For which a is 1 + a + a2 + a3 + ... = 1−a
valid?

18.5. Recall the definition of Taylor series and answer the following trick question:
1
Problem D: What is the Taylor series of f (x) = (1−x)
at x0 = 0?

18.6. How can you get from the last exercise the following identity?
x2 x3 x4
Problem E: − log(1 − x) = x + 2
+ 3
+ 4
+ ....

P∞
18.7. We can also differentiate to verify the formula n=1 nxn−1 = 1/(1 − x)2 and so
∞
X x
nxn = .
n=1
(1 − x)2
This function is called Li−1 (x).

Problem F: What is the numerical value of

1 2 3 4 5
+ + + + + ··· .
2 4 8 16 32

18.8. How come that great number theorists like Leonard Euler or Godfrey Hardy
were also masters in calculus? The reason is that many results of number theoretic
nature have intimate relations with calculus. Lets look at the following problem:
Problem G: What is the value of the Leibniz series
1 1 1
1 − + − + ... .
3 5 7
18.9. Hint: compute first the Taylor series of f (x) = arctan(x) using the Taylor series
of 1/(1 + x2 ) (the later is a geometric series), then evaluate f at x = 1.
18.10. ∞
X xn x2 x3
Lis (x) = =x+ + s + ···
n=1
ns 2s 3
is called the poly logarithm function. For s = 0 it is Problem D, for s = 1 it is
problem E, for s = −1, x = 1/2 it is problem F. While in calculus, we might be more
interested in the function as a function of x, number theorists are more interested
in the function as a function of s P
and s is complex. In the case x = 1, one gets the
Riemann zeta function ζ(s) = ∞ 1
k=1 ks .

Problem H: What does the Riemann hypothesis say?

18.11. The Euler golden key relates ζ with primes:

Theorem: ζ(s) = ∞ 1 1 −1
P Q
n=1 ns = p prime (1 − ps ) .

18.12.
Problem I: Verify the Euler golden key identity.

18.13. First verify (maybe look at Problem C) that for a single prime p
1 1 1 1
1 = 1 + s + 2s + 3s + . . .
1 − ps p p p
which is the sum over all n1s , where n has only prime factors p. Then look at the
product of these for two primes p, q and see that this is the sum over all n1s where n
has only prime factors p and q.
18.14. The Goldbach conjecture tells that every even number larger than 2 is the
sum of two primes. What is the relation with calculus? Define g(x) = (f (x))2 with
X xp x2 x3 x5 x7
f (x) = = + + + + ....
p
p! 2! 3! 5! 7!

Problem J: Goldbach is equivalent to g (n) (x) > 0 for all even n > 2.
Linear Algebra and Vector Analysis

Homework
Exercises A-F are done in the seminar. This homework is due on Tuesday

Problem 18.1 The function f defined by f (x) = e−1/x for x > 0 and
0 for x ≤ 0 is smooth and that all derivatives at 0 are zero. Check
f 0 (0), f 00 (0), f 000 (0) = 0.

1.0

0.8

0.6

0.4

0.2

0.0
-2 -1 0 1 2

Figure 2. The function f (x) = e−1/x allows to define a smooth bump

function b(x) which is zero outside a ball of radius r.

Problem 18.2 Why is b(x) = f (r2 − |x|2 ) a bump function?

Problem 18.3 The series

1 1 1
+ +
ζ(2) = 1 + + ···
4 9 16
a long history. Research it a bit. Especially: What is the value? Who
found this problem first? What is the name of the problem?

Problem 18.4 By looking it up, give an explanation why

ζ(−1) = 1 + 2 + 3 + 4 + · · ·
can be assigned a finite value. You can also look up its value −1/12 with
Mathematica Zeta[−1]. How is such a finite value possible?

Problem 18.5 You can practice computing square roots of numbers

between 1 and 100 by linear approximation
√ in your head. For example,
if somebody asks you to compute 20 you would immediately tell 4 +
4/(2 · 4) = 4.5. The actual result is 4.472... You you could also get
5 − 5/(2 · 5) = 4.5. Find another non-square integer in 1 to 100 for which
these two estimates agree. (There are a couple of them).

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 19: Extrema

Lecture
19.1. All functions are assumed here to be in C 2 . It all starts with an observation
going back to Pierre de Fermat:

Theorem: If x0 is a maximum of f : Rm → R, then ∇f (x0 ) = 0.

Proof. We prove by contradiction. Assume ∇f (x0 ) 6= 0, define the vector v = ∇f (x0 )

and look at g(t) = f (x0 + tv), which is a function of one variable. By the chain rule, it
satisfies g 0 (0) = ∇f (x0 + 0v) · v = |∇f |2 > 0. This means that f (x0 + tv) > 0 for small
t > 0. The point x0 can not have been maximal. This is a contradiction. QED.
19.2. A point x with ∇f (x) = 0 is called a critical point of f . By the Taylor
formula, we have at a critical point x0 the quadratic approximation Q(x) = f (x0 ) +
(x − x0 )T H(x0 )(x − x0 )/2, where H(x0 ) is the Hessian matrix
 
f x1 x1 f x1 x2 . . . f x1 xm
 f x2 x1 f x2 x2 . . . f x2 xm 
H(x0 ) = 
 ...
 .
... ... ... 
f xm x1 f xm x2 . . . f xm xm
19.3. As in one dimension, having a critical point does not assure that a point is a local
maximum or minimum. The second derivative test in single variable calculus assures
that if f 0 (x0 ) = 0, f 00 (x0 ) > 0, we have a local minimum and if f 0 (x0 ) = 0, f 00 (x0 ) < 0,
we have a local maximum. If f 00 (x0 ) = 0, we can not say anything without looking at
higher derivatives.
19.4. A matrix A is called positive definite if v · Av > 0 for all vectors v 6= 0. It is
called negative definite if v · Av < 0 for all vectors v 6= 0. A diagonal matrix with
positive diagonal entries is positive definite. In the following statements, we assume x0
is a critical point.
19.5. We say x0 is a local maximum of f if there exists r > 0 such that f (x) ≤ f (x0 )
for all |x − x0 | < r. We say, it is a local minimum of f if f (x) ≥ f (x0 ) for all
|x − x0 | < r. How can we check whether a point is a local maximum or minimum?

Theorem: Assume ∇f (x0 ) = 0. If H(x0 ) is positive definite, then x0 is a

local minimum. If H(x0 ) is negative definite, then x0 is a local maximum.
Linear Algebra and Vector Analysis

19.6. Proof: as ∇f (x0 ) = 0, the quadratic approximation at x0 is Q(x) = f (x0 ) +

H(x0 )v · v/2 > f (x0 ) for small non-zero v = x − x0 and Hessian H. The analogue
statement for the minimum can be deduced by replacing f with −f .

19.7. Let us look at the case, where f (x, y) is a function of two variables such that
fx (x0 , y0 ) = 0 and gx (x0 , y0 ) = 0. The Hessian matrix is

fxx fxy
H(x0 , y0 ) = .
fyx fyy
In this two dimensional case, we can classify the critical points if the determinant
2
D = det(H) = fxx fyy − fxy of H is non-zero. The number D is also called the
discriminant at a critical point.

Figure 1. f = x2 + y 2 gives a minimum, f = −x2 − y 2 a maximum

and f = x2 − y 2 a saddle. The case f = x2 y − yx2 is not Morse.

19.8. We say (x0 , y0 ) is a Morse point, if (x0 , y0 ) is a critical point and the deter-
minant is non-zero. A C 2 function is a Morse function if every critical point is
Morse. Examples of Morse functions are f (x, y) = x2 + y 2 , f (x, y) = −x2 − y 2 and
f (x, y) = x2 − y 2 . The last case is called a hyperbolic saddle. In general, a critical
point is a hyperbolic saddle if D 6= 0 and if it is neither a maximum nor a minimum.
Here is the second derivative test in dimension 2:

Theorem: Assume f ∈ C 2 has a critical point (x0 , y0 ) with D 6= 0.

If D > 0 and fxx > 0 then (x0 , y0 ) is a local minimum.
If D > 0 and fxx < 0 then (x0 , y0 ) is a local maximum.
If D < 0 then (x0 , y0 ) is a hyperbolic saddle.

19.9. Proof. After translation (x, y) → (x − x0 , y − y0 ) and replacing f with f −

f (x0 , y0 ), we have (x0 , y0 ) = (0, 0) and f (0, 0) = 0. At the critical point, the quadratic
approximation is now
Q(x, y) = ax2 + 2bxy + cy 2 .
2
This can be rewritten as a(x + ab y)2 + (c − ba )y 2 = a(A2 + DB 2 ) with A = (x + ab y), B =
b2 /a2 and discriminant D. If a = fxx > 0 and D > 0 then c − b2 /a > 0 and the
function has positive values for all (x, y) 6= (0, 0). The point (0, 0) is then a minimum.
If a = fxx < 0 and D > 0, then c − b2 /a < 0 and the function has negative values for
all (x, y) 6= (0, 0) and the point (x, y) is a local maximum. If D < 0, then f takes both
negative and positive values near (0, 0). QED
19.10. One can ask, why fxx and not fyy is chosen. It does not matter, because if
D > 0, then both fxx and fyy need to be non-zero and have the same sign. Instead
of fxx , one could also have pick the more natural trace tr(H). It is invariant under
coordinate changes similarly as the determinant D. The discriminant D happens also
to be the Gauss curvature of the surface at the point.
19.11. In higher dimensions, the situation is described by the Morse lemma. It tells
that near a critical point there is a coordinate change φ such that g(x) = f (φ(x)) is
a quadratic function f (x) = B(x − x0 ) · (x − x0 ) where B is diagonal with entries +1
or −1. Critical point can then be given a Morse index, the number of entries −1
in B. The Morse lemma is actually a theorem (theorems are more important than
lemmata=helper theorems)

Theorem: Near a Morse critical point x0 of a C 2 function f , there is a

coordinate change φ : Rm → Rm such that g(x) = f (φ(x)) − f (x0 ) is
g(x) = −x21 − · · · − x2k + x2k+1 + · · · + x2m .

19.12. Proof. We use induction with respect to m. (i) Induction foundation. For
m = 1, the result tells that for a Morse critical point, the function looks like y = x2
or y = −x2 . First show that if f (0) = f 0 (0) = 0, f 00 (0) 6= 0, then f (x) = x2 h(x) or
f (x) = −x2 h(x) for some positive C 2 function h. Proof. By a linear coordinate change
we assume x0 = 0 and f (0) = 0. There exists then g(x) such that f (x) = xg(x):
it is g(x) = f (x)/x for x 6= 0 and is in the limit x → 0 the value of limx→0 (f (x) −
f (0))/x = f 0 (0). By the product rule, f 0 (x) = g(x) + xg 0 (x) with g(0) = 0. Because
f 0 (0) = g(0) = 0 can define f (x)/x2 for x 6= 0 and take the limit x → 0, because
by applying Hôpital twice, the limit is f 00 (0). p The coordinate change is now given by
a functionp y = φ(x) satisfying g(x, y) = y h(y) = x. Implicit differentiation gives
gy (0, 0) = h(y) 6= 0 so that by the implicit function theorem y(x) exists.
2
(ii) Induction step m → m+1: P we first note that Taylor for C with remainder term
implies that f (x1 , . . . , xn ) = i,j xi xj hij (x1 , . . . , xn ) with some continuous functions
hij . Furthermore, the function value hij (0) = fxi xj (0) = Hij (0) are the coordinates
of the Hessian. Apply first a rotation so that h11 6= 0. Now look at x1 and keep the
other coordinates constant. As in (i), find a coordinate change φ such that f (φ(x)) =
±x21 + g(x2 , . . . , xm ), where g inherits the properties of f 1, but is of one dimension less.
By induction assumption, there is a second coordinate change such that g(ψ(x)) =
x22 − · · · − x2l + x2l+1 + · · · + x2m . Combining φ and ψ produces the Morse normal form.
Examples
19.13. Q: Classify the critical points of f (x, y) = x3 − 3x − y 3 − 3y. A: As ∇f (x, y) =
[3x2 − 3, −3y 2 + 3]T , the critical points are (1, 1),(−1, 1),(1, −1) and (−1, −1). We
2x 0
compute H(x, y) = . For (1, 1) and (−1, −1) we have D = −4 and so
0 −2y
saddle points. For (−1, 1), we have D = 4, fxx = −2, a local max. For (1, −1) where
D = 4, fxx = 2 we have a local min.

1This will be more clear after having seen more linear algebra
Linear Algebra and Vector Analysis

Homework

Problem 19.1: Classify the critical points of the area 51 function

f (x, y) = x51 − 51x − y 51 + 51y
using the second derivative test. This function is classified.

Problem 19.2: The function f (x, y) = 2x3 + 2y 3 − 3x2 y 2 is called the

“happy function”. Find and classify its extrema.
This function is not Morse as for one of the critical points, the discriminant
D is zero. We want you nevertheless to decide whether this point is a “local
maximum” a “local minimum” or “neither of them”.

Problem 19.3: Where on the parametrized surface r(u, v) = [u2 , v 3 , uv]

is the temperature T (x, y, z) = 12x + y − 12z minimal. Classify all the
critical points of the function f (u, v) = T (r(u, v)). [ If you have found the
function f (u, v), you can replace u, v again with x, y if you like to work
with a function f (x, y). ]

Problem 19.4: Find all the critical points of the function f (x, y, z) =
(x − 1)2 − y 2 + xz 2 . In each of the cases, find the Hessian matrix. We
have not talked about eigenvalues yet, but they are numbers λ such that
Hv = λv for some non-zero vector. One can find them by looking for
the roots of the characteristic polynomial χH (λ) = det(L − λ). You can
calculate them on a computer. Find in each case the eigenvalues.

Problem 19.5: a) Find a function f (x, y) with 3 maxima and 3 saddle

points and one minimum.
b) You see below a contour map of a function of two variables. How many
critical points are there? Is the function a Morse function?

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 20: Constraints

Lecture
20.1. If we want to maximize a function f : Rm → R on the constraint S = {x ∈
Rm | g(x) = c}, then both the gradients of f and g matter. We call two vectors v, w
parallel if v = λw or w = λv for some real λ. The zero vector is parallel to everything.
Here is a variant of Fermat:
Theorem: If x0 is a maximum of f under the constraint g = c, then
∇f (x0 ) and ∇g(x0 ) are parallel.

20.2. Proof: use contradiction: assume ∇f (x0 ) and ∇g(x0 ) are not parallel and x0
is a local maximum. Let T be the tangent plane to S = {g = c} at x0 . Because
∇f (x0 ) is not perpendicular to T we can project it onto T to get a non-zero vector v
in T which is not perpendicular to ∇f . Actually the angle between ∇f and v is acute
so that cos(α) > 0. Take a curve r(t) in S with r(0) = x0 and r0 (0) = v. We have
d/dtf (r(0)) = ∇f (r(0)) · r0 (0) = |∇f (x0 )||v| cos(α) > 0. By linear approximation, we
know that f (r(t)) > f (r(0)) for small enough t > 0. This is a contradiction to the fact
that f was maximal at x0 = r(0) on S.
20.3. This immediately implies: (distinguish ∇g 6= 0 and ∇g = 0)

Theorem: For a maximum of f on S = {g = c} either the Lagrange

equations ∇f (x0 ) = λ∇g(x0 ), g = c hold, or then ∇g(x0 ) = 0, g = c.

20.4. For functions f (x, y), g(x, y) of two variables, this means we have to solve a
system with three equations and three unknowns:

fx (x0 , y0 ) = λgx (x0 , y0 )

fy (x0 , y0 ) = λgy (x0 , y0 )
g(x, y) = c

20.5. To find a maximum, solve the Lagrange equations and add a list of critical points
of g on the constraint. Then pick a point where f is maximal among all points. We
don’t bother with a second derivative test. But here is a possible statement:
d2
Dtv Dtv f (x0 )|t=0 < 0
dt2
for all v perpendicular to ∇g(x0 ), then x0 is a local maximum.
Linear Algebra and Vector Analysis

20.6. Of course, the case of maxima and minima are analog. If f has a maximum
on g = c, then −f has a minimum at g = c. We can have a maximum of f under a
smooth constraint S = {g = c} without that the Lagrange equations are satisfied. An
example is f (x, y) = x and g(x, y) = x3 − y 2 shown in Figure (1).

f(x,y)=x

x 3-y 2=0
1= λ 3x 2
0=-λ 2y
x 3-y 2=0

Figure 1. An example of a function, where the Lagrange equations

do not give the minimum, here (0, 0). It is a case, where ∇g = 0.

20.7. The method of Lagrange can maximize functions f under several constraints.
Lets show this in the case of a function f (x, y, z) of three variables and two constraints
g(x, y, z) = c and h(x, y, z) = d. The analogue of the Fermat principle is that at a
maximum of f , the gradient of f is in the plane spanned by ∇g and ∇h. This leads
to the Lagrange equations for 5 unknowns x, y, z, λ, µ.

fx (x0 , y0 , z0 ) = λgx (x0 , y0 , z0 ) + µhx (x0 , y0 , z0 )

fy (x0 , y0 , z0 ) = λgy (x0 , y0 , z0 ) + µhy (x0 , y0 , z0 )
fz (x0 , y0 , z0 ) = λgz (x0 , y0 , z0 ) + µhz (x0 , y0 , z0 )
g(x, y, z) = c
h(x, y, z) = d

20.8. For example, if f (x, y, z) = x2 + y 2 + z 2 and g(x, y, z) = x2 + y 2 = 1, h(x, y, z) =

x + y + z = 4, then we find points on the ellipse g = 1, h = 4 with minimal or maximal
distance to 0.

Figure 2. Extremizing a function f under two constraints. In this

case the intersection g = c, h = d is an ellipse.
Examples
20.9. Problem: Minimize f (x, y) = x2 +2y 2 under the constraint g(x, y) = x+y 2 = 1.
Solution: The Lagrange equations are 2x = λ, 4y = λ2y. If y = 0 then x = 1. If
y 6= 0 we can divide the second equation by y and get 2x = λ, 4 = λ2 again showing
x = 1. The point x = 1, y = 0 is the only solution.
20.10. Problem: Which cylindrical soda can of height h and radius r has minimal
surface A for fixed volume V ? Solution: We have V (r, h) = hπr2 = 1 and A(r, h) =
2πrh + 2πr2 . With x = hπ, y = r, you need to optimize f (x, y) = 2xy + 2πy 2 under
the constrained g(x, y) = xy 2 = 1. We will do that in class.
20.11. Problem: If 0 ≤ pk ≤ 1 is the probability that a dice shows k, then we have
g(p) = p1 + p2 + · · · + p6 = 1. This vector p is called a probability distribution. The
Shannon entropy of p is defined as
6
X
S(p) = − pi log(pi ) = −p1 log(p1 ) − p2 log(p2 ) − · · · − p6 log(p6 ) .
i=1

Find the distribution p which maximizes entropy S. Solution: ∇f = (−1 −

log(p1 ), . . . , −1 − log(pn )), ∇g = (1, . . . , 1). The Lagrange equations are −1
P − log(pi ) =
λ, p1 +· · ·+p6 = 1, from which we get pi = e−(λ+1) . The last equation 1 = i exp(−(λ+
1)) = 6 exp(−(λ + 1)) fixes λ = − log(1/6) − 1 so that p1 = p2 = · · · = p6 = 1/6. It is
the fair dice that has maximal entropy. Maximal entropy means least information
content.
20.12. Assume that the probability that a physical or chemical system is in a state k
is pk and that the energy of the state k is Ek . Nature minimizes the free energy
X
F (p1 , . . . , pn ) = − [pi log(pi ) − Ei pi ]
i
P
if the energies Ei are fixed. The probability distribution pi satisfying i pi = 1 min-
imizing the free energy is called a Gibbs distribution. Find this distribution in
general if Ei are given. Solution: ∇f = (−1 − log(p1 ) − E1 , . . . , −1 − log(pn ) − En ),
∇g = (1, . . . , 1). The Lagrange equation are log(pi ) = −1 − λ − Ei , orPpi = exp(−Ei )C,
where C = exp(−1 P −−Eλ). The constraint p1 + · · · + pn = 1 gives C(P i exp(−Ei )) = 1
so that C = 1/( i e ). The Gibbs solution is pk = exp(−Ek )/ i exp(−Ei ). 1
i

20.13. If f is a quadratic function on Rm and g is linear that is f (x) = Bx · x/2

with B ∈ M (m, m) and if the constraint g(x) = Ax = c is linear A ∈ M (1, m), then
∇f (x) = Bx and ∇g(x) = AT . Lets call b = AT ∈ M (m, 1) ∼ Rm . The Lagrange
equations are then Bx = λb, Ax = c. We see in general that for quadratic f and linear
g, we end up with a linear system of equations.
20.14. Related to the previous remark is the following observation. It is often possible
to reduce the Lagrange problem to a problem without constraint. This is a point of
view often taken by economists. Let us look at it in dimension 2, where we extremize
f (x, y) under the constraint g(x, y) = 0. Define F (x, y, λ) = f (x, y) − λg(x, y). The
Lagrange equations for f, g are now equivalent to ∇F (x, y, λ) = 0 in three dimensions.

1This example is from Rufus Bowen, Lecture Notes in Math, 470, 1978
Linear Algebra and Vector Analysis

Homework

Problem 20.1: Find the cylindrical basket which is open on the top
has has the largest volume for fixed area π. If x is the radius and y
is the height, we have to maximize f (x, y) = πx2 y under the constraint
g(x, y) = 2πxy + πx2 = π. Use the method of Lagrange multipliers.

Problem 20.2: Given a n × n symmetric matrix B, we look at the

function f (x) = x · Bx. and look at extrema of f under the constraint
that g(x) = x · x = 1. This leads to an equation
Bx = λx .
A solution x is called an eigenvector. The Lagrange constant λ is an
eigenvalue. Find the solutions to Bx = λx, |x| = 1 if B is a 2 × 2 matrix,
where f (x, y) = ax2 + (b + c)xy + dy 2 and g(x, y) = x2 + y 2 . Then solve
the problem where a = 3, b = 2, c = 4, d = 1. (Never mind here that B is
not symmetric).

Problem 20.3: Which √ pyramid of height h over a square [−a, a]×[−a, a]

2 2 2
with surface area is 4a h + a + 4a = 4 has maximal volume V (h, a) =
4ha2 /3? By using new variables (x, y) and multiplying V with a constant,
2
we get to the equivalent
p problem to maximize f (x, y) = yx over the
constraint g(x, y) = x y 2 + x2 + x2 = 1. Use the later variables.

Problem 20.4: Motivated by the Disney movie “Tangled”, we want

to build a hot air balloon with a cuboid mesh of dimension x, y, z which
together with the top and bottom fortifications uses wires of total length
g(x, y, z) = 6x + 6y + 4z = 32. Find the balloon with maximal volume
f (x, y, z) = xyz.

Problem 20.5: A solid bullet made of a half sphere and a cylinder has
the volume V = 2πr3 /3 + πr2 h and surface area A = 2πr2 + 2πrh + πr2 .
Doctor Manhattan designs a bullet with fixed volume and minimal
area. With g = 3V /π = 1 and f = A/π he therefore minimizes
f (h, r) = 3r2 + 2rh under the constraint g(h, r) = 2r3 + 3r2 h = 1. Use the
Lagrange method to find a local minimum of f under the constraint g = 1.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 21: Island mathematics

Seminar
21.1. With an “island” we mean a region in the plane R2 which is bound by a simple
closed curve C which is continuous everywhere and differentiable everywhere except
at a finite set of points. So, simple polygons are allowed. What island does have the
maximal area if the length of the boundary is fixed? This is called the isoperimetric
problem. If we look at the problem restricted to polygons with a fixed number n of
vertices, then we have a nice finite dimensional Lagrange problem.
21.2. Let us look at a triangular island T (x, y) with vertices (−1, 0), (1, 0), (x, y).

Problem A: Assume the circumference g(x, y) of the triangle is 3. What

is the maximal area f (x, y) = y/2 we can get? Set up the Lagrange
equations and solve them.

21.3. Here is a side problem from good old Euclidean geometry. If you should not
know, look up “string method pins”.

Problem B: What points (x, y) in the plane satisfy g(x, y) = 3.

21.4. Solving the problem to find the n-gon with maximal area is a messy Lagrange
problem. It can be done by a computer but there is a more elegant way:
Problem C: Use the computation in problem A to show that for a
maximal polygon containing vertices ..., P, Q, R, ... in a row, the distance
between P and Q is the same as the distance between Q and R.

Problem D: Conclude that a polygon with n vertices and maximal area

must be a regular polygon.
Linear Algebra and Vector Analysis

21.5. You are on your treasure island G and have two locations A, B in G. The
problem to find the shortest connection between A and B can be quite complex in
general. An example is when G is bound by a Gosper curve. For the following let
us assume that the boundary of G is a convex curve: this means that for any two
points A, B in G, the line segment through A, B is contained in G. A triangle A, B, C
for which all three points A, B, C are on the boundary is called a “shore triangle”.
Problem E: Verify that for a shore triangle, the billiard law of reflec-
tion at the boundary holds.

21.6. Hint: to see that the incoming angle is the same as the outgoing angle, take a
minimal triangle A, B, C, where B is on the island shore, then replace the curve with
the tangent curve L at B. Now reflect C at L to get a point C 0 . Verify that the shortest
billiard path ABC has the same length than the straight line connecting A with C 0 .

Figure 1. What polygon with fixed circumference has maximal area?

21.7. The next time you are cast away on an island, count the number m of mountain
peaks, the number s of sinks and the number p of mountain passes. Make some
experiments. You notice the following rule which is known as a special case of the
Poincaré-Hopf theorem:
Theorem: maxima + minima − saddles = 1.

Problem F: Find an example where this equality holds, in which we

have maxima = 3, minima = 1 and saddles = 3.

21.8. If you want to challenge yourself, see whether you can prove the island theorem
by deformation. (This is probably too hard. Just enjoy the struggle!)
21.9. Assume now that our island is an atoll, a ring shaped reef.
Problem G: By looking at examples, what is the island number
maxima + minima − saddles on an atoll?
Figure 2. First an island with 2 mountain peaks and with 1 mountain
pass. Then an island with 3 mountain peaks and 2 mountain passes. We
see maxima + minima − saddles = 1.

Figure 3. The Atafu atoll. Picture by NASA Johnson Space Center, 2009.

Figure 4. If we place a surface S : g = c in space and look at the

restriction of a function f (x, y, z) on S, we solve a Lagrange problem. In
a Morse situation, the numbers maxima + minima − saddles add up to
a number which only depends on the number of holes.

21.10. Let us look at the one-dimensional case, where we prove things easier. Assume
the island is the interval [a, b]. Let f be a smooth function on [a, b] which has the
property that f is zero for x ≥ b and for x ≤ a. We look at critical points of f in the
interior (a, b) which are Morse, (meaning f 00 (x) 6= 0 at critical points), so that we only
have only local maxima and minima as critical points. Let m be the number of maxima
and s the number of minima (sinks). In order to prevent the island to be flooded, we
also assume that the function f is positive for x > a, close to a and x < b close to b.
Theorem: maxima − minima = 1.

Problem H: Verify that there is an odd number of critical points for a

Morse function f which has as a support a finite interval [a, b].
Linear Algebra and Vector Analysis

Figure 5. One-dimensional islands.

Problem I: Use a deformation argument to show that if there are 2k +

1 critical points, we can reduce them to 2k − 1 by merging a pair of
neighboring maxima and minima

Homework

21.1 A spherical triangle A, B, C on the unit sphere has angles α, β, γ in

[0, π]. What is the largest area that such a triangle can have? You can
use the fact that α + β + γ − π is the area. The result might look a bit
strange for a triangle.

21.2 Find an example of a non-Morse function f (x, y, z) with a maximum.

Similarly find an example with a minimum and an example of a non-Morse
function where the critical point is neither a maximum, nor a minimum.

21.3 If we look at maxima, minima and saddle points for a function

f (x, y) defined on a doughnut. By looking at examples, find the island
number maxima + minima − saddles there.

21.4 If we look at maxima, minima and saddle points for a function

f (x, y) defined on a sphere. By looking at examples, what is the island
number maxima + minima − saddles there.

21.5 Assume f : R → R is a single variable Morse function which is

2π periodic. What is the relation between the number m of maxima on
[0, 2π) and the number of minima on [0, 2π)? Prove this.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 22: Double integrals

Lecture
22.1. Given a bounded region RRRin R2 and a continuous function f (x, y) : R → R,
define the Riemann integral I = R f (x, y) dA as the n → ∞ limit of
1 X i j
In = 2 f( , ) .
n n n
(i/n,j/n)∈R

The bounded region R is a defined as closed subset of R2 bound by finitely many differ-
entiable curves R = {g1 ≤ c1 , . . . gk ≤ ck }. As already in one dimension, the definition
is designed to be independent of an orientation chosen on R. We are integrating like
summing up a spread sheet. Just add up all entries. To justify that the limit exists,
we again can use the Heine-Cantor theorem which tells that f is continuous on R if
and only if it is uniformly continuous. This means there are numbers Mn → 0 such
that if |(x1 , y1 ) − (x2 , y2 )| ≤ 1/n, then |f (x1 , y1 ) − f (x2 , y2 )| ≤ Mn .
RR
Theorem: For continuous f on a bounded region R, R f dxdy exists.

22.2. Proof. In each cube Qij = {i/n ≤ x ≤ (i + 1)/n, j/n ≤ y ≤ (j + 1)/n} ∩ R

define aij = min(x,y)∈Qij f (x, y) and bij = max(x,y)∈Qij f (x, y). Because the boundary
was assumed to be given by a collection of curves which have finite total arc length L,
the number of cubes Qij which intersect the boundary C is bounded by 4Ln (a curve
of length 1 can maximally touch 4 squares). Define also F = max(x,y)∈R |f (x, y)|. We
have with Kn = 4LF/n:
An − Kn ≤ In ≤ Bn + Kn ,
where An = i,j aij /n2 and Bn = i,j bij /n2 and Kn takes care of cubes Qij which
P P
intersect the boundary of R and so only contribute partially. Let I be the limsup
of In . We have Bn − An ≤ Mn n2 /n2 = Mn → 0 and Kn → 0 as well so that
||In − I| ≤ Mn + Kn → 0.
22.3. We rarely evaluate integrals using Riemann sums. Fortunately it is possible to
reduce a double integral to single integrals. One can do that for basic regions which
consist of two type of regions “bottom to top” regions R = {(x, y), a ≤ x ≤
b, c(x) ≤ y ≤ d(x)} or “left to right” regions R = {(x, y), a(y) ≤ x ≤ b(y), c ≤ y ≤
d}. By cutting a general region into smaller pieces like intersecting with sufficiently
small cubes Qi,j defined above, we can write any region as a union of such basic regions:
Linear Algebra and Vector Analysis

for large enough n, any Qij ∩ R us a basic region. Now we can define the integral in
R b R d(x) R d R b(y)
the first case as a [ c(x) f (x, y) dy]dx and in the second case as c [ a(y) f (x, y) dx]dy.
Is this the same? This is answered with Fubini, which we have already used. Let R be
a rectangle R = {(x, y) | a ≤ x ≤ b, c ≤ y ≤ d}. Here is the Fubini theorem:

Figure 1. “Bottom to top” and “left to right” regions.

R Rb Rd Rd Rb
Theorem: R
f (x, y) dA = [ f (x, y) dy]dx = c [ a f (x, y) dx]dy.
a c

22.4. Proof: first make a coordinate change to get R = [0, 1]×[0, 1], then cover R with
n2 cubes Qij of side length 1/n. We have for every y a uniformly continuous function
x → f (x, y) and for every x a uniformly continuous function y → f (x, y) and the
constants Mn work for all: there is Mn → 0 so that if |x1 −x2 | < 1/n and |y1 −y2 | < 1/n,
thenRR|f (x1 , y1 ) − f (x2 , y2 )| ≤ Mn . Now use the notation A ∼c BR if |A − B| ≤ c and
Pn−1 1
get R f (x, y)dA ∼Mn n1 n−1
P 1
Pn−1 1
i=0 n j=0 f (i/n, j/n) ∼2Mn n i=0 0 f (i/n, y) dy ∼3Mn
R1 R1 RR R1 R1
[
0 0
f (x, y) dy] dx. Similarly, we can show R
f (x, y)dA ∼3M n [ f (x, y) dx] dy.
0 0

22.5. Without continuity, Fubini is false: the standard example is illustrated in Fig-
ure (2):
Z 1Z 1 2 Z 1Z 1 2
−π (x − y 2 ) (x − y 2 ) π
= 2 2 2
dydx 6
= 2 2 2
dxdy = .
4 0 0 (x + y ) 0 0 (x + y ) 4
R R
Proof. (x2 −y 2 )/(x2 +y 2 )2 dx = −x/(x2 +y 2 ), (x2 −y 2 )/(x2 +y 2 )2 dy = y/(x2 +y 2 ). so
R1 R1
that 0 (x2 −y 2 )/(x2 +y 2 )2 dx = −1/(1+y 2 ) and 0 (x2 −y 2 )/(x2 +y 2 )2 dy = 1/(1+x2 ).
22.6. Integrals in higher dimensions are defined in the same way. We will cover the
three dimensional case in particular later. Lets just add the definition for now. Given
a m dimensional region R in Rm and a continuous f : Rm → R, using the multi-index
notation x = (x1 , . . . , xm ), dx = dx1 dx2 · · · dxm and i/n = (i1 /n, i2 /n, . . . , im /n) de-
fine Z
1 X i
f (x)dx = lim m f( ) .
R n→∞ n n
i
n
∈R
m
A region is now a set R = {x ∈ R | g1 (x) ≤ c1 , . . . , gk (x) ≤ ck } where gk are smooth
functions. It is called bounded if there exists ρ > 0 such that R ⊂ {|x| ≤ ρ}.
Figure 2. Integrating over a region via a Riemann integral. A double
integral is a signed volume. Parts where f < 0 is negative volume. Fubini
can fail, even if the two conditional integrals exist.

Examples
RR RR
22.7. If f (x, y) = 1, then R f (x, y) dxdy is the area of R. For example, if x2 +y2 ≤9
RR
8 dxdy = 8 x2 +y2 ≤9 1 dxdy = 8Area(R) = 72π.
Rb
22.8. We know from single variable calculus that a f (x) dx is the signed area under
R b R f (x)
the curve of f . For f (x) ≥ 0, where it is the area, we can write this as a 0 1 dydx.
Note that as we have defined the integrals, the equivalence would be wrong if f (x)
is negative somewhere. It is the double integral which is the correct notion of area.
Example: The area of the region bounded by the curve y = 1/(1 + x2 ), the curve
R 1 R 1/(1+x2 )
y = 0 and the curve x = −1 and x = 1 is −1 0 dydx = arctan(x)|1−1 = π/2.

Figure 3.
RR
22.9. The integral R f (x, y) dxdy can be interpreted as the signed volume under
the graph of f above the region R. Find the volume of the region bound by z =
4R −R 2x4 − 2y 4 and z = 4 − 2x2 − 2y 2 and −1 ≤ x ≤ 1 and −1 ≤ y ≤ 1. Solution:
1 1
0 0
(4 − 2x4 − 2y 4 ) − (4 − 2x2 − 2y 2 ) dxdy = (4/15)2 .
22.10. Problem. Find the area of a disc of radius a. Solution:
Z a Z √a2 −x2 Z a √
√
1 dydx = 2 a2 − x2 dx .
−a − a2 −x2 −a
Linear Algebra and Vector Analysis

Use trig substitution x = a sin(u), dx = a cos(u), to get

Z π/2 q Z π/2
2 2 2
2 a − a sin (u)a cos(u) du = 2a2 cos2 (u) du .
−π/2 −π/2

2 π/2
2 (1+cos(2u))
R
Using a double angle formula, this gives a −π/2 2
du = a2 π. We will next
time compute this much more effectively.
22.11.
R R −xProblem. Let R be the triangle {1 ≥ x ≥ 0, 0 ≤ y ≤ x}. Evaluate
2 2
R
e dxdy. Solution. We can not evaluate the integral directly because e−x
has no anti-derivative
R 1 R x −x2 given in terms of elementary functions. But we can write the
integral as 0 [ 0 e dy] dx
Z 1 2
−x2 e−x 1 (1 − e−1 )
= xe dx = − | = .
0 2 0 2

Homework
R 1 R 2−x
Problem 22.1: Calculate the iterated integral 0 x (x2 − y) dydx in
two ways, once as a “left to right” and once as a “bottom to top” integral.

Problem 22.2: Find the integral

Z 1Z y2
3x7
√ dx dy .
0
√
y x − x2

Problem 22.3: Compute the area of the region bound by the ellipse
x2 /42 + y 2 /92 = 1 using trig substitution. (It is the “hardest problem in
geometry”, according to the comedy-drama “Rushmore”, a movie from
1998).

Problem 22.4: Find the integral

Z π2 Z π
sin(x)
dxdy .
0
√
y x2

Problem 22.5: Find the volume of the hoof solid x2 +y 2 ≤ 1, 0 ≤ z ≤ x.

The hoof solid was considered by Archimedes already.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 23: Substitution

Lecture

u x(u, v)
23.1. If Φ : R → S, → is a coordinate change, then the distor-
v y(u, v)
tion factor was defined as |dΦ| = |det(dΦ)|, where

∂u x(u, v) ∂v x(u, v)
dΦ(u, v) = .
∂u y(u, v) ∂v y(u, v)
The change of variable theorem is the same in all dimensions. In the following proof,
we assume that Φ is C 2 . Because √ of Heine-Cantor, we know there exists Mn → 0 with
d2
| dt2 Φ(u0 + tv, v0 + tw)| ≤ Mn for v 2 + w2 ≤ 1/n and all (u0 , v0 ) ∈ R. 1
RR RR
Theorem: R
f (Φ(u, v)) |dΦ(u, v)|dudv = S f (x, y) dxdy.

23.2. Proof. Cover S with cubes Qij as in the last lecture. Then
ZZ X ZZ X i j 1
f (x, y) dxdy = f (x, y) dxdy ∼ f( , ) 2 .
S Q Qij ∩S i,j
n n n
ij

The transformed squares Φ(Qij ) are close to the parallelograms dΦ(Qij ) which have
area |dΦ(i/n, j/n)|/n2 . Now make a quadratic Taylor expansion Φ(x, y) = Φ(x0 , y0 ) +
dΦ(x0 , y0 )(x − x0 , y − y0 ) + d2 Φ(x0 , y0 )(x − x0 , y − y0 )2 /2 at (x0 , y0 ) = (i/n, j/n), where
|d2 Φ(x0 , y0 )(x − x0 , y − y0 )2 | ≤ Mn . Let F = max(x,y)∈R (|f (x, y)|). Applying in every
direction, Taylor with remainder, we see
Z
i j i j 1 Mn F
| f (x, y) dxdy − f (Φ( , ))|dΦ( , )| 2 | ≤ .
Φ(Qij )∩S) n n n n n n2
As the number of squares hitting R is bound by An2 + 4Ln where A is the area of R
and L is the length of the boundary of R, the sum of the non-linear errors is therefore
bound by (An2 + 4Ln)Mn F/n2 which goes to zero for n → ∞. QED.

23.3. Here is an example: If Φ : R = [0, 1] × [0, 2π] → S = {x2 + y 2 ≤ 1} is given

by
RR Φ(r, θ) = [r cos(θ),
RR r sin(θ)]T . Then dΦ(r, θ) = r. If f (x, y) = x2 + y 2 = r2 , then
R
r2 rdr dθ = S x2 + y 2 dxdy. The first integral is 2π/4.
1For the C 1 case, see J. Schwartz, Mathematical Monthly 61, 1954, or P.D. Lax, Monthly 108, 2001
Linear Algebra and Vector Analysis

23.4. Let Φ : [0, 1] × [0, 1] → [0, 1] × [0, 1] be given as Φ(x, y) = (y, x). Now det(dΦ) =
−1 and |dΦ| = 1. While we usually could ignore talking about orientation, it is evident
here that the integrals considered so far, we do not care about the orientation of the
space. If the change of coordinates switches the orientation, the resulting integral does
not change.

23.5. The chain rule assures that combining two coordinate changes Φ, Ψ, gives a
new coordinate change with d(Ψ ◦ Φ)(x) = dΨ(Φ(x))dΦ(x). For example if Ψ(x, y) =
[ax, by]T and Φ(r, θ) = [r cos(θ), r sin(θ)]T changes into polar coordinates, then Ψ(Φ(r, θ)) =
[ar cos(θ), br sin(θ)]T . Now the image of R = [0, RR
1] × [0, 2π] is the ellipse S = {x2 /a2 +
y 2 /b2 ≤ 1} and the area of the ellipse is A = R abr drdθ because det(dΦ) = r and
R 1 R 2π
det(dΨ) = ab. The result is 0 0 abr dθdr = πab.

Figure 1. Coordinate change.

23.6. Preview: We will next week look at more general casesp like r : R ⊂ R2 → R3 of
where the distortionRRfactor is |dr| = det(drT dr) = |ru × rv |
a parametrized surface,RR
and the surface area is R |ru × rv |dudv = S 1 dA.
Rd Rb
23.7. The theorem generalizes substitution c f (Φ(x))|Φ0 (x)| dx = a f (x) dx if
Φ(c) = a and Φ(d) = b. We usually insist that Φ is monotonically √ increasing and
π/2
write u = Φ(x), du = Φ0 (x)dx to get computations like in 0
R
sin(x2 )2xdx =
R π/2
0
sin(u) du, where Φ(x) = x2 . As a hack, one can extend the formula to the
case when Φ can decrease in which case the [a, b] interval becomes the negative [b, a]
interval
R1 with a < b. RExample: Let Φ(x) = 2 − 2x which has Φ0 = −2, then
1
1/2
(2−2x)2 |(−2)|dx = 0 x2 dx. In single variable calculus, one can also work with the
R 1/2 R 1/2 R1
negative sign case and compute 1 (2 − 2x)2 (−2)dx which works if 1 = − 1/2 but
this is not compatible with the defined Riemann integral: we use “spread-sheet”
summation and do not distinguish whether we add up the function values from left
to right or from right to left. 2
2In single variable one can easily go from orientation independent ‘Bosonic’ integrals to ‘Fermionic’
integrals. In higher dimensions, one can then apply the derivative to anti-symmetric tensors. The
switch “Bosonic” → “Fermionic” requires however to orient objects like curves or surfaces.
RR
23.8. We can again look at the Fubini counter example x2 +y2 ≤1 (x2 − y 2 )/(x2 +
R 1 R 2π
y 2 )2 dxdy = 0 0 cos(2θ)/r dθdr = 0. We can not change the order of integration
R1
as we can not integrate 0 1/r dr. The trouble also continues in the new coordinate
system and it is even more dramatic.
23.9. If Φ : x → Ax and Ψ : x → Bx are two linear coordinate changes then Ψ ◦
Φ = BA is the matrix product and the chain rule tells |d(Ψ ◦ Φ)| = |det(AB)| which
agrees with the product |dΨ||dΦ| = |det(A)||det(B)|. We can do the verification
of
a b
the Cauchy-Binet formula det(AB) = det(A)det(B) directly. If A = and
c d

p q ap + br aq + bs
B = , then AB = and you can check the determinant
r s cp + dr cq + ds
formula.
23.10. Here is a famous open problem about coordinate changes. It is called the
Jacobian conjecture. It deals with polynomial coordinate changes, where x(u, v)
and y(u, v) are polynomials in u, v.

Conjecture: If Φ is polynomial and |dΦ| is constant different from zero,

then Φ has a polynomial inverse.
One knows that if the conjecture is false, then there exists a counter example with
integer polynomials and Jacobian determinant 1. The conjecture is open since at
least 1939. An example of a coordinate transformation with determinant 1 and integer
polynomials are Hénon maps from lecture 16. If Φ([u, v]T ) = [x, y]T = [u2 −u4 −v, u]T ,
then Φ−1 ([x, y]T ) = [y, y 2 − y 4 − x]T .
Examples
23.11. Problem: What is the area of the image S = Φ(R) if Φ([u, v]) = [u2 − v 2 +
1, 2uv + 2]T and R = {1 ≤ u ≤ 3, 0 ≤v ≤ 1}. (This 2
is Φ(z) = z + c with c = 1 + 2i
2u −2v
in the complex). We have dΦ(u, v) = and |dΦ(u, v)| = 4u2 + 4v 2 . We see
2v 2u
R1R3
from the change of variables formula that the area is 0 1 4u2 + 4v 2 dudv = 112/3.
RR
23.12. Problem: What is the moment of inertia R x2 + y 2 dxdy, where R is
the polar region given in polar coordinates as r ≤ 2 + sin(3θ). Solution: using the
R 2π R 2+sin(3θ) 2
polar coordinate change of variables Φ with |dΦ| = r, we get 0 0 r r drdθ =
R 2π 4
0
(2 + sin(3θ)) /4 dθ. We explain in class how to get the answer 227π/4 quickly.
23.13. Problem: Here is RR a famous problem. It is so popular, that it even made
2 2
it to Hollywood: compute R2 e−x −y dxdy. Solution: this problem looks difficult
2
at first as we can not integrate with respect to x or y. The function e−x has no
elementary
R 2π anti-derivative.
R ∞ −r2 This improper integral isR doable in polar coordinates
∞ 2
as it is 0 0 e r dr dθ = π. It is the inner part 0 e−r r dr which is an im-
proper
R L −r2 integral. One deals with this by approximation. For every finite L we have
−r2 L −L2
e r dr = −e /2|0 = 1/2 − e /2 This converges nicely to 1/2 for L → ∞. It
0 R∞ 2 √
follows (and that is the punch line) that −∞ e−x dx = π.
Linear Algebra and Vector Analysis

Homework

Problem 23.1: Given a disk R = {x2 + y 2 ≤ 1}, we can make this into
aRRprobability space and define the expectation of a function f as E[f ] =
R
f dxdy/π. The expectation of the random variables f (x, y) = xn are
examples of moments. Find E[x], E[x2 ], E[x3 ] and E[x4 ].

Problem 23.2: What is the volume of the solid bound by z = f (x, y) =

x2 + y 2 and 2 2
RR z = g(x, y) = 8 − x − y ? You can write this as a double
integral R g(x, y) − f (x, y) dxdy over a suitable region.

Problem 23.3: The fidget spinner is so “2017” now. What is hot now
is
R Rthe math 22 spinner with 23 bearings! What is the moment of inertia
2 2
G
x + y dxdy of the math 22 fidget spinner region G given in
polar coordinates as 1/2 ≤ r ≤ 2 + cos(22θ). To keep our bearings, we do
not count the bearings.

Problem 23.4: Biologist Piet Gielis once patented polar regions in

order to use them to describe biological shapes like cells, leaves, starfish
or butterflies. Don’t worry about violating patent laws when finding the
area of the following butterfly r(t) ≤ |8 − sin(t) + 2 sin(3t) + 2 sin(5t) −
sin(7t) + 3 cos(2t) − 2 cos(4t)|. (It can produce butterflies in your stomach
but there are some tricks to do that fast. Relax with the Math 22 fidget
spinner for example!)

Figure 2. The math 22 spinner and the butterfly.

Problem 23.5: a) Prove the Jacobian conjecture for linear maps

Φ(x) = Ax, where A is a 2 × 2 matrix.
b) Find a linear coordinate change Φ(x, y) for which the Jacobian deter-
minant is 1. It should be non-trivial in the sense, that we don’t just want
a diagonal matrix dΦ.
c) Find a counter example of the Jacobian conjecture for cubic polyno-
mials (just kidding). Find an example for the Jacobian conjecture where
both polynomials are not linear!

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 24: How to solve: Literature Samples

Seminar
24.1. In this seminar we look a bit around in the literature and collect problem solving
strategies. We have seen already a few methods:

Already seen principles

1. Induction (Theorem on unique row reduced echelon form)
2. Contradiction (Clairaut theorem)
3. Deformation (Hopf Umlaufsatz)
4. Invariant (Sum of Morse indices on island)

24.2. We will introduce a few more principles and tips and take the opportunity to
introduce a bit the literature. We only look at 4 books.

Figure 1. 4 Superstar books: Polya: How to Solve it. Tao: Solv-

ing Mathematical Problems, Perkins: The Eureka effect, Posamentier-
Krulik: Problem solving strategies.

24.3. The mother of all problem solving books is Polya’s ”How to solve it” which was
published in 1945. If you read and absorb this book, you immediately get measurably
stronger in math. Still after more than 70 years, it is the best. Here are the now
famous Polya principles:
Linear Algebra and Vector Analysis

Polya principles
1. Understand the problem: unknowns, data, draw figure.
2. Devise a plan: similar or related problem?
3. Carry out the plan: check each step.
4. Examine the solution: can other problems be solved as such?

24.4. This sounds a bit like ”open the door, step through the door, close the door”
advise to ”how to exit the house”. But it is amazing to see the power in a method.
Why is it powerful? Because if one sees a harder problem the first time, one is totally
lost. (Proof: if not, then the problem was easy ....) Where do we start? This is where
it is good already to have a guide telling you: well, just first start to understand the
problem.

24.5. Here is an example of a problem in geometry which is mentioned in Polya’s book.

The problem is featured even on the cover of some later editions of the book.

Problem A: Inscribe a square Q in a triangle T so that two vertices of

Q on the base of T and the other sides of T each contains a vertex of Q.

24.6. An here is another problem from Polya, slightly reformulated. Work out also
this problem using the Polya principles:

Problem B: Water is flowing with a constant rate of one cubic meter

per second into a conical vessel x2 + y 2 = z 2 , z ≥ 0. At which rate is the
water level rising if the water depth is z meters?

24.7. The second best book in our collection is ”Solving mathematical problems” by
Terrence Tao. Why? Like Polya, also Tao has proven new important theorems (many
as a single author) and so got some street cred. Here are some problems from his book:

Problem C: An integer n has the same last digit than n5 .

Problem D: If k is a positive odd number, then 1k + 2k + ... + nk is

divisible by n + 1.

24.8. Tao calls the following identity ”his favourite algebraic identity”. We have done
the case of the sum of the first n squares in a practice exam.

Problem E: 13 + 23 + ... + n3 = (1 + 2 + 3 + .. + n)2 .

24.9. Tao does not give a formal list of strategies, but explains in an example on page
4 the following principles. We paraphrase here these ”deformation principles”:
Tao’s deformation principles
a. Consider special, extreme or degenerate cases.
b. Solve a simplified version of the problem
c. Formulate a conjecture
d. Derive intermediate steps which would get it.
e. Reformulate, especially try contraposition.
f. Examine solutions of similar problems
g. Generalize the problem

24.10. The book of Perkins analyses skillfully the mechanisms of break through ideas.
It destills the following mechanism for break through ideas. It captures it pretty well,
since problems which are solved quickly rarely cover new ground.

Perkins
1. Long search. 99 percent perspiration. Work for years or decades.
2. Little apparent progress. Many failures.
3. A precipitating event. Maybe external circumstances.
4. A cognitive snap. Usually in a flash. Eureka!
5. Transformation. Flesh it out. Consequences.

24.11. The following exercise is from Perkin’s book. Try to solve it yourself and also
keep track on how you pursue the task to solve the problem.

Problem F: Someone brings an old coin to a museum director and offers

it for sale. The coin is stamped 540 B.C.E. Instead of considering the
purchase, the museum director calls the police. Why?

24.12. If this was too easy (experiments show that some people can answer it very
quickly. For others it takes longer), try this one, also from Perkins:

Problem G: You are driving a jeep through the Sahara desert. You
encounter someone lying face down in the sand, dead. There are no tracks
anywhere around. There has been no wind for days to destroy tracks. You
look into the pack on the person’s back. What do you find?

24.13. The book of Posamentier and Krulik is more intended for the teacher and less
for the research mathematician. It goes through the following principles

Posamentier-Krulik
1. Reason logically 2. Recognize patterns
3. Work backwards 4. Adopt different view
5. Consider extreme cases 6. Solve simpler problems
7. Organize data 8. Make a picture
9. Account all possibilities 10. Experiment, guess and test
Linear Algebra and Vector Analysis

24.14. Here is a strategy which often occurs: ”make it more general”. In the book
”Posamentier-Krulik: Problem-Solving-Strategies in mathematics” for example is the
problem:
Problem H: We have a 5 × 5 seating arrangement of students. The
teacher wants every student to change place and move to a seat to the
left, right, front or left. It it possible? Solve this problem by looking first
at smaller classrooms like 2 × 2 or 3 × 3 or 2 × 3. In which cases is it
possible?

24.15. Once you have an idea, prove the statement.

Homework

24.1 A nursery rhyme is the riddle “As I was going to St. Ives, I met
a man with seven wives, Each wife had seven sacks, Each sack had seven
cats, Each cat had seven kits: Kits, cats, sacks, and wives, How many
were there going to St. Ives?” Pretend not to know the answer, solve the
riddle and follow the Polya principle. The rhyme was inspired by one of
the oldest problems texts in math, the Rhind Papyrus. But it was a more
serious question which translates: ”how many kits came from St Ives”?

24.2 (Tao) The perpendicular bisectors in a triangle meet in a point.

24.3 (Tao). Find all triangles for which the length have an arithmetic
progression a, a + d, a + 2d.

24.4 Here are a few children riddles. We hope you don’t know all of
them (if you know the answer there is little benefit). Keep a log of how
you search for an answer: a) I’m tall when I’m young and I’m short when
I’m old. What am I? b) What gets wetter and wetter the more it dries?
c) What can run but can’t walk? d) What is full of holes and still holds
water?

24.5 In the 15 puzzle (invented in 1874 by Noyes Palmer Chapman)

each the numbers 1 − 15 are arranged in a 4 × 4 grid. There is one hole
0 left. The task is to reorder a scrambled puzzle so that all numbers are
in order and 0 at the very bottom right. The player can switch 0 with a
neighboring piece. Sam Loyd suggested to start with stone 14 and 15
switched. and offered 1000 dollars for a solution. Prove that one can not
win the prize.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 25: Solids

Lecture
25.1. A basic solid R in Rn is a bounded region enclosed by finitely many surfaces
gi (x1 , · · · , xn ) = ci . A solid is aRRR
finite union of such basic solids. We focus here mostly
on n = 3. A 3D integral I = R
f (x, y, z) dxdydz is defined in the same way as a
limit of a Riemann sum In which for a given integer n is defined as
1 X i j k
In = 3 f( , , ) .
n n n n
(i/n,j/n,k/n)∈R

The convergence is proven in the same way. The boundary contribution can be ne-
glected in the limit n → ∞. If Φ : R → E is a parametrization of the solid, then
RRR RRR
Theorem: R
f (u, v, w)|dΦ(u, v, w)|dudvdw = E
f (x, y, z) dxdydz

Figure 1. Solids in R3 are sets which are unions of solids bound by

smooth surfaces. The second solid appears in homework 25.3, the last in
25.2
RRR
25.2. If f (x, y, z) is constant 1, then E
f (x, y, z) dxdydz is theRRR
volume of the
2 2 2
solid E. pFor a cone x + y ≤ z , 0 ≤ z ≤ 1, we can write 1 dzdxdy =
RR
2 2
1 − x + y dxdy, where R is the unit disc. Its volume is πRRR − 2π/3 = π/3.
R
2 2 2
For the unit sphere x + y + z ≤ 1 for example, we can write E
1 dzdxdy =
RR p 2 2
2 2
2 1 − x − y dxdy, where R is the unit disc x + y ≤ 1. In polar coordi-
R R 2π R 1 √
nates, we get 0 0 2 1 − r2 r drdθ = 4π/3. We can also use spherical coordinates
Φ([ρ, φ,Rθ]) = [ρR sin(φ) cos(θ), ρ sin(φ) sin(θ), ρ cos(φ)], where |dΦ| = ρ2 sin(φ). The vol-
2π π 1
R
ume is 0 0 0 ρ2 sin(φ) dρdφdθ = 4π/3.
Linear Algebra and Vector Analysis

25.3. There are two basic strategies to compute the R b integral:

RR the first is to slice the
region up along a line like the z-axis then form a R(z) f (x, y, z)dxdydz. To get
R 1 RR
the volume of a cone for example, integrate 0 [ R(z) 1dxdy]dz. The inner double
integral is the area of the slice which is πz 2 . The last integral gives π/3. A second
reduction is to see the solid sandwiched between two graphs of a function on a region
RR R h(x,y)
R, then form R [ g(x,y) f (x, y, z) dz]dxdy. In the cone case, we have for R the disc
p
of radiusp1. The lower function is g(x, y) = x2 + y 2 the upper function is 1. We get
RR
R
[1 − x2 + y 2 ] dxdy, a double integral which best can be computed using polar
R 2π R 1
coordinates: 0 0 (1 − r)rdrdθ = 2π(1/2 − 1/3) = π/3. Burgers and fries!

Figure 2. The “burger and fries methods” to compute triple integral.

The first reduces to a single integral, the second to a double integral.

25.4. We have seen in the theorem the coordinate change formula if Φ : R → E is

given. For spherical coordinates Φ([ρ, φ, θ]) = [ρ sin(φ) cos(θ), ρ sin(φ) sin(θ), ρ cos(φ)],
we have |dφ| = ρ2 sin(φ) . For cylindrical coordinates, the situation is the same as
for polar coordinates. The map Φ([r, θ, z]) = [r cos(θ), r sin(θ), z] produces |dΦ| = r .
RRR
25.5. Let us find the integral E
1 dxdydz, where E = {x2 /a2 + y 2 /b2 + z 2 /c2 ≤ 1}
is a solid ellipsoid. The most comfortable way is to introduce another coordinate
change Ψ([x, y, z]) → [ax, by, cz] which maps the solid sphere S to to the solid ellipsoid
E. Then take the spherical coordinate map φ : R → S, where R = {(ρ, φ, θ) | 0 ≤ ρ ≤
1, 0 ≤ φ ≤ π, 0 ≤ θ ≤ 2π}. Now Ψ ◦ Φ : R → E is a coordinate change which maps
R to the ellipsoid. By the chain rule, the distortion factor is |dΨ||dΦ| = abcρ2 sin(φ).
Rπ
The integral is abc(1/3)(2π) 0 sin(φ) dφ = (4π/3)(abc) .
25.6. In order to compute the volume of a solid torus, we can introduce a special
coordinate system Φ([r, ψ, θ]) = [(b + ar cos(ψ)) cos(θ), (b + ar cos(ψ)) sin(θ), a sin(ψ)].
The solid torus E is then the image of the cuboid {(r, ψ, θ) | 0 ≤ r ≤ 1, 0 ≤ ψ ≤
2π, 0 ≤ θ ≤ 2π}. The determinant is |dΦ| = a2 cos2 (s)(b + ar cos(s)). Integration over
the cuboid gives the volume (2πb)(πa2 ).
Examples
RRR
25.7. To find E
f dV for E = {0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 0 ≤ z ≤ 1} and f (x, y, z) =
R1 R1 R1
24x2 y 3 z, set up the integral 0 0 0
24x2 y 3 z dz dy dx . Start with the core
R1 R1
0
24x2 y 3 z dz = 12x3 y 3 , then integrate the middle layer, 0 12x3 y 3 dy = 3x2 and
R1
finally handle the outer layer: 0 3x2 dx =1.
RRR 2
25.8. To find the moment of inertia I = E
x + y 2 dV of a sphere E = {x2 +
y 2 + z 2 ≤ L2 }, we use spherical coordinates. We know that x2 + y 2 = ρ2 sin2 (φ) and
the distortion factor is ρ2 sin(φ). We have therefore
Z 2π Z π Z L
I= ρ2 sin2 (φ)ρ2 sin(φ) dρdφdθ = 8πL5 /15 .
0 0 0
We will see some details in class. If we rotate the sphere around the z-axis with angular
velocity ω, then Iω 2 /2 is the kinetic energy of that sphere. Example: the moment
of inertia of the earth is 8 · 1037 kgm2 . With an angular velocity of ω = 2π/day =
2π/(86400s), this rotational kinetic energy is 8 · 1037 kgm2 /(7464960000s2 ) ∼ 1029 J ∼
2.5 · 1024 kcal.
25.9. Problem: Find the volume E of the intersection of x2 + y 2 ≤ 1, x2 + z 2 ≤ 1 and
y 2 + z 2 ≤ 1. Solution: look at 1/16’th of the
√ body given in cylindrical coordinates
0 ≤ θ ≤ π/4, r ≤ 1, z > 0. The roof is z = 1 − x2 because above the ”one eighth
disc” R only the cylinder x2 + z 2 = 1 matters. The polar integration problem
Z π/4 Z 1 p
16 1 − r2 cos2 (θ)r drdθ
0 0
has an inner r-integral of (16/3)(1 − sin(θ)3 )/ cos2 (θ). Integrating this over θ can be
done by integrating f (x) = (1 − sin(x)3 ) sec2 (x) by parts (using tan0 (x) = sec2 (x))
√
leading to the anti-derivative − cos(x) + sec(x) + tan(x) of f . The result is 16 − 8 2.
25.10. Problem: A pencil E, a hexagonal cylinder of radius 1 above the xy-plane
is cut by a sharpener below the cone z = 10 − x2 − y 2 . What is its volume? Solution:
we consider
√ one √ sixth of the pen where the base is the polar region 0 ≤ θ ≤ 2π/6 and
r(θ) ≤ 3/( 3 cos(θ) + sin(θ)). The pen’s back is z = 0 and the sharpened part is
z = 10 − r2 .
Z π/3 Z √3/(√3 cos(t)+sin(t)) Z 10−r2
1 r dzdrdθ .
0 0 0
115
√ . 1
The integral can be computed and is 32 3
2

1An exam problem at ETH in a single variable calculus exam when Oliver was an undergrad.
2Archimedes Revenge, first appeared in Math S21a exam, Harvard Summer School, 2017
Linear Algebra and Vector Analysis

Figure 3. Illustrating two harder problems: the pen problem and the
“Archimedes revenge problem” asking to prove that E : x2 + y 2 − z 2 ≤
1, y 2 + z 2 − x2 ≤ 1, z 2 + x2 − y 2 ≤ 1 has Vol(E) = log(256).

Homework
RRR
Problem 25.1: Find the moment of inertia E
x2 + y 2 dV , where
E = {x2 + y 2 ≤ z 2 , |z|2 ≤ 1 is the double cone.

Problem 25.2: a) In Figure 1, you see the solid E = {x2 + z 2 ≤

1, y 2 + z 2 ≤ 1}. Find its volume.
b) You see also the union of two cylinders {x2 + z 2 < 1, |y|2 < 9} and
{y 2 + z 2 < 1, |x|2 < 9}. Use a) to find the volume.

Problem 25.3: In figure 1, we see the solid E = {x2 ≤ 1, y 2 ≤ 1, z 2 ≤

1, x2 + y 2 ≥ 1, x2 + z 2 ≥ 1, y 2 + z 2 ≥ 1}. Find its volume.

Problem 25.4: Evaluate the triple integral

ZZZ
xy dV ,
E
where E is bounded by the parabolic cylinders y = 3x2 and x = 3y 2 and
the planes z = 0 and z = x + y.

Problem 25.5: We have seen the problem in the movie “Gifted” to

2
compute the improper integral of e−x . Here is another approach: verify
Z ∞Z ∞Z ∞
2 2 2 √
e−(x +y +z ) dx dy dz = ( π)3 .
−∞ −∞ −∞
R∞ 2
Use this as in the “Gifted” computation to find e−x dx. You can do
√ −∞
that without knowing that the later is π.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 26: Surface area

26.2. More generally

RR if f : R → R is a function which describes something RR like a
density then R f (r(u, v)) |ru ×rv |dudv is an integral which is abbreviated as S f dS
and called
RR a scalar surface integral. For example, if f is a density on the surface then
this S f dS is the mass. Again, we have to stress that in this integral, the orientation
of the surface is irrelevant. The distortion factor |dr| is always non-negative. It is
better to think of S f dS as a weighted surface area generalizing area S dS. 1
RR RR

26.3. Here is the most general p change of integration formula for maps r : Rm → Rn ,
with distortion factor |dr| = det(drT dr). The formula holds for m > n too, det is
then a pseudo determinant. If S = r(R) is the image of a solid R under a C 2 map r
and f : Rn → R is a function, then the mother of all substitution formulas is
RR RR
Theorem: R
f (r(u))|dr(u)| du = S
f (u) du.

26.4. The proof is the same as seen in the two-dimensional change of variable sit-
uation. Just because n is used for the target space Rn , we use the basic size 1/N .
We chop up the region into parts R ∩ Q with cubes Q of size 1/N and estimate
the difference V ol(dr(Q)) and V ol(r(Q)) by CMN /N 2 leading to an overall difference
bounded by F CMN /N 2 , where F is the maximal value of f on R and Mn is the
Heine-Cantor function modulus of continuity of f . Adding everything up gives an
error F CVol(R)MN + 2n Vol(δR)F/Np → 0, where δR is the boundary of R. There is
one new thing: we have to see why det(AT A) is the volume of the parallelepiped
spanned by the column vectors of the Jacobian matrix A = dr. We will talk about
determinants in detail later but if A is in row reduced echelon form then AT A is the
1Unfortunately, scalar integrals are often placed close to the integration of differential forms (like
volume forms). The later are of different nature and use an integration theory in which spaces come
with orientation. So far, if we replace r(u, v) with r(v, u) gives the same result (like area or mass).
Linear Algebra and Vector Analysis

identity matrix and the determinant is 1, agreeing with the volume. Now notice that if
a column of A is scaled by λ producing a new matrix B, then det(B T A) = λdet(AT A)
and det(B T B) = λ2 det(AT A). If two columns of A are swapped leading to a new
matrix B, then det(B T A) = −det(AT A) and det(B T B) = det(AT A). If a column of A
is added to another column, then this does change det(B T B). The only row reduction
step which affects the |dr| is the scaling. But that is completely in sync what happens
with the volume. QED.

26.5. The last theorem covers everything we have seen and we ever need to know when
integrating scalar functions over manifolds. In the special case n = m it leads to:
RR
Theorem: R
|dr(u)| du = Vol(S).

26.6. Here are the important small dimensional examples:

Examples
26.7. In all the examples of surface area computations, p we take a parametrization
r(u, v) : R → S, then use use that the distortion factor is det(drT dr) = |ru × rv |.

p p
Figure 1. The distortion factors |dr| = |g| = det(g)RR= det(drT dr)
appear in general. For m = 2, n = 3 we get surface area R |ru ×rv | dudv.

26.8. Problem: find the surface area of a sphere x2 +y 2 +z 2 = L2 . Solution: Param-

etrize the surface r([θ, φ]) = [L sin(φ) cos(θ), L sin(φ) sin(θ), L cos(φ)]. The distortion
factor is L2 sin(φ). The surface area is 4πL2 .
26.9. Problem: find the surface area of surface of revolution given in cylindrical
coordinates as z = g(θ), a ≤ z ≤ b. Solution: Parametrize p the surface r([θ, z]) =
0
[g(z) cos(θ), g(z) sin(θ), z]. The distortion factor is g(z) 1 + g (z)2 .
26.10. As an example, we can look at the surface of revolution x2 +y 2 = 1/z 2 , |z|2 > 1.
The volume of the solid enclosed by the surface is π. The surface area is infinite.
26.11. Problem: find the surface area of the graph of a function z = f (x, y), (x, y) ∈
R. Solution: Parametrize
p the surface as r([x, y]) = [x, y, f (x, y)]. The distortion
factor is |rx × ry | = 1 + fx2 + fy2 .
26.12. Problem: what is the surface area of the intersection of x2 + z 2 ≤ 1, 6x + 3y +
9z = 12. Solution: The surface is a plane but also a graph over R = {x2 + z 2 ≤ 1}
in the xz-plane. The simplest parametrization is r([x, z]) =√[x, (12 − 6x − 9z)/3, z] =
[x, 4√− 2x − 3z, √
RR z]. It gives |rx√× rz | = |[−2, −1, −3]| = 14. The surface area is
R
14dxdy = 14Area(R) = 14π.
26.13. The following hyperspherical coordinates parametrize the 3-dimensional
sphere x2 + y 2 + z 2 + w2 = 1 in R4 .
r([φ, ψ, θ]) = [cos(φ), sin(φ) cos(ψ), sin(φ) sin(ψ) cos(θ), sin(φ) sin(ψ) sin(θ)] ,
p
with
p 4 θ ∈ [0, 2π], φ ∈ [0, π], ψ ∈ [0, π]. The distortion factor is det(drT dr) =
2
sin (φ) sin (ψ) so that the surface area of the hypersphere is
RπRπ
2π 0 0 sin2 (φ) sin(ψ)dφdψ = 2π 2 .
26.14. In dimension n what is the volume |Bn | of the n-dimensional unit ball Bn in
Rn and the volume |Sn | of the n-dimensional unit sphere Sn in Rn+1 ? It starts with
|B0 | = 1, as B0 is a point and |S0 | = 2, as S0 consists of two points. The n-ball of radius
ρ has the
R 1volume |Bn |ρn and the n-sphere of radius ρ has the volume |Sn |ρn . Because
|Bn | = 0 |Sn |ρn dρ, we have |Bn+1 | = |Sn |/(n + 1). Because Sn can be written as a
R π/2
union of products (n − 2)-spheres with S1 leading to |Sn | = 2π 0 |Sn−2 | cos(φ) dφ
= 2π|Bn−1 |. We know now all: just start with |B0 | = 1, |S0 | = 2, |B1 | = 2, |S1 | = 2π
Area/Volume

and 30

2π 2π
Theorem: |Bn | = n
|Bn−2 |, |Sn | = |S |.
n−1 n−2
25

The 5-ball has maximal volume 5.26379... among all unit balls. The 6-sphere has 15

maximal surface area 33.0734... among all unit spheres. The volume of the 30-ball is 5

only 0.00002.... The surface area of the 30-sphere for example is only 0.0003. Compare 5 10 15 20 25 30
Dimension

with a n-unit cube of volume 1 and a boundary surface area 2n. High dimensional
spheres and balls are tiny!
26.15. If S is a cylinder x2 + y 2 = 1, 0 < z < 1, triangulated with each triangle
smaller than 1/n → 0, does the area converge to the surface area A(S)? No! A counter
example is the Schwarz lantern from 1880. The cylinder is cut into m slices and
n points are marked on the rim of each slice to get triangles like A = (1, 0, 0), B =
(cos(4π/n), sin(4π/n,
p 0)), C = (cos(2π/n), sin(2π/n), 1/m) of√area
sin(2π/n)(1/m) 2 + 3m√ 2 − 4m2 cos(2π/n) + m2 cos(4π/n)/ 2. The nm triangles have
p
area ∼ 2 + 8m2 π 4 /n4 / 2. For m = n3 , the triangulated area diverges.
Linear Algebra and Vector Analysis

Homework

Problem 26.1: Find the surface area of the Einstein-Rosen bridge

r(u, v) = [3v 3 , v 9 cos(u), v 9 sin(u)]T , where 0 ≤ u ≤ 2π and −1 ≤ v ≤
1. Tunnels connecting different parts of space-time appear frequently in
science fiction.

Problem 26.2: Find the area of the surface given by the helicoid
r(u, v) = [u cos(v), u sin(v), v]T with 0 ≤ u ≤ 1, 0 ≤ v ≤ π.

Problem 26.3: A decorative paper lantern is made of 8 surfaces. Each

is parametrized by
r(t, z) = [10z cos(t), 10z sin(t), z]
with 0 ≤ t ≤ 2π and 0 ≤ z ≤ 1 and then translated or rotated. Find the
total surface area of the lantern.

Problem 26.4: Find the surface area of the torus [5 cos θ +

cos α cos θ, 5 sin θ + cos α sin θ, sin α], where θ and α are both in [0, 2π].

Problem 26.5: The Hopf parametrization r : R ⊂ R3 → S ⊂ R4 is

r([φ, θ1 , θ2 ]) = [cos(φ)
p cos(θ1 ), cos(φ) sin(θ1 ), sin(φ) cos(θ2 ), sin(φ) sin(θ2 )].
Check that |dr| = det(drT dr) = cos(φ) sin(φ) = sin(2φ)/2, then verify
again that the surface area of the three sphere is 2π 2 . If we fix φ, we see
tori. Their union with φ ∈ [0, π/2] is the Hopf fibration.

Figure 2. A “wormhole”, the Hopf fibration of the 3-sphere x2 + y 2 +

z 2 + w2 = 1 and a Schwarz lantern.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 27: A visit by Archimedes

Seminar
27.1. In this seminar we have the honor to have Archimedes as a special guest. We
talk to him using a technology called “quantum forward tunneling” which allows to
interact with part of the past without running into a causality paradox. The actual
Archimedes did not know about the interview. It is his “quantum spirit” which does
it for us. How does it work? Quantum space-time produces sometimes tiny wormhole
constellations in which a wave function can be trapped. By harvesting many of those
trapped waves, we can rebuild and interact with an object or person from a previous
time. The so established “time tunnel” is sustainable only for a short time as the
trapped waves will fade within a half an hour. It is enough time however for a short
interview. We take the opportunity and ask him about his theorems.
27.2. Math 22a: What a pleasure to have you here. Welcome! Archimedes: I’m
glad to find myself in this lovely place. It must be a dream. I don’t recognize the
town but it feels like a ‘Alexandria in the future’. Math 22a: yes, it is also a hot
spot for science, but there are many now. We are eager to learn a bit about your proof
expertise.
27.3. Math 22a: What result of yours do you consider the most important one?
Archimedes: Definitely the formula for the volume of the sphere! Math 22a: Why?
Archimedes: It was much harder to get this than the circumference of the circle or the
surface area of the sphere. It was also harder to test the result experimentally. Math
22a: How did you measure? Archimedes: We build wood models of cylinders, cones
and spheres of the same base radius and height and measured their volume ratios.
Problem A: Explain how Archimedes can using wooden models measure
their volumes. If you don’t know, take a bath. Given a cylinder C, a cone
O and a sphere S of base length 1. What ratios |C|/|S|, |O|/|S| do the
measurements show?
Linear Algebra and Vector Analysis

27.4. Math 22a: Was the comparison of the sphere with the complement of a cone
in the cylinder historically the first proof? Archimedes: The relation had been
conjectured before. It had been suspected that the ratio between the volume of a
sphere and the volume of a cylinder is the fraction 2/3 but nobody had been able to
prove this relation before I could see the slicing trick.

Problem B: Explain why slicing the unit sphere at height z gives the
same area as a ring of radius 1 in which a hole of size z has been has been
drilled.

27.5. Math 22a: Do you remember the precise moment, when the discovery stuck?
Archimedes: I don’t recall directly but it must have been one of these “hot tub ideas”.

27.6. Math 22a: This discovery must have occurred after you got the circle circum-
ference computed. How difficult was the later? Archimedes: also this needed some
time. It emerged pretty early that the circumference is somehow proportional to the
radius. The measurement of the constant was then a bit trickier even so it remained
open what fraction it is. 22/7 was close. I got first the diameter/area ratio. I did that
using the following picture.

Problem B: How does the picture below prove that the area A, radius
R and diameter D of a circle satisfies 2A = RD? How can you make this
precise as in reality the circular sector does not have the same area as the
triangle. (Hint: you can use modern tools like L’Hôpital’s rule if you like).

Figure 1. The circle proof.

27.7. Math 22a: We also wonder about your computation of the volume of the
“hoof” which is the solid bound by the cylinder x2 + y 2 = 1 and z = x and z = 0.
Archimedes: I don’t recognize the symbols you just spelled out but I know what
object you are talking about. It was exciting to see a solid bound partly by round
parts to have a rational volume, which is 2 third of the height. One can see that the
result is 2/3 in various ways.

Problem C: a) Take a hoof of height 1 and cut in triangular pieces

which are obtained if y is constant. Show that the area of the triangle is
(1 − y 2 )/2 and conclude from this the volume is 2/3. b) Cut the same hoof
into rectangular pieces which are obtained if x is constant. Show that the
area of the rectangle is (1 − x2 ) and conclude that the volume is 2/3.

27.8. Math 22a: Also very impressive is your computation of the surface area of the
sphere by relating it with the surface area of a cylinder. What was the intuition there?
Archimedes: Actually, a drawing which is accurate enough shows this pretty well.
As both situations have circular symmetry, we only need to understand what happens
with the lengths on a sphere when it is projected on the cylinder. There are similar
triangles. Take a stick of some length and place it onto the sphere pointing to the
north pole. As it gets closer to the pole and its height-length is one half of the actual
length. then the radius of that position is also half etc. As the area of a small sphere
strip is height times radius times about 22/7, this is also the area of a cylinder. In
the sphere case, the factor one-half is applied to the radius. In the cylinder case it is
applied to the height.

Problem C: Explain this in more modern terms. We have a unit sphere

and a cylinder of radius 1. Look what the surface area of a strip z, z + dz
is in both cases. You can use the spherical angle φ which you know from
spherical coordinates.

27.9. Math 22a. A last question: What is a function in mathematics? Archimedes.

I don’t know this expression: for me, mathematics deals with geometric objects and
numbers which characterize those objects like length, area or volume. Math 22a: we
interpret your formula for the volume of a sphere as V (r) = 4πr3 /3 which is a rule
assigning to the radius r a number. We also have rules which tell how to compute rates
of change. For the function V (r) for example, its rate of change is 4πr2 , the surface
area of the sphere. The reason is that if we decrease the radius by a small unit, then
this essentially means taking away a layer of area 4πr2 . Archimedes. This is cool.
Let me see: does this also work for the area and circumference of a disc? Math 22a.
Certainly. Go ahead. Archimedes. Well, with this new language, we would say that
a disc of radius r is f (r) = πr2 . I assume that for any integer n the rate of change of
rn is nrn−1 . Math 22a. Yes, that is correct. Archimedes. In that case the rate of
Linear Algebra and Vector Analysis

change of πr2 is 2πr and indeed this is my formula for the circumference of a circle.
This is “Phaidros”.
27.10. Math 22a Thank you very much for the interview. It will inspire us for the
second midterm exam. Maybe you can visit and take the exam on Tuesday or review
on Sunday? Archimedes ‘It will be my pleasure.”
Homework

27.1 Find a solid which has the property that if you project it on the
xy-plane it is a half circle, if you project it on the yz plane it is a triangle
and if you project it onto the xz-plane, it is a rectangle.

27.2 There are regions in the plane which have the property that their
thickness is constant 1 but which are not circles. Find some.

27.3 There is a beautiful theorem of Pappus which gives the formula of

a solid obtained by taking all points in distance d from a given curve r(t)
provided the thickened curve does not intersect: the formula is Lπd2 +
4πd3 /3, where L is the length of the curve. Archimedes designed a spiral
pump in which a spiral r(t) = [10 cos(t), 10 sin(t), t] plays an important
role. Assume 0 ≤ t ≤ 100π, what is the volume of the solid consisting of
all points in distance 1 to the curve?

Figure 2. The “Archimedes Screw” and another sphere proof.

27.4 Let A be the solid obtained by intersecting three perpendicular solid

cylinders and let B the solid obtained by intersecting two. What are the
formulas for the volumes and surface areas?

27.5 Archimedes had another picture for the volume of a sphere. It is

seen in the picture above to the right. Please explain.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 28: Keywords for Second Hourly

Partial Derivatives
∂
fx (x, y) = ∂x f (x, y) partial derivative
L(x, y) = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ) linear approximation
Q(x, y) = L(x0 , y0 ) + fxx (x − x0 )2 /2 + fyy (y − y0 )2 /2 + fxy (x − x0 )(y − y0 ) quadratic
L(x, y) estimates f (x, y) near f (x0 , y0 ). The result is f (x0 , y0 ) + a(x − x0 ) + b(y − y0 )
tangent line: ax + by = d with a = fx (x0 , y0 ), b = fy (x0 , y0 ), d = ax0 + by0
tangent plane: ax + by + cz = d with a = fx , b = fy , c = fz , d = ax0 + by0 + cz0
estimate f (x, y, z) by L(x, y, z) near (x0 , y0 , z0 )
fxy = fyx Clairaut’s theorem, if fxy and fyx are continuous.
ru (u, v), rv (u, v) tangent to surface parameterized by r(u, v)

Partial Differential Equations

ft = fxx heat equation
ftt − fxx = 0 wave equation
fx − ft = 0 transport equation
fxx + fyy = 0 Laplace equation
ft + f fx = fxx Burgers equation

Gradient
∇f (x, y) = [fx , fy ]T , ∇f (x, y, z) = [fx , fy , fz ]T , gradient
Dv f = ∇f · v directional derivative
d
dt
f (r(t)) = ∇f (r(t)) · r 0 (t) chain rule
∇f (x0 , y0 ) is orthogonal to the level curve f (x, y) = c containing (x0 , y0 )
∇f (x0 , y0 , z0 ) is orthogonal to the level surface f (x, y, z) = c containing (x0 , y0 , z0 )
d
dt
f (x + tv) = Dv f by chain rule
(x − x0 )fx (x0 , y0 , z0 ) + (y − y0 )fy (x0 , y0 , z0 ) + (z − z0 )fz (x0 , y0 , z0 ) = 0 tangent plane
f (x, y) increases in the ∇f /|∇f | direction. Functions dance upwards.
f (x, y, z) = c defines z = g(x, y), and gx (x, y) = −fx (x, y, z)/fz (x, y, z) implicit diff

Extrema
∇f (x, y) = [0, 0]T , critical point or stationary point
2
D = fxx fyy − fxy discriminant, useful in second derivative test
f (x0 , y0 ) ≥ f (x, y) in a neighborhood of (x0 , y0 ) local maximum
f (x0 , y0 ) ≤ f (x, y) in a neighborhood of (x0 , y0 ) local minimum
Linear Algebra and Vector Analysis

∇f (x, y) = λ∇g(x, y), g(x, y) = c, or ∇g = 0 Lagrange equations

second derivative test: ∇f = (0, 0), D > 0, fxx < 0 local max, ∇f = (0, 0), D >
0, fxx > 0 local min, ∇f = (0, 0), D < 0 saddle point
f (x0 , y0 ) ≥ f (x, y) everywhere, global maximum
f (x0 , y0 ) ≤ f (x, y) everywhere, global minimum

Double Integrals
RR
f (x, y) dydx double integral
R b RRd(x)
f (x, y) dydx bottom-to-top region
Rad Rc(x)
b(y)
f (x, y) dxdy left-to-right region
Rc R a(y)
R RR f (r, θ) r drdθ polar coordinates
|r × rv | dudv surface area
R b RRd u RdRb
a c
f (x, y) dydx = c a f (x, y) dxdy Fubini
RR
R RR 1 dxdy area of region R
R
f (x, y) dxdy signed volume of solid bound by graph of f and xy-plane

Triple Integrals
RRR
f (x, y, z) dzdydx triple integral
R b R dRR v
f (x, y, z) dzdydx integral over rectangular box
Rab Rcg2 (x)
u R
h2 (x,y)
a g 1 (x) h1 (x,y)
f (x, y) dzdydx type I region
RRR
f (r, θ, z) r dzdrdθ integral in cylindrical coordinates
R R RR
f (ρ, θ, z) ρ2 sin(φ) dzdrdθ integral in spherical coordinates
R b R dRR v RvRdRb
a cR u
f (x, y, z) dzdydx = u c a
f (x, y, z) dxdydz Fubini
RR
V = R R RE 1 dzdydx volume of solid E
M= E
f (x, y, z) dzdydz mass of solid E with density f .

General advise
Draw the region when integrating in in higher dimensions.
Consider other coordinate systems if the integral does not work.
Consider changing the order of integration if the integral does not work.
For tangent planes, compute the gradient [a, b, c]T first then fix the constant.
When looking at relief problems, mind the gradient.

Theorems
R R
fxy = fyx , Taylor, f dxdy = f dydx, Morse theorem, chain rule, gradient theorem,
change of variables
People
Clairaut, Fubini, Lagrange, Fermat, Riemann, Archimedes, Hamilton, Euler, Taylor,
Morse, Hopf

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

4
Name:
5

7
LINEAR ALGEBRA AND VECTOR ANALYSIS
8
MATH 22A Total :
9

Unit 28: Second Hourly Practice

Problems

Problem 28P.1 (10 points):

a) (4 points) You know the positive integer n5 is odd. Prove that n is odd.
b) (3 points) Prove or disprove: if a and b are irrational, then ab is irrational.
c) (3 points) Prove or disprove: if a and b are irrational, then a + b is irrational.

Problem 28P.2 (10 points, each sub problem is one point):

a) What is the name of the differential equation ft = fxx ?
b) What assumptions need to hold so that fxy = fyx is true?
c) The gradient ∇f (x0 ) has a relation to f (x) = c with c = f (x0 ). Which one?
d) The linear approximation of f at x0 is L(x) = f (x0 ) + ..... Complete the formula.
e) Assume f has a maximum on g = c, then either ∇f = λ∇g, g = c holds or ...
f) Which mathematician proved the switch the order of integration formula?
g) True or false: the gradient vector ∇f (x) is the same as df (x).
h) The equation ut + uux = uxx is an example of a differential equation. We have
seen two major types (each a three capital letter acronym). Which type is it?
i) What is the formula for the arc length of a curve C?
j) What is the integration factor |dφ| when going into polar coordinates?

Problem 28P.3 (10 points, 2 points for each sub-problem):

We see the level curves of a Morse function f . Only pick points A-J.
a) Which point is critical with discriminant D = det(d2 f ) < 0.
b) At which point is fx > 0, fy = 0?
c) At which point is fx > 0, fy > 0?
d) Which (x0 , y0 ) are critical points of f when imposing the constraint
g(x, y) = y = y0 ?
e) Which (x0 , y0 ) are critical points of f when imposing the constraint
g(x, y) = x = x0 ?
Linear Algebra and Vector Analysis

11. 20. 23.

23.
20.
G B29. 26.
17. 17.
14. 14.

-1.
-4.
E 2. F
5. H 8.
x
11. I
D32.38. 23.
14.
20.
23. C26. 44. A 29.
41. 35.

17.
J
8.
23. 20. 17. 11. 14. 20. 23.

Problem 28P.4 (10 points):

a) (5 points) Find the tangent plane to the surface xyz + x5 y + z = 11 at (1, 2, 3).
b) (5 points) Near (x, y) = (1, 2), we can write z = g(x, y). Find gx (1, 2), gy (1, 2).

Problem 28P.5 (10 points):

a) Find the quadratic approximation of f (x, y, z) = 1 + x + y 2 + z 3 + sin(xyz) at
(0, 0, 0).
b) Estimate f (0.01, 0.03, 0.05) using linear approximation.

Problem 28P.6 (10 points):

a) (8 points) Classify the critical points of the function f (x, y) = x12 +12x2 +y 12 +12y 2
using the second derivative test.
b) (2 points) Does f have a global minimum? Does f have a global maximum?
Problem 28P.7 (10 points):
On the top of a MIT building there is a radar dome in the form of a spherical cap.
Insiders call it the “Death star” radar dome. We know that with the height h
and base radius r, we have volume and surface area given by V = πrh2 − πh3 /3,
A = 2πrh = π. This leads to the problem to extremize
y3
f (x, y) = xy 2 −
3
under the constraint
g(x, y) = 2xy = 1 .
Find the minimum of f on this constraint using the Lagrange method!

Problem 28P.8 (10 points):

Find ZZ
5/(x2 + y 2 ) dxdy ,
R
where R is the region 1 ≤ x2 + y 2 ≤ 25, y 2 > x2 .

Problem 28P.9 (10 points):

Integrate f (x, y, z) = z over the solid E bound by
z=0
x=0
y=0
and
x+y+z =1.

Problem 28P.10 (10 points):

What is the surface area of the surface
 
2y
r(x, y) =  x 
y3
3
+x
with 0 ≤ y ≤ 2 and 0 ≤ x ≤ y 3 ?

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

4
Name:
5

7
LINEAR ALGEBRA AND VECTOR ANALYSIS
8
MATH 22A Total :
9

Unit 28: Second Hourly

Welcome to the second hourly. Please don’t get started yet. We start all together at
9:00 AM. You can already fill out your name in the box above.
• You only need this booklet and something to write. Please stow away any other
material and electronic devices. Remember the honor code.
• Please write neatly and give details. Except for problems 28.2 and 28.3, we
want to see details, even if the answer should be obvious to you.
• Try to answer the question on the same page. There is also space on the back
of each page.
• If you finish a problem somewhere else, please indicate on the problem page so
that we find it.
• You have 75 minutes for this hourly.
Archimedes sends his good luck wishes. He unfortunately can not join us as he is “busy
proving a new theorem”. He just sent us his selfie. Oh well, these celebrities!
Linear Algebra and Vector Analysis

Problems

Problem 28.1 (10 points):

a) (4 points) Prove that if x3 is irrational, then x is irrational.

b) (3 points) Prove or disprove: the product of two odd integers is odd.

c) (3 points) Prove or disprove: the sum of two odd integers is odd.

Problem 28.2 (10 points) Each question is one point:

a) What is the name of the partial differential equation ftt = fxx ?

P∞
b) The series f (x) = k=0 xk /k! = 1 + x + x2 /2! + x3 /3! + · · · represents a
function. Which one?

c) The implicit differentiation formula for f (x, y(x)) = 1 is y 0 (x) = . . . . . . .

P∞
d) What is the name of the function f (s) = n=1 n−s ?

e) On a circular island there are exactly 3 maxima and one minimum for the
height f . Assuming f is a Morse function, how many saddle points are there?

f) Which mathematician first found the value for the volume of the ball
x2 + y 2 + z 2 ≤ 1?

g) True or False: the directional derivative of f in the direction ∇f (x)/|∇f (x)| is

negative at a point where ∇f is not zero.

h) The equation f (x + t) = eDt f = f (x) + f 0 (x)t + f 00 (x)t2 /2 + · · · solves a

partial differential equation. Which one?

i) What is the formula for the surface area of a surface S parametrized by

r(u, v) over a domain R?

j) What is the integration factor (= distortion factor) when going to spherical

coordinates (ρ, φ, θ)?
Linear Algebra and Vector Analysis

Problem 28.3 (10 points) Each question is two points:

We see the level curves of a Morse function f . The circle through ABC will
sometimes serve as a constraint g(x, y) = x2 + y 2 = 1. In all questions, we only pick
points from A,B,C,D,E,F,G,H,I,J,K,L,M.

a) Which points are local minima of f under the constraint g(x, y) = 1.

b) Which points are local maxima of f under the constraint g(x, y) = 1.

c) At which points do we have fx (x, y) · fy (x, y) 6= 0?

d) At which points are |∇f (x, y)| maximal?

e) At which points are |∇f (x, y)| minimal?

B
10

9 9 9

L 8
D
11 11
7
14
16
K F 14
16
18 18
20 20
I G J A x
19 19
17 17
15 15
6 13
12 1312
10 10
4
E H 5 M 3
2 1
0

-2 -1
-3
-4
-5
-6 C
-7
-8 -8
Problem 28.4 (10 points):

a) (5 points) Find the tangent plane to the surface

f (x, y, z) = x2 y − x3 + y 2 + z 4 xy = −13
at the point (2, −1, 1).

b) (5 points) Estimate f (2.001, −0.99, 1.1) by linear approximation.

Problem 28.5 (10 points):

a) (5 points) Find the quadratic approximation Q(x, y) of

f (x, y) = 5 + x + y + x2 + 3y 2 + sin(xy) + ex
at (x, y) = (0, 0).

b) (5 points) Estimate the value of f (0.001, 0.02) using quadratic approxima-

tion.

Problem 28.6 (10 points):

a) (8 points) Classify the critical points of the function

f (x, y) = x2 − y 3 + 2x + 3y
using the second derivative test.

b) (2 points) Does the function f (x, y) have a global minimum or global max-
imum?

Problem 28.7 (10 points):

Using the Lagrange optimization method, find the parameters (x, y) for which
the area of an arch
f (x, y) = 2x2 + 4xy + 3y 2
is minimal, while the perimeter
g(x, y) = 8x + 9y = 33
is fixed.
Linear Algebra and Vector Analysis

Problem 28.8 (10 points):

a) (5 points) Find the moment of inertia

ZZ
I= (x2 + y 2 ) dydx
G
2 2
of the quarter disc G = {x + y ≤ 1, x ≥ 0, y ≤ 0 }.

b) (5 points) Evaluate the double integral

Z eZ 1
y
dydx ,
1 log(x) ey −1
where log is the natural log as usual.

Problem 28.9 (10 points):

Find the integral ZZZ

f (x, y, z) dzdydx
E
of the function
f (x, y, z) = x + (x2 + y 2 + z 2 )4
over the solid
E = {(x, y, z) | 1 ≤ x2 + y 2 + z 2 ≤ 4, z ≥ 0 } .

Problem 28.10 (10 points):

Find the surface area of  

2x
r(x, y) =  y 
x3
3
+y
with 0 ≤ x ≤ 2 and 0 ≤ y ≤ x3 .

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 29: Line integrals

Lecture
29.1. A vector field F assigns to every point x ∈ Rn a vector F (x) = [F1 (x), . . . , Fn (x)]T
such that every Fk (x) is a continuous function. We think of F as a force field. Let
t → r(t) ∈ Rn be a curve parametrized on [a, b]. The integral
Z Z b
F · dr = F (r(t)) · r0 (t) dt
C a
0
R called the line integral of F along C. We think of F (r(t)) · r (t) as power and
is
C
F · dr as the work. Even so F and r are column vectors, we write in this lecture
[F1 (x), . . . , Fn (x)] and r0 = [x01 , . . . , x0n ] to avoid clutter. Mathematically, F : Rn → Rn
can also be seen as a coordinate change, we think about it differently however and
draw a vector F (x) at every point x.

Figure 1. A line integral in the plane and a line integral in space.

29.2. If F (x, y) = [y, x3 ], and r(t) = [cos(t), sin(t)] a circle with 0 ≤ t ≤ 2π, then
F (r(t)) = [sin(t), cos3R(t)] and r0 (t)R = [− sin(t), cos(t)] so that F (r(t))·r0 (t) = − sin2 (t)+
2π
cos4 (t). The work is C F · dr = 0 − sin2 (t) + cos4 (t) dt = −π/4. Figure 1 shows the
situation. We go more against the field than with the field.
29.3. A vector field F is called a gradient field if F (x) = ∇f (x) for some differen-
tiable function f . We think of f as the potential. The first major theorem in vector
calculus is the fundamental theorem of line integrals for gradient fields in Rn :
Rb
Theorem: a ∇f (r(t)) · r0 (t) dt = f (r(b)) − f (r(a)).
Linear Algebra and Vector Analysis

29.4. Proof: by the chain rule, ∇f (r(t)) · r0 (t) = dtd f (r(t)). The fundamental
Rb
theorem of calculus now gives a dtd f (r(t)) dt = f (r(b)) − f (r(a)). QED.
29.5. As a corollary we immediately get path independence
R R
If C1 , C2 are two curves from A to B then C1 F · dr = C2 F · dr,

as well as the closed loop property:

R
If C is a closed curve and F = ∇f , then C
F · dr = 0.

29.6. Is every vector field F a gradient field? Lets look at the case n = 2, where
F = [P, Q]. Now, if this is equal to [fx , fy ] = [P, Q], then Py = fxy = fyx = Qx . We
see that Qx − Py = 0. More generally, we have the following Clairaut criterion:

Theorem: If F = ∇f , then curl(F )ij = ∂xj Fi − ∂xi Fj = 0.

Proof: this is a consequence of the Clairaut theorem.

29.7. The field F = [0, x] for example satisfies Qx − Py = 1. It can not be a gradient
field. Now, if Qx − Py = 0 everywhere in the plane, how do we find the potential f ?

Integrate fx = P with respect to x and add a constant C(y).

Differentiate f with respect to y and compare fy with Q. Solve for C(y).

Figure 2. The vector field F = ∇f for f (x, y) = y 2 + 4yx2 + 4x2 .

We see the flow lines, curves with r0 (t) = F (r(t)). Going with the flow
increases f because F (r(t)) · r0 (t) = |∇f (t)|2 is equal to d/dtf (r(t)).
29.8. Example: find R x the 2potential of F (x, y) = [P, Q] = [2xy 2 + 3x2 , 2x2 y + 3y 2 ].
We have f (x, y) = 0 2xy + 3x dx + C(y) = x3 + x2 y 2 + C(y). Now fy (x, y) =
2

2x2 y + C 0 (y) = 2x2 y + 3y 2 so that C 0 (y) = 3y 2 or C(y) = y 3 and f = x3 + x2 y 2 + y 3 .

29.9. Here is a direct formula for the potential. Let Cxy be the straight line path
which goes from (0, 0) to (x, y).
R
Theorem: If F is a gradient field then f (x, y) = Cxy F · dr.

29.10. Proof: By the fundamental theorem of line integral, we can replace Cxy by a
path [t, 0] Rgoing from (0, 0) toR (x, 0) and then with R x[x, t] to (x, y).R yThe line integral is
x y
f (x, y) = 0 [P, Q] · [1, 0]dt + 0 [P, Q] · [0, 1] dt = 0 P (t, 0) dt + 0 Q(x, t) dt. We see
that fy = Q(x, y). IfRwe use the path going R x (0, 0) to (0, y) and R y to (x, y) instead,
R x the line
y
integral is f (x, y) = 0 [P, Q] · [0, 1]dt + 0 [P, Q] · [1, 0] dt = 0 Q(0, t) dt + 0 P (t, y) dt.
Now, fx = P (x, y). QED.
Examples
R
29.11. Find C [2xy 2 + 3x2 , 2x2 y + 3y 2 ] · dr for a curve r(t) = [t cos(t), t sin(t)] with
t ∈ [0, 2π]. Answer: we found already F = ∇f with f = x3 + x2 y 2 + y 3 . The curve
starts at A = (1, 0) and ends at B = (2π, 0). The solution is f (B) − f (A) = 8π 3 .
Rb
29.12. If F = E is an electric field, then the line integral a E(r(t)) · r 0 (t) dt is
an
R b electric potential. In celestial mechanics, if F is the gravitational field, then
0
a
F (r(t)) · r (t) dt is a gravitational potential difference. If f (x, y, z) is a temper-
ature and r(t) the path of a fly in the room, then f (r(t)) is the temperature, which
the fly experiences at the point r(t) at time t. The change of temperature for the fly is
d
dt
f (r(t)). The line-integral of the temperature gradient ∇f along the path of the fly
coincides with the temperature difference.
29.13. A device which implements a non-gradient force field is called a perpetual
motion machine. It realizes a force field for which the energy gain is positive along
some closed loop. The first law of thermodynamics forbids the existence of such a
machine. It is informative to contemplate the ideas which people have come up and to
see why they don’t work. We will look at examples in the seminar.
29.14. Let F (x, y) = [P, Q] = [ x2−y , x ]. Its potential f (x, y) = arctan(y/x) has
+y 2 x2 +y 2
the property that fx = (−y/x2 )/(1 + y 2 /x2 ) = P, fy = (1/x)/(1 + y 2 /x2 ) = Q. In the
seminar you ponder the riddle that the line integral along the unit circle is not zero:
Z 2π Z 2π
− sin(t) cos(t)
[ 2 , ] · [− sin(t), cos(t)] dt = 1 dt = 2π .
0 cos (t) + sin2 (t) cos2 (t) + sin2 (t) 0
The vector field F is called the vortex.
Linear Algebra and Vector Analysis

Figure 3. The vortex vector field has a singularity at (0, 0). All the
curl is concentrated at (0, 0).

Homework

Problem 29.1: Let C be the space curve r(t) = [cos(t), sin(t), sin(t)]
Rfor t ∈ [0, π/2] and let F (x, y, z) = [y, x, 15]. Calculate the line integral
C
F · dr.

Problem 29.2: What is the work done by moving in the force field
F (x, y) = [2x3 + 1, 4π sin(πy 4 )y 3 ] along the quartic y = x4 from (−1, 1) to
(1, 1)?

Problem 29.3: Let F be the vector field F (x, y) = [−y, x]/2. Compute
the line integral of F along the curve r(t) = [a cos(t), b sin(t)] with width
2a and height 2b. The result should depend on a and b.

Problem 29.4: Archimedes swims around a curve x22 + y 22 = 1 in a hot

tub, in which the water hasR the velocity F (x, y) = [3x3 + 5y, 10y 4 + 5x].
Calculate the line integral C F · dr when moving from (1, 0) to (−1, 0)
along the curve.

Problem 29.5: Find a closed curve C : r(t) for which the vector field
F (x, y) = [P (x, y), Q(x, y)] = [xy, x2 ]
F (r(t)) · r 0 (t) dt 6= 0.
R
satisfies C

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 30: Perpetual motion machines

Seminar
30.1. Wouldn’t it be nice to have a machine which would produce energy from nothing?
Humans have dreamed about this for centuries. There is no mathematical proof that
such a machine can not exist. It is an experimental fact that all isolated physical
process we know preserve energy. 1 In experiments, we see that all basic forces of
nature are gradient fields. So, how come we can harvest energy from the wind force
for example? Wind energy is driven by external sources, in particular the solar energy
which heats up different parts of the earth surface. The sun energy comes from nuclear
processes, mainly the fusion process.
30.2. It is a nice sport to come up with machines which seem to work or then to
analyze a given machine which has been constructed and to find why it fails.
30.3. Our first machine is a circular pipe which is half filled with water. On the side
without water, the gravitational force pulls a wooden ball down. On the water side,
the buoyancy force pulls the ball up. Valves are in place so that the water stays in
place.
Problem A: Analyse the pipe machine. You can assume that operating
the valves uses arbitrary little energy and when opening one of the valves,
the water stays in place.

Figure 1. A half filled pipe produces a non-conservative force field.

1Except for very short time, where virtual particles can appear and disappear in a short time frame.
Linear Algebra and Vector Analysis

30.4. An other class of machines uses magnets. Magnets are arranged in a circular
way to produce a circular non-conservative force field in which a magnet is pushed
forward.

Figure 2. Magnets arranged so that a magnet always gets pushed

forward (positive parts of magnets repel, equal parts attract).

Problem B: Analyze the magnet machine. Experiment with real mag-

nets.

30.5. And then there are mechanical machines. Here is an example with weights.

Figure 3. Mechanical perpetual motion machine: the torque on the

left hand side is larger as the hammers are further out. The wheel moves
counter clockwise.

Problem C: Analyze the hammer machine using line integrals using the
gravitational potential f (x, y, z) = z.
Figure 4. The capillary effect lifts the water level.

30.6. You all know that a sponge, a paper or a plant put into water lifts up the water
using the capillary effect. In narrow spaces, this force can beat gravity.

Problem D: Why does the “capillary lifting machine” not work?

30.7. Why are there no “perpetual motion machines”? There is no fundamental prin-
ciple which forbids it. We could certainly produce a computer simulation of a world,
where energy conservation fails. But it is like with “time machines”. If such a machine
would exist in our physical world, there would be serious dangers luring for a physicist
who studies it. Benjamin Peirce refers in his book “A system of analytic mechanics”
of 1855 to the “Antropic Principle”: “Such a series of motions would receive the
technical name of a ’perpetual motion’ by which is to be understood, that of a system
which would constantly return to the same position, with an increase of power, unless
a portion of the power were drawn off in some way and appropriated, if it were desired,
to some species of work. A constitution of the fixed forces, such as that here supposed
and in which a perpetual motion would possible, may not, perhaps, be incompatible
with the unbounded power of the Creator; but, if it had been introduced into nature, it
would have proved destructive to human belief, in the spiritual origin of force, and the
necessity of a First Cause superior to matter, and would have subjected the grand plans
of Divine benevolence to the will and caprice of man”.

Problem E: Can you reformulate this “anthropic principle” in more

modern terms? What could be the fate of a universe in which energy
conservation does not hold in macroscopic physics?

30.8. Non-conservative fields can also be generated by optical illusion as M.C. Es-
cher did. The illusion suggests the existence of a force field which is not conservative.
Can you figure out how Escher’s pictures ”work”? This is part of the homework. Here
is a last possible task for the seminar:

Problem F: Find some perpetual motion machine on the web (i.e.

youtube). If you find something interesting, share with others. Why
does it not work?
Linear Algebra and Vector Analysis

Problem G: Finally, if you have adventure spirit, Come up with a ma-

chine which actually works, get some seed money from investors or by
crowd sourcing and then start production.

Figure 5. Escher Stairs.

Homework

30.1 The force field F (x, y) = [−y/(x2 + y 2 ), x/(x2 + y 2 )] produces a non-

conservative force. A body put near the vortex will spin around it. Check
that that the field is a gradient field F = ∇f with f = arctan(y/x). Draw
the level curves of f , take a path r(t) = [cos(t),Rsin(t)] and verify that
2π
d/dtf (r(t)) = ∇f (r(t)) · r0 (t) = 1 for all t so that 0 ∇f (r(t)) · r0 (t) dt =
2π. Why does this not contradict the fundamental theorem?

30.2 If H(x, y) is a function of two variables, then a curve r(t) =

[x(t), y(t)] satisfying x0 (t) = Hy (x, y) and y 0 (t) = −Hx (x, y) is called a
solution of the Hamiltonian system. The function H(x, y) is called the
energy of the system. Verify that d/dtH(x(t), y(t)) = 0 meaning that
energy is conserved.

30.3 Design a vector field F (x, y) = [P (x, y), Q(x, y)] which has the
property such that for any closed curve C : r(t) in {x2R+ y 2 > 1} winding
once around the hole {x2 + y 2 ≤ 1}, the line integral C F (r(t)) · r0 (t) dt
is a multiple of 6π. An example of a curve winding once around is r(t) =
[2 cos(t), 2 sin(t)] with 0 ≤ t ≤ 2π.

30.4 A heat engine is a system that convert heat energy into mechanical
energy. We have seen such a machine in class. How does it work?

30.5 Explain the Escher waterfall illusion.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 31: Green’s theorem

Lecture
31.1. For a C 1 vector field F = [P, Q] in a region G ⊂ R2 , the curl is defined as
curl(F ) = Qx − Py . Assume the boundary C of G oriented so that the region G is
to the left (meaning that if r(t) = [x(t), y(t)] is a parametrization, then the turned
velocity [−y 0 (t), x0 (t)] cuts through G close to r(t)). Green’s theorem assures that if
C is made of a finite collection of smooth curves, then
RR R
Theorem: G curl(F ) dxdy = C F (r(t)) · dr(t).

31.2. Proof. It is enough to prove the theorem for F = [0, Q] or F = [P, 0] separately
and for regions G which are both “bottom to top” G = B = {a ≤ x ≤ b, c(x) ≤ y ≤
d(x)} and “left to right” G = L = {c ≤ y ≤ d, a(y) ≤ x ≤ b(y)}. For F = [P, 0],
use a bottom to top integral, where the two vertical integrals along r(t) = [b, t] and
r(t) = [a, t] are zero. The integrals along r(t) = [t, c(t)] and r(t) = [t, d(t)] give
Z b Z b Z b Z d(t) ZZ
P (t, c(t)) ds − P (t, d(t)) ds = −Py (t, s) dsdt = −Py dsdt .
b a a c(t) G

For F = [Q, 0], use a left to right integral, where the bottom and top integrals are zero
and where
Z d Z d Z d Z b(s) ZZ
Q(b(t), t) dt − Q(a(t), t) dt = Qx (t, s) dtds = Qx dsdt .
c c c a(s) G

In general, write F = [0, Q] + [P, 0], use the first computation for [P, 0] and the second
computation for [0, Q]. In general, cut G along a small grid so that each part is of both
types. When adding the line integrals, only the boundary survives. QED.

Figure 1. To prove Green cut the region into regions which are “bot-
tom to top” and “left to right”. Interior cuts cancel.
Linear Algebra and Vector Analysis

31.3. To see that we can cut G into regions of both types, turn the coordinate system
first a tiny bit so that no horizontal nor vertical line segments appear at the boundary.
This is possible because we assume the boundary to consist of finitely many smooth
pieces. Now also use a slightly turned grid to chop up the region into smaller parts.
Now we have a situation where each piece has the form G = {(x, y) |c(x) ≤ y ≤
d(x)} = {(x, y) | a(y) ≤ x ≤ b(y)}, where a, b, c, d are piecewise smooth functions.
31.4. Green assures:
Theorem: If F is irrotational in R2 , then F is a gradient field.

31.5. There are four properties which are equivalent if F is differentiable in R2 : A) F

is a gradient field, B) F has the closed loop property, C) F has the path independence
property, and D) F is irrotational. We have seen in the proof seminar that the vortex
vector field F = [−y, x]/(x2 + y 2 ) is a counter example to a more general theorem if
the field is not differentiable at some point.

Applications
31.6. Green’s theorem allows to Rcompute areas. If curl(F ) = 1 and C is a curve enclos-
ing a region G, then Area(G) = C F (r(t)) · r0 (t) dt. For example, with F = [−y, x]/2,
R R 2π
and r(t) = [a cos(t), b sin(t)], then C F ·dr = 0 [−b sin(t), a cos(t)] ·[−a sin(t), b cos(t)]/2 dt
R 2π
= 0 ab/2 dt = πab is the area of the ellipse x2 /a2 + y 2 /b2 = 1.
31.7. What is the area of the region enclosed
R 2π by r(t) = [cos(t), sin(t) + cos(22t)/22]?
Take F (x, y) = [0, x]. The line integral is 0 [0, cos(t)] ·[− sin(t), cos(t) − sin(22t)] dt =
π.
31.8. The planimeter is an analogue computer which computes the area of regions.
It works because of Green’s theorem. The vector F (x, y) is a unit vector perpendicular
to the second leg (a, b) → (x, y) if (0, 0) → (a, b) is the second leg. Given (x, y) we find
(a, b) by intersecting two circles. The magic is that the curl of F is constant 1. The
following computer assisted computation proves this:

s=Solve [ { ( x−a )ˆ2+(y−b)ˆ2==1 , aˆ2+bˆ2==1},{a , b } ] ;
{A, B}=First [ { a , b } / . s ] ; F={−(y−B) , x−A} ; Simplify [ Curl [ F, { x , y } ] ]

Figure 2. The planimeter is an analog computer which allows to

compute the area of a region enclosed by a curve.
Examples
31.9. Problem: Compute F (x, y) = [x2 − 4y 3 /3, 8xy 2 + y 5 ] along the boundary of the
rectangle [0, 1]×[0, 2] oriented counter clockwise. Solution: Since curl(F ) = Qx −Py =
R R1R2
8y 2 + 4y 2 = 12y 2 we have C F · dr = 0 0 12y 2 dydx = 32.
31.10. Problem: Find the line integral of the vector field

x+y
F (x, y) =
3x + 3y 2
along the boundary C of the quadratic Koch island. The counter clockwise oriented
C encloses the island G which has 289 unit squares. Solution: curl(F ) = 2, so that
RR
G
2dA = 2Area(G) = 578.

Figure 3. Koch islands constructed by a Lindenmayer system, a

recursive grammar. It starts with F + F + F + F and recursion F →
F − F + F + F F F − F − F + F . [F=“moving forward by 1”, + = “turn
by 90 degrees”, − = “turn by (−90) degrees”.]

Homework
R
Problem 31.1: Calculate the line integral C F · dr with F = [−22y +
3x2 sin(y)+2222 sin(x6 ), x3 cos(y)+2342y 22 sin(y) ]T along a triangle C which
traverses the vertices (0, 0), (7, 0) and (7, 11) back to (0, 0) in this order.

Problem 31.2: A classical problem asks to compute the area of the

region bounded by the hypocycloid
r(t) = [4 cos3 (t), 4 sin3 (t)], 0 ≤ t ≤ 2π .
We can not do that directly so easily. Guess which theorem to use, then
use it!

R √
Problem 32.3: Find C [sin( 1 + x3 ), 7x] · dr, where C is the boundary
of the region K(n). You see in the picture K(0), K(1), K(2), K(3), K(4).
The first K(0) is an equilateral triangle of length 1. The second K(1) is
K(0) with 3 equilateral triangles of length 1/3 added. K(2) is K(1) with
3 ∗ 41 equilateral triangles of length 1/9 added. K(3) is K(2) with 3 ∗ 42
of length 1/27 added and K(4) is K(3) with 3 ∗ 43 triangles of length 1/81
added. What is the line integral in the Koch Snowflake limit K = K(∞)?
The curve K is a fractal of dimension log(4)/ log(3) = 1.26 . . . .
Linear Algebra and Vector Analysis

Figure 4. The first 4 approximations of the Koch curve.

Problem 32.4: Given the scalar function f (x, y) = x5 + xy 4 , compute

the line integral of
F (x, y) = [5y − 3y 2 , −6xy + y 4 ] + ∇(f )
along the boundary of the Monster region given in the picture. There
are four boundary curves, oriented as shown in the picture: a large ellipse
of area 16, two circles of area 1 and 2 as well as a small ellipse (the mouth)
of area 3. “Mike” from Monsters, Inc. warns you about orientations!

Problem 32.5: Let C be the boundary curve of the white Yang part
of the Yin-Yang symbol in the disc of radius 6. You can see in the image
that the curve C has three parts, and that the orientation of each part is
given. Find the line integral of the vector field
F (x, y) = [−y + sin(ex ), x]T
along C. There are three separate line integrals.

Figure 5. Hypocycloid, Monster and Yin-Yang

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 32: Stokes theorem

Lecture
32.1. Given a C 1 surface S = r(G) in R3 and a differentiable vector field F = [P, Q, R],
we can form the flux integral
ZZ ZZ
F · dS = F (r(u, v)) · ru × rv dudv .
S G
For F = [P, Q, R], the curl is defined as ∇ × F = [Ry − Qz , Pz − Rx , Qx − Py ]. The
Stokes theorem tells that if C = r(I) is the boundary of S = r(G) and I is oriented
so that G is to the left, then
RR R
Theorem: S
curl(F ) · dS = C
F · dr.

32.2. Proof. The key is the “important formula”

curl(F )(r(u, v)) · (ru × rv ) = Fu · rv − Fv · ru .

This is straightforward and done in class. Now define the field F̃ (u, v) = [P̃ , Q̃] =
[F (r(u, v)) · ru (u, v), F (r(u, v)) · rv (u, v)] in the uv-plane. The 2-dimensional curl of F̃
is Q̃u − P̃v = Fu · rv − Fv · ru as we can see by using Clairaut ruv = rvu . The Stokes
theorem is now a direct consequence of Green’s theorem proven last time. QED. 1

Figure 1. The paddle wheel measures curl. The boundary C has S

“to the left”. The pant surface illustrates a “cobordism”.
1Mathematicians say: “we pulled back the field from R3 to R2 along the parametrization”.
Linear Algebra and Vector Analysis

Examples
32.3. Problem: Compute the flux of F (x, y, z) = [0, 0, 8z 2 ]T through the upper half
unit sphere S oriented outwards. Solution: we parametrize the surface as r(u, v) =
[cos(u) sin(v), sin(u) sin(v), cos(v)]T . Because ru × rv = − sin(v)r, this parametrization
has the wrong orientation! We continue nevertheless and just change the sign at the
end. We have F (r(u, v)) = [0, 0, 8 cos2 (v)]T so that
Z 2π Z π/2
−[0, 0, 8 cos2 (v)]T · [cos(u) sin2 (v), sin(u) sin2 (v), cos(v) sin(v)]T dvdu .
0 0
R 2π R π/2 π/2
The flux integral is 0 0 - 8 cos3 (v) sin(v) dvdu which is 2π · 8 cos4 (v)/4|0 = −4π.
The flux with the outward orientation is +4π. We could not use the Stokes theorem
here because we don’t deal with the flux of the curl but the flux of F itself.
R
32.4. Problem: What is the value of C F · dr if F = [sin(sin(x)) + z 2 , ey + x3 +
y 2 , sin(y 2 ) + z 2 ] and C is the unit polygon (0, 0, 0) → (1, 0, 0) → (1, 1, 0) → (0, 1, 0) →
(0, 0, 0)? Solution: use Stokes theorem. The curl of F is [2y cos(y 2 ), 2z, 3x2 ]. The
surface S : r(u, v) =RR[u, v, 0] with 0 ≤ u ≤ 1 and 0 ≤ v ≤ 1 has C as boundary. Stokes
allows to compute S curl(F ) · dS instead. Since ru × rv = [0, 0, 1], the flux integral
R1R1
is 0 0 3u2 dvdu = 1. The computation of the line integral would have been more
painful.
32.5. Problem: Compute the flux of the curl of F (x, y, z) = [0, 1, 8z 2 ]T through the
upper half sphereRRS oriented outwards.
R Solution: Great, it is here, where we can use
Stokes theorem S curl(F ) · dS = C F · dr, where C is the boundary curve which
can be parametrized by r(t) = [cos(t), sin(t), 0]T with 0 ≤ t ≤ 2π. Before diving into
the computation of the line integral, it is good to check, whether the vector field is
a gradient field. Indeed, we see that curl(F ) = [0, 0, 0]. This means that F = ∇f
for
R some potential f implying by the fundamental theorem of line integrals that
C
F · dr = 0. But wait a minute, if the curl of F is zero, couldn’t we just have seen
directly that the flux of the curl through the surface is zero? Yes, we could have seen
that before: for a gradient field, the flux of the curl of F through a surface is always
zero, for the simple reason that the curl of such a field is zero.
32.6. Problem. What is the flux of the curl of F (x, y, z) = [sin(xyz), zecos(x+y) , zx5 +
z 22 ] through the lower ellipsoid S given by xR2 /4 + y 2 /9 + z 2 /16 = 1, z < 0? Solution:
by Stokes theorem, it is the line integral C F · dr. Through the boundary r(t) =
[2 cos(t), 3 sin(t), 0]. But in the xy-plane z = 0, the field F is zero. The result is zero.
32.7. Problem: What is the flux of the curl of F through an ellipsoid x2 /4 + y 2 /9 +
z 2 /16 = 1? Solution: We can cut the ellipsoid into two parts to get two surfaces with
boundary. The upper part S+ = {(x, y, z) ∈ S, z > 0} has the boundary C+ : r(t) =
[2 cos(t),
RR 3 sin(t), 0] which Rmatches the orientation of the surface. Stokes theorem tells
that S+ curl(F ) · dS = C+ F · dr. The lower part S− = {(x, y, z) ∈ S, z < 0} has
the boundary C− : r(t) = [2 cos(t), −3 RRsin(t), 0] which matches
R the orientation of the
lower part. Stokes theorem tells that S− curl(F ) · dS = C− F · dr. Together we have
R R
C−
F · dr + C+ F · dr = 0 as the line integrals have just different signs. The result is
zero.
Remarks
32.8. The left hand side of the important formula (it “imports” the curl) 2 is defined
only in three dimensions. But the right hand side also makes sense in Rn . It is
tr((dF )∗ dr), where * rotates the 2-frame by 90 degrees. The Stokes theorem for 2-
surfaces works for Rn if n ≥ 2. For n = 2, we have with x(u, v) = u, y(u, v) = v
the identity tr((dF )∗ dr) = Qx − Py which is Green’s theorem. Stokes has the general
R R
structure G δF = δG F , where δF is a derivative of F and δG is the boundary of G.

Theorem: Stokes holds for fields F and 2-dimensional S in Rn for n ≥ 2.

32.9. Why are we interested in Rn and not only in R3 ? One example is that 2-
dimensional surfaces appear as “paths” which a moving string in 11 dimension traces.
More important maybe is that statisticians work by definition in high dimensional
spaces. When dealing with n data points, one works in Rn . Why would you care
about theorems like Stokes in statistics? As a matter of fact, integral theorems in
general allow to simplify computations. As we have seen in Green’s theorem, when
computing the sum over all the curls, there are cancellations happening in the inside.
Integral theorems “see these cancellations” and allow to bypass and ignore stuff
which does not matter.
Rb
32.10. The fundamental theorem of line integrals a tr(df (r(t))dr(t))dt = f (r(b)) −
f (r(a)) holds also in Rn . The flux integral
ZZ
tr(F ∗ (r(u, v))dr(u, v)) dudv
G
is the analogue of a line integral in two dimensions. Written like this, we don’t need
the cross product. And not yet the language of differential forms.
32.11. Stokes deals with “fields” and “space”. What happens if the field R b is 2space
∗ T
itself, that is if F = dr? It is of interest. For m = 1, and F = dr , then a |dr| dt is
the action integral in physics. R b A general Maupertius principle assures that it is
equivalent to the arc length a |dr| dt in the sense that minimizing arc length between
two points is equivalent to minimize the action integral (which is more like the energy
one
RR uses Tto get from the first point to the second).RR Now, inT two dimensions we have
G
tr(dr dr) dudv. We can compare this with G
det(dr dr) dudv which is called
RR p
the Nambu-Goto action, which resembles the surface area G det(drT dr) dudv
also called the Polyakov action. Nature likes to minimize. Free particles move on
shortest
R B 0 paths, minimize the arc length. Maupertius R B 0tells that minimizing the length
0
A
|r (t)| dt of a path equivalent to minimizing A r (t) · r (t) dt which essentially is
the integrated kinetic energy or gasoline use to go from A to B. For the purpose of
minimizing
RR stuff this also works for two dimensional actions. Minimizing the surface
area G |ru × rv | dudv RR among all surfaces connecting two one dimensional curves is
equivalent to minimize G |ru × rv |2 dudv. Also in higher dimensions, Nambu-Goto
and Polyakov are equivalent.

2I learned the “important formula” from Andrew Cotton-Clay in 2009:

http://www.math.harvard.edu/archive/21a fall 09/exhibits/stokesgreen
Linear Algebra and Vector Analysis

Homework
R
Problem 32.1: Use Stokes to find C F · dr, where F (x, y, z) =
z
[12x2 y, 4x3 , 12xy + e(e ) ] and C is the curve of intersection of the hyper-
bolic paraboloid z = y 2 − x2 and the cylinder x2 + y 2 = 1, oriented
counterclockwise as viewed from above.

RR
Problem 32.2: Evaluate the flux integral S
curl(F ) · dS, where
y2 x2 +z 2 +z 2 +z
F (x, y, z) = [xe z 3 + 2xyze , x + z 2 ex , yex + zex ]T
and where S is the part of the ellipsoid x2 + y 2 /4 + (z + 1)2 = 2, z > 0
oriented so that the normal vector points upwards.

R
Problem 32.3: Find the line integral C F dr, where C is the circle of
radius 3 in the xz-plane oriented counter clockwise when looking from the
point (0, 1, 0) onto the plane and where F is the vector field
F (x, y, z) = [4x2 z + x5 , cos(ey ), −4xz 2 + sin(sin(z))]T .
Use a convenient surface S which has C as a boundary.

RR
Problem 32.4: Find the flux integral S
curl(F )·dS, where F (x, y, z) =
[2 cos(πy)e2x + z 2 , x2 cos(zπ/2) − π sin(πy)e2x , 2xz]T
and S is the surface parametrized by
r(s, t) = [(1 − s1/3 ) cos(t) − 4s2 , (1 − s1/3 ) sin(t), 5s]T
with 0 ≤ t ≤ 2π, 0 ≤ s ≤ 1 and oriented so that the normal vectors point
to the outside of the thorn.

Problem 32.5: Assume S is the surface x22RR+ y 8 + z 6 = 100 and

22z
F = [ee , 22x2 yz, x − y − sin(zx)]. Explain why S curl(F ) · dS = 0.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 33: Discrete Vector Calculus

Seminar
33.1. In this seminar, we replace the space Rn with a finite graph G = (V, E), where
V is a set of vertices called nodes and E is a set of edges called connections. A scalar
field is a function f which assigns to every vertex x a function value f (x). We assume
the vertices to be ordered leading to an order of the edges: draw an arrow a → b if
a < b. This a priori order has no effect on any of the theorems. A vector field assigns
to every edge a number F (x). A curve is a list of nodes x1 , x2 , . . . , xn such that x1
is connected to x2 , x2 is connected to x3 etc. The gradient ∇f R of a scalar function
fPis the vector field F (a, b) = f (b) − f (a). The line integral C F · dr is defined as
e∈C F (e)de. We just add up the function values of F along the curve C, positive
de = 1 if we go with the arrow, negative de = −1 if we go against the arrow.

Problem A: Check the closed loop property of the gradient field ∇f

shown in the graph of Figure 1.

1
3 -1
4 1 2
-2 0
2

Figure 1. We see a graph with 4 vertices and 5 edges. The scalar

function f is given by the values on the round vertices. It defines a
gradient vector field F = ∇f which is a function on edges.

33.2. The discrete fundamental theorem of line integrals is:

Theorem:
R If F = ∇f is a gradient field and C is a curve from a to b,
then C ∇f · dr = f (b) − f (a).
Linear Algebra and Vector Analysis

Problem B: Prove the discrete fundamental theorem of line integrals by

induction on the length of the curve C.

33.3. Let’s look at some terminology. Given a vertex x in a graph G, the unit sphere
S(x) of x is the sub-graph generated by the set of vertices directly attached to x. The
unit sphere of the vertex labeled 11 in Figure 2 for example is the circular graph
generated by the vertices {2, 4, 9, 8, 7, 9}. It is a “circle”. The unit sphere of the vertex
with label 4 in that figure is the graph generated by the vertices {11, 7, 1}. It is an
linear graph, a half circle.

33.4. A graph is called a discrete two-dimensional region, if every unit sphere

S(x) is a circular graph with 4 or more vertices or a linear graph with 2 or more
vertices. The set of vertices for which the unit sphere is circular form the interior of
the region. The other vertices form the boundary of the region. A two dimensional
region without boundary is called closed. In Figure 2 for example, there are 4 interior
points and 9 boundary points. In Figure 6, we see a closed region.

33.5. The curl of a vector field F is a function on the triangles T of G. To get the value
of the triangle (a, b, c) we form the line integral of F along the curve C : a → b → c → a.
Each triangle is assumed to be oriented (if drawn in the plane, then counter clockwise).

33.6. Given
RR a function F on the triangles
P of a region G which is oriented, the flux
integral G F (x) dA is defined as t∈T f (t), where T is the set of triangles in G.

2
-2 7
-9
4 9
7 -2
3 11 -8
5 2
1 2 -4 1
8 -3 -6
9 7
-1 -1 -1 -3
-7 8 -3
-1 4
2 0 3 4
6 -7
8 3 11

Figure 2. A gradient field on a two-dimensional region with boundary.

Check that the curl is zero everywhere.

33.7. Here is the discrete Green theorem:

Theorem: If F is a vector field on a 2-dimensional discrete region G,
and
RR the boundary RC is oriented in a compatible way with the region, then
G
curl(F ) dA = C F · dr.

33.8. Figure 3 shows a region equipped with a vector field F .

Problem C: Write in the curl of the vector field in Figure 3.

7 3

7 4

5 7 6 4
5
1

4 3

4 1

6
3
3 5 3 1 2

2
2

2 0

Figure 3. A vector field on a two-dimensional discrete region.

Problem D: Prove the discrete Green theorem by induction on the

number of triangles.

Homework

33.1 Check that the curl of a gradient field is zero: curl(grad(f )) = 0 for
every triangle.

33.2 Figure 4 shows a tree, a graph without closed loops. Find a potential
function f . You can assume that the value at the top node is 0. You see
then that the function value right below is 1. Get all the function values
of the potential.

1
9

8
10

2 2

1 7
-6

6
1 1 8

3
3

4
1 -7

2 1

6 -2
11 3

Figure 4. On a tree, every vector field is a gradient field.

33.3 Find a vector field on a circular graph with 5 vertices which is not
a gradient field.
Linear Algebra and Vector Analysis

Figure 5. Fill in a vector field which is not a gradient field

33.4 Figure 6 shows a vector field on the octahedron a two dimensional

discrete sphere. Determine all the curls and check that the sum of all curls
is zero.

Figure 6. On a closed discrete 2-dimensional region like an octahe-

dron, the sum of the curls of a vector field are zero.

33.5 Construct your own 2-dimensional discrete region and define a vector
field on it, then check the Green theorem by computing the sum of the
curls and the line integral along the boundary.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 34: Stokes Applications

Topology
34.1. A region E in Rn is called simply connected if it is connected and for every
closed loop C in E there is a continuous deformation Cs of C within G such that
C0 = C and C1 (t) = P is a point. For example, C(t) = [cos(t), sin(t), 0] can be
deformed in E = R3 to a point with Cs (t) = [(1 − s) cos(t), (1 − s) sin(t), 0] as C1 (t) =
P = [0, 0, 0] for all t. Each Euclidean space Rn is simply connected. The region
G = {x2 +y 2 > 0} ⊂ R3 is not simply connected as the circle C : r(t) = [cos(t), sin(t), 0]
winding around the z-axis can not be pulled together to a point within G. The region
G = {x2 + y 2 + z 2 > 0} ⊂ R3 is simply connected, but G = {x2 + y 2 > 0} in R2 is not.
Remember that F was called irrotational if curl(F ) = 0 everywhere.
Theorem: If F is irrotational on a simply connected E then F = ∇f in E.

34.2. Proof: since E is simply

S connected and curl(F ) = 0, every closed loop C can be
filled
R a surface S = 0≤s≤1 Cs which has the boundary C. Stokes theorem gives
in by RR
S
F · dr = S curl(F ) · dS = 0. The closed loop property implies path independence.
A potential f can be obtained by fixing a base point p in E, then define for any other
point xRa path Cpx going from p to x. The potential function f is then defined as
f (x) = Cpx F · dr. QED

34.3. The field F (x, y, z) = [−y/(x2 + y 2 ), x/(x2 + y 2 ), 0] is defined everywhere except

on the z-axis. The domain E, where F is defined is not simply connected. There is no
global function f which is a potential for F .
34.4. The notion of “simply connectedness” is important in topology. The first solved
Millenium problem, the Poincaré conjecture, is now a theorem. It tells that a
3-dimensional manifold which is simply connected is topologically equivalent to the
3-sphere {x2 + y 2 + z 2 + w2 = 1} ⊂ R4 . In two dimensions, the result was known
for a long time already, because the structure of 2-dimensional connected manifolds is
known.

Electromagnetism
34.5. The Maxwell-Faraday equation in electromagnetism relates the electric
field E and the magnetic field B with theRRpartial differential equation curl(E) =
− dtd B. Given a surface S, the flux integral S B · dS is called the magnetic flux
Linear Algebra and Vector Analysis

of B through the surface. If we integrate the Maxwell-Faraday equation, we RRsee that

d
RR
S
curl(E) · dS is equal to minus the
RR rate of change of
R the magnetic flux − dt S
B · dS.
Stokes theorem now assures that S curl(E) · dS = C E · dr is the line integral of the
electric field along the boundary. But this is electric potential or voltage. We see:
We can generate an electric potential by changing the magnetic flux.

34.6. Changing the magnetic flux can happen in various ways. We can generate a
changing magnetic field by using alternating current. This is how transformers
work. An other way to change the flux is to rotate a wire in a fixed magnetic field.
This is the principle of the dynamo:

Figure 1. The dynamo, implemented using the ray tracer Povray.

Electric current is generated by moving a wire in a fixed magnetic field.

[−y,x,0]
34.7. The vector field A(x, y, z) = (x2 +y 2 +z 2 )3/2 is called the vector potential of a

magnetic field B = curl(A). The picture shows some flow lines of this magnetic dipole
field B. Problem: Find the flux of B through the lower half sphere x2 + y 2 + z 2 =
1, z ≤ 0 oriented downwards. Solution: Since we have an integral of the curl of the
vector field A, we use Stokes theorem and integrate A(r(t)) along the boundary
curve r(t) = [cos(t), − sin(t), 0]. First of all, we have RA(r(t)) = [sin(t), cos(t), 0]. The
2π
velocity is r 0 (t) = [− sin(t), cos(t), 0]. The integral is 0 −1 dt = −2π.

Figure 2. The flux of the magnetic field B through a surface can be

computed with Stokes by computing a line integral of the vector potential
A.

34.8. Here are all the four magical Maxwell equations for the electric field E and
magnetic field B related to the charge density σ and the electric current j. The
constant c is the speed of light. (By using suitable coordinates, one can assume c = 1.)
div(E) = 4πσ, div(B) = 0, c · curl(E) = −Bt , c · curl(B) = Et + 4πj .
Fluid dynamics
R
34.9. If F is the fluid velocity field and C is a closed curve, then C F · dr is called
the circulation of F along C. The curl of F is called the vorticity of F . A vortex
line is a flow line of curl(F ). Given a curve C, we can let any point in C flow along
the vorticity field. This produces a vortex tube S. The flux of the vorticity though
a surface S is the vortex strength of F through S. Stokes theorem implies the
Helmholtz theorem.
R
Theorem: If Cs flows along F , then Cs F · dr stays constant.

34.10. Proof: Let C be a closed curve and Cs (t) be the curve after letting it flow
S using
a deformation parameter s. The deformation produces a tube surface S = ts=0 Cs
which has the boundary C and Ct . Since the curl of F is always tangent to the
surface
R S, Rthe flux of the curl of F through S is zero. Stokes theorem implies that
C
F · dr − Cs F · dr = 0. The negative sign is because the orientation of Cs is different
from the orientation of C if the surface has to be to the left.

Figure 3. Helmholtz theorem assures that the circulation along a flux

tube is constant. This is a direct application of Stokes theorem: because
the curl of F is tangent to the tube, there is no flux through the tube.

Complex analysis
34.11. An application of Green’s theorem is obtained, when integrating in the complex
plane C. Given a function f (z) = u(z) + iv(z) from C → C and a closed path C
Rb
parametrized by r(t) = x(t) + iy(t) in C, define the complex integral a (u(x(t) +
Rb
iy(t)) + iv(x(t) + iy(t)))(x0 (t) + iy 0 (t)) dt. This is a u(r(t))x0 (t) − v(r(t))y 0 (t) dt +
Rb
i a v(r(t))x0 (t) + u(r(t))0 (t) dt. These are two line integrals. The real part is F =
[u, −v], the imaginary part is F = [v, u].RRAssume C bounds a region G, then Green’s
theorem
RR tells that the first integral is G −vx − uy dxdy and the second integral is
u − vy dxdy. It turns out now that for nice functions f like polynomials, the
G x
Cauchy-Riemann differential equations ux = vy , vx = −ux hold so that these line
integrals are zero. We have therefore
R
Theorem: If f is a polynomial and C a closed loop, C f (z) dz = 0
Linear Algebra and Vector Analysis

Homework: thanksgiving quickies

Problem 34.1: We can measure how many magnetic monopoles RR

there are in the interior of a closed surface S by computing S B · dS.
We see that B RR
= curl(A) for a magnetic potential A, which is a vector
field. What is S B · dS? (We will see in the next lecture why this tells
about the amount of magnetic monopoles inside S.)

Problem 34.2:
a) Define div([P, Q, R]) = Px + Qy + Rz . Check that div(curl(F )) = 0.
b) Is div(grad(f )) = 0 for all functions?
c) Is curl(curl(F )) = [0, 0, 0] for all fields?
d) Which of the regions in Figure 4 are simply connected?
e) Which of the capital letters A − Z are not simply connected?

Figure 4. Complement B \ T of the solid torus T in a ball B, the solid

{1 < x2 + y 2 + z 2 < 4} or the complement of two small balls in a larger
ball.

Problem 34.3: Let S be the torus r(u, v) = [(3 + cos(u)) cos(v), (3 +

cos(u)) sin(v), sin(u)] and F the vector field F (x, y, z) = [−y, x, 0]. What
is the flux of F through S? (No computation and no Stokes theorem is
needed).

Problem 34.4: If F is a vector field, which is everywhere perpendicular

RR S pointing in the normal direction of S, and |F (x, y, z)| = 1.
to a surface
What is S F · dS?

Problem 34.5: a) Can you find a vector field F with curl(F ) = [0, x2 , 0]?
b) Can you find a vector field F with curl(F ) = [0, 0, x2 ]?
c) Can you find a vector field F = [P, Q, R] such that div(F ) = x2 ?
d) Can you find a gradient field F = ∇(f ) such that div(F ) = x2 ?
e) Given a function g(x, y, z), find F such that div(F ) = g.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 35: Gauss theorem

Lecture
35.1. The divergence of a vector field F = [P, Q, R] in R3 is defined as div(F ) =
∇·F = Px +Qy +Rz . Let G be a solid in R3 bound by a surface S made of finitely many
smooth surfaces, oriented so the normal vector to S points outwards. The divergence
theorem or Gauss theorem is
RRR RR
Theorem: G
div(F ) dV = S
F · dS.

Figure 1. The boundary of a solid is oriented outwards. The diver-

gence measures the expansion of a box flowing in the field. The flux of
curl(F ) through a closed surface is 0. No field is created inside.

35.2. Proof. If G is a solid of the form G = {(x, y, z)|(x, y) ∈ U, g(x, y) ≤ z ≤

RRR RR R h(x,y)
h(x, y)} and F = [0, 0, R], then G
div(F ) dV = U g(x,y) Rz dzdydx which is
RR
G
R(x, y, h(x, y)) − R(x, y, g(x, y)) dydx. The flux of F = [0, 0, R] through a surface
r(u, v) = [u, v, h(u, v)] is
ZZ ZZ
[0, 0, R(u, v, h(u, v))] · [−gu , gv , 1]dvdu = R(x, y, h(x, y)) dxdy .
G G
RR
Similarly, the flux through the bottom surface is − G R(x, y, g(x, y)) dxdy. In general,
write F = [P, Q, R] = [P, 0, 0] + [0, Q, 0] + [0, 0, R] to get the claim for solid which are
simultaneously bound by graphs of functions in x and y, or y and z or x and z. A
general solid can be cut into such solids.
Linear Algebra and Vector Analysis

35.3. The theorem gives meaning to the term divergence. The total divergence over a
small region is equal to the flux of the field through the boundary. If this is positive,
then more field leaves than enters and field is “generated” inside. The divergence
measures the expansion of the field. The field F (x, y, z) = [x, 0, 0] for example expands,
while f (x, y, z) = [−x, 0, 0] compresses. F (x, y, z) = [y, z, x] is “incompressible”.

35.4. The divergence theorem holds in any dimension m. If F = [F1 , · · · , Fm ] is the

vector field, then ∂x1 F1 + · + ∂xm Fm is defined as the divergence of F . If G is an m-
dimensional
R region with boundary S = s(G), then the flux of F through S is defined
as G F (s(u)) · n(s(u))|ds(u)|, where n(s(u)) is a unit normal vector. This can be
explained a bit better using the language of differential forms which is introduced next
time.

35.5. The divergence of F = [P, Q] is defined as Px + Qy . If F ⊥ = [Q, −P ] is the

⊥
RR vector field, then div(FRR) = Qx −
turned Py is the curl of F . Green’s theorem tells
that G curl(F ) dxdy which is G div(F ⊥ ) dxdy is the line integral C F · dr. The line
R

integral for F is the flux integral for F ⊥ . The two dimensional divergence theorem is
Green’s theorem “turned”.

Examples
35.6. Problem: Compute the flux of F = [x, y, z] through the sphere RRRof radius ρ
bounding a ball G, oriented outwards. Solution: As div(F )RR = 3 we have G
div(F )dV =
3Vol(G) = 3 · 4πρ3 /3. The flux through the boundary is S F · dS. As in spherical
R 2π R π
coordinates, F (r(φ, θ)) · rφ × rθ = ρ3 sin(φ), the flux is 0 0 ρ3 sin(φ) dφdθ = 4πρ3
also.

35.7. Problem: What is the flux of the vector field F (x, y, z) = [6x + y 3 , 3z 2 +
8y, 22z + sin(x)] through the solid G = [0, 3] × [0, 3] × [0, 3] \ ([0, 3] × [1, 2] × [1, 2] ∪
[1, 2] × [0, 3] × [1, 2] ∪ [0, 3] × [0, 3] × [1, 2]) which is a cube with three perpendicular
cubic holes which is the first stage of the Menger sponge construction? Solution:
As div(F ) = 22 + 8 + 6 = 36, the result is 36 times the volume of the solid which is
36(27 − 7) = 720.

Figure 2. The gravity inside the moon is such that an elevator crossing
the moon oscillates like a harmonic oscillator. The flux of F = [0, 0, z]
through a surface is the volume inside.
35.8. Problem. How does the gravitational field look like inside the moon in dis-
tance ρ to the origin?
RR Solution. A direct computation of summing up all the field
values F (x) = G (x − y)/|x − y|3 dy is difficult as we can not compute in spherical
coordinates. Fortunately we have the divergence theorem. The field F (x) has con-
stant
RR length F (ρ) = |F (x)| for x on a sphere S(ρ) of radius ρ and points inwards. So
S(ρ)
F · dS = −4πρ2 F (ρ). Gauss was able to write down the gravitational field as
a partial differential equation div(F (x)) = 4πσ(x) , where σ(x) is the mass density of
RRR
the solid. We see then with the divergence theorem that B(ρ)
4πσ(x) dx is equal to
−4πρ F ∗ (ρ). Assuming σ to be constant, we have 4π(4πρ /3)σ = −4πρ2 F (ρ) which
2 3

gives F (ρ) = (4σ/3)ρ. TheRRR field grows linearly inside the body. If ρRRR
is bigger than the
radius of the moon, then B(ρ)
4πσ(x) dx is 4πM , where M = G
σ(x) dx is the
2
mass of the moon. We see that in that case F (ρ) = M/ρ , which is the Newton law.
35.9. Problem: Compute using the divergence theorem the flux of the vector field
F (x, y, z) = [2342434y, 2xy, 4yz + 21341324xy]T through the unit cube [0, 1] × [0, 1] ×
[0, 1] which is opened on the top. Solution: the divergence of F is 2x+4y. Integrating
this over the unit cube gives
R 1 R 11 + 2 = 3. The flux through all 6 faces is 3. The flux
through the face z = 1 is 0 0 4y dxdy = 2. We have to subtract this and get 3−2 = 1.
35.10. Similarly as Green’s theorem allowed area computation using line integrals the
volume of a region can be computed as a flux integral:R Rtake a vector field F with
constant divergence 1 like F (x, y, z) = [0, 0, z]. We have S
[0, 0, z] · dS = Vol(G).

35.11. Example: For an ellipsoid x2 /a2 + y 2 /b2 + z 2 /c2 , where the parametrization is
r(φ, θ) = [a sin(φ) cos(θ), b sin(φ) sin(θ), c cos(φ)], we have [0, 0, c cos(φ)][ab sin(φ) cos(φ)] =
abc sin(φ) cos2 (φ) leading to 2πabc2/3 = 4πabc/3.
35.12. A computer can determine the volume of a solid enclosed by a triangulated
surface by computing the flux of the vector field F = [0, 0, z] through the surface.
The vector field has divergence 1 so that by the divergence theorem, the flux gives
the volume. A computer stores a geometric object using triangles. Assume ABC is
that triangle. If n = AB × AC points outside the region, then the flux is F · n/2. A
computer can now add up all these values and get the volume.

Figure 3. A cow, a Klein bottle and a car from the Mathematica

example files and produce closed surfaces. The Klein bottle does not
have an interior however.
Linear Algebra and Vector Analysis

Homework

Problem 35.1: Use the divergence theorem to calculate the flux of

F (x, y, z) = [x3 , y 3 , z 3 ]T through the sphere S : x2 + y 2 + z 2 = 1, where
the sphere is oriented so that the normal vector points outwards.

Problem 35.2: Assume the vector field

F (x, y, z) = [5x3 + 12xy 2 , y 3 + ey sin(z), 5z 3 + ey cos(z)]T
is the magnetic field of the sun whose surface is a sphere of radius
RR 3
oriented with the outward orientation. Compute the magnetic flux S F ·
dS.

Problem 35.3: Find the flux of the vector field F (x, y, z) = [xy, yz, zx]T
through the solid cylinder x2 + y 2 ≤ 1, 0 ≤ z ≤ 2.

Problem 35.4: Find the flux of F (x, y, z) = [x + y + z, x + z, z + y]T

through the Menger sponge Mn defined in the unit cube and take the limit
n → ∞.

Figure 4. Approximations to the Menger sponge.

Problem 35.5: Compute the flux of the vector field F (x, y, z, w) =

[x + 2y 2 , 3x + 4y 5 , 6z + 8z 9 , 7w + 9x10 ]T through the three 3-sphere x2 +
y 2 + z 2 + w2 = 1 in R4 , oriented outwards.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 36: General Stokes

Lecture
36.1. We know already E = Rn = M (n, 1), the space of column vectors and its dual
E ∗ = M (1, n), the space of row vectors. To get more general objects, it is important
to think about vectors as maps. A row vector is a linear map F : E → R defined by
F (u) = F u and a column vector defines a linear map F : E ∗ → R by F (u) = uF .
A map F (x1 , . . . , xn ) of several variables is called multi-linear, if it is linear in each
coordinate. The set Tqp (E) of all multi-linear maps F : (E ∗ )p × E q → R is the space
of tensors of type (p, q). We have T01 (E) = E and T10 (E) = E ∗ and T11 (E) is M (n, n)
the space of n × n matrices: given a matrix A, a column vector v ∈ E and a row vector
w ∈ E ∗ , we have the bi-linear map F (v, w) = wAv. It is linear in v and in w.

36.2. Let Λq (E) be the subspace of Tq0 (E) which consists of tensors F of type (0, q) such
that F (x1 , . . . xq ) is anti-symmetric in x1 , . . . , xq ∈ E: this means F (xσ(1) , . . . , xσ(q) ) =
(−1)σ f (x1 , . . . , xq ) for all i, j = 1, . . . , q, where (−1)σ is the sign of the permutation
σ of {1, . . . , n}. If the Binomial coefficient B(n, q) = n!/(q!(n − q)!) counts the
number of subsets with q elements i1 < · · · < iq of {1, . . . , n} and E has dimension n,
then Λq (E) has dimension B(n, q). A map F : E → Tqp (E) is called a (p, q)-tensor
field. The set T01 (E) is the space of vector fields. If g : Rm → Rn is a smooth
map, then F = dk g is a tensor field of type (0, k). A k-form is a (0, k)-tensor field
F with F (x) ∈ Λk (E). A 2-form in R3 for example attaches to x ∈ R3 a bi-linear,
anti-symmetric map F (x)(u, v) = −F (x)(v, u). One writes P dydz + Qdxdz + Rdxdy
where dydz(u, v) = u2 v3 − u3 v2 , dxdz(u, v) = u1 v3 − u3 v1 , dxdy(u, v) = u1 v2 − v1 u2 .
p
36.3. The exterior derivative d : ΛP → Λp+1 is defined for f ∈ Λ0 as df = fx1 dx1 +
· · · + fxn dxn and d(f dxi1 · · · dxip ) = i fxi dxi dxi1 · · · dxip . For F = P dx + Qdy for
example, it is (Px dx + Py dy)dx + (Qx dx + Qy dy)dy = (Qx − Py )dxdy which is the
curl of F . If r : G ⊂ Rm → Rn is a parametrization, then S = r(G) is a m-surface
and δS = r(δG) is its boundary in Rn . If F ∈ Λp (Rn ) is a p-form on Rn , then
r∗ F (x)(u1 , . . . , up ) = F (r(x))(dr(x)(u1 ), dr(x)(u2 ), . . . , dr(x)(up )) is a p-form in Rm
called the R pull-back of r. Given a p-form F and an p-surface S = r(G), define the
integral S F = G r∗ F . The general Stokes theorem is
R

R R
Theorem: S
dF = δS
F for a (m − 1)-form F and m surface S in E.
Linear Algebra and Vector Analysis

36.4. Proof. As in the proof of the divergence theorem, we can assume that the region
G is simultaneously of the form gj (x1 , . . . , x̂j , . . . xm ) ≤ xj ≤ hj (x1 , . . . , x̂j , . . . xm ),
where 1 ≤ j ≤ n and that F = [0, . . . , 0, Fj , 0, . . . , 0]. The coordinate independent
definition of dF reduces the result to the divergence theorem in G. QED

Examples
36.5. For n = 1, there are only 0-forms and 1-forms. Both are scalar functions. We
write f for a 0-form and F = f dx for a 1-form. The symbol dx abbreviates the linear
map dx(u) = u. The 1-form assigns to every point the linear map f (x)dx(u) = f (x)u.
The exterior derivative d : Λ0 → Λ1 is given by df (x)u = f 0 (x)u. Stokes theorem is the
Rb
fundamental theorem of calculus a f 0 (x)dx = f (b) − f (a).
36.6. For n = 2, there are 0-forms, 1-forms and 2-forms. It is custom to write
F = P dx+Qdy rather than F = [P, Q] which is thought of as a linear map F (x, y)(u) =
P (x, y)u1 + Q(x, y)u2 . A 2-form is also written as F = f dxdy or F = f dx ∧ dy.
Here dxdy means the bi-linear map dxdy(u, v) = (u1 v2 − u2 v1 ). The 2-form de-
fines such a bi-linear map at every point (x, y). The exterior derivative dΛ0 → Λ1
is df (x, y)(u1 , u2 ) = fx (x, y)u1 + fy (x, y)u2 which encodes the Jacobian df = [fx , fy ],
a row vector. The exterior derivative of a 1-form F = P dx + Qdy is dF (x, y)(u, v) =
(−1)1 Py (x, y) det([u, v]) + (−1)2 Qx (x, y) det([u, v]) which is (Qx − Py )dxdy. Using co-
ordinates is convenient as dF = Py dydx + Qx dxdy = (Qx − Py )dxdy using now that
dydx = −dxdy.
36.7. For n = 3, we write F = P dx + Qdy + Rdz for a 1-form, and F = P dydz +
Qdzdx + Rdxdy for a 2-form. Here dydz = dy ∧ dz are symbols representing bi-
linear maps like dydz(u, v) = u2 v3 − v3 u2 . As a 2-form has 3 components, it can
be visualized as vector field. A 3-form f dxdydz defines a scalar function f . The
symbol dxdydz = dx ∧ dy ∧ dz represents the map dxdydz(u, v, w) = det([uvw]).
The exterior derivative of a 1-form gives the curl because d(P dx + Qdy + Rdz) =
Py dydx+Pz dzdx+Qx dxdy +Qz dzdy +Rx dxdz +Ry dydz which is (Ry −Qz )dydz +(Pz −
Rx )dzdx + (Qx − Py )dxdy. The exterior derivative of a 2-form P dydz + Qdzdx + Rdxdy
is Px dxdydz + Qy dydzdx + Rz dzdxdy = (Px + Qy + Rz )dxdydz. To integrate a 2-form
F = x2 yzdxdy + yzdydz + xzdxdz over a surface r(u, v) = [x, y, z] = [uv, u − v, u + v]
with G = {u2 + v 2 ≤ 1} we end up with integrating F (r(u, v)) · ru × rv . In order to
integrate
RR dF for a 1-form F = P dx + Qdy + Rdz we can also pull back F and get
F (r(u, v))ru − Fu (r(u, v)rv dudv.
G v

36.8. For n = 4, where we have 0-forms f , 1-forms F = P dx + Qdy + Rdz + Sdw and
2-forms F = F12 dxdy + F13 dxdz + F14 dxdw + F23 dydz + F24 dydw + F34 dzdw which are
objects with 6 components. Then 3-forms F = P dydzdw + Qdxdzdw + Rdxdydw +
Sdxdydz and finally 4-forms f dxdydzdw.

Remarks
36.9. Historically, differential forms emerged in 1922 with Élie Cartan. Most textbooks
introduce the Grassmanian algebra early and use the language of “chains” for example
which is the language used in algebraic topology. It was Jean Dieudonné in 1972 who
freed the general Stokes theorem from chains and used first the coordinate free pull
back idea. This allowed us in this lecture to formulate the general Stokes theorem from
scratch on a single page with all definitions.

36.10. What is a differential form? We have seen a mathematically precise def-

inition: a differential form is a kind of field: a multi-linear anti-symmetric function
attached to each point of space. But what is the intuition and what are ways to “vi-
sualize” and “see” and “understand” such an object? Here are four paths. Maybe one
of them helps:

A) Using Stokes one can seeRa form as a functional

R F , which assigns
R to a m-dimensional
R
oriented surface S a number S F · dS such that −S F · dS = S (−F ) · dS = − S F · dS.
1
This way of thinking about forms matches what we do in the discrete. If we have
a k-form on a graph, then this is aRfunction on P
k-dimensional oriented complete sub-
graphs. Given a graph S we have S F · dS = x∈S F (x), where the sum is over all
k-dimensional simplices in S.

B) One can understand differential forms better using arithmetic, the Grassmanian
algebra. This is done with the help of the tensor product, which induces an exte-
rior product F ∧ G on Λp × Λq → Λp+q . This product generalizes the cross product
Λ1 × Λ1 → Λ2 which works for n = 3 as there, the space of 1-forms Λ1 and 2-forms Λ2
can be identified. The exterior algebra structure helps to understand k-forms. We can
for example see a 2-form as an exterior product F ∧ G of two 1-forms. We can think of
a 2-form for example as attaching two vectors at a point and identify two such frames
if their orientation and parallelogram areas match.

C) A third way comes through physics. We are familiar with manifestations of elec-
tomagnetism: we see light, we use magnets to attach papers to the fridge or have
magnetic forces keep the laptop lid closed. Electric fields are felt when combing the
hair, as we see sparks generated by the high electric field obtained by stripping away
the electrons from the head. We use magnetic fields to store information on hard
drives and electric fields to store information on a SSD harddrive. Non-visible electro-
magnetic fields are used when communicating using cell phones or connecting through
blue-tooth or wireless network connections. The electro-magnetic field E, B is actu-
ally a 2-form in 4-dimensions. The B(4, 2) = 6 components are (E1 , E2 , E3 , B1 , B2 , B3 ).

D) A fourth way comes through discretization. When formulating Stokes on a

discrete network, everything is much easier: a k-form is just a function on oriented
k-dimensional complete subgraphs of a network. Start with a graph G = (V, E)
and orient the complete subgraphs arbitrarily. Given a k-form F , a function on k-
simplices P
has an exterior derivative at a k + 1 dimensional simplex x is defined as
dF (x) = y⊂x σ(y, x)F (y), where the sum is over all k-dimensional sub-simplices of
x and σ(y, x) = 1 if the orientation of y matches the orientation of x or −1 else. We
have for example seen that for a 1-form F , a function on edges, the exterior derivative
at a triangle x is the sum over the F values of the edges, where we add up the value
negatively if the arrow of the edge does not match the orientation of the triangle.
1David Bachman’s text on differential forms: “it is a thing which can be integrated”.
Linear Algebra and Vector Analysis

Applications
36.11. An electromagnetic field is determined by a 1-form A in 4-dimensional space
time. The electromagnetic field is F = dA. The Maxwell equations are dF = 0 (the
relation d ◦ d = 0 is seen in the homework). The second part of the Maxwell equations
are d∗ F = j, where d∗ : Λp → Λp−1 is the adjoint and j is a 1-form encoding both
the electric charge and the electric current. We can always gauge with a gradient
A + df so that d∗ (A + df ) = 0. The Maxwell equations reduced to the Poisson equation
LF = (dd∗ + d∗ d)F = j, where L is the Laplacian on 1-forms. In vacuum, without
electric charges or currents, we have the wave equation LF = 0. And there was light.
Homework

Problem 36.1: Given the 1-form F (x, y, z, w) = [x3 , y 5 , z 5 , w2 ] = x3 dx+

y 5 dy + z 5 dz + w2 dw and the curve C :R r(t) = [cos(t), sin(t), cos(t), sin(t)]
with 0 ≤ t ≤ π. Find the line integral C F (r(t)) · dr.

Problem 36.2: Given the 1-form F = [xyz, xy, wx, RRwxy] = xyzdx +
xydy + wxdz + wxydw, find the curl dF . Now find S dF over the 2-
dimensional surface S : x2 + y 2 ≤ 1, z = 1, w = 1 which has as a boundary
the curve C : r(t) = [cos(t), sin(t), 1, 1]T , 0 ≤ t ≤ 2π. You certainly can use
the Stokes theorem. If you like to compute both sides of the theorem you can see how the theorem
works. The 2-manifold S is parametrized by r(t, s) = [s, t, 1, 1]T . The (rs ∧ rt )ij has 6 components,
where only one component (rs ∧ rt )12 is nonzero. This will match with the dF12 = P dxdy part of the
6-component 2-form dF building the curl. We will have to integrate then over G = s2 + t2 ≤ 1.

Problem 36.3: Given the 2-form F = z 4 xdxdz + xyzw2 dydw and the
3-sphere
RRR x2 + y 2 + z 2 + w2 = 1 oriented outwards. What is the integral
S
dF ? To compute this 3D integral, you can use the general integral
theorem.

Problem 36.4: Given the 3-form F = xyzdxdydz + y 2 zdydzdw, find

the divergence dF . Now find the flux of F through the unit sphere x2 +
y 2 + z 2 + w2 = 1 oriented outwards.

Problem 36.5:
a) Take f (x, y, z, w). Check that F = df satisfies dF = 0.
b) Take F = F1 dx + F2 dy + F3 dz + F4 dw. Compute the curl G = dF and
check that dG = 0.
c) Take the 2-form F = F12 dxdy + F13 dxdz + F14 dxdw + F23 dydz +
F24 dydw + F34 dzdw. Write down the 3-form G = dF and check dG = 0.
d) Take the 3-form F = F1 dydzdw + F2 dxdzdw + F3 dxdydw + F4 dxdydz
and compute the 4-form G = dF . Check that dG = 0.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 37: A Discrete World

Seminar
37.1. A 0-form f on a graph G = (V, E) is a function on the vertices V . It is what
we call a scalar function. A 1-form is a function on the oriented edges E meaning
F (a, b) = −F (b, a). Informally, as in the continuum, we think of a 1-form as a vector
field. The gradient F (a, b) = df (a, b) = f (b) − f (a) of a 0-form f is a 1 form F .
The curl of a vector field F is a 2-form. It is a function on triangles (a, b, c) given by
dF (a, b, c) = F (a, b) + F (b, c) + F (c, a) which can be seen as the line integral along the
boundary of the triangle. When describing p-forms for p > 0, orientation matters. To
fix it, just enumerate the vertices V and then choose the orientation of an edge (a, b)
with a < bRRor the orientationRof a triangle (a, b, c) if a < b < c. The discrete Stokes
theorem S curl(F ) · dS = C F · dr told us that that the sum of the curls of F on
triangles of a surface S is equal to the line integral of F along the boundary C of S.

Figure 1. Examples of three dimensional graphs with 1,2,4 and 12

tetrahedra. The divergence dF (x) of a 2-form F is the sum over all
values F (y), where y ⊂ x runs over the triangular faces of x. The sum
of all divergences is the flux of F through the boundary because in the
inside the fluxes cancel. This is the divergence theorem for solids.

37.2. A tetrahedral graph is a collection of 4 nodes which all are connected to each
other. A 3-form on a graph G is a function on tetrahedral sub-graphs x of G. An
example is the divergence dF (x) of a 2-form F which is defined as the sum of the
F (y) values of the triangles y ⊂ x enclosing the tetrahedron x. As in the continuum,
the orientation plays a role. Here is the discrete divergence theorem for a solid G
is built by tetrahedra x and where the boundary surface S consists of triangles:
P P
Problem A: Check that x∈G div(F )(x) = y∈S F (y).
Linear Algebra and Vector Analysis

Hint: prove by induction with respect to the number of tetrahedra. first check that
if G is a single tetrahedron, this is the definition of the divergence. Then see what
happens if a new tetrahedron is added.
37.3. We also have seen that the divergence of the curl of a vector field F is zero: We
had curl(F ) = [Ry − Qz , Pz − Rx , Qx − Py ] and taking the x derivative of Ry − Qz is
Ryx − Qzx , the y derivative of Pz − Rx is Pzy − Rxy and the z-derivative of Qx − Py is
Qxz − Pyz . Adding them all up gives 0. In the discrete it is even simpler. Start with
a 1-form F on the edges of a graph. Then form the curls, which are functions on the
triangles, then add up all these curls. You check:

Problem B: Check: div(curl(F ))(x) = 0 of every F and tetrahedron x.

37.4. The general Stokes theorem is not much different. A p-simplex in a graph
is a collection of p + 1 nodes which are all connected to each other. A p-form is a
function on the set of p-simplices x in G. The function value is fixed if the simplex is
given in an oriented way but defined also if the simplices are oriented differently, we
just have F (x0 , . . . , xp ) = (−1)σ f (σ(x0 ), . . . , σ(xp )) if σ is a permutation. For exam-
ple F (x0 , x1 , x2 ) = F (x1 , x2 , x0 ) = F (x2 , x0 , x1 ) = −F (x1 , x0 , x2 ) = −F (x0 , x2 , x1 ) =
−F (x2 , x1 , x0 ).
37.5. The exterior derivative of p-form F is the (p + 1)-form
p+1
X
dF (x0 , . . . , xp+1 ) = (−1)j F (x0 , . . . , xˆj , . . . xp+1 ) .
j=0

Problem C: Check in general that ddF = 0.

37.6. The general Stokes theorem tells that for a m-dimensional graph G with bound-
ary S and a (m − 1)-form F we have
P P
Theorem: x∈G dF (x) = y∈S F (y)

Gravity
d2
37.7. The Newton equations dt2 xk = − j Gmj /|xk − xj |2 with gravitational con-
P

stant G describe the motion of finitely many mass points with positions xk (t) ∈ R3
and mass mk . These classical laws govern the motion of planets in our solar system,
stars in a galaxy or galaxies in a galaxy cluster. While relativity modifies this
Newtonian picture slightly and produces corrections which for example manifest in the
Perihel advancement of Mercury, the Newtonian theory is amazingly accurate. Gauss
derived the gravitational inverse square force F from div(F ) = 4πσ, where σ is the
mass density. While divergence usually maps a 2-form to a 3-form, it is the adjoint
d∗ of the gradient d. In R3 it is equivalent. Now, L = div ◦ grad = d∗ d : Λ0 → Λ0 is
called the Kirchhoff Laplacian. The Gauss law of gravity therefore is the Poisson
equation LV = 4πσ , where V is the gravitational potential, a 0−form. Since d∗ = 0
on 0-forms, we can also write L = dd∗ + d∗ d. Classical gravity gets from a mass density
σ the gravitational potential V and so the gravitational field as a gradient F = dV :
(d∗ d + dd∗ )V = 4πσ defines the gravitational 1-form F = dV .

Electromagnetism
37.8. The Maxwell equations div(E) = 4πσ, div(B) = 0, curl(E) = −Bt , curl(B) =
Et +4πi become more elegant when written in four-dimensional space-time R4 . There
are then two equations only. The first is dF = 0 which is evident from F = dA and
d2 = 0. The second is d∗ F = 4πj, where j is the 4-current encoding both the
charge density σ as well as the electric current i. Now dF = 0 implies in a simply
connected region that F = dA, where A is an electro-magnetic potential. If d∗ A = 0
(which can always be achieved by adding a gradient to A) we get the Poisson equation
LA = (dd∗ + d∗ d)A = 4πj. This completely encodes the Maxwell equations; we can
look at it also in a discrete network. Classical electromagnetism in a world with charge
and current density j is the field F = dA, where A is obtained from
(d∗ d + dd∗ )A = 4πj defines the electromagnetic 2-form F = dA.

Quantum mechanics and beyond

37.9. In this last homework we deal with a small universe G. We call it Gaia, the
primordial deity of earth. In Greek mythology, Gaia was the daughter of Aether
the god of air and Hemera the goddess of light. We only create the gravitational
field, the electromagnetic field on G and some quanta, so there will be matter and
light in this world. But that mathematics is exactly as in the universe we live in: the
classical gravitational field is described with the language of Gauss which we have seen
to imply the Newton law of gravity. The electromagnetic field is formulated according
to Maxwell, but directly in space-time. We also look a bit at quantum mechanics as
the eigenvalues and eigenvectors of the Laplacian L play a role when looking at the
Wheeler De Witt equation, a time-independent Schrödinger equation in space time
(d∗ d + dd∗ )F = λF defines a wave function F on p-forms.

37.10. The rest will be up to you: it remains to include the Fermionic constituents of
matter (quarks (building mesons and baryons) as well as leptons) and bosons (photons,
gluons, vector bosons and the Higgs) as well as a few other details. Don’t worry, a
former student has solved a similar homework assignment in less than 7 days ...

Figure 2. The Greek godess Gaia, seen in a Roman relief sculpture

from the “Ara Pacis Augustae” in Rome. (Image by Dr. Sarah E. Bond.)
Linear Algebra and Vector Analysis

Homework

Problem P37.1: Given the 1-form

P F in Figure 3a, find the 0-form f (x) =
∗ ∗
d F (x) = e,e→x F (e). Check x∈V d F (x) = 0. (Conservation law)

Problem 37.2: a) Given the 0-form f in Figure 3b, find F = df , then

compute d∗ F = d∗ df = Lf .
b) Given the 2-form H in Figure 3c, find a 1 form F such that dF = H.

Problem 37.3: Given the 0-form f in Figure 4a check that this is f

satisfies L0 f = λf for some constant λ. This is called an eigenvalue of L.
We write Lp for dd∗ + d∗ d restricted to p-forms.

Problem 37.4: In Figure 4c you see a 2-form H. Check that this is H

satisfies L2 H = λH for some constant λ. What is the eigenvalue?

Problem 37.5: Find f = d∗ F and H = dF for the 1-form F in Figure

4b. Then check d∗ H + df = (d∗ d + dd∗ )F = λF for some constant λ.

1
3
7 -2 7
4
2 4 9
1
-3

Figure 3. a) a 1-form,b) a 0-form, and c) a 2-form

1
1
0 0 -1
0
0 1 1
0
-1

Figure 4. Eigenvectors. a) for L0 b) for L1 and c) for L2 .

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 38: Geometries and Fields

Lecture
38.1. Integral theorems deal with geometries G and fields F . Integration pairs
them up and gives the Stokes theorem
R R
G
dF = δG F
It involves the boundary δG of G and the exterior derivative dF of F . One can
classify the theorems by looking at the dimension n of space and the dimension m of
the object we are integrating over. In dimension n, there are n theorems:

1
d
1
dx FTC
1 −→ 1 1 −→ 1
grad curl FTL Green
1 −→ 2 −→ 1 1 −→ 2 −→ 1
grad curl div FTL Stokes Gauss
1 −→ 3 −→ 3 −→ 1 1 −→ 3 −→ 3 −→ 1

38.2. The Fundamental theorem of line integrals is a theorem about the gradient
∇f . It tells that if C is a curve going from A to B and f is a function (that is a 0-form),
then
R
Theorem: C ∇f · dr = f (B) − f (A)

In calculus we write the 1-form as a column vector field ∇f . It actually is a 1-form

F = df , a field which attaches a row vector to every point. If the 1-form is evaluated at
r0 (t) one gets df (r(t))(r0 (t)) which is the matrix product. We integrate then the pull
back of the 1-form on the interval [a, b]. It is the switch from row vectors to column
vectors which leads to the dot product ∇f (r(t)) · r0 (t). For closed curves, the line
integral is zero. It follows also that integration is path independent.
38.3. Green’s theorem tells that if G ⊂ R2 is a region bound by a curve C having
G to the left, then
RR R
Theorem: G
curl(F ) dxdy = C F · dr
Linear Algebra and Vector Analysis

Figure 1. Fundamental theorem of line integrals and Green’s theorem.

In the language of forms, F = P dx + Qdy is a 1-form and dF = (Px dx + Py dy)dx +

(Qx dx + Qy dy)dy = (Qx − Py )dxdy is a 2-form. We write this 2-form dF as Qx − Py
and treat it as a scalar function even so this is not the same as a 0-form, which is a
scalar function. If curl(F ) = 0 everywhere in R2 then F is a gradient field.

38.4. Stokes theorem tells that if S is a surface with boundary C oriented to have
S to the left and F is a vector field, then
RR R
Theorem: S
curl(F ) · dS = C F · dr

Figure 2. Stokes theorem and the Gauss theorem.

In the general frame work, the field F = P dx + Qdy + Rdz is a 1-form and the 2-form
dF = (Px dx + Py dy + Pz dz)dx + (Qx dx + Qy dy + Qz dz)dy + (Rx dx + Ry dy + Rz dz)dz =
(Qx − Py )dxdy + (Ry − Qz )dydz + (Pz − Rx )dzdx is written as a column vector field
curl(F ) = [Ry − Qz , Pz − Rx , Qx − Py ]T . To understand the flux integral, we need to
see what a bilinear form like dxdy does on the pair of vectors ru , rv . In the case dxdy
we have dxdy(ru , rv ) = xu yv − yu xv which is the third component of the cross product
ru × rv with ru = [xu , yu , zu ]T . Integrating dF over S is the same as integrating the
dot product of curl(F ) · ru × rv . Stokes theorem implies that the flux of the curl of F
only depends on the boundary of S. In particular, the flux of the curl through a closed
surface is zero because the boundary is empty.
38.5. Gauss theorem: if the surface S bounds a solid E in space, is oriented out-
wards, and F is a vector field, then
RRR RR
Theorem: E
div(F ) dV = S
F · dS

Gauss theorem deals with a 2-form F = P dydz + Qdzdx + Rdxdy, but because a 2-
form has three components, we can write it as a vector field F = [P, Q, R]T . We have
computed dF = (Px dx + Py dy + Qz dz)dydz + (Qx dx + Qy dy + Qz dz)dzdx + (Rx dx +
Ry dy + Rz dz)dxdy, where only the terms Px dxdydz + Qy dydzdx + Rz dzdxdy = (Px +
Qy + Rz )dxdydz survive which we associate again with the scalar function div(F ) =
Px + Qy + Rz . The integral of a 3-form over a 3-solid is the usual triple integral. For a
divergence free vector field F , the flux through a closed surface is zero. Divergence-free
fields are also called incompressible or source free.

Remarks
38.6. We see why the 3 dimensional case looks confusing at first. We have three
theorems which look very different. This type of confusion is common in science: we
put things in the same bucket which actually are different: it is only in 3 dimensions
that 1-forms and 2-forms can be identified. Actually, more is mixed up: not only
are 1-forms and 2-forms identified, they are also written as vector fields which are
T01 tensor fields. From the tensor calculus point of view, we identify the three spaces
T01 (E) = E, T10 (E) = Λ1 (E) = E ∗ and Λ2 (E) ⊂ T20 . While we can still always identify
vector fields with 1-forms, this identification in a general non-flat space will depend
on the metric. In R4 , the 2-forms have dimension 6 and can no more be written as a
vector. One still does. The electro-magnetic F is a 2-form in R4 which we write as a
pair of two time-dependent vector fields, the electric field E and the magnetic field B.
38.7. Geometries and fields are remarkably similar. On geometries, the boundary
operation δ satisfies δ◦δ = 0. On fields the derivative operation d satisfies d◦d = 0.
‘Geometries” as well as “fields” come with an orientation: ru × rv = −rv × ru ,
dxdy = −dydx. The operations d and δ look different because calculus deals with
smooth things like curves or surfaces leading to generalized functions. In quantum
calculus they are thickened up and d, δ defined without limit. Fields and geometries
then become indistinguishable elements in a Hilbert space. The exterior derivative d
has as an adjoint δ = d∗ which is the boundary operator. It is a kind of quantum field
theory as d generates while d∗ destroys a “particle”. d2 = δ 2 = 0 is a “Pauli exclusion”.
38.8. We can spin this further: a m-manifold S is the image of a parametrization
r : G ⊂ Rm → Rn . The Jacobian dr is a dual m-form, the exterior product of the m
vectors dru1 up to drum (think of m column vectors attached to r(u) ∈ S). If we take
a map s : S ⊂ Rn → Rm and look at F = ds, we can think of it as a m-form F (think
of m row vectors attached to each point x in Rn ). The map s defines m × n Jacobian
ds(x), while the Jacobian dr(u) is the n × m matrix.R Cauchy-Binet
R shows that the
fluxR of F = ds through r(G) = S is the integral G F = G det(ds(r(u))dr(u)) du
= S det(ds(x)dr(s(x))). If s(r(u)) = u, then this is a geometric functional. So:
geometries G can come from maps from a space A to Ra space B, while fields F can
come from
R maps from BR to A. The action integral G F generalizes the Polyakov
action G det(drT dr) = G |dr|2 , a case where F and G are dual meaning s(r(u)) = u.
Linear Algebra and Vector Analysis

Prototype examples

Problem: Compute the line integral of F (x, y, z) = [5x4 + zy, 6y 5 +

xz, 7z 6 + xy] along the path r(t) = [sin(5t), sin(2t), t2 /π 2 ] from t = 0 to
t = 2π.

Solution: The field is a gradient field df with f = x5 + y 6 + z 7 + xyz.

We have A = r(0) = (0, 0, 0) and B = r(2π) = (0, 0, 4) and f (A) = 1
and
R f (B) = 47 . The fundamental theorem of line integrals gives
C
∇f dr = f (B) − f (A) = 47 .

Problem: Find the line integral of the vector field F (x, y) = [x4 +sin(x)+
y + 5xy, 4x + y 3 ] along the cardiod r(t) = (1 + sin(t))[cos(t), sin(t)], where
t runs from t = 0 to t = 2π.

Solution: We use Green’s theorem. RR Since curl(F ) = 3 − 5x, the line

integral is the double integral G 3 − 5x dxdy. We integrate in polar
R 2π R 1+sin(t)
coordinates and get 0 0 (3 − 5r cos(t))rdrdt
RR which is 9π/2. One
can short cut by noticing that by symmetry G (−5x) dxdy = 0, so that
R 2π
the integral is 3 times the area 0 (1+sin(t))2 /2 dt = 3π/2 of the cardioid.

Problem: Compute the line integral of F (x, y, z) =

3
[x + xy, y, z] along the polygonal path C connecting the points
(0, 0, 0), (2, 0, 0), (2, 1, 0), (0, 1, 0).

Solution: The path C bounds a surface S : r(u, v) = [u, v, 0] param-

eterized on G = {(x, y)| x ∈ [0, 2], y ∈ [0, 1]}. By Stokes theorem, the
line integral is equal to the flux of curl(F )(x, y, z) = [0, 0, −x] through S.
R R normal vector Rof2 RS1 is ru × rv = [1, 0, 0] × [0, R1,20]
The R 1= [0, 0, 1] so that
S
curl(F ) · dS = 0 0 [0, 0, −u] · [0, 0, 1] dvdu = 0 0 −u dvdu = −2.

Problem: Compute the flux of the vector field F (x, y, z) = [−x, y, z 2 ]

through the boundary S of the rectangular box G = [0, 3] × [−1, 2] × [1, 2].

Solution: By the Gauss theorem,

R 3 R 2 R the flux is equal to the triple integral
2
of div(F ) = 2z over the box: 0 −1 1 2z dzdydx = (3 − 0)(2 − (−1))(4 −
1) = 27.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 40: Calculus in Hyperspace

Geometries
40.1. The four dimensional Euclidean space R4 = M (4, 1) is the space of column
vectors with four real components X = [x, y, z, w]T . If we think of such a vector as
a point, we also write X = (x, y, z,√w). The dot product = inner product allows
as usual to define length |X| = X · X, the distance |X − Y | and the angles
cos(α) = (X · Y )/(|X||Y |) between vectors. The Cartesian coordinate system has
now four axes which are perpendicular to each other. Historically, as R4 is also the
space of quaternions, it is custom to label the coordinate directions as 1 = [1, 0, 0, 0], i =
[0, 1, 0, 0], j = [0, 0, 1, 0], k = [0, 0, 0, 1]. A vector [3, 4, 5, 1] for example is then written
also as 3 + 4i + 5j + k. We will however keep the vector-form. We will come back in
the last section of this document about why quaternions are natural.
40.2. The kernel of the 1 × 4 matrix A = [a, b, c, d] defines the linear hyperplane
ax+by +cz +dw = 0. It is a 3-dimensional linear space. An example is the coordinate
hyperplane x = 0, which consists of all points {(0, y, z, w) , y, z, w ∈ R}. More
generally, the solution space ax + by + dz + dw = e is an affine hyperplane. The
kernel of a 2 × 4 matrix is in general, as an intersection of two hyperplanes, a 2-
dimensional plane, which we just call a plane. The kernel of a 3 × 4 matrix A is in
general a line. Geometrically, it is the intersection of three hyperplanes.
40.3. A symmetric 4×4 matrix B, a row vector A ∈ M (1, 4) and a constant e define the
hyper quadric X ·BX +AX = e. For a diagonal matrix B = Diag(a, b, c, d), this gives
the quadric ax2 +by 2 +cz 2 +dw2 = e. Examples are the 3-sphere x2 +y 2 +z 2 +w2 = 1,
the hyper paraboloid x2 + y 2 + z 2 = w, the 3-cylinder x2 + y 2 + z 2 = 1 which is the
product of a 2-sphere and a line. Or the cylinder-plane x2 + y 2 = 1 which can be seen
as the product of the 1-sphere with a 2-plane. There are three types of hyperboloids like
x2 + y 2 + z 2 − w2 = 1 x2 + y 2 − y 2 − z 2 = 1 or x2 − y 2 − z 2 − w2 = 1. One could call them
1-hyper-hyperboloids, 2-hyper-hyperboloids and 3-hyper-hyperboloids, using
the Morse index as a label. There is still 1-hyperbolic-paraboloid x2 + y 2 − z 2 = w
but there are more degenerate surfaces like x2 − y 2 = w. The two-dimensional torus
T2 can be realized here as a quadratic surface. It is the intersection of x2 + y 2 =
1, z 2 + w2 = 1. This is the flat torus. We can not realize the two-dimensional torus
in a flat way in our three dimensional space R3 . In hyper-space, it can. There is also
a three dimensional torus T3 . To get a parametrization, start with the 2-torus
parametrization r(φ, θ) = [(3 + cos(φ)) cos(θ),(3 + cos(φ)) sin(θ), sin(φ)] then expand
Linear Algebra and Vector Analysis

the circle to get a hyper-torus r(φ, θ, ψ) = [(3 + cos(φ)) cos(θ), (3 + cos(φ)) sin(θ),
(3 + sin(φ)) cos(ψ), (3 + sin(φ)) sin(ψ)]T , You see that for every fixed ψ we have a
2-torus. We can compute 4|dr| = 18 + 6 cos(φ) + 6 sin(φ) + sin(2φ) which is always
positive and so verifies that the map from T3 to R4 is locally injective. We can also
easily check that if ψ or θ is fixed we get a translated scaled version of the 2-torus. If
φ is fixed, we get the flat 2-torus mentioned above.
40.4. In single variable calculus, one looks at graphs {(x, y) | y = f (x)} of functions of
one variable. In multi-variable, one adds graphs {(x, y, z) | z = f (x, y)} of functions of
two variables. The graph of a function w = f (x, y, z) is now a 3-dimensional space.
Paraboloids like w = x2 + y 2 + z 2 or w = x2 + y 2 − z 2 are graphs. An other example is
2 2 2
the three dimensional bell hyper-surface w = f (x, y, z) = π −3/2 e−x −y +z , where
the constant has been chosen so that the hyper-volume 0 ≤ w ≤ f (x, y, z) is equal
to 1. For obvious reasons, we usually do not draw the graph of a function of three
variables as we would have to draw in 4 dimensions. Now, in hyperspace, we can do
that.
40.5. Spaces can be parametrized in the same way as we parametrized curves or sur-
faces in three dimensions. A curve is defined by four real functions x(t), y(t), z(t), w(t)
of one variables and written as r(t) = [x(t), y(t), z(t), w(t)]T . A surface is parametrized
by r(u, v) = [(x(u, v), y(u, v), z(u, v), w(u, v)]. A hypersurface is now defined by
r(u, v, t) = [x(u, v, t), y(u, v, t), z(u, v, t), w(u, v, t)].
40.6. A coordinate change is defined by a map from R4 to R4 given by four differ-
entiable functions: r(u, v, s, t) = [x(u, v, s, t), y(u, v, s, t), z(u, v, s, t), w(u, v, s, t)]. We
have seen already the parametrization r(φ, θ1 , θ0 ) = [cos(φ) cos(θ1 ), cos(φ) sin(θ1 ),
sin(φ) cos(θ2 ), sin(φ) sin(θ2 )] of the unit 3-sphere= hyper-sphere x2 +y 2 +z 2 +w2 =
1. Because z = x2 +y 2 +z 2 is a cylinder, there is also a natural cylindrical coordinate
system in four dimensions. It is given by r(ρ, φ, θ, w) = [ρ sin(φ) cos(θ), ρ sin(φ) sin(θ),
ρ cos(φ), w]. If we write down the Jacobian matrix and compute the determinant we
get ρ2 sin(φ) as in spherical coordinates.

Fields
40.7. A scalar function f (x, y, z, w) is also called a 0-form. A vector field is denoted
by F = [P, Q, R, S]T and a 1-form F = [P, Q, R, S] is written as F = P dx + Qdy +
Rdz + Sdw. A 2-form F has 6 components: F = Adxdy + Bdxdz + Cdxdw +
P dydz +Qdydz +Rdzdw. A 3-form again has four components P dydzdw+Qdxdzdw+
Rdxdydw +Sdxdydz and a 4-form is again completely determined by a scalar function
f because F = f dxdydzdw.
40.8. The exterior derivatives are computed by using the anti-commutation rule
like dxdy = −dydx and df = fx dx + fy dy + fz dz + fw dw and extending this to terms
like P dydz = dP dydz = (Px dx + Py dy + Pz dz + Pw dw)dydz = Px dxdydz + Pw dwdydz.
For a 1 form F = P dx + Qdy + Rdz + Sdw we have
dF = Px dxdx + Py dydx + Pz dzdx + Pw dwdx +Qx dxdy + Qy dydy + Qz dzdy + Qw dwdy
+Rx dxdz + Ry dydz + Rz dzdz + Rw dwdz +Sx dxdw + Sy dydw + Sz dzdw + Sw dwdw
which simplifies to expression with 6 terms. We have ddF = 0 because every term
like Pyz dzdydx is paired with a term like Pzy dydzdx which cancel. For a 2-form
F = Adxdy + Bdxdz + Cdwdx + P dydz + Qdydw + Rdzdw, we have dF = (Az dz +
Aw dw)dxdy+(By dy+Bw dw)dxdz+(Cy dy+Cz dz)dwdx+(Px dx+Pw dw)dydz+(Qx dx+
Qz dz)dydw + (Rx dx + Ry dy)dzdw which simplifies to (Qz + Pw + Ry )dydzdw + (Bw +
Cz + Rx )dxdzdw + (Aw + Qx + Cy )dxdydw + (Az + By + Px )dxdydz. For a 3-form
F = P dydzdw + Qdzdwdx + Rdwdxdy + Sdydzdw we have dF = (Px + Qy + Rz +
Sw )dxdydzdw.

40.9. The gradient of a function f (x, y, z, w) is defined as ∇f (x, y, z, w) = df T =

[fx , fy , fz , fw ]T The curl of a vector field F (x, y, z, w) = [F1 , F2 , F3 , F4 ]T is the hyper-
field dF = [F12 , F13 , F14 , F23 , F24 , F34 ]T , where we have just chosen a lexigographic order
and where Fij = ∂xj Fi − ∂xi Fj . The hypercurl of a hyper vector field F (x, y, z, w) =
hF12 , F13 , F14 , F21 , F23 , F34 ] is a 3-form but can again be associated with a vector field
dF = [F234 , F134 , F124 , F123 ]T . The divergence of a vector field F = [P, Q, R, S] is a
4-form (Px + Qy + Rz + Sw )dxdydzdw but can again be associated with a scalar field.

40.10. Here are some properties which we have seen already. The gradient ∇f = df T is
perpendicular to the level surface f (x, y, z, w) = c. The curl of the gradient is zero. The
hypercurl of the curl is zero. The divergence of the hypercurl is zero. The divergence
of the gradient is the Laplacian (using the identifications, the divergence map can be
identified with the adjoint −d∗ ). The chain rule is d/dtf (r(t)) = ∇f (r(t)) · r0 (t).

40.11. The line integral of a vector field F along a curve C is C F (r(t)) · r0 (t) dt.
R
The flux integral of a vector field F along a 2-dimensional surface is a flux integral.
The hyper flux integral of a hyper-field RRRRF along a surface . The hyper volume
integral of a function f on a solid G is G
f (x, y, z, w) dxdydzdw.

Theorems
40.12. The fundamental theorem of line integrals is

∇f (r(t)) · r0 (t) dt = f (r(b)) − f (r(a)).

R
Theorem:

40.13. The Stokes theorem tells that for a surface S and 1 form F
RR R
Theorem: S curl(F ) · dS = C F · dr

40.14. The Hyper Stokes theorem assures that for a hypersurface S and a 2-form
F , the flux of the hypercurl of F through G (a 3D-integral) is the flux of F through
the boundary surface S (a 2D-integral)
RRR RR
Theorem: G
hypercurl(F ) · dG = S F · dS

40.15. The divergence theorem assures that for a 3-form (identified as a vector field
F ) and a solid G with boundary hyper-surface S, we have
RRRR RRR
Theorem: G
div(F ) dV = S
F · dS.
Linear Algebra and Vector Analysis

Quaternions
40.16. Hyperspace R4 is special: it is the only Euclidean space for which the unit
sphere is a non-Abelian Lie group. A Lie group G is a manifold r(Rm ) ⊂ Rn 1
on which one has a group operation x ∗ y which has the property that for every y,
the maps x → x ∗ y and x → y ∗ x are smooth maps on G. To have a group (G, ∗)
we must have the property that (x ∗ y) ∗ z = x ∗ (y ∗ z) and that there is a 1-element
1∗x = x∗1 = x such that every element x has an inverse x−1 satisfying x∗x−1 = 1. The
circle {x2 + y 2 = 1} = {z ∈ C||z| = 1} is an example of a group. This multiplication is
Abelian if x ∗ y = y ∗ x for all x, y ∈ G. The complex plane C = R2 is characterized
as the only Euclidean space Rn in which the unit sphere T1 = {|x| = 1} is an Abelian
Lie group. Why Lie groups? They are the dough, elementary particles are baked
from! Electromagnetism is built from T1 for example.
40.17. One can write a vector in R4 also as v = a+ib+jc+kd where i, j, k are symbols.
Hamilton noticed that when defining i2 = j 2 = k 2 = ijk = −1, the 4-dimensional space
becomes an algebra. An algebra is a linear space which also features a multiplication.
Now one has already M (2, 2), the space of 2 × 2 matrices, which is a 4-dimensional
algebra, but the algebra which Hamilton found is a division algebra: every non-zero
element can be inverted. This is not the case for M (2, 2). The matrix in which all
elements are 1 for example is non zero but it is also not invertible.
40.18. The algebra which Hamilton defined through the relations i2 = j 2 = k 2 = ijk =
−1 is called the quaternion algebra H. If v = a − ib − jc − kd, then |v|2 = v · v = vv,
where the right hand side is a quaternion multiplication. One can readily check that
|vw| = |v||w|. Thereason is that quaternions
v can be realized as complex 2 × 2-
a + ib c + id
matrices: if A(v) = , then |v| = det(A(v)) and A(v)A(w) = A(vw).
−c + id a − ib
Your favorite AI helps to check this last identity quickly.

Import [ ” Quaternions ‘ ” ] ;
A[ { x , y , z , w } ] : = { { x+I ∗y , z+I ∗w},{− z+I ∗w, x−I ∗y } } ;
Q=Quaternion [ a , b , c , d ] ∗ ∗ Quaternion [ p , q , r , s ] ;
Simplify [A[ { a , b , c , d } ] . A[ { p , q , r , s }]==A[ Table [Q [ [ k ] ] , { k , 4 } ] ] ]

40.19. An algebra with the property |v ∗ w| = |v||w| is a normed division algebra.

By theorems of Hurwitz and Frobenius, there are only four: the reals R, the complex
C, the quaternions H and the octonions O. For an associative division algebra, the
unit sphere is a Lie group. Because the unit sphere of R has only two points, the
1-circle {|z| = 1} ⊂ C and the unit 3-sphere {|z| = 1} ⊂ H are the only spheres that
are Lie groups. There is a unique non-commutative one, the 3-sphere and a unique
commutative one, the 1-sphere.
Theorem: H is the only non-Abelian associative normed division algebra.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

1Manifolds
can be described abstractly, but a theorem of John Nash assures that every manifold
can be embedded in some Rn . So, looking at images of maps r is no loss of generality!
LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 41: Keywords for Final (see also Units 13+28)

Discrete Calculus
G = (V, E) graph with vertex set V and edge set E.
0-form: function on V . Discrete scalar function
1-form: function on E. Discrete vector field
2-form: function on triangles T .
d(f ) = grad(f ) is a function on edges a− > b defined by f (b) − f (a).
H = dF = curl(F ) is a function on triangles obtained by summing F along the triangle.
d∗ H is a function on edges. Add up the attached triangle values.
d∗ F is a function on vertices. Add up the attached edge values.

New People
Cartan, Maxwell, Stokes, Green, Gauss, Newton, Maxwell, Kirchhoff, Menger, Koch,
Escher, Peirce

Partial Derivatives
L(x, y) = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ) linear approximation
Q(x, y) = L(x0 , y0 ) + fxx (x − x0 )2 /2 + fyy (y − y0 )2 /2 + fxy (x − x0 )(y − y0 ).
use L(x, y) to estimate f (x, y) near f (x0 , y0 ). The result is f (x0 , y0 )+a(x−x0 )+b(y−y0 )
tangent plane: ax + by + cz = d with a = fx , b = fy , c = fz , d = ax0 + by0 + cz0
estimate f (x, y) by L(x, y) or Q(x, y) near (x0 , y0 )
fxy = fyx Clairaut’s theorem for functions which are in C 2 .
ru (u, v), rv (u, v) tangent to surface parameterized by r(u, v)

Parametrization
r : G ⊂ Rm → Rn , dr Jacobian
√
g = drT dr first fundamental form, |dr| = g distortion factor.
curl(F )(r(u, v)) · (ru × rv ) = Fu · rv − Fv · ru important formula

Partial Differential Equations

fxy = fyx Clairaut
ft = fxx heat equation
ftt − fxx = 0 wave equation
fx − ft = 0 transport equation
fxx + fyy = 0 Laplace equation
Linear Algebra and Vector Analysis

ft + f fx = fxx Burgers equation

dF ∗ = j, dF = 0, Maxwell equations
div(F ) = 4πσ, Gravity equation

Gradient
∇f (x, y) = [fx , fy ]T , ∇f (x, y, z) = [fx , fy , fz ]T , gradient
Dv f = ∇f · v directional derivative
d
dt
f (r(t)) = ∇f (r(t)) · r 0 (t) chain rule
∇f (x0 , y0 ) is orthogonal to the level curve f (x, y) = c containing (x0 , y0 )
∇f (x0 , y0 , z0 ) is orthogonal to the level surface f (x, y, z) = c containing (x0 , y0 , z0 )
d
dt
f (x + tv) = Dv f by chain rule
(x − x0 )fx (x0 , y0 ) + (y − y0 )fy (x0 , y0 ) = 0 tangent line
(x − x0 )fx (x0 , y0 , z0 ) + (y − y0 )fy (x0 , y0 , z0 ) + (z − z0 )fz (x0 , y0 , z0 ) = 0 tangent plane
Dv f (x0 , y0 ) is maximal in the v = ∇f (x0 , y0 )/|∇f (x0 , y0 )| direction
f (x, y) increases in the ∇f /|∇f | direction at points which are not critical points
if Dv f (x) = 0 for all v, then ∇f (x) = 0
f (x, y, z) = c defines y = g(x, y), and gx (x, y) = −fx (x, y, z)/fz (x, y, z) implicit diff

Extrema
∇f (x, y) = [0, 0]T , critical point
D = det(d2 f ) = fxx fyy − fxy 2
discriminant.
Morse: critical point and D 6= 0, in 2D looks like x2 + y 2 , x2 − y 2 , −x2 − y 2
f (x0 , y0 ) ≥ f (x, y) in a neighborhood of (x0 , y0 ) local maximum
f (x0 , y0 ) ≤ f (x, y) in a neighborhood of (x0 , y0 ) local minimum
∇f (x, y) = λ∇g(x, y), g(x, y) = c, λ Lagrange equations
∇f (x, y, z) = λ∇g(x, y, z), g(x, y, z) = c, λ Lagrange equations
second derivative test: ∇f = (0, 0), D > 0, fxx < 0 local max, ∇f = (0, 0), D >
0, fxx > 0 local min, ∇f = (0, 0), D < 0 saddle point
f (x0 , y0 ) ≥ f (x, y) everywhere, global maximum
f (x0 , y0 ) ≤ f (x, y) everywhere, global minimum

Double Integrals
RR
f (x, y) dydx double integral
R bRR d
f (x, y) dydx integral over rectangle
Rab Rcd(x)
f (x, y) dydx bottom-top region
Rad Rc(x)
b(y)
c a(y)
f (x, y) dxdy left-right region
RR
RRR f (r, θ) r drdθ polar coordinates
|r × rv | dudv surface area
R bRR d u RdRb
a c
f (x, y) dydx = c a f (x, y) dxdy Fubini
RR
RRR 1 dxdy area of region R
R
f (x, y) dxdy signed volume of solid bound by graph of f and xy-plane

Triple Integrals
RRR
f (x, y, z) dzdydx triple integral
R b RRd R v
f (x, y, z) dzdydx integral over rectangular box
u R
Rab Rcg2 (x) h2 (x,y)
a g1 (x) h1 (x,y)
f (x, y) dzdydx type I region
RRR
f (r, θ, z) r dzdrdθ integral in cylindrical coordinates
RRRR
R
f (ρ, θ, φ) ρ2 sin(φ) dρdφdθ integral in spherical coordinates
RbRdRv RvRdRb
a c RRR u
f (x, y, z) dzdydx = u c a f (x, y, z) dxdydz Fubini
V = RRRE 1 dzdydx volume of solid E
M= E
σ(x, y, z) dxdydz mass of solid E with density σ

Line Integrals
F (x, y) = [P (x, y), Q(x, y)]T vector field in the plane
F (x, y, z) = [P (x, y, z), Q(x, y, z), R(x, y, z)]T vector field in space
Rb
F (r(t)) · r 0 (t) dt line integral
R
C
F · dr = a
F (x, y) = ∇f (x, y) gradient field = potential field = conservative field

Fundamental theorem of line integrals

Rb
FTL: F (x, y) = ∇f (x,R y), a
F (r(t)) · r 0 (t)) dt = f (r(b)) − f (r(a))
closed loop property C F dr = 0, for all closed curves C
always equivalent: closed loop property, path independence and gradient field
mixed derivative test curl(F ) 6= 0 assures F is not a gradient field
in simply connected regions: curl(F ) = 0 implies that field F is conservative
Conservative field: can not be used for perpetual motion.

Green’s Theorem
F (x, y) = [P, Q]T , curl in two dimensions: Rcurl(F ) = Q
RRx − Py
Green’s theorem: C boundary of R, then C F · dr = R curl(F ) dxdy
Area computation: Take F with curl(F ) = Qx −Py = 1 like F = [−y, 0]T or F = [0, x]T
Green’s theorem is useful to compute difficult line integrals or difficult 2D integrals

Flux integrals
F (x, y, z) vector field, S = r(R) parametrized surface
rRuR× rv dudv =R dS
R 2-form on surface
S
F · dS = S
F (r(u, v)) · (ru × rv ) dudv flux integral

Stokes Theorem
F (x, y, z) = [P, Q, R]T , curl([P, Q, R]T ) = [Ry − QRz , Pz − Rx , Q T
RRx − Py ] = ∇ × F
Stokes’s theorem: C boundary of surface S, then C F · dr = S curl(F ) · dS
Stokes theorem allows to compute difficult flux integrals or difficult line integrals

Grad Curl Div

∇ = [∂x , ∂y , ∂z ]T , F = ∇f , curl(F ) = ∇ × F , div(F ) = ∇ · F
div(curl(F )) = 0 and curl(grad(f )) = 0
Linear Algebra and Vector Analysis

div(grad(f )) = ∆f Laplacian
incompressible = divergence free field: div(F ) = 0 everywhere. Implies F = curl(H)
irrotational = curl(F ) = 0 everywhere. Implies F = grad(f )

Divergence Theorem
div([P, Q, R]T ) = Px + Qy + Rz = ∇ · F RR RRR
divergence theorem: solid E, boundary S then S F · dS = E
div(F ) dV
the divergence theorem allows to compute difficult flux integrals or difficult 3D integrals

Some topology
simply connected region D: can deform any closed curve within D to a point
interior of a region D: points in D for which small neighborhood is still in D
boundary of curve C: the end points of the curve
boundary of S points on surface not in the interior of the parameter domain
boundary of solid G: points in G which are not in the interior of D
closed surface: a surface without boundary like a sphere
closed curve: a curve with no boundary like a knot

Some surface parameterizations

sphere of radius ρ: r(u, v) = [ρ cos(u) sin(v), ρ sin(u) sin(v), ρ cos(v)]T
graph of function f (x, y): = r(u, v) = [u, v, f (u, v)]T
example: Paraboloid: r(u, v) = [u, v, u2 + v 2 ]T .
plane containing P and vectors u, v: r(s, t) = P + su + tv
surface of revolution: distance g(z) of z − axis : r(u, v) = [g(v) cos(u), g(v) sin(u), v]T
example: Cylinder: r(u, v) = [cos(u), sin(u), v]T
T
example: Cone: r(u, v) = [v cos(u),
√ v sin(u),√v]
example: Paraboloid: r(u, v) = [ v cos(u), v sin(u), v]T

Integration for integral theorems

RR RRR
Double and triple integral: G f (x, y)dA, G
f (x, y, z)dV .
Rb 0
Line integral: Ra RF (r(t)) · r (t) dt
Flux integral: S
F (r(u, v)) · (ru × rv ) dudv

Differential forms
A k-form is a field, which attaches at every point a multi-linear anti-symmetric map of
k variables.
F = 5x3 dydz + 7 sin(y)xdxdz + 3 cos(xy)dxdy is an example of a 2-form. In calculus
this is identified with a vector field F = [5x3 , 7 sin(y)x, 3 cos(xy)].
The exterior derivative of a term like F = P dxdy is dF = (Px dx + Py dy + Pz dz)dxdy =
Pz dzdxdy = Pz dxdydz. R R
The general stokes theorem tells G dF = δG F , where δG is the boundary of G.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

1
2
3
4
5
6
Name:
7
8
9
10 LINEAR ALGEBRA AND VECTOR ANALYSIS
11
12 MATH 22A Total :
13
14
15

Unit 41: Final Exam Practice

Problems

Problem 41P.1) (10 points):

On the graph G in Figure 1 we are given a 1-form F on a graph
G = (V, E).
a) (3 points) Write the values of the curl dF . As a 2-form it is a function
on the set T of triangles.
b) (3 points) Compute the “discrete divergence” d∗ F , which is a 0-form,
a function on the vertices.
c) (4 points) Find the value of the Laplacian d∗ dF + dd∗ F and enter the
values near the edges in Figure 2.

7
1 0 4 5
2 3

Figure 1. A graph with a 1-Form F . Enter here the result for a) and b).

Figure 2. Enter here the result for c).

Linear Algebra and Vector Analysis

Problem 41P.2) (10 points) Each question is one point:

a) Who formulated the law of gravity in the form the partial differential
equation div(F ) = 4πσ?
b) The expressionRR5xdxdzdx + 77dydzdy + 3dxdy + 6dydx simplifies to ....
c) What value is S [x, y, z] · dS if S is the unit sphere oriented outwards?
d) What is the distance between the point (0, 0, 3) and the xy-plane?
e) Is it true that if |r0 (t)| = 1 everywhere, then r00 (t) is perpendicular to
the velocity r0 (t)?
f) What is the distortion factor |dr| for the change of coordinates r(u, v) =
[−2v, 3u]?
g) If r(u, v) parametrizes a surface in R3 , is it true that ru × (ru × rv )
tangent to the surface?
h) Yes or no: if (0, 0, 0) is a maximum of f (x, y, z) then fxx (0, 0, 0) < 0.
i) Write down the quadratic approximation of 1 + x + y + sin(x2 − y 2 )?
j) If S : f (x, y, z) = x2 + y 2 + z 2 = 1 is oriented outwards, then the flux
of ∇f through S is either negative, zero or positive. Which of the three
cases is it?

Problem 41P.3) (10 points) Each problem is 1 point:

a)
R 1 R Which of the triangles in Figure 3 is integrated over in
1
0 y
f (x, y) dxdy?
b) We have seen a counter example for Clairaut’s theorem. This function
f (x, y) was in C k but not in C k+1 . The integer k indicated how many
times we could differentiate f continuously. What was the k?
c) To what group of partial differential equations belongs div(E) =
4πj + Et ?
d) Write down the Cauchy-Schwarz inequality.
e) Let G be the first stage of the Menger sponge (with 20 cubes from 27
cubes present). Is it simply connected?
f) Take a exterior derivative of the differential form F = sin(xz)dxdy.
g) Parametrize the surface x = z 2 − y 3 .
h) Parametrize the curve obtained by intersecting of the ellipsoid x2 /4 +
y 2 + z 2 /9 = 1 with the plane y = 0.
i) What surface is given in spherical coordinates as sin(φ) cos(θ) = cos(φ)?
j) Write down the general formula for the area of a triangle with vertices
(0, 0, 0), (a, b, c), (u, v, w).

Problem 41P.4) (10 points):

a) (6 points) Find the equation of the plane which contains the line r(t) =
[1+t, 2+t, 3−t] and which is perpendicular to the plane Σ : x+2y −z = 4.
b) (4 points) What is the angle between the normal vectors of Σ and the
plane you just found?
A B C D

Figure 3. Four triangles

Problem 41P.5) (10 points):

a) (8 points) Find the critical points of the function f (x, y) = cos(x) +
y 5 −5y and classify them using the second derivative test. You can assume
that 0 ≤ x < 2π.
b) (2 points) Does the function f have a global maximum or a global
minimum?

Problem 41P.6) (10 points):

a) (5 points) Use the Lagrange method to find the maximum of f (x, y) =
y 2 − x under the constraint g(x, y) = x + x3 − y 2 = 2.
b) (5 points) The Lagrange equations fail to find the maximum of f (x, y) =
y 2 − x under the constraint g(x, y) = x3 − y 2 = 0. Still, the Lagrange
theorem still allows you to find the maximum. How?

Problem 41P.7) (10 points):

a) (6 points) Find the tangent plane at the point P = (4, 2, 1, 1) of the
surface x2 − 2y 2 + z 3 + w2 = 2.
b) (4 points) Parametrize the line r(t) which passes through P which
is perpendicular to the hyper surface at that point. Then find (r(1) +
r(−1))/2.

Problem 41P.8) (10 points):

a) Estimate f (0.012, 0.023) for f (x, y) = log(1 + x + 3xy) using linear
approximation.
b) Estimate f (0.012, 0.023) for f (x, y) = log(1 + x + 3xy) using quadratic
approximation.

Problem 41P.9) (10 points):

a) Lets look at the curve which satisfies the acceleration r00 (t) =
[−2 cos(t), −2 sin(t), −2 cos(t), −2 sin(t)], has the initial position [2, 0, 2, 0]
and initial velocity [0, 2, 0, 2]. Find r(t).
b) What is the curvature |T 0 (t)|/|r0 (t)| of r(t) at t = 0?
Linear Algebra and Vector Analysis

Problem 41P.10) (10 points):

a) Integrate the function f (x, y) = x+x2 −y 2 over the region 1 < x2 +y 2 <
4, xy > 0.
b) Find the surface area of
r(t, s) = [cos(t) sin(s), sin(t) sin(s), cos(s)]
0 ≤ t ≤ 2π, 0 ≤ s ≤ t/2.

Problem 41P.11) (10 points):

Let E be the solid
x2 + y 2 ≥ z 2 , x2 + y 2 + z 2 ≤ 9, y ≥ |x|.
a) (7 points) Integrate ZZZ
x2 + y 2 + z 2 dxdydz.
E
b) (3 points) Let F be a vector field
F = [x3 , y 3 , z 3 ]
Find the flux of F through the boundary surface of E, oriented outwards.

Figure 4. The solid in Problem 10.

Problem 41P.12) (10 points):

What is the line integral of the force field F (x, y, z, w) = [1, 5y 4 + z, 6z 5 +
y, 7w6 ]T + [y − w, 0, 0, 0]T along the path r(t) = [t3 , sin(6t), cos(8t), sin(6t)]
from t = 0 to t = 2π. Hint. We have written the field by purpose as the
sum of two vector fields.

Problem 41P.13) (10 points):

Find the area of the region |x|2/5 + |y|2/5 ≤ 1. Use an integral theorem.
Problem 41P.14) (10 points):
What is the flux of the vector field F (x, y, z, w) = [x+cos(y), y+z 2 , 2z, 3w]
through the boundary of the solid E : 1 ≤ x ≤ 3, 3 ≤ y ≤ 5, 0 ≤ z ≤
1, 4 ≤ w ≤ 8 oriented outwards?

Problem 41P.15) (10 points):

Find the flux of the curl of the vector field
F (x, y, z) = [−z, z + sin(xyz), x − 3]T
through the twisted surface seen in Figure 3 is oriented inwards and
parametrized by
r(t, s) = [(3 + 2 cos(t)) cos(s), (3 + 2 cos(t)) sin(s), s + 2 sin(t)] ,
where 0 ≤ s ≤ 7π/2 and 0 ≤ t ≤ 2π.

Figure 5. The boundary of the surface is made of two circles r(t, 0)

and r(t, 7π/2). The picture gives the direction of the velocity vectors of
these curves (which in each case might or might not be compatible with
the orientation of the surface).

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

1
2
3
4
5
6
Name:
7
8
9
10 LINEAR ALGEBRA AND VECTOR ANALYSIS
11
12 MATH 22A Total :
13
14
15
Welcome to the final exam. Please don’t get started yet. We start all together at 9:00
AM after getting reminded about some formalities. You can fill out the attendance
slip already. Also, you can already enter your name into the larger box above.

• You only need this booklet and something to write. Please stow away any other
material and any electronic devices. Remember the honor code.
• Please write neatly and give details. Except for problems 2 and 3 we want to see
details, even if the answer should be obvious to you.
• Try to answer the question on the same page. There is additional space on the
back of each page. If you must, use additional scratch paper at the end.
• If you finish a problem somewhere else, please indicate on the problem page where
we can find it.
• You have 180 minutes for this 3-hourly.
Linear Algebra and Vector Analysis

Unit 41: Final Exam

Problems

Problem 41E.1) (10 points):

The graph G = (V, E) in Figure 1 represents a discrete surface in which
all triangles are oriented counterclockwise. The values of a 1-form =
vector field F are given.

a) (2 points) Find the line integral of F along the boundary curve

oriented counter clockwise.
b) (2 points) Compute the curl H = dF and write its values into the
triangles.
c) (2 points) What is the sum of all curl values? Why does it agree with
the result in a)?
d) (2 points) Find also g =P d∗ F and enter it near the vertices.
e) (1 point) True or False: x∈V g(x) = 0.
f) (1 point) True of False: we called L = dd∗ the Laplacian of G.

Figure 1. A discrete 2-dimensional region on which a 1-form F models

a vector field. You compute the curl dF and divergence d∗ F of F .
Problem 41E.2) (10 points) Each question is one point:

a) Name the 3-dimensional analogue of the Mandelbrot set.

b) If A is a 5 × 4 matrix, then AT is a m × n matrix. What is m

and n?

c) Write down the general formula for the arc length of a curve
r(t) = [x(t), y(t), z(t)]T
with a ≤ t ≤ b.

d) Write down one possible formula for the curvature of a curve

r(t) = [x(t), y(t), z(t)]T .

e) We have seen a parametrization of the 3-sphere invoking three angles

φ, θ1 , θ2 . Either write down the parametrization or recall the name of the
mathematician after whom it this parametrization is named.

f) The general change of variable formula for Φ : R → G is

RRR RRR
R
f (u, v, w) dudvdw = G
f (x, y, z) dxdydz. Fill
in the blank part of the formula.

g) What is the numerical value of log(−i)?

h) We have used the Fubini theorem to prove that C 2 functions

f (x, y) satisfy a partial differential equation. Please write down this
important partial differential equation as well as its name. (It was used
much later in the course.)

i) What is the integration factor |dr| for the parametrization

r(u, v) = [a cos(u) sin(v), b sin(u) sin(v), c cos(v)]T ?
p
j) In the first lecture, we have defined tr(AT A) as the length of a matrix.
What is the length of the 3 × 3 matrix which contains 1 everywhere?
Linear Algebra and Vector Analysis

Problem 41E.3) (10 points) Each problem is 1 point:

a) Assume that for a Morse function f (x, y) the discriminant D at

a critical point (x0 , y0 ) is positive and that fyy (x0 , y0 ) < 0. What can you
say about fxx (x0 , y0 )?

b) We have proven the identity |dr| = |ru × rv |, where r was a

map from Rm to Rn . For which m and n was this identity defined?

c) Which of the following is the correct integration factor when us-

ing spherical coordinates in 4 dimensions?
|dΦ| = r
|dΦ| = (3 + cos(φ))
|dΦ| = ρ2 sin(φ)
|dΦ| = ρ3 sin(2φ)/2

d) Which of the following vector fields are gradient fields? (It could be
none, one, two, three or all.)
F = [x, 0]T
F = [0, x]T
F = [x, y]T
F = [y, x]T

e) Which of the following four surfaces is a one-sheeted hyperboloid? (It

could be none, one, two, three or all.)
x2 + y 2 = z 2 − 1
x2 − y 2 = 1 − z 2
x2 + y 2 = 1 − z 2
x2 − y 2 = z 2 + 1

f) Parametrize the surface x2 + y 2 − z 2 = 1 as

r(θ, z) = [.............., ................., ........]T .
g) Who was the creative person who discovered dark matter and proposed
the mechanism of gravitational lensing?

h) What is the cosine of the angle between the matrices A, B ∈ M (2, 2),
where A is the identity matrix and B is the matrix which has 1 every-
where? You should get a concrete number.

i) We have seen the identity |v|2 + |w|2 = |v − w|2 , where v, w are

vectors in Rn . What conditions do v and w have to satisfy so that the
identity holds?

j) Compute the exterior derivative dF of the differential form

F = ex sin(y)dxdy + cos(xyz)dydz .
Problem 41E.4) (10 points):
a) (4 points) Find the plane Σ which contains the three points

A = (3, 2, 1), B = (3, 3, 2), C = (4, 3, 1) .

b) (3 points) What is the area of the triangle ABC?

c) (3 points) Find the distance of the origin O = (0, 0, 0) to the

plane Σ.

Problem 41E.5) (10 points):

a) (8 points) Find all the critical points of the function
f (x, y) = x5 − 5x + y 3 − 3y
and classify these points using the second derivative test.

b) (2 points) Is any of these points a global maximum or global

minimum of f ?

Problem 41E.6) (10 points):

a) (8 points) Use the Lagrange method to find all the maxima and all
the minima of
f (x, y) = x2 + y 2
under the constraint
g(x, y) = x4 + y 4 = 16 .

b) (2 points) In our formulation of Lagrange theorem, we also mentioned

the case, where ∇g(x, y) = [0, 0]T . Why does this case not lead to a
critical point here?

Problem 41E.7) (10 points):

a) (5 points) The hyper surface
S = {f (x, y, z, w) = x2 + y 2 + z 2 − w = 5}
defines a three-dimensional manifold in R4 . It is poetically called a
hyper-paraboloid. Find the tangent plane to S at the point (1, 2, 1, 1).

b) (5 points) What is the linear approximation L(x, y, z, w) of f (x, y, z, w)

at this point (1, 2, 1, 1)?
Linear Algebra and Vector Analysis

Problem 41E.8) (10 points):

Estimate the value f (0.1, −0.02) for
f (x, y) = 3 + x2 + y + cos(x + y) + sin(xy)
using quadratic approximation.

Problem 41E.9) (10 points):

a) (8 points) We vacation in the 5-star hotel called MOTEL 22 in
5-dimensional space and play there ping-pong. The ball is accelerated
by gravity r00 (t) = [x(t), y(t), z(w), v(t), w(t)] = [0, 0, 0, 0, −10]T . We
hit the ball at r(0) = [4, 3, 2, 1, 2]T and give it an initial velocity
r0 (0) = [5, 6, 0, 0, 3]T . Find the trajectory r(t).

b) (2 points) At which positive time t > 0 does the ping-pong ball

hit the hyper ping-pong table w = 0? (The points in this space are
labeled [x, y, z, v, w].)

Problem 41E.10) (10 points):

a) (5 points) Integrate the function f (x, y) = (x2 + y 2 )22 over the region
G = {1 < x2 + y 2 < 4, y > 0}.

b) (5 points) Find the area of the region enclosed by the curve

r(t) = [cos(t), sin(t) + cos(2t)]T ,
with 0 ≤ t ≤ 2π.

Problem 41E.11) (10 points):

a) (7 points) Integrate
f (x, y, z) = x2 + y 2 + z 2
over the solid
G = {x2 + y 2 + z 2 ≤ 4, z 2 < 1} .

b) (3 points) What is the volume of the same solid G?

Problem 41E.12) (10 points):

a) (8 points) Compute the line integral of the vector field
F = [yzw + x6 , xzw + y 9 , xyw − z 3 , xyz + w4 ]T
along the path
r(t) = [t + sin(t), cos(2t), sin(4t), cos(7t)]T
from t = 0 to t = 2π.
R 2π
b) (2 points) What is 0
r0 (t) dt?
Problem 41E.13) (10 points):
a) (8 points) Find the line integral of the vector field
F (x, y) = [3x − y, 7y + sin(y 4 )]T
along the polygon ABCDE with A = (0, 0), B = (2, 0), C = (2, 4), D =
(2, 6), E = (0, 4). The path is closed. It starts at A, then reaches
B, C, D, E until returning to A again.

b) (2 points) What is line integral if the curve is traced in the

opposite direction?

Problem 41E.14) (10 points):

a) (8 points) What is the flux of the vector field
F (x, y, z) = [y + x3 , z + y 3 , x + z 3 ]T
through the sphere S = {x2 + y 2 + z 2 = 9} oriented outwards?

b) (2 points) What is the flux of the same vector field F through

the same sphere S but where S is oriented inwards?

Problem 41E.15) (10 points):

a) (7 points) What is the flux of the curl of the vector field
F (x, y, z) = [−y, x + z(x2 + y 5 ), z]T
through the surface
S = {x2 + y 2 + z 2 + z(x4 + y 4 + 2 sin(x − y 2 z)) = 1, z > 0 }
oriented upwards?

b) (3 points) The surface in a) was not closed, it did not include

the bottom part
D = {z = 0, x2 + y 2 ≤ 1} .
Assume now that we close the bottom and orient the bottom disc D
downwards. What is the flux of the curl of the same vector field F through
this closed surface obtained by taking the union of S and D?

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

1
2
3
4
5
6
Name:
7
8
9
10 LINEAR ALGEBRA AND VECTOR ANALYSIS
11
12 MATH 22A Total :
13
14
15

Unit 41: Final Exam Review Lecture

Problems

Problem 41R.1) (10 points):

The graph G in Figure 1 represents a surface in which all triangles are
oriented counterclockwise. We are given a 1-form = vector field F on G.
a) (3 points) Find the line integral of F along the boundary curve oriented
counter clockwise.
b) (3 points) Enter the curl dF values in the triangles.
c) (2 points) What is the sum of all curl values?
d) (2 points) Why are the results in a) and c) the same?

2 2 2
0 0 0 1 0 2 2 1
2 2 2 2

1 0 0 0 0 2 2 1

2 2 2
0 1 0 1 1 1

0 2

Figure 1. A graph analogue to a 2-dimensional region with a 1-form

F which models a vector field.
Linear Algebra and Vector Analysis

Problem 41R.2) (10 points) Each question is one point:

a) Assume F is a 1-form P on a graph G = (V, E). Define f = d∗ F . Can
you say something about x∈V f (x)?
b) The volume V (Sn ) of the n-dimensional sphere Sn has the property
that V ((Sn ) → ...
c) Who wrote the book “How to solve?” and who invented differential
forms?
d) What is (1 + i)i ?
e) Green was not only doing mathematics, he had an other profession.
Which one? Stokes theorem appeared first in an exam problem. Who was
one of the pupils?
f) If B and C are row reductions of the same matrix A. What can you
say about the length |B − C|?
g) We cited a Harvard professor who invoked the anthropic principle to
exclude perpetual motion. Who was this? Also and unrelated: who found
first the formula for the volume of the sphere?
h) In the relief below in Figure 2, we see the level curves of some function
f . Is the function f a Morse function?
i) True or false? There is a non-zero function f (x) which can be differen-
tiated infinitely many times everywhere which has the property that all
derivatives at 0 are 0.
j) What is the name of the lantern which approximates a cylinder but for
which the surface area explodes?

0 1 -1-2 2 3 -1
-1 0 1
2
-2
1 -2
-4-3 3

-8-6
-5-7 -9
0
-4
-1
-4 -5 -4 1 12
-3
-2
-1
0 -4-4
1 2 34 -3-2-1
4
1
2
-1
-2 -1 4 5 4 3
1

0 0
8 64
57 3
4
-3 0
2
-2 2
-2 1
-1
0 1 -3 -1 2 1 -1 0 x

Figure 2. Contour map of some function f (x, y).

Problem 41E.3) (10 points) Each problem is 1 point:
a) What is the value of the Hessian determinant D = det(d2 f (x)) at a
critical point x of a Morse function f ?
b) What is the curl of a vector field F which is conservative?
c) Take the unit sphere and drill a hole of radius 1/10 from the surface to
the center. Is the solid simply connected?
d) One of the three following identities is not defined. F are vector fields
in R3 . Which one?
A) curl(curl(F )),
B) grad(div(F )),
C) div(div(F )).
e) Which of the following three vector fields can not be the curl of an other
vector field?
A) F = [x, y, z],
B) F = [y, z, x],
C) F = [z, x, y].
f) Which of the following vector fields is a gradient field?
A) F = [x, y, z],
B) F = [y, z, x],
C) F = [z, x, y]?
g) What is the exterior derivative dF if F = xdy + zdz + xdx?
h) Is it true that curl(F ) is always perpendicular to F ?
i) Is there a differentiable function which violates the Fubini theorem? Is
there a differentiable function which violates the Clairaut theorem?
j) What is the name of the partial differential equation curl(E) = −Bt ?

Problem 41R.4) (10 points):

a) Find the equation ax + by + cz = d of the plane which contains both
the line r(t) = [2 − t, t + 1, 3t] as well as the point P = (3, 5, 1).
b) What is the distance from P to the line?

Problem 41R.5) (10 points):

a) Find the critical points of the function f (x, y) = xy + x2 + 2x and
classify them using the second derivative test.
b) Does f have a global mazimum or minimum?

Problem 41R.6) (10 points):

Use the Lagrange method to find the maximum of xyz under the constraint
x + y + z − yz = 1.
Linear Algebra and Vector Analysis

Problem 41R.7) (10 points):

The surface f (x, y, z, w) = x2 + y 2 + z 2 − w2 = 1 is called a hyper-
hyperboloid. a) Find the tangent plane at the point (1, 0, 1, 1).
b) The tangent plane has the form ax + by + cz + dw = e. Parametrize
this plane

Problem 41R.8) (10 points):

Estimate the cube root of 1001 ∗ 9992 using a quadratic approximation of
f (x, y) = (xy 2 )1/3 at a suitable point.

Problem 41R.9) (10 points):

a) We live in 22 dimensional space and observe a planet mov-
ing along a path r(t) experiencing the acceleration r00 (t) =
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]. The planet is initially
at the point r(0) = [10, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] and
has zero initial velocity. Were is it at time t = 1?
b) What is the curvature |T 0 (0)|/|r0 (0)| at t = 0?

Problem 41R.10) (10 points):

a) Integrate the function f (x, y) = y over the region given in polar coor-
dinates given as 0 ≤ r ≤ θ.
b) Integrate the function f (x, y, z) = 3 + x + y over the solid given in
spherical coordinates as 0 ≤ ρ ≤ φ.

Problem 41R.11) (10 points):

a) What is the volume of the solid G : x4 + y 4 − 1 < z < 1 + 2x2 + 2y 4 ,
|x|2 < 1, |y|2 < 1? RRR RR
b) What is the average height G
z dV / G
1 dV ?

Problem 41R.12) (10 points):

Compute the line integral of the field F = [y 2 + x, y 5 , z 3 ] along the path
r(t) = [t + sin(t), 0, sin(4t)] from t = 0 to t = π.

Problem 41R.13) (10 points):

Find the line integral of the vector field F (x, y) = [x10 + y, y + sin(sin(y))]
along the triangle C = ABC with A = (0, 0), B = (2, 0), C = (0, 1) in the
order A → B → C → A.
Problem 41R.14) (10 points):
What is the flux of the vector field F [x, y, z, w] = [x3 , y 3 , z 3 , w3 ] through
the sphere x2 + y 2 + z 2 + w2 = 1 oriented outwards? Remember
that we can parametrize the four dimensional ball E as r(ρ, φ, θ1 , θ2 ) =
[ρ cos(φ) cos(θ), ρ cos(φ) sin(θ), ρ sin(φ) cos(θ), ρ sin(φ) sin(θ)] with 0 ≤
ρ ≤ 1, 0 ≤ φ ≤ π/2, 0 ≤ θ1 ≤ 2π, 0 ≤ θ2 ≤ 2π.

Problem 41R.15) (10 points):

What is the flux of the curl of the vector field F [x, y, z, w] = [xyx, x +
y 4 wx, −y + x, x ∗ w] through the disk surface r(u, v) = [0, u, v, 0], u2 + v 2 ≤
1.

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Unit 42: Some literature

42.1. Of course, the hope is that no other literature is needed. These lecture notes are
quite dense. The idea is that you “write your own book” and fill in eventual gaps or
work out some parts in more detail. We live in a time where many great resources are
online. The total sum of the books below are in the ten thousands. Some are there for
historical reasons, like Gibbs, Cartan or Gleason. For proof literature, see Unit 24.

Figure 1. Edwards, Blatter, Kaplan, Marsden-Tromba, Marsden-Ratiu

Figure 2. A bit more historical: Cartan, Gibbs-Wilson, Widder, Glea-

son, Loomis-Sternberg.
To the end, we added some physics books. Why physics? Well, we are made of matter
and live for some time in space and essentially all calculus was developed in order to
understand concepts like space, time and matter. There is no doubt that also in the
future, both calculus and physics will remain tightly linked and continue to influence
each other.
Linear Algebra and Vector Analysis

Figure 3. Spivak, Weintraub, Arnold, Thirring and Morita.

Figure 4. Popular choices: Adams-Thompson-Hass, Bachman, Stew-

art, Do Carmo and Hubbard-Hubbard.

Figure 5. Puzzling times for physics. Greene (Stringy), Smolin, Woit,

Hossenfelder, Rovelli-Vidotto (all not so Stringy).

Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

LINEAR ALGEBRA AND VECTOR ANALYSIS

MATH 22A

Acknowledgements:

43.1. First of all thanks to the Math 22a class of the fall of 2018 for questions,
comments, feedback and proofreading. So, thanks to Troy Appel, Ethan Arellano,
Jordan Barkin, Vlad Batagui, David Bruno, Silvia Casacuberta, Ian Chan, Candice
Chen, Michael Chen, Jackson Delgado, Isabel Diersen, Michaela Donato, Shelby El-
der, William Elder, Phillip Evans, Eric Hansen, Emily He, Ben HoffnerBrodsky, Jerry
Huang, Spencer Hurt, Lauren Jiang, Drew Kelner, Nour Khachemoune, Madeleine
KlebanoffOBrien, Mary Kolesar, Jordan Lawanson, Tim Li, Jake Lim, Isaac Longo-
bardi, Robert Malate, Emily Murdock, Samantha OSullivan, Christopher Ong, Moni
Radev, Sanjana Ramrajvel, Isaac Robinson, Amia Ross, Julian Schmitt, Daniel Shin,
Ross Simmons, Daniel Slaw, Sophia Sun, Varun Tekur, Grace Tian, Eddie Tu, Connor
Wagaman, Carissa Wu, Rebecca Xi, Iris Xu, Mark Xu, Jenny Yao, Nicole Zhang, Jen-
nifer Zhu and Richard Zhu. There were not hundreds of small things which the class
has found but also substantial directives came from the class while teaching the course.
43.2. Thanks to Robin Gottlieb and Cliff Taubes for initiating the course and getting
it off the ground. And to the TF’s David, Aditya and Elliot for helping running the
course smoothly.
43.3. Thanks to Jameel Al-Aidroos, Wes Cain, Janet Chen and Dusty Grundmeyer for
valuable discussions and feedback for early versions during the spring and the summer
of 2018 while planning the course.
43.4. The cover picture uses a Povray code by Jaime Vives Piqueres from 2005 illus-
trating height-fields.
Oliver Knill, [email protected], Math 22a, Harvard College, Fall 2018

Understanding Quadratic Forms and Conics
No ratings yet
Understanding Quadratic Forms and Conics
12 pages
Linear Algebra Problems: Math 504 - 505 Jerry L. Kazdan
No ratings yet
Linear Algebra Problems: Math 504 - 505 Jerry L. Kazdan
80 pages
Shawm Linear Algebra
No ratings yet
Shawm Linear Algebra
16 pages
Notes On Linear Algebra-Peter J Cameron
No ratings yet
Notes On Linear Algebra-Peter J Cameron
124 pages
Vector Space
No ratings yet
Vector Space
42 pages
Linear Algebra
100% (1)
Linear Algebra
340 pages
Linear Algebra Lecture Notes
100% (2)
Linear Algebra Lecture Notes
69 pages
2.3 Vector Spaces: MATH 294 FALL 1982 Prelim 1 # 3a
No ratings yet
2.3 Vector Spaces: MATH 294 FALL 1982 Prelim 1 # 3a
9 pages
Cambridge Linear Algebra Course Notes
No ratings yet
Cambridge Linear Algebra Course Notes
82 pages
Vector Spaces
No ratings yet
Vector Spaces
24 pages
Advanced Linear Algebra Proofs
No ratings yet
Advanced Linear Algebra Proofs
16 pages
Linear Algebra Lecture Notes MA106
No ratings yet
Linear Algebra Lecture Notes MA106
61 pages
Linear Mapping MSc Project
No ratings yet
Linear Mapping MSc Project
31 pages
Matrix Similarity and Properties
No ratings yet
Matrix Similarity and Properties
4 pages
Notes Mich14 PDF
No ratings yet
Notes Mich14 PDF
97 pages
Treil Linear Algebra
No ratings yet
Treil Linear Algebra
276 pages
Inner Product Spaces Explained
No ratings yet
Inner Product Spaces Explained
40 pages
Math Models in Physics by Anchordoqui
No ratings yet
Math Models in Physics by Anchordoqui
29 pages
(Ebook PDF) Elementary Linear Algebra 7th Edition by Ron Larson Download
100% (1)
(Ebook PDF) Elementary Linear Algebra 7th Edition by Ron Larson Download
41 pages
Applied Mathematics. OLVER SHAKIBAN
No ratings yet
Applied Mathematics. OLVER SHAKIBAN
1,165 pages
Lect02 - Complex Functions and Mapping
No ratings yet
Lect02 - Complex Functions and Mapping
55 pages
Bruce Cooperstein-Elementary Linear Algebra (2010)
100% (1)
Bruce Cooperstein-Elementary Linear Algebra (2010)
954 pages
Hello Again, Linear Algebra - A Second Look at The Subject Through A Collect
No ratings yet
Hello Again, Linear Algebra - A Second Look at The Subject Through A Collect
99 pages
Kerala University Complex Analysis Module 1
No ratings yet
Kerala University Complex Analysis Module 1
71 pages
Understanding Complex Numbers
100% (1)
Understanding Complex Numbers
30 pages
Vector Calculus Notes for Students
100% (1)
Vector Calculus Notes for Students
76 pages
(Ebook) Introduction To Probability and Its Applications, Third Edition by Richard L. Scheaffer, Linda J. Young ISBN 9780534386719, 0534386717 Download
No ratings yet
(Ebook) Introduction To Probability and Its Applications, Third Edition by Richard L. Scheaffer, Linda J. Young ISBN 9780534386719, 0534386717 Download
151 pages
MIT - Algebra I
No ratings yet
MIT - Algebra I
178 pages
Linear Transformation
No ratings yet
Linear Transformation
21 pages
Preview-A Student's Guide To Laplace Transforms by Daniel Fleisch
100% (1)
Preview-A Student's Guide To Laplace Transforms by Daniel Fleisch
23 pages
Alexander Graham - Kronecker Products and Matrix Calculus With Applications
No ratings yet
Alexander Graham - Kronecker Products and Matrix Calculus With Applications
129 pages
Linear Algebra
No ratings yet
Linear Algebra
438 pages
Unit II Vector Calculus PDF
No ratings yet
Unit II Vector Calculus PDF
110 pages
Solving Linear Equations: Prerequisites
No ratings yet
Solving Linear Equations: Prerequisites
21 pages
Linear Algebra Chapter 2&3
No ratings yet
Linear Algebra Chapter 2&3
208 pages
Calculus: Derivatives Essentials
No ratings yet
Calculus: Derivatives Essentials
23 pages
Tensor Calc and Moving Surfaces Exercises New
50% (2)
Tensor Calc and Moving Surfaces Exercises New
119 pages
Arapura, Introduction To Differential Forms
No ratings yet
Arapura, Introduction To Differential Forms
30 pages
Visual Complex Analysis
No ratings yet
Visual Complex Analysis
610 pages
Siceloff L. Analytic Geometry 2007
100% (1)
Siceloff L. Analytic Geometry 2007
296 pages
Advanced Calculus for Students
100% (1)
Advanced Calculus for Students
4 pages
Divergence and Curl: Intermediate Mathematics
No ratings yet
Divergence and Curl: Intermediate Mathematics
29 pages
Barry Spain - Ordinary Differential Equations-Van Nostrand Reinhold (1969)
100% (1)
Barry Spain - Ordinary Differential Equations-Van Nostrand Reinhold (1969)
152 pages
Vector Calculus: Vector Fields & Line Integrals
No ratings yet
Vector Calculus: Vector Fields & Line Integrals
37 pages
Functions of Several Variables2
No ratings yet
Functions of Several Variables2
5 pages
Invariant Subspaces
No ratings yet
Invariant Subspaces
7 pages
Elementary Linear Algebra Course Guide
No ratings yet
Elementary Linear Algebra Course Guide
17 pages
Eigen Values Eigen Vectors
No ratings yet
Eigen Values Eigen Vectors
35 pages
Multivariable Calculus Essentials
No ratings yet
Multivariable Calculus Essentials
6 pages
Properties of Vectors and Matrices
No ratings yet
Properties of Vectors and Matrices
22 pages
Short Notes For Analytical Geometry
No ratings yet
Short Notes For Analytical Geometry
8 pages
Vectors and Matrices Overview
No ratings yet
Vectors and Matrices Overview
14 pages
Linear Algebra for Math Students
No ratings yet
Linear Algebra for Math Students
47 pages
Jimma Calculus of Several v.
No ratings yet
Jimma Calculus of Several v.
481 pages
One II
No ratings yet
One II
28 pages
MATH100 Lecture Notes
No ratings yet
MATH100 Lecture Notes
37 pages
GeomINotes03 PDF
No ratings yet
GeomINotes03 PDF
8 pages
C1 Vectors
No ratings yet
C1 Vectors
22 pages
Aljebre 1 For Exit
No ratings yet
Aljebre 1 For Exit
53 pages
M1V6Notes Dot Product of Two Vectors (Corrected10february2024)
No ratings yet
M1V6Notes Dot Product of Two Vectors (Corrected10february2024)
10 pages
2025 1 일반물리 1일
No ratings yet
2025 1 일반물리 1일
45 pages
Vectors Matrices
No ratings yet
Vectors Matrices
46 pages
Sanet - Me 3D Rendering in Windows - D. James Benton
No ratings yet
Sanet - Me 3D Rendering in Windows - D. James Benton
147 pages
PH 205: Mathematical Methods of Physics: Problem Set 2
No ratings yet
PH 205: Mathematical Methods of Physics: Problem Set 2
2 pages
Engineering Math: Vector Algebra
100% (1)
Engineering Math: Vector Algebra
166 pages
Chapter 2 Force System
No ratings yet
Chapter 2 Force System
53 pages
CBSE Class12 Maths 100plus MCQ With Answers DR Devendran
No ratings yet
CBSE Class12 Maths 100plus MCQ With Answers DR Devendran
5 pages
Math. Ed. 445 Linear Algebra and Vector Analysis
No ratings yet
Math. Ed. 445 Linear Algebra and Vector Analysis
5 pages
MATLAB Lecture 1 v2 Revised 2025
No ratings yet
MATLAB Lecture 1 v2 Revised 2025
36 pages
RS Aggarwal Class 12 Solutions Chapter-27 Straight Line in Space
No ratings yet
RS Aggarwal Class 12 Solutions Chapter-27 Straight Line in Space
152 pages
Mathematics - A Course in Fluid Mechanics With Vector Field
No ratings yet
Mathematics - A Course in Fluid Mechanics With Vector Field
198 pages
ABAQUS Theory Manual
0% (1)
ABAQUS Theory Manual
841 pages
Linear Algebra in Communication
No ratings yet
Linear Algebra in Communication
10 pages
PIB - Numerical Analysis - Chua, Moore (2016) 58pg
No ratings yet
PIB - Numerical Analysis - Chua, Moore (2016) 58pg
58 pages
Vector Algebra Essentials
No ratings yet
Vector Algebra Essentials
8 pages
Advanced Arithmetic For The Digital Computer - Design of Arithmetic Units (PDFDrive)
No ratings yet
Advanced Arithmetic For The Digital Computer - Design of Arithmetic Units (PDFDrive)
150 pages
CBSE 2024 - 12th BOARD - Vectors 3D Geometry
No ratings yet
CBSE 2024 - 12th BOARD - Vectors 3D Geometry
203 pages
Statistical Methods II
No ratings yet
Statistical Methods II
284 pages
Tensor Algebra in Machine Learning
No ratings yet
Tensor Algebra in Machine Learning
12 pages
Understanding Vectors in Physics
100% (1)
Understanding Vectors in Physics
18 pages
1.5 The Dot Product
No ratings yet
1.5 The Dot Product
2 pages
Force Systems and Moments Overview
No ratings yet
Force Systems and Moments Overview
116 pages
Math 201 Monirul Sir All Slide
No ratings yet
Math 201 Monirul Sir All Slide
280 pages
Maths Unit-1 BCA
No ratings yet
Maths Unit-1 BCA
15 pages
Vector Products Dot and Cross Product Sept 14
No ratings yet
Vector Products Dot and Cross Product Sept 14
38 pages
Mathematics II Course Outline
No ratings yet
Mathematics II Course Outline
5 pages
TG - 2025-27 - JR - Super60 (Incoming) - Sterling BT - MAT - Teaching&Test Schedule TG - W.E.F - 22!05!2025@ 21st March 12PM
No ratings yet
TG - 2025-27 - JR - Super60 (Incoming) - Sterling BT - MAT - Teaching&Test Schedule TG - W.E.F - 22!05!2025@ 21st March 12PM
32 pages
Cohen 1995 - Visual Color and Color Mixture
No ratings yet
Cohen 1995 - Visual Color and Color Mixture
257 pages
Econ 605 - Lecture 1
No ratings yet
Econ 605 - Lecture 1
41 pages
Basic Matrix Operations and Geometric Interpretations
No ratings yet
Basic Matrix Operations and Geometric Interpretations
8 pages