Math_55a_Notes.pdf
Math_55a_Notes.pdf
Vikram Sundar
December 4, 2017
Contents
1 8/30/17 5
1.1 Introduction to Math . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Why Linear Algebra? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Mechanics for the Course . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Blackboard Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 9/1/17 8
2.1 Motivation: Linearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Groups, Rings, and Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 9/6/17 12
3.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2 Constructing Vector Spaces: Direct Sums and Subspaces . . . . . . . . . . . 13
4 9/8/17 15
4.1 Characteristic of a Field, Homomorphisms, and Ideals . . . . . . . . . . . . . 15
4.2 Span and Linear Combinations . . . . . . . . . . . . . . . . . . . . . . . . . 16
5 9/11/17 18
5.1 Span and Finite-Dimensional . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.2 Linear Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
6 9/13/17 21
6.1 Spanning Sets and Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
6.2 Linear Independence and Bases . . . . . . . . . . . . . . . . . . . . . . . . . 22
6.3 Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
7 9/15/17 24
7.1 Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
7.2 The Space of Linear Transformations . . . . . . . . . . . . . . . . . . . . . . 26
1
8 9/18/17 27
8.1 Kernel and Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
8.2 Rank-Nullity Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
8.3 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
9 9/20/17 31
9.1 Quotient Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
9.2 Quotient Spaces and Kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
9.3 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
10 9/22/17 34
10.1 Duality of Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
10.2 Duality of Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . 34
10.3 Exact Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
10.4 Digression: Category Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
11 9/25/17 37
11.1 Finishing Up Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
11.2 Duality and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
11.3 Linear Transformations from One Vector Space to Itself . . . . . . . . . . . . 39
12 9/29/17 41
12.1 Motivation for Eigenstuff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
12.2 Invariant Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
12.3 Eigenspaces and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . 42
12.4 Quiz Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
13 10/2/17 44
13.1 Review: Invariant Subspaces and Eigenvectors . . . . . . . . . . . . . . . . . 44
13.2 Linear Independence of Eigenvectors . . . . . . . . . . . . . . . . . . . . . . 44
13.3 Existence of Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
14 10/4/17 47
14.1 Remainder and Factor Theorems . . . . . . . . . . . . . . . . . . . . . . . . 47
14.2 Algebraically Closed Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
14.3 Field Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
15 10/6/17 50
15.1 Finishing Up Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
15.2 Upper-Triangular Matrices and Flags . . . . . . . . . . . . . . . . . . . . . . 50
15.3 Preview: Inner Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
16 10/11/17 52
16.1 Definition of the Tensor Product . . . . . . . . . . . . . . . . . . . . . . . . . 52
16.2 Properties of the Tensor Product . . . . . . . . . . . . . . . . . . . . . . . . 53
16.3 Universal Property of the Tensor Product . . . . . . . . . . . . . . . . . . . 54
2
17 10/13/17 55
17.1 Review: Tensor Products of Linear Transformations . . . . . . . . . . . . . . 55
17.2 Bilinear Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
17.3 The Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
17.4 Inner Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
18 10/16/17 57
18.1 Inner Products over Real Vector Spaces . . . . . . . . . . . . . . . . . . . . . 57
18.2 Inner Products over Complex Vector Spaces . . . . . . . . . . . . . . . . . . 58
18.3 Equivalence of Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
19 10/18/17 60
19.1 Inner Products over the Complex Numbers . . . . . . . . . . . . . . . . . . . 60
19.2 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
19.3 Orthogonal Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
20 10/20/17 62
20.1 Orthogonal Complements, Continued . . . . . . . . . . . . . . . . . . . . . . 62
20.2 Orthogonal and Orthonormal Bases . . . . . . . . . . . . . . . . . . . . . . . 62
21 10/23/17 64
21.1 Adjoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
21.2 Self-Adjoint Transformations and Orthonormal Eigenbases . . . . . . . . . . 64
22 10/25/17 66
22.1 Moore Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
22.2 Introducing Linear Algebra to Graph Theory . . . . . . . . . . . . . . . . . . 66
22.3 Characterizing Moore Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . 67
23 10/27/17 69
23.1 Matrix Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
23.2 Spectral Theorem with Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 70
24 10/30/17 72
24.1 Introduction to Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . 72
25 11/1/17 74
25.1 Graded Associative Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
25.2 The Wedge Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
25.3 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
25.4 The Exterior Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
26 11/6/17 77
26.1 Digression: The Fifteen Puzzle . . . . . . . . . . . . . . . . . . . . . . . . . . 77
26.2 Digression: Group Homomorphisms . . . . . . . . . . . . . . . . . . . . . . . 78
26.3 Back to Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3
27 11/8/17 80
27.1 Computing Determinants via Column Reduction . . . . . . . . . . . . . . . . 80
27.2 Exterior Powers
Ź of Duals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
27.3 Adj A and n´1 pV q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
28 11/10/17 83
28.1 Motivation for Quadratic Forms . . . . . . . . . . . . . . . . . . . . . . . . . 83
28.2 Positive-Definite Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
28.3 Characteristic Polynomial and the Cayley-Hamilton Theorem . . . . . . . . 84
29 11/13/17 85
29.1 The Characteristic Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . 85
29.2 Nilspace and Generalized Eigenspaces . . . . . . . . . . . . . . . . . . . . . . 86
30 11/15/17 87
30.1 Proof of Cayley-Hamilton . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
30.2 Computation Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
31 11/17/17 89
31.1 Representation Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
31.2 Subrepresentations and Quotient Representations . . . . . . . . . . . . . . . 89
31.3 Mapping Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
32 11/20/17 91
32.1 Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
32.2 Orthonormality of Characters . . . . . . . . . . . . . . . . . . . . . . . . . . 92
33 11/27/17 93
33.1 Orthogonality of Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
33.2 Schur’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
33.3 Proving Orthonormality of Characters . . . . . . . . . . . . . . . . . . . . . 94
34 11/29/17 96
34.1 Group Actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
34.2 Permutation Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
34.3 Group Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
35 12/1/17 99
35.1 Projection Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
35.2 Other Orthogonality Relations . . . . . . . . . . . . . . . . . . . . . . . . . . 99
35.3 Group Ring Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
36 12/4/17 101
36.1 Problem Set 3 Problem 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
36.2 Finite Group Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4
1 8/30/17
Course website: math.harvard.edu/~elkies/M55a.17.
Instructor: Professor Noam Elkies ([email protected]). Office hours: Lowell din-
ing hall at the Inn at Harvard, Tuesday 7:30 - 9:30 or by email appointment. Course Assis-
tants: Vikram Sundar (me, [email protected]) and Rohil Prasad (prasad01 @col-
lege.harvard.edu).
Problem Set 1 is due on September 8.
5
f 1 px0 q such that
f px0 ` hq “ f px0 q ` hf 1 px0 q ` op|h|q
as h Ñ 0. This op|h|q is a notion in metric topology. If we try to generalize this to n-tuples
of real numbers (or vectors), we have x0 P Rn , U Ă Rn a neighborhood of x0 , and f ; U Ñ Rk .
Now our function is differentiable if
where f 1 px0 q is now a linear transformation Rn Ñ Rk and h is some n-tuple of real numbers.
This is already a notion of linear algebra that plays a critical role in differential calculus.
In integral calculus, there is a famous change of variables formula
ż f px2 q ż x2
f pyq dy “ f pypxqqy 1 pxq dx.
f px1 q x1
6
1.4 Blackboard Notation
We typically use R to represent a boldface R for the real numbers. You can also use a regular
boldface R, e.g. R. Note that C are the complex numbers, Q are the rational numbers (for
quotients), Z are the integers (in German Zahlen). Axler often uses F to refer to either R
or C (any field).
7
2 9/1/17
Course announcements: Monday, 8 - 10 PM is Math Night in Leverett dining hall starting
this Monday. Professor Elkies will hold OH on Tuesday, 7:30 - 9 PM in Lowell dining hall.
Math Table in Mather small dining hall, 5:30 PM dinner and 6 PM talk.
We will be going in parallel to the textbook, not reproducing what is in the textbook
(which you should read on your own).
8
2.2 Groups, Rings, and Fields
Let’s start by defining the field axioms.
Definition 2.2.1. F is a field if we have two operations: addition (` : FˆF Ñ F which takes
pa, bq ÞÑ `pa, bq “ a ` b) and multiplication (¨ : F ˆ F Ñ F which takes pa, bq ÞÑ ¨pa, bq “ a ¨ b)
with the following axioms:
7. There exists a map ´1 : F˚ Ñ F˚ that maps a ÞÑ a´1 such that a ¨ a´1 “ a´1 ¨ a “ 1.
(Multiplicative inverse)
Observe that pF, `q and pF˚ , ¨q are abelian groups. Dropping the commutativity assump-
tion gives us a group. We’ll talk more about groups (abelian and not) later, but permutation
groups are a simple example of nonabelian groups. For your convenience, I’ve written out
the definitions of a group and ring below, though we did not strictly cover this in class.
Definition 2.2.2. G is a group given an operation ¨ : GˆG Ñ G with the following axioms:
Definition 2.2.3. G is an abelian group if it is a group and is commutative, i.e. for all
a, b P G we have a ¨ b “ b ¨ a.
9
Definition 2.2.4. R is a ring if it has two operations: addition ` : R ˆ R Ñ R and
multiplication ¨ : R ˆ R Ñ R with the following axioms:
The difference between a ring and a field is that multiplicative inverses need not exist.
a`x“b
for a, b, x P G, a group. We want to say that the answer is x “ b ´ a, but we have not
defined subtraction yet! Note the distinction between ´ : G Ñ G (the unary minus) and
´ : G ˆ G Ñ G (subtraction). To solve this, note that
a`x“b
p´aq ` pa ` xq “ p´aq ` b
pp´aq ` aq ` x “ p´aq ` b
0 ` x “ p´aq ` b
x “ p´aq ` b
ax “ b ô x “ a´1 b.
We used associativity but not commutativity, so this holds in any group. For a field with
a ‰ 0, we have also shown that
b
ax “ b ô x “ a´1 b “ .
a
10
In a non-commutative group, we have
xa “ b ô x “ ba´1 .
Lemma 2.2.6 (Uniqueness of Identity and Inverse). In any group G, the identity 1 and
inverse a´1 are unique.
11
3 9/6/17
Problem Set 1 due at 11 AM sharp (I will start grading right after class, so don’t be late!)
If you want to enroll, please petition by 2:30 PM today to ensure that you can enroll by
tomorrow.
Definition 3.1.1. V is a vector space over a field F (or V {F ) if V is a set of vectors with
two operations (addition, i.e. ` : V ˆ V Ñ V, pv1 , v2 q ÞÑ v1 ` v2 , and scalar multiplication,
i.e. ¨ : F ˆ V Ñ V, pa, vq ÞÑ a ¨ v “ av) such that
Example 3.1.2. The canonical example of a vector space is Fn “ tpa1 , a2 , . . . , an q|ai P Fu,
with addition and scalar multiplication termwise. This is the vector space of n-tuples and
is the most common vector space in lower-level classes. For n “ 1, we just have the original
field F.
Notice that the vector space axioms are a proper subset of the field axioms, i.e. there’s
no multiplicative inverse. Thus you can define an extension for rings that is similar to a
vector space; these are called modules over a ring R. We need multiplicative inverses over
fields for a number of our basic axioms (and module theory is much more complicated than
linear algebra for this reason) so we’ll stick to fields for now.
We have some basic lemmas:
Proof.
0v “ p0 ` 0qv “ 0v ` 0v
and we have already shown that there is only one element in an additive group that is the
sum of element and itself (by adding the additive inverse). Thus 0v “ ~0.
12
Proof.
p´1qv ` 1v “ pp´1q ` 1qv “ 0v “ ~0
so p´1qv is the additive inverse of v.
Lemma 3.1.5. a~0 “ ~0.
Proof.
a~0 ` a~0 “ ap~0 ` ~0q “ a~0 “ ~0.
13
Definition 3.2.4. A subset U Ă V of a vector space is called a subspace when U is closed
under vector space operations (i.e. ~0 Ă U , v1 , v2 P U ùñ v1 ` v2 P U , v P U, a P F ùñ
av P U ).
14
4 9/8/17
Problem Set 1 is due now. Problem Set 2 has been posted.
You can drop down from 55 to 25 until the fifth Monday. Please do problem sets for only
one of the two classes if you’re still deciding.
Note that hpmq “ hpnq is equivalent to hpn ´ mq “ 0. Thus the set of integers that map
to 0 tells us everything about whether or not this map is an injection.
Definition 4.1.3. For a map f : A Ñ B, the set ta P A | f paq “ 0u is known as the kernel
of f , denoted ker f .
In our case, the kernel is an ideal I (a term we will define later). We can also state that
hpmq “ hpnq if m ” n pmod Iq. We can see some of the properties that our ideal has to
define it.
2. 0 P I.
3. x P I, y P R implies that xy P I.
15
The proof of this was outlined in class for the case that our ring homomorphism is
h : Z Ñ F to motivate the ideal axioms. I leave the general case to you as an exercise.
The key fact about the integers is that we understand their ideals completely. Any ideal
of the integers is of the form I Ă Z “ t0, ˘g, ˘2g, ˘3g, . . .u (or the zero ideal). The proof
of this is also left as an exercise; it relies crucially on the Euclidean algorithm. In fact, the
ideal I “ tng | n P Zu works for any g in any ring (another exercise!). The special fact about
the integers in particular is that any ideal can be written in this form.
We can now define the characteristic with this setup.
You can define the characteristic of any ring in exactly the same way and the same proofs
will work (our statements were general to all rings, not just fields). There are fields with
positive characteristic, e.g. Z{2Z (the integers mod 2) with characteristic 2. But not every
integer can be the characteristic of a field! For example, Z{nZ has characteristic n but is
not always a field; if n is nonprime then there are elements without multiplicative inverses.
If p is a prime, we can check that Z{pZ is a field. There are a number of ways of doing
this (e.g. by Bezout). This is an example of a finite field since they are finite.
We have the following theorem:
Theorem 4.1.7. A number is the size of a finite field iff it is a prime power, and there is
a unique such finite field.
Definition 4.2.1. The span of a subset S of a vector space V is the intersection of all
subspaces U Ă V such that S Ă U .
This is a subspace, and is in fact the smallest subspace that contains S. It is a very
general definition that works in the context of modules over commutative rings as well.
We can also consider the span as linear combinations of finite subsets of S.
It’s easy to verify that the set of all linear combinations is a subspace of V (just check
addition and scalar multiplication). It is similarly easy to see that this is the smallest
subspace that contains the elements of S.
We will use the notation ÿ
as S
sPS
16
to denote a linear combination over the entire set S, whether or not S is finite. In order
to deal with infinite sets, we will require that as “ 0 for all but finitely many s P S (which
makes this a finite sum). This condition is vacuous if S is finite.
What if S is the union of some number of subspaces U1 Y U2 Y . . . Y Un ? We can then
define our span as
# ˇ +
ÿ ˇˇ
ui ˇ ui P Ui , ui “ 0 for all but finitely many ui .
iPI
ˇ
If U “ span S and S is finite, then we say that U is finitely generated. In the case of a
vector space, we say that U is finite dimensional. This refers to the fundamental invariant
of the space called the dimension that we’ll discuss later. Note that there is no notion of the
dimension in the case of Z-module or a module over a ring.
17
5 9/11/17
Announcements: My section is Monday (starting today), 1 - 2 PM. Room today is SC 112,
next week and beyond is SC 222.
OH tonight, 8 - 10 PM, at Math Night in Leverett dining hall.
Tomorrow: Math Table from 5:30 - 6:30 in Mather dining hall. Talk will be given by
Rosalie Belanger-Rioux on chaos. Elkies OH from 7:30 - 9:30 in Lowell dining hall.
pa1 , . . . , an q “ a1 e1 ` a2 e2 ` . . . ` an en .
18
5.2 Linear Independence
The words finite-dimensional suggest the notion of a dimension dim V , which can be 0, 1, 2, . . . , 8.
We will show what exactly this dimension is over the next day or so, but remember that this
is an important consequence of being over a field as opposed to over a ring. Modules do not
have a notion of dimension!
We’ll start by discussing linear independence. A set S Ă V can be:
• A spanning set.
• A minimal spanning set/basis/maximal linearly independent set.
• A linearly independent set.
Here, we have:
Definition 5.2.1. A linearly independent set is a set S such that
ÿ
ai vi “ 0
vi PS
19
Theorem 5.2.3 (Linear Independence Lemma, from Axler 2.21). Suppose that v1 , . . . , vm
are linearly dependent such that v1 ‰ 0. Then there exists a j, 1 ď j ď m, such that
vj P Spanpvi |i ă jq and Spanpv1 , v2 , . . . , vj´1 , vˆj , vj`1 , . . . , vm q “ Spanpv1 , . . . , vm q.
The hat on the vj denotes that we are considering the pn ´ 1q-tuple resulting from
removing the vj from the other n-tuple.
Proof. We have a linear relation ÿ
ai vi “ 0.
i
Because v1 ‰ ~0, there exists some j ą 1 such that aj ‰ 0. Take the maximal such j and
then
ÿj
ai vi “ 0
i“1
or equivalently ÿ
aj vj “ ´ p´ai qvi .
iăj
which implies the first part. In class this was done in more detail using the field axioms
directly; I’ve skipped steps and shown as much work as you can on your problem sets. You
can also write ´ aaji instead of a´1
j p´ai q. This implies the first statement.
The second part is easy; a linear combination of a linear combination is a linear combi-
nation of what you started with, so the Ą part follows. The Ă part is easy.
Proof. Start with the spanning set u1 , w1 , . . . , wn which is clearly linearly dependent by
hypothesis. Apply Theorem 5.2.3 to remove one of the wj . Add u2 and repeat (you can’t
remove any of the ui since they are linearly independent). Once we do this m times, we have
a set of n elements that contains all of the ui , implying m ď n.
20
6 9/13/17
Rohil’s section: Thusrday, 4 - 5 PM, in Science Center 411. We do not coordinate sections:
attend how many or how few as you feel is appropriate.
Problem Set 2 due on Friday.
Note that this fails in the context of modules! Rings that have this property for all of
their modules are called Noetherian.
Proof. Let’s try to construct a finite generating set for U . If U “ t0u there is nothing to do.
If not, let u1 P U ´ t0u. If we’re not done, let u2 P U ´ Spanpu1 q, and so on inductively until
um . The key point is that all of the ui are linearly independent. V “ Spanpv1 , . . . , vn q, so
we must finish when m ď n by our theorem. Thus U is finite dimensional.
Proposition
řn 6.1.3 (Axler 2.30). pv1 , . . . , vn q is a basis iff
ř every v P V is uniquely represented
as a sum i“1 ai vi for some ai P F or for an index set sPS as vs with as almost all 0.
The proof is omitted; it is largely identical to the one we provided for linear independence
earlier. ř
Now consider the map Fn Ñ V that takes pa1 , . . . , an q Ñ i ai vi for some basis vi . This
map is clearly a linear homomorphism and by the proposition is also an isomorphism. So all
vector spaces are just Fn !
But we will continue studying vector spaces abstractly as opposed to by picking a basis.
This is because it is often cleaner and more useful to understand finite-dimensional vector
spaces abstractly and we will often want to change bases to get a better understanding of
a particular map. Thus you should not get used to picking a basis for every problem! (also
that will make your CAs very very sad.)
Theorem 6.1.4 (Axler 2.31). Any finite spanning set contains a basis.
Proof. Use Theorem 5.2.3 (getting rid of 0s) until you eliminate all linear dependences; then
you get a linearly independent spanning set which is a basis.
Corollary 6.1.5 (Axler 2.32). Every finite-dimensional vector space has a basis.
21
Theorem 6.1.6 (Axler 2.31+). Any spanning set of a finite-dimensional vector space con-
tains a basis.
Proof. We know that there must be a finite spanning set inside our infinite spanning set, so
we can just take that and then apply Theorem 6.1.4.
In the infinite-dimensional case, these theorems hold if you assume the Axiom of Choice/Zorn’s
lemma, but we won’t worry about that here. When you have infinite-dimensional vector
spaces with some additional topological structure (i.e. some notion of distance/convergence/etc.)
6.3 Dimension
Now the dimension actually makes sense.
Theorem 6.3.1 (Axler 2.35). If V is finite-dimensional, all bases have the same length.
Definition 6.3.2. The dimension dim V is that length.
Our theorem is thus equivalent to saying that dimension is well-defined, since it estab-
lishes that our definition of dimension is not dependent on the choice of basis that we use in
the definition.
Proof. Given two bases v1 , . . . , vn and w1 , . . . , wm , we’ll show that m “ n. By Theorem 5.2.4,
we have m ď n since the vi are linearly independent and the wi span. Similarly, m ě n
reversing the roles. Thus m “ n as desired.
22
Example 6.3.3. Elementary dimension computations:
1. dim Fn “ n.
3. Let Fn0 be the subspace of Fn consisting of all vectors whose coordinates sum to 0.
Then dim Fn0 “ n ´ 1.
23
7 9/15/17
PS 2 due now, PS 3 goes out today.
A couple of notes about the remainder of Axler, chapter 2:
1. So far F may be skew, since we have not used commutativity of multiplication yet.
Thus dimension of vector spaces over a skew field is well-defined. This will stop being
the case very soon.
The proof of this can be found in the textbook and you should have looked this up for PS
2. However, on your most recent problem set you (hopefully!) found a counterexample
to the statement with three vector spaces.
So far, we have defined vector spaces and given properties and constructions of them, i.e.
the nouns and adjectives of our theory. Today, we start discussing linear transformations
(Axler chapter 3). These are the verbs of our theory, i.e. how we map from one vector space
to another.
2. T p0G q “ 0H .
3. T p´gq “ ´T pgq.
When proving that something is a group homomorphism, you only need to prove that
T pg1 ¨ g2 q “ T pg1 q ¨ T pg2 q since the auxiliary properties are actually consequences of this one
(just consider T p0 ¨ 0q “ T p0q ¨ T p0q “ T p0q, for example).
24
Definition 7.1.2. A map T : V Ñ W is said to be an F-linear transformation if:
Example 7.1.3. Examples of linear transformations (exercise: check that all of these are
linear transformations):
9. Let V be the vector space of differentiable functions on RtoR and let W be the vector
space of all functions R Ñ R. Then the map of taking the derivative on V Ñ W is
linear. This can also be restricted to V “ W “ P. The same is true of integration.
Sidenote: You can compute that Dx ´ xD “ id. This is the uncertainty principle in
quantum mechanics (essentially).
25
7.2 The Space of Linear Transformations
So far, we’ve seen that you can add linear transformations, can multiply them by a scalar,
and that there’s a zero transformation. This implies that linear transformations form a
vector space!
2. Homp0, V q “ 0.
3. HompV, 0q “ 0.
5. HompFn , V q » V ‘n by applying the above repeatedly. This implies that dim HompV, W q “
dim V ¨ dim W for finite-dimensional vector spaces V, W .
26
8 9/18/17
My section today: 1 - 2 pm, Science Center 222. Math Night/my OH: Leverett dining hall,
8 - 10 PM.
Math Table tomorrow: Cameron Krulewski will be giving a talk on vector bundles and
K-theory.
Professor Elkies’s OH are Wednesday this week in Lowell dining hall, 7:30 - 9 PM.
Correction to PS 3 #1 coming soon: see the webpage.
There will be a quiz next week (either Monday or Wednesday). Email Professor Elkies
if you have a strong preference. It is in-class, 1 hour.
Definition 8.1.1. The kernel ker T of a linear map T : V Ñ W is the set of v P V such
that T v “ 0.
Definition 8.1.2. The image of a linear map T : V Ñ W is the set T pvq for all v P V .
Definition 8.2.2. The rank of T is dim T pV q but the nullity of T is dim ker T .
27
Nullity is used less often in abstract math but is still used in computational math text-
books.
Based on Theorem 8.2.1, we can show that dim V , dim W , and dim ker T or equivalently
dim T pV q define T up to isomorphism, i.e. given some map T 1 : V 1 Ñ W 1 with the same
such numbers, we can find isomorphisms V » V 1 and W » W 1 such that T » T 1 . This will
give us a complete classification of all linear transformations between different vector spaces.
We start with an important proposition.
Proposition 8.2.3 (Axler 3.16). T : V Ñ W is injective iff ker T “ 0.
Definition 8.2.4. A map T : V Ñ W (over any set) is injective if T v “ T v 1 implies v “ v 1 .
Proof of Proposition 8.2.3. We have v P ker T iff T v “ 0 only when v “ 0. However,
T v “ T v 1 implies T pv ´ v 1 q “ 0 and this is equivalent to v ´ v 1 “ 0 only when ker T “ 0, as
desired.
Please do not use Axler’s notation of one-to-one for injective, since some pepole use it to
mean bijective.
You can also show that the conditions of Proposition 8.2.3 hold iff T ; V Ñ T pV q is an
isomorphism (you can check that if T is linear and bijective then T ´1 is also linear).
Proof of Theorem 8.2.1. Let U “ ker T . Axler uses bases here (for the record, when I took
55 this was on a midterm and we lost points for citing bases), but we will not. We know
that U has a complement U 1 , i.e. V “ U ‘ U 1 and dim V “ dim U ` dim U 1 . We have
T pV q “ T pU q ` T pU 1 q and T pU q “ 0 so T pV q “ T pU 1 q. Further, T |U 1 is an isomorphism by
Proposition 8.2.3 and the fact that U X U 1 “ t0u. Thus
dim V “ dim U ` dim U 1 “ dim U ` dim T pU 1 q “ dim ker T ` dim T pV q
as desired.
Corollary 8.2.5 (Axler 3.23, 3.24). Given that V, W are finite-dimensional vector spaces
with T : V Ñ W a linear map, then T injective implies that dim V ď dim W with equality
if T is an isomorphism. T surjective implies that dim V ě dim W with equality if T is an
isomorphism.
Definition 8.2.6. A map T : V Ñ W (over any set) is surjective if given any w P W ,
there exists some v P V such that T v “ w.
We also have the structure theorem, i.e. that dim V , dim W , and dim ker T completely
describe the structure of the transformation. This is because if W is finite-dimensional, you
can take the complement W “ T pV q ‘ W 1 (this complement is called the cokernel) and our
linear transformation is just
T pu, u1 q “ T pu ` u1 q “ T pu1 q “ pT pu1 q, 0q.
Now, you can be very explicit. Consider a basis for V that is u1 , u2 , . . . , um , u11 , u12 , . . . , u1n
and for W , w1 “ T pu11 q, . . . , wn “ T pu1n q, w11 , . . . , wp1 . Any other such linear map with the
same dimensions of the vector spaces and kernel will look the same with respect to some
basis.
Here, we are implicitly using:
28
Lemma 8.2.7 (Axler 3.?). 1. HompV ‘ V 1 , W q “ HompV, W q ‘ HompV 1 , W q. This is
also true for arbitrary direct sums.
2. Let S be a basis for a not necessarily finite-dimensional vector space V over a field
F. Then for all W over F and any ws , there exists a unique linear transformation
T : V Ñ W such that T pvs q “ ws for all vs in the basis.
The first part is easy to see directly. The second part is true since
ÿ
v“ as v s
sPS
for as “ 0 for all but finitely many s. Then we can just write
ÿ
Tv “ as ws
sPS
which must be our linear transformation. We just need to check that this is a linear trans-
formation.
Finally, we can prove:
Theorem 8.2.8 (Axler 3.61). If both V and W are finite-dimensional, then HompV, W q is
also and
dim HompV, W q “ dim V ¨ dim W.
This is easy enough to prove via Lemma 8.2.7; choose a basis for V that identifies V » Fn
and now apply Lemma 8.2.7 (1) as many times as necessary.
Why is the theorem number so high? Because Axler decides to veer from the golden light
of abstract linear algebra and introduce discussion of. . .
8.3 Matrices
Choose a basis for both V and W , i.e. v1 , . . . , vn and w1 , . . . , wm . Then we have
m
ÿ
T pvi q “ ai,j wj .
j“1
In particular, the kth column is the image of the kth basis vector vk . We get the matrix by
regarding elements of W » Fm as column vectors and writing them in order.
Note that we must think of vectors as column vectors rather than row vectors, which
corresponds to our convention of composing functions/multiplying matrices on the left. It’s
29
easy to see that composition of maps corresponds to multiplication of matrices (we’ll show
this next time). You are not required to know that m ˆ n matrices are different than n ˆ m
matrices.
Note that I will try to write a1,1 or ai,j instead of a11 or aij . The literature often uses the
latter since it is clear from context exactly what I mean.
Consider a map T : V Ñ V . You must use the same basis of V for both sides, so these
maps look different from the ones we have characterized above. The set HompV, V q is a
ring of n ˆ n matrices, since you can compose these maps with themselves to get new maps
V Ñ V . We have an identity I which corresponds to the identity matrix. We have
¨ ˛
1 0 ... 0
˚0 1 . . . 0‹
I “ ˚ .. .. . . .. ‹ .
˚ ‹
˝. . . .‚
0 0 ... 1
30
9 9/20/17
Professor Elkies will hold office hours from 7:30 - 9:00 PM in Lowell dining hall today. Rohil
will hold make-up office hours from 1 - 2:30 PM in Science Center 304 and section from 4 -
5 PM as usual.
There will be a quiz next Wednesday, September 27 in class. This quiz is purely diagnostic
and will count for an insignificant portion of your grade.
Axler now has a discussion of quotients, an essential topic in algebra.
Definition 9.1.1. A coset of V mod U is a subset of V of the form rv0 s for some element
v0 P V where rv0 s “ tv0 ` u | u P U u.
The best way to think about this is congruent classes mod n. We can consider these as
equivalence classes rvn s where v „ v 1 means v 1 ” v pmod U q or v 1 ´ v P U . We want this
to be an equivalence relation.
The last property (transitivity) is the only nontrivial property, which we can check by
noting that
v3 ´ v1 “ pv3 ´ v2 q ` pv2 ´ v1 q
and both of the latter terms are P U , so v3 ´ v1 P U as desired. The other two are left as
exercises for you.
Note that reflexivity and transitivity do not imply identity; the obvious proof v1 „ v2 „
v1 ùñ v1 „ v1 assumes that there exist v1 , v2 such that v1 „ v2 , but that is not given with
only reflexivity and transitivity! Thus we keep the identity axiom.
Now we have an idea as to what the vector space consists of. We now need addition and
multiplication to work.
Definition 9.1.3. The quotient space V {U consists of cosets of V mod U with addition
and multiplication defined by cosets, i.e. rv1 s ` rv2 s “ rv1 ` v2 s and crv1 s “ rcv1 s. We have
0V {U “ r0s and ´rv0 s “ r´v0 s.
31
We need to check that this definition is well-defined, i.e. for whatever choice of v1 , v2 in
the cosets rv1 s, rv2 s, I get the same value for rv1 ` v2 s. This is equivalent to v1 ” v11 , v2 ”
v21 ùñ v1 ` v2 ” v11 ` v21 , which I leave as an exercise. Similarly, one can show that
v1 ” v11 ùñ ´v1 ” ´v11 and v1 ” v11 ùñ cv1 ” cv11 .
As a note, checking these facts is nontrivial if you are not working over vector spaces!
You cannot quotient out by an arbitrary subgroup or subring. Thus I strongly recommend
you check all of the claims I made above to ensure our definition is well-defined.
V {U
Thus HompV {U, W q Ă HompV, W q and the image is exactly T 1 such that U Ă ker T 1 , i.e.
T pU q “ 0. Why? To lift a map T : V {U Ñ W to a map T 1 : V Ñ W , you need to pick
1
a random element v in each coset rvs and map T prvsq “ T 1 pvq. But this only well-defined
if v 1 ” v implies T 1 pv 1 q ´ T 1 pvq “ T 1 pv 1 ´ vq “ 0, implying that T 1 pU q “ 0. Conversely, if
T 1 pU q “ 0, then we can define T prvsq “ T 1 pvq and this is well-defined, giving us the desired.
Another way of saying this is that HompV, W q|vanishes on U » HompV {U, W q.
şb
If you’ve seen some real analysis, you might know that a Functpr0, 1s, Rq Ñ R sends null
şb
functions, or functions that are nonzero at a finite number of points, to 0, so a is a map
from the quotient of all functions by the null functions, or the L1 space.
Definition 9.2.1. The codimension of U in V is dimpV {U q.
As above, if V is finite-dimensional we can apply rank-nullity to get dimpV {U q “ dim V ´
dim U . Thus all rank-nullity is saying is that codimpker T q “ dimpT pV qq which can be proven
just by descending to the isomorphism V { ker T » T pV q.
We also have the following theorem, which was not explicitly stated in class but referred
to:
Theorem 9.2.2 (First Isomorphism Theorem). Given T : V Ñ W , we have V { ker T »
T pV q.
The only thing we haven’t checked to prove Theorem 9.2.2 is that the map V { ker T Ñ
T pV q is an injection (surjectivity is obvious). You can verify that the kernel of this map is
0, implying that this map is an isomorphism as desired. This can be used to preserve the
rank-nullity theorem and generalizes the rank-nullity theorem to other contexts, like quotient
groups, quotient rings, etc.
32
9.3 Duality
Recall that we defined the dual:
Definition 9.3.1. The dual vector space V ˚ :“ HompV, Fq, or the space of functionals
on V .
where the columns are v1 , . . . , vn and the rows are w1 , . . . , wm . Thus column vectors must
correspond to the original basis, i.e. if V “ F, and if we let W “ F, then the row vectors
correspond to HompV, Fq “ V ˚ , i.e. elements of the dual space.
Finally, you can verify that composition of maps is equivalent to multiplication of matri-
ces.
33
10 9/22/17
Quiz: next Wednesday, September 27.
Problem Set 3 due now, Problem Set 4 due Monday, Oct 2 (note the changed deadline).
Note that Axler uses V 1 , T 1 , U 0 instead of what we call V ˚ , T ˚ , U K . Note that there are
two forms of the same Greek letter: zphi refers to φ and zvarphi refers to ϕ.
Thus V ˚ » V since they have the same dimension. But this isomorphism is not canonical.
If you pick a basis v1 , . . . , vn P V , then there exists a dual basis vi˚ P V ˚ where vi˚ pvj q “ δij ,
i.e. vi˚ pvi q “ 1 and vi˚ pvj q “ 0 for i ‰ j. But this map depends on choosing a basis and is
not basis-independent! There cannot be a natural identification V » V ˚ since it would take
any vi ÞÑ vi˚ always, which is impossible as multiplying vi by a constant divides vi˚ by the
same constant.
However, the second dual pV ˚ q˚ » V canonically! Our disproof above doesn’t work
since we’d multiply the dual-dual basis by the same constant. We can demonstrate that
this isomorphism is canonical on Problem Set 4 by defining it in a basis-free manner (note:
this means that using bases on your problem set will cause you to lose points). Hint:
pV ˚ q˚ “ HompHompV, Fq, Fq.
34
and rows, i.e. transposing the matrix (assuming you use the dual basis). There are much
cleaner ways to prove that this is an isomorphim, however.
What happens if we compose duals of linear transformations? We have the diagram:
T S
V W X
ϕS“S ˚ ϕ
ϕST “T ˚ pS ˚ ϕq ϕ
By the diagram, we have pST q˚ “ T ˚ S ˚ . Note the switched order, corresponding to the
arrows reversing above.
Suppose we have T : V Ñ W with ker T, T pV q. Then ker T “ pIm T ˚ q˝ and Im T “
pker T ˚ q˝ . Proofs of this are in the textbook.
35
Lemma 10.3.2. V Ñ W Ñ X is exact iff X ˚ Ñ W ˚ Ñ V ˚ is also exact.
Proof. It’s easy to show that the composition of the two maps is always zero. The hard part
is showing that the image of the first is the kernel of the second, but this is equivalent to
the statements of image and kernel of the dual we made earlier.
In general, we have an exact sequence
T
0ÑU ÑV ÑW ÑXÑ0
2. The trivial category with one object and one morphism and the trivial functor that
sends everything to that category.
3. Forgetful functors, e.g. VecF Ñ Set that forgets the structure of the vector spaces (i.e.
a vector space homomorphism is clearly a map between sets!) or VecC Ñ VecR that
forgets the complex structure of the vector space.
4. A contravariant functor, that does everything described above but reverses all the
arrows. Duality is an example of a contravariant functor. In fact, this is an exact
contravariant functor, since it preserves exactness (which is well-defined for categories).
36
11 9/25/17
My section: 1 - 2 PM today, SC 222. CA OH and Math Night: 8 - 10 PM, Leverett dining
hall.
Math Table: graduate school panel, tomorrow 5:30 - 6:30 PM. Professor Elkies’s OH:
7:30 - 9 PM, Lowell dining hall.
Diagnostic quiz in class on Wedenesday: 11:07 - 12:00. Being on time would be helpful.
It covers material on problem sets 1 to 3 only.
Putnam exam: on December 2. Signups will be on the third floor under the undergraduate
section. I don’t think signups are up yet, but the math department is often disorganized so
it’s best to check.
We’re finishing Chapter 3 of Axler. Chapter 4 of Axler is about polynomials, but we’ve
already assumed some of Chapter 4 in earlier problem sets, so we’re not going to cover it.
It’s regarded as something that you probably already know (polynomials don’t change from
over R to over F much).
U V
Any map from U Ñ F can be extended to a map V Ñ F. This is actually not an obvious
statement and is?true over infinite-dimensional?vector spaces without the Axiom of Choice!
Consider Q ‘ Q 2 Ă R and the map 1 ÞÑ 1, 2 ÞÑ 0; you can see that you can’t actually
construct such a map without Hamel bases.
In the finite-dimensional case one possible way to prove this is to use complements. Write
V “ U ‘ W and V ˚ “ U ˚ ‘ W ˚ and V ˚ Ñ U ˚ where pu˚ , w˚ q Ñ u˚ . In other words, if you
give me a map from U Ñ F, I get a map from V Ñ F by saying that everything in W goes
to 0 and then writing any vector v P V as v “ u ` w, sending u to whatever and w to 0. Of
course, this only works in the finite-dimensional case.
37
Now extend our short exact sequence to 0 Ñ U Ñ V Ñ V {U Ñ 0 and the dual
0 Ð U ˚ Ð V ˚ Ð pV {U q˚ Ð 0. We want to show that pV {U q˚ “ kerpV ˚ Ñ U ˚ q and have
identified pV {U q˚ “ W ˚ , which gives us the desired (along with a dimensions argument).
The more general fact is that T : V Ñ W extends to a dual T ˚ : W ˚ Ñ V ˚ with the
diagram
T
V W
T ˚ϕ ϕ
Lemma 11.1.2 (Axler 3.107A, Axler 3.109B). ker T ˚ “ pT pV qq0 and Im T ˚ “ pker T q0 .
Proof. We want to show that ker T ˚ “ pT pV qq0 . This is almost immediate from the defi-
nitions. The ker T ˚ is anything such that T ˚ ϕ “ ϕ ˝ T “ 0 for any ϕ, i.e. ϕ annihilates
everything in T pV q, as desired.
The hard part is the other direction, i.e. Im T ˚ “ pker T q0 . If v P ker T , then pT ˚ φqv “
φ ˝ T pvq “ 0 thus Im T ˚ Ă pker T q0 . If I have an element v ˚ P V ˚ that annihilates everything
in ker T , we want to show that it can br brought to an operator in W ˚ . The easy way to do
this is to compare the dimensions of our two subspaces and show that they are equal, which
is just a computation.
and
38
You do need to check that this is true (which is left as an exercise since we don’t want
to do ugly matrix stuff in class). But we have shown that dim Im T “ dim Im T ˚ , which in
terms of matrices looks quite different. dim Im T is the dimension of the span of the columns
of the matrix, i.e. the column rank of A. dim Im T ˚ is the dimension of the span of the
rows of the matrix, i.e. the row rank of A. And thus we have:
Corollary 11.2.1 (Axler 3.115 - 3.119, 3.111 - 3.112). The row rank and column rank of a
matrix are the same.
This common number is called the rank. This is a nontrivial fact in terms of matrices,
but is relatively obvious in terms of duals. This is a common punchline in linear algebra
proofs.
If you have a 1 ˆ n matrix, then the rank is clearly 1 (aside from the 0 matrix). If you
have a 2 ˆ 2 matrix, the rank is 0 is you have the 0 matrix. Letting the matrix be
ˆ ˙
a b
,
c d
we see that it has rank 1 if ad “ bc, i.e. the rows are multiples of each other. The rest
are rank 2, i.e. invertible matrices or bijections. Things get a lot more complicated in the
general case.
Recall that linear transformations on Fn Ñ Fn that are bijective/invertible form a group.
This group is the group of general linear transformations and called either GLpnq or
GLn pFq. In general, it’s not commutative. If you want, you can also talk about GLpV q for
a general vector space (all invertible linear transformations T : V Ñ V ).
39
You can also characterize Fibonacci-esque recursions in this way, that turn out to be relevant
for population dynamics and the like.
40
12 9/29/17
We are starting Axler Ch. 5, about eigenstuff. We will need to say something about the
complex numbers being algebraically closed (Ch. 4) but will talk about that later.
If you want to talk to Professor Elkies, please meet him Sunday, 2 - 4 PM on the 4th
floor common room. Rohil and I are also available to talk.
Add/drop deadline on Monday. If you are switching between 23/25/55, there should be
no fee; contact Professor Cliff Taubes if you are charged.
PS 4 due Monday.
It would be really nice if there was a complement V “ U ‘ W such that both U and
W were invariant. Then we could just consider smaller-dimensional vector spaces U and
W instead of thinking about V . If you want to think about this in terms of matrices, we
essentially get ˆ ˙
MpT |U q ˚
MpT q “
0 MpT |W q
when picking a basis for U and W . You can then iterate this over and over again, getting
essentially a block diagonal matrix (if you can get the ˚ to be 0).
The complement in general is not invariant, since you cannot define the complement
purely canonically. But we always have the quotient space! The key observation is that we
41
always have an induced map T : V {U Ñ V {U . Axler calls this map T {U , but we will not use
this notation. You can define this map via T prvsq “ rT vs or via the commutative diagram
(I couldn’t resist)
T
V V
π π
T
V {U V {U
In either case, you need to check that this map is well-defined, i.e. rvs “ rv 1 s implies
rT vs “ rT v 1 s. But rvs “ rv 1 s implies that v ´ v 1 P U so T pv ´ v 1 q P U or rT vs “ rT v 1 s, as
desired.
Thus we now have the matrix
ˆ ˙
MpT |U q ˚
MpT q “
0 MpT |V {U q
which is known as a block upper triangular matrix. If we repeat this dissecting pro-
cess to get 1-dimensional subspaces, then we get an upper triangular matrix. Iterating
this transformation gets another upper triangular matrix. Ideally, we’d even want to get a
diagonal matrix, which is very easy to deal with.
Thus we can try lots more things to get invariant subspaces. Note that kerpT ´ λIq “
tv P V | T v “ λvu is also invariant under T . We thus have:
Definition 12.3.3. The λ-eigenspace of T is the space Vλ “ kerpT ´ λIq. The vectors in
this eigenspace are called eigenvectors.
42
We want to generate a basis of eigenvectors so our matrix becomes upper triangular.
Does there even exist one eigenvalue always? This is not the case over all fields F.
Consider rotation by 90˝ in R2 . This clearly has no eigenvalues or eigenspaces. However,
there will always be eigenvalues when working with F “ C or any other algebraically closed
field.
Traditionally, at this point you use the determinant to prove that eigenvalues exist. We
will not be doing that; instead we’ll naturally construct a polynomial that will demonstrate
the existence of eigenvalues.
43
13 10/2/17
PS 4 due now. PS 5 will be posted soon.
My section/Math Night as usual.
Today is the last day to add or drop a class. Please contact one of the CAs if you are
still uncertain.
Tomorrow: math table, Davis Lavowski on Poincare-Birkhoff. Elkies’s office hours as
usual.
Putnam sign-ups are due by October 18.
0 U V V {U 0
T T T {U
0 U V V {U 0
Last time, we checked that there exists a map that makes the above diagram commute
and this map is well-defined.
The idea is that we try to find invariant subspaces until we cut everything down to one
dimension.
Why do we assume finite-dimensionality? Here’s a counterexample. Suppose F “ R, C
and V “ Cpr0, 1s, Fq “ tf : r0, 1s Ñ F | f continuousu. Let T : V Ñ V with f ÞÑ xf , where
pxf qptq “ tf ptq. The claim here is that there is no one-dimensional invariant subspace, i.e.
no eigenvector. Thus there is no nonzero function f ‰ 0 and real or complex number λ P F
such that xf “ λf . Thus invariant subspaces in an infinite-dimensional space will look more
complicated than those in a finite-dimensional case.
You might wonder whether you can define a codimension 1 invariant subspace? Or any
invariant subspace? I leave that to you.
Theorem 13.2.1 (Axler 5.10). Eigenspaces for different ř eigenvalues are linearly indepen-
dent. That is, for pairwise distinct λi , if vi P Vλi and i vi “ 0 (with finitely many i), then
each vi “ 0.
This implies that if you have found n different eigenvalues in a space of dimension n, you
have found all possible eigenvalues.
44
Proof. We know that T vi “ λi vi with each λi distinct. Take a set such that the number
of nonzero vi is minimal. If not all vectors are zero, we’ll produce a linear relationship
with fewer nonzero vectors (but not all); this will produce a contradiction. If n “ 1, we’re
obviously done.
Now induct on the number of eigenvectors. Suppose
v1 ` v2 ` . . . ` vN `1 “ 0.
We have
T v1 ` . . . T vN `1 “ λ1 v1 ` . . . ` λN `1 vN `1 “ 0.
We can now get rid of one of the vs by subtracting λN `1 times the first linear combination.
We get
pλ1 ´ λN `1 qv1 ` . . . ` pλN ´ λN `1 qvN “ 0.
This has a smaller number of nonzero vectors, as desired.
a0 v ` a1 T v ` a2 T 2 v ` . . . ` an T n v “ 0
an pT ´ t1 IqpT ´ t2 Iq . . . pT ´ tm Iqv “ 0.
Thus we have a sequence of linear transformations that in a finite number of steps sends v
to 0. One of these linear transformations must have nontrivial kernel, implying the existence
of an eigenvalue.
We now want to formalize this proof to get the theorem:
Theorem 13.3.1 (Axler 5.21). Every linear transformation on a vector space over an alge-
braically closed field has an eigenvalue.
Proof. Define the map Frxs Ñ LpV q “ LpV, V q “ HompV, V q “ EndpV q. This is the
endomorphism ring and is a non-commutative ring in general. Our map is defined by
x ÞÑ T ; this implies 1 ÞÑ I, x ÞÑ T, x2 ÞÑ T 2 , x3 ÞÑ T 3 , . . .. This homomorphism is both a
ring homomorphism and a F-algebra homomorphism.
Definition 13.3.2. An F-algebra is a ring that contains the field F, making it a vector
space over F.
45
Let this map be hT . Then hT paq “ aI, so we can now extend this to a map over all of
Frxs by linearity. You can check that this map preserves multiplication and addition.
Frxs is infinite-dimensional, while EndpV q is finite-dimensional. This means that we have
a nontrivial kernel. So that there is a polynomial of degree ď n2 that kills every nonzero
vector, i.e. annihilates the entire vector space!
The ker hT is an ideal of the ring Frxs.
Definition 13.3.3. An ideal I Ă R is a subring of R such that for all r P R, we have
rI Ă I. This is stronger than just requiring closure of multiplication within I.
It is a nonzero ideal, as we’ve established, and cannot be the whole ring. All ideals
in the polynomial ring are generated by one element (exercise: use the Euclidean algo-
rithm/Bezout’s lemma) so the polynomials are a principal ideal domain. Call this gen-
erator P , so ker hT “ P Frxs. This P can be characterized as the monic polynomial of the
minimal degree such that hT pP q “ 0. We call this the minimal polynomial of T .
We can now factor the minimal polynomial of T and proceed as above to ge the desired
result.
46
14 10/4/17
Rohil’s section as usual tomorrow. Today - Friday: CMSA workshop continues.
Next Monday is a university holiday so we will not have class. However, section and
Math Night/OH will be held as usual.
Putnam signups are due October 18.
To proceed further with eigenvalues and eigenspaces, it’s most natural to work in the
context of algebraically closed fields. There are several questions here: what is an alge-
braically closed field? where does that term come from? why do such fields exist (e.g. the
complex numbers C)? This is really a topological/analytic fact and doesn’t really belong in
55a; we’ll prove it very easily in 55b. The textbook gives a very fast proof using complex
analysis which we will outline here.
then there exists unique A, B P Frts such that Q “ AP ` B and deg B ă d. This is just
polynomial division. Now we can do the computation
P ptq “ Aptqpt ´ t1 q ` B
where deg B ă 1, i.e. B is constant. Substituting t1 for t, we have P pt1 q “ B. You can
evt
think about substitution as a map Frts Ñ1 F which is just evaluation at t1 . You can check
that this is a ring homomorphism, but this is really a good way of obfuscating something
rather trivial.
Similarly, we know that P pt1 q “ 0 is equivalent to P ptq “ pt ´ t1 qAptq. Thus the kernel
of our evaluation map is just pt ´ t1 q.
47
Proposition 14.2.2. The following are equivalent:
1. Every nonzero P P Frts factors completely.
2. Every P P Frts of degree ě 1 has a root in F.
Proof. The direction 1 ùñ 2 is essentially trivial. The direction 2 ùñ 1 is also easy; just
write P ptq “ pt ´ t1 qQptq where deg Q ă deg P and induct on deg P .
Definition 14.2.3. A field that satisfies the above conditions is algebraically closed.
We know that the real numbers are not algebraically closed. For example, pptq “ t2 ` 1
has no real solutions. Any equation of odd degree has at least one real solution by the
intermediate value theorem, which we have not proven and is also a topological fact about
the real numbers. With enough algebra, one can reduce the fundamental theorem of algebra
to this statement, but it is still topological.
If you make your real numbers larger by choosing some solution of this and calling it i,
i.e. i2 `1 “ 0, then a`ib for a, b P R can be used to do arithmetic, like addition, subtraction,
multiplication, and even division via
1 a ´ ib
“ 2 .
a ` ib a ` b2
?
However, we find that surprisingly, i etc. are all in our new field! This suggests that the
complex numbers are in fact algebraically closed.
Theorem 14.2.4 (Fundamental Theorem of Algebra). The complex numbers are alge-
braically closed.
Sketch of Proof. The idea is to consider the number of times various paths go around the
origin, known as the winding number. You can use a polynomial with no root to show
that a path with winding number 0 changes to winding number n, which is ipmossible,
contradiction. We’ll make this more rigorous and clearer next semester.
F1
d
F
If d “ 1, then F1 “ F. The claim is that algebraically closed fields have only one possible
finite-degree extension.
48
Theorem 14.3.2. F is algebraically closed iff there is no field F1 Ą F of finite degree aside
from F itself.
This proof tells us something about how the ideas we’ve developed so far tell us things
about the structure of fields that are not automatically about vector spaces.
Proof. We first prove that if F is not algebraically closed then there exists such a field F1 .
If F is not algebraically closed, then let P be a polynomial that does not factor completely.
We know deg P ě 2 and assume P irreducible (otherwise just pick an irreducible factor that
does not factor completely).
Construct a ring F1 “ Frxs{P pxqFrxs. This quotient ring (the ring of polynomials mod
P ) guarantees that x is a root of P . P pxqFrxs is an ideal so this quotient ring makes sense.
We do need to check that this is actually a field and that it has the right dimension. The
dimension of this is the codimension of P pxqFrxs which is deg P ě 2. So we just need to
check that it is actually a field.
Suppose you have a ‰ 0, a P F1 . We need to show that there exists an inverse. We have
a “ Apxq for some polynomial A and want to show that there exists b such that ab ” 1
pmod P q. Multiplication by a, i.e. b ÞÑ ab is an F-linear map from F1 Ñ F1 . F1 is a finite-
dimensional vector space, so we’ll just show that this map is surjective. By rank-nullity, it’s
enough to show that this map is injective, i.e. ab ” 0 pmod P q. Since P is irreducible, we
must have P |a or P |b, contradiction. Thus we are done.
Now we need to show the reverse direction, i.e. if there exists F1 then F is not algebraically
closed. Let x P F1 , x R F. Consider 1, x2 , . . . , xd where d “ rF1 : Fs. Given d ` 1 vectors in
a d-dimensional vector space, there must be a linear relation between them. We know that
1 ‰ 0 and x is linearly independent from 1, so there exists a0 , a1 , . . . , ae where 1 ă e ď d
such that e
ÿ
ai xi “ 0.
i“0
śe I claim that this is a polynomial that does not split completely. If it did, it would be
i“1 px ´ ti q with each ti P F. But this implies that x “ ti for some i, contradiction.
This gives us a good way to construct algebraic closures; just keep adding field extensions
for all remaining polynomials that do not split completely and keep stacking extensions!
If your field is countable, this works no matter what; if you are allowing the Axiom of
Choice/Zorn’s lemma, this works always (exercise). If you’re in the complex numbers, you
can just pick C.
49
15 10/6/17
Monday: no class, my section/Math Night as usual.
Tuesday: Math table (Carlos Arbos-Ribeira, Cobordism Groups) and OH as usual.
Wednesday: PS 5 due, start tensor algebra.
which implies that dim Vλi “ 1. We may now pick a basis for each Vλi to get a basis for V
on which MpT q is diagonal with diagonal entries being the eigenvalues λi . We loosen the
distinctness condition on the eigenvalues to get:
Definition 15.1.2. T is diagonalizable if there exists a basis on which T has a diagonal
matrix, or equivalently V has a basis of eigenvectors of T .
Recall also the minimal polynomial P pT q “ 0. Factorizations of this polynomial necessar-
ily give you all possible eigenvalues. Alternatively, you could just start with v, T v, T 2 v, . . . , T n v
to get a polynomial P pT qv “ 0. We therefore also get the existence of eigenvalues in an
algebraically closed field by factoring the minimal polynomial.
We’ll now spend a bit of time talking about linear transformations that are less nice than
diagonalizable transformations, i.e. upper-triangular transformations.
50
From a basis, we can get a flag by considering Vi “ Spanpv1 , . . . , vi q. Observe that
upper-triangular matrices respect the flag with respect to the basis of that matrix, since
V vi P Spanpv1 , . . . , vi q by definition. In particular, if the ith diagonal entry is λi then this is
the induced map on Vi {Vi´1 .
We can now determine a condition for when an upper-triangular matrix is invertible.
Proof. In the forward direction, if some λi “ 0, then the induced map on Vi {Vi´1 is 0, so
T pVi q Ă Vi´1 , contradiction.
In the backward direction, suppose that v P ker T , i.e. T v “ 0, v ‰ 0. Thus v P Vi , v R
Vi´1 for some i. Since λi ‰ 0, we must have T v P Vi , T v R Vi´1 since when we take mod Vi´1
we find T v “ λi v ‰ 0. But this is clearly a contradiction.
Proposition 15.2.4 (Axler 5.32). If M is upper-triangular, its eigenvalues are precisely its
diagonal entries.
51
16 10/11/17
Today: tensor products. PS 6 posted online later today and available in class; PS 5 due
now.
Science Center A 4:15 - 5:15 today and tomorrow: Tim Gowers is giving the Ahlfors
lecture series on additive combinatorics.
We’ll use tensor products to prove later today that the trace (concretely in terms of
matrices, the sum of diagonal elements) is coordinate-independent and thus a canonical
map. You may have seen a proof of this based on bases, but we’ll show why the trace is
actually a very natural map. Later, we’ll do something similar for the determinant.
Using this rule, we see that any u b v can be written as a linear combination of the
ui b vj . But how do we know that the ui b vj are linearly independent? This is much harder
to prove. And what about vector spaces for which we don’t know a basis? We’ll want to
form the tensor product of R bQ R, for example.
Definition 16.1.1. We define the tensor product of two vector spaces U, V over F to be
U b V “ Z{Z0 . Z is generated by tu b v | u P U, v P V u formally (we have no rules on how to
add each of these symbols u b v; they’re just symbols). Z0 is generated by all the relations:
pu ` u1 q b v ´ u b v ´ u1 b v for u, u1 P U, v P V , u b pv ` v 1 q ´ u b v ´ u b v 1 for u P U, v, v 1 P V ,
pauq b v ´ apu b vq and u b pavq ´ apu b vq for a P F, u P U, v P V .
52
16.2 Properties of the Tensor Product
First, we’ll show that the tensor product has the basis we expect.
Lemma 16.2.1. If ui and vj are bases for U and V , then ui b vj is a basis for U b V .
We are not going to prove this by showing that our set is linearly independent and spans
the vector space. Exercise: try doing this; see how hard it is.
Proof. Let W have basis wij “ ui b vj ; we’ll show that W » U b V . Construct α : W Ñ
U b V, β : U b V Ñ W and prove that α ˝ β “ id, β ˝ α “ id. This demonstrates the
isomorphism by providing a map with a two-sided inverse.
Define αpwij q “ ui b vj (technically I should take this mod Z0 , but I’ll drop that since
people consider this implicit).
The other direction is trickier. We define our transformation on Z and check that it is 0
on Z0 , thus inducing a transformation on Z{Z0 . Define βr : Z Ñ W with
˜ ¸ ˜ ¸
ÿ ÿ ÿÿ
βpu
r b vq “ ai ui b bj vj “ ai bj wij .
i j i j
In all cases, we need to check that our maps are well-defined under the relations, i.e. the
elements of Z0 , and that a two-sided inverse exists. But there is an easier way to prove many
of these statements. . .
53
16.3 Universal Property of the Tensor Product
Suppose we have a map B : U ˆ V Ñ X (just the Cartesian product).
X
You need to check that the map T makes sense (under the relations) and is actually a
linear map and that there is a bijection between B and T . We’ll leave this as an exercise.
You can define the tensor product through the universal property, since universal prop-
erties are unique up to isomorphism (you do need to construct it, though). You can then
just use the universal property to prove all the properties we listed above by proving that
the maps we showed are actually bilinear in both directions.
You can also define the tensor product of linear transformations.
54
17 10/13/17
There’s a typo on the official handout: on page 2, the first displayed equation should read
ÿÿ
βpu
r b vq “ ai bj wij
i j
which implies that the space of linear transformations is itself a tensor product of the dual!
This is again only an isomorphism in the finite-dimensional case.
55
subspaces are invariant under S as desired. You can also define Sym2 V “ Z{Z1 where
Z1 “ SpanpZ0 , u b v ´ v b uq, where Z, Z0 are as in the tensor product definition.
Given this backgorund, we know that bilinear forms are elements of HompSym2 V, Fq.
We’ll be using this very soon to understand inner product spaces.
to hold. This is equivalent to |x ` x1 | ď |x| ` |x1 | (often also called the triangle inequality).
We have to prove this, and it is a consequence of bilinearity and positive-definiteness. We
will do that next time.
56
18 10/16/17
Today we are going to continue our discussion of inner product spaces.
Before now, it was not too crucial which fields we were working over. However, to define
an inner product, it is necessary that we consider vector spaces over R. This is because the
real numbers have the special property that x2 ě 0, with equality if and only if x “ 0.
Otherwise, the positive-definiteness axiom is difficult to ensure. For example, if we are
working over a finite field, then x2 ` y 2 ` z 2 may be equal to zero as we found out on the
last problem set.
An inner product space also gives rise to notions of norms and distance. These are
given by a
|x| “ xx, xy
and
dpx, yq “ dpx ´ y, 0q “ |x ´ y|.
The former is a definition, but the latter is just something we cooked up. To show this
actually works, we need to check that the axioms of a distance function are satisfied.
In particular, the triangle inequality boils down to showing:
Lemma 18.1.3. For any vectors x, y P V , |x| ` |y| ě |x ` y|.
If we square both sides, since they are positive the inequality is equivalent to showing
57
There are two different types of proofs.
Proof 1. This is the proof given in the book. It is more geometric.
Let us take two vectors x, y. If x “ 0, there is nothing to check.
On the other hand, if x ‰ 0, let’s project y onto an axis perpendicular to x to get yr. Then
y | ` C|x|2 for
y | ě xx, yry “ 0. Then y ´ yr differ by a multiple of x, so we find |x||y| “ |x||r
|x||r
some constant C, and xx, yy “ C|x|2 , so the inequality remains true.
Proof 2. This proof is more algebraic, but if you translate it to pictures the underlying
geometric idea is the same as Axler.
For every a, b P R we have that xax ` by, ax ` byy ě 0. The left hand side expands to
a2 |x|2 ` 2abxx, yy ` b2 xy, yy.
Then it is true by the standard theory of quadratic equations that constants A, B, C
satisfy Aa2 ` 2Bab ` Cb2 ě 0 if and only if A ě 0, AC ´ B 2 ě 0.
In our situation, we set A “ xx, xy, B “ xx, yy, C “ xy, yy. Then we get xx, xy ě 0,
xx, xyxy, yy ě pxx, yyq2 . This last statement is equivalent to the Cauchy-Schwarz inequality.
Example 18.1.5. There is another natural inner product for continuous functions on the
interval rx0 , x1 s. Although we don’t have a notion of integration yet, we can take it in the
usual high-school calculus sense and set
ż x1
xf, gy “ f pxqgpxqdx.
x0
şx
We need to show the axioms. For positive-definiteness, xf, f y “ x01 f pxq2 dx. The inte-
grand is non-negative, so the integral is clearly non-negative. Then, it remains to argue that
it is zero if and only if f “ 0, which we can figure out using notions of continuity.
In theş case of polynomials, we can also define an inner product on the real line by
8
xf, gy “ 0 f pxqgpxqe´x dx. Since e´x decreases faster than any polynomial increases, this
integral is always well-defined.
Something that was forgotten earlier was to discuss when equality holds. For the Cauchy-
Schwarz inequality, we have the equality xax ` by, ax ` byy “ 0 if and only if ax ` by “ 0.
We have a name for this: x and y are linearly dependent!
58
• “Sesquilinear” or “conjugate-linear”: xav1 ` bv2 , wy “ axv1 , wy ` bxv2 , wy, xv, aw1 `
bw2 y “ axv, w1 y ` bxv, w2 y for all v1 , v2 , w1 , w2 P V , a, b P C.
Note that the positive-definiteness is well-defined, since xx, xy is equal to its own conjugate
by conjugate-symmetry and is therefore real.
In our proof of the triangle inequality, we now need to show |x||y| ě Rexx, yy instead.
However, the Cauchy-Schwarz inequality |x||y| ě |xx, yy| still holds and |xx, yy| ě Rexx, yy.
59
19 10/18/17
PS 6 due now, PS 7 out today. Last day to sign up for the Putnam exam.
No class on Friday, November 3.
Final exam due end of Sunday, December 10.
The trivial example here is σ “ id and there could be nontrivial maps, as in the case of
the complex numbers. Such maps are called involutions. In the field ? F “ Qpiq you can
ÞÑ a ´ ib and this similarly works for F “ Qp dq for d not a square
consider the?map a ` bi ?
where a ` b d ÞÑ a ´ b d. We can now interpret non-degeneracy for inner products over
the complex numbers as if for all w, xv, wy “ 0, then v “ 0.
Recall our standard example of an inner product on the reals to be
ÿ
xa, by “ ai b i .
19.2 Orthogonality
Definition 19.2.1. If u, v P V with an inner product (either a conjugate-symmetric sesquilin-
ear pairing or a symmetric bilinear pairing), we say that u is orthogonal to v or u K v iff
xu, vy “ 0.
60
Lemma 19.2.2 (Pythagorean Theorem). If U K v, then xu ` v, u ` vy “ xu, uy ` xv, vy .
As we just established, you can just check this on the spanning sets.
Note that in the sum U ` U 1 of orthogonal subspaces, we have
by bilinearity. Thus given an inner product, two orthogonal subspaces must meet only at 0,
so we actually have a direct sum U ‘ U 1 . Such a direct sum is orthogonal direct sum,
sometimes denoted U ` U 1 .
Theorem 19.3.1 (Axler 6.29+, Gram-Schmidt). Let V have a conjugate bilinear pairing
and U Ă V for V finite-dimensional. If our pairing is non-degenerate on U (e.g. it’s an
inner product), then there exists a unique W Ă V such that V “ U ` W
61
20 10/20/17
This weekend: Head of the Charles. Don’t leave the Yard without your HUID.
Nov 3: no class.
Nov 10: class despite Veteran’s Day.
and then uniqueness gives us the desired. We can now just let W “ π2 pV q, π2 “ id ´π1 .
There is a nice geometric description of this in terms of inner products. Given an inner
product and any orthogonal decomposition V “ U1 ` V2 , then v “ u1 ` u2 implies that u1
is the closest vector in U1 to V since
|u1 ´ v| ď |u1 ´ v|
for all u1 P U1 . This is often called the orthogonal projection. Note that u1 ´ v “ pu1 ´ vq `
pu1 ´ u1 q so by the Pythagorean theorem |u1 ´ v|2 “ |u1 ´ v|2 ` |u1 ´ u1 |2 .
62
We can now write
ÿ ÿ ÿÿ
x“ xj u j , y “ yj uj , xx, yy “ xj yj 1 xuj , uj 1 y
j j j j1
How do we know cj is positive? Note that the polarization identity says that
4 xu, vy “ xu ` v, u ` vy ´ xu ´ v, u ´ vy .
If char F ‰ 2, then if xu, uy “ 0, we have xu, vy “ 0 for all u, v. Thus nondegeneracy implies
that the cj are positive.
This basis is called an orthogonal basis. We can now scale to get cj “ 1 by dividing
a
by xuj , uj y to get a orthonormal basis.
What if we have a bilinear, symmetric, nondegenerate pairing that is not positive definite?
We can then make the cj either ˘1 by scaling. It turns out that:
Theorem 20.2.1 (Sylvester’s Law of Inertia). The number of positive and negative cs is an
invariant of the pairing, called the signature.
This is much less obvious over the rationals! Note that 5px2 ` y 2 q “ px ` 2yq2 ` p2x ´ yq2
which is equivalent to x2 ` y 2 , but 7px2 ` y 2 q.
Proof. If you could find two different signatures, then you have written U “ U1 `U2 where U1
is positive definite and U2 is negative definite. Given two different such decompositions where
the dimensions don’t match, we must have dim U1 ą dim U11 , dim U21 ą dim U2 . Consider
U1 XU21 which must be nontrivial since dim U1 `dim U21 ą dim U . Our form is simultaneously
positive definite and negative definite on this space, contradiction.
63
21 10/23/17
Today: adjoints and orthogonal eigenbases. Same stuff as usual.
21.1 Adjoints
Duality tells us that given a map T : U Ñ V , there is a dual map T ˚ : V ˚ Ñ U ˚ . This
is defined by precomposition T ˚ pv ˚ q “ pu ÞÑ v ˚ pT uqq. If you have an inner product, this
induces an isomorphism U Ñ U ˚ , V Ñ V ˚ , which is u ÞÑ xu, ´y , v ÞÑ xv, ´y (assuming U
and V are finite-dimensional). This induces a map:
Definition 21.1.1. The adjoint T ˚ (via abuse of notation to torment poor students) is the
induced map T ˚ : V Ñ U by the above isomorphisms.
If you unwind the definition, you find that the adjoint is characterized by
xT u, vy “ xu, T ˚ vy .
(Exercise: verify that the above is true). We now see that pT ˚ q˚ “ T (adjoints, not duals
here) and pST q˚ “ T ˚ S ˚ .
If you choose bases to represent T and the dual T ˚ by matrices, T ˚ is the transpose of T .
This is also true for the adjoint, provided that you use a canonical “self-dual” basis. What
does it mean for a basis to be self-dual? Well the dual basis pu1 , . . . , um q sends one vector
to 1 and the rest to 0. Thus we want the inner product of these vectors to be the Kronecker
delta, i.e. we need an orthonormal basis.
Over the complex numbers, the adjoint T ˚ is actually the conjugate transpose, since we
have to take a complex conjugate. This is also known as the Hermitian transpose T H .
64
Theorem 21.2.1 (Spectral Theorem). Suppose V is finite-dimensional over R or C such
that T : V Ñ V are self-dual, i.e.
xu, T vy “ xT u, vy .
Then there exists an orthonormal eigenbasis and all eigenvalues are real.
65
22 10/25/17
Talk by Noga Alon about random Cayley graphs (Center for Mathematical Sciences and
Applications) today at 3.
It’s an amazing theorem that the only possible degrees that a graph of girth 5 and
diameter 2 can have are 2, 3 (constructed above), 7, and 57. The more amazing part is that
this comes from linear algebra.
66
A label-free formulation is to consider TG : RV Ñ RV . RV has basis eV for all v P V . Then
ÿ
TG pev q “ ev 1 .
v 1 adjacent to v
A2 ` A “ pd ´ 1qI
?
˘ 4d´3´1
which implies that λ1,2 “ 2
. Since 4d ´ 3 ‰ 0, these are real and distinct. So
Rn0 “ Vλ1 ` Vλ2 .
67
What are the dimensions of these eigenspaces? Let d1 “ dim Vλ1 , d2 “ dim Vλ2 . To
compute this, note that Tr TG “ 0. Pick a basis corresponding to the decomposition 1 `
Vλ1 ` Vλ2 . Then the trace is
Tr TG “ d ` d1 λ1 ` d2 λ2 “ 0.
d1 ` d2 “ d2
since dim Vλ1 ` dim Vλ2 “ dim 1T . Thus we can solve this for d1 , d2 . In the pentagon, we get
d1 “ d2 “ 2; in the Petersen graph, we get d1 “ 5, d2 “ 4.
2
What about other d? You can find that d1 ´ d2 “ ´2d´d
?
4d´3
, but this needs to be an integer!
Thus 4d ´ 3 must be a perfect square, i.e. 4d ´ 3 “ p2m ` 1q2 or d “ m2 ` m ` 1 or d “ 2.
Thus d “ 3 corresponds to m “ 1 and d “ 7 corresponds to m “ 2, but m “ 4 does not get
integer dimensions. m “ 7 is the next working candidate with d “ 57. We can show this by
15
dividing d1 ´ d2 out by polynomial division to get 2m`1 ` P pmq, so 2m ` 1|15, which implies
the desired.
68
23 10/27/17
Today: matrices (eww), Schur and normal operators, quadratic forms.
T
V W
In Chapter 5, we looked at linear transformations T : V Ñ V , you need to use the same
automorphism on both sides, or in matrix notation we must pick inverses. Thus we want
gV T gV´1 .
Consider the completely general setting f : X Ñ Y . Both X and Y have various
automorphisms. f is equivalent to f 1 under automorphisms if the diagram commutes:
f
X Y
gX gY
f1
X Y
´1 ´1
or in equations that f 1 “ gY f gX . If we let X “ Y , we then get f 1 “ gX f gX , which gives
us the equation we had before in the case of vector spaces. This construction is known as
the conjugate of f by gX (or gX ´1
). We sometimes denote f g “ g ´1 f g, which makes some
sense since pf g qh “ f gh .
if you have an inner product space, an automorphism of an inner product space is more
constrained since it has to preserve the inner product. In particular, we require
xAv, Avy “ xv, vy
for all v if A P AutpV q (exercise: this implies xAv, Awy “ xv, wy). We know that
xAv, Awy “ xAT Av, wy “ xv, wy
69
which implies AT A “ id over the reals and AH A “ id over the complex numbers.
Definition 23.1.2. The set of matrices A such that AT A “ id or equivalently xAv, Awy “
xv, wy is called the set of orthogonal matrices OpV q or On .
Notice that in this case, the columns of A are an orthonormal basis is equivalent to the
rows of A being an orthonormal basis. This is a useful punchline for mathematical proofs.
Definition 23.1.3. The set of matrices A such that AH A “ id or equivalently xAv, Awy “
xv, wy is called the set of unitary matrices UpV q or Un .
as desired. Now apply Schur’s theorem and we can induct down to show that the off-diagonal
terms are zero using Schur’s theorem.
How do you prove the spectral theorem without invoking the fundamental theorem of
algebra? Consider xx, T xy where T “ T ˚ . In bases,
ÿÿ
xx, T xy “ xi Tij xj .
i j
70
Equality holds iff you are in the Vλmax eigenspace (and likewise for the lower bound). This
gives us an idea of how to maximize or minimize the quantity |T|x|x| and once we do compactness
and continuity next semester, we’ll see that this can be completed to an independent proof
of the spectral theorem without the fundamental theorem of algebra.
71
24 10/30/17
Section/Math Night as usual.
Tomorrow: Math Table talk on the Monster groups by Brian Warner (55 alum from last
year).
No class on Friday.
For n “ 1, we require .a11 ‰ 0. For n ´ 2, we require a11 a22 ‰ a21 a12 . You can find that
the solutions for both x1 and x2 have denominator a11 a22 ´ a21 a12 . For n “ 3, we get a more
complicated condition
a11 a22 a33 ` a12 a23 a31 ` a13 a32 a21 ´ a11 a23 a32 ´ a12 a21 a33 ´ a13 a22 a31 .
Cramer’s rule tells you the relationship between the numerator and denominator of the
solutions.
We could write further cases out explicitly, but you could guess that the general pattern
is something like the following. Thus we can define:
Here, : Sn Ñ t1, ´1u is a nontrivial group homomorphism (there’s only one possibility)
and Sn is the symmetric group, or the group of permutations of n letters. We can now
try to develop some properties of and det, which pins down the determinant uniquely and
(hopefully) makes it reasonably easy to verify that det A ‰ 0 iff vi form a basis.
Highlights:
pστ q “ pσqpτ q,
pidq “ 1, and pσ ´1 q “ pσq´1 “ pσq. The kernel of this homomorphism is An , the
alternating group.
72
2. If σ is a simple transposition, i.e. σpiq “ i`1, σpi`1q “ i and σ sends everything else
to itself, then pσq “ ´1. Sn is generated by simple transpositions, so this determines
(and An is a proper subgroup for n ą 1). We do need to prove that this is a consistent
definition of ; we’ll do this later.
We’re now going to prove the results above, starting with the construction of . Our
idealŹdefinition` of the
˘ determinant Źwill be based on the exterior power. We’ll show that
k
dim pV Ź
q“ k dim V
implying that n pV q for n “ dim V is one-dimensional. We can then
construct n pT q, the wedge product of a linear transformation, which is a map from a one-
dimensional space to itself and thus a scalar known as det T . We’ll then show that that
definition is consistent with the classical formula but also totally basis-independent.
Let’s start by defining . For σ P Sn , let
pσq “ p´1qN
where N is the number of inversions, or pairs pj, kq such that 1 ď j ă k ď n but σpjq ą
σpkq. We can do some messy combinatorics now to convince ourselves that this is equivalent
to the definition we provided above on the generating set.
Proof. Count inversions. Notice that no additional inversions could have been generated
aside from the ones involving i and i ` 1, and i, i ` 1 has an additional inversion. Hence the
negative sign.
This is similar to the bubble sort algorithm from computer science.
How do we show that this is a group homomorphism? Just write each permutation as a
product of transpositions and work from there. This reduces to addition of inversions mod
2, which is simple enough.
73
25 11/1/17
No class on Friday, but PS 8 problems 7 - 8 are due Friday at noon.
Definition 25.1.1. A F-algebra is a ring A with unity that contains a copy of F, i.e. F Ă A,
with the same ring operations. In particular, it’s a vector space over F. If this multiplication
operation is associative, we call it an associative F-algebra.
Let V be any F-vector space and let dim V “ n. Then the tensor algebra is defined as
8
à
‚
bV “ V bi .
i“0
Here dim V bi “ ni . This is clearly an F-algebra, with the multiplication operation being the
tensor product.
The tensor algebra above is a graded F-algebra. Other examples include F (Ai “ 0 for
i ą 0), Frxs (Ai “ Fxi ), and Frx1 , . . . , xm s (Ai are the homogeneous polynomials of degree
i).
Given this, we can define a couple more spaces. The symmetric algebra is
i
Sym‚ V “ ‘8 ‚ 1
i“0 Sym V “ b V { Spanpt b pv b w ´ w b vq b t q
where t, t1 P b‚ V, v, w P V . This quotient ensures that all the cosets will be symmetric
tensors. Note that we can simply perform this quotient on each graded part to get
Symi V “ bi V { Spanpt b pv b w ´ w b vq b t1 q
v ^ w “ ´w ^ v,
74
and v ^ v “ 0 (which should make sense from the condition you’re familiar with about
determinants). These two rules are almost equivalent to each other, aside from char F “ 2
in which case v ^ w “ ´w ^ v does not imply v ^ v “ 0. Thus we have a choice; which
condition do we want? We’ll use the stronger condition v ^ v “ 0 which is almost always
equivalent to v ^ w “ ´w ^ v.
The image of v b w in 2 V is v ^ w.
Ź
b2
Ź2 Observe that if V has basis e1 , . . . , en , then V has basis ei b ej for 1 ď i, j ď n and
V has basis ei ^ ej for 1 ď i ď j ď n since all the others disappear (ej ^ř ei “ ´ei ^ ej ).
How do we know that these are all linearly independent? Suppose that v “ i ai ei , then
ÿÿ ÿ ÿ
vbv “ ai aj ei b ej “ a2i ei b ei ` ai aj pei b ej ` ej b ei q.
i j i iăj
Proposition 25.2.3 (Universal Property of the Wedge Product). The set of alternating
bilinear maps f : V ˆ V Ñ X is isomorphic to linear maps 2 V Ñ X. Specifically, we have
Ź
the diagram:
T Ź2
V ˆ V univ V
f
fr
25.3 Determinants
We can now define the wedge product of linear transformations.
75
Now consider T : V Ñ V and for now consider dim V “ 2. Then T e1 “ ae1 ` ce2 , T e2 “
be1 ` de2 so
Ź2
T pe1 ^ e2 q “ pae1 ` ce2 q ^ pbe1 ` de2 q “ pad ´ bcqpe1 ^ e2 q.
This scalar is the canonical definition of the determinant of the matrix. Our condition above
then corresponds to det ST “ det S det T , so the determinant is multiplicative. We’ll now
need to generalize this further.
Note that 0 V “ F,
Ź Ź1 Ź2
Źi Źj V “ V, V is our definition from above. We find that v ^ w “
´w ^ v as before and V, V anticommute if i, j are odd and commute otherwise. The
universal property above also extends to that of an alternating multilinear maps, not just
alternating bilinear maps.
As before, we find that ei1 ^ . . . ^ eij for any subset ti1 , . . . , ij u Ă t1, . . . , nu form a basis
Źj
for V . This is clearly a spanning set; to show that it is linearly independent, let A be the
algebra with this basis. Now b‚ V Ñ A has an obvious map by sending simple Źtensors to the
‚
appropriate wedge product; this map is alternating to it descends to a map V Ñ A that
is at Ź
least surjective, but it must also be injective since we already know that the wedges
span ‚ V . Thus we get Źkour desired
`n˘ isomorphism Źn and we have a basis.
V “ 1. Thus maps on n V are scalar
Ź
In particular, dim V “ k and dim
multiplication. We can define the n-fold wedge product of linear transformations just as
above, so we get:
We will now leave it as an exercise to show that the formula from last time corresponds
to this definition of the determinant. But note that this is clearly basis-independent and
multiplicative and as an easy exercise, you can show that det T “ 0 iff T is not invertible
and that det id “ 1.
76
26 11/6/17
Tomorrow’s math table is by Rohil; section, math night, etc. as usual.
We’ll start with a digression about the 15 puzzle and then talk about group homomor-
phisms; we’ll then go back to signs and determinants.
The goal is to unjumble the puzzle into the final state shown above. However, not all
configurations are reachable from the above (or solvable given that is an initial configura-
tion)! There are two possibilities here: either you can solve it, or you will end up with a
configuration where the 15 and 14 are swapped. Mathematically, we want to show that all
configurations are reachable from these two possibilities and these two possibilities are not
reachable from each other.
The reason you can’t solve it is essentially the existence of the sign homomorphism.
Always getting it to this form is deriving an algorithm which is easier than the Rubik’s cube
but still annoying. But you can see that all configurations you attain constitute a subgroup
of S15 (assuming the empty square is in the same place). It contains the identity and the
inverse (just retrace your steps) and is closed under composition (just do both permutations
in sequence). The claim is that this subgroup is contained in A15 “ ker , where is the sign
map.
To prove this, go inside S16 (add the empty square). But we no longer have a subgroup
since we no longer have composition. However, if we wind up with the empty square back
where we started, we end up back in S15 and in our original subgroup. There must be an
even number of such transitions (since 16 needs to move an even number of steps) so we have
an even number of transpositions meaning that we are actually A16 X tσ | σp16q “ 16u “ S15 .
This is just A15 , as desired.
Why is the number of steps even? You can either note this by seeing that for every step
you take out, you take one step backwards (so the total is even) or consider the checkerboard
coloring, under which on odd steps you are always on the opposite color.
77
26.2 Digression: Group Homomorphisms
Consider a map between groups. Not surprisingly, we have:
Definition 26.2.1. A map φ : G Ñ H is a group homomorphism if φp1G q “ 1H , φpg ´1 q “
φpgq´1 , and φpgg 1 q “ φpgqφpg 1 q for all g, g 1 P G.
Definition 26.2.2. The kernel of a group homomorphism φ : G Ñ H is ker φ “ tg P
G | φpgq “ 1H u. It is clearly a subgroup (exercise).
We can now write a short exact seuqnece
φ
1 Ñ ker φ ãÑ G Ñ H Ñ 1
to reinterpret this information. But there is one major difference with vector spaces. If we
took any subspace U Ă V , we can always form a quotient space
0 Ñ U Ñ V Ñ V {U Ñ 0
to complete this exact sequence. This is not true for any subgroup of G! We can’t always
form a quotient group to complete
1 Ñ K Ñ G Ñ G{K Ñ 1.
Suppose we consider the set of cosets rgs “ gK Ă G. We want to define a multiplication
operation on this set of cosets. For vector spaces, we wrote pv1 ` U q ` pv2 ` U q “ v1 ` v2 ` U
which is well-defined any time you have an abelian group. But if you have a nonabelian
group, then it’s not necessarily true that
pgKqpg 1 Kq “ pgg 1 Kq
since we can’t commute K with g 1 .Further, there are two different types of cosets (gK and
Kg); one is a right coset and one is a left coset.
Sn is not commutative, but we were able to have Sn {An , since one coset is all the even
permutations and the other is all the odd permutations. So clearly sometimes we can take
quotients, but there are plenty of cases where we can’t. For example, S2 Ă S3 cannot form
a quotient group. Thus we have the condition.
Definition 26.2.3. A normal subgroup is a subgroup K Ă G for which any of the
following equivalent conditions hold:
1. For all g, gK “ Kg.
2. For all g, gKg ´1 “ K (the conjugate of K by g, which is automatically a subgroup).
3. G{K exists as a quotient group.
Theorem 26.2.4. The conditions above in the definition of a normal subgroup are equivalent.
Proof. The equivalence of the first two should be obvious. For the last, just note that
pgKqpg 1 Kq “ gg 1 KK “ gg 1 K as desired.
Note that the kernel of any group homomorphism K “ ker φ is automaticaly normal, as
φpgqφpKqφpg ´1 q “ φpgqφpg ´1 q “ 1 so gKg ´1 Ă K as desired.
78
26.3 Back to Determinants
We have constructed the determinant map det : GLpV q “ GLn pFq Ñ Fˆ (the multiplicative
group associated with a field F).
Definition 26.3.1. The special linear group SLpV q “ SLn pFq “ ker det.
79
27 11/8/17
27.1 Computing Determinants via Column Reduction
We’ve defined det T in terms of the exterior algebra, and would now like an easy way to
compute it iin terms of matrices (and to connect it to more familiar definitions of the de-
terminant from more standard and lamer linear algebra courses). Our standard definition
immediately implies
detpdiagpa11 , a22 , . . . , ann qq “ a11 a22 . . . ann
(diag means a diagonal matrix with those entries). Thi sis still true if you have an upper-
triangular matrix; all the elements in the upper triangle cancel out since they have elements
of the form ei ^ ei .
We’ll use these facts to compute the determinant in general, along with the fact that
the determinant is multilinear and alternating. This is equivalent to the identity detpABq “
det A det B for some special matrices A. We start with some easy properties.
2. We have
detpv1 , . . . , vj ` w, . . . , vn q “ detpv1 , . . . , vj , . . . , vn q
with w P Spanpv1 , . . . , vpj , . . . , vn q where the proof is the same as above. This is equiva-
lent to detpABq “ det A det B by letting A be anything and B be the identity matrix,
except in the jth column which has other random coefficients in the non-diagonal
entries, i.e. ¨ ˛
1 a1
˚ 1 a2 ‹
˚ ‹
. . ..
‹.
˚ ‹
˚ . .
˚ ‹
˝ 1 ‚
an 1
You can check that det B “ 1. These matrices are closed under multiplication (as are
the diagonal matrices), so you really only need to check them for a spanning set, i.e.
matrices with 1s on the diagonal and a single off-diagonal entry. This is known as
column reduction.
We can use this to compute determinants. Use column reduction to make your matrix
upper or lower triangular by subtracting columns appropriately; this allows you to
compute the determinant easily using the property that we described above.
3. You can switch the ith and jth column by using the matrix B with bij “ bji “ 1, bii “
bjj “ 0 and otherwise the identity. This matrix has det B “ ´1, as we’d expect by the
exterior product. This allows you to switch columns up to sign.
80
This gives us an algorithm to compute determinants that works in roughly OpN 3 q, since
you have N elements per column and N 2 pairs of columns to compare. This is a signif-
icant improvement over the OpN !q that we had earlier from the explicit formula for the
determinant!
Of course, if you have an eigenbasis or an upper triangular basis already, you should use
that. But finding those matrices is roughly as hard as this procedure, so it’s not necessarily
general.
Note that the fact that the determinant of A is equivalent under changing a basis is just
The matrices above are called elementary matrices, along with diagp1, . . . , a, . . . , 1q
which is sometimes called elementary. These allow us to reduce our matrix into something
that is easier to compute the determinant of since each elementary matrix has an obvious
determinant. Every invertible matrix can be written as the product of elementary matrices.
In terms of group theory, we say that the elementary matrices generate GLn pFq.
We have
AB1 . . . Bk “ D
where the Bi are elementary matrices and D is diagonal. We can thus write
A “ DBk´1 . . . B1´1
implying that
D´1 B1 . . . Bk “ A´1 .
Thus we can compute the inverse!
AT “ pB1´1 qT . . . pBk´1 qT DT .
We claim that detpAT q “ detpAq. You can check this by checking it on the generators, which
is easy enough.
Alternatively, considerŹT ˚ : V ˚ Ñ V ˚ .ŹThen we know thatŹdet T ˚ “ n pT ˚ q which
Ź
we
Źnwant toŹconnect with n T . Note that n pV ˚ q is not quite n V ! But we have a map
pV ˚ qˆ n V Ñ F (by
Źn evaluation, i.e. pv1˚ ^. . .^vn˚ , v1 ^. . .^vn q ÞÑ v1˚ pv1 qv2˚ pv2 q . . . vn˚ pvn q)
pV ˚ q Ñ p n V q . You can finish off this proof with a commutative
Ź ˚
which gives us a map
diagram for brownie points from me.
Źn´1
27.3 Adj A and pV q
Recall
Źn´1 the matrix Adj A that relates closely to the inverse. We’ll connect this now to
pV q. Recall that there’s a canonical map V ˚ ˆ V Ñ F and also an important map
81
Źn´1
V ˆ V Ñ F via wedging, i.e. pv1 ^ . . . ^ vn´1 q ˆ v Ñ v1 ^ . . . ^ vn´1 ^ vn . This is a
perfect pairing, which you can check by extending v to n linearly independent vectors and
wedging the other n ´ 1Ź together.
We can now identify n´1 Ź V » V ˚ b n V by comparing these two maps. Applying n T
Ź Ź
on both sides, you find that n´1 T “ T ˚ det T for some interpretation of T ˚ . Thus the
adjugate ^n´1 T is equal to the determinant times the transpose, which proves the standard
formula for the inverse of a matrix.
Next time, we’ll talk about positive definite matrices and signatures in terms of determi-
nants.
82
28 11/10/17
PS 10 due next Monday, November 20 (not next Friday)!
83
ř
Now pick an eigenbasis. Then we see that v “ i ai ei implies that
ÿ
xT v, vy “ |ai |2 λi .
i
Thus our operator is positive semidefinite iff the λi ě 0 and positive definite iff λi ą 0. But
how do you determine if your transformation T is positive-definite?
Observe that ź
detpxI ´ T q “ px ´ λi q
i
(as a special case of the characteristic polynomial applied to the self-adjoint operator). This
determinant doesn’t depend on eigenbasis, so we can apply our formula to the matrix in
any basis and find roots of our polynomial (or approximate them numerically) and check
whether they are positive.
T being positive definite implies that xT u, uy ą 0 for all u P U given any subspace
U Ă V . This isn’t very exciting, but you can take any subspace, restrict the quadratic form
to that subspace, get the matrix and determinant, and this also has to be positive. This is
a much stronger condition since you can check on any subspace. It is also sufficient (check
on one-dimensional subspaces). Thus given a flag
0 “ V0 Ă V1 Ă . . . Ă Vn “ V
where each Vk has dimension k, you can check positive-definiteness on each Vk . You can also
compute the signature in this way by counting sign changes.
Consider the matrix ¨ 1 1˛
1 2 3
˝ 1 1 1 ‚.
2 3 4
1 1 1
3 4 5
You can verify that the sub-determinants here are positive, so this is a positive-definite
matrix (as are its obvious generalizations).
Proof. We want to show that if xT v, vy |Vk is already known to be positive-definite and det on
Vk`1 is positive, then xT v, vy |Vk`1 is also positive-definite. (This is essentially the necessary
induction step).
The signature of xT v, vy |Vk`1 is either pk ` 1, 0q or pk, 1q. But the latter would imply a
negative determinant, impossible. So we have the desired.
84
29 11/13/17
Math table tomorrow: Sander Kupers (BP fellow) on knots and the spirit world.
This week: algebraic methods in combinatorics talks at CMSA.
Our last topics of linear algebra are Ch. 8, where we come back to one of the fundamental
questions of linear algebra T : V Ñ V , how do you classify such operators? We already said
that vector spaces are classified by their dimension and a linear map T : V Ñ W is classified
by the dimensions dim V, dim W and the rank. T : V Ñ V is more complicated; we already
have some information (the eigenvalues). If you can write V “ ‘λ Vλ , i.e. T is diagonalizable
and we’re done (we just need the eigenvalues). But this doesn’t always work;ˆF might ˙ not
0 1
be algebraically closed, and even if it is ‘λ Vλ might be strictly smaller, as in . The
0 0
answer to this question if F is algebraically closed is Jordan normal form (which you should
be familiar with from my section). If F is not algebraically closed, i.e. F “ R, this becomes
a pain.
ś This is dim
relatively obvious for diagonalizable transformations, since we have qT pxq “
Vλ
λ px ´ λq which is clearly satisfied by T . We’ll show that it is also true for non-
diagonalizable transformations.
Why is this useful? One simple application is for n “ 2; we have
T 2 ´ pTr T qT ` det T “ 0
so you can solve to get
Tr T ´ T
T ´1 “
det T
given det T ‰ 0. This corresponds to our usual formula for the inverse of a 2 ˆ 2 matrix.
We’ll give a proof over the complex numbers, but this is just a polynomial identity with
coefficients in Z. So this is true in the reals, or in any other field we like, once we reduce it
to this polynomial identity. This general principle (turning things into polynomial identities
and writing them over different fields) works in other examples, like proving that if A is
skew-symmetric and n is odd then det A “ 0. Proof: work over C using AT “ ´A and then
deduce it for all fields, even those with characteristic 2.
85
29.2 Nilspace and Generalized Eigenspaces
Let the algebraic multiplicity be the multiplicity of λ as a root of qT , or the number of
λs on the diagonal of the upper triangular form. Let the geometric multiplicity of λ be
dim Vλ . This indicates that the number of λs on the diagonal in the upper-triangular form
is an invariant.
We can show that the geometric multiplicity ą 0 is eqiuvalent to the algebraic multiplicity
ą 0. However, the geometric multiplicity ď the algebraic multiplicity always, so the ř same is
true of the totals (which we already knew since the geometric multiplicities add to λ dim Vλ
and the algebraic multiplicities add to dim V ). Equality occurs iff the algebraic and geometric
multiplicities are equal, i.e. the transformation is diagonalizable.
What happens if not? We need to replace Vλ by something similar. We’ll do this for λ “ 0
and then see how this works for general λ. Consider T : V Ñ V and define Vk “ ker T k . We
have
0 Ă V0 Ă V1 Ă . . .
as an increasing sequence of subspaces that is contained in V . Further, T : Vi`1 Ñ Vi for all
i. Further, any time Vk “ Vk`1 , we have Vk`1 “ Vk`2 and so on. Thus we say:
Note: in section, I called this the eventual kernel. It is also sometimes denoted kerpT 8 q.
The nilspace must stabilize in less than dimpnilspaceq steps, thus you can stop taking hte
union at the dim V step, or just take Vdim V “ kerpT dim V q.
V “ ‘ nilspacepλI ´ T q.
Using this, you can prove Cayley-Hamilton and find a canonical form for any operator: you’ll
have all copies of each eigenvalue together and 1s or 0s on the superdiagonal and nothing
beyond that.
86
30 11/15/17
Announcement: the math department is holding a new seminar called the Open Neighbor-
hood Seminar with accessible math talks. The series is every other Thursday, talks from 4 -
5, snacks from 5 - 6. See http://math.harvard.edu/ons.
0 Ă V0 Ă V1 Ă . . . Ă Vk Ă .
We can see that this sequence stabilizes at Vn . We may also define the opposing sequence
V “ Im T 0 Ą Im T 1 Ą . . . Ą Im T m Ą . . . .
Proof. The dimensions match up by rank-nullity, so we need to check that the intersection is
zero. The intersection is all vectors v such that T n v “ 0, T n w “ v. But T is an isomorphism
on Im T n so this implies v “ 0.
Now work inside ker T n . T is nilpotent here, so we can find a basis under which the
matrix of T is strictly upper-triangular, not just upper-triangular. Thus we have pT ´
λIqdim nilspaceT ´λI “ 0. So we can decompose
Now you can induct, applying the same argument to each eigenvalue, until you get all the
generalized λ-eigenspaces.
Now we just need to put everything together. Suppose you take the direct sum of two
linear transformations T1 : V1 Ñ V1 , T2 : V2 Ñ V2 . Then we have
This can be easily verified using the exterior product or using the block diagonal formula for
the determinant. Now we just use the nilspace decomposition to get Cayley-Hamiton.
87
30.2 Computation Issues
We know that ˆ ˙´1 ˆ ˙
a b 1 d ´b
“ .
c d ad ´ bc ´c a
How do you compute this? ad, bc can be computed to 2N digits of accuracy if you are
given N digits of accuracy in a, b, c, d, but we are then subtracting and dividing. Dividing is
problematic since dividing by something close to 0 gets something huge, which can change
a lot with very small errors in the denominator. There is a field of numerical analysis that
talks about the stability of doing operations like this on real numbers with real objects.
For eigenspaces and eigenvalue decopmositions this gets more annoying. If λ1 and λ2 are
very close to each other, it’s very difficult to tell whether they are exactly the same. So it’s
very difficult to distinguish between close eigenvalues and the same eigenvalue, which messes
greatly with attempts to compute the eigendecomposition.
Finally, there is the question of sparse linear algebra. This is about storing and computing
matrices for whom the vast majority of entries are 0s (e.g. Google’s PageRank algorithm).
You can compute the top eigenvector by iterating the matrix several times on a random vector
which ges you the vector with the top eigenvalue, and so on for the next few eigenvectors
on the orthogonal complements. You also have a notion of how an eigenvalue changes with
respect to certain entries changing.
88
31 11/17/17
Thanks to Rohil for taking these notes when I was out of class.
Today we are starting representation theory. Our references are Chapter 9 of Artin’s
Algebra and Chapter 1 of Fulton and Harris’ book.
• V “ t0u is a representation.
89
Definition 31.2.2. A subrepresentation of pV, ρq is a subspace U Ď V such that U is
G-invariant, i.e. ρpgqu P U for all g P G, u P U . The action of G on U is then the restriction
of the action of G on V .
Here’s an example from (9.6) in Artin. If G acts on a set S by permutations, then G acts
on the vector space F S linearly. This is sometimes called the permutation representation.
The most natural example is G “ Sn , S “ t1, 2, . . . , nu. Then G acts on F n by permuting
the standard basis e1 , . . . , en . Another example is where G is equal to the set S, in which case
F S is known as the regular representation of G. We will show later that the decomposition
of the regular representation into irreducible representations gives us all of the irreducible
representations of G.
We can also generalize quotient spaces.
90
32 11/20/17
7π
Today: PS 2
. No class for the rest of the week or section or math table.
32.1 Characters
Characters are the main tool that we will use to understand the representations of finite
groups. As a reminder, a representation is a homomorphism ρ : G Ñ GLpV q given V {C finite-
dimensional and G a finite group. Given such a representation, we associate the character
χρ : G Ñ C such that χρ pgq “ Trpρpgqq. This is not necessarily a homomorphism. This
character carries a lot of information about your representation and can answer questions
about irreducibility, homomorphisms, duals, etc.
Here are some basic properties of the character:
2. dim V “ Tr id “ χρ pidG q. (We’ll occasionally use ρ to refer to the vector space itself
as well, so sometimes you might see dim ρ).
5. χρ1 ‘ρ2 pgq “ χρ1 pgq ` χρ2 pgq. This is immediate from the formula for the trace of a
direct sum.
91
32.2 Orthonormality of Characters
We have an inner product on CG “ tf : G Ñ Cu defined by
1 ÿ
xf1 , f2 y “ f1 pgqf2 pgq.
|G| gPG
The characters are class functions so we can consider their inner products.
by the usual formula. This is true because by Maschke’s theorem (shown last time), V can
be written as the sum of irreducibles, implying the above holds with the nχ real (and actually
nonnegative integers). Thus it doesn’t matter whether or not we take a complex conjugate.
Further, we can determine if V and W are isomorphic by splitting into direct sums
of irreducibles and then determining if they have the same number of each irreducible.
Alternatively, we could just determine if their characters are the same.
Finally, consider the regular representation. We see that xχ, χV y “ χp1q “ dim Vχ so
V “ ‘χ pVχ qdim Vχ .
92
33 11/27/17
Makeup class next Monday. Saturday: Putnam exam (10 - 1 and 3 - 6, Science Center C,
be there by 9:30 to register). PS 11 out today, due next Wednesday.
T
V W
ρpgq ρpgq
T
V W
commutes.
We already know by Maschke that V “ ‘j Vj for some irreducibles Vj and similarly
W “ ‘k Wk for irreducibles Wk . You can then check that
Further, xχV , χW y is bilinear on V and W (in terms of the direct sum), since
ÿ
χ‘j Vj “ χVj
j
and all inner products are bilinear. Thus we can just prove our result for irreducibles. These
leads us to the following.
93
Proof. The first part is relatively easy. Given T : V Ñ V 1 , we know that ker T Ă V is a
subspace, and since T is a G-homomorphism, ker T is a subrepresentation. V is irreducible,
so ker T “ 0, V and similarly Im T “ 0, V 1 . If ker T “ V , then T “ 0. If Im T “ 0, then
T “ 0. If T ‰ 0, then ker T “ 0, Im T “ V 1 , implying that T is an isomorphism. But V ‰ V 1 ,
so we can have no isomorphisms T , so there are no nonzero G-transformations T : V Ñ V 1 .
For the second part, note that T P EndG pV q has an eigenvalue λ. kerpT ´ λIq ‰ 0 so we
must have kerpT ´ λIq “ V implying that T “ λI.
If the field is not algebraically closed, we still know that EndG pV q is an algebra and it’s
still true that there are no eigenvalues in F if T is not a multiple of the identity. But this
algebra might be strictly larger.
This implies that two decompositions of V into irreducibles have the same multiplicities.
Note that dim HomG pV, V0 q for V0 irreducible is the number of appearances of V0 in the
decomposition of V into irreducibles, by what we’ve shown above. So without knowing
anything about the character, we can compute how many times V0 has to appear in any
decomposition.
We claim that this is the number of times that the trivial representation occurs in V in any
decomposition, i.e.
xχV , 1y “ dim V G
where V G is the space of G-invariants, or vectors v P V such that for all g P G we have
gv “ v. This formula holds any time char F does not divide |G|.
Proof. Let P P EndpV q be defined by
1 ÿ
P “ ρpgq.
|G| gPG
94
and similarly ρphqP “ P . Thus P sends an element v to the sum of all elements in its orbit,
where the orbit is just the set ρpgqv for all g P G. Note that P 2 “ P since
1 ÿ 1 ÿ
P2 “ ρpghq “ ρpgq “ P
|G|2 gh |G| g
as desired.
Thus P has eigenvalues 1 and 0 and we can decompose V “ V1 ‘ V0 since there are no
generalized eigenvectors. Further, V1 “ V G so
1 ÿ 1 ÿ
dim V G “ Tr P “ Tr ρpgq “ χV pgq
|G| gPG |G| gPG
as desired.
This proves the orthonormality formula for the case that one representation is the triv-
ial representation. We’ll now want to generalize this by descriing the action of the dual
representation. Observe that
if you are careful with how G acts on HompV, W q. Given T : V Ñ W , define the action of g
by
T
V W
g g
g¨T
V W
which implies that
g ¨ T “ g ˝ T ˝ g ´1 .
If we have g ¨ T “ T , then T is a G-homomorphism so T P HomG pV, W q. Thus HompV, W q
is a representation.
Another way of thinking of this is by noting that HompV, W q » V ˚ b W . We’d like to
define an action of G on V ˚ by taking the dual of ρpgq, but they don’t compose in the right
order. Thus we define the action of g as the dual of the action of g ´1 , since inverses multiply
in the reverse order. Thus the dual representation or contregredient representation is
ρ˚ pgq “ pρpg ´1 qq˚ . With that, HompV, W q as a G-representation is the same as V ˚ b W with
the normal tensor product and dual representations.
Now,
95
34 11/29/17
34.1 Group Actions
We already know that maps ρ : G Ñ GLpV q are called representations and ϕ : G Ñ
tpermutations of Su are called permutation representations. These are also known as group
actions, where G acts on S. Since this map must be a homomorphism, we require that for
all g1 , g2 P G and all s P S, then ϕpg1 g2 q “ ϕpg1 qϕpg2 q so
It follows that ϕpidqpsq “ s and ϕpg ´1 q “ pϕpgqq´1 . Note: we often write g ¨ s instead of
pϕpgqqpsq.
We can now define an equivalence relation. s » s1 such that there exists g P G such that
ϕpgqs “ s1 . The orbit of s is the equivalence class, i.e. the set of all elements of the form
φpgqs for g P G. Because this is an equivalence relation, S is the disjoint union of its orbits.
We say that G acts transitivitely if S is its own orbit, i.e. there is just one orbit.
The stabilizer of S is the subgroup consisting of g P G such that φpgqs “ s. It is
a subgroup since we know it is closed under multiplication. Stabilizers do not need to
be normal subgroups. If G is finite and the stabilizer is finite, then g ÞÑ φpgqs sends
G ÞÑ orbit of s. Every element arises the same number of times, since φpgqpsq “ φpg 1 qpsq
implies that φpg 1´1 gqpsq “ s implying that g 1´1 g P Stabpsq and g 1 P g Stabpsq. These cosets
all have the same size, so each element arises the same number of times. This map is a
Stabpsq-to-1 map. This gives us:
Lemma 34.1.1 (Orbit-Stabilizer). |G| “ | Stabpsq| | orbit |.
Further, we have the following (similar to Cayley’s theorem that any group is a subgroup
of the permutation group):
Theorem 34.1.2. Any subgroup is the stabilizer of a transitive group action.
Proof. Given some subgroup H, g ÞÑ gH is a transitive action on the cosets gH with
stabilizer H.
So we get (something that I’m still unconvinced is big enough to be a theorem):
Theorem
ˇ 34.1.3 (Lagrange). Any subgroup has order dividing the order of the group.
ˇ
|H|ˇ|G|.
96
implying that
ρpgqf “ f ˝ pϕpgq´1 q.
Note the important inverse sign.
Example 34.2.1. Consider G “ S3 . This is the smallest nonabelian group. Characters are
class functions, and conjugacy classes in Sn correspond to cycle structures (exercise). The
cycle structure is just the number/size of cycles in?the cycle decomposition of Sn . These
correspond to partitions of n, which grow like eC n (per Ramanujan’s formula). For 3,
fortunately, we have 3 partitions 3 “ 1 ` 1 ` 1 “ 2 ` 1 “ 3. The first is the identity (1), the
second is simple transpositions (3), and the last is 3-cycles (2).
We want to determine the character table. See Table 34.2.1. We always have a trivial
character χ1 . The next character is the sign, which we discussed in the determinant. You
computed its character on the determinant, and it is a representation since S3 Ñ ˘1 Ñ
GLpCq by sending 1 ÞÑ 1, ´1 ÞÑ ´1. We know that the rows of the character table have
to be orthonormal and there can be at most 3 of them. We are looking for some other
representation, and we can guess what it has to be by orthonormality. Thus we get χ2 .
But we know where this representation comes from. We clearly have a map φ : G Ñ
tpermutations of t1, 2, 3uu. The character is the number of fixed points p3, 1, 0q and it has a
copy of χ1 in it, so subtracting that off you get χ2 .
Artin goes through a number of other examples including the icosahedral group (group
of symmetries of the icosahedron). We’ll go back to some of the fundamental results of
representation theory.
97
Thus you can interpret a lot of what we’re doing in terms of modules of CrGs. For example,
the regular representation is just CrGs acting on itself by multiplication. Any representation
inside CrGs is an ideal (an irreducible ideal if the representation is irreducible).
We can compute the center of this group algebra ZpCrGsq (the center is the set that
commutes with every element of the group). We must have
˜ ¸ ˜ ¸
ÿ ÿ
eg ah eH “ ah e h e g
h h
for all ah . This implies that agh “ ahg for all g, h. Two elements can be written as gh and
hg iff they are conjugate, thus g „ g 1 implies that ag “ ag1 . Thus g ÞÑ ag defines a class
function, so ZpCrGsq is the set of class functions. Thus characters form a basis of the center
of the group algebra.
Theorem 34.3.1. The characters comprise an orthonormal basis for the set of class func-
tions.
Proof. Suppose that ϕ : G Ñ C is a class function that is orthogonal to all characters. We’ll
show that ϕ “ 0. This implies the desired, since we already have orthonormality and thus
linear independence of our characters, and if our characters did not span we could take the
orthogonal complement and find such a ϕ.
Let pV, ρq be any representation. Consider the map T : V Ñ V given by
1 ÿ
T “ ϕpgqpρpgqq.
|G| g
1
ř
ϕ is a class function, so it commutes with everything. Thus |G| g ϕpgq P ZpCrGsq so
T ρphq “ ρphqT for all h P G. Thus T is a G-endomorphism. Suppose V is irreducible, so
Tr T
T “ cI where c “ dim V
. However,
1 ÿ
Tr T “ ϕpgqχpgq “ 0
|G| gPG
98
35 12/1/17
Final exam: December 8 to 11. Handed out around noon and due in the evening (exact time
TBD).
Today: representation theory finale.
which by the above argument must be a projection W Ñ ‘Vi »V Vi . This thus extracts out
all of the parts of W that are isomorphic to V ; this argument works even in the infinite case.
This is called the ρ-isotypic subspace.
Note that, for example, EndG pV ‘ V q “ HomG pV ‘ V, V ‘ V q “ M2ˆ2 pCq. If you know
|G|
about algebraic integers, you may note that dim V
id must have eigenvalues that are algebraic
integers by this argument, so we must have dim V ||G|.
99
Let the character table be T and the diagonal matrix be D; we have
1
T DT H “ id .
|G|
Now conjugate both sides by T to get
DT H T “ |G| id
and therefore
T H T “ |G|D´1 .
This gives you inner products of columns with other columns and it appears we have another
orthogonality relationship, where the norm of each column is the group order divided by the
size of the conjugacy class. This is the size of the stabilizer (since the orbit is the conjugacy
class), i.e. all h such that h´1 gh “ g or gh “ hg, or all h that commute with g. This is the
commutator of g.
Thus we have:
Theorem 35.2.1. For all g, h P G,
#
ÿ 0 g not conjugate to h
χpgqχphq “ .
χ
|commutator| g conjugate to h
100
36 12/4/17
Today: review session + last Math Night.
Final exam haned out at noon in the 4th floor common room on Friday; due next Monday.
v “ px ´ x1 q . . . px ´ xd q “ Apx ´ x1 q . . . px ´ xd q
in FF since they disagree at most at the xi where they are both 0. This equation is vx P pxq “
Apxq for x P F which is another system of linear equations for some P P Pe´1 . The problem
required you to show that you could solve this linear equation as soon as d was the number
of errors (and not before).
The procedure is as follows. Try it for 0 errors; if it works you’re done. If not, try it with
1 error where vx px ´ x1 q “ Apxq for Apxq has degree q ´ 2e ` 1. This allows us to recover
the original polynomial. If that doesn’t work, try vx px2 ´ ax ` bq “ Apxq and solve those
linear equations. e ă q so in at most e attempts to solve simultaneous linear equations, you
will find the error-detecting polynomial and can evaluate your polynomial.
101
Proof. Consider the subgroup generated by g, i.e. H “ xgy “ t1, g, g 2 , . . . , u. Now apply
Lagrange’s theorem.
Proof. There exists g ‰ 1, so the order of g is not 1. Thus the order of g is p “ |G|, implying
G is cyclic.
ˇ
Theorem 36.2.3 (Cauchy’s Theorem). If pˇ|G|, then there exists g P G such that g has
ˇ
order p.
Theorem 36.2.4 (Sylow’s Theorems). If pk |G, then there exists H Ă G such that |H| “ pk .
If |G| “ pe m such that m ı 0 pmod pq, then such H is called a p-Sylow group. Any two
such groups are conjugate in G and any p-power subgroup is contained in one of those. The
number of such p-Sylow subgroups is congruent to 1 pmod pq.
The latter is ˘1 pmod pq and the former is 1 pmod pq. Sylow’s theorem promises that G has
n2 ´n
a subgroup of order q 2 . These are the unipotent matrices (upper-triangular matrices
with 1s on the diagonal) that form the unipotent subgroup.
102
Proof Outline of Sylow. If |G| “ pe for p ą 0, G can be partitioned into conjugacy classes.
Each conjugacy class is an orbit of the action of conjugation and the possible orders of
these orbits are 1, p, p2 , . . . , pe´1 . In particular, 1 has to occur at least p times by the same
argument as before.The conjugacy class of size 1 is a group element g such that hgh´1 “ g,
i.e. elements of the center. Thus ZpGq ‰ 1 and has size a multiple of p. Thus there is a
nontrivial normal subgroup of G. Now consider G?ZpGq. Apply the same argument and
keep quotienting out. This gets you groups of every p power. `e ˘
To show the existence of Sylow groups, consider subsets of size pe . There are ppem of
these, which is not a multiple of p (Lucas’s theorem). If G acts on this, there is an orbit
such that p does not divide its size. The stabilizer of any element in this orbit has size pe
and is thus our desired p-Sylow subgroup.
Note that p-groups are not very nice. Abelian p-groups are pretty nice; they’re just
‘pZ{pi Zq and we cna classify them by partitions. But non-abelian p-groups are not nice.
SUppose we have
1 Ñ pZ{pZqa Ñ G Ñ pZ{pZqb Ñ 1.
Let V “ pZ{pZqa , W “ 9Z{pZqb . You have a map ^2 W Ñ V by pulling back g ÞÑ gr, h ÞÑ r h
´1r ´1
and forming the commutator g, h ÞÑ rr g , hs “ grhG h . You can check
r r r
`a˘that this is actually a
2
linear map ^ W Ñ V . The vector space of such maps has dimension 2 b. This is maximized
3
for a “ n3 , b “ 2n
3
so this is roughly 4n
27
if n “ a ` b. Thus the number of groups of order p
2n3
´Opn2 q
is around p 27 which is way way way more than pn .
103
Index
104
general linear group, 69 nullity, 27
generalized λ-eigenspace, 86 nullspace, 27
geometric multiplicity, 86
graded, 74 orbit, 95, 96
graphs, 66 order, 91, 101
group, 9 orthogonal, 60
group actions, 96 orthogonal basis, 61, 63
group algebra, 97 orthogonal direct sum, 61
group homomorphism, 24, 78 orthogonal matrices, 70
group of general linear transformations, 39 orthonormal basis, 61, 63
105
tensor algebra, 74 unitary representation, 90
tensor product, 52 universal property, 26
tensor product of linear transformations, upper triangular matrix, 42
54 upper-triangular matrix, 50
trace, 56
transitivitely, 96 vector space, 12
transpose, 38 vectors, 12
trivial representation, 89 vertices , 66
wedge product, 75
unipotent matrices, 102 wedge product of linear transformations,
unipotent subgroup, 102 75
unit, 79
unit disk, 59 zero, 47
unitary matrices, 70 zero vector space, 13
106