0% found this document useful (0 votes)
24 views

Math_55a_Notes.pdf

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Math_55a_Notes.pdf

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 106

Math 55a Class Notes

Vikram Sundar
December 4, 2017

Contents

1 8/30/17 5
1.1 Introduction to Math . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Why Linear Algebra? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Mechanics for the Course . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Blackboard Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 9/1/17 8
2.1 Motivation: Linearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Groups, Rings, and Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3 9/6/17 12
3.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2 Constructing Vector Spaces: Direct Sums and Subspaces . . . . . . . . . . . 13

4 9/8/17 15
4.1 Characteristic of a Field, Homomorphisms, and Ideals . . . . . . . . . . . . . 15
4.2 Span and Linear Combinations . . . . . . . . . . . . . . . . . . . . . . . . . 16

5 9/11/17 18
5.1 Span and Finite-Dimensional . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.2 Linear Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

6 9/13/17 21
6.1 Spanning Sets and Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
6.2 Linear Independence and Bases . . . . . . . . . . . . . . . . . . . . . . . . . 22
6.3 Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

7 9/15/17 24
7.1 Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
7.2 The Space of Linear Transformations . . . . . . . . . . . . . . . . . . . . . . 26

1
8 9/18/17 27
8.1 Kernel and Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
8.2 Rank-Nullity Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
8.3 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

9 9/20/17 31
9.1 Quotient Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
9.2 Quotient Spaces and Kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
9.3 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

10 9/22/17 34
10.1 Duality of Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
10.2 Duality of Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . 34
10.3 Exact Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
10.4 Digression: Category Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

11 9/25/17 37
11.1 Finishing Up Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
11.2 Duality and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
11.3 Linear Transformations from One Vector Space to Itself . . . . . . . . . . . . 39

12 9/29/17 41
12.1 Motivation for Eigenstuff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
12.2 Invariant Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
12.3 Eigenspaces and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . 42
12.4 Quiz Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

13 10/2/17 44
13.1 Review: Invariant Subspaces and Eigenvectors . . . . . . . . . . . . . . . . . 44
13.2 Linear Independence of Eigenvectors . . . . . . . . . . . . . . . . . . . . . . 44
13.3 Existence of Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

14 10/4/17 47
14.1 Remainder and Factor Theorems . . . . . . . . . . . . . . . . . . . . . . . . 47
14.2 Algebraically Closed Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
14.3 Field Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

15 10/6/17 50
15.1 Finishing Up Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
15.2 Upper-Triangular Matrices and Flags . . . . . . . . . . . . . . . . . . . . . . 50
15.3 Preview: Inner Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

16 10/11/17 52
16.1 Definition of the Tensor Product . . . . . . . . . . . . . . . . . . . . . . . . . 52
16.2 Properties of the Tensor Product . . . . . . . . . . . . . . . . . . . . . . . . 53
16.3 Universal Property of the Tensor Product . . . . . . . . . . . . . . . . . . . 54

2
17 10/13/17 55
17.1 Review: Tensor Products of Linear Transformations . . . . . . . . . . . . . . 55
17.2 Bilinear Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
17.3 The Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
17.4 Inner Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

18 10/16/17 57
18.1 Inner Products over Real Vector Spaces . . . . . . . . . . . . . . . . . . . . . 57
18.2 Inner Products over Complex Vector Spaces . . . . . . . . . . . . . . . . . . 58
18.3 Equivalence of Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

19 10/18/17 60
19.1 Inner Products over the Complex Numbers . . . . . . . . . . . . . . . . . . . 60
19.2 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
19.3 Orthogonal Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

20 10/20/17 62
20.1 Orthogonal Complements, Continued . . . . . . . . . . . . . . . . . . . . . . 62
20.2 Orthogonal and Orthonormal Bases . . . . . . . . . . . . . . . . . . . . . . . 62

21 10/23/17 64
21.1 Adjoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
21.2 Self-Adjoint Transformations and Orthonormal Eigenbases . . . . . . . . . . 64

22 10/25/17 66
22.1 Moore Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
22.2 Introducing Linear Algebra to Graph Theory . . . . . . . . . . . . . . . . . . 66
22.3 Characterizing Moore Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . 67

23 10/27/17 69
23.1 Matrix Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
23.2 Spectral Theorem with Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 70

24 10/30/17 72
24.1 Introduction to Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . 72

25 11/1/17 74
25.1 Graded Associative Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
25.2 The Wedge Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
25.3 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
25.4 The Exterior Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

26 11/6/17 77
26.1 Digression: The Fifteen Puzzle . . . . . . . . . . . . . . . . . . . . . . . . . . 77
26.2 Digression: Group Homomorphisms . . . . . . . . . . . . . . . . . . . . . . . 78
26.3 Back to Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

3
27 11/8/17 80
27.1 Computing Determinants via Column Reduction . . . . . . . . . . . . . . . . 80
27.2 Exterior Powers
Ź of Duals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
27.3 Adj A and n´1 pV q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

28 11/10/17 83
28.1 Motivation for Quadratic Forms . . . . . . . . . . . . . . . . . . . . . . . . . 83
28.2 Positive-Definite Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
28.3 Characteristic Polynomial and the Cayley-Hamilton Theorem . . . . . . . . 84

29 11/13/17 85
29.1 The Characteristic Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . 85
29.2 Nilspace and Generalized Eigenspaces . . . . . . . . . . . . . . . . . . . . . . 86

30 11/15/17 87
30.1 Proof of Cayley-Hamilton . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
30.2 Computation Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

31 11/17/17 89
31.1 Representation Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
31.2 Subrepresentations and Quotient Representations . . . . . . . . . . . . . . . 89
31.3 Mapping Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

32 11/20/17 91
32.1 Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
32.2 Orthonormality of Characters . . . . . . . . . . . . . . . . . . . . . . . . . . 92

33 11/27/17 93
33.1 Orthogonality of Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
33.2 Schur’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
33.3 Proving Orthonormality of Characters . . . . . . . . . . . . . . . . . . . . . 94

34 11/29/17 96
34.1 Group Actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
34.2 Permutation Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
34.3 Group Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

35 12/1/17 99
35.1 Projection Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
35.2 Other Orthogonality Relations . . . . . . . . . . . . . . . . . . . . . . . . . . 99
35.3 Group Ring Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

36 12/4/17 101
36.1 Problem Set 3 Problem 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
36.2 Finite Group Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

4
1 8/30/17
Course website: math.harvard.edu/~elkies/M55a.17.
Instructor: Professor Noam Elkies ([email protected]). Office hours: Lowell din-
ing hall at the Inn at Harvard, Tuesday 7:30 - 9:30 or by email appointment. Course Assis-
tants: Vikram Sundar (me, [email protected]) and Rohil Prasad (prasad01 @col-
lege.harvard.edu).
Problem Set 1 is due on September 8.

1.1 Introduction to Math


The material in Math 55ab is largely the same as in Math 25ab; we just do things more
deeply than in Math 25ab. Thus Math 55ab is more than just an introduction to abstract
algebra and real/complex analysis but rather an introduction to the way math is done at
the undergrad and graduate level.
We’ll begin with an overview of math itself. Many of you started by learning forms of
arithmetic and then built up to geometry, algebra, and single-variable calculus (officially,
calculus is not a prerequisite for this course; we develop everything from first principles).
In the undergraduate curriculum, we have higher algebra (starting with linear algebra),
geometry (differential geometry, algebraic geometry, topology), and a lot of ways to go
beyond that (including something called arithmetic). Math 55ab are roughly in the area
between calculus and higher algebra, but we will hint to other parts of math as well.
Other people divide math up into algebra, analysis, and geometry. Algebra includes
linear and abstract algebra, analysis starts from calculus, and geometry includes topology
(which we will do some of). These are not separate; you might teach them separately, but
they interact. The algebra part will be in 55a and the analysis/geometry part will be in
55b, which will assume some of the algebra we do in 55a. Thus it’s not a good idea to take
25a/55b.
There’s another notion that math is a synergy between problem solving and system
building. In the course of solving problems, you can find the same obstacles which leads to
system building. In math classes, system building is largely what we do in class and problem
solving is what you do on the problem set, but there are links between the two.

1.2 Why Linear Algebra?


Another method is dividing math between linear algebra and everything else (which we re-
duce to linear algebra). Linear algebra is a remarkably useful tool and questions that start
out in other fields can end up in linear algebra by associating some canonical linear struc-
ture. Linear algebra also has applications far beyond math (economics, computer science,
chemistry, etc.). Note: if you’re interested in applications of linear algebra and group theory
to chemistry, come talk to me.
Let’s look at how linear algebra is related to differential and integral calculus. Given some
x0 P R, our function f must be defined in a neighborhood U Ă R Ñ R. The neighborhood
here is already a topological notion. Differentiability implies that there exists some number

5
f 1 px0 q such that
f px0 ` hq “ f px0 q ` hf 1 px0 q ` op|h|q
as h Ñ 0. This op|h|q is a notion in metric topology. If we try to generalize this to n-tuples
of real numbers (or vectors), we have x0 P Rn , U Ă Rn a neighborhood of x0 , and f ; U Ñ Rk .
Now our function is differentiable if

f px0 ` hq “ f px0 q ` f 1 px0 q ¨ h ` op|h|q

where f 1 px0 q is now a linear transformation Rn Ñ Rk and h is some n-tuple of real numbers.
This is already a notion of linear algebra that plays a critical role in differential calculus.
In integral calculus, there is a famous change of variables formula
ż f px2 q ż x2
f pyq dy “ f pypxqqy 1 pxq dx.
f px1 q x1

What happens to this in multiple dimensions? Consider y : Rn Ñ Rn . There is again a


change of variables formula, but we need a linear transformation to replace the derivative
y 1 pxq. We still have f : Rn Ñ R, so we need to substitute det y 1 pxq instead of y 1 pxq. Thus
the determinant is necessary for change of variable in multivariable calculus.
Higher derivatives also require algebra. The second derivative becomes a quadratic func-
tion, i.e. a quadratic form, which can be naturally described by tensor algebra. An extremal
point can be characterized by the second derivative. Proving this will require Rolle’s theorem
and compactness.
This is an overview of linear algebra that will be applied in analysis during 55b.

1.3 Mechanics for the Course


There is no textbook for the class that closely parallels the class. Linear Algebra Done Right
by Axler is fairly close to the material we will cover, in avoiding choice of coordinates and
determinants. The third edition uses quotient spaces, which we will need. This will be our
main textbook, but we will supplement it with notes on the website. Towards the end of the
class, we will do representation theory, which will require Artin’s textbook Algebra. You can
get a pdf for free on the Springer website if you would like.
Please learn how to TEX. It will make me happy.
Most of your grade depends on the problem sets. Mathematics is not competitive and
not similar to math competitions that many of you may be familiar with. You are allowed
and encouraged to collaborate on problem sets with other students in the class. You should
not ask for solutions from your CAs or other students or look up solutions online. You
must write up everything you come up with yourself without the help of your peers and
acknowledge your collaborators.
The final exam is take-home and on your own (without sources). There will be at least
one brief quiz to help you decide whether you’re taking 55 or 25 before the add-drop deadline.
This will be graded and count for  of your grade (something small).

6
1.4 Blackboard Notation
We typically use R to represent a boldface R for the real numbers. You can also use a regular
boldface R, e.g. R. Note that C are the complex numbers, Q are the rational numbers (for
quotients), Z are the integers (in German Zahlen). Axler often uses F to refer to either R
or C (any field).

7
2 9/1/17
Course announcements: Monday, 8 - 10 PM is Math Night in Leverett dining hall starting
this Monday. Professor Elkies will hold OH on Tuesday, 7:30 - 9 PM in Lowell dining hall.
Math Table in Mather small dining hall, 5:30 PM dinner and 6 PM talk.
We will be going in parallel to the textbook, not reproducing what is in the textbook
(which you should read on your own).

2.1 Motivation: Linearity


Today, we will cover linearity (one of the keys to linear algebra). We start with basic
mathematical notions that are so familiar that it is hard to believe how much nontrivial
math you can build on them! Linearity is almost intuitive because it’s what you are most
used to in terms of the basic rules of algebra and arithmetic, like
apu ` vq “ au ` av, pabqv “ apbvq.
You encounter the same rules in the context of vectors in higher-dimensional spaces and
in the context of functions, where we define addition and subtraction as
pf ` gqpxq “ f pxq ` gpxq, pf ´ gqpxq “ f pxq ´ gpxq.
Note that the similarity here is not surprising as a vector in pv1 , v2 , v3 q P R3 can be interpreted
as a function t1, 2, 3u Ñ R. We have the same rule for scalar multiplication
pa ¨ f qpxq “ a ¨ f pxq.
The same rules apply for differentiation and integration, e.g.
ż ż ż ż ż
1 1 1 1 1
pf ` gq “ f ` g , paf q “ af , pf ` gq “ f ` g, paf q “ a f.

We even apply these rules in the wrong place, like


sinpx ` yq “ sin x ` sin y, px ` yq5 “ x5 ` y 5 !
We want to unify all of these patterns into one idea in a grand unification. We will
abstract out the basic axioms. An axiom today refers to something that we will assume in
some context, without considering whether or not it’s true in the real world. For example,
the parallel postulate might not be true in the real world, but we will assume it is true on
the Euclidean plane and test whether we are on a Euclidean plane by verifying the parallel
postulate.
Here, our operations on our vector space are how to add vectors, subtract them, and
multiply by a constant. When you add and subtract vectors, you must get another vector,
but the constants come from a field which must satisfy a different set of axioms. Combining
the axioms for a vector space and a field will tell us everything we need to know about a
vector space. While most of the time, we will only care about R and C, as long as we don’t
use anything beyond the field axioms, our proofs will work for any other field as well, like
Q or Z{pZ (p prime) or Qp (p-adic numbers) or Cpzq (rational functions in 1 variable), etc.
This is one reason why linear algebra is so powerful.

8
2.2 Groups, Rings, and Fields
Let’s start by defining the field axioms.

Definition 2.2.1. F is a field if we have two operations: addition (` : FˆF Ñ F which takes
pa, bq ÞÑ `pa, bq “ a ` b) and multiplication (¨ : F ˆ F Ñ F which takes pa, bq ÞÑ ¨pa, bq “ a ¨ b)
with the following axioms:

1. For all a, b, c P F, we have pa ` bq ` c “ a ` pb ` cq. (Associativity of addition)

2. There exists a distinguished element 0 P F such that a ` 0 “ 0 ` a “ a for all a P F.


(Additive identity)

3. There exists a map ´ : F Ñ F that maps a ÞÑ ´a such that a ` p´aq “ p´aq ` a “ 0.


(Additive inverse)

4. For all a, b P F, we have a ` b “ b ` a. (Commutativity of addition)

5. For all a, b, c P F, we have pa ¨ bq ¨ c “ a ¨ pb ¨ cq. (Associativity of multiplication)

6. There exists a distinguished element 1 P F such that a ¨ 1 “ 1 ¨ a “ a for all a P F˚ “


Fzt0u. (Multiplicative identity)

7. There exists a map ´1 : F˚ Ñ F˚ that maps a ÞÑ a´1 such that a ¨ a´1 “ a´1 ¨ a “ 1.
(Multiplicative inverse)

8. For all a, b P F, we have a ¨ b “ b ¨ a. (Commutativity of multiplication)

9. 0 ‰ 1. (Avoid one-element fields)

10. For all a, b, c P F we have a ¨ pb ` cq “ a ¨ b ` a ¨ c, pa ` bq ¨ c “ a ¨ c ` b ¨ c. (Distributive


property)

Observe that pF, `q and pF˚ , ¨q are abelian groups. Dropping the commutativity assump-
tion gives us a group. We’ll talk more about groups (abelian and not) later, but permutation
groups are a simple example of nonabelian groups. For your convenience, I’ve written out
the definitions of a group and ring below, though we did not strictly cover this in class.

Definition 2.2.2. G is a group given an operation ¨ : GˆG Ñ G with the following axioms:

1. For all a, b, c P G, we have pa ¨ bq ¨ c “ a ¨ pb ¨ cq. (Associativity)

2. There exists a distinguished element 1 P G such that a ¨ 1 “ 1 ¨ a “ a for all a P G.


(Identity)
´1
3. There exists a map : G Ñ G that maps a ÞÑ a´1 such that a ¨ a´1 “ a´1 ¨ a “ 1.
(Inverse)

Definition 2.2.3. G is an abelian group if it is a group and is commutative, i.e. for all
a, b P G we have a ¨ b “ b ¨ a.

9
Definition 2.2.4. R is a ring if it has two operations: addition ` : R ˆ R Ñ R and
multiplication ¨ : R ˆ R Ñ R with the following axioms:

1. There exists an element 0 and a map ´ : R Ñ R such that pR, `, 0, ´q is an abelian


group.

2. For all a, b, c P R, we have pa ¨ bq ¨ c “ a ¨ pb ¨ cq. (Associativity of multiplication)

3. There exists a distinguished element 1 P R such that a ¨ 1 “ 1 ¨ a “ a for all a P R˚ “


Rzt0u. (Multiplicative identity)

4. For all a, b P R, we have a ¨ b “ b ¨ a. (Commutativity of multiplication)

5. 0 ‰ 1. (Avoid one-element rings)

6. For all a, b, c P R we have a ¨ pb ` cq “ a ¨ b ` a ¨ c, pa ` bq ¨ c “ a ¨ c ` b ¨ c. (Distributive


property)

The difference between a ring and a field is that multiplicative inverses need not exist.

A simple application of these axioms is to solve an equation

a`x“b

for a, b, x P G, a group. We want to say that the answer is x “ b ´ a, but we have not
defined subtraction yet! Note the distinction between ´ : G Ñ G (the unary minus) and
´ : G ˆ G Ñ G (subtraction). To solve this, note that

a`x“b
p´aq ` pa ` xq “ p´aq ` b
pp´aq ` aq ` x “ p´aq ` b
0 ` x “ p´aq ` b
x “ p´aq ` b

which is a unique solution. Thus we have the following lemma.

Lemma 2.2.5 (Cancellation Lemma.). For a, x, b P G a group, we have

ax “ b ô x “ a´1 b.

We used associativity but not commutativity, so this holds in any group. For a field with
a ‰ 0, we have also shown that
b
ax “ b ô x “ a´1 b “ .
a
10
In a non-commutative group, we have

xa “ b ô x “ ba´1 .

So in an arbitrary group, ab does not make sense!


We will generally do linear algebra over commutative fields, so this is not an issue. The
only real relevant non-commutative field is the Hamilton quarternions.
Another useful lemma is as follows:

Lemma 2.2.6 (Uniqueness of Identity and Inverse). In any group G, the identity 1 and
inverse a´1 are unique.

This can be proven from Lemma 2.2.5 (Exercise!).

11
3 9/6/17
Problem Set 1 due at 11 AM sharp (I will start grading right after class, so don’t be late!)
If you want to enroll, please petition by 2:30 PM today to ensure that you can enroll by
tomorrow.

3.1 Vector Spaces


Given a field F , we can now define a vector space V over F as follows.

Definition 3.1.1. V is a vector space over a field F (or V {F ) if V is a set of vectors with
two operations (addition, i.e. ` : V ˆ V Ñ V, pv1 , v2 q ÞÑ v1 ` v2 , and scalar multiplication,
i.e. ¨ : F ˆ V Ñ V, pa, vq ÞÑ a ¨ v “ av) such that

1. pV, `q is an abelian group, i.e. there is a zero vector ~0 P V , an additive inverse


´ : V Ñ V, ´ : v ÞÑ ´v, and we have all the axioms we discussed last class.

2. For all v P V we have 1 ¨ v “ v (Multiplicative identity from the field).

3. For all a, b P F, v P V we have pabqv “ apbvq. (Associativity of scalar multiplication).

4. For all a, b P F, v P V , we have pa ` bqv “ av ` bv (Distributivity of multiplication over


scalar addition).

5. For all a P F, u, v P V , we have apu ` vq “ au ` av (Distributivity of multiplication


over vector addition).

Example 3.1.2. The canonical example of a vector space is Fn “ tpa1 , a2 , . . . , an q|ai P Fu,
with addition and scalar multiplication termwise. This is the vector space of n-tuples and
is the most common vector space in lower-level classes. For n “ 1, we just have the original
field F.

Notice that the vector space axioms are a proper subset of the field axioms, i.e. there’s
no multiplicative inverse. Thus you can define an extension for rings that is similar to a
vector space; these are called modules over a ring R. We need multiplicative inverses over
fields for a number of our basic axioms (and module theory is much more complicated than
linear algebra for this reason) so we’ll stick to fields for now.
We have some basic lemmas:

Lemma 3.1.3. 0v “ ~0.

Proof.
0v “ p0 ` 0qv “ 0v ` 0v
and we have already shown that there is only one element in an additive group that is the
sum of element and itself (by adding the additive inverse). Thus 0v “ ~0.

Lemma 3.1.4. p´1qv “ ´v.

12
Proof.
p´1qv ` 1v “ pp´1q ` 1qv “ 0v “ ~0
so p´1qv is the additive inverse of v.
Lemma 3.1.5. a~0 “ ~0.
Proof.
a~0 ` a~0 “ ap~0 ` ~0q “ a~0 “ ~0.

These lemmas are Proposition 1.29 and 1.30 in Axler.


The converse (proving that av “ ~0 implies a “ 0 or v “ ~0) is on the problem set.

3.2 Constructing Vector Spaces: Direct Sums and Subspaces


Example 3.2.1. The most trivial example of a vector space is V “ t0u which is a vector
space over any field. This is the zero vector space.
We have already constructed Fn and F above as vector spaces.
Looking at our fields, we notice that some are subfields of others, i.e. subsets that are
also fields like Q Ă R, R Ă C. This is a result of closure. This motivates the following
example.
Example 3.2.2. Consider a subfield F Ă F 1 . Then V “ F 1 is a vector space over F . Note
that we are throwing away some structure, specifically multiplication among elements of F 1
that are not in F .
For example, C is a R-vector space and is isomorphic to R2 , which we can illustrate with
the complex plane. Further, C and R are Q-vector spaces, but they are not isomorphic to Qn
for any n! This is because R is uncountable by Cantor diagonalization while Q is countable.
If you haven’t seen countability or Cantor diagonalization before, come talk to one of us.
We can also construct a new vector space from two previous ones V1 , V2 .
Definition 3.2.3. The direct sum of two F -vector spaces V1 , V2 is
V1 ‘ V2 :“ tp~v1 , ~v2 q | v1 P V1 , v2 P V2 u.
As a set, this is the Cartesian product V1 ˆ V2 . Addition and scalar multiplication are
done termwise and the zero element p~0, ~0q.
Some books (like Axler) call this an external direct sum, reserving the internal direct
sum for subspaces of the same vector space.
To build up Fn , all we have to do now is take the direct sum of F with itself n times.
We can also go farther and take the infinite direct sum to get infinite sequences F‘N . These
infinite sequences are just a map N Ñ F (as opposed to the finite version which is a map
t1, 2, . . . , nu Ñ F). Thus all we are doing is adding functions to get linearity. We use F X to
denote the set of all functions f : X Ñ F .
Suppose we have a subset U Ă V and we want U to inherit the vector space operations
from V . This is where we require a closed subset.

13
Definition 3.2.4. A subset U Ă V of a vector space is called a subspace when U is closed
under vector space operations (i.e. ~0 Ă U , v1 , v2 P U ùñ v1 ` v2 P U , v P U, a P F ùñ
av P U ).

We don’t need additive inverses since p´1qv “ ´v.


The notion of what a subspace is depends on what field you’re working over. For example,
R Ă C is a subspace as a R-vector space but not as a C-vector space.
If U1 , U2 Ă V are subspaces, then their intersection U1 X U2 is also a subspace. Proving
this is on the problem set (as is the extension to an arbitrary number of subspaces). This
does not work for the union.

14
4 9/8/17
Problem Set 1 is due now. Problem Set 2 has been posted.
You can drop down from 55 to 25 until the fifth Monday. Please do problem sets for only
one of the two classes if you’re still deciding.

4.1 Characteristic of a Field, Homomorphisms, and Ideals


In any field, you have the special numbers 0 and 1. We thus can consider standard names
for numbers 2 “ 1 ` 1, 3 “ 1 ` 1 ` 1, . . .. You can perform this inductively for both the
positive and negative integers to get a map h : Z Ñ F where hpn ` 1q “ hpnq ` 1F , hp0q “
0F , hp1q “ 1F , hp´nq “ ´hpnq.
We want some number of consistency conditions, i.e. hp6q “ hp5q ` hp1q “ hp2q ˚ hp3q.
This turns out to be true if you apply the field axioms repeatedly (check this!)
We can formalize these axioms as follows:

Definition 4.1.1. A ring homomorphism h : R1 Ñ R2 is a map that satisfies the axioms


hpmq ` hpnq “ hpm ` nq, hpmqhpnq “ hpmnq, hp´nq “ ´hpnq, hp0q “ 0, hp1q “ 1.

Thus our map h : Z Ñ F is a ring homomorphism. There are similar notions of a


group homomorphism, field homomorphism, vector space homomorphism, etc. In general a
homomorphism in algebra is a map that preserves the algebraic structure of the underlying
object.
Notice that nothing that we have said so far allows you to deduce that hpnq ‰ 0 for any
n ‰ 1. In fact, this is possible and tells us how far this map is from being an injection.

Definition 4.1.2. An injection is a map f : A Ñ B where f pa1 q “ f pa2 q implies a1 “ a2 .

Note that hpmq “ hpnq is equivalent to hpn ´ mq “ 0. Thus the set of integers that map
to 0 tells us everything about whether or not this map is an injection.

Definition 4.1.3. For a map f : A Ñ B, the set ta P A | f paq “ 0u is known as the kernel
of f , denoted ker f .

In our case, the kernel is an ideal I (a term we will define later). We can also state that
hpmq “ hpnq if m ” n pmod Iq. We can see some of the properties that our ideal has to
define it.

Definition 4.1.4. An ideal I is a subset of a ring R such that:

1. For all x, y P I, we have x ` y P I.

2. 0 P I.

3. x P I, y P R implies that xy P I.

We have the following lemma:

Lemma 4.1.5. The kernel of a ring homomorphism h is an ideal.

15
The proof of this was outlined in class for the case that our ring homomorphism is
h : Z Ñ F to motivate the ideal axioms. I leave the general case to you as an exercise.
The key fact about the integers is that we understand their ideals completely. Any ideal
of the integers is of the form I Ă Z “ t0, ˘g, ˘2g, ˘3g, . . .u (or the zero ideal). The proof
of this is also left as an exercise; it relies crucially on the Euclidean algorithm. In fact, the
ideal I “ tng | n P Zu works for any g in any ring (another exercise!). The special fact about
the integers in particular is that any ideal can be written in this form.
We can now define the characteristic with this setup.

Definition 4.1.6. The characteristic of a field F is the generator of ker h, i.e. 0 if h is


injective or g if g is the least positive integer in ker h.

You can define the characteristic of any ring in exactly the same way and the same proofs
will work (our statements were general to all rings, not just fields). There are fields with
positive characteristic, e.g. Z{2Z (the integers mod 2) with characteristic 2. But not every
integer can be the characteristic of a field! For example, Z{nZ has characteristic n but is
not always a field; if n is nonprime then there are elements without multiplicative inverses.
If p is a prime, we can check that Z{pZ is a field. There are a number of ways of doing
this (e.g. by Bezout). This is an example of a finite field since they are finite.
We have the following theorem:

Theorem 4.1.7. A number is the size of a finite field iff it is a prime power, and there is
a unique such finite field.

4.2 Span and Linear Combinations


There are various ways we can study new mathematical structures by describing ways of
getting new ones from ones that already exist. If V is a vector space over some field F, we
know that the intersection of any family of subspaces is a subspace (PS 1 problem).
What about if we had a subset? Let’s define the following:

Definition 4.2.1. The span of a subset S of a vector space V is the intersection of all
subspaces U Ă V such that S Ă U .

This is a subspace, and is in fact the smallest subspace that contains S. It is a very
general definition that works in the context of modules over commutative rings as well.
We can also consider the span as linear combinations of finite subsets of S.

Definition 4.2.2. A linear combination of vectors v1 , . . . , vn P V is any element v that


can be written as a1 v1 ` . . . ` an vn for a1 , . . . , an P F.

It’s easy to verify that the set of all linear combinations is a subspace of V (just check
addition and scalar multiplication). It is similarly easy to see that this is the smallest
subspace that contains the elements of S.
We will use the notation ÿ
as S
sPS

16
to denote a linear combination over the entire set S, whether or not S is finite. In order
to deal with infinite sets, we will require that as “ 0 for all but finitely many s P S (which
makes this a finite sum). This condition is vacuous if S is finite.
What if S is the union of some number of subspaces U1 Y U2 Y . . . Y Un ? We can then
define our span as
# ˇ +
ÿ ˇˇ
ui ˇ ui P Ui , ui “ 0 for all but finitely many ui .
iPI
ˇ

If U “ span S and S is finite, then we say that U is finitely generated. In the case of a
vector space, we say that U is finite dimensional. This refers to the fundamental invariant
of the space called the dimension that we’ll discuss later. Note that there is no notion of the
dimension in the case of Z-module or a module over a ring.

17
5 9/11/17
Announcements: My section is Monday (starting today), 1 - 2 PM. Room today is SC 112,
next week and beyond is SC 222.
OH tonight, 8 - 10 PM, at Math Night in Leverett dining hall.
Tomorrow: Math Table from 5:30 - 6:30 in Mather dining hall. Talk will be given by
Rosalie Belanger-Rioux on chaos. Elkies OH from 7:30 - 9:30 in Lowell dining hall.

5.1 Span and Finite-Dimensional


Recall the following definitions from last time.
Definition 5.1.1. A vector space V over a field F is finite-dimensional if there exists a
spanning set S with |S| ă 8.
Definition 5.1.2. A module M over a ring R is finitely generated if there exists a spanning
set S with |S| ă 8.
Today, we are going to start the theory of vector spaces by looking at linear independence
and dependence and covering what Axler calls the linear independence lemma.
Example 5.1.3. Some examples of finite-dimensional vector spaces.
1. Fm is finitely generated. The spanning set has m elements; the ith element (1 ď i ď m)
has a 1 in the ith coordinate and 0 elsewhere. These are unit vectors and often denoted
~ei . Then a general vector

pa1 , . . . , an q “ a1 e1 ` a2 e2 ` . . . ` an en .

2. A finite direct sum of vector spaces Ui , i.e. U1 ‘ U2 ‘ . . . ‘ Un is finite-dimensional iff


the Ui are finite-dimensional. If the Ui are finite-dimensional, you can construct this
analogously to the previous example (0s in all but the ith coordinate where the ith
coordinate has the spanning set). This shows the backwards direction; the forwards
direction follows by projecting down onto each individual subspace (i.e. consider the
image of the spanning set under the map πi : ‘i Ui Ñ Ui which takes the n-tuple to its
ith coordinate. This is obviously still a spanning set, so we are done.)

3. Pm , the set of polynomials of degree at most m. (The polynomials of degree exactly


m are not a vector space). This is finitely generated by the set p1, t, t2 , . . . , tm q. The
space of all polynomials P “ Frts is not finitely generated. (Just consider the maximal
degree of your spanning set.)

4. The more general direct sum:


Definition 5.1.4. The direct sum ‘iPI Ui consists of I-tuples, or functions f : I Ñ
Yi Ui , i ÞÑ ui P Ui for all i such that all but finitely many i have ui “ 0.

We leave it as an exercise to show when this is finite-dimensional.

18
5.2 Linear Independence
The words finite-dimensional suggest the notion of a dimension dim V , which can be 0, 1, 2, . . . , 8.
We will show what exactly this dimension is over the next day or so, but remember that this
is an important consequence of being over a field as opposed to over a ring. Modules do not
have a notion of dimension!
We’ll start by discussing linear independence. A set S Ă V can be:

• A spanning set.
• A minimal spanning set/basis/maximal linearly independent set.
• A linearly independent set.

Here, we have:
Definition 5.2.1. A linearly independent set is a set S such that
ÿ
ai vi “ 0
vi PS

implies that ai “ 0 for all i.


The nontrivial statement here is that minimal spanning sets and maximal linearly inde-
pendent sets are the same notion, are of finite size, are all of the same size, and thus imply
that bases exist and have nice properties.
We’ll show that finite dimensional implies there exists a finite basis and all bases have
the same cardinality, which will be the dimension. Every spanning set contains a basis and
every linearly independent set is contained in a basis.
What about infinite-dimensional vector spaces? If you assume the Axiom of Choice/Zorn’s
lemma, then all vector spaces do indeed have bases that are all the same size, but they might
be countable or uncountable. P has a countable basis, but F8 and R{Q have uncountable
bases (the latter are called Hamel bases).
Lemma 5.2.2. Linear independence is equivalent to the following: Suppose you have
ÿ ÿ
ai vi “ âi vi .
vi PS vi PS

Then ai “ âi for all i.


Proof omitted as an exercise (though it was given in class).
The opposite of linear independence is linear dependence.
Axler makes a big point of distinguishing sets from lists. As far as I can tell, a list is the
same as an ordered n-tuple. The idea is that we are not going to use lists aside from lists of
vectors. This is not the same thing as a finite set per Axler’s definition, since it is ordered
and can have repeated elements.
Clearly, we have V Ą S 1 Ą S, then if S 1 is linearly independent, so is S and if S is linearly
dependent, so is S 1 . Further, if S spans V , then so does S 1 .
Here is the first key fact where we finally use multiplicative inverses.

19
Theorem 5.2.3 (Linear Independence Lemma, from Axler 2.21). Suppose that v1 , . . . , vm
are linearly dependent such that v1 ‰ 0. Then there exists a j, 1 ď j ď m, such that
vj P Spanpvi |i ă jq and Spanpv1 , v2 , . . . , vj´1 , vˆj , vj`1 , . . . , vm q “ Spanpv1 , . . . , vm q.

The hat on the vj denotes that we are considering the pn ´ 1q-tuple resulting from
removing the vj from the other n-tuple.
Proof. We have a linear relation ÿ
ai vi “ 0.
i

Because v1 ‰ ~0, there exists some j ą 1 such that aj ‰ 0. Take the maximal such j and
then
ÿj
ai vi “ 0
i“1

or equivalently ÿ
aj vj “ ´ p´ai qvi .
iăj

Now use the fact that we have multiplicative inverses to get


ÿ ÿ
vj “ aj´1 ai vi “ a´1
j p´ai qvi
iăj iăj

which implies the first part. In class this was done in more detail using the field axioms
directly; I’ve skipped steps and shown as much work as you can on your problem sets. You
can also write ´ aaji instead of a´1
j p´ai q. This implies the first statement.
The second part is easy; a linear combination of a linear combination is a linear combi-
nation of what you started with, so the Ą part follows. The Ă part is easy.

Theorem 5.2.4 (Axler 2.23). If V “ Spanpw1 , . . . , wn q and u1 , . . . , um are linearly indepen-


dent, then m ď n.

Proof. Start with the spanning set u1 , w1 , . . . , wn which is clearly linearly dependent by
hypothesis. Apply Theorem 5.2.3 to remove one of the wj . Add u2 and repeat (you can’t
remove any of the ui since they are linearly independent). Once we do this m times, we have
a set of n elements that contains all of the ui , implying m ď n.

20
6 9/13/17
Rohil’s section: Thusrday, 4 - 5 PM, in Science Center 411. We do not coordinate sections:
attend how many or how few as you feel is appropriate.
Problem Set 2 due on Friday.

6.1 Spanning Sets and Bases


Last time, we had Theorem 5.2.4 (a result proved by Theorem 5.2.3, the Linear Independence
Lemma). We’ll derive some corollaries of this.

Corollary 6.1.1. If V is finite-dimensional and U Ă V is a subpsace, then U is finite-


dimensional.

Note that this fails in the context of modules! Rings that have this property for all of
their modules are called Noetherian.
Proof. Let’s try to construct a finite generating set for U . If U “ t0u there is nothing to do.
If not, let u1 P U ´ t0u. If we’re not done, let u2 P U ´ Spanpu1 q, and so on inductively until
um . The key point is that all of the ui are linearly independent. V “ Spanpv1 , . . . , vn q, so
we must finish when m ď n by our theorem. Thus U is finite dimensional.

Definition 6.1.2. A basis of a vector space V is a subset or an n-tuple in V that is both


spanning and linearly independent.

Proposition
řn 6.1.3 (Axler 2.30). pv1 , . . . , vn q is a basis iff
ř every v P V is uniquely represented
as a sum i“1 ai vi for some ai P F or for an index set sPS as vs with as almost all 0.

The proof is omitted; it is largely identical to the one we provided for linear independence
earlier. ř
Now consider the map Fn Ñ V that takes pa1 , . . . , an q Ñ i ai vi for some basis vi . This
map is clearly a linear homomorphism and by the proposition is also an isomorphism. So all
vector spaces are just Fn !
But we will continue studying vector spaces abstractly as opposed to by picking a basis.
This is because it is often cleaner and more useful to understand finite-dimensional vector
spaces abstractly and we will often want to change bases to get a better understanding of
a particular map. Thus you should not get used to picking a basis for every problem! (also
that will make your CAs very very sad.)

Theorem 6.1.4 (Axler 2.31). Any finite spanning set contains a basis.

Proof. Use Theorem 5.2.3 (getting rid of 0s) until you eliminate all linear dependences; then
you get a linearly independent spanning set which is a basis.

Corollary 6.1.5 (Axler 2.32). Every finite-dimensional vector space has a basis.

The following refinement is for whatever reason not stated in Axler.

21
Theorem 6.1.6 (Axler 2.31+). Any spanning set of a finite-dimensional vector space con-
tains a basis.
Proof. We know that there must be a finite spanning set inside our infinite spanning set, so
we can just take that and then apply Theorem 6.1.4.
In the infinite-dimensional case, these theorems hold if you assume the Axiom of Choice/Zorn’s
lemma, but we won’t worry about that here. When you have infinite-dimensional vector
spaces with some additional topological structure (i.e. some notion of distance/convergence/etc.)

6.2 Linear Independence and Bases


On the other hand, we have:
Theorem 6.2.1 (Axler 2.33). Any linearly independent subset of a finite-dimensional vector
space V can be extended to a basis.
Proof. Let the linearly independent subset be u1 , . . . , um . Add a finite spanning set v1 , . . . , vn
which we know exists since it is finite-dimensional. Now apply Theorem 5.2.3 repeatedly to
the whole set, noting that you will only throw out elements in the spanning set. Now we
have a basis since our set is both linearly independent and spanning.
Corollary 6.2.2. Suppose V is finite-dimensional and U Ă V is a subspace. Then there
exists a complement W Ă V where V “ U ‘ W .
Proof. U is finite-dimensional and has basis u1 , . . . , um which we can extend by Theorem
6.2.1 to a basis u1 , . . . , um , w1 , . . . , wn . Then let W “ Spanpw1 , . . . , wn q. We have V » U ‘W
since any v P V is uniquely u ` w, u P U, w P W .
(Note: the proof was presented in class before the corollary’s statement.)
This is not the orthogonal complement, which requires additional structure of distances
or inner products that we won’t see for a few weeks.
This is true in the infinite-dimensional case, but requires the Axiom of Choice/Zorn’s
lemma as well. We’ll see an unconditional result when we talk about quotients later.

6.3 Dimension
Now the dimension actually makes sense.
Theorem 6.3.1 (Axler 2.35). If V is finite-dimensional, all bases have the same length.
Definition 6.3.2. The dimension dim V is that length.
Our theorem is thus equivalent to saying that dimension is well-defined, since it estab-
lishes that our definition of dimension is not dependent on the choice of basis that we use in
the definition.
Proof. Given two bases v1 , . . . , vn and w1 , . . . , wm , we’ll show that m “ n. By Theorem 5.2.4,
we have m ď n since the vi are linearly independent and the wi span. Similarly, m ě n
reversing the roles. Thus m “ n as desired.

22
Example 6.3.3. Elementary dimension computations:

1. dim Fn “ n.

2. dim Pm “ m ` 1 since the basis is x0 , x1 , . . . , xm .

3. Let Fn0 be the subspace of Fn consisting of all vectors whose coordinates sum to 0.
Then dim Fn0 “ n ´ 1.

The dimension is somewhat a notion of size. So we have:

Proposition 6.3.4 (Axler 2.38). If V is finite-dimensional and U Ă V is a subspace, then


dim U ď dim V . The equality condition is dim U “ dim V iff U “ V .

Proof. The basis of U is linearly independent, so it can be extended to a basis of V .


Note: when you prove an inequality, figure out what the equality condition is.
We can similarly show that V “ U ‘ W implies that dim V “ dim U ` dim W .
Further, we have:

Proposition 6.3.5 (Axler 2.39, 2.42). If V is finite-dimensional and S is linearly indepen-


dent, then |S| ď dim V and if |S| “ dim V , then S is a basis. Similarly, if S spans, then
|S| ě dim V and if |S| “ dim V , then S is a basis.

Proof left as an exercise.


For a simple example of why this fails for modules, consider Z-modules. 2Z Ă Z is a
submodule, but you cannot write Z ” 2Z ‘ M for any submodule M .

23
7 9/15/17
PS 2 due now, PS 3 goes out today.
A couple of notes about the remainder of Axler, chapter 2:

1. So far F may be skew, since we have not used commutativity of multiplication yet.
Thus dimension of vector spaces over a skew field is well-defined. This will stop being
the case very soon.

2. There’s a notion by which dimension is similar to size of a finite set, i.e. V ‘ W is


similar to S >T (the disjoint union). We also have |S YT |`|S XT | “ |S|`|T | according
to the principle of inclusion-exclusion; this still works in the context of vector spaces

dim U ` dim W “ dimpU ` W q ` dimpU X W q.

The proof of this can be found in the textbook and you should have looked this up for PS
2. However, on your most recent problem set you (hopefully!) found a counterexample
to the statement with three vector spaces.

So far, we have defined vector spaces and given properties and constructions of them, i.e.
the nouns and adjectives of our theory. Today, we start discussing linear transformations
(Axler chapter 3). These are the verbs of our theory, i.e. how we map from one vector space
to another.

7.1 Linear Transformations


Suppose we have V, W as F-vector spaces. We have a linear transformation T : V Ñ
W . These are the right objects to understand a number of important mathematical ideas
including linear differential equations. At a higher level, these make the vector spaces over
a given field F a category, where we have the objects (vector spaces) and the morphisms or
arrows (which usually are maps between the objects, but are not required to be)!
We will also be able to use linear transformations to construct a number of vector spaces:
the image, the kernel (the preimage of 0), Hom (the set of linear transformations), and later
tensor products, etc.
The following definition was embedded in class in the definition of a linear transformation:
Definition 7.1.1. A group homomorphism is a map T : G Ñ H between groups G, H
that respects the group structure. Specifically:
1. For all g1 , g2 P G, we have T pg1 ¨ g2 q “ T pg1 q ¨ T pg2 q.

2. T p0G q “ 0H .

3. T p´gq “ ´T pgq.
When proving that something is a group homomorphism, you only need to prove that
T pg1 ¨ g2 q “ T pg1 q ¨ T pg2 q since the auxiliary properties are actually consequences of this one
(just consider T p0 ¨ 0q “ T p0q ¨ T p0q “ T p0q, for example).

24
Definition 7.1.2. A map T : V Ñ W is said to be an F-linear transformation if:

1. It is a group homomorphism of the additive groups, i.e. T pv ` v 1 q “ T pvq ` T pv 1 q and


similarly.

2. For all a P F, v P V , we have T pavq “ aT pvq. Thus T respects scalar multiplication.


This is also known as homogeneity of degree 1.

Sometimes we see that T is linear iff for all a, a1 P F, v, v 1 P V , we have T pav ` a1 v 1 q “


aT pvq ` a1 T pv 1 q. This is equivalent (exercise!).

Example 7.1.3. Examples of linear transformations (exercise: check that all of these are
linear transformations):

1. The only map 0 Ñ 0.

2. The map V Ñ 0, or alternatively the zero map V Ñ W that takes v ÞÑ 0.

3. The identity map V Ñ V that takes v ÞÑ v. (This is required by the axioms of a


category.)

4. Given any linear transformation T : V Ñ W and a scalar a P F, then the product


paT qv “ a ¨ T pvq is linear. To check this, we finally use commutativity of our field!

5. Given any two linear transformations T, T 1 : V Ñ W , the sum pT ` T 1 qpvq “ T pvq `


T 1 pvq is linear.

6. Given a subspace U Ă V , any map T : V Ñ W can be restricted to U to get a map


T |U that is also a linear transformation.

7. The isomorphism Fn » V resulting from picking a basis is a linear transformation, as


is any isomorphism. (Vector space isomorphisms are required to be linear transforma-
tions.)

8. Scalar multiplication by a P F is a linear transformation.

9. Let V be the vector space of differentiable functions on RtoR and let W be the vector
space of all functions R Ñ R. Then the map of taking the derivative on V Ñ W is
linear. This can also be restricted to V “ W “ P. The same is true of integration.

10. Given a linear transformation T : V Ñ W and a transformation S : W Ñ X, the


composition v ÞÑ SpT pvqq is a linear transformation. This is also required for the
axioms of a category. Similarly, note that compositions are associative, i.e. pST qU “
SpT U q.

Sidenote: You can compute that Dx ´ xD “ id. This is the uncertainty principle in
quantum mechanics (essentially).

25
7.2 The Space of Linear Transformations
So far, we’ve seen that you can add linear transformations, can multiply them by a scalar,
and that there’s a zero transformation. This implies that linear transformations form a
vector space!

Definition 7.2.1. The space of linear transformations LpV, W q “ HompV, W q is a


vector space containing all linear transformations, with scalar multiplication and addition
defined as above.

Example 7.2.2. Some special cases:

1. HompF, V q “ V under the map T ÞÑ T p1q.

2. Homp0, V q “ 0.

3. HompV, 0q “ 0.

4. HompV ‘ V 1 , W q » HompV, W q ‘ HompV 1 , W q. The map here is T ÞÑ pT |V , T |V 1 q.


This is the universal property of the direct sum.

5. HompFn , V q » V ‘n by applying the above repeatedly. This implies that dim HompV, W q “
dim V ¨ dim W for finite-dimensional vector spaces V, W .

6. HompV, Fq “ V ˚ , the dual vector space, or the space of functionals on V (Axler


denotes this as V 1 ). Note that we can write HompV, Fq » V for finite-dimensional
vector spaces since they have the same dimension, but this isomorphism is not canonical
and does not hold for infinite-dimensional vector spaces!

26
8 9/18/17
My section today: 1 - 2 pm, Science Center 222. Math Night/my OH: Leverett dining hall,
8 - 10 PM.
Math Table tomorrow: Cameron Krulewski will be giving a talk on vector bundles and
K-theory.
Professor Elkies’s OH are Wednesday this week in Lowell dining hall, 7:30 - 9 PM.
Correction to PS 3 #1 coming soon: see the webpage.
There will be a quiz next week (either Monday or Wednesday). Email Professor Elkies
if you have a strong preference. It is in-class, 1 hour.

8.1 Kernel and Image


Last time, we defined the notion of a linear transformation and noted that it can be ex-
tended from being about vector spaces to being about modules, etc. Only some of the
properties of linear transformations will extend to modules, but these are the right notion
of homomorphism for modules over a ring.
We have a couple of important definitions:

Definition 8.1.1. The kernel ker T of a linear map T : V Ñ W is the set of v P V such
that T v “ 0.

The kernel is sometimes known as the nullspace.

Definition 8.1.2. The image of a linear map T : V Ñ W is the set T pvq for all v P V .

The image is sometimes known as the range. It is occasionally denoted as Im T (some-


thing I might use since I’m used to that notation).
These are both automatically subspaces. If T v “ 0, then T pavq “ aT v “ 0 and T v1 “ 0
and T v2 “ 0 implies that T pv1 ` v2 q “ T v1 ` T v2 “ 0. You can prove this similarly for the
image (exercise).
We have two additional dimensions that we can now associate with any linear map T ,
i.e. dim ker T and dim T pV q. However, we have an extremely important theorem that relates
these to dim V :

8.2 Rank-Nullity Theorem


Theorem 8.2.1 (Axler 3.22, Fundamental Theorem of Linear Maps, Rank-Nullity Theo-
rem). If V, W are finite-dimensional vector spaces of F and T : V Ñ W is linear, then

dim V “ dim ker T ` dim T pV q.

Definition 8.2.2. The rank of T is dim T pV q but the nullity of T is dim ker T .

27
Nullity is used less often in abstract math but is still used in computational math text-
books.
Based on Theorem 8.2.1, we can show that dim V , dim W , and dim ker T or equivalently
dim T pV q define T up to isomorphism, i.e. given some map T 1 : V 1 Ñ W 1 with the same
such numbers, we can find isomorphisms V » V 1 and W » W 1 such that T » T 1 . This will
give us a complete classification of all linear transformations between different vector spaces.
We start with an important proposition.
Proposition 8.2.3 (Axler 3.16). T : V Ñ W is injective iff ker T “ 0.
Definition 8.2.4. A map T : V Ñ W (over any set) is injective if T v “ T v 1 implies v “ v 1 .
Proof of Proposition 8.2.3. We have v P ker T iff T v “ 0 only when v “ 0. However,
T v “ T v 1 implies T pv ´ v 1 q “ 0 and this is equivalent to v ´ v 1 “ 0 only when ker T “ 0, as
desired.
Please do not use Axler’s notation of one-to-one for injective, since some pepole use it to
mean bijective.
You can also show that the conditions of Proposition 8.2.3 hold iff T ; V Ñ T pV q is an
isomorphism (you can check that if T is linear and bijective then T ´1 is also linear).
Proof of Theorem 8.2.1. Let U “ ker T . Axler uses bases here (for the record, when I took
55 this was on a midterm and we lost points for citing bases), but we will not. We know
that U has a complement U 1 , i.e. V “ U ‘ U 1 and dim V “ dim U ` dim U 1 . We have
T pV q “ T pU q ` T pU 1 q and T pU q “ 0 so T pV q “ T pU 1 q. Further, T |U 1 is an isomorphism by
Proposition 8.2.3 and the fact that U X U 1 “ t0u. Thus
dim V “ dim U ` dim U 1 “ dim U ` dim T pU 1 q “ dim ker T ` dim T pV q
as desired.
Corollary 8.2.5 (Axler 3.23, 3.24). Given that V, W are finite-dimensional vector spaces
with T : V Ñ W a linear map, then T injective implies that dim V ď dim W with equality
if T is an isomorphism. T surjective implies that dim V ě dim W with equality if T is an
isomorphism.
Definition 8.2.6. A map T : V Ñ W (over any set) is surjective if given any w P W ,
there exists some v P V such that T v “ w.
We also have the structure theorem, i.e. that dim V , dim W , and dim ker T completely
describe the structure of the transformation. This is because if W is finite-dimensional, you
can take the complement W “ T pV q ‘ W 1 (this complement is called the cokernel) and our
linear transformation is just
T pu, u1 q “ T pu ` u1 q “ T pu1 q “ pT pu1 q, 0q.
Now, you can be very explicit. Consider a basis for V that is u1 , u2 , . . . , um , u11 , u12 , . . . , u1n
and for W , w1 “ T pu11 q, . . . , wn “ T pu1n q, w11 , . . . , wp1 . Any other such linear map with the
same dimensions of the vector spaces and kernel will look the same with respect to some
basis.
Here, we are implicitly using:

28
Lemma 8.2.7 (Axler 3.?). 1. HompV ‘ V 1 , W q “ HompV, W q ‘ HompV 1 , W q. This is
also true for arbitrary direct sums.

2. Let S be a basis for a not necessarily finite-dimensional vector space V over a field
F. Then for all W over F and any ws , there exists a unique linear transformation
T : V Ñ W such that T pvs q “ ws for all vs in the basis.
The first part is easy to see directly. The second part is true since
ÿ
v“ as v s
sPS

for as “ 0 for all but finitely many s. Then we can just write
ÿ
Tv “ as ws
sPS

which must be our linear transformation. We just need to check that this is a linear trans-
formation.
Finally, we can prove:
Theorem 8.2.8 (Axler 3.61). If both V and W are finite-dimensional, then HompV, W q is
also and
dim HompV, W q “ dim V ¨ dim W.

This is easy enough to prove via Lemma 8.2.7; choose a basis for V that identifies V » Fn
and now apply Lemma 8.2.7 (1) as many times as necessary.
Why is the theorem number so high? Because Axler decides to veer from the golden light
of abstract linear algebra and introduce discussion of. . .

8.3 Matrices
Choose a basis for both V and W , i.e. v1 , . . . , vn and w1 , . . . , wm . Then we have
m
ÿ
T pvi q “ ai,j wj .
j“1

We can then put this in an array


¨ ˛
a1,1 . . . a1,n
˚ .. ... .. ‹ .
MpT q “ ˝ . . ‚
am,1 . . . am,n

In particular, the kth column is the image of the kth basis vector vk . We get the matrix by
regarding elements of W » Fm as column vectors and writing them in order.
Note that we must think of vectors as column vectors rather than row vectors, which
corresponds to our convention of composing functions/multiplying matrices on the left. It’s

29
easy to see that composition of maps corresponds to multiplication of matrices (we’ll show
this next time). You are not required to know that m ˆ n matrices are different than n ˆ m
matrices.
Note that I will try to write a1,1 or ai,j instead of a11 or aij . The literature often uses the
latter since it is clear from context exactly what I mean.
Consider a map T : V Ñ V . You must use the same basis of V for both sides, so these
maps look different from the ones we have characterized above. The set HompV, V q is a
ring of n ˆ n matrices, since you can compose these maps with themselves to get new maps
V Ñ V . We have an identity I which corresponds to the identity matrix. We have
¨ ˛
1 0 ... 0
˚0 1 . . . 0‹
I “ ˚ .. .. . . .. ‹ .
˚ ‹
˝. . . .‚
0 0 ... 1

30
9 9/20/17
Professor Elkies will hold office hours from 7:30 - 9:00 PM in Lowell dining hall today. Rohil
will hold make-up office hours from 1 - 2:30 PM in Science Center 304 and section from 4 -
5 PM as usual.
There will be a quiz next Wednesday, September 27 in class. This quiz is purely diagnostic
and will count for an insignificant portion of your grade.
Axler now has a discussion of quotients, an essential topic in algebra.

9.1 Quotient Spaces


Given U Ă V , both vector spaces over F, we can form the quotient space V {U , also over
F. This notation is identical to one standard notation for modular arithmetic, i.e. Z{nZ,
and modular arithmetic is an example of a quotient construction over modules. This is a
construction that works identically for modules and groups and many many other algebraic
objects.
Vectors in V {U are cosets mod U .

Definition 9.1.1. A coset of V mod U is a subset of V of the form rv0 s for some element
v0 P V where rv0 s “ tv0 ` u | u P U u.

The best way to think about this is congruent classes mod n. We can consider these as
equivalence classes rvn s where v „ v 1 means v 1 ” v pmod U q or v 1 ´ v P U . We want this
to be an equivalence relation.

Definition 9.1.2. An equivalence relation „ on a set S satisfies the following properties:

1. For all v P S, v „ v (identity).

2. For all v1 , v2 P S, v1 „ v2 implies v2 „ v1 (reflexivity).

3. For all v1 , v2 , v3 P S, v1 „ v2 , v2 „ v3 implies v1 „ v3 (transitivity).

The last property (transitivity) is the only nontrivial property, which we can check by
noting that
v3 ´ v1 “ pv3 ´ v2 q ` pv2 ´ v1 q
and both of the latter terms are P U , so v3 ´ v1 P U as desired. The other two are left as
exercises for you.
Note that reflexivity and transitivity do not imply identity; the obvious proof v1 „ v2 „
v1 ùñ v1 „ v1 assumes that there exist v1 , v2 such that v1 „ v2 , but that is not given with
only reflexivity and transitivity! Thus we keep the identity axiom.
Now we have an idea as to what the vector space consists of. We now need addition and
multiplication to work.

Definition 9.1.3. The quotient space V {U consists of cosets of V mod U with addition
and multiplication defined by cosets, i.e. rv1 s ` rv2 s “ rv1 ` v2 s and crv1 s “ rcv1 s. We have
0V {U “ r0s and ´rv0 s “ r´v0 s.

31
We need to check that this definition is well-defined, i.e. for whatever choice of v1 , v2 in
the cosets rv1 s, rv2 s, I get the same value for rv1 ` v2 s. This is equivalent to v1 ” v11 , v2 ”
v21 ùñ v1 ` v2 ” v11 ` v21 , which I leave as an exercise. Similarly, one can show that
v1 ” v11 ùñ ´v1 ” ´v11 and v1 ” v11 ùñ cv1 ” cv11 .
As a note, checking these facts is nontrivial if you are not working over vector spaces!
You cannot quotient out by an arbitrary subgroup or subring. Thus I strongly recommend
you check all of the claims I made above to ensure our definition is well-defined.

9.2 Quotient Spaces and Kernels


It’s easy to verify that π : V Ñ V {U, v ÞÑ rvs is a linear map (exercise!). This map is called
the quotient map. This map is clearly surjective and has kernel U , so we have

dim V “ dim U ` dimpV {U q.

So what does it mean to map V {U Ñ W ? Well, consider the diagram:


T ˝π
V W
T
π

V {U
Thus HompV {U, W q Ă HompV, W q and the image is exactly T 1 such that U Ă ker T 1 , i.e.
T pU q “ 0. Why? To lift a map T : V {U Ñ W to a map T 1 : V Ñ W , you need to pick
1

a random element v in each coset rvs and map T prvsq “ T 1 pvq. But this only well-defined
if v 1 ” v implies T 1 pv 1 q ´ T 1 pvq “ T 1 pv 1 ´ vq “ 0, implying that T 1 pU q “ 0. Conversely, if
T 1 pU q “ 0, then we can define T prvsq “ T 1 pvq and this is well-defined, giving us the desired.
Another way of saying this is that HompV, W q|vanishes on U » HompV {U, W q.
şb
If you’ve seen some real analysis, you might know that a Functpr0, 1s, Rq Ñ R sends null
şb
functions, or functions that are nonzero at a finite number of points, to 0, so a is a map
from the quotient of all functions by the null functions, or the L1 space.
Definition 9.2.1. The codimension of U in V is dimpV {U q.
As above, if V is finite-dimensional we can apply rank-nullity to get dimpV {U q “ dim V ´
dim U . Thus all rank-nullity is saying is that codimpker T q “ dimpT pV qq which can be proven
just by descending to the isomorphism V { ker T » T pV q.
We also have the following theorem, which was not explicitly stated in class but referred
to:
Theorem 9.2.2 (First Isomorphism Theorem). Given T : V Ñ W , we have V { ker T »
T pV q.
The only thing we haven’t checked to prove Theorem 9.2.2 is that the map V { ker T Ñ
T pV q is an injection (surjectivity is obvious). You can verify that the kernel of this map is
0, implying that this map is an isomorphism as desired. This can be used to preserve the
rank-nullity theorem and generalizes the rank-nullity theorem to other contexts, like quotient
groups, quotient rings, etc.

32
9.3 Duality
Recall that we defined the dual:

Definition 9.3.1. The dual vector space V ˚ :“ HompV, Fq, or the space of functionals
on V .

What is pV {U q˚ ? It is the subspace of V ˚ consisting of v ˚ such that v ˚ pU q “ 0.


Now suppose V » Fn with a basis v1 , v2 , . . . vn . Then pFn q˚ “ Fn , specifically v ˚ “
pb1 , b2 , . . . , bn q is the functional v “ pa1 , a2 , . . . , an q ÞÑ a1 b1 ` a2 b2 ` . . . ` an bn . This is the
inner product of row and column vectors. We can take a basis, i.e. a˚i paj q “ δij , the
Kronecker delta. This basis is called the dual basis. In particular, F˚ » F canonically.
Note however that V ˚ » V is not canonical for finite-dimensional vector spaces (and not
true for infinite-dimensional vector spaces); you need to pick a basis to get the identification
discussed above!
Suppose you have T : V Ñ W with V having basis vk , 1 ď k ď n and W having basis
wj , 1 ď j ď m. Then we have
¨ ˛
a1,1 . . . a1,n
MpT, pv1 , . . . , vn q, pw1 , . . . , wm qq “ ˝ ... ... .. ‹
˚
. ‚
am,1 . . . am,n

where the columns are v1 , . . . , vn and the rows are w1 , . . . , wm . Thus column vectors must
correspond to the original basis, i.e. if V “ F, and if we let W “ F, then the row vectors
correspond to HompV, Fq “ V ˚ , i.e. elements of the dual space.
Finally, you can verify that composition of maps is equivalent to multiplication of matri-
ces.

33
10 9/22/17
Quiz: next Wednesday, September 27.
Problem Set 3 due now, Problem Set 4 due Monday, Oct 2 (note the changed deadline).
Note that Axler uses V 1 , T 1 , U 0 instead of what we call V ˚ , T ˚ , U K . Note that there are
two forms of the same Greek letter: zphi refers to φ and zvarphi refers to ϕ.

10.1 Duality of Vector Spaces


We have already difined V ˚ “ HompV, Fq for any vector space V {F. We know that if
dim V ă 8 then dim V ˚ “ dim V , as a special case of

dim HompV, W q “ dim V dim W.

Thus V ˚ » V since they have the same dimension. But this isomorphism is not canonical.
If you pick a basis v1 , . . . , vn P V , then there exists a dual basis vi˚ P V ˚ where vi˚ pvj q “ δij ,
i.e. vi˚ pvi q “ 1 and vi˚ pvj q “ 0 for i ‰ j. But this map depends on choosing a basis and is
not basis-independent! There cannot be a natural identification V » V ˚ since it would take
any vi ÞÑ vi˚ always, which is impossible as multiplying vi by a constant divides vi˚ by the
same constant.
However, the second dual pV ˚ q˚ » V canonically! Our disproof above doesn’t work
since we’d multiply the dual-dual basis by the same constant. We can demonstrate that
this isomorphism is canonical on Problem Set 4 by defining it in a basis-free manner (note:
this means that using bases on your problem set will cause you to lose points). Hint:
pV ˚ q˚ “ HompHompV, Fq, Fq.

10.2 Duality of Linear Transformations


Let’s look at how duality works with linear transformations. I claim that T : V Ñ W has a
natural dual T ˚ : W ˚ Ñ V ˚ (note the direction of the arrows). We can define this map by
using the commutative diagram
T
V W
ϕT
ϕPW ˚

This diagram takes a map ϕ P W ˚ and associates to it a map ϕT P V ˚ . Thus we have


the map T ˚ : W ˚ Ñ V ˚ that takes ϕ ÞÑ ϕT .
We need to check that T ˚ is actually a linear map. This is left as an exercise for you;
be careful that you don’t get confused! To do this, check that T ˚ pϕ1 ` ϕ2 q “ T ˚ pϕ1 q `
T ˚ pϕ2 q, T ˚ pcϕq “ cT ˚ pϕq for any ϕ1 , ϕ2 , ϕ P W ˚ “ HompW, Fq.
We now have a map HompV, W q Ñ HompW ˚ , V ˚ q consisting of T ÞÑ T ˚ . This map
is also linear and for finite-dimensional vector spaces, it is an isomorphism. Using bases,
HompV, W q corresponds to matrices and HompW ˚ , V ˚ q corresponds to switching columns

34
and rows, i.e. transposing the matrix (assuming you use the dual basis). There are much
cleaner ways to prove that this is an isomorphim, however.
What happens if we compose duals of linear transformations? We have the diagram:
T S
V W X
ϕS“S ˚ ϕ
ϕST “T ˚ pS ˚ ϕq ϕ

By the diagram, we have pST q˚ “ T ˚ S ˚ . Note the switched order, corresponding to the
arrows reversing above.
Suppose we have T : V Ñ W with ker T, T pV q. Then ker T “ pIm T ˚ q˝ and Im T “
pker T ˚ q˝ . Proofs of this are in the textbook.

10.3 Exact Sequences


Most math communication today is done via diagrams of spaces and maps between them and
arguments are done by composing these maps together. These diagrams are known as com-
mmutative diagrams. One particularly common commutative diagram is the sequence
Ñ V1 Ñ V2 Ñ V3 Ñ ¨ ¨ ¨
.
i q
One particular example U Ñ V Ñ V {U . We can form the composite map, but it is
actually q ˝ i “ 0. Further, we also know that Impiq “ kerpqq. This property occurs often
enough that it has a special name.
Definition 10.3.1. A sequence is exact at V if Im i “ ker q, given i the incoming map and
q the outgoing map.
If your sequence is exact at all spaces it could be, then it is called an exact sequence.
Note that the sequence
i q
0 Ñ U Ñ V Ñ V {U Ñ 0
is exact. This sequence is known as a short exact sequence. The only shorter exact
sequences are
0ÑV Ñ0
which implies that V “ 0 and
0 Ñ V1 Ñ V2 Ñ 0
which implies that V1 » V2 . Finally, note that
0ÑU ÑV ÑW ÑXÑ0
implies for T : V Ñ W , we can write U “ ker T and X “ W {T pV q “ coker T , thus giving
us a 4-term exact sequence for any linear map.
Now note that duality preserves exactness, but reverses arrows! We have:

35
Lemma 10.3.2. V Ñ W Ñ X is exact iff X ˚ Ñ W ˚ Ñ V ˚ is also exact.

Proof. It’s easy to show that the composition of the two maps is always zero. The hard part
is showing that the image of the first is the kernel of the second, but this is equivalent to
the statements of image and kernel of the dual we made earlier.
In general, we have an exact sequence
T
0ÑU ÑV ÑW ÑXÑ0

and the dual



0 Ð U ˚ Ð V ˚ Ð W ˚ Ð X ˚ Ð 0.
Thus U ˚ “ V ˚ {pT ˚ W ˚ q “ coker T ˚ so the dual of this is the annihilator of T ˚ W ˚ or ker T .
(Recall that the dual of V {U is the annihilator of U ).
Next time, we’ll show that this implies the rank of a matrix is invariant upon transposi-
tion, which implies that the column space has the same dimension as the row space for an
n ˆ n matrix.

10.4 Digression: Category Theory


Categories themselves are related by functors. Categories C themselves have objects
ObjpCq (like vector spaces) which are connected by morphisms (MorpCq). A functor F :
C Ñ D is a map ObjpCq Ñ ObjpDq and a map MorpCq Ñ MorpDq, subject to certain
natural axioms (e.g. for any T : V Ñ W with V, W P C, we have FT : FV Ñ FW , there
is an identity morphism in C for every V that maps to an identity morphism in D for every
FV , composition works correctly, etc.)

Example 10.4.1. Some examples of functors:

1. The identity functor: take everything to itself.

2. The trivial category with one object and one morphism and the trivial functor that
sends everything to that category.

3. Forgetful functors, e.g. VecF Ñ Set that forgets the structure of the vector spaces (i.e.
a vector space homomorphism is clearly a map between sets!) or VecC Ñ VecR that
forgets the complex structure of the vector space.

4. A contravariant functor, that does everything described above but reverses all the
arrows. Duality is an example of a contravariant functor. In fact, this is an exact
contravariant functor, since it preserves exactness (which is well-defined for categories).

36
11 9/25/17
My section: 1 - 2 PM today, SC 222. CA OH and Math Night: 8 - 10 PM, Leverett dining
hall.
Math Table: graduate school panel, tomorrow 5:30 - 6:30 PM. Professor Elkies’s OH:
7:30 - 9 PM, Lowell dining hall.
Diagnostic quiz in class on Wedenesday: 11:07 - 12:00. Being on time would be helpful.
It covers material on problem sets 1 to 3 only.
Putnam exam: on December 2. Signups will be on the third floor under the undergraduate
section. I don’t think signups are up yet, but the math department is often disorganized so
it’s best to check.
We’re finishing Chapter 3 of Axler. Chapter 4 of Axler is about polynomials, but we’ve
already assumed some of Chapter 4 in earlier problem sets, so we’re not going to cover it.
It’s regarded as something that you probably already know (polynomials don’t change from
over R to over F much).

11.1 Finishing Up Duality


We still need to do some proofs of basic facts about linear maps between finite-dimensional
vector spaces and their duals. Last time, we said sometihng about exact sequences behav-
ing under duality and how duality preserves exactness, but we didn’t prove any of these
statements.
Suppose you have U Ă V , i.e. the exact sequence 0 Ñ U Ñ V . The dual is 0 Ð U ˚ Ð V ˚
where the dual map V ˚ Ñ U ˚ is exactly what we defined it to be. We want to show that
this map is exact, or equivalently:
Lemma 11.1.1 (Axler 3.A.11). The dual of any injective map on finite-dimensional vector
spaces T : U Ñ V is a surjective map T : V ˚ Ñ U ˚ .
Proof. We need to show that in the following diagram:

U V

Any map from U Ñ F can be extended to a map V Ñ F. This is actually not an obvious
statement and is?true over infinite-dimensional?vector spaces without the Axiom of Choice!
Consider Q ‘ Q 2 Ă R and the map 1 ÞÑ 1, 2 ÞÑ 0; you can see that you can’t actually
construct such a map without Hamel bases.
In the finite-dimensional case one possible way to prove this is to use complements. Write
V “ U ‘ W and V ˚ “ U ˚ ‘ W ˚ and V ˚ Ñ U ˚ where pu˚ , w˚ q Ñ u˚ . In other words, if you
give me a map from U Ñ F, I get a map from V Ñ F by saying that everything in W goes
to 0 and then writing any vector v P V as v “ u ` w, sending u to whatever and w to 0. Of
course, this only works in the finite-dimensional case.

37
Now extend our short exact sequence to 0 Ñ U Ñ V Ñ V {U Ñ 0 and the dual
0 Ð U ˚ Ð V ˚ Ð pV {U q˚ Ð 0. We want to show that pV {U q˚ “ kerpV ˚ Ñ U ˚ q and have
identified pV {U q˚ “ W ˚ , which gives us the desired (along with a dimensions argument).
The more general fact is that T : V Ñ W extends to a dual T ˚ : W ˚ Ñ V ˚ with the
diagram
T
V W
T ˚ϕ ϕ

Lemma 11.1.2 (Axler 3.107A, Axler 3.109B). ker T ˚ “ pT pV qq0 and Im T ˚ “ pker T q0 .
Proof. We want to show that ker T ˚ “ pT pV qq0 . This is almost immediate from the defi-
nitions. The ker T ˚ is anything such that T ˚ ϕ “ ϕ ˝ T “ 0 for any ϕ, i.e. ϕ annihilates
everything in T pV q, as desired.
The hard part is the other direction, i.e. Im T ˚ “ pker T q0 . If v P ker T , then pT ˚ φqv “
φ ˝ T pvq “ 0 thus Im T ˚ Ă pker T q0 . If I have an element v ˚ P V ˚ that annihilates everything
in ker T , we want to show that it can br brought to an operator in W ˚ . The easy way to do
this is to compare the dimensions of our two subspaces and show that they are equal, which
is just a computation.

dimpIm T ˚ q “ dim W ˚ ´ dim ker T ˚


“ dim W ´ dimpT pV qq0
“ dim W ´ pdim W ´ dim T pV qq
“ dim T pV q

and

dimpker T q0 “ dim V ´ dim ker T


“ dim V ´ pdim V ´ dim T pV qq
“ dim T pV q

as desired. So we have equality.


Note that we showed along the way that T ˚ pW ˚ q “ T pV q. This has important conse-
quences for matrices.

11.2 Duality and Matrices


Suppose we have matrices A “ raj,k s and the corresponding dual matrix AT “ rak,j s. This
is often known as the transpose but corresponds to the dual linear transformation. If you
think about A as a map from Fm Ñ Fn , then AT is a linear transformation Fn Ñ Fm .
Specifically, if A corresponds to a map T , then AT corresponds to a map T ˚ upon picking
the dual basis on both sides.

38
You do need to check that this is true (which is left as an exercise since we don’t want
to do ugly matrix stuff in class). But we have shown that dim Im T “ dim Im T ˚ , which in
terms of matrices looks quite different. dim Im T is the dimension of the span of the columns
of the matrix, i.e. the column rank of A. dim Im T ˚ is the dimension of the span of the
rows of the matrix, i.e. the row rank of A. And thus we have:

Corollary 11.2.1 (Axler 3.115 - 3.119, 3.111 - 3.112). The row rank and column rank of a
matrix are the same.

This common number is called the rank. This is a nontrivial fact in terms of matrices,
but is relatively obvious in terms of duals. This is a common punchline in linear algebra
proofs.
If you have a 1 ˆ n matrix, then the rank is clearly 1 (aside from the 0 matrix). If you
have a 2 ˆ 2 matrix, the rank is 0 is you have the 0 matrix. Letting the matrix be
ˆ ˙
a b
,
c d

we see that it has rank 1 if ad “ bc, i.e. the rows are multiples of each other. The rest
are rank 2, i.e. invertible matrices or bijections. Things get a lot more complicated in the
general case.
Recall that linear transformations on Fn Ñ Fn that are bijective/invertible form a group.
This group is the group of general linear transformations and called either GLpnq or
GLn pFq. In general, it’s not commutative. If you want, you can also talk about GLpV q for
a general vector space (all invertible linear transformations T : V Ñ V ).

11.3 Linear Transformations from One Vector Space to Itself


We’ll spend quite a bit of time focusing on the study of linear transformations from one
vector space to itself. This is significantly more complicated than what we’ve studied before
and also much more interesting, since you can iterate linear transformations to get interesting
properties.
Why is this so much more interesting? Consider a general linear transformation T : V Ñ
W where dim V “ n, dim W “ m. You can choose n2 bases in V and m2 bases in W , so you
have a total of m2 ` n2 ´ 1 choices. But you have only mn scalars and m2 ` n2 ´ 1 ą mn.
Thus most linear transformations from V Ñ W look the same, and we’ve already shown
that dim V, dim W and the rank/kernel/whatever tells you everything about the matrix.
What about T : V Ñ V ? There are still n2 choices of basis, but you have to use the
same idefntification on both sides. So you really only have n2 choices for n2 scalars, so it
looks like we should have continuous invariant, i.e. an element of the field that we can pull
out of T . There are actually n such elements, and we’ll see what these are soon. This will
lead us to eigenspaces and eigenbases, etc.
Why do you want to iterate maps? It turns out to be useful to generate the Fibonacci
numbers by iterating ˆ ˙
1 1
.
1 0

39
You can also characterize Fibonacci-esque recursions in this way, that turn out to be relevant
for population dynamics and the like.

40
12 9/29/17
We are starting Axler Ch. 5, about eigenstuff. We will need to say something about the
complex numbers being algebraically closed (Ch. 4) but will talk about that later.
If you want to talk to Professor Elkies, please meet him Sunday, 2 - 4 PM on the 4th
floor common room. Rohil and I are also available to talk.
Add/drop deadline on Monday. If you are switching between 23/25/55, there should be
no fee; contact Professor Cliff Taubes if you are charged.
PS 4 due Monday.

12.1 Motivation for Eigenstuff


There are a number of reasons to study eigenstuff, which are characteristic of linear trans-
formations T : V Ñ V . We will see more and more applications throughout the semester,
including field extensions F Ă F 1 and others. The motivation you’d want to keep in mind is
that such a map can be iterated, i.e. T 2 “ T ˝ T and you can keep going.
This iterations arise in a number of cases, e.g. predator-prey models or planetary rota-
tions. Many moons around large planets have resonances in them (e.g. Galilean resonances)
and these can be modeled by iterating the differential change in position (i.e. velocity) as
determined by Newton’s second law. This map is certainly not linear, but in multivariate
calculus we will find that we can choose coordinates and make approximations to find the
derivative, a linear map that approximates this function in the origin. Iterates of this linear
map will approximate the full nonlinear function.

12.2 Invariant Subspaces


We’ll be particularly concerned with the finite-dimensional case. The simplest possible case is
dimension 0 but that’s trivial. Thus consider dimension 1. Then T : V Ñ V is multiplication
by a scalar T : v ÞÑ λv for some λ P F. Thus T “ λ id and T n “ λn id as well. This is
extremely nice, and as it turns out we’ll come very close to reducing the general case to this!
The first step is the following definition:

Definition 12.2.1. A subspace U Ă V is an invariant subspace under T : V Ñ V if


T pU q Ď U , i.e. u P U implies T u P U .

It would be really nice if there was a complement V “ U ‘ W such that both U and
W were invariant. Then we could just consider smaller-dimensional vector spaces U and
W instead of thinking about V . If you want to think about this in terms of matrices, we
essentially get ˆ ˙
MpT |U q ˚
MpT q “
0 MpT |W q
when picking a basis for U and W . You can then iterate this over and over again, getting
essentially a block diagonal matrix (if you can get the ˚ to be 0).
The complement in general is not invariant, since you cannot define the complement
purely canonically. But we always have the quotient space! The key observation is that we

41
always have an induced map T : V {U Ñ V {U . Axler calls this map T {U , but we will not use
this notation. You can define this map via T prvsq “ rT vs or via the commutative diagram
(I couldn’t resist)
T
V V
π π

T
V {U V {U

In either case, you need to check that this map is well-defined, i.e. rvs “ rv 1 s implies
rT vs “ rT v 1 s. But rvs “ rv 1 s implies that v ´ v 1 P U so T pv ´ v 1 q P U or rT vs “ rT v 1 s, as
desired.
Thus we now have the matrix
ˆ ˙
MpT |U q ˚
MpT q “
0 MpT |V {U q

which is known as a block upper triangular matrix. If we repeat this dissecting pro-
cess to get 1-dimensional subspaces, then we get an upper triangular matrix. Iterating
this transformation gets another upper triangular matrix. Ideally, we’d even want to get a
diagonal matrix, which is very easy to deal with.

12.3 Eigenspaces and Eigenvectors


Example 12.3.1. Examples of invariant subspaces:

1. t0u and V are invariant subspaces.

2. ker T and Im T are invariant subspaces.

3. If U and U 1 are invariant, then so are U X U 1 and U ‘ U 1 .

4. We have the following:

Lemma 12.3.2 (Axler “5.0”). Given T : V Ñ V , U Ă V a subspace, and λ P F,


define T ´ λI to be pT ´ λIqpvq “ T v ´ λv. Then U is invariant under T ´ λI iff U
is invariant under T .

Thus we can try lots more things to get invariant subspaces. Note that kerpT ´ λIq “
tv P V | T v “ λvu is also invariant under T . We thus have:

Definition 12.3.3. The λ-eigenspace of T is the space Vλ “ kerpT ´ λIq. The vectors in
this eigenspace are called eigenvectors.

Definition 12.3.4. λ is an eigenvalue if Vλ ‰ 0.

42
We want to generate a basis of eigenvectors so our matrix becomes upper triangular.
Does there even exist one eigenvalue always? This is not the case over all fields F.
Consider rotation by 90˝ in R2 . This clearly has no eigenvalues or eigenspaces. However,
there will always be eigenvalues when working with F “ C or any other algebraically closed
field.
Traditionally, at this point you use the determinant to prove that eigenvalues exist. We
will not be doing that; instead we’ll naturally construct a polynomial that will demonstrate
the existence of eigenvalues.

12.4 Quiz Results


I will not be posting the distribution of grades here. If you scored between a 35 and 40, you
are fine. If you scored significantly below this, you should seriously consider switching classes
to ensure that you aren’t totally in over your head and are understanding the material. If
you are somewhere in between, you should either come in and talk to us or make sure that
you didn’t just totally mess up on the quiz.
The definitions were mostly fine on the quiz. The key fact on the last question was to
say that T 2 “ 0 implies that Im T Ă ker T which allows you to apply rank-nullity. We
have dim Im T ă dim ker T so we must have dim Im T “ 1, dim ker T “ 2 or dim Im T “
0, dim ker T “ 3. A few people used bases, which makes me sad.
There was some confusion about the difference between FrXs and FF . The first is the
set of polynomials and the second is the set of maps F Ñ F. Remember that Frxs is not
a subspace of FF ! In finite fields, x2 ´ x in characteristic 2 evaluates to 0 everywhere but
is not the 0 polynomial. The codimension of AP or the dimension of P{AP is just deg A,
since you can always write P “ QA ` R with deg R ă deg A, which gives you a complement
of AP as desired.

43
13 10/2/17
PS 4 due now. PS 5 will be posted soon.
My section/Math Night as usual.
Today is the last day to add or drop a class. Please contact one of the CAs if you are
still uncertain.
Tomorrow: math table, Davis Lavowski on Poincare-Birkhoff. Elkies’s office hours as
usual.
Putnam sign-ups are due by October 18.

13.1 Review: Invariant Subspaces and Eigenvectors


Last time, we had a vector space V of dimension 1 ă dim V ă 8 and some linear transfor-
mation T : V Ñ V . We want to find invariant subspaces U Ĺ V , which gives us an action
of V on the quotient space, i.e. (different diagram than the one I used)

0 U V V {U 0
T T T {U

0 U V V {U 0

Last time, we checked that there exists a map that makes the above diagram commute
and this map is well-defined.
The idea is that we try to find invariant subspaces until we cut everything down to one
dimension.
Why do we assume finite-dimensionality? Here’s a counterexample. Suppose F “ R, C
and V “ Cpr0, 1s, Fq “ tf : r0, 1s Ñ F | f continuousu. Let T : V Ñ V with f ÞÑ xf , where
pxf qptq “ tf ptq. The claim here is that there is no one-dimensional invariant subspace, i.e.
no eigenvector. Thus there is no nonzero function f ‰ 0 and real or complex number λ P F
such that xf “ λf . Thus invariant subspaces in an infinite-dimensional space will look more
complicated than those in a finite-dimensional case.
You might wonder whether you can define a codimension 1 invariant subspace? Or any
invariant subspace? I leave that to you.

13.2 Linear Independence of Eigenvectors


Recall the definitions of eigenspaces, eigenvalues, and eigenvectors. The key fact is the
following:

Theorem 13.2.1 (Axler 5.10). Eigenspaces for different ř eigenvalues are linearly indepen-
dent. That is, for pairwise distinct λi , if vi P Vλi and i vi “ 0 (with finitely many i), then
each vi “ 0.

This implies that if you have found n different eigenvalues in a space of dimension n, you
have found all possible eigenvalues.

44
Proof. We know that T vi “ λi vi with each λi distinct. Take a set such that the number
of nonzero vi is minimal. If not all vectors are zero, we’ll produce a linear relationship
with fewer nonzero vectors (but not all); this will produce a contradiction. If n “ 1, we’re
obviously done.
Now induct on the number of eigenvectors. Suppose

v1 ` v2 ` . . . ` vN `1 “ 0.

We have
T v1 ` . . . T vN `1 “ λ1 v1 ` . . . ` λN `1 vN `1 “ 0.
We can now get rid of one of the vs by subtracting λN `1 times the first linear combination.
We get
pλ1 ´ λN `1 qv1 ` . . . ` pλN ´ λN `1 qvN “ 0.
This has a smaller number of nonzero vectors, as desired.

13.3 Existence of Eigenvectors


We can now avoid using determinants via the following observation. Consider v, T v, T 2 v, T 3 v, . . ..
Each of these are clearly T -invariant subspaces. Then

a0 v ` a1 T v ` a2 T 2 v ` . . . ` an T n v “ 0

by linear independence. But a0 ` a1 T ` . . . ` an T n is also linear transformation. We’ve


written this as if it were a polynomial in T , but this is not quite accurate; T is a linear
transformation not a polynomial in T .
If we are over an algebraically closed field, we might expect to factor this as

an pT ´ t1 IqpT ´ t2 Iq . . . pT ´ tm Iqv “ 0.

Thus we have a sequence of linear transformations that in a finite number of steps sends v
to 0. One of these linear transformations must have nontrivial kernel, implying the existence
of an eigenvalue.
We now want to formalize this proof to get the theorem:

Theorem 13.3.1 (Axler 5.21). Every linear transformation on a vector space over an alge-
braically closed field has an eigenvalue.

Proof. Define the map Frxs Ñ LpV q “ LpV, V q “ HompV, V q “ EndpV q. This is the
endomorphism ring and is a non-commutative ring in general. Our map is defined by
x ÞÑ T ; this implies 1 ÞÑ I, x ÞÑ T, x2 ÞÑ T 2 , x3 ÞÑ T 3 , . . .. This homomorphism is both a
ring homomorphism and a F-algebra homomorphism.
Definition 13.3.2. An F-algebra is a ring that contains the field F, making it a vector
space over F.

45
Let this map be hT . Then hT paq “ aI, so we can now extend this to a map over all of
Frxs by linearity. You can check that this map preserves multiplication and addition.
Frxs is infinite-dimensional, while EndpV q is finite-dimensional. This means that we have
a nontrivial kernel. So that there is a polynomial of degree ď n2 that kills every nonzero
vector, i.e. annihilates the entire vector space!
The ker hT is an ideal of the ring Frxs.
Definition 13.3.3. An ideal I Ă R is a subring of R such that for all r P R, we have
rI Ă I. This is stronger than just requiring closure of multiplication within I.
It is a nonzero ideal, as we’ve established, and cannot be the whole ring. All ideals
in the polynomial ring are generated by one element (exercise: use the Euclidean algo-
rithm/Bezout’s lemma) so the polynomials are a principal ideal domain. Call this gen-
erator P , so ker hT “ P Frxs. This P can be characterized as the monic polynomial of the
minimal degree such that hT pP q “ 0. We call this the minimal polynomial of T .
We can now factor the minimal polynomial of T and proceed as above to ge the desired
result.

46
14 10/4/17
Rohil’s section as usual tomorrow. Today - Friday: CMSA workshop continues.
Next Monday is a university holiday so we will not have class. However, section and
Math Night/OH will be held as usual.
Putnam signups are due October 18.
To proceed further with eigenvalues and eigenspaces, it’s most natural to work in the
context of algebraically closed fields. There are several questions here: what is an alge-
braically closed field? where does that term come from? why do such fields exist (e.g. the
complex numbers C)? This is really a topological/analytic fact and doesn’t really belong in
55a; we’ll prove it very easily in 55b. The textbook gives a very fast proof using complex
analysis which we will outline here.

14.1 Remainder and Factor Theorems


Start fromřsome field F and look at the ring of polynomials Frts “ P (i.e. polynomials of
the form ni“0 ai ti for any n P N with each ai P F). Suppose you have t1 P F and P P Frts,
then you can evaluate P pt1 q by using polynomial division.
In general, we know that for any P, Q P Frts, P ‰ 0 and deg P “ d, where:
ř
Definition 14.1.1. The degree deg P of a polynomial P “ i“0 ai ti is the maximum i
such that ai ‰ 0.

then there exists unique A, B P Frts such that Q “ AP ` B and deg B ă d. This is just
polynomial division. Now we can do the computation

P ptq “ Aptqpt ´ t1 q ` B

where deg B ă 1, i.e. B is constant. Substituting t1 for t, we have P pt1 q “ B. You can
evt
think about substitution as a map Frts Ñ1 F which is just evaluation at t1 . You can check
that this is a ring homomorphism, but this is really a good way of obfuscating something
rather trivial.
Similarly, we know that P pt1 q “ 0 is equivalent to P ptq “ pt ´ t1 qAptq. Thus the kernel
of our evaluation map is just pt ´ t1 q.

Definition 14.1.2. If P pt1 q “ 0, then t1 is a root or a zero of t.

14.2 Algebraically Closed Fields


In basic algebra, we spend a lot of time solving these polynomial equations. Degree 0 is
impossible, degree 1 is trivial, degree 2 is the quadratic formula, etc. But we can always
reduce degree of the polynomial we’re trying to solve using the above technique (assuming
we can find a root).

Definition 14.2.1. Suppose P P Frts, P ‰ 0. Then we say that P factors completely if


there exist t1 , . . . , tn P F such that P ptq “ apt ´ t1 q . . . pt ´ tn q for some nonzero a P F.

47
Proposition 14.2.2. The following are equivalent:
1. Every nonzero P P Frts factors completely.
2. Every P P Frts of degree ě 1 has a root in F.
Proof. The direction 1 ùñ 2 is essentially trivial. The direction 2 ùñ 1 is also easy; just
write P ptq “ pt ´ t1 qQptq where deg Q ă deg P and induct on deg P .
Definition 14.2.3. A field that satisfies the above conditions is algebraically closed.
We know that the real numbers are not algebraically closed. For example, pptq “ t2 ` 1
has no real solutions. Any equation of odd degree has at least one real solution by the
intermediate value theorem, which we have not proven and is also a topological fact about
the real numbers. With enough algebra, one can reduce the fundamental theorem of algebra
to this statement, but it is still topological.
If you make your real numbers larger by choosing some solution of this and calling it i,
i.e. i2 `1 “ 0, then a`ib for a, b P R can be used to do arithmetic, like addition, subtraction,
multiplication, and even division via
1 a ´ ib
“ 2 .
a ` ib a ` b2
?
However, we find that surprisingly, i etc. are all in our new field! This suggests that the
complex numbers are in fact algebraically closed.
Theorem 14.2.4 (Fundamental Theorem of Algebra). The complex numbers are alge-
braically closed.
Sketch of Proof. The idea is to consider the number of times various paths go around the
origin, known as the winding number. You can use a polynomial with no root to show
that a path with winding number 0 changes to winding number n, which is ipmossible,
contradiction. We’ll make this more rigorous and clearer next semester.

14.3 Field Extensions


Let’s generalize this picture slightly. Start with some field F and consider another field
F1 such that F Ă F1 . We know that F1 is automatically a vector space over F (the scalar
multiplication of scalars in F is just the ordinary multiplication operation in F1 ).
Definition 14.3.1. The degree of a field extension rF1 : Fs “ dimF F1 .
We also just draw a diagram like this:

F1
d

F
If d “ 1, then F1 “ F. The claim is that algebraically closed fields have only one possible
finite-degree extension.

48
Theorem 14.3.2. F is algebraically closed iff there is no field F1 Ą F of finite degree aside
from F itself.

This proof tells us something about how the ideas we’ve developed so far tell us things
about the structure of fields that are not automatically about vector spaces.
Proof. We first prove that if F is not algebraically closed then there exists such a field F1 .
If F is not algebraically closed, then let P be a polynomial that does not factor completely.
We know deg P ě 2 and assume P irreducible (otherwise just pick an irreducible factor that
does not factor completely).
Construct a ring F1 “ Frxs{P pxqFrxs. This quotient ring (the ring of polynomials mod
P ) guarantees that x is a root of P . P pxqFrxs is an ideal so this quotient ring makes sense.
We do need to check that this is actually a field and that it has the right dimension. The
dimension of this is the codimension of P pxqFrxs which is deg P ě 2. So we just need to
check that it is actually a field.
Suppose you have a ‰ 0, a P F1 . We need to show that there exists an inverse. We have
a “ Apxq for some polynomial A and want to show that there exists b such that ab ” 1
pmod P q. Multiplication by a, i.e. b ÞÑ ab is an F-linear map from F1 Ñ F1 . F1 is a finite-
dimensional vector space, so we’ll just show that this map is surjective. By rank-nullity, it’s
enough to show that this map is injective, i.e. ab ” 0 pmod P q. Since P is irreducible, we
must have P |a or P |b, contradiction. Thus we are done.
Now we need to show the reverse direction, i.e. if there exists F1 then F is not algebraically
closed. Let x P F1 , x R F. Consider 1, x2 , . . . , xd where d “ rF1 : Fs. Given d ` 1 vectors in
a d-dimensional vector space, there must be a linear relation between them. We know that
1 ‰ 0 and x is linearly independent from 1, so there exists a0 , a1 , . . . , ae where 1 ă e ď d
such that e
ÿ
ai xi “ 0.
i“0

śe I claim that this is a polynomial that does not split completely. If it did, it would be
i“1 px ´ ti q with each ti P F. But this implies that x “ ti for some i, contradiction.

This gives us a good way to construct algebraic closures; just keep adding field extensions
for all remaining polynomials that do not split completely and keep stacking extensions!
If your field is countable, this works no matter what; if you are allowing the Axiom of
Choice/Zorn’s lemma, this works always (exercise). If you’re in the complex numbers, you
can just pick C.

49
15 10/6/17
Monday: no class, my section/Math Night as usual.
Tuesday: Math table (Carlos Arbos-Ribeira, Cobordism Groups) and OH as usual.
Wednesday: PS 5 due, start tensor algebra.

15.1 Finishing Up Eigenvalues


Let T : V Ñ V as usual. Recall that an eigenvalue is λ such that there exists v such that
T v “ λv for v ‰ 0 P V, λ P F, or equivalently v P kerpT ´ λIq. v is an eigenvector; such
vectors constitute a subspace Vλ “ kerpT ´ λIq when you put the zero back in. Another way
to say that λ is an eigenvalue is to say that Vλ ‰ 0.
Recall Theorem 13.2.1. We have the following corollary:
Corollary 15.1.1 (Axler 5.13+). dim V ă 8 implies that the number of eigenvalues ď
dim V .
When does equality hold? We require there be n distinct eigenvalues, i.e.
n
à
V “ V λi
i“1

which implies that dim Vλi “ 1. We may now pick a basis for each Vλi to get a basis for V
on which MpT q is diagonal with diagonal entries being the eigenvalues λi . We loosen the
distinctness condition on the eigenvalues to get:
Definition 15.1.2. T is diagonalizable if there exists a basis on which T has a diagonal
matrix, or equivalently V has a basis of eigenvectors of T .
Recall also the minimal polynomial P pT q “ 0. Factorizations of this polynomial necessar-
ily give you all possible eigenvalues. Alternatively, you could just start with v, T v, T 2 v, . . . , T n v
to get a polynomial P pT qv “ 0. We therefore also get the existence of eigenvalues in an
algebraically closed field by factoring the minimal polynomial.
We’ll now spend a bit of time talking about linear transformations that are less nice than
diagonalizable transformations, i.e. upper-triangular transformations.

15.2 Upper-Triangular Matrices and Flags


Definition 15.2.1. An upper-triangular matrix is one that only has entries on or above
the main diagonal.
Adding and multiplying two upper-triangular matrices by a scalar gives you another
upper-triangular matrix. The diagonal matrix preserves direct sum decompositions into
1-dimensional subspaces, while the upper-triangular matrix preserves a flag.
Definition 15.2.2. A flag or filtration in a finite-dimensional vector space V {F is a se-
quence of subspaces 0 “ V0 Ă V1 Ă V2 Ă . . . Ă Vn “ V with each Vi having dimension
i.

50
From a basis, we can get a flag by considering Vi “ Spanpv1 , . . . , vi q. Observe that
upper-triangular matrices respect the flag with respect to the basis of that matrix, since
V vi P Spanpv1 , . . . , vi q by definition. In particular, if the ith diagonal entry is λi then this is
the induced map on Vi {Vi´1 .
We can now determine a condition for when an upper-triangular matrix is invertible.

Proposition 15.2.3 (Axler 5.30). If M is upper-triangular, M is invertible iff all diagonal


entries are nonzero.

Proof. In the forward direction, if some λi “ 0, then the induced map on Vi {Vi´1 is 0, so
T pVi q Ă Vi´1 , contradiction.
In the backward direction, suppose that v P ker T , i.e. T v “ 0, v ‰ 0. Thus v P Vi , v R
Vi´1 for some i. Since λi ‰ 0, we must have T v P Vi , T v R Vi´1 since when we take mod Vi´1
we find T v “ λi v ‰ 0. But this is clearly a contradiction.

Proposition 15.2.4 (Axler 5.32). If M is upper-triangular, its eigenvalues are precisely its
diagonal entries.

Proof. Apply the previous to M ´ λI.


Why is this so messy? There might be vectors killed by pT ´ λIqk as opposed to T ´ λI.
This is known as the nilspace, and it turns out that the dimension of the nilspace is equal to
the multiplicity of the λi . If the matrix is diagonalizable, the dimension of the eigenspace is
the dimension of the nilspace. If we can only get an upper-triangular matrix, it is because
of vectors in the nilspace but not in the kernel. We will cover this a bit later in Chapter 8.

15.3 Preview: Inner Products


The next main topic is inner products and inner product spaces. Inner products connect
most strongly with the geometry of the real plane by giving a notion of distances and angles.
The inner product is a map V ˆ V Ñ F that is bilinear, i.e. fixing any v P V gets a map
V Ñ F or an element of V ˚ . Thus inner products are the same as maps V Ñ V ˚ . We have
the additional condition that our inner product is symmetric.
We can reinterpret this as a tensor product V b V ˚ . The tensor product allows us to
speak more generally about many other constructions, including the trace and determinant.
Thus we will talk about the tensor product next week.

51
16 10/11/17
Today: tensor products. PS 6 posted online later today and available in class; PS 5 due
now.
Science Center A 4:15 - 5:15 today and tomorrow: Tim Gowers is giving the Ahlfors
lecture series on additive combinatorics.
We’ll use tensor products to prove later today that the trace (concretely in terms of
matrices, the sum of diagonal elements) is coordinate-independent and thus a canonical
map. You may have seen a proof of this based on bases, but we’ll show why the trace is
actually a very natural map. Later, we’ll do something similar for the determinant.

16.1 Definition of the Tensor Product


The tensor product of two vector spaces U, V over the field F is a new vector space U b V
that is kind of like the Cartesian product. We’d expect for finite-dimensional U, V that
dimpU b V q “ pdim U qpdim V q. This is occasionally known as the Kronecker product.
Picking a basis ui , . . . , um , v1 , . . . , vn , we find that the nm elements ui b vj will span our
tensor product. Along with some rules about how we manipulate this symbol b, this will
eventually show that we have a basis-independent definition of the tensor product. But this
construction is non-canonical and seems to depend on choice of basis, which is not ideal.
We don’t want to have to prove that everything is basis-independent and have to do lots of
calculations with change of basis.
Thus we want a canonical construction of U b V . The nicest way of doing this is as a
quotient space of two huge infinite-dimensional vector spaces. This is a construction that
you’ll see over and over again in algebra.
In general, for any u P U, v P V , there will be a pure tensor u b v P U b V (also
known as simple tensors). Pure tensors span the space U b V , but not all elements of
the tensor product are pure tensors (this is the most common error students make with
tensor products). We’ll need some rules for dealing with these pure tensors. We will require
bilinearity, i.e. ˜ ¸ ˜ ¸
ÿ A ÿb A ÿ
ÿ B
aα uα b bβ vβ “ aα bβ uα b vβ .
α“1 β“1 α“1 β“1

Using this rule, we see that any u b v can be written as a linear combination of the
ui b vj . But how do we know that the ui b vj are linearly independent? This is much harder
to prove. And what about vector spaces for which we don’t know a basis? We’ll want to
form the tensor product of R bQ R, for example.

Definition 16.1.1. We define the tensor product of two vector spaces U, V over F to be
U b V “ Z{Z0 . Z is generated by tu b v | u P U, v P V u formally (we have no rules on how to
add each of these symbols u b v; they’re just symbols). Z0 is generated by all the relations:
pu ` u1 q b v ´ u b v ´ u1 b v for u, u1 P U, v P V , u b pv ` v 1 q ´ u b v ´ u b v 1 for u P U, v, v 1 P V ,
pauq b v ´ apu b vq and u b pavq ´ apu b vq for a P F, u P U, v P V .

Note that we have a natural map Z Ñ U b V that sends u b v ÞÑ u b v with kernel Z0 .

52
16.2 Properties of the Tensor Product
First, we’ll show that the tensor product has the basis we expect.

Lemma 16.2.1. If ui and vj are bases for U and V , then ui b vj is a basis for U b V .

We are not going to prove this by showing that our set is linearly independent and spans
the vector space. Exercise: try doing this; see how hard it is.
Proof. Let W have basis wij “ ui b vj ; we’ll show that W » U b V . Construct α : W Ñ
U b V, β : U b V Ñ W and prove that α ˝ β “ id, β ˝ α “ id. This demonstrates the
isomorphism by providing a map with a two-sided inverse.
Define αpwij q “ ui b vj (technically I should take this mod Z0 , but I’ll drop that since
people consider this implicit).
The other direction is trickier. We define our transformation on Z and check that it is 0
on Z0 , thus inducing a transformation on Z{Z0 . Define βr : Z Ñ W with
˜ ¸ ˜ ¸
ÿ ÿ ÿÿ
βpu
r b vq “ ai ui b bj vj “ ai bj wij .
i j i j

We need to check that βpZr 0 q “ 0 to ensure that βr induces a map β : Z{Z0 “ U b V Ñ W .


This is easy enough to check manually (exercise).
We now need to check that α ˝ β “ id and that β ˝ α “ id. Check both on a basis. The
second is trivial; note that βpαpwij qq “ βpui b vj q “ wij . For the first, note that
˜ ¸
ÿÿ ÿ
αpβpu b vqq “ α ai bj wij “ ai bj ui b vj “ u b v
i j i

by bilinearity. Thus we have our two-sided inverse and get an isomorphism.


Note: most of the above proof is relatively tautological, just written out. If you think
that this is easy, you’re not missing anything.
There are some natural consequences that you might expect from the use of this notation.
We have:

1. U b V » V b U , with the map between the two being u b v ÞÑ v b u.

2. pU b V q b W » U b pV b W q, with the map being pu b vq b w ÞÑ u b pv b wq.

3. pU ‘ V q b W » pU b W q ‘ pV b W q, with the map being pu, vq b w ÞÑ pu b w, v b wq.

In all cases, we need to check that our maps are well-defined under the relations, i.e. the
elements of Z0 , and that a two-sided inverse exists. But there is an easier way to prove many
of these statements. . .

53
16.3 Universal Property of the Tensor Product
Suppose we have a map B : U ˆ V Ñ X (just the Cartesian product).

Definition 16.3.1. B is bilinear if fixing u P U gets a linear map V Ñ X and fixing v P V


gets a linear map U Ñ X.

Proposition 16.3.2 (Universal Property of the Tensor Product). Bilinear maps B : U ˆ


V Ñ X are in bijection with linear maps T ; U b V Ñ X, where T pu b vq “ Bpu, vq.

Defining Tuniv : U ˆ V Ñ U b V as Tuniv pu, vq “ u b v, we have:


Tuniv
U ˆV U bV
B D!T

X
You need to check that the map T makes sense (under the relations) and is actually a
linear map and that there is a bijection between B and T . We’ll leave this as an exercise.
You can define the tensor product through the universal property, since universal prop-
erties are unique up to isomorphism (you do need to construct it, though). You can then
just use the universal property to prove all the properties we listed above by proving that
the maps we showed are actually bilinear in both directions.
You can also define the tensor product of linear transformations.

Definition 16.3.3. The tensor product of linear transformations T : U Ñ V, T 1 : U 1 Ñ


V 1 is T bT 1 , defined as the map induced by U ˆU 1 Ñ V bV 1 that sends pu, u1 q ÞÑ T puqbT 1 pu1 q.
This map is bilinear, and thus we induce a linear map T b T 1 .

54
17 10/13/17
There’s a typo on the official handout: on page 2, the first displayed equation should read
ÿÿ
βpu
r b vq “ ai bj wij
i j

which is not quite yet ui b vj .

17.1 Review: Tensor Products of Linear Transformations


Last time, we constructed the tensor product of linear trnasformations S : U Ñ U 1 , T : V Ñ
V 1 by pu, vq ÞÑ Spuq b T pvq which induces a map S b T : U b V Ñ U 1 b V 1 .
Observe that this gives a bilinear map HompU, U 1 q ˆ HompV, V 1 q Ñ HompU b V, U 1 b V 1 q
which yields a linear map

HompU, U 1 q b HompV, V 1 q Ñ HompU b V, U 1 b V 1 q.

This is an isomorphism for finite-dimensional U, U 1 , V, V 1 . One can prove this by picking a


basis to observe that the dimensions on both sides are the same and can define a map taking
one basis to another. You can also do this by other basis-free means if you want to.
We already know that V b F “ F b V “ V , so we get

U ˚ b V ˚ » HompU, Fq b HompV, Fq » HompU b V, F b Fq “ pU b V q˚ .

This is an extremely useful statement and is an isomorphism only in the finite-dimensional


case.
We also get the statement

U ˚ b V » HompU, Fq b HompF, V q » HompU, V q

which implies that the space of linear transformations is itself a tensor product of the dual!
This is again only an isomorphism in the finite-dimensional case.

17.2 Bilinear Forms


A bilinear form is a bilinear map V ˆV Ñ F. We know that this corresponds to linear maps
V b V Ñ F or pV b V q˚ . In the finite-dimensional case, this is just V ˚ b V ˚ . Of course, inner
products are not just random bilinear forms; they satisfy a few other properties (the triangle
inequality and symmetry). Thus not every element in the tensor product will correspond to
an inner product.
Symmetric bilinear forms satisfy the additional condition that pv, wq “ pw, vq. Un-
winding everything, we recall the isomorphism U b V » V b U under u b v ÞÑ v b u. If
U “ V , this is a map S; V b V Ñ V b V that is not the identity. S 2 “ id however, implying
that this map is an involution.
Consider the symmetric space, the subspace Sym2 V which can be either tτ P V b
V | Sτ “ τ u “ kerpI ´ Sq or Spanpu b v ` v b uq assuming char F ‰ 2. Note that these

55
subspaces are invariant under S as desired. You can also define Sym2 V “ Z{Z1 where
Z1 “ SpanpZ0 , u b v ´ v b uq, where Z, Z0 are as in the tensor product definition.
Given this backgorund, we know that bilinear forms are elements of HompSym2 V, Fq.
We’ll be using this very soon to understand inner product spaces.

17.3 The Trace


Note that HompV, V q » V ˚ b V for V finite-dimensional. There is an obvious bilinear map
V ˚ b V Ñ F that takes pv ˚ , vq ÞÑ v ˚ pvq (the evaluation map).
Definition 17.3.1. The trace of a map T : V Ñ V is the map HompV, V q Ñ F induced by
the evaluation map V ˚ b V Ñ F.
ř
We can evaluate this explicitly. If MpT q “ paij q, we see that TrpT q “ i aii , implying
that this is the normal trace. We also see for free that the trace is coordinate-independent,
since we provided a coordinate-independent definition of it!
Further, note that TrpIq “ dim V , implying that we’ve got to keep to finite-dimensional
spaces for any of this to make sense. You can further check that TrpST q “ TrpT Sq (without
taking bases, please). Note that ST ´ T S “ I for some linear transformations on the space
of polynomials, further suggesting that the trace is not well-defined over infinite-dimensional
spaces.

17.4 Inner Product Spaces


Inner products are a generalization of dot products. Recall that the dot product of two
vectors pxi q, pyj q is ÿ
xx, yy “ xi yi .
i

When x “ y, we know that ÿ


xx, xy “ x2i ą 0
when working over R. Further, xx, xy “ 0 iff x “ 0. Further, we can define xx, xy “ ||x||2 ,
the length
a of x. This can be used to prove the Pythagorean theorem. We will call this
|x| “ xx, xy the norm of our vector space (not the square, which sometimes happens).
Finally, we have xx, yy “ xy, xy.
Note that we also have a map V Ñ V ˚ such that x ÞÑ py ÞÑ xx, yyq. Wehn dealing with
an inner product and not just a bilinear form means that this is injective, since xx, xy ą 0
for all x ‰ 0. Thus in the finite-dimensional space this is an isomorphism, implying that we
get a canonical isomorphism V » V ˚ for finite-dimensional inner product spaces.
Define dpx1 , x2 q “ |x1 ´ x2 | as a distance function. We have dpx1 , x2 q ě 0 with equality
if x1 “ x2 . We also want the triangle inequality

dpx1 , x2 q ` dpx2 , x3 q ě dpx1 , x3 q

to hold. This is equivalent to |x ` x1 | ď |x| ` |x1 | (often also called the triangle inequality).
We have to prove this, and it is a consequence of bilinearity and positive-definiteness. We
will do that next time.

56
18 10/16/17
Today we are going to continue our discussion of inner product spaces.

18.1 Inner Products over Real Vector Spaces


Let V be a real vector space.
Definition 18.1.1. An inner product is a map, denoted by x, y from V ˆ V Ñ R that is:
• Bilinear: xav1 ` bv2 , wy “ axv1 , wy ` bxv2 , wy, xv, aw1 ` bw2 y “ axv, w1 y ` bxv, w2 y for
all v1 , v2 , w1 , w2 P V , a, b P R.

• Symmetric: xx, yy “ xy, xy for all x, y P V .

• Positive definite: xx, xy ą 0 for any nonzero vector x P V .


Example 18.1.2. The standard example is the “dot product” with V “ Rn , xx, yy “
řn
i“1 xi yi .

Before now, it was not too crucial which fields we were working over. However, to define
an inner product, it is necessary that we consider vector spaces over R. This is because the
real numbers have the special property that x2 ě 0, with equality if and only if x “ 0.
Otherwise, the positive-definiteness axiom is difficult to ensure. For example, if we are
working over a finite field, then x2 ` y 2 ` z 2 may be equal to zero as we found out on the
last problem set.
An inner product space also gives rise to notions of norms and distance. These are
given by a
|x| “ xx, xy
and
dpx, yq “ dpx ´ y, 0q “ |x ´ y|.
The former is a definition, but the latter is just something we cooked up. To show this
actually works, we need to check that the axioms of a distance function are satisfied.
In particular, the triangle inequality boils down to showing:
Lemma 18.1.3. For any vectors x, y P V , |x| ` |y| ě |x ` y|.
If we square both sides, since they are positive the inequality is equivalent to showing

xx, xy ` 2|x||y| ` xy, yy ě xx ` y, x ` yy


“ xx, xy ` xx, yy ` xy, xy ` xy, yy.

By symmetry, xy, xy “ xx, yy.


Simplifying, this is equivalent to |x||y| ě xx, yy. This will follow from:
Lemma 18.1.4 (Cauchy-Schwarz inequality).

|x||y| ě |xx, yy|.

57
There are two different types of proofs.
Proof 1. This is the proof given in the book. It is more geometric.
Let us take two vectors x, y. If x “ 0, there is nothing to check.
On the other hand, if x ‰ 0, let’s project y onto an axis perpendicular to x to get yr. Then
y | ` C|x|2 for
y | ě xx, yry “ 0. Then y ´ yr differ by a multiple of x, so we find |x||y| “ |x||r
|x||r
some constant C, and xx, yy “ C|x|2 , so the inequality remains true.
Proof 2. This proof is more algebraic, but if you translate it to pictures the underlying
geometric idea is the same as Axler.
For every a, b P R we have that xax ` by, ax ` byy ě 0. The left hand side expands to
a2 |x|2 ` 2abxx, yy ` b2 xy, yy.
Then it is true by the standard theory of quadratic equations that constants A, B, C
satisfy Aa2 ` 2Bab ` Cb2 ě 0 if and only if A ě 0, AC ´ B 2 ě 0.
In our situation, we set A “ xx, xy, B “ xx, yy, C “ xy, yy. Then we get xx, xy ě 0,
xx, xyxy, yy ě pxx, yyq2 . This last statement is equivalent to the Cauchy-Schwarz inequality.

Example 18.1.5. There is another natural inner product for continuous functions on the
interval rx0 , x1 s. Although we don’t have a notion of integration yet, we can take it in the
usual high-school calculus sense and set
ż x1
xf, gy “ f pxqgpxqdx.
x0
şx
We need to show the axioms. For positive-definiteness, xf, f y “ x01 f pxq2 dx. The inte-
grand is non-negative, so the integral is clearly non-negative. Then, it remains to argue that
it is zero if and only if f “ 0, which we can figure out using notions of continuity.
In theş case of polynomials, we can also define an inner product on the real line by
8
xf, gy “ 0 f pxqgpxqe´x dx. Since e´x decreases faster than any polynomial increases, this
integral is always well-defined.

Something that was forgotten earlier was to discuss when equality holds. For the Cauchy-
Schwarz inequality, we have the equality xax ` by, ax ` byy “ 0 if and only if ax ` by “ 0.
We have a name for this: x and y are linearly dependent!

18.2 Inner Products over Complex Vector Spaces


We would also like to have a notion of inner product for complex vector spaces.
If z is a complex number, we don’t have z 2 ě 0. In fact, z 2 is complex in general so this
doesn’t make sense! However, if we take z “ x ` iy, z “ x ´ iy to be its complex conjugate,
then zz “ |z|2 ě 0. In terms of the expansions of z and z, we find zz “ px ` iyqpx ´ iyq “
x2 ` y 2 ě 0.

Definition 18.2.1. For a complex vector space V , an inner product is a function x, y :


V ˆ V Ñ C that is:

58
• “Sesquilinear” or “conjugate-linear”: xav1 ` bv2 , wy “ axv1 , wy ` bxv2 , wy, xv, aw1 `
bw2 y “ axv, w1 y ` bxv, w2 y for all v1 , v2 , w1 , w2 P V , a, b P C.

• Conjugate-symmetric or Hermitian: xx, yy “ xy, xy.

• Positive-definiteness: xx, xy ą 0 for all nonzero vectors x P V .

Note that the positive-definiteness is well-defined, since xx, xy is equal to its own conjugate
by conjugate-symmetry and is therefore real.
In our proof of the triangle inequality, we now need to show |x||y| ě Rexx, yy instead.
However, the Cauchy-Schwarz inequality |x||y| ě |xx, yy| still holds and |xx, yy| ě Rexx, yy.

18.3 Equivalence of Norms


For the remainder of class, we have just a bit more to say about norms. For any real (or
complex) vector space, we can choose a variety of inner products.
In general, if we have a fixed basis, then we can write xx, yy “ nj“1 cj xj yj for reals cj ą 0.
ř
Each inner product givves rise to a different notion of distance. If we look at the unit
disk, i.e. the set of all vectors of norm less than 1, then changing the constants cj stretches
it out in different directions along the different axes, making it into an ellipsoid.
However, all norms are actually within a constant of each other. That is, if | ¨ |1 , | ¨ |2 are
two norms, then there is are constants C, D such that C|x|1 ď |x|2 ď D|x|1 independent of
the choice of vector x.

59
19 10/18/17
PS 6 due now, PS 7 out today. Last day to sign up for the Putnam exam.
No class on Friday, November 3.
Final exam due end of Sunday, December 10.

19.1 Inner Products over the Complex Numbers


We have some bilinear pairing x , y : V ˆ V Ñ F. This pairing is symmetric and positive def-
inite (if F “ R), and non-degenerate. Non-degenerate means that the induced linear map
V Ñ V ˚ is injective/surjective/an isomorphism, for V finite-dimensional when these condi-
tions are equivalent. We will avoid using the word non-degenerate for infinite-dimensional
vector spaces.
Last time, we extended this to work for the complex numbers. We require that our
sesquilinear (respects multiplication via the conjugate on the second element) pairing be
conjugate symmetric, i.e.
xv, wy “ xw, vy
and positive definite here. Non-degeneracy doesn’t make quite as much sense since we only
˚
get a semi-linear map V Ñ V that respects multiplication by the conjugate.
We can generalize this to an arbitrary field F by considering a field automorphism σ :
F Ñ F (i.e. any map between fields that respects addition and multiplication). Pick an
automorphism with order 2, i.e. an automorphism σ such that σ 2 “ id. Then we can require

xv, wy “ σpxw, vyq.

The trivial example here is σ “ id and there could be nontrivial maps, as in the case of
the complex numbers. Such maps are called involutions. In the field ? F “ Qpiq you can
ÞÑ a ´ ib and this similarly works for F “ Qp dq for d not a square
consider the?map a ` bi ?
where a ` b d ÞÑ a ´ b d. We can now interpret non-degeneracy for inner products over
the complex numbers as if for all w, xv, wy “ 0, then v “ 0.
Recall our standard example of an inner product on the reals to be
ÿ
xa, by “ ai b i .

Over the complexes, we have ÿ


xa, by “ aj σpbj q.
This still satisfies conjugate symmetry and nondegeneracy.

19.2 Orthogonality
Definition 19.2.1. If u, v P V with an inner product (either a conjugate-symmetric sesquilin-
ear pairing or a symmetric bilinear pairing), we say that u is orthogonal to v or u K v iff
xu, vy “ 0.

We know that xv, uy “ 0 iff xu, vy “ 0 so v K u iff u K v.

60
Lemma 19.2.2 (Pythagorean Theorem). If U K v, then xu ` v, u ` vy “ xu, uy ` xv, vy .

Proof. Just expand.


This isn’t quite the Pythagorean theorem since Euclidean geometry has a very different
axiomatic system, but it is similar. The converse holds in the case that you have a symmetric
bilinear pairing and char F ‰ 2; in the conjugate-symmetric case this does not hold.
Thus if you have some ui , vj such that for all i and j ui K vj , then u K v for all
u P Spanpui q, v P Spanpvj q.

Definition 19.2.3. U K U 1 for two subspaces U, U 1 if u K u1 for all u P U, u1 P U 1 .

As we just established, you can just check this on the spanning sets.
Note that in the sum U ` U 1 of orthogonal subspaces, we have

xu1 ` u11 , u2 ` u12 y “ xu1 , u2 y ` xu11 , u12 y

by bilinearity. Thus given an inner product, two orthogonal subspaces must meet only at 0,
so we actually have a direct sum U ‘ U 1 . Such a direct sum is orthogonal direct sum,
sometimes denoted U ` U 1 .

19.3 Orthogonal Complements


The next natural step is to think about orthogonal complements.

Theorem 19.3.1 (Axler 6.29+, Gram-Schmidt). Let V have a conjugate bilinear pairing
and U Ă V for V finite-dimensional. If our pairing is non-degenerate on U (e.g. it’s an
inner product), then there exists a unique W Ă V such that V “ U ` W

Proof. Begin by considering some simple cases. If U is 0-dimensional this is trivial. If U is


1-dimensional, then U “ Fu. Then we want v “ w ` au and 0 “ xw, uy “ xv, uy ´ a xu, uy
xv,uy
which can be fulfilled only if xu, uy ‰ 0. In that case, we take v ÞÑ v ´ xu,uy u which is clearly
a linear map. This takes any element of V to the unique vector w of the form v ´ au that
is orthogonal to u. W is then the image of this map.
Suppose we have a general subspace U of arbitrary dimension. Pick u1 such that
xu1 , u1 y ‰ 0 ad U “ Fu1 ` `U1 . Pick u2 P U2 such that xu2 , u2 y ‰ 0 and U1 “ Fu2 ` U2 .
We can now proceed until we have an orthogonal basis for U . We then have a correpsonding
V “ Fu1 ` Fu2 ` . . . ` Fun ` W replicating the construction from above.
This will work assuming that we can always find a vector such that xu, uy ‰ 0. This
works if we have a non-degenerate pairing. Once we have done that, we’ll proven that
every such space has an orthogonal basis and constructed it via something equivalent to the
Gram-Schmidt procedure.
In particular, if you have a finite-dimensional inner product space, you can find an or-
thogonal basis. This is roughly Axler Theorem 6.24. You can require the basis be an
orthonormal basis by dividing by ? 1 .
xu,uy

61
20 10/20/17
This weekend: Head of the Charles. Don’t leave the Yard without your HUID.
Nov 3: no class.
Nov 10: class despite Veteran’s Day.

20.1 Orthogonal Complements, Continued


Last time, we had a vector space V with either a bilinear or sesquilinear pairing to F. We have
some finite-dimensional subspace U Ă V on which the pairing is nondegenerate. We claimed
that in that case there exists a unique orthogonal complement W such that V “ U ` W
(pu, wq “ 0 for u P U, w P W and V “ U ‘ W ). We outlined a proof by induction last time,
but the nice way of seeing is the following:
Clean Proof of Theorem 19.3.1. Fix some v P V . Let v “ u ` pv ´ uq. We want xv ´ u, u1 y “
xv, u1 y ´ xu, u1 y “ 0. Thus we want xv, u1 y “ xu, u1 y. Note that u1 ÞÑ xu1 , vy is a linear
functional U Ñ F. Non-degeneracy is the condition that every linear functional can be
represented by an inner product with some vector in U (surjectivity in the finite-dimensional
case), so there exists a unique u such that xv, u1 y “ xu, u1 y for all u1 . Pick that u.ř
Now, a direct sum is just equivalent to projections πi : V Ñ Ui that send πi p ui q ÞÑ ui .
We already have π1 : V Ñ U that maps v ÞÑ u. We need to verify that it’s linear and the
identity on U . The second is trivial; the first is also relatively easy since

xπ1 pvq ` π1 pv 1 q, uy “ xπ1 pvq, uy ` xπ1 pv 1 q, uy “ xv, uy ` xv 1 , uy

and then uniqueness gives us the desired. We can now just let W “ π2 pV q, π2 “ id ´π1 .
There is a nice geometric description of this in terms of inner products. Given an inner
product and any orthogonal decomposition V “ U1 ` V2 , then v “ u1 ` u2 implies that u1
is the closest vector in U1 to V since

|u1 ´ v| ď |u1 ´ v|

for all u1 P U1 . This is often called the orthogonal projection. Note that u1 ´ v “ pu1 ´ vq `
pu1 ´ u1 q so by the Pythagorean theorem |u1 ´ v|2 “ |u1 ´ v|2 ` |u1 ´ u1 |2 .

20.2 Orthogonal and Orthonormal Bases


We now want to identify any inner product space with the standard one
ÿ
xx, yy “ xi yi .
i

The idea is to generate an orthonormal basis.


To do this, pick u1 P U such that xu1 , u1 y ‰ 0 (assuming U ‰ 0). Then write U “
W1 ` Fu1 . Then pick u2 P W1 and continue inducting downwards. We get

U “ Fu1 ` Fu2 ` . . . ` Fun .

62
We can now write
ÿ ÿ ÿÿ
x“ xj u j , y “ yj uj , xx, yy “ xj yj 1 xuj , uj 1 y
j j j j1

and since our basis is orthogonal, we get xuj , uj 1 y “ 0 for j ‰ j 1 . Therefore


ÿ ÿ
xx, yy “ xj yj xuj , uj y “ x j y j cj .
j j

How do we know cj is positive? Note that the polarization identity says that

4 xu, vy “ xu ` v, u ` vy ´ xu ´ v, u ´ vy .

If char F ‰ 2, then if xu, uy “ 0, we have xu, vy “ 0 for all u, v. Thus nondegeneracy implies
that the cj are positive.
This basis is called an orthogonal basis. We can now scale to get cj “ 1 by dividing
a
by xuj , uj y to get a orthonormal basis.
What if we have a bilinear, symmetric, nondegenerate pairing that is not positive definite?
We can then make the cj either ˘1 by scaling. It turns out that:

Theorem 20.2.1 (Sylvester’s Law of Inertia). The number of positive and negative cs is an
invariant of the pairing, called the signature.

This is much less obvious over the rationals! Note that 5px2 ` y 2 q “ px ` 2yq2 ` p2x ´ yq2
which is equivalent to x2 ` y 2 , but 7px2 ` y 2 q.
Proof. If you could find two different signatures, then you have written U “ U1 `U2 where U1
is positive definite and U2 is negative definite. Given two different such decompositions where
the dimensions don’t match, we must have dim U1 ą dim U11 , dim U21 ą dim U2 . Consider
U1 XU21 which must be nontrivial since dim U1 `dim U21 ą dim U . Our form is simultaneously
positive definite and negative definite on this space, contradiction.

63
21 10/23/17
Today: adjoints and orthogonal eigenbases. Same stuff as usual.

21.1 Adjoints
Duality tells us that given a map T : U Ñ V , there is a dual map T ˚ : V ˚ Ñ U ˚ . This
is defined by precomposition T ˚ pv ˚ q “ pu ÞÑ v ˚ pT uqq. If you have an inner product, this
induces an isomorphism U Ñ U ˚ , V Ñ V ˚ , which is u ÞÑ xu, ´y , v ÞÑ xv, ´y (assuming U
and V are finite-dimensional). This induces a map:

Definition 21.1.1. The adjoint T ˚ (via abuse of notation to torment poor students) is the
induced map T ˚ : V Ñ U by the above isomorphisms.

If you unwind the definition, you find that the adjoint is characterized by

xT u, vy “ xu, T ˚ vy .

(Exercise: verify that the above is true). We now see that pT ˚ q˚ “ T (adjoints, not duals
here) and pST q˚ “ T ˚ S ˚ .
If you choose bases to represent T and the dual T ˚ by matrices, T ˚ is the transpose of T .
This is also true for the adjoint, provided that you use a canonical “self-dual” basis. What
does it mean for a basis to be self-dual? Well the dual basis pu1 , . . . , um q sends one vector
to 1 and the rest to 0. Thus we want the inner product of these vectors to be the Kronecker
delta, i.e. we need an orthonormal basis.
Over the complex numbers, the adjoint T ˚ is actually the conjugate transpose, since we
have to take a complex conjugate. This is also known as the Hermitian transpose T H .

21.2 Self-Adjoint Transformations and Orthonormal Eigenbases


Suppose we have T : V Ñ V . There are two kinds of bases we’ve pointed out already:
orthonormal bases and eigenbases. Now can we ever expect to have these at the same time?
Warning: you do NOT always have an eigenbasis without further hypotheses on the linear
transformation.
There is a natural condition on T that is necessary and sufficient for the existence of
an orthonormal eigenbases. Suppose there exists an orthonormal eigenbasis. Then MpT q
is diagonal with respect to this basis. Since our basis is its own dual basis, MpT ˚ q is just
the transpose of this, which is just MpT q. Thus T ˚ “ T . This is known as a self-adjoint
transformation.
Over the complex numbers, a similar argument gets T H “ T . But complex conjugation is
not canonically defined over the complex numbers; it is a basis-dependent concept. However,
it is still true that T H commutes with T , i.e. T H T “ T T H . This is known as a normal
operator.
Now let’s consider the other direction.

64
Theorem 21.2.1 (Spectral Theorem). Suppose V is finite-dimensional over R or C such
that T : V Ñ V are self-dual, i.e.

xu, T vy “ xT u, vy .

Then there exists an orthonormal eigenbasis and all eigenvalues are real.

Proof. Note that


xu, T uy “ xT u, uy “ xu, T uy
by the given statement and conjugate symmetry. Thus xu, T uy is always real. In particular,
if u is an eigenvector, then xu, T uy “ xu, λuy “ λ xu, uy and xu, uy is always a positive real,
so λ must be real. Note that the converse of this does not hold.
Next, suppose T u “ λu, T v “ µv with µ ‰ λ. Then µ xu, vy “ xu, T vy “ xT u, vy “
λ xu, vy, implying that xu, vy “ 0. Thus u and v are orthogonal. Thus Vµ K Vλ .
We now need to show that there exists a basis of eigenvectors. If F “ C, we can find
a nonzero eigenvector T u1 “ λ1 u1 , u1 ‰ 0. Now consider pCu1 qK and note that this is an
invariant subspace. We can show that this holds for any field. Specifically, T pFu1 qK Ă pFu1 qK
since v P pFu1 qK implies that xv, u1 y “ 0 so xT v, u1 y “ xv, T u1 y “ λ1 xv, u1 y “ 0, so
T v P pFu1 qK as well. Now we can just induct downwards, noting that pFu1 qK must contain
an eigenvector u2 , and keep going. Of course, thus far this only works over the complex
numbers C.
What about over the reals? Tensor up to the complex numbers (or extend coefficients)
to get a complex vector space. Now apply the above argument, noting that our basis hasn’t
changed and our eigenvalues are real. Thus the above argument still holds.

65
22 10/25/17
Talk by Noga Alon about random Cayley graphs (Center for Mathematical Sciences and
Applications) today at 3.

22.1 Moore Graphs


Spectral graph theory is concerned about the relationship between linear algebra and graph
theory, where we refer to graphs in combinatorics. A graph is a finite collection of vertices
and a choice of two-element subsets called edges that tell you which pairs are going to be
called adjacent. There are many standard reasons for people to visualize graphs.
We’ll talk about undirected graphs with a lot of regularity. The examples we will look
at are the regular pentagon and the Petersen graph (see Figure 22.1.1). It has degree 3 and
all vertices and edges are equivalent. There are no short cycles (of length ¡ 5). Further, any
two vertices have a unique neighbor iff they’re not adjacent. This makes it a Moore graph
of degree d, girth 5, and diameter 2. The girth is the length of the longest cycle and the
diameter is the longest distance between any two points. The pentagon is the Moore graph
of degree 2 and Petersen graph is the Moore graph of degree 3.

Figure 22.1.1: The Petersen Graph

It’s an amazing theorem that the only possible degrees that a graph of girth 5 and
diameter 2 can have are 2, 3 (constructed above), 7, and 57. The more amazing part is that
this comes from linear algebra.

22.2 Introducing Linear Algebra to Graph Theory


To do this, define the adjacency matrix of a graph. It is a square matrix with 1 if the two
vertices are adjacent and 0 otherwise. For the pentagon, we get
¨ ˛
0 1 0 0 1
˚1 0 1 0 0‹
˚ ‹
A“˚ ˚0 1 0 1 0‹ .

˝0 0 1 0 1‚
1 0 0 1 0

66
A label-free formulation is to consider TG : RV Ñ RV . RV has basis eV for all v P V . Then
ÿ
TG pev q “ ev 1 .
v 1 adjacent to v

We can now compute ÿ


TG2 ev “ TG TG ev “ ev2 .
vÑv 1 Ñv 2

In our case, we find that ¨ ˛


2 0 1 1 0
˚0 2 0 1 1‹
˚ ‹
A2G “ ˚
˚1 0 2 0 1‹‹
˝1 1 0 2 0‚
0 1 1 0 2
for the pentagon. Now that we’re interested in iterations, this suggests that eigenvectors
and eigenvalues may be useful. If your graph has constant degree d, then
˜ ¸ ˜ ¸
ÿ ÿ ÿ ÿ ÿ
TG ev “ ev1 “ pdeg v 1 qev1 “ d ev
v v vÑv 1 v1 v
ř
implying that v ev is an eigenvector with eigenvalue d.
Further, our graphs are undirected, so AG “ ATG . This corresponds to a self-adjoint
operator over the reals, which we noted means the eigenvalues are real and you have an
orthogonal decomposition into eigenvectors. Notice that, ignoring the diagonal, AG has
ones everywhere A2G has zeroes and vice versa. Thus AG ` A2G “ J ` I where J is the
all-ones matrix. Given J, let 1 be the all-ones vector. This is automatically an eigenvector
with eigenvalue n, and the orthogonal complement 1T is the kernel. Thus ?
TG |1T satisfies
2 ˘ 5´1
TG ` TG “ I. This is a polynomial equation of distinct roots, λ1,2 “ 2
so you can
T
decompose 1 “ Vλ1 ` Vλ2 .

22.3 Characterizing Moore Graphs


The assumption that you have a Moore graph is going to give you similar behavior: it satisfies
some kind of quadratic equation with distinct roots on the codimension 1 space of 1T . So
how do we this? Note that first, there is 1 vertex, d vertices 1 away, and 1 ` pd ´ 1qd vertices
2 away, yielding n “ d2 ` 1 vertices total. You can check that you’re not overcounting by
the girth and not undercounting by the diameter.
A2G takes each basis vector to the sum of all vectors in the second neighborhood plus d
times itself, so
A2G ` AG “ J ` pd ´ 1qI.
On the pn ´ 1q “ d2 -dimensional space Rn0 “ 1T , we get

A2 ` A “ pd ´ 1qI
?
˘ 4d´3´1
which implies that λ1,2 “ 2
. Since 4d ´ 3 ‰ 0, these are real and distinct. So
Rn0 “ Vλ1 ` Vλ2 .

67
What are the dimensions of these eigenspaces? Let d1 “ dim Vλ1 , d2 “ dim Vλ2 . To
compute this, note that Tr TG “ 0. Pick a basis corresponding to the decomposition 1 `
Vλ1 ` Vλ2 . Then the trace is

Tr TG “ d ` d1 λ1 ` d2 λ2 “ 0.

This is a linear equation satisfied by d1 , d2 . We also know that

d1 ` d2 “ d2

since dim Vλ1 ` dim Vλ2 “ dim 1T . Thus we can solve this for d1 , d2 . In the pentagon, we get
d1 “ d2 “ 2; in the Petersen graph, we get d1 “ 5, d2 “ 4.
2
What about other d? You can find that d1 ´ d2 “ ´2d´d
?
4d´3
, but this needs to be an integer!
Thus 4d ´ 3 must be a perfect square, i.e. 4d ´ 3 “ p2m ` 1q2 or d “ m2 ` m ` 1 or d “ 2.
Thus d “ 3 corresponds to m “ 1 and d “ 7 corresponds to m “ 2, but m “ 4 does not get
integer dimensions. m “ 7 is the next working candidate with d “ 57. We can show this by
15
dividing d1 ´ d2 out by polynomial division to get 2m`1 ` P pmq, so 2m ` 1|15, which implies
the desired.

68
23 10/27/17
Today: matrices (eww), Schur and normal operators, quadratic forms.

23.1 Matrix Groups


Suppose we have an isomorphism V » Fn . We can compose this with an automorphism
Fn » Fn to get all other isomorphisms, so once we pick an isomorphism V » Fn (a basis), we
have all other isomorphism. The set of automorphisms is a group (it satisfies associativity,
identity, and inverses) and is equivalent to choosing all possible bases.
Definition 23.1.1. The general linear group GLn pFq is the set of invertible nˆn matrices
with entries in field F, or automorphisms Fn Ñ Fn .
If you don’t choose a basis, this is denoted GLpV q as the set of automorphisms of V as
an F-vector space.
In Chapter 3, we looked at linear transformations V Ñ W and found that (if V, W are
finite-dimensional) there’s basically only one choice of linear transformation T : V Ñ W for
every rank. Any two transformations are equivalent, i.e. you can find automorphisms of
V and W such any other matrix of the same rank can be written as GLpW q ¨ T ¨ GLpV q.
Commutative diagram:
S
V W
gV gW

T
V W
In Chapter 5, we looked at linear transformations T : V Ñ V , you need to use the same
automorphism on both sides, or in matrix notation we must pick inverses. Thus we want
gV T gV´1 .
Consider the completely general setting f : X Ñ Y . Both X and Y have various
automorphisms. f is equivalent to f 1 under automorphisms if the diagram commutes:
f
X Y
gX gY
f1
X Y
´1 ´1
or in equations that f 1 “ gY f gX . If we let X “ Y , we then get f 1 “ gX f gX , which gives
us the equation we had before in the case of vector spaces. This construction is known as
the conjugate of f by gX (or gX ´1
). We sometimes denote f g “ g ´1 f g, which makes some
sense since pf g qh “ f gh .
if you have an inner product space, an automorphism of an inner product space is more
constrained since it has to preserve the inner product. In particular, we require
xAv, Avy “ xv, vy
for all v if A P AutpV q (exercise: this implies xAv, Awy “ xv, wy). We know that
xAv, Awy “ xAT Av, wy “ xv, wy

69
which implies AT A “ id over the reals and AH A “ id over the complex numbers.
Definition 23.1.2. The set of matrices A such that AT A “ id or equivalently xAv, Awy “
xv, wy is called the set of orthogonal matrices OpV q or On .
Notice that in this case, the columns of A are an orthonormal basis is equivalent to the
rows of A being an orthonormal basis. This is a useful punchline for mathematical proofs.
Definition 23.1.3. The set of matrices A such that AH A “ id or equivalently xAv, Awy “
xv, wy is called the set of unitary matrices UpV q or Un .

23.2 Spectral Theorem with Matrices


What does it mean to find an eigenbasis? We want to write our linear transformation as
T “ A´1 DA where D is diagonal. This is the condition for T to be diagonalizable, which
may or may not be possible.
If we are working in an inner-product space, then A has to be an orthogonal or unitary
matrix as discussed above so this map will also be consistent with the inner product structure.
Most linear transformations are diagonalizable if you are over an inner product space. In
the real case, this corresponds to T “ A´1 DA “ AT DA so T T “ AT DT A “ AT DA “ T
so T has to be its own transpose (i.e. T is symmetric). We’ve derived this condition before
and know that it is also sufficient. If our matrix is complex, then this implies we have real
eigenvalues.
In the complex case, we had a condition for our matrix to be diagonalizable, i.e. that it
was a normal operator and T T ˚ “ T ˚ T . Then T ˚ “ A´1 D˚ A which gives us the desired.
How do we show that this is sufficient?
Theorem 23.2.1 (Schur). For all T P GLn , there exists a basis B “ pv1 , . . . , vn q such that
the matrix with respect to this basis is upper triangular.
Given an inner product space, we can make the vj orthonormal using Gram-Schmidt.
A normal operator has the property that |T ˚ v| “ |T v|. You can see this by noting that

|T ˚ v|2 “ xT ˚ v, T ˚ vy “ xv, T T ˚ vy “ xv, T ˚ T vy “ xT v, T vy “ |T v|2

as desired. Now apply Schur’s theorem and we can induct down to show that the off-diagonal
terms are zero using Schur’s theorem.
How do you prove the spectral theorem without invoking the fundamental theorem of
algebra? Consider xx, T xy where T “ T ˚ . In bases,
ÿÿ
xx, T xy “ xi Tij xj .
i j

It is a homogeneous polynomial in coordinates and thus an element ř of Sym2 V ˚ . If you


2
choose coordinates such that you havae eigenvectors, then this is λi xi . How big can this
get in terms of the size of x? Picking the largest and smallest λ, we see that
ÿ
λmin |x|2 ď λi x2i ď λmax |x|2 .
i

70
Equality holds iff you are in the Vλmax eigenspace (and likewise for the lower bound). This
gives us an idea of how to maximize or minimize the quantity |T|x|x| and once we do compactness
and continuity next semester, we’ll see that this can be completed to an independent proof
of the spectral theorem without the fundamental theorem of algebra.

71
24 10/30/17
Section/Math Night as usual.
Tomorrow: Math Table talk on the Monster groups by Brian Warner (55 alum from last
year).
No class on Friday.

24.1 Introduction to Determinants


Traditionally, the determinant is the condition on v1`, . . . , vn P F˘n to guarantee that they are
linearly independent, or that the associated matrix v1 ¨ ¨ ¨ vn is invertible. It can also be
written in terms of systems of linear equations having a unique solution, i.e.

a11 x1 ` a12 x2 ` ¨ ¨ ¨ ` a1n xn “ b1


.. ..
.“.
an1 x1 ` an2 x2 ` ¨ ¨ ¨ ` ann xn “ bn .

For n “ 1, we require .a11 ‰ 0. For n ´ 2, we require a11 a22 ‰ a21 a12 . You can find that
the solutions for both x1 and x2 have denominator a11 a22 ´ a21 a12 . For n “ 3, we get a more
complicated condition

a11 a22 a33 ` a12 a23 a31 ` a13 a32 a21 ´ a11 a23 a32 ´ a12 a21 a33 ´ a13 a22 a31 .

Cramer’s rule tells you the relationship between the numerator and denominator of the
solutions.
We could write further cases out explicitly, but you could guess that the general pattern
is something like the following. Thus we can define:

Definition 24.1.1. v1 , . . . , vn form a basis if detpv1 , . . . , vn q “ detpAq ‰ 0. This determi-


nant is defined as
ÿ ÿ n
ź
det A “ pσqa1,σp1q a2,σp2q . . . an,σpnq “ pσq ai,σpiq .
σPSn σPSn i“1

Here,  : Sn Ñ t1, ´1u is a nontrivial group homomorphism (there’s only one possibility)
and Sn is the symmetric group, or the group of permutations of n letters. We can now
try to develop some properties of  and det, which pins down the determinant uniquely and
(hopefully) makes it reasonably easy to verify that det A ‰ 0 iff vi form a basis.
Highlights:

1.  is a group homomorphism, i.e. for all σ, τ P Sn we have

pστ q “ pσqpτ q,

pidq “ 1, and pσ ´1 q “ pσq´1 “ pσq. The kernel of this homomorphism is An , the
alternating group.

72
2. If σ is a simple transposition, i.e. σpiq “ i`1, σpi`1q “ i and σ sends everything else
to itself, then pσq “ ´1. Sn is generated by simple transpositions, so this determines 
(and An is a proper subgroup for n ą 1). We do need to prove that this is a consistent
definition of ; we’ll do this later.

3. The determinant is a function pFn qn “ V n Ñ F. It is multilinear and alternating


(if vi “ vj then det “ 0). The multilinear part induces a map pFn qbn Ñ F. The
alternating condition implies that the determinant is a map out of a smaller vector
space called the exterior product Λn V , which we’ll come back and construct later
when we do exterior algebra.

4. Applications: define the characteristic polynomial of T : V Ñ V . This is defined


to be pT pλq “ detpλI ´ T q. The Cayley-Hamilton theorem tells you that pT pT q “ 0.
We will also give a determinant description of what it means for a quadratic form to
be positive-definite and a description of the Sylvester signature.

We’re now going to prove the results above, starting with the construction of . Our
idealŹdefinition` of the
˘ determinant Źwill be based on the exterior power. We’ll show that
k
dim pV Ź
q“ k dim V
implying that n pV q for n “ dim V is one-dimensional. We can then
construct n pT q, the wedge product of a linear transformation, which is a map from a one-
dimensional space to itself and thus a scalar known as det T . We’ll then show that that
definition is consistent with the classical formula but also totally basis-independent.
Let’s start by defining . For σ P Sn , let

pσq “ p´1qN

where N is the number of inversions, or pairs pj, kq such that 1 ď j ă k ď n but σpjq ą
σpkq. We can do some messy combinatorics now to convince ourselves that this is equivalent
to the definition we provided above on the generating set.

Proposition 24.1.2. pσ ˝ pi i ` 1qq “ ´pσq.

Proof. Count inversions. Notice that no additional inversions could have been generated
aside from the ones involving i and i ` 1, and i, i ` 1 has an additional inversion. Hence the
negative sign.
This is similar to the bubble sort algorithm from computer science.
How do we show that this is a group homomorphism? Just write each permutation as a
product of transpositions and work from there. This reduces  to addition of inversions mod
2, which is simple enough.

73
25 11/1/17
No class on Friday, but PS 8 problems 7 - 8 are due Friday at noon.

25.1 Graded Associative Algebras


We’ll start with some context for the exterior algebra in terms of graded and associative
F-algebras.

Definition 25.1.1. A F-algebra is a ring A with unity that contains a copy of F, i.e. F Ă A,
with the same ring operations. In particular, it’s a vector space over F. If this multiplication
operation is associative, we call it an associative F-algebra.

Let V be any F-vector space and let dim V “ n. Then the tensor algebra is defined as
8
à

bV “ V bi .
i“0

Here dim V bi “ ni . This is clearly an F-algebra, with the multiplication operation being the
tensor product.

Definition 25.1.2. A F-algebra is graded by the natural numbers is


8
à
A“ Ai
i“0

and multiplication takes Ai ˆ Aj Ñ Ai`j . We usually require A0 “ F.

The tensor algebra above is a graded F-algebra. Other examples include F (Ai “ 0 for
i ą 0), Frxs (Ai “ Fxi ), and Frx1 , . . . , xm s (Ai are the homogeneous polynomials of degree
i).
Given this, we can define a couple more spaces. The symmetric algebra is
i
Sym‚ V “ ‘8 ‚ 1
i“0 Sym V “ b V { Spanpt b pv b w ´ w b vq b t q

where t, t1 P b‚ V, v, w P V . This quotient ensures that all the cosets will be symmetric
tensors. Note that we can simply perform this quotient on each graded part to get

Symi V “ bi V { Spanpt b pv b w ´ w b vq b t1 q

where deg t ` deg t1 “ i ´ 2.

25.2 The Wedge Product


For determinants, we will introduce the exterior algebra, which are similar but we require
anticommutation, not commutation. We want

v ^ w “ ´w ^ v,

74
and v ^ v “ 0 (which should make sense from the condition you’re familiar with about
determinants). These two rules are almost equivalent to each other, aside from char F “ 2
in which case v ^ w “ ´w ^ v does not imply v ^ v “ 0. Thus we have a choice; which
condition do we want? We’ll use the stronger condition v ^ v “ 0 which is almost always
equivalent to v ^ w “ ´w ^ v.

Definition 25.2.1. The wedge product is defined as


Ź2
V “ V b V { Spanpv b v | v, P V q.

The image of v b w in 2 V is v ^ w.
Ź

b2
Ź2 Observe that if V has basis e1 , . . . , en , then V has basis ei b ej for 1 ď i, j ď n and
V has basis ei ^ ej for 1 ď i ď j ď n since all the others disappear (ej ^ř ei “ ´ei ^ ej ).
How do we know that these are all linearly independent? Suppose that v “ i ai ei , then
ÿÿ ÿ ÿ
vbv “ ai aj ei b ej “ a2i ei b ei ` ai aj pei b ej ` ej b ei q.
i j i iăj

We can now use dimension-counting or something similar.


Recall the universal property of the tensor product, that a bilinear map V ˆ W Ñ X is
the same as a linear map V b W Ñ X. We have a similar universal property for the wedge
product.

Definition 25.2.2. An alternating map f : V ˆ V Ñ X is a bilinear map such that


f pv, vq “ 0.

Proposition 25.2.3 (Universal Property of the Wedge Product). The set of alternating
bilinear maps f : V ˆ V Ñ X is isomorphic to linear maps 2 V Ñ X. Specifically, we have
Ź
the diagram:
T Ź2
V ˆ V univ V
f
fr

Proof. Take Tuniv pv, wq “ v ^ w. The rest is left to you as an exercise.

25.3 Determinants
We can now define the wedge product of linear transformations.

Definition 25.3.1. Suppose T : V Ñ W . The wedge product of linear transforma-


Ź2 Ź2 Ź2 T bT
tions
Ź2 T is the map induced on V Ñ by the alternating map V b2 Ñ W b2 Ñ
W Ź
W , i.e. v b w ÞÑ T v ^ T w. Explicitly, we have 2 T pv ^ wq “ T v ^ T w.
T S Ź2 Ź2 Ź2
Just as for tensor products, if we have V Ñ W Ñ X, then pST q “ S˝ T . This
is implied by the corresponding problem for tensor products.

75
Now consider T : V Ñ V and for now consider dim V “ 2. Then T e1 “ ae1 ` ce2 , T e2 “
be1 ` de2 so
Ź2
T pe1 ^ e2 q “ pae1 ` ce2 q ^ pbe1 ` de2 q “ pad ´ bcqpe1 ^ e2 q.

This scalar is the canonical definition of the determinant of the matrix. Our condition above
then corresponds to det ST “ det S det T , so the determinant is multiplicative. We’ll now
need to generalize this further.

25.4 The Exterior Algebra


Definition 25.4.1. The exterior algebra
Ź‚ À Źi
V “ 8 i“0 V “ b‚ V { Spanpt b pv b vq b t1 q

just as for the symmetric algebra. In particular,


Źi
V “ V bi { Spanpt b pv b vq b t1 q

where deg t ` deg t1 “ i ´ 2.

Note that 0 V “ F,
Ź Ź1 Ź2
Źi Źj V “ V, V is our definition from above. We find that v ^ w “
´w ^ v as before and V, V anticommute if i, j are odd and commute otherwise. The
universal property above also extends to that of an alternating multilinear maps, not just
alternating bilinear maps.
As before, we find that ei1 ^ . . . ^ eij for any subset ti1 , . . . , ij u Ă t1, . . . , nu form a basis
Źj
for V . This is clearly a spanning set; to show that it is linearly independent, let A be the
algebra with this basis. Now b‚ V Ñ A has an obvious map by sending simple Źtensors to the

appropriate wedge product; this map is alternating to it descends to a map V Ñ A that
is at Ź
least surjective, but it must also be injective since we already know that the wedges
span ‚ V . Thus we get Źkour desired
`n˘ isomorphism Źn and we have a basis.
V “ 1. Thus maps on n V are scalar
Ź
In particular, dim V “ k and dim
multiplication. We can define the n-fold wedge product of linear transformations just as
above, so we get:

Definition 25.4.2. The determinant


Źn ofŹanlinear Ź
transformation T : V Ñ V is the scalar
corresponding to the induced map T : V Ñ n V which is just scalar multiplication.

We will now leave it as an exercise to show that the formula from last time corresponds
to this definition of the determinant. But note that this is clearly basis-independent and
multiplicative and as an easy exercise, you can show that det T “ 0 iff T is not invertible
and that det id “ 1.

76
26 11/6/17
Tomorrow’s math table is by Rohil; section, math night, etc. as usual.
We’ll start with a digression about the 15 puzzle and then talk about group homomor-
phisms; we’ll then go back to signs and determinants.

26.1 Digression: The Fifteen Puzzle


The Rubik’s cube (on your pset) is a variant of a nineteenth-century craze of the fifteen
puzzle. See Figure 26.1.1.

Figure 26.1.1: The Fifteen Puzzle.

The goal is to unjumble the puzzle into the final state shown above. However, not all
configurations are reachable from the above (or solvable given that is an initial configura-
tion)! There are two possibilities here: either you can solve it, or you will end up with a
configuration where the 15 and 14 are swapped. Mathematically, we want to show that all
configurations are reachable from these two possibilities and these two possibilities are not
reachable from each other.
The reason you can’t solve it is essentially the existence of the sign homomorphism.
Always getting it to this form is deriving an algorithm which is easier than the Rubik’s cube
but still annoying. But you can see that all configurations you attain constitute a subgroup
of S15 (assuming the empty square is in the same place). It contains the identity and the
inverse (just retrace your steps) and is closed under composition (just do both permutations
in sequence). The claim is that this subgroup is contained in A15 “ ker , where  is the sign
map.
To prove this, go inside S16 (add the empty square). But we no longer have a subgroup
since we no longer have composition. However, if we wind up with the empty square back
where we started, we end up back in S15 and in our original subgroup. There must be an
even number of such transitions (since 16 needs to move an even number of steps) so we have
an even number of transpositions meaning that we are actually A16 X tσ | σp16q “ 16u “ S15 .
This is just A15 , as desired.
Why is the number of steps even? You can either note this by seeing that for every step
you take out, you take one step backwards (so the total is even) or consider the checkerboard
coloring, under which on odd steps you are always on the opposite color.

77
26.2 Digression: Group Homomorphisms
Consider a map between groups. Not surprisingly, we have:
Definition 26.2.1. A map φ : G Ñ H is a group homomorphism if φp1G q “ 1H , φpg ´1 q “
φpgq´1 , and φpgg 1 q “ φpgqφpg 1 q for all g, g 1 P G.
Definition 26.2.2. The kernel of a group homomorphism φ : G Ñ H is ker φ “ tg P
G | φpgq “ 1H u. It is clearly a subgroup (exercise).
We can now write a short exact seuqnece
φ
1 Ñ ker φ ãÑ G Ñ H Ñ 1
to reinterpret this information. But there is one major difference with vector spaces. If we
took any subspace U Ă V , we can always form a quotient space
0 Ñ U Ñ V Ñ V {U Ñ 0
to complete this exact sequence. This is not true for any subgroup of G! We can’t always
form a quotient group to complete
1 Ñ K Ñ G Ñ G{K Ñ 1.
Suppose we consider the set of cosets rgs “ gK Ă G. We want to define a multiplication
operation on this set of cosets. For vector spaces, we wrote pv1 ` U q ` pv2 ` U q “ v1 ` v2 ` U
which is well-defined any time you have an abelian group. But if you have a nonabelian
group, then it’s not necessarily true that
pgKqpg 1 Kq “ pgg 1 Kq
since we can’t commute K with g 1 .Further, there are two different types of cosets (gK and
Kg); one is a right coset and one is a left coset.
Sn is not commutative, but we were able to have Sn {An , since one coset is all the even
permutations and the other is all the odd permutations. So clearly sometimes we can take
quotients, but there are plenty of cases where we can’t. For example, S2 Ă S3 cannot form
a quotient group. Thus we have the condition.
Definition 26.2.3. A normal subgroup is a subgroup K Ă G for which any of the
following equivalent conditions hold:
1. For all g, gK “ Kg.
2. For all g, gKg ´1 “ K (the conjugate of K by g, which is automatically a subgroup).
3. G{K exists as a quotient group.
Theorem 26.2.4. The conditions above in the definition of a normal subgroup are equivalent.

Proof. The equivalence of the first two should be obvious. For the last, just note that
pgKqpg 1 Kq “ gg 1 KK “ gg 1 K as desired.
Note that the kernel of any group homomorphism K “ ker φ is automaticaly normal, as
φpgqφpKqφpg ´1 q “ φpgqφpg ´1 q “ 1 so gKg ´1 Ă K as desired.

78
26.3 Back to Determinants
We have constructed the determinant map det : GLpV q “ GLn pFq Ñ Fˆ (the multiplicative
group associated with a field F).

Definition 26.3.1. The special linear group SLpV q “ SLn pFq “ ker det.

All elements of the special linear group have determinant 1.


We still need easier ways to compute the determinant and want to see how computation
of the determinant is related to computation of the inverse of a matrix and solve equations.
This can be written down for general matrices, but it’s a bit of a pain. You’ll find that
each entry is a determinant of a submatrix. The general pattern is that to get the pi, jqth
entry of your matrix, you need to eliminate the ith and jth column from the transpose of
your original matrix, take the determinant, and multiply by a checkerboard sign. This is the
adjugate matrix orŹadjunct matrix, which we’ll just call Adj A. Much more simply, this
also corresponds to n´1 T (which is also over an n-dimensional space and thus an n ˆ n
matrix).
Suppose you are looking at matrices over a commutative ring. The determinant still
makes sense, as does det T det S “ detpST q since it’s just a polynomial identity. Thus
det T “ pdet Sq´1 so the determinant of an invertible matrix is an invertible element of the
ring, or the unit. An invertible matrix over the integers must have determinant ˘1. What’s
not obvious is that that condition is necessary and sufficient, but we see this from the adjoint
matrix.
Here is our first attempt at computing the determinant. Start from vectors v1 , . . . , vn and
add en . We’ll have some linear dependence, so we can replace vj by βn en in v1 ^ . . . ^ vj ^
. . . ^ vn (the other vi cancel out since we’re already wedging with them). We can keep doing
this by adding en´1 , en´2 , etc. You’ll end up with βn en ^ βn´1 en´1 ^ βn´2 en´2 ^ . . . ^ β1 e1 in
some order (i.e. some sign). Multiply the βi s together and that gives you the determinant.

79
27 11/8/17
27.1 Computing Determinants via Column Reduction
We’ve defined det T in terms of the exterior algebra, and would now like an easy way to
compute it iin terms of matrices (and to connect it to more familiar definitions of the de-
terminant from more standard and lamer linear algebra courses). Our standard definition
immediately implies
detpdiagpa11 , a22 , . . . , ann qq “ a11 a22 . . . ann
(diag means a diagonal matrix with those entries). Thi sis still true if you have an upper-
triangular matrix; all the elements in the upper triangle cancel out since they have elements
of the form ei ^ ei .
We’ll use these facts to compute the determinant in general, along with the fact that
the determinant is multilinear and alternating. This is equivalent to the identity detpABq “
det A det B for some special matrices A. We start with some easy properties.

1. If vj “ 0, then detpv1 , . . . , vj , . . . , vn q “ 0. This is a consequence of the fact that the


determinant of a noninvertible matrix is 0; just replace vj by a linear combination of
the other vi ’s and their wedge product is obviously 0.

2. We have
detpv1 , . . . , vj ` w, . . . , vn q “ detpv1 , . . . , vj , . . . , vn q
with w P Spanpv1 , . . . , vpj , . . . , vn q where the proof is the same as above. This is equiva-
lent to detpABq “ det A det B by letting A be anything and B be the identity matrix,
except in the jth column which has other random coefficients in the non-diagonal
entries, i.e. ¨ ˛
1 a1
˚ 1 a2 ‹
˚ ‹
. . ..
‹.
˚ ‹
˚ . .
˚ ‹
˝ 1 ‚
an 1
You can check that det B “ 1. These matrices are closed under multiplication (as are
the diagonal matrices), so you really only need to check them for a spanning set, i.e.
matrices with 1s on the diagonal and a single off-diagonal entry. This is known as
column reduction.
We can use this to compute determinants. Use column reduction to make your matrix
upper or lower triangular by subtracting columns appropriately; this allows you to
compute the determinant easily using the property that we described above.

3. You can switch the ith and jth column by using the matrix B with bij “ bji “ 1, bii “
bjj “ 0 and otherwise the identity. This matrix has det B “ ´1, as we’d expect by the
exterior product. This allows you to switch columns up to sign.

80
This gives us an algorithm to compute determinants that works in roughly OpN 3 q, since
you have N elements per column and N 2 pairs of columns to compare. This is a signif-
icant improvement over the OpN !q that we had earlier from the explicit formula for the
determinant!
Of course, if you have an eigenbasis or an upper triangular basis already, you should use
that. But finding those matrices is roughly as hard as this procedure, so it’s not necessarily
general.
Note that the fact that the determinant of A is equivalent under changing a basis is just

detpS ´1 ASq “ pdet Sq´1 det A det S “ det A.

The matrices above are called elementary matrices, along with diagp1, . . . , a, . . . , 1q
which is sometimes called elementary. These allow us to reduce our matrix into something
that is easier to compute the determinant of since each elementary matrix has an obvious
determinant. Every invertible matrix can be written as the product of elementary matrices.
In terms of group theory, we say that the elementary matrices generate GLn pFq.
We have
AB1 . . . Bk “ D
where the Bi are elementary matrices and D is diagonal. We can thus write

A “ DBk´1 . . . B1´1

implying that
D´1 B1 . . . Bk “ A´1 .
Thus we can compute the inverse!

27.2 Exterior Powers of Duals


How do we get from column reduction to row reduction? We can take the transpose. We
know that pABqT “ B T AT so using the above notation, we have

AT “ pB1´1 qT . . . pBk´1 qT DT .

We claim that detpAT q “ detpAq. You can check this by checking it on the generators, which
is easy enough.
Alternatively, considerŹT ˚ : V ˚ Ñ V ˚ .ŹThen we know thatŹdet T ˚ “ n pT ˚ q which
Ź
we
Źnwant toŹconnect with n T . Note that n pV ˚ q is not quite n V ! But we have a map
pV ˚ qˆ n V Ñ F (by
Źn evaluation, i.e. pv1˚ ^. . .^vn˚ , v1 ^. . .^vn q ÞÑ v1˚ pv1 qv2˚ pv2 q . . . vn˚ pvn q)
pV ˚ q Ñ p n V q . You can finish off this proof with a commutative
Ź ˚
which gives us a map
diagram for brownie points from me.
Źn´1
27.3 Adj A and pV q
Recall
Źn´1 the matrix Adj A that relates closely to the inverse. We’ll connect this now to
pV q. Recall that there’s a canonical map V ˚ ˆ V Ñ F and also an important map

81
Źn´1
V ˆ V Ñ F via wedging, i.e. pv1 ^ . . . ^ vn´1 q ˆ v Ñ v1 ^ . . . ^ vn´1 ^ vn . This is a
perfect pairing, which you can check by extending v to n linearly independent vectors and
wedging the other n ´ 1Ź together.
We can now identify n´1 Ź V » V ˚ b n V by comparing these two maps. Applying n T
Ź Ź
on both sides, you find that n´1 T “ T ˚ det T for some interpretation of T ˚ . Thus the
adjugate ^n´1 T is equal to the determinant times the transpose, which proves the standard
formula for the inverse of a matrix.
Next time, we’ll talk about positive definite matrices and signatures in terms of determi-
nants.

82
28 11/10/17
PS 10 due next Monday, November 20 (not next Friday)!

28.1 Motivation for Quadratic Forms


Recall some of our motivation for linear algebra. We can model any function f : R Ñ R as
a2 2
f px ` aq “ f pxq ` af 1 pxq ` f pxq ` . . . .
2
If you find a point with a zero first derivative, then it’s a critical point which could be either
a local minimum or maximum depending on the value of f 2 pxq.
Now what if you have a map on n variables f : V ˚ » Rn Ñ R? Then x is a vector and a
is a vector as well. f 1 pxq maps vectors to scalars and thus be a linear functional (sometimes
called the gradient of f ), an element of V ˚ » Rn . If you have a local maximum or minimum,
you expect f 1 pxq “ 0 to get a critical point. The second derivative is homogeneous of degree
2, and must therefore be a quadratic form, or a symmetric map V ˆ V Ñ R which (with
an inner product) can be identified as an element of V ˚ b V or a linear transformation T .
By choosing coordinates, we see that this is just xT v, vy with T “ T ˚ . We want to know
something about whether this is positive or negative or neither (?) in dimension greater
than 2. Thus we’d like to know some things about how quadratic forms behave.

28.2 Positive-Definite Operators


So let’s focus on the theory of quadratic forms. This goes back to Chapter 7, but we’ll
use determinants freely. This works for F “ R, C. Assume V finite-dimensional and let
T : V Ñ V be self-adjoint, so xT v.vy “ xv, T vy P R.
Definition 28.2.1. T is a positive operator if for all v P V we have xT v, vy ě 0.
Example 28.2.2. Examples of positive operators:
1. The zero map. Hmmm. . .
2. The identity.
We want to be a bit careful here, since we don’t want to replace the ě sign with a strict
inequality, but we don’t really want 0 to be positive. Thus we have:
Definition 28.2.3. T is a positive-definite operator if for all v P V such that v ‰ 0,
xT v, vy ą 0.
Positive is not necessarily a standard definition; it’s sometimes called positive semidefi-
nite. We also say that T is negative-definite if ´T is positive-definite.ˆ ˙
0 1
Not all properties of positive numbers transfer over; in particular is not a positive-
1 0
definite or negative-definite operator. The sum of two positive-definite operators is positive-
definite, and so is aT ` a1 T 1 if a, a1 ą 0. So this is not quite a vector space; it is someitmes
called a cone.

83
ř
Now pick an eigenbasis. Then we see that v “ i ai ei implies that
ÿ
xT v, vy “ |ai |2 λi .
i

Thus our operator is positive semidefinite iff the λi ě 0 and positive definite iff λi ą 0. But
how do you determine if your transformation T is positive-definite?
Observe that ź
detpxI ´ T q “ px ´ λi q
i

(as a special case of the characteristic polynomial applied to the self-adjoint operator). This
determinant doesn’t depend on eigenbasis, so we can apply our formula to the matrix in
any basis and find roots of our polynomial (or approximate them numerically) and check
whether they are positive.
T being positive definite implies that xT u, uy ą 0 for all u P U given any subspace
U Ă V . This isn’t very exciting, but you can take any subspace, restrict the quadratic form
to that subspace, get the matrix and determinant, and this also has to be positive. This is
a much stronger condition since you can check on any subspace. It is also sufficient (check
on one-dimensional subspaces). Thus given a flag

0 “ V0 Ă V1 Ă . . . Ă Vn “ V

where each Vk has dimension k, you can check positive-definiteness on each Vk . You can also
compute the signature in this way by counting sign changes.
Consider the matrix ¨ 1 1˛
1 2 3
˝ 1 1 1 ‚.
2 3 4
1 1 1
3 4 5

You can verify that the sub-determinants here are positive, so this is a positive-definite
matrix (as are its obvious generalizations).
Proof. We want to show that if xT v, vy |Vk is already known to be positive-definite and det on
Vk`1 is positive, then xT v, vy |Vk`1 is also positive-definite. (This is essentially the necessary
induction step).
The signature of xT v, vy |Vk`1 is either pk ` 1, 0q or pk, 1q. But the latter would imply a
negative determinant, impossible. So we have the desired.

28.3 Characteristic Polynomial and the Cayley-Hamilton Theo-


rem
If you evaluate detpxI ´ T q on T , you get 0. This turns out to be true in general even if you
have a non-diagonalizable matrix. Proving this is the key topic of Chapter 8, which we’ll do
some of next week.

84
29 11/13/17
Math table tomorrow: Sander Kupers (BP fellow) on knots and the spirit world.
This week: algebraic methods in combinatorics talks at CMSA.
Our last topics of linear algebra are Ch. 8, where we come back to one of the fundamental
questions of linear algebra T : V Ñ V , how do you classify such operators? We already said
that vector spaces are classified by their dimension and a linear map T : V Ñ W is classified
by the dimensions dim V, dim W and the rank. T : V Ñ V is more complicated; we already
have some information (the eigenvalues). If you can write V “ ‘λ Vλ , i.e. T is diagonalizable
and we’re done (we just need the eigenvalues). But this doesn’t always work;ˆF might ˙ not
0 1
be algebraically closed, and even if it is ‘λ Vλ might be strictly smaller, as in . The
0 0
answer to this question if F is algebraically closed is Jordan normal form (which you should
be familiar with from my section). If F is not algebraically closed, i.e. F “ R, this becomes
a pain.

29.1 The Characteristic Polynomial


Assume V is finite-dimensional.
Definition 29.1.1. The characteristic polynomial qT pxq “ detpxI ´ T q.
One may explicitly compute that
qT pxq “ xn ´ TrpT qxn´1 ` . . . ` p´1qn det T
by considering the matrix representation of T . Further, λ is a root of this polynomial iff λ
is an eigenvalue.
The key result about the characteristic polynomial is:
Theorem 29.1.2 (Cayley-Hamilton Theorem). qT pT q “ 0.

ś This is dim
relatively obvious for diagonalizable transformations, since we have qT pxq “

λ px ´ λq which is clearly satisfied by T . We’ll show that it is also true for non-
diagonalizable transformations.
Why is this useful? One simple application is for n “ 2; we have
T 2 ´ pTr T qT ` det T “ 0
so you can solve to get
Tr T ´ T
T ´1 “
det T
given det T ‰ 0. This corresponds to our usual formula for the inverse of a 2 ˆ 2 matrix.
We’ll give a proof over the complex numbers, but this is just a polynomial identity with
coefficients in Z. So this is true in the reals, or in any other field we like, once we reduce it
to this polynomial identity. This general principle (turning things into polynomial identities
and writing them over different fields) works in other examples, like proving that if A is
skew-symmetric and n is odd then det A “ 0. Proof: work over C using AT “ ´A and then
deduce it for all fields, even those with characteristic 2.

85
29.2 Nilspace and Generalized Eigenspaces
Let the algebraic multiplicity be the multiplicity of λ as a root of qT , or the number of
λs on the diagonal of the upper triangular form. Let the geometric multiplicity of λ be
dim Vλ . This indicates that the number of λs on the diagonal in the upper-triangular form
is an invariant.
We can show that the geometric multiplicity ą 0 is eqiuvalent to the algebraic multiplicity
ą 0. However, the geometric multiplicity ď the algebraic multiplicity always, so the ř same is
true of the totals (which we already knew since the geometric multiplicities add to λ dim Vλ
and the algebraic multiplicities add to dim V ). Equality occurs iff the algebraic and geometric
multiplicities are equal, i.e. the transformation is diagonalizable.
What happens if not? We need to replace Vλ by something similar. We’ll do this for λ “ 0
and then see how this works for general λ. Consider T : V Ñ V and define Vk “ ker T k . We
have
0 Ă V0 Ă V1 Ă . . .
as an increasing sequence of subspaces that is contained in V . Further, T : Vi`1 Ñ Vi for all
i. Further, any time Vk “ Vk`1 , we have Vk`1 “ Vk`2 and so on. Thus we say:

Definition 29.2.1. The nilspace of V is Yk Vk .

Note: in section, I called this the eventual kernel. It is also sometimes denoted kerpT 8 q.
The nilspace must stabilize in less than dimpnilspaceq steps, thus you can stop taking hte
union at the dim V step, or just take Vdim V “ kerpT dim V q.

Definition 29.2.2. An operator is nilpotent if V “ Vdim V , the nilspace, or T k “ 0 for


some k.

We may define the generalized λ-eigenspace to be the nilspace of λI ´T . The algebraic


multiplicity is now the dimension of the generalized λ-eigenspace. Combining this fact with
the fact the sum of the algebraic multiplicities is n, then we get

V “ ‘ nilspacepλI ´ T q.

Using this, you can prove Cayley-Hamilton and find a canonical form for any operator: you’ll
have all copies of each eigenvalue together and 1s or 0s on the superdiagonal and nothing
beyond that.

86
30 11/15/17
Announcement: the math department is holding a new seminar called the Open Neighbor-
hood Seminar with accessible math talks. The series is every other Thursday, talks from 4 -
5, snacks from 5 - 6. See http://math.harvard.edu/ons.

30.1 Proof of Cayley-Hamilton


Last time, we defined the characteristic polynomial of a transformation T : V Ñ V as

qT pxq “ detpxI ´ T q “ xn ´ pTr T qxn´1 ` . . . ` p´1qn det T.

It’s enough to prove this over C, as we pointed out


ś last time.
Since we are over C, we know that qT pxq “ px ´ λi q where the λi are the eigenvalues
of T . In the diagonalizable case, this is now trivial since we can find an eigenbasis which
trivializes the problem.
If not, then we need to consider the nilspace. This was defined as the intersection of all
the Vi where Vi “ ker T i and

0 Ă V0 Ă V1 Ă . . . Ă Vk Ă .

We can see that this sequence stabilizes at Vn . We may also define the opposing sequence

V “ Im T 0 Ą Im T 1 Ą . . . Ą Im T m Ą . . . .

Proposition 30.1.1 (Axler 8.5). V “ Vn ‘ Im T n .

Proof. The dimensions match up by rank-nullity, so we need to check that the intersection is
zero. The intersection is all vectors v such that T n v “ 0, T n w “ v. But T is an isomorphism
on Im T n so this implies v “ 0.
Now work inside ker T n . T is nilpotent here, so we can find a basis under which the
matrix of T is strictly upper-triangular, not just upper-triangular. Thus we have pT ´
λIqdim nilspaceT ´λI “ 0. So we can decompose

V “ pT ´ λIqnilspace ‘ ImpT ´ λIqn .

Now you can induct, applying the same argument to each eigenvalue, until you get all the
generalized λ-eigenspaces.
Now we just need to put everything together. Suppose you take the direct sum of two
linear transformations T1 : V1 Ñ V1 , T2 : V2 Ñ V2 . Then we have

qT1 ‘T2 “ qT1 qT2 .

This can be easily verified using the exterior product or using the block diagonal formula for
the determinant. Now we just use the nilspace decomposition to get Cayley-Hamiton.

87
30.2 Computation Issues
We know that ˆ ˙´1 ˆ ˙
a b 1 d ´b
“ .
c d ad ´ bc ´c a
How do you compute this? ad, bc can be computed to 2N digits of accuracy if you are
given N digits of accuracy in a, b, c, d, but we are then subtracting and dividing. Dividing is
problematic since dividing by something close to 0 gets something huge, which can change
a lot with very small errors in the denominator. There is a field of numerical analysis that
talks about the stability of doing operations like this on real numbers with real objects.
For eigenspaces and eigenvalue decopmositions this gets more annoying. If λ1 and λ2 are
very close to each other, it’s very difficult to tell whether they are exactly the same. So it’s
very difficult to distinguish between close eigenvalues and the same eigenvalue, which messes
greatly with attempts to compute the eigendecomposition.
Finally, there is the question of sparse linear algebra. This is about storing and computing
matrices for whom the vast majority of entries are 0s (e.g. Google’s PageRank algorithm).
You can compute the top eigenvector by iterating the matrix several times on a random vector
which ges you the vector with the top eigenvalue, and so on for the next few eigenvectors
on the orthogonal complements. You also have a notion of how an eigenvalue changes with
respect to certain entries changing.

88
31 11/17/17
Thanks to Rohil for taking these notes when I was out of class.
Today we are starting representation theory. Our references are Chapter 9 of Artin’s
Algebra and Chapter 1 of Fulton and Harris’ book.

31.1 Representation Theory


Let F be a field (usually Cq and V a vector space over F (usually finite-dimensional).
We have discussed a variety of concepts in linear algebra, including linear maps V Ñ W ,
kernels, images, splittings V “ U ‘ U 1 , etc.
Now we will consider vector spaces with a bit of extra structure. Fix a group G that is
usually finite, and fix an action of G on V .

Definition 31.1.1. A (linear) representation of G is a pair pV, ρq where V is a vector


space and ρ is a group homomorphism from G to GLpV q, the space of transformations
V Ñ V with nonzero determinant.
This is a “group homomorphism” if we put the group operation of composition on GLpV q.
In other words, the identity e P G maps to the identity transformation on V and ρpghq “
ρphq ˝ ρpgq.

This is in fact a generalization of an ordinary vector space. If we set G to be the trivial


group t1u, then a representation pV, ρq must have ρ is the trivial map sending t1u to the
identity. Therefore, a representation is the same thing as a vector space.
All of the constructions in linear algebra we have done have analogues in the setting of
representations. Note that the map ρ does not necessarily have to be injective, a situation
in which we would call the representation faithful.

Example 31.1.2. Examples of Representations:

• V “ t0u is a representation.

• A trivial representation is a pair pV, idq, i.e. where gv “ v for all g P G, v P V .

• Two representations pV1 , ρ1 q and pV2 , ρ2 q can be summed to produce a representation


pV1 ‘ V2 , ρ1 ‘ ρ2 q, i.e. pρ1 ‘ ρ2 qpgqpv1 , v2 q “ pρ1 pgqv1 , ρ2 pgqv2 q for every g P G, v1 P V1 ,
v2 P V2 .

31.2 Subrepresentations and Quotient Representations


We can build new representations out of direct sums, so it is interesting to ask the opposite
question, if a representation can be broken down into a direct sum.

Definition 31.2.1. A nonzero representation pV, ρq is irreducible if pV, ρq “ pV1 , ρ1 q ‘


pV2 , ρ2 q if and only if V1 or V2 are equal to t0u.

We can also generalize the notion of a subspace to representations.

89
Definition 31.2.2. A subrepresentation of pV, ρq is a subspace U Ď V such that U is
G-invariant, i.e. ρpgqu P U for all g P G, u P U . The action of G on U is then the restriction
of the action of G on V .

Here’s an example from (9.6) in Artin. If G acts on a set S by permutations, then G acts
on the vector space F S linearly. This is sometimes called the permutation representation.
The most natural example is G “ Sn , S “ t1, 2, . . . , nu. Then G acts on F n by permuting
the standard basis e1 , . . . , en . Another example is where G is equal to the set S, in which case
F S is known as the regular representation of G. We will show later that the decomposition
of the regular representation into irreducible representations gives us all of the irreducible
representations of G.
We can also generalize quotient spaces.

Definition 31.2.3. Given a representation pV, ρq with a subrepresentation U , the quotient


representation is the vector space V {U with the action ρpgqpv ` U q “ ρpgqv ` U .

31.3 Mapping Representations


We can also define maps between representations.

Definition 31.3.1. A map T : pV, ρV q Ñ pW, ρW q of representations is a linear map T :


V Ñ W that intertwines the representations, i.e. T pρV pgqvq “ ρW pgqT pvq for every g P G,
v PV.

Now let’s work over F “ R, C and suppose V has an inner product.

Definition 31.3.2. pV, ρq is a unitary representation of G if ρ maps G to U pV q Ă GLpV q,


i.e. for every g P G, v, v 1 P V , the identity xgv, gv 1 y “ xv, v 1 y holds.

We can show that U K is a representation if U Ă V are unitary representations. For any


v P U K , we have xu, vy “ 0 for every u P U . Then xgu, gvy “ 0 for every g P G, u P U . This
implies gv P U K as well.
If x¨, ¨y is any inner product on V , we can define a new inner product for which V is a
unitary representation: ÿ
rv, v 1 s “ xgv, gv 1 y.
gPG

This kind of “averaging trick” or “unitary trick” pops up often.

90
32 11/20/17

Today: PS 2
. No class for the rest of the week or section or math table.

32.1 Characters
Characters are the main tool that we will use to understand the representations of finite
groups. As a reminder, a representation is a homomorphism ρ : G Ñ GLpV q given V {C finite-
dimensional and G a finite group. Given such a representation, we associate the character
χρ : G Ñ C such that χρ pgq “ Trpρpgqq. This is not necessarily a homomorphism. This
character carries a lot of information about your representation and can answer questions
about irreducibility, homomorphisms, duals, etc.
Here are some basic properties of the character:

1. Suppose g 1 is conjugate to g in G, i.e. g 1 “ h´1 gh. Because ρ is a homomorphism,


we have ρpg 1 q “ pρphqq´1 ρpgqρphq is the conjugate of ρpgq by ρphq, so it has the same
trace. Thus χρ pg 1 q “ χρ pgq. Conjugacy is clearly an equivalence relation in your group
(exercise) with equivalence classes called conjugacy classes; the character is constant
on conjugacy classes. Such a function is called a class function.

2. dim V “ Tr id “ χρ pidG q. (We’ll occasionally use ρ to refer to the vector space itself
as well, so sometimes you might see dim ρ).

3. If dim V “ 1, then ρ : G Ñ C˚ “ GLpCq and χρ “ ρ. Thus maps G Ñ C˚ are


sometimes known as characters of the group.
It’s well known that for any g P G given a finite group G there exists a number g n “ 1.
The smallest such number is the order of the element g, which divides |G| (proof:
pigeonhole). In this case, pρpgqqn “ id, so χρ pgqn “ 1 if dim V “ 1. Thus χρ pgq “ ρpgq
is an nth root of unity, sometimes denoted µn . Further, ρpg ´1 q must be the complex
conjugate. This leads to:

4. χρ pg ´1 q “ χρ pgq. This is because ρpgq satisfies xn ´ 1 “ 0 which has n distinct roots


in any algebraically closed field not of characteristic dividing n. Thus ρpgq can be
diagonalized and each eigenvalue λj is an nth root of unity. The inverse has eigenvalues
λ´1
j “ λj so the traces are also complex conjugates.

5. χρ1 ‘ρ2 pgq “ χρ1 pgq ` χρ2 pgq. This is immediate from the formula for the trace of a
direct sum.

6. G Ñ Sn gives a map G Ñ GLn pFq. This is a permutation representation. Then ρpgq


has a 1 on the diagonal precisely when a basis vector maps to itself, i.e. when the
permutation has a fixed point. Thus χρ pgq is the number of fixed points.
A very important special case is the regular representation. In this case, χpgq is either
0 (when g is not the identity) or |G| (when g is the identity).

91
32.2 Orthonormality of Characters
We have an inner product on CG “ tf : G Ñ Cu defined by
1 ÿ
xf1 , f2 y “ f1 pgqf2 pgq.
|G| gPG

The characters are class functions so we can consider their inner products.

Theorem 32.2.1 (Artin 5.9). Characters of irreducible representations are orthonormal,


i.e. given two irreducible representations V1 , V2 we have xχV1 , χV2 y “ 1 if V1 “ V2 and 0 if
not. Further, these form an orthonormal basis of the class functions.

Given this, if V is any representation, then


ÿ
χV “ nχ χ, nχ “ xχ, χV y
irreps

by the usual formula. This is true because by Maschke’s theorem (shown last time), V can
be written as the sum of irreducibles, implying the above holds with the nχ real (and actually
nonnegative integers). Thus it doesn’t matter whether or not we take a complex conjugate.
Further, we can determine if V and W are isomorphic by splitting into direct sums
of irreducibles and then determining if they have the same number of each irreducible.
Alternatively, we could just determine if their characters are the same.
Finally, consider the regular representation. We see that xχ, χV y “ χp1q “ dim Vχ so

V “ ‘χ pVχ qdim Vχ .

Thus the regular representation contains all irreducible representations. Finally,


ÿ
|G| “ dim V “ pdim Vχ q2 .
χ

This gives us a bound on the number of irreducible representations.

92
33 11/27/17
Makeup class next Monday. Saturday: Putnam exam (10 - 1 and 3 - 6, Science Center C,
be there by 9:30 to register). PS 11 out today, due next Wednesday.

33.1 Orthogonality of Characters


We define the inner product on functions f1 , f2 P CG to be
1 ÿ
xf1 , f2 y “ f1 pgqf2 pgq.
G gPG
#
1 V “V1
We will show that two irreducible characters are orthonormal xχ, χ1 y “ We
0 V ‰ V 1.
will in fact show the more general statement that xχV , χW y “ dim HomG pV, W q. In this
case, HomG pV, W q is the set of homomorphisms T : V Ñ W or T P HompV, W q such that

T
V W
ρpgq ρpgq
T
V W

commutes.
We already know by Maschke that V “ ‘j Vj for some irreducibles Vj and similarly
W “ ‘k Wk for irreducibles Wk . You can then check that

HomG pV, W q “ ‘j,k HomG pVj , Wk q.

Further, xχV , χW y is bilinear on V and W (in terms of the direct sum), since
ÿ
χ‘j Vj “ χVj
j

and all inner products are bilinear. Thus we can just prove our result for irreducibles. These
leads us to the following.

33.2 Schur’s Lemma


Theorem 33.2.1 (Schur’s Lemma). If V, V 1 are irreducible representations, then:

1. If V ‰ V 1 , then HomG pV, V 1 q “ t0u (note here I mean equivalence as representations,


not just vector spaces) for any ground field.

2. Over an algebraically closed ground field F, then HomG pV, V 1 q “ F idV .

93
Proof. The first part is relatively easy. Given T : V Ñ V 1 , we know that ker T Ă V is a
subspace, and since T is a G-homomorphism, ker T is a subrepresentation. V is irreducible,
so ker T “ 0, V and similarly Im T “ 0, V 1 . If ker T “ V , then T “ 0. If Im T “ 0, then
T “ 0. If T ‰ 0, then ker T “ 0, Im T “ V 1 , implying that T is an isomorphism. But V ‰ V 1 ,
so we can have no isomorphisms T , so there are no nonzero G-transformations T : V Ñ V 1 .
For the second part, note that T P EndG pV q has an eigenvalue λ. kerpT ´ λIq ‰ 0 so we
must have kerpT ´ λIq “ V implying that T “ λI.
If the field is not algebraically closed, we still know that EndG pV q is an algebra and it’s
still true that there are no eigenvalues in F if T is not a multiple of the identity. But this
algebra might be strictly larger.

Example 33.2.2. Consider G “ Z{4Z and consider representations over F “ R. Our


representation is V “ R2 with the action of g being rotation by 90 degrees, or px, yq ÞÑ
p´y, xq. This is an irreducible representations, but there are G-representations over R that
are not multiples of the identity, like G itself. In fact, EndG pV q “ ´C.

This implies that two decompositions of V into irreducibles have the same multiplicities.
Note that dim HomG pV, V0 q for V0 irreducible is the number of appearances of V0 in the
decomposition of V into irreducibles, by what we’ve shown above. So without knowing
anything about the character, we can compute how many times V0 has to appear in any
decomposition.

33.3 Proving Orthonormality of Characters


Let V be any representation and consider
1 ÿ
χV pgq “ xχV , 1y “ xχV , χtriv y .
|G| gPG

We claim that this is the number of times that the trivial representation occurs in V in any
decomposition, i.e.
xχV , 1y “ dim V G
where V G is the space of G-invariants, or vectors v P V such that for all g P G we have
gv “ v. This formula holds any time char F does not divide |G|.
Proof. Let P P EndpV q be defined by
1 ÿ
P “ ρpgq.
|G| gPG

Then note that


1 ÿ
P ρphq “ ρpgqρphq “ P
|G| gPG

94
and similarly ρphqP “ P . Thus P sends an element v to the sum of all elements in its orbit,
where the orbit is just the set ρpgqv for all g P G. Note that P 2 “ P since
1 ÿ 1 ÿ
P2 “ ρpghq “ ρpgq “ P
|G|2 gh |G| g

as desired.
Thus P has eigenvalues 1 and 0 and we can decompose V “ V1 ‘ V0 since there are no
generalized eigenvectors. Further, V1 “ V G so
1 ÿ 1 ÿ
dim V G “ Tr P “ Tr ρpgq “ χV pgq
|G| gPG |G| gPG

as desired.
This proves the orthonormality formula for the case that one representation is the triv-
ial representation. We’ll now want to generalize this by descriing the action of the dual
representation. Observe that

HomG pV, W q “ pHompV, W qqG

if you are careful with how G acts on HompV, W q. Given T : V Ñ W , define the action of g
by
T
V W
g g
g¨T
V W
which implies that
g ¨ T “ g ˝ T ˝ g ´1 .
If we have g ¨ T “ T , then T is a G-homomorphism so T P HomG pV, W q. Thus HompV, W q
is a representation.
Another way of thinking of this is by noting that HompV, W q » V ˚ b W . We’d like to
define an action of G on V ˚ by taking the dual of ρpgq, but they don’t compose in the right
order. Thus we define the action of g as the dual of the action of g ´1 , since inverses multiply
in the reverse order. Thus the dual representation or contregredient representation is
ρ˚ pgq “ pρpg ´1 qq˚ . With that, HompV, W q as a G-representation is the same as V ˚ b W with
the normal tensor product and dual representations.
Now,

χHompV,W q pgq “ χV ˚ bW pgq “ χV ˚ pgqχW pgq “ χV pg ´1 qχW pgq “ χV pgqχW pgq.

Thus we find that


1 ÿ
dim HomG pV, W q “ dimpHompV, W qqG “ xχV ˚ bW , 1y “ χV pgqχW pgq “ xχV , χW y
|G|
and we are done.

95
34 11/29/17
34.1 Group Actions
We already know that maps ρ : G Ñ GLpV q are called representations and ϕ : G Ñ
tpermutations of Su are called permutation representations. These are also known as group
actions, where G acts on S. Since this map must be a homomorphism, we require that for
all g1 , g2 P G and all s P S, then ϕpg1 g2 q “ ϕpg1 qϕpg2 q so

pϕpg1 , g2 qqpsq “ pϕpg1 qϕpg2 qqpsq.

It follows that ϕpidqpsq “ s and ϕpg ´1 q “ pϕpgqq´1 . Note: we often write g ¨ s instead of
pϕpgqqpsq.
We can now define an equivalence relation. s » s1 such that there exists g P G such that
ϕpgqs “ s1 . The orbit of s is the equivalence class, i.e. the set of all elements of the form
φpgqs for g P G. Because this is an equivalence relation, S is the disjoint union of its orbits.
We say that G acts transitivitely if S is its own orbit, i.e. there is just one orbit.
The stabilizer of S is the subgroup consisting of g P G such that φpgqs “ s. It is
a subgroup since we know it is closed under multiplication. Stabilizers do not need to
be normal subgroups. If G is finite and the stabilizer is finite, then g ÞÑ φpgqs sends
G ÞÑ orbit of s. Every element arises the same number of times, since φpgqpsq “ φpg 1 qpsq
implies that φpg 1´1 gqpsq “ s implying that g 1´1 g P Stabpsq and g 1 P g Stabpsq. These cosets
all have the same size, so each element arises the same number of times. This map is a
Stabpsq-to-1 map. This gives us:
Lemma 34.1.1 (Orbit-Stabilizer). |G| “ | Stabpsq| | orbit |.
Further, we have the following (similar to Cayley’s theorem that any group is a subgroup
of the permutation group):
Theorem 34.1.2. Any subgroup is the stabilizer of a transitive group action.
Proof. Given some subgroup H, g ÞÑ gH is a transitive action on the cosets gH with
stabilizer H.
So we get (something that I’m still unconvinced is big enough to be a theorem):
Theorem
ˇ 34.1.3 (Lagrange). Any subgroup has order dividing the order of the group.
ˇ
|H|ˇ|G|.

34.2 Permutation Representations


A very common way in mathematics to study any kind of group action is to extract something
linear and then apply representation theory. Given an action on G by permutations of S,
you can get out a linear representation CS . We define ρpgqpes q “ eϕpgqpsq and extend by
linearity. Now considering f : S Ñ C, we have

ρpgqpf q “ ϕpgqpsq ÞÑ as “ s ÞÑ aϕpgq´1 psq

96
implying that
ρpgqf “ f ˝ pϕpgq´1 q.
Note the important inverse sign.

Example 34.2.1. Consider G “ S3 . This is the smallest nonabelian group. Characters are
class functions, and conjugacy classes in Sn correspond to cycle structures (exercise). The
cycle structure is just the number/size of cycles in?the cycle decomposition of Sn . These
correspond to partitions of n, which grow like eC n (per Ramanujan’s formula). For 3,
fortunately, we have 3 partitions 3 “ 1 ` 1 ` 1 “ 2 ` 1 “ 3. The first is the identity (1), the
second is simple transpositions (3), and the last is 3-cycles (2).
We want to determine the character table. See Table 34.2.1. We always have a trivial
character χ1 . The next character is the sign, which we discussed in the determinant. You
computed its character on the determinant, and it is a representation since S3 Ñ ˘1 Ñ
GLpCq by sending 1 ÞÑ 1, ´1 ÞÑ ´1. We know that the rows of the character table have
to be orthonormal and there can be at most 3 of them. We are looking for some other
representation, and we can guess what it has to be by orthonormality. Thus we get χ2 .

111 (1) 12 (3) 3 (2)


χ1 ptrivq 1 1 1
χ2 2 0 -1
χ3 psignq 1 -1 1

Table 34.2.1: The Character Table of S3 .

But we know where this representation comes from. We clearly have a map φ : G Ñ
tpermutations of t1, 2, 3uu. The character is the number of fixed points p3, 1, 0q and it has a
copy of χ1 in it, so subtracting that off you get χ2 .

Artin goes through a number of other examples including the icosahedral group (group
of symmetries of the icosahedron). We’ll go back to some of the fundamental results of
representation theory.

34.3 Group Algebra


There is an underlying structure CrGs called the group algebra. It’s the vector space CG
but has a multiplication operation eg eh “ egh (and extend by linearity). Explicitly, we see
that ˜ ¸˜ ¸ ˜ ¸
ÿ ÿ ÿÿ ÿ ÿ
ag eg bh eh “ ag bh egh “ ag bg´1 k ek
g h g h kPG g

which is like convolution, just generalized to nonabelian groups.


Note that representations of G are actually CrGs-modules. Specifically,
˜ ¸
ÿ ÿ
ag e g v “ ag pρpgqqpvq.
g g

97
Thus you can interpret a lot of what we’re doing in terms of modules of CrGs. For example,
the regular representation is just CrGs acting on itself by multiplication. Any representation
inside CrGs is an ideal (an irreducible ideal if the representation is irreducible).
We can compute the center of this group algebra ZpCrGsq (the center is the set that
commutes with every element of the group). We must have
˜ ¸ ˜ ¸
ÿ ÿ
eg ah eH “ ah e h e g
h h

for all ah . This implies that agh “ ahg for all g, h. Two elements can be written as gh and
hg iff they are conjugate, thus g „ g 1 implies that ag “ ag1 . Thus g ÞÑ ag defines a class
function, so ZpCrGsq is the set of class functions. Thus characters form a basis of the center
of the group algebra.

Theorem 34.3.1. The characters comprise an orthonormal basis for the set of class func-
tions.

Proof. Suppose that ϕ : G Ñ C is a class function that is orthogonal to all characters. We’ll
show that ϕ “ 0. This implies the desired, since we already have orthonormality and thus
linear independence of our characters, and if our characters did not span we could take the
orthogonal complement and find such a ϕ.
Let pV, ρq be any representation. Consider the map T : V Ñ V given by
1 ÿ
T “ ϕpgqpρpgqq.
|G| g

1
ř
ϕ is a class function, so it commutes with everything. Thus |G| g ϕpgq P ZpCrGsq so
T ρphq “ ρphqT for all h P G. Thus T is a G-endomorphism. Suppose V is irreducible, so
Tr T
T “ cI where c “ dim V
. However,

1 ÿ
Tr T “ ϕpgqχpgq “ 0
|G| gPG

by assumption, so T “ 0. So any V (irreducible or not) is the direct sum of irreducibles. T


acts on each one of these by 0, so T “ 0 on V . Apply this to V “ CrGs. Then T eg “ 0 so
ϕ “ 0, as desired.

98
35 12/1/17
Final exam: December 8 to 11. Handed out around noon and due in the evening (exact time
TBD).
Today: representation theory finale.

35.1 Projection Operators


We’ve used the result several times that if T : V Ñ V such that T 2 “ I then V “ V´1 ‘ V1 .
We can also do this for any other polynomial with no repeated roots. Note that V now is a
representation of Z?2Z where 1 pmod 2q ÞÑ T . We know that Z{2Z has two reprsentations:
a trivial one p1, 1q and a nontrivial one p1, ´1q. Note that in V´1 we have T v “ ´v so any
element generates a copy of our nontrivial representation. V1 contains a bunch of copies of
the trivial representation. Thus we may write
1`T 1´T
V1 “ V, V´1 “ V.
2 2
These formulas generalize to representations. Given any finite group G and any irre-
ducible representations pV, ρq and pV 1 , ρ1 q, we have constructed the operator
dim V ÿ
χpgqρ1 pgq.
|G| g

This is a class function so this is a G-endomorphism of V , but by Schur this is C id. By


taking the trace and using orthogonality of characters, we find that this is 0 if V 1 ‰ V and
1 if V 1 “ V .
Now consider pW, ϕq to be any representation. We can then consider the operator
dim V ÿ
χpgqϕpgq
|G| g

which by the above argument must be a projection W Ñ ‘Vi »V Vi . This thus extracts out
all of the parts of W that are isomorphic to V ; this argument works even in the infinite case.
This is called the ρ-isotypic subspace.
Note that, for example, EndG pV ‘ V q “ HomG pV ‘ V, V ‘ V q “ M2ˆ2 pCq. If you know
|G|
about algebraic integers, you may note that dim V
id must have eigenvalues that are algebraic
integers by this argument, so we must have dim V ||G|.

35.2 Other Orthogonality Relations


We know that the rows of the character table (a square matrix) are orthogonal, but they’re
orthogonal in this weird way that depends on the size of the conjugacy classes and |G|. In
general, we have to add some diagonal matrix to make this to work. For example in S3 we
have ¨ ˛¨ ˛¨ ˛
1 1 1 1 0 0 1 2 1
˝2 0 ´1‚˝0 3 0‚˝1 0 ´1‚ “ 6 id .
1 ´1 1 0 0 2 1 ´1 1

99
Let the character table be T and the diagonal matrix be D; we have
1
T DT H “ id .
|G|
Now conjugate both sides by T to get
DT H T “ |G| id
and therefore
T H T “ |G|D´1 .
This gives you inner products of columns with other columns and it appears we have another
orthogonality relationship, where the norm of each column is the group order divided by the
size of the conjugacy class. This is the size of the stabilizer (since the orbit is the conjugacy
class), i.e. all h such that h´1 gh “ g or gh “ hg, or all h that commute with g. This is the
commutator of g.
Thus we have:
Theorem 35.2.1. For all g, h P G,
#
ÿ 0 g not conjugate to h
χpgqχphq “ .
χ
|commutator| g conjugate to h

If you pick g “ h “ id, then this is equivalent to saying that


ÿ
pdim ρq2 “ |G|.
ρ

We’ll now see that this is actually a structural fact.

35.3 Group Ring Decomposition


Note that à ÿ
dim EndC pV q “ pdim V q2 “ |G| “ dim CrGs.
irreducible V V
We’ll see now that
i : CrGs » ‘V EndC pV q
so this is actually a natural fact.
The map i is defined by g ÞÑ ‘V ρpgq (this is really the only reasonable map). i is clearly
a homomorphism since
ipgqiphq “ ‘V ρpgqρphq “ ‘V ρpghq “ ipghq.
This is actually an isomorphism. Since we already have the sum-of-squares formula, we only
need to show one of injectivity or surjectivity.
We’ll show injectivity, i.e. ker i “ 0. But we know that CrGs “ ‘V V ‘ dim V as represen-
tations (where the LHS is the regular representation). So if A P ker i, then A must act as 0
on each V and so also on CrGs. This takes the identity to 0, so A “ 0, as desired.

100
36 12/4/17
Today: review session + last Math Night.
Final exam haned out at noon in the 4th floor common room on Friday; due next Monday.

36.1 Problem Set 3 Problem 10


The problem considers a finite field |F| “ q and given 0 ă 2e ă q. We have a map Pq´2e Ñ FF
We are considering a linear error-correcting code with q ´ 2e ` 1 coefficients a0 , . . . , aq´2e and
we send P p0q, P p1q, . . .. We expect not more than e of them to be corrupted. If there were
no corruption, then we have q linear equations in the coefficients and we can easily solve this
system of linear equations to rederive the coefficients (since we have more equations than
unknowns).
If we have w and w1 that are corrupted to v by fewer than e errors, then we’re screwed
since we can’t distinguish between w and w1 . So first we want to show that this doesn’t
happen. w and w1 however are separated by fewer than 2e errors (triangle inequality). The
beginning of the problem is to show that this can’t happen.
Suppose you now have fewer than e entries changed, i.e. d entries. We want to now `˘
actually find the original polynomial. This is in general a hard problem, since there are dq
possibilities which is not a polynomial in q. However, picking errors x1 . . . xd and supposing
that the polynomial we should have received is A, observe that

v “ px ´ x1 q . . . px ´ xd q “ Apx ´ x1 q . . . px ´ xd q

in FF since they disagree at most at the xi where they are both 0. This equation is vx P pxq “
Apxq for x P F which is another system of linear equations for some P P Pe´1 . The problem
required you to show that you could solve this linear equation as soon as d was the number
of errors (and not before).
The procedure is as follows. Try it for 0 errors; if it works you’re done. If not, try it with
1 error where vx px ´ x1 q “ Apxq for Apxq has degree q ´ 2e ` 1. This allows us to recover
the original polynomial. If that doesn’t work, try vx px2 ´ ax ` bq “ Apxq and solve those
linear equations. e ă q so in at most e attempts to solve simultaneous linear equations, you
will find the error-detecting polynomial and can evaluate your polynomial.

36.2 Finite Group Theory


This is an overview of material typically covered in Math 122 and also in Artin Chapter 6.
We’ve avoided this because representation theory fits nicely in linear algebra but the Sylow
theorems don not; further, the Sylow theorems have restricted uses in math (mostly a bit in
Galois theory). We’ll try to cover some of this material that’s relevant for Math 123.
Given a finite group G and a subset H Ă G (some brilliant person ˇ decided that using
H ď G to denote this was a smart idea), we’ve already seen that |H|ˇ|G| (Artin 6.11).
ˇ

Corollary 36.2.1. For any


ˇ g P G, we know there exists a smallest possible n (the order)
n
such that g “ 1. Then nˇ|G|.
ˇ

101
Proof. Consider the subgroup generated by g, i.e. H “ xgy “ t1, g, g 2 , . . . , u. Now apply
Lagrange’s theorem.

Corollary 36.2.2. If |G| has prime order, then G is cyclic.

Proof. There exists g ‰ 1, so the order of g is not 1. Thus the order of g is p “ |G|, implying
G is cyclic.
ˇ
Theorem 36.2.3 (Cauchy’s Theorem). If pˇ|G|, then there exists g P G such that g has
ˇ
order p.

Proof. Consider the case p “ 2. We’re trying to find an element g “ g ´1 . But g ÞÑ g ´1 is


an involution of |G|, i.e. a 2-element group acting on G. We have orbits of size 1 and 2,
i.e. decompositions tgu for g “ g ´1 and tg, g ´1 u otherwise. Since |G| is even, the number of
decompositions tgu has to be even. There is one such decomposition t1u, so there must be
another which is not the identity and we have our element g “ g ´1 .
We now want to generalize this. In general, have the cyclic group Z{pZ act on ordered
pairs pg1 , . . . , gp q such that g1 ¨ ¨ ¨ gp “ 1. The number of such pairs is |G|p´1 since we can
pick g1 , . . . , gp´1 arbitrarily and have the cyclic group act by having 1 send pg1 , . . . , gp q ÞÑ
pg2 , . . . , gp , g1 q. We see that g1 ¨ ¨ ¨ gp “ g2 ¨ ¨ ¨ gp g1 “ 1 so this is a valid action. Now repeat
the argument above, noting that fixed points are of the form pg, . . . , gq for g p “ 1, and all
other orbits must have size p.
Note that the number of elements of order p is congruent to ´1 pmod pq and the number
of subgroups of order p is congruent to 1 pmod pq (since that’s a p ´ 1 to 1 map).
What if |G| is not a multiple of p? Then if we let |G| “ n we see that np´1 ” 1 pmod pq,
giving a proof of Fermat’s little theorem.
What if p is not prime? Then this doesn’t hold; for G ` A4 (the alternating group of
order 4), there is no subgroup H of order 6|12 “ |G|. However, we have Sylow’s theorems
that generalize this fact.

Theorem 36.2.4 (Sylow’s Theorems). If pk |G, then there exists H Ă G such that |H| “ pk .
If |G| “ pe m such that m ı 0 pmod pq, then such H is called a p-Sylow group. Any two
such groups are conjugate in G and any p-power subgroup is contained in one of those. The
number of such p-Sylow subgroups is congruent to 1 pmod pq.

Note: a p-group is a group whose order is a power of p.


We can use this theorem to prove some useful things. If G “ GLn pkq where |k| “ pf ,
then n
n2 ´n
ź
n n n n´1
|G| “ pq ´ 1qpq ´ qq . . . pq ´ q q “ q 2 pq m ´ 1q.
m“1

The latter is ˘1 pmod pq and the former is 1 pmod pq. Sylow’s theorem promises that G has
n2 ´n
a subgroup of order q 2 . These are the unipotent matrices (upper-triangular matrices
with 1s on the diagonal) that form the unipotent subgroup.

102
Proof Outline of Sylow. If |G| “ pe for p ą 0, G can be partitioned into conjugacy classes.
Each conjugacy class is an orbit of the action of conjugation and the possible orders of
these orbits are 1, p, p2 , . . . , pe´1 . In particular, 1 has to occur at least p times by the same
argument as before.The conjugacy class of size 1 is a group element g such that hgh´1 “ g,
i.e. elements of the center. Thus ZpGq ‰ 1 and has size a multiple of p. Thus there is a
nontrivial normal subgroup of G. Now consider G?ZpGq. Apply the same argument and
keep quotienting out. This gets you groups of every p power. `e ˘
To show the existence of Sylow groups, consider subsets of size pe . There are ppem of
these, which is not a multiple of p (Lucas’s theorem). If G acts on this, there is an orbit
such that p does not divide its size. The stabilizer of any element in this orbit has size pe
and is thus our desired p-Sylow subgroup.
Note that p-groups are not very nice. Abelian p-groups are pretty nice; they’re just
‘pZ{pi Zq and we cna classify them by partitions. But non-abelian p-groups are not nice.
SUppose we have
1 Ñ pZ{pZqa Ñ G Ñ pZ{pZqb Ñ 1.
Let V “ pZ{pZqa , W “ 9Z{pZqb . You have a map ^2 W Ñ V by pulling back g ÞÑ gr, h ÞÑ r h
´1r ´1
and forming the commutator g, h ÞÑ rr g , hs “ grhG h . You can check
r r r
`a˘that this is actually a
2
linear map ^ W Ñ V . The vector space of such maps has dimension 2 b. This is maximized
3
for a “ n3 , b “ 2n
3
so this is roughly 4n
27
if n “ a ` b. Thus the number of groups of order p
2n3
´Opn2 q
is around p 27 which is way way way more than pn .

103
Index

G-invariants, 94 commmutative diagrams, 35


λ-eigenspace, 42 commutator, 100
F-algebra, 45, 74 complement, 22
F-algebra homomorphism, 45 cone, 83
F-linear transformation, 25 conjugacy classes, 91
ρ-isotypic subspace, 99 conjugate, 69
n-tuples, 12 conjugate symmetric, 60
p-Sylow group, 102 contravariant functor, 36
p-group, 102 coset, 31
(linear) representation, 89
degree, 47, 48
abelian group, 9 determinant, 72, 76
action, 89 diagonalizable, 50
adjacency matrix, 66 dimension, 22
adjoint, 64 direct sum, 13, 18
adjugate matrix, 79 distance, 57
adjunct matrix, 79 dual, 26
Algebra, 89 dual basis, 33
algebraic multiplicity, 86 dual representation, 95
algebraically closed, 48 dual vector space, 33
alternating, 75
alternating group, 72 edges, 66
associative F-algebra, 74 eigenvalue, 42
axiom, 8 eigenvectors, 42
elementary matrices, 81
basis, 21 endomorphism ring, 45
bilinear, 54 equivalence classes, 31
bilinear form, 55 exact at V , 35
block upper triangular matrix, 42 exact sequence, 35
exterior algebra, 76
Cartesian product, 13 external direct sum, 13
Categories, 36
center, 98 factors completely, 47
character, 91 faithful, 89
characteristic, 16 field, 9
characteristic polynomial, 73, 85 filtration, 50
characters of the group, 91 finite dimensional, 17
class function, 91 finite field, 16
closure, 13 finite-dimensional, 18
codimension, 32 finitely generated, 17, 18
cokernel, 28 flag, 50
column rank, 39 functionals, 26
column reduction, 80 functors, 36

104
general linear group, 69 nullity, 27
generalized λ-eigenspace, 86 nullspace, 27
geometric multiplicity, 86
graded, 74 orbit, 95, 96
graphs, 66 order, 91, 101
group, 9 orthogonal, 60
group actions, 96 orthogonal basis, 61, 63
group algebra, 97 orthogonal direct sum, 61
group homomorphism, 24, 78 orthogonal matrices, 70
group of general linear transformations, 39 orthonormal basis, 61, 63

Hamel bases, 19 permutation representation, 90


Hermitian transpose, 64 polarization identity, 63
homomorphism, 15 positive operator, 83
positive-definite operator, 83
ideal, 15, 46 principal ideal domain, 46
identity matrix, 30 pure tensor, 52
image, 27
injection, 15 quotient map, 32
injective, 28 quotient representation, 90
inner product, 57 quotient space, 31
intertwines, 90
invariant subspace, 41 range, 27
inversions, 73 rank, 27, 39
involution, 55 regular representation, 90
involutions, 60 ring, 10
irreducible, 89 ring homomorphism, 15
root, 47
kernel, 15, 27, 78 row rank, 39
linear combination, 16 self-adjoint, 64
linear dependence, 19 short exact sequence, 35
linearity, 8 signature, 63
linearly independent set, 19 simple transposition, 73
minimal polynomial, 46 space of linear transformations, 26
modules, 12 span, 16
Moore graph, 66 special linear group, 79
stabilizer, 96
negative-definite, 83 subfields, 13
nilpotent, 86 subrepresentation, 90
nilspace, 86 subspace, 14
Noetherian, 21 surjective, 28
non-degenerate, 60 symmetric algebra, 74
normal, 64 Symmetric bilinear forms, 55
normal subgroup, 78 symmetric group, 72
norms, 57 symmetric space, 55

105
tensor algebra, 74 unitary representation, 90
tensor product, 52 universal property, 26
tensor product of linear transformations, upper triangular matrix, 42
54 upper-triangular matrix, 50
trace, 56
transitivitely, 96 vector space, 12
transpose, 38 vectors, 12
trivial representation, 89 vertices , 66

wedge product, 75
unipotent matrices, 102 wedge product of linear transformations,
unipotent subgroup, 102 75
unit, 79
unit disk, 59 zero, 47
unitary matrices, 70 zero vector space, 13

106

You might also like