0% found this document useful (0 votes)
84 views424 pages

Abstract Algebra Applications To Galois Theory, Algebraic Geometry, Representation Theory and Cryptography (Celine Carstensen-Opitz, Benjamin Fine Etc.)

This textbook provides an introduction to abstract algebra, covering topics such as groups, rings, fields, polynomials, field extensions, Galois theory, and applications to number theory. The book assumes some familiarity with linear algebra and calculus and is intended to be suitable for a full year undergraduate course in abstract algebra.

Uploaded by

shirasassa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views424 pages

Abstract Algebra Applications To Galois Theory, Algebraic Geometry, Representation Theory and Cryptography (Celine Carstensen-Opitz, Benjamin Fine Etc.)

This textbook provides an introduction to abstract algebra, covering topics such as groups, rings, fields, polynomials, field extensions, Galois theory, and applications to number theory. The book assumes some familiarity with linear algebra and calculus and is intended to be suitable for a full year undergraduate course in abstract algebra.

Uploaded by

shirasassa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 424

Celine Carstensen-Opitz, Benjamin Fine, Anja Moldenhauer, and

Gerhard Rosenberger
Abstract Algebra
Also of Interest
Algebra and Number Theory. A Selection of Highlights
Benjamin Fine, Anthony Gaglione, Anja Moldenhauer,
Gerhard Rosenberger, Dennis Spellman, 2017
ISBN 978-3-11-051584-8, e-ISBN (PDF) 978-3-11-051614-2,
e-ISBN (EPUB) 978-3-11-051626-5

Geometry and Discrete Mathematics. A Selection of Highlights


Benjamin Fine, Anthony Gaglione, Anja Moldenhauer,
Gerhard Rosenberger, Dennis Spellman, 2018
ISBN 978-3-11-052145-0, e-ISBN (PDF) 978-3-11-052150-4,
e-ISBN (EPUB) 978-3-11-052153-5

A Course in Mathematical Cryptography


Gilbert Baumslag, Benjamin Fine, Martin Kreuzer, Gerhard
Rosenberger, 2015
ISBN 978-3-11-037276-2, e-ISBN (PDF) 978-3-11-037277-9,
e-ISBN (EPUB) 978-3-11-038616-5

Abstract Algebra. An Introduction with Applications


Derek J. S. Robinson, 2015
ISBN 978-3-11-034086-0, e-ISBN (PDF) 978-3-11-034087-7,
e-ISBN (EPUB) 978-3-11-038560-1

Discrete Algebraic Methods. Arithmetic, Cryptography, Automata and


Groups
Volker Diekert, Manfred Kufleitner, Gerhard Rosenberger,
Ulrich Hertrampf, 2016
ISBN 978-3-11-041332-8, e-ISBN (PDF) 978-3-11-041333-5,
e-ISBN (EPUB) 978-3-11-041632-9
Celine Carstensen-Opitz, Benjamin Fine,
Anja Moldenhauer, and Gerhard Rosenberger

Abstract Algebra

|
Applications to Galois Theory, Algebraic Geometry,
Representation Theory and Cryptography
Mathematics Subject Classification 2010
Primary: 11-01, 12-01, 13-01, 14-01, 16-01, 20-01, 20C15; Secondary: 01-01, 08-01, 94-01

Authors
Celine Carstensen-Opitz Dr. Anja Moldenhauer
Dortmund Hamburg
Germany Germany
[email protected] [email protected]

Prof. Dr. Benjamin Fine Prof Dr. Gerhard Rosenberger


Fairfield University University of Hamburg
Department of Mathematics Department of Mathematics
1073 North Benson Road Bundesstr. 55, 20146 Hamburg
Fairfield, CT 06430 Germany
USA [email protected]
[email protected]

ISBN 978-3-11-060393-4
e-ISBN (PDF) 978-3-11-060399-6
e-ISBN (EPUB) 978-3-11-060525-9

Library of Congress Control Number: 2019938926

Bibliographic information published by the Deutsche Nationalbibliothek


The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie;
detailed bibliographic data are available on the Internet at http://dnb.dnb.de.

© 2019 Walter de Gruyter GmbH, Berlin/Boston


Cover image: Stephen Barnes / iStock / Getty Images
Typesetting: VTeX UAB, Lithuania
Printing and binding: CPI books GmbH, Leck

www.degruyter.com
Preface
Traditionally, mathematics has been separated into three main areas: algebra, anal-
ysis, and geometry. Of course, there is a great deal of overlap between these areas.
For example, topology, which is geometric in nature, owes its origins and problems
as much to analysis as to geometry. Furthermore, the basic techniques in studying
topology are predominantly algebraic. In general, algebraic methods and symbolism
pervade all of mathematics, and it is essential for anyone learning any advanced math-
ematics to be familiar with the concepts and methods in abstract algebra.
This is an introductory text on abstract algebra. It grew out of courses given to
advanced undergraduates and beginning graduate students in the United States, and
to mathematics students and teachers in Germany. We assume that the students are
familiar with calculus and with some linear algebra, primarily matrix algebra and the
basic concepts of vector spaces, bases, and dimensions. All other necessary material
is introduced and explained in the book. We assume, however, that the students have
some, but not a great deal, of mathematical sophistication. Our experience is that the
material in this text can be completed in a full years course. We presented the material
sequentially, so that polynomials and field extensions preceded an in-depth look at
group theory. We feel that a student who goes through the material in these notes
will attain a solid background in abstract algebra, and be able to move on to more
advanced topics.
The centerpiece of these notes is the development of Galois theory and its impor-
tant applications, especially the insolvability of the quintic polynomial. After intro-
ducing the basic algebraic structures, groups, rings, and fields, we begin the theory
of polynomials and polynomial equations over fields. We then develop the main ideas
of field extensions and adjoining elements to fields. After this, we present the nec-
essary material from group theory needed to complete both the insolvability of the
quintic polynomial and solvability by radicals in general. Hence, the middle part of
the book, Chapters 9 through 14, are concerned with group theory, including permu-
tation groups, solvable groups, abelian groups, and group actions. Chapter 14 is some-
what off to the side of the main theme of the book. Here, we give a brief introduction
to free groups, group presentations and combinatorial group theory. With the group
theory material, we return to Galois theory and study general normal and separable
extensions and the fundamental theorem of Galois theory. Using this approach, we
present several major applications of the theory, including solvability by radicals and
the insolvability of the quintic, the fundamental theorem of algebra, the construction
of regular n-gons and the famous impossibilities; squaring the circling, doubling the
cube, and trisecting an angle. We finish in a slightly different direction, giving an in-
troduction to algebraic and group-based cryptography.

https://doi.org/10.1515/9783110603996-201
VI | Preface

October 2010 Celine Carstensen


Benjamin Fine
Gerhard Rosenberger
Preface to the second edition
We were very pleased with the response to the first edition of this book, and we were
very happy to do a second edition. In this second edition, we cleaned up various ty-
pos pointed out by readers, and have added some new material suggested by them.
Here, we have to give a special thank you to Ahmad Mirzay. Mentioning important
results is warranted: We added a new chapter, Chapter 22, on algebras and group rep-
resentations. This can be included in a year-long course. In Chapter 7, we added some
material on skew field extensions of ℂ and Frobenius’s theorem, and, in Chapter 17, on
solvability of polynomial equations. In the bibliography we choose to mention some
interesting books and papers which are not used explicitely in our exposition but are
very much related to the topics of the present book and could be helpful for addi-
tional readings. As before, we would like to thank the many people who read or used
the first edition and made suggestions. We would also especially like to thank Anja
Rosenberger, who helped tremendously with editing and LATEX, and made some in-
valuable suggestions about contents. Also we would like to thank Annika Schürenberg
and Leonard Wienke for the careful reading of the new edition.
Last but not least, we thank de Gruyter for publishing our book.

January 2019 Celine Carstensen-Opitz


Benjamin Fine
Anja Moldenhauer
Gerhard Rosenberger

https://doi.org/10.1515/9783110603996-202
Contents
Preface | V

Preface to the second edition | VII

1 Groups, rings and fields | 1


1.1 Abstract algebra | 1
1.2 Rings | 2
1.3 Integral domains and fields | 3
1.4 Subrings and ideals | 6
1.5 Factor rings and ring homomorphisms | 9
1.6 Fields of fractions | 13
1.7 Characteristic and prime rings | 14
1.8 Groups | 16
1.9 Exercises | 19

2 Maximal and prime ideals | 21


2.1 Maximal and prime ideals | 21
2.2 Prime ideals and integral domains | 22
2.3 Maximal ideals and fields | 24
2.4 The existence of maximal ideals | 25
2.5 Principal ideals and principal ideal domains | 26
2.6 Exercises | 28

3 Prime elements and unique factorization domains | 29


3.1 The fundamental theorem of arithmetic | 29
3.2 Prime elements, units and irreducibles | 34
3.3 Unique factorization domains | 38
3.4 Principal ideal domains and unique factorization | 41
3.5 Euclidean domains | 44
3.6 Overview of integral domains | 50
3.7 Exercises | 50

4 Polynomials and polynomial rings | 53


4.1 Polynomials and polynomial rings | 53
4.2 Polynomial rings over fields | 55
4.3 Polynomial rings over integral domains | 57
4.4 Polynomial rings over unique factorization domains | 59
4.5 Exercises | 65
X | Contents

5 Field extensions | 67
5.1 Extension fields and finite extensions | 67
5.2 Finite and algebraic extensions | 70
5.3 Minimal polynomials and simple extensions | 71
5.4 Algebraic closures | 74
5.5 Algebraic and transcendental numbers | 75
5.6 Exercises | 78

6 Field extensions and compass and straightedge constructions | 81


6.1 Geometric constructions | 81
6.2 Constructible numbers and field extensions | 81
6.3 Four classical construction problems | 84
6.3.1 Squaring the circle | 84
6.3.2 The doubling of the cube | 84
6.3.3 The trisection of an angle | 84
6.3.4 Construction of a regular n-gon | 85
6.4 Exercises | 89

7 Kronecker’s theorem and algebraic closures | 93


7.1 Kronecker’s theorem | 93
7.2 Algebraic closures and algebraically closed fields | 96
7.3 The fundamental theorem of algebra | 101
7.3.1 Splitting fields | 101
7.3.2 Permutations and symmetric polynomials | 102
7.4 The fundamental theorem of algebra | 106
7.5 The fundamental theorem of symmetric polynomials | 109
7.6 Skew field extensions of ℂ and Frobenius’s theorem | 112
7.7 Exercises | 116

8 Splitting fields and normal extensions | 119


8.1 Splitting fields | 119
8.2 Normal extensions | 121
8.3 Exercises | 124

9 Groups, subgroups, and examples | 125


9.1 Groups, subgroups, and isomorphisms | 125
9.2 Examples of groups | 127
9.3 Permutation groups | 130
9.4 Cosets and Lagrange’s theorem | 133
9.5 Generators and cyclic groups | 138
9.6 Exercises | 144
Contents | XI

10 Normal subgroups, factor groups, and direct products | 147


10.1 Normal subgroups and factor groups | 147
10.2 The group isomorphism theorems | 151
10.3 Direct products of groups | 155
10.4 Finite Abelian groups | 157
10.5 Some properties of finite groups | 161
10.6 Automorphisms of a group | 165
10.7 Exercises | 167

11 Symmetric and alternating groups | 169


11.1 Symmetric groups and cycle decomposition | 169
11.2 Parity and the alternating groups | 172
11.3 Conjugation in Sn | 174
11.4 The simplicity of An | 175
11.5 Exercises | 178

12 Solvable groups | 179


12.1 Solvability and solvable groups | 179
12.2 Solvable groups | 179
12.3 The derived series | 183
12.4 Composition series and the Jordan–Hölder theorem | 185
12.5 Exercises | 186

13 Groups actions and the Sylow theorems | 189


13.1 Group actions | 189
13.2 Conjugacy classes and the class equation | 190
13.3 The Sylow theorems | 192
13.4 Some applications of the Sylow theorems | 196
13.5 Exercises | 200

14 Free groups and group presentations | 201


14.1 Group presentations and combinatorial group theory | 201
14.2 Free groups | 202
14.3 Group presentations | 207
14.3.1 The modular group | 209
14.4 Presentations of subgroups | 215
14.5 Geometric interpretation | 218
14.6 Presentations of factor groups | 221
14.7 Group presentations and decision problems | 222
14.8 Group amalgams: free products and direct products | 223
14.9 Exercises | 225
XII | Contents

15 Finite Galois extensions | 227


15.1 Galois theory and the solvability of polynomial equations | 227
15.2 Automorphism groups of field extensions | 228
15.3 Finite Galois extensions | 230
15.4 The fundamental theorem of Galois theory | 231
15.5 Exercises | 240

16 Separable field extensions | 243


16.1 Separability of fields and polynomials | 243
16.2 Perfect fields | 244
16.3 Finite fields | 246
16.4 Separable extensions | 247
16.5 Separability and Galois extensions | 250
16.6 The primitive element theorem | 254
16.7 Exercises | 256

17 Applications of Galois theory | 257


17.1 Applications of Galois theory | 257
17.2 Field extensions by radicals | 257
17.3 Cyclotomic extensions | 261
17.4 Solvability and Galois extensions | 262
17.5 The insolvability of the quintic polynomial | 263
17.6 Constructibility of regular n-gons | 269
17.7 The fundamental theorem of algebra | 271
17.8 Exercises | 273

18 The theory of modules | 275


18.1 Modules over rings | 275
18.2 Annihilators and torsion | 279
18.3 Direct products and direct sums of modules | 280
18.4 Free modules | 282
18.5 Modules over principal ideal domains | 285
18.6 The fundamental theorem for finitely generated modules | 288
18.7 Exercises | 292

19 Finitely generated Abelian groups | 293


19.1 Finite Abelian groups | 293
19.2 The fundamental theorem: p-primary components | 294
19.3 The fundamental theorem: elementary divisors | 295
19.4 Exercises | 301

20 Integral and transcendental extensions | 303


Contents | XIII

20.1 The ring of algebraic integers | 303


20.2 Integral ring extensions | 305
20.3 Transcendental field extensions | 310
20.4 The transcendence of e and π | 315
20.5 Exercises | 318

21 The Hilbert basis theorem and the nullstellensatz | 319


21.1 Algebraic geometry | 319
21.2 Algebraic varieties and radicals | 319
21.3 The Hilbert basis theorem | 321
21.4 The Hilbert nullstellensatz | 322
21.5 Applications and consequences of Hilbert’s theorems | 323
21.6 Dimensions | 326
21.7 Exercises | 330

22 Algebras and group representations | 333


22.1 Group representations | 333
22.2 Representations and modules | 334
22.3 Semisimple algebras and Wedderburn’s theorem | 342
22.4 Ordinary representations, characters and character theory | 351
22.5 Burnside’s theorem | 358
22.6 Exercises | 362

23 Algebraic cryptography | 365


23.1 Basic cryptography | 365
23.2 Encryption and number theory | 370
23.3 Public key cryptography | 375
23.3.1 The Diffie–Hellman protocol | 376
23.3.2 The RSA algorithm | 377
23.3.3 The El-Gamal protocol | 379
23.3.4 Elliptic curves and elliptic curve methods | 381
23.4 Noncommutative-group-based cryptography | 382
23.4.1 Free group cryptosystems | 384
23.5 Ko–Lee and Anshel–Anshel–Goldfeld methods | 389
23.5.1 The Ko–Lee protocol | 389
23.5.2 The Anshel–Anshel–Goldfeld protocol | 390
23.6 Platform groups and braid group cryptography | 391
23.7 Exercises | 395

Bibliography | 399

Index | 403
1 Groups, rings and fields
1.1 Abstract algebra
Abstract algebra or modern algebra can be best described as the theory of algebraic
structures. Briefly, an algebraic structure is a set S together with one or more binary
operations on it satisfying axioms governing the operations. There are many algebraic
structures, but the most commonly studied structures are groups, rings, fields, and
vector spaces. Also, widely used are modules and algebras. In this first chapter, we
will look at some basic preliminaries concerning groups, rings, and fields. We will
only briefly touch on groups here; a more extensive treatment will be done later in the
book.
Mathematics traditionally has been subdivided into three main areas—analysis,
algebra, and geometry. These areas overlap in many places so that it is often difficult,
for example, to determine whether a topic is one in geometry or in analysis. Algebra
and algebraic methods permeate all these disciplines and most of mathematics has
been algebraicized; that is, uses the methods and language of algebra. Groups, rings,
and fields play a major role in the modern study of analysis, topology, geometry, and
even applied mathematics. We will see these connections in examples throughout the
book.
Abstract algebra has its origins in two main areas and questions that arose in
these areas—the theory of numbers and the theory of equations. The theory of num-
bers deals with the properties of the basic number systems—integers, rationals, and
reals, whereas the theory of equations, as the name indicates, deals with solving equa-
tions, in particular, polynomial equations. Both are subjects that date back to classical
times. A whole section of Euclid’s elements is dedicated to number theory. The foun-
dations for the modern study of number theory were laid by Fermat in the 1600s, and
then by Gauss in the 1800s. In an attempt to prove Fermat’s big theorem, Gauss intro-
duced the complex integers a + bi, where a and b are integers and showed that this
set has unique factorization. These ideas were extended by Dedekind and Kronecker,
who developed a wide ranging theory of algebraic number fields and algebraic inte-
gers. A large portion of the terminology used in abstract algebra, such as rings, ideals,
and factorization, comes from the study of algebraic number fields. This has evolved
into the modern discipline of algebraic number theory.
The second origin of modern abstract algebra was the problem of trying to de-
termine a formula for finding the solutions in terms of radicals of a fifth degree poly-
nomial. It was proved first by Ruffini in 1800, and then by Abel that it is impossible
to find a formula in terms of radicals for such a solution. Galois in 1820 extended this
and showed that such a formula is impossible for any degree five or greater. In proving
this, he laid the groundwork for much of the development of modern abstract algebra,
especially field theory and finite group theory. Earlier, in 1800, Gauss proved the fun-

https://doi.org/10.1515/9783110603996-001

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 5:48 AM
2 | 1 Groups, rings and fields

damental theorem of algebra, which says that any nonconstant complex polynomial
equation must have a solution. One of the goals of this book is to present a compre-
hensive treatment of Galois theory and a proof of the results mentioned above.
The locus of real points (x, y), which satisfy a polynomial equation f (x, y) = 0, is
called an algebraic plane curve. Algebraic geometry deals with the study of algebraic
plane curves and extensions to loci in a higher number of variables. Algebraic geom-
etry is intricately tied to abstract algebra and especially commutative algebra. We will
touch on this in the book also.
Finally linear algebra, although a part of abstract algebra, arose in a somewhat
different context. Historically, it grew out of the study of solution sets of systems of
linear equations and the study of the geometry of real n-dimensional spaces. It began
to be developed formally in the early 1800s with work of Jordan and Gauss, and then
later in the century by Cayley, Hamilton, and Sylvester.

1.2 Rings
The primary motivating examples for algebraic structures are the basic number sys-
tems: the integers ℤ, the rational numbers ℚ, the real numbers ℝ, and the complex
numbers ℂ. Each of these has two basic operations, addition and multiplication, and
form what is called a ring. We formally define this.
Definition 1.2.1. A ring is a set R with two binary operations defined on it: addition,
denoted by +, and multiplication, denoted by ⋅, or just by juxtaposition, satisfying the
following six axioms:
(1) Addition is commutative: a + b = b + a for each pair a, b in R.
(2) Addition is associative: a + (b + c) = (a + b) + c for a, b, c ∈ R.
(3) There exists an additive identity, denoted by 0, such that a + 0 = a for each a ∈ R.
(4) For each a ∈ R, there exists an additive inverse, denoted by −a, such that
a + (−a) = 0.
(5) Multiplication is associative: a(bc) = (ab)c for a, b, c ∈ R.
(6) Multiplication is left and right distributive over addition: a(b + c) = ab + ac, and
(b + c)a = ba + ca for a, b, c ∈ R.

If in addition
(7) Multiplication is commutative: ab = ba for each pair a, b in R,

then R is a commutative ring.


Further, if
(8) There exists a multiplicative identity denoted by 1 such that a ⋅ 1 = a and 1 ⋅ a = a
for each a in R,

then R is a ring with identity.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 5:48 AM
1.3 Integral domains and fields | 3

If R satisfies (1) through (8), then R is a commutative ring with an identity.


A set G with one operation, +, on it satisfying axioms (1) through (4) is called an
abelian group. We will discuss these further later in the chapter.
The numbers systems ℤ, ℚ, ℝ, ℂ are all commutative rings with identity.
A ring R with only one element is called trivial. A ring R with identity is trivial if
and only if 0 = 1.
A finite ring is a ring R with only finitely many elements in it. Otherwise, R is an
infinite ring. ℤ, ℚ, ℝ, ℂ are all infinite rings. Examples of finite rings are given by the
integers modulo n, ℤn , with n > 1. The ring ℤn consists of the elements 0, 1, 2, . . . , n − 1
with addition and multiplication done modulo n. That is, for example 4 ⋅ 3 = 12 = 2
modulo 5. Hence, in ℤ5 , we have 4 ⋅ 3 = 2. The rings ℤn are all finite commutative rings
with identity.
To give examples of rings without an identity, consider the set nℤ = {nz : z ∈ ℤ}
consisting of all multiples of the fixed integer n. It is an easy verification (see exercises)
that this forms a ring under the same addition and multiplication as in ℤ, but that
there is no identity for multiplication. Hence, for each n ∈ ℤ with n > 1, we get an
infinite commutative ring without an identity.
To obtain examples of noncommutative rings, we consider matrices. Let M2 (ℤ) be
the set of 2 × 2 matrices with integral entries. Addition of matrices is done component-
wise; that is,

a b1 a b2 a + a2 b1 + b2
( 1 )+( 2 )=( 1 ),
c1 d1 c2 d2 c1 + c2 d1 + d2

whereas multiplication is matrix multiplication

a b1 a b2 a a + b1 c2 a1 b2 + b1 d2
( 1 )⋅( 2 )=( 1 2 ).
c1 d1 c2 d2 c1 a2 + d1 c2 c1 b2 + d1 d2

Then again, it is an easy verification (see exercises) that M2 (ℤ) forms a ring. Further,
since matrix multiplication is noncommutative, this forms a noncommutative ring.
However, the identity matrix does form a multiplicative identity for it. M2 (nℤ) with
n > 1 provides an example of an infinite noncommutative ring without an identity.
Finally, M2 (ℤn ) for n > 1 will give an example of a finite noncommutative ring.

1.3 Integral domains and fields


Our basic number systems have the property that if ab = 0, then either a = 0, or b = 0.
However, this is not necessarily true in the modular rings. For example, 2 ⋅ 3 = 0 in ℤ6 .

Definition 1.3.1. A zero divisor in a ring R is an element a ∈ R with a ≠ 0 such that


there exists an element b ≠ 0 with ab = 0. A commutative ring with an identity 1 ≠ 0
and with no zero divisors is called an integral domain.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 5:48 AM
4 | 1 Groups, rings and fields

Notice that having no zero divisors is equivalent to the fact that if ab = 0 in R, then
either a = 0, or b = 0.
Hence, ℤ, ℚ, ℝ, ℂ are all integral domains, but from the example above, ℤ6 is not.
In general, we have the following:

Theorem 1.3.2. ℤn is an integral domain if and only if n is a prime.

Proof. First of all, notice that under multiplication modulo n, an element m is 0 if and
only if n divides m. We will make this precise shortly. Recall further Euclid’s lemma
(see Chapter 2), which says that if a prime p divides a product ab, then p divides a, or
p divides b.
Now suppose that n is a prime and ab = 0 in ℤn . Then n divides ab. From Euclid’s
lemma it follows that n divides a, or n divides b. In the first case, a = 0 in ℤn , whereas
in the second, b = 0 in ℤn . It follows that there are no zero divisors in ℤn , and since
ℤn is a commutative ring with an identity, it is an integral domain.
Conversely, suppose ℤn is an integral domain. Suppose that n is not prime. Then
n = ab with 1 < a < n, 1 < b < n. It follows that ab = 0 in ℤn with neither a nor b
being zero. Therefore, they are zero divisors, which is a contradiction. Hence, n must
be prime.

In ℚ, every nonzero element has a multiplicative inverse. This is not true in ℤ,


where only the elements −1, 1 have multiplicative inverses within ℤ.

Definition 1.3.3. A unit in a ring R with identity 1 ≠ 0 is an element a ∈ R, which has


a multiplicative inverse; that is, an element b ∈ R such that ab = ba = 1. If a is a unit
in R, we denote its inverse by a−1 . We denote the set of units of R by R⋆ .

Hence, every nonzero element of ℚ and of ℝ and of ℂ is a unit, but in ℤ, the


only units are ±1. In M2 (ℝ), the units are precisely those matrices that have nonzero
determinant, whereas in M2 (ℤ), the units are those integral matrices that have deter-
minant ±1.

Definition 1.3.4. A field K is a commutative ring with an identity 1 ≠ 0, where every


nonzero element is a unit.

Hence, a field K always contains at least two elements, a zero element 0 and an
identity 1 ≠ 0.
The rationals ℚ, the reals ℝ, and the complexes ℂ are all fields. If we relax the com-
mutativity requirement and just require that in the ring R with identity, each nonzero
element is a unit, then we get a skew field or division ring.

Lemma 1.3.5. If K is a field, then K is an integral domain.

Proof. Since a field K is already a commutative ring with an identity, we must only
show that there are no zero divisors in K.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 5:48 AM
1.3 Integral domains and fields | 5

Suppose that ab = 0 with a ≠ 0. Since K is a field and a is nonzero, it has an


inverse a−1 . Hence,

a−1 (ab) = a−1 0 = 0 󳨐⇒ (a−1 a)b = 0 󳨐⇒ b = 0.

Therefore, K has no zero divisors and must be an integral domain.

Recall that ℤn was an integral domain only when n was a prime. This turns out to
also be necessary and sufficient for ℤn to be a field.

Theorem 1.3.6. ℤn is a field if and only if n is a prime.

Proof. First suppose that ℤn is a field. Then from Lemma 1.3.5, it is an integral domain.
Therefore, from Theorem 1.3.2, n must be a prime.
Conversely, suppose that n is a prime. We must show that ℤn is a field. Since we
already know that ℤn is an integral domain, we must only show that each nonzero
element of ℤn is a unit. Here, we need some elementary facts from number theory. If
a, b are integers, we use the notation a|b to indicate that a divides b.
Recall that given nonzero integers a, b, their greatest common divisor or GCD d > 0
is a positive integer, which is a common divisor; that is, d|a and d|b, and if d1 is any
other common divisor, then d1 |d. We denote the greatest common divisor of a, b by
either gcd(a, b) or (a, b). It can be proved that given nonzero integers a, b their GCD
exists, is unique and can be characterized as the least positive linear combination
of a and b. If the GCD of a and b is 1, then we say that a and b are relatively prime or
coprime. This is equivalent to being able to express 1 as a linear combination of a and b
(see Chapter 3 for proofs and more details).
Now let a ∈ ℤn with n prime and a ≠ 0. Since a ≠ 0, we have that n does
not divide a. Since n is prime, it follows that a and n must be relatively prime,
(a, n) = 1. From the number theoretic remarks above, we then have that there ex-
ist x, y with

ax + ny = 1.

However, in ℤn , the element ny = 0. Therefore, in ℤn , we have

ax = 1.

Therefore, a has a multiplicative inverse in ℤn and is, hence, a unit. Since a was
an arbitrary nonzero element, we conclude that ℤn is a field.

The theorem above is actually a special case of a more general result from which
Theorem 1.3.6 could also be obtained.

Theorem 1.3.7. Each finite integral domain is a field.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 5:48 AM
6 | 1 Groups, rings and fields

Proof. Let K be a finite integral domain. We must show that K is a field. It is clearly
sufficient to show that each nonzero element of K is a unit. Let

{0, 1, r1 , . . . , rn }

be the elements of K. Let ri be a fixed nonzero element and multiply each element of
K by ri on the left. Now

if ri rj = ri rk then ri (rj − rk ) = 0.

Since ri ≠ 0, it follows that rj − rk = 0 or rj = rk . Therefore, all the products ri rj are


distinct. Hence,

R = {0, 1, r1 , . . . , rn } = ri R = {0, ri , ri r1 , . . . , ri rn }.

Therefore, the identity element 1 must be in the right-hand list; that is, there is an
rj such that ri rj = 1. Therefore, ri has a multiplicative inverse and is, hence, a unit.
Therefore, K is a field.

1.4 Subrings and ideals


A very important concept in algebra is that of a substructure that is a subset having
the same structure as the superset.

Definition 1.4.1. A subring of a ring R is a nonempty subset S that is also a ring under
the same operations as R. If R is a field and S also a field, then it is a subfield.

If S ⊂ R, then S satisfies the same basic axioms, associativity, and commutativity


of addition, for example. Therefore, S will be a subring if it is nonempty and closed un-
der the operations; that is, closed under addition, multiplication, and taking additive
inverses.

Lemma 1.4.2. A subset S of a ring R is a subring if and only if S is nonempty, and when-
ever a, b ∈ S, we have a + b ∈ S, a − b ∈ S and ab ∈ S.

Example 1.4.3. Show that if n > 1, the set nℤ is a subring of ℤ. Here, clearly nℤ is
nonempty. Suppose a = nz1 , b = nz2 are two elements of nℤ. Then

a + b = nz1 + nz2 = n(z1 + z2 ) ∈ nℤ


a − b = nz1 − nz2 = n(z1 − z2 ) ∈ nℤ
ab = nz1 ⋅ nz2 = n(nz1 z2 ) ∈ nℤ.

Therefore, nℤ is a subring.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 5:48 AM
1.4 Subrings and ideals | 7

Example 1.4.4. Show that the set of real numbers of the form

S = {u + v√2 : u, v ∈ ℚ}

is a subring of ℝ.
Here, 1 + √2 ∈ S; therefore, S is nonempty. Suppose a = u1 + v1 √2, b = u2 + v2 √2
are two element of S. Then

a + b = (u1 + v1 √2) + (u2 + v2 √2) = u1 + u2 + (v1 + v2 )√2 ∈ S


a − b = (u1 + v1 √2) − (u2 + v2 √2) = u1 − u2 + (v1 − v2 )√2 ∈ S
a ⋅ b = (u1 + v1 √2) ⋅ (u2 + v2 √2) = (u1 u2 + 2v1 v2 ) + (u1 v2 + v1 u2 )√2 ∈ S.

Therefore, S is a subring.

In fact, S is a field because u+v1√2 = u2 −2v


u v 2
2 − u2 −v 2 if (u, v) ≠ (0, 0).

In the following, we are especially interested in special types of subrings called


ideals.

Definition 1.4.5. Let R be a ring and I ⊂ R. Then I is a (two-sided) ideal if the following
properties hold:
(1) I is nonempty.
(2) If a, b ∈ I, then a ± b ∈ I.
(3) If a ∈ I and r is any element of R, then ra ∈ I, and ar ∈ I.

We denote the fact that I forms an ideal in R by I ⊲ R.

Notice that if a, b ∈ I, then from (3), we have ab ∈ I, and ba ∈ I. Hence, I forms a


subring; that is, each ideal is also a subring. The set {0} and the whole ring R are trivial
ideals of R.
If we assume that in (3), only ra ∈ I, then I is called a left ideal. Analogously, we
define a right ideal.

Lemma 1.4.6. Let R be a commutative ring and a ∈ R. Then the set

⟨a⟩ = aR = {ar : r ∈ R}

is an ideal of R.

This ideal is called the principal ideal generated by a.

Proof. We must verify the three properties of the definition. Since a ∈ R, we have that
aR is nonempty. If u = ar1 , v = ar2 are two elements of aR, then

u ± v = ar1 ± ar2 = a(r1 ± r2 ) ∈ aR.

Therefore, (2) is satisfied.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 5:48 AM
8 | 1 Groups, rings and fields

Finally, let u = ar1 ∈ aR and r ∈ R. Then

ru = rar1 = a(rr1 ) ∈ aR, and ur = ar1 r = a(r1 r) ∈ aR.

Recall that a ∈ ⟨a⟩ if R has an identity.


Notice that if n ∈ ℤ, then the principal ideal generated by n is precisely the ring
nℤ, which we have already examined. Hence, for each n > 1, the subring nℤ is actually
an ideal. We can show more.

Theorem 1.4.7. Any subring of ℤ is of the form nℤ for some n. Hence, each subring of ℤ
is actually a principal ideal.

Proof. Let S be a subring of ℤ. If S = {0}, then S = 0ℤ, so we may assume that S has
nonzero elements. Since S is a subring if it has nonzero elements, it must have positive
elements (since it has the additive inverse of any element in it).
Let S+ be the set of positive elements in S. From the remarks above, this is a
nonempty set, and so, there must be a least positive element n. We claim that S = nℤ.
Let m be a positive element in S. By the division algorithm

m = qn + r,

where either r = 0, or 0 < r < n (see Chapter 3). Suppose that r ≠ 0. Then

r = m − qn.

Now m ∈ S, and n ∈ S. Since S is a subring, it is closed under addition so that qn ∈ S.


But S is a subring, therefore, m − qn ∈ S. It follows that r ∈ S. But this is a contradiction
since n was the least positive element in S. Therefore, r = 0, and m = qn. Hence, each
positive element in S is a multiple of n.
Now let m be a negative element of S. Then −m ∈ S, and −m is positive. Hence,
−m = qn, and thus, m = (−q)n. Therefore, every element of S is a multiple of n, and so,
S = nℤ.
It follows that every subring of ℤ is of this form and, therefore, every subring of ℤ
is an ideal.

We mention that this is true in ℤ, but not always true. For example, ℤ is a subring
of ℚ, but not an ideal.
An extension of the proof of Lemma 1.4.6 gives the following. We leave the proof
as an exercise.

Lemma 1.4.8. Let R be a commutative ring and a1 , . . . , an ∈ R be a finite set of elements


in R. Then the set

⟨a1 , . . . , an ⟩ = {r1 a1 + r2 a2 + ⋅ ⋅ ⋅ + rn an : ri ∈ R}

is an ideal of R.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 5:48 AM
1.5 Factor rings and ring homomorphisms | 9

This ideal is called the ideal generated by a1 , . . . , an .


Recall that a1 , . . . , an are in ⟨a1 , . . . , an ⟩ if R has an identity.

Theorem 1.4.9. Let R be a commutative ring with an identity 1 ≠ 0. Then R is a field if


and only if the only ideals in R are {0} and R.

Proof. Suppose that R is a field and I ⊲ R is an ideal. We must show that either I = {0},
or I = R. Suppose that I ≠ {0}, then we must show that I = R.
Since I ≠ {0}, there exists an element a ∈ I with a ≠ 0. Since R is a field, this
element a has an inverse a−1 . Since I is an ideal, it follows that a−1 a = 1 ∈ I. Let r ∈ R,
then, since 1 ∈ I, we have r ⋅ 1 = r ∈ I. Hence, R ⊂ I and, therefore, R = I.
Conversely, suppose that R is a commutative ring with an identity, whose only
ideals are {0} and R. We must show that R is a field, or equivalently, that every nonzero
element of R has a multiplicative inverse.
Let a ∈ R with a ≠ 0. Since R is a commutative ring, and a ≠ 0, the principal ideal
aR is a nontrivial ideal in R. Hence, aR = R. Therefore, the multiplicative identity
1 ∈ aR. It follows that there exists an r ∈ R with ar = 1. Hence, a has a multiplicative
inverse, and R must be a field.

1.5 Factor rings and ring homomorphisms


Given an ideal I in a ring R, we can build a new ring called the factor ring or quotient
ring of R modulo I. The special condition on the subring I, that rI ⊂ I and Ir ⊂ I for all
r ∈ R, that makes it an ideal, is specifically to allow this construction to be a ring.

Definition 1.5.1. Let I be an ideal in a ring R. Then a coset of I is a subset of R of the


form

r + I = {r + i : i ∈ I}

with r a fixed element of R.

Lemma 1.5.2. Let I be an ideal in a ring R. Then the cosets of I partition R; that is, any
two cosets are either coincide or disjoint.

We leave the proof to the exercises.


Now, on the set of all cosets of an ideal, we will build a new ring.

Theorem 1.5.3. Let I be an ideal in a ring R. Let R/I be the set of all cosets of I in R; that
is,

R/I = {r + I : r ∈ R}.

We define addition and multiplication on R/I in the following manner:

(r1 + I) + (r2 + I) = (r1 + r2 ) + I

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 5:48 AM
10 | 1 Groups, rings and fields

(r1 + I) ⋅ (r2 + I) = (r1 ⋅ r2 ) + I.

Then R/I forms a ring called the factor ring of R modulo I. The zero element of R/I is 0 + I
and the additive inverse of r + I is −r + I.
Further, if R is commutative, then R/I is commutative, and if R has an identity, then
R/I has an identity 1 + I.

Proof. The proofs that R/I satisfies the ring axioms under the definitions above is
straightforward. For example,

(r1 + I) + (r2 + I) = (r1 + r2 ) + I = (r2 + r1 ) + I = (r2 + I) + (r1 + I),

and so, addition is commutative.


What must be shown is that both addition and multiplication are well-defined.
That is, if

r1 + I = r1󸀠 + I, and r2 + I = r2󸀠 + I

then

(r1 + I) + (r2 + I) = (r1󸀠 + I) + (r2󸀠 + I),

and

(r1 + I) ⋅ (r2 + I) = (r1󸀠 + I) ⋅ (r2󸀠 + I).

Now if r1 + I = r1󸀠 + I, then r1 ∈ r1󸀠 + I, and so, r1 = r1󸀠 + i1 for some i1 ∈ I. Similarly, if
r2 + I = r2󸀠 + I, then r2 ∈ r2󸀠 + I, and so, r2 = r2󸀠 + i2 for some i2 ∈ I. Then

(r1 + I) + (r2 + I) = (r1󸀠 + i1 + I) + (r2󸀠 + i2 + I) = (r1󸀠 + I) + (r2󸀠 + I)

since i1 + I = I and i2 + I = I. Similarly,

(r1 + I) ⋅ (r2 + I) = (r1󸀠 + i1 + I) ⋅ (r2󸀠 + i2 + I)


= r1󸀠 ⋅ r2󸀠 + r1󸀠 ⋅ i2 + r2󸀠 ⋅ i1 + r1󸀠 ⋅ I + r2󸀠 ⋅ I + I ⋅ I
= (r1󸀠 ⋅ r2󸀠 ) + I

since all the other products are in the ideal I.


This shows that addition and multiplication are well-defined. It also shows why
the ideal property is necessary.

As an example, let R be the integers ℤ. As we have seen, each subring is an ideal


and of the form nℤ for some natural number n. The factor ring ℤ/nℤ is called the
residue class ring modulo n, denoted ℤn . Notice that we can take as cosets

0 + nℤ, 1 + nℤ, . . . , (n − 1) + nℤ.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 5:48 AM
1.5 Factor rings and ring homomorphisms | 11

Addition and multiplication of cosets is then just addition and multiplication mod-
ulo n. As we can see, this is just a formalization of the ring ℤn , which we have already
looked at. Recall that ℤn is an integral domain if and only if n is prime and ℤn is a field
for precisely the same n. If n = 0, then ℤ/nℤ is the same as ℤ.
We now show that ideals and factor rings are closely related to certain mappings
between rings.

Definition 1.5.4. Let R and S be rings. Then a mapping f : R → S is a ring homomor-


phism if

f (r1 + r2 ) = f (r1 ) + f (r2 ) for any r1 , r2 ∈ R


f (r1 ⋅ r2 ) = f (r1 ) ⋅ f (r2 ) for any r1 , r2 ∈ R.

In addition,
(1) f is an epimorphism if it is surjective.
(2) f is an monomorphism if it is injective.
(3) f is an isomorphism if it is bijective; that is, both surjective and injective. In this
case, R and S are said to be isomorphic rings, which we denote by R ≅ S.
(4) f is an endomorphism if R = S; that is, a ring homomorphism from a ring to itself.
(5) f is an automorphism if R = S and f is an isomorphism.

Lemma 1.5.5. Let R and S be rings, and let f : R → S be a ring homomorphism. Then
(1) f (0) = 0, where the first 0 is the zero element of R, and the second is the zero element
of S.
(2) f (−r) = −f (r) for any r ∈ R.

Proof. We obtain f (0) = 0 from the equation f (0) = f (0 + 0) = f (0) + f (0). Hence,
0 = f (0) = f (r − r) = f (r + (−r)) = f (r) + f (−r); that is, f (−r) = −f (r).

Definition 1.5.6. Let R and S be rings, and let f : R → S be a ring homomorphism. Then
the kernel of f is

ker(f ) = {r ∈ R : f (r) = 0}.

The image of f , denoted im(f ), is the range of f within S. That is,

im(f ) = {s ∈ S : there exists r ∈ R with f (r) = s}.

Theorem 1.5.7 (Ring isomorphism theorem). Let R and S be rings, and let

f :R→S

be a ring homomorphism. Then

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 5:48 AM
12 | 1 Groups, rings and fields

(1) ker(f ) is an ideal in R, im(f ) is a subring of S, and

R/ ker(f ) ≅ im(f ).

(2) Conversely, suppose that I is an ideal in a ring R. Then the map f : R → R/I, given
by f (r) = r + I for r ∈ R, is a ring homomorphism, whose kernel is I, and whose image
is R/I.

The theorem says that the concepts of ideal of a ring and kernel of a ring homo-
morphism coincide; that is, each ideal is the kernel of a homomorphism and the kernel
of each ring homomorphism is an ideal.

Proof. Let f : R → S be a ring homomorphism. If s1 , s2 ∈ im(f ), then there are r1 , r2 ∈


R, such that f (r1 ) = s1 , and f (r2 ) = s2 . Then certainly, im(f ) is a subring of S from
Definition 1.5.4 and Lemma 1.5.5. Now, let I = ker(f ). We show first that I is an ideal. If
r1 , r2 ∈ I, then f (r1 ) = f (r2 ) = 0. It follows from the homomorphism property that

f (r1 ± r2 ) = f (r1 ) ± f (r2 ) = 0 + 0 = 0


f (r1 ⋅ r2 ) = f (r1 ) ⋅ f (r2 ) = 0 ⋅ 0 = 0.

Therefore, I is a subring.
Now let i ∈ I and r ∈ R. Then

f (r ⋅ i) = f (r) ⋅ f (i) = f (r) ⋅ 0 = 0 and f (i ⋅ r) = f (i) ⋅ f (r) = 0 ⋅ f (r) = 0

and, hence, I is an ideal.


Consider the factor ring R/I. Let f ∗ : R/I → im(f ) by f ∗ (r + I) = f (r). We show that
f is an isomorphism.

First, we show that it is well-defined. Suppose that r1 + I = r2 + I, then r1 − r2 ∈ I =


ker(f ). It follows that f (r1 − r2 ) = 0, so f (r1 ) = f (r2 ). Hence, f ∗ (r1 + I) = f ∗ (r2 + I), and
the map f ∗ is well-defined.
Now

f ∗ ((r1 + I) + (r2 + I)) = f ∗ ((r1 + r2 ) + I) = f (r1 + r2 )


= f (r1 ) + f (r2 ) = f ∗ (r1 + I) + f ∗ (r2 + I),

and

f ∗ ((r1 + I) ⋅ (r2 + I)) = f ∗ ((r1 ⋅ r2 ) + I) = f (r1 ⋅ r2 )


= f (r1 ) ⋅ f (r2 ) = f ∗ (r1 + I) ⋅ f ∗ (r2 + I).

Hence, f ∗ is a homomorphism. We must now show that it is injective and surjective.


Suppose that f ∗ (r1 + I) = f ∗ (r2 + I). Then f (r1 ) = f (r2 ) so that f (r1 − r2 ) = 0. Hence,
r1 − r2 ∈ ker(f ) = I. Therefore, r1 ∈ r2 + I, and thus, r1 + I = r2 + I, and the map f ∗ is
injective.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 5:48 AM
1.6 Fields of fractions | 13

Finally, let s ∈ im(f ). Then there exists r ∈ R such that f (r) = s. Then f ∗ (r + I) = s,
and the map f ∗ is surjective and, hence, an isomorphism. This proves the first part of
the theorem.
To prove the second part, let I be an ideal in R and R/I the factor ring. Consider
the map f : R → R/I, given by f (r) = r + I. From the definition of addition and multi-
plication in the factor ring R/I, it is clear that this is a homomorphism. Consider the
kernel of f . If r ∈ ker(f ), then f (r) = r + I = 0 = 0 + I. This implies that r ∈ I and, hence,
the kernel of this map is exactly the ideal I, completing the theorem.
Theorem 1.5.7 is called the ring isomorphism theorem or the first ring isomorphism
theorem. We mention that there is an analogous theorem for each algebraic structure,
in particular, for groups and vector spaces. We will mention the result for groups in
Section 1.8.

1.6 Fields of fractions


The integers are an integral domain, and the rationals ℚ are a field that contains the
integers. First, we show that ℚ is the smallest field containing ℤ.

Theorem 1.6.1. The rationals ℚ are the smallest field containing the integers ℤ. That is,
if ℤ ⊂ K ⊂ ℚ with K a subfield of ℚ, then K = ℚ.

Proof. Since ℤ ⊂ K, we have m, n ∈ K for any two integers m, n with n ≠ 0. Since K is a


subfield, it is closed under taking division; that is, taking multiplicative inverses and,
hence, the fraction mn ∈ K. Since each element of ℚ is such a fraction, it follows that
ℚ ⊂ K. Since K ⊂ ℚ, it follows that K = ℚ.
Notice that to construct the rationals from the integers, we form all the fractions
m
n
with n ≠ 0, and where mn 1 = mn 2 if m1 n2 = n1 m2 . We then do the standard operations
1 2
on fractions. If we start with any integral domain D, we can mimic this construction
to build a field of fractions from D; that is, the smallest field containing D.

Theorem 1.6.2. Let D be an integral domain. Then there is a field K containing D, called
the field of fractions for D, such that each element of K is a fraction from D; that is, an
element of the form d1 d2−1 with d1 , d2 ∈ D. Further, K is unique up to isomorphism and is
the smallest field containing D.

Proof. The proof is just the mimicking of the construction of the rationals from the
integers. Let

K 󸀠 = {(d1 , d2 ) : d1 , d2 ≠ 0, d1 , d2 ∈ D}.

Define on K 󸀠 the equivalence relation

(d1 , d2 ) = (d1󸀠 , d2󸀠 ) if d1 d2󸀠 = d2 d1󸀠 .

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 5:48 AM
14 | 1 Groups, rings and fields

Let K be the set of equivalence classes, and define addition and multiplication in the
usual manner as for fractions, where the result is the equivalence class:

(d1 , d2 ) + (d3 , d4 ) = (d1 d4 + d2 d3 , d2 d4 )


(d1 , d2 ) ⋅ (d3 , d4 ) = (d1 d3 , d2 d4 ).

It is now straightforward to verify the ring axioms for K. The inverse of (d1 , 1) is (1, d1 )
for d1 ≠ 0 in D.
As with ℤ, we identify the elements of K as fractions dd1 .
2
The proof that K is the smallest field containing D is the same as for ℚ from ℤ.

As examples, we have that ℚ is the field of fractions for ℤ. A familiar, but less
common, example is the following:
Let ℝ[x] be the set of polynomials over the real numbers ℝ. It can be shown that
ℝ[x] forms an integral domain (see Chapter 3). The field of fractions consists of all
f (x)
formal functions g(x) , where f (x), g(x) are real polynomials with g(x) ≠ 0. The corre-
sponding field of fractions is called the field of rational functions over ℝ and is denoted
ℝ(x).

1.7 Characteristic and prime rings


We saw in the last section that ℚ is the smallest field containing the integers. Since
any subfield of ℚ must contain the identity, it follows that any nontrivial subfield of
ℚ must contain the integers and, hence, be all of ℚ. Therefore, ℚ has no nontrivial
subfields. We say that ℚ is a prime field.

Definition 1.7.1. A field K is a prime field if K contains no nontrivial subfields.

Lemma 1.7.2. Let K be any field. Then K contains a prime field K as a subfield.

Proof. Let K1 , K2 be subfields of K. If k1 , k2 ∈ K1 ∩ K2 , then k1 ± k2 ∈ K1 since K1 is a


subfield, and k1 ± k2 ∈ K2 since K2 is a subfield. Therefore, k1 ± k2 ∈ K1 ∩ K2 . Similarly,
k1 k2−1 ∈ K1 ∩ K2 . It follows that K1 ∩ K2 is again a subfield.
Now, let K be the intersection of all subfields of K. From the argument above K is
a subfield, and the only nontrivial subfield of K is itself. Hence, K is a prime field.

Definition 1.7.3. Let R be a commutative ring with an identity 1 ≠ 0. The smallest


positive integer n such that n ⋅ 1 = 1 + 1 + ⋅ ⋅ ⋅ + 1 = 0 is called the characteristic of R. If
there is no such n, then R has characteristic 0. We denote the characteristic by char(R).

First, notice that 0 is the characteristic of ℤ, ℚ, ℝ. Further the characteristic of ℤn


is n.

Theorem 1.7.4. Let R be an integral domain. Then the characteristic of R is either 0 or


a prime. In particular, the characteristic of a field is zero or a prime.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 5:48 AM
1.7 Characteristic and prime rings | 15

Proof. Suppose that R is an integral domain and char(R) = n ≠ 0. Suppose that n = mk


with 1 < m < n, 1 < k < n. Then n ⋅ 1 = 0 = (m ⋅ 1)(k ⋅ 1). Since R is an integral domain, we
have no zero divisors and, hence, m ⋅ 1 = 0, or k ⋅ 1 = 0. However, this is a contradiction
since n is the least positive integer such that n⋅1 = 0. Therefore, n must be a prime.

We have seen that every field contains a prime field. We extend this.

Definition 1.7.5. A commutative ring R with an identity 1 ≠ 0 is a prime ring if the only
subring containing the identity is the whole ring.

Clearly both the integers ℤ and the modular integers ℤn are prime rings. In fact,
up to isomorphism, they are the only prime rings.

Theorem 1.7.6. Let R be a prime ring. If char(R) = 0, then R ≅ ℤ, whereas if char(R) =


n > 0, then R ≅ ℤn .

Proof. Suppose that char(R) = 0. Let S = {r = m ⋅ 1 : r ∈ R, m ∈ ℤ}. Then S is a subring


of R containing the identity and, hence, S = R. However, the map m ⋅ 1 → m gives an
isomorphism from S to ℤ. It follows that R is isomorphic to ℤ.
If char(R) = n > 0, the proof is identical. Since n ⋅ 1 = 0, the subring S of R, defined
above, is all of R and isomorphic to ℤn .

Theorem 1.7.6 can be extended to fields with ℚ, taking the place of ℤ and ℤp , with
p a prime, taking the place of ℤn .

Theorem 1.7.7. Let K be a prime field. If K has characteristic 0, then K ≅ ℚ, whereas if


K has characteristic p, then K ≅ ℤp .

Proof. The proof is identical to that of Theorem 1.7.6; however, we consider the small-
est subfield K1 of K containing S.

We mention that there can be infinite fields of characteristic p. Consider, for ex-
ample, the field of fractions of the polynomial ring ℤp [x]. This is the field of rational
functions with coefficients in ℤp .
We give a theorem on fields of characteristic p that will be important much later
when we look at Galois theory.

Theorem 1.7.8. Let K be a field of characteristic p. Then the mapping ϕ : K → K, given


by ϕ(k) = k p , is an injective endomorphism of K. In particular, (a + b)p = ap + bp for any
a, b ∈ K.
This mapping is called the Frobenius homomorphism of K.
Further, if K is finite, ϕ is an automorphism.

Proof. We first show that ϕ is a homomorphism. Now

ϕ(ab) = (ab)p = ap bp = ϕ(a)ϕ(b).

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 5:48 AM
16 | 1 Groups, rings and fields

We need a little more work for addition:


p p−1
p p
ϕ(a + b) = (a + b)p = ∑ ( )ai bp−i = ap + ∑ ( )ai bp−i + bp
i=0
i i=1
i

by the binomial expansion, which holds in any commutative ring. However,

p p(p − 1) ⋅ ⋅ ⋅ (p − i + 1)
( )= ,
i i ⋅ (i − 1) ⋅ ⋅ ⋅ 1

and it is clear that p|(pi ) for 1 ≤ i ≤ p − 1. Hence, in K, we have (pi ) ⋅ 1 = 0, and so, we
have

ϕ(a + b) = (a + b)p = ap + bp = ϕ(a) + ϕ(b).

Therefore, ϕ is a homomorphism.
Further, ϕ is always injective. To see this, suppose that ϕ(x) = ϕ(y). Then

ϕ(x − y) = 0 󳨐⇒ (x − y)p = 0.

But K is a field, so there are no zero divisors. Therefore, we must have x − y = 0, or


x = y.
If K is finite and ϕ is injective, it must also be surjective and, hence, an auto-
morphism of K.

1.8 Groups
We close this first chapter by introducing some basic definitions and results from
group theory that mirror the results, which were presented for rings and fields. We
will look at group theory in more detail later in the book. Proofs will be given at that
point.

Definition 1.8.1. A group G is a set with one binary operation (which we will denote
by multiplication) such that
(1) The operation is associative.
(2) There exists an identity for this operation.
(3) Each g ∈ G has an inverse for this operation.

If, in addition, the operation is commutative, the group G is called an abelian group.
The order of G is the number of elements in G, denoted by |G|. If |G| < ∞, G is a finite
group; otherwise G is an infinite group.

Groups most often arise from invertible mappings of a set onto itself. Such map-
pings are called permutations.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 5:48 AM
1.8 Groups | 17

Theorem 1.8.2. The group of all permutations on a set A forms a group called the sym-
metric group on A, which we denote by SA . If A has more than 2 elements, then SA is
nonabelian.

Definition 1.8.3. Let G1 and G2 be groups. Then a mapping f : G1 → G2 is a (group)


homomorphism if

f (g1 g2 ) = f (g1 )f (g2 ) for any g1 , g2 ∈ G1 .

As with rings, we have, in addition,


(1) f is an epimorphism if it is surjective.
(2) f is an monomorphism if it is injective.
(3) f is an isomorphism if it is bijective; that is, both surjective and injective. In this
case, G1 and G2 are said to be isomorphic groups, which we denote by G1 ≅ G2 .
(4) f is an endomorphism if G1 = G2 ; that is, a homomorphism from a group to itself.
(5) f is an automorphism if G1 = G2 , and f is an isomorphism.

Lemma 1.8.4. Let G1 and G2 be groups, and let f : G1 → G2 be a homomorphism. Then


1. f (1) = 1, where the first 1 is the identity element of G1 , and the second is the identity
element of G2 .
2. f (g −1 ) = (f (g))−1 for any g ∈ G1 .

If A is a set, |A| denotes the size of A.

Theorem 1.8.5. If A1 and A2 are sets with |A1 | = |A2 |, then SA1 ≅ SA2 . If |A| = n with n
finite, we call SA the symmetric group on n elements, which we denote by Sn . Further, we
have |Sn | = n!.

Subgroups are defined in an analogous manner to subrings. Special types of sub-


groups, called normal subgroups, take the place in group theory that ideals play in
ring theory.

Definition 1.8.6. A subset H of a group G is a subgroup if H ≠ 0 and H forms a group


under the same operation as G. Equivalently, H is a subgroup if H ≠ 0, and H is closed
under the operation and inverses.

Definition 1.8.7. If H is a subgroup of a group G, then a left coset of H is a subset


of G of the form gH = {gh : h ∈ H}. A right coset of H is a subset of G of the form
Hg = {hg : h ∈ H}.

As with rings the cosets of a subgroup partition a group. We call the number of
right cosets of a subgroup H in a group G, then index of H in G, denoted |G : H|. One
can prove that the number of right cosets is equal to the number of left cosets. For
finite groups, we have the following beautiful result called Lagrange’s theorem.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 5:48 AM
18 | 1 Groups, rings and fields

Theorem 1.8.8 (Lagrange’s theorem). Let G be a finite group and H a subgroup. Then
the order of H divides the order of G. In particular,

|G| = |H||G : H|.

Normal subgroups take the place of ideals in group theory.

Definition 1.8.9. A subgroup H of a group G is a normal subgroup, denoted H ⊲ G, if


every left coset of H is also a right coset; that is, gH = Hg for each g ∈ G. Note that this
does not say that g and H commute elementwise, just that the subsets gH and Hg are
the same. Equivalently, H is normal if g −1 Hg = H for any g ∈ G.

Normal subgroups allow us to construct factor groups, just as ideals allowed us


to construct factor rings.

Theorem 1.8.10. Let H be a normal subgroup of a group G. Let G/H be the set of all
cosets of H in G; that is,

G/H = {gH : g ∈ G}.

We define multiplication on G/H in the following manner:

(g1 H)(g2 H) = g1 g2 H.

Then G/H forms a group called the factor group or quotient group of G modulo H.
The identity element of G/H is 1H, and the inverse of gH is g −1 H.
Further, if G is abelian, then G/H is also abelian.

Finally, as with rings normal subgroups, factor groups are closely tied to homo-
morphisms.

Definition 1.8.11. Let G1 and G2 be groups, and let f : G1 → G2 be a homomorphism.


Then the kernel of f , denoted ker(f ), is

ker(f ) = {g ∈ G1 : f (g) = 1}.

The image of f , denoted im(f ), is the range of f within G2 . That is,

im(f ) = {h ∈ G2 : there exists g ∈ G1 with f (g) = h}.

Theorem 1.8.12 (Group isomorphism theorem). Let G1 and G2 be groups, and let f :
G1 → G2 be a homomorphism. Then
(1) ker(f ) is a normal subgroup in G1 . im(f ) is a subgroup of G2 , and

G1 / ker(f ) ≅ im(f ).

(2) Conversely, suppose that H is a normal subgroup of a group G. Then the map f :
G → G/H, given by f (g) = gH for g ∈ G is a homomorphism, whose kernel is H and
whose image is G/H.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 5:48 AM
1.9 Exercises | 19

1.9 Exercises
1. Let ϕ : K → R be a homomorphism from a field K to a ring R. Show: Either ϕ(a) = 0
for all a ∈ K, or ϕ is a monomorphism.
2. Let R be a ring and M ≠ 0 an arbitrary set. Show that the following are equivalent:
(i) The ring of all mappings from M to R is a field.
(ii) M contains only one element and R is a field.
3. Let π be a set of prime numbers. Define

a
ℚπ = { : all prime divisors of b are in π}.
b

(i) Show that ℚπ is a subring of ℚ.


(ii) Let R be a subring of ℚ and let ba ∈ R with coprime integers a, b. Show that
1
b
∈ R.
(iii) Determine all subrings R of ℚ. (Hint: Consider the set of all prime divisors of
denominators of reduced elements of R.)
4. Prove Lemma 1.5.2.
5. Let R be a commutative ring with an identity 1 ∈ R. Let A, B and C be ideals in R.
A + B := {a + b : a ∈ A, b ∈ B} and AB := ({ab : a ∈ A, b ∈ B}). Show:
(i) A + B ⊲ R, A + B = (A ∪ B)
(ii) AB = {a1 b1 + ⋅ ⋅ ⋅ + an bn : n ∈ ℕ, ai ∈ A, bi ∈ B}, AB ⊂ A ∩ B
(iii) A(B + C) = AB + AC, (A + B)C = AB + BC, (AB)C = A(BC)
(iv) A = R ⇔ A ∩ R∗ ≠ 0
(v) a, b ∈ R ⇒ ⟨a⟩ + ⟨b⟩ = {xa + yb : x, y ∈ R}
(vi) a, b ∈ R ⇒ ⟨a⟩⟨b⟩ = ⟨ab⟩. Here, ⟨a⟩ = Ra = {xa : x ∈ R}.
6. Solve the following congruence:

3x ≡ 5 mod 7.

Is this congruence also solvable mod 17?


7. Show that the set of 2 × 2 matrices over a ring R forms a ring.
8. Prove Lemma 1.4.8.
9. Prove that if R is a ring with identity and S = {r = m ⋅ 1 : r ∈ R, m ∈ ℤ} then S is a
subring of R containing the identity.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 5:48 AM
Brought to you by | Chalmers University of Technology
Authenticated
Download Date | 9/12/19 5:48 AM
2 Maximal and prime ideals
2.1 Maximal and prime ideals
In the first chapter, we defined ideals I in a ring R, and then the factor ring R/I of R
modulo the ideal I. We saw, furthermore, that if R is commutative, then R/I is also
commutative, and if R has an identity, then so does R/I. This raises further questions
concerning the structure of factor rings. In particular, we can ask under what con-
ditions does R/I form an integral domain, and under what conditions does R/I form
a field. These questions lead us to define certain special properties of ideals, called
prime ideals and maximal ideals.
Let us look back at the integers ℤ. Recall that each proper ideal in ℤ has the form
nℤ for some n > 1, and the resulting factor ring ℤ/nℤ is isomorphic to ℤn . We proved
the following result:

Theorem 2.1.1. ℤn = ℤ/nℤ is an integral domain if and only if n = p a prime. Further-


more, ℤn is a field again if and only if n = p is a prime.

Hence, for the integers ℤ, a factor ring is a field if and only if it is an integral
domain. We will see later that this is not true in general. However, what is clear is
that special ideals nℤ lead to integral domains and fields when n is a prime. We look
at the ideals pℤ with p a prime in two different ways, and then use these in subsequent
sections to give the general definitions. We first need a famous result, Euclid’s lemma,
from number theory. For integers a, b, the notation a|b means that a divides b.

Lemma 2.1.2 (Euclid). If p is a prime and p|ab, then p|a or p|b.

Proof. Recall that the greatest common divisor or GCD of two integers a, b is an integer
d > 0 such that d is a common divisor of both a and b, and if d1 is another common
divisor of a and b, then d1 |d. We express the GCD of a, b by d = (a, b). It is known that
for any two integers a, b, their GCD exists and is unique, and is the least positive linear
combination of a and b; that is, the least positive integer of the form ax+by for integers
x, y. The integers a, b are relatively prime if their GCD is 1, (a, b) = 1. In this case, 1 is a
linear combination of a and b (see Chapter 3 for proofs and more details).
Now suppose p|ab, where p is a prime. If p does not divide a, then since the only
positive divisors of p are 1 and p, it follows that (a, p) = 1. Hence, 1 is expressible as
a linear combination of a and p. That is, ax + py = 1 for some integers x, y. Multiply
through by b, so that

abx + pby = b.

Now p|ab, so p|abx and p|pby. Therefore, p|abx + pby; that is, p|b.

https://doi.org/10.1515/9783110603996-002

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 8:58 AM
22 | 2 Maximal and prime ideals

We now recast this lemma in two different ways in terms of the ideal pℤ. Notice
that pℤ consists precisely of all the multiples of p. Hence, p|ab is equivalent to ab ∈
pℤ.

Lemma 2.1.3. If p is a prime and ab ∈ pℤ, then a ∈ pℤ, or b ∈ pℤ.

This conclusion will be taken as a motivation for the definition of a prime ideal in
the next section.

Lemma 2.1.4. If p is a prime and pℤ ⊂ nℤ, then n = 1, or n = p. That is, every ideal in
ℤ containing pℤ with p a prime is either all of ℤ or pℤ.

Proof. Suppose that pℤ ⊂ nℤ. Then p ∈ nℤ; therefore, p is a multiple of n. Since p is a


prime, it follows easily that either n = 1, or n = p.

In Section 2.3, the conclusion of this lemma will be taken as a motivation for the
definition of a maximal ideal.

2.2 Prime ideals and integral domains


Motivated by Lemma 2.1.3, we make the following general definition for commutative
rings R with identity:

Definition 2.2.1. Let R be a commutative ring. An ideal P in R with P ≠ R is a prime


ideal if whenever ab ∈ P with a, b ∈ R, then either a ∈ P, or b ∈ P.

This property of an ideal is precisely what is necessary and sufficient to make the
factor ring R/I an integral domain.

Theorem 2.2.2. Let R be a commutative ring with an identity 1 ≠ 0, and let P be a non-
trivial ideal in R. Then P is a prime ideal if and only if the factor ring R/P is an integral
domain.

Proof. Let R be a commutative ring with an identity 1 ≠ 0, and let P be a prime ideal.
We show that R/P is an integral domain. From the results in the last chapter, we have
that R/P is again a commutative ring with an identity. Therefore, we must show that
there are no zero divisors in R/P. Suppose that (a+I)(b+I) = 0 in R/P. The zero element
in R/P is 0 + P and, hence,

(a + P)(b + P) = 0 = 0 + P 󳨐⇒ ab + P = 0 + P 󳨐⇒ ab ∈ P.

However, P is a prime ideal; therefore, we must have a ∈ P, or b ∈ P. If a ∈ P, then


a + P = P = 0 + P so a + P = 0 in R/P. The identical argument works if b ∈ P. Therefore,
there are no zero divisors in R/P and, hence, R/P is an integral domain.
Conversely, suppose that R/P is an integral domain. We must show that P is a
prime ideal. Suppose that ab ∈ P. Then (a + P)(b + P) = ab + P = 0 + P. Hence, in R/P,

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 8:58 AM
2.2 Prime ideals and integral domains | 23

we have

(a + P)(b + P) = 0.

However, R/P is an integral domain, so it has no zero divisors. It follows that either
a + P = 0 and, hence, a ∈ P or b + P = 0, and b ∈ P. Therefore, either a ∈ P, or b ∈ P.
Therefore, P is a prime ideal.

In a commutative ring R, we can define a multiplication of ideals. We then obtain


an exact analog of Euclid’s lemma. Since R is commutative, each ideal is 2-sided.

Definition 2.2.3. Let R be a commutative ring with an identity 1 ≠ 0, and let A and B
be ideals in R. Define

AB = {a1 b1 + ⋅ ⋅ ⋅ + an bn : ai ∈ A, bi ∈ B, n ∈ ℕ}.

That is, AB is the set of finite sums of products ab with a ∈ A and b ∈ B.

Lemma 2.2.4. Let R be a commutative ring with an identity 1 ≠ 0, and let A and B be
ideals in R. Then AB is an ideal.

Proof. We must verify that AB is a subring, and that it is closed under multiplication
from R. Le r1 , r2 ∈ AB. Then

r1 = a1 b1 + ⋅ ⋅ ⋅ + an bn for some ai ∈ A, bi ∈ B,

and

r2 = a󸀠1 b󸀠1 + ⋅ ⋅ ⋅ + a󸀠m b󸀠m for some a󸀠i ∈ A, b󸀠i ∈ B.

Then

r1 ± r2 = a1 b1 + ⋅ ⋅ ⋅ + an bn ± a󸀠1 b󸀠1 ± ⋅ ⋅ ⋅ ± a󸀠m b󸀠m ,

which is clearly in AB. Furthermore,

r1 ⋅ r2 = a1 b1 a󸀠1 b󸀠1 + ⋅ ⋅ ⋅ + an bn a󸀠m b󸀠m .

Consider, for example, the first term a1 b1 a󸀠1 b󸀠1 . Since R is commutative, this is equal to

(a1 a󸀠1 )(b1 b󸀠1 ).

Now a1 a󸀠1 ∈ A since A is a subring, and b1 b󸀠1 ∈ B since B is a subring. Hence, this term
is in AB. Similarly, for each of the other terms. Therefore, r1 r2 ∈ AB and, hence, AB is
a subring.
Now let r ∈ R, and consider rr1 . This is then

rr1 = ra1 b1 + ⋅ ⋅ ⋅ + ran bn .

Now rai ∈ A for each i since A is an ideal. Hence, each summand is in AB, and then
rr1 ∈ AB. Therefore, AB is an ideal.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 8:58 AM
24 | 2 Maximal and prime ideals

Lemma 2.2.5. Let R be a commutative ring with an identity 1 ≠ 0, and let A and B be
ideals in R. If P is a prime ideal in R, then AB ⊂ P implies that A ⊂ P or B ⊂ P.

Proof. Suppose that AB ⊂ P with P a prime ideal, and suppose that B is not contained
in P. We show that A ⊂ P. Since AB ⊂ P, each product ai bj ∈ P. Choose a b ∈ B with
b ∉ P, and let a be an arbitrary element of A. Then ab ∈ P. Since P is a prime ideal,
this implies either a ∈ P, or b ∈ P. But by assumption b ∉ P, so a ∈ P. Since a was
arbitrary, we have A ⊂ P.

2.3 Maximal ideals and fields


Now, motivated by Lemma 2.1.4, we define a maximal ideal.

Definition 2.3.1. Let R be a ring and I an ideal in R. Then I is a maximal ideal if I ≠ R,


and if J is an ideal in R with I ⊂ J, then I = J, or J = R.

If R is a commutative ring with an identity this property of an ideal I is precisely


what is necessary and sufficient, so that R/I is a field.

Theorem 2.3.2. Let R be a commutative ring with an identity 1 ≠ 0, and let I be an ideal
in R. Then I is a maximal ideal if and only if the factor ring R/I is a field.

Proof. Suppose that R is a commutative ring with an identity 1 ≠ 0, and let I be an


ideal in R. Suppose first that I is a maximal ideal, and we show that the factor ring R/I
is a field.
Since R is a commutative ring with an identity, the factor ring R/I is also a com-
mutative ring with an identity. We must show then that each nonzero element of R/I
has a multiplicative inverse. Suppose then that r = r + I ∈ R/I is a nonzero element
of R/I. It follows that r ∉ I. Consider the set ⟨r, I⟩ = {rx + i : x ∈ R, i ∈ I}. This is also
an ideal (see exercises) called the ideal generated by r and I, denoted ⟨r, I⟩. Clearly,
I ⊂ ⟨r, I⟩, and since r ∉ I, and r = r ⋅ 1 + 0 ∈ ⟨r, I⟩, it follows that ⟨r, I⟩ ≠ I. Since I is
a maximal ideal, it follows that ⟨r, I⟩ = R the whole ring. Hence, the identity element
1 ∈ ⟨r, I⟩, and so, there exist elements x ∈ R and i ∈ I such that 1 = rx + i. But then
1 ∈ (r + I)(x + I), and so, 1 + I = (r + I)(x + I). Since 1 + I is the multiplicative identity of
R/I, it follows that x + I is the multiplicative inverse of r + I in R/I. Since r + I was an
arbitrary nonzero element of R/I, it follows that R/I is a field.
Now suppose that R/I is a field for an ideal I. We show that I must be maximal.
Suppose then that I1 is an ideal with I ⊂ I1 and I ≠ I1 . We must show that I1 is all of R.
Since I ≠ I1 , there exists an r ∈ I1 with r ∉ I. Therefore, the element r + I is nonzero in
the factor ring R/I, and since R/I is a field, it must have a multiplicative inverse x + I.
Hence, (r + I)(x + I) = rx + I = 1 + I and, therefore, there is an i ∈ I with 1 = rx + i. Since
r ∈ I1 , and I1 is an ideal, we get that rx ∈ I1 . In addition, since I ⊂ I1 , it follows that

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 8:58 AM
2.4 The existence of maximal ideals | 25

rx + i ∈ I1 , and so, 1 ∈ I1 . If r1 is an arbitrary element of R, then r1 ⋅ 1 = r1 ∈ I1 . Hence,


R ⊂ I1 , and so, R = I1 . Therefore, I is a maximal ideal.

Recall that a field is already an integral domain. Combining this with the ideas of
prime and maximal ideals we obtain:

Theorem 2.3.3. Let R be a commutative ring with an identity 1 ≠ 0. Then each maximal
ideal is a prime ideal.

Proof. Suppose that R is a commutative ring with an identity and I is a maximal ideal
in R. Then from Theorem 2.3.2, we have that the factor ring R/I is a field. But a field is
an integral domain, so R/I is an integral domain. Therefore, from Theorem 2.2.2, we
have that I must be a prime ideal.

The converse is not true in general. That is, there are prime ideals that are not
maximal. Consider, for example, R = ℤ the integers and I = {0}. Then I is an ideal,
and R/I = ℤ/{0} ≅ ℤ is an integral domain. Hence, {0} is a prime ideal. However, ℤ is
not a field, so {0} is not maximal. Note, however, that in the integers ℤ, a proper ideal
is maximal if and only if it is a prime ideal.

2.4 The existence of maximal ideals


In this section, we prove that in any ring R with an identity, there do exist maximal
ideals. Furthermore, given an ideal I ≠ R, then there exists a maximal ideal I0 such
that I ⊂ I0 . To prove this, we need three important equivalent results from logic and
set theory.
First, recall that a partial order ≤ on a set S is a reflexive, transitive relation on S.
That is, a ≤ a for all a ∈ S, and if a ≤ b, b ≤ c, then a ≤ c. This is a “partial” order since
there may exist elements a ∈ S, where neither a ≤ b, nor b ≤ a. If A is any set, then it
is clear that containment of subsets is a partial order on the power set 𝒫 (A).
If ≤ is a partial order on a set M, then a chain on M is a subset K ⊂ M such that
a, b ∈ K implies that a ≤ b or b ≤ a. A chain on M is bounded if there exists an m ∈ M
such that k ≤ m for all k ∈ K. The element m is called an upper bound for K. An element
m0 ∈ M is maximal if whenever m ∈ M with m0 ≤ m, then m = m0 . We now state the
three important results from logic.

Zorn’s lemma. If each chain of M has an upper bound in M, then there is at least one
maximal element in M.

Axiom of well-ordering. Each set M can be well-ordered, such that each nonempty sub-
set of M contains a least element.

Axiom of choice. Let {Mi : i ∈ I} be a nonempty collection of nonempty sets. Then there
is a mapping f : I → ⋃i∈I Mi with f (i) ∈ Mi for all i ∈ I.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 8:58 AM
26 | 2 Maximal and prime ideals

The following can be proved.

Theorem 2.4.1. Zorn’s lemma, the axiom of well-ordering and the axiom of choice are
all equivalent.

We now show the existence of maximal ideals in commutative rings with identity.

Theorem 2.4.2. Let R be a commutative ring with an identity 1 ≠ 0, and let I be an ideal
in R with I ≠ R. Then there exists a maximal ideal I0 in R with I ⊂ I0 . In particular, a ring
with an identity contains maximal ideals.

Proof. Let I be an ideal in the commutative ring R. We must show that there exists a
maximal ideal I0 in R with I ⊂ I0 .
Let

M = {X : X is an ideal with I ⊂ X ≠ R}.

Then M is partially ordered by containment. We want to show first that each chain in
M has a maximal element. If K = {Xj : Xj ∈ M, j ∈ J} is a chain, let

X 󸀠 = ⋃ Xj .
j∈J

If a, b ∈ X 󸀠 , then there exists an i, j ∈ J with a ∈ Xi , b ∈ Xj . Since K is a chain, either


Xi ⊂ Xj or Xj ⊂ Xi . Without loss of generality, suppose that Xi ⊂ Xj so that a, b ∈ Xj .
Then a ± b ∈ Xj ⊂ X 󸀠 , and ab ∈ Xj ⊂ X 󸀠 , since Xj is an ideal. Furthermore, if r ∈ R, then
ra ∈ Xj ⊂ X 󸀠 , since Xj is an ideal. Therefore, X 󸀠 is an ideal in R.
Since Xj ≠ R, it follows that 1 ∉ Xj for all j ∈ J. Therefore, 1 ∉ X 󸀠 , and so X 󸀠 ≠ R. It
follows that under the partial order of containment X 󸀠 is an upper bound for K.
We now use Zorn’s lemma. From the argument above, we have that each chain
has a maximal element. Hence, for an ideal I, the set M above has a maximal element.
This maximal element I0 is then a maximal ideal containing I.

2.5 Principal ideals and principal ideal domains


Recall again that in the integers ℤ, each ideal I is of the form nℤ for some integer n.
Hence, in ℤ, each ideal can be generated by a single element.

Lemma 2.5.1. Let R be a commutative ring and a1 , . . . , an be elements of R. Then the set

⟨a1 , . . . , an ⟩ = {r1 a1 + ⋅ ⋅ ⋅ + rn an : ri ∈ R}

forms an ideal in R called the ideal generated by a1 , . . . , an .

Proof. The proof is straightforward. Let

a = r1 a1 + ⋅ ⋅ ⋅ + rn an , b = s1 a1 + ⋅ ⋅ ⋅ + sn an

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 8:58 AM
2.5 Principal ideals and principal ideal domains | 27

with r1 , . . . , rn , s1 , . . . , sn elements of R, be two elements of ⟨a1 , . . . , an ⟩. Then

a ± b = (r1 ± s1 )a1 + ⋅ ⋅ ⋅ + (rn ± sn )an ∈ ⟨a1 , . . . , an ⟩


ab = (r1 s1 a1 )a1 + (r1 s2 a1 )a2 + ⋅ ⋅ ⋅ + (rn sn an )an ∈ ⟨a1 , . . . , an ⟩,

so ⟨a1 , . . . , an ⟩ forms a subring. Furthermore, if r ∈ R, we have

ra = (rr1 )a1 + ⋅ ⋅ ⋅ + (rrn )an ∈ ⟨a1 , . . . , an ⟩,

and so ⟨a1 , . . . , an ⟩ is an ideal.

Definition 2.5.2. Let R be a commutative ring. An ideal I ⊂ R is a principal ideal if it


has a single generator. That is,

I = ⟨a⟩ = aR for some a ∈ R.

We now restate Theorem 1.4.7 of Chapter 1.

Theorem 2.5.3. Every nonzero ideal in ℤ is a principal ideal.

Proof. Every ideal I in ℤ is of the form nℤ. This is the principal ideal generated
by n.

Definition 2.5.4. A principal ideal domain or PID is an integral domain, in which every
ideal is principal.

Corollary 2.5.5. The integers ℤ are a principal ideal domain.

We mention that the set of polynomials K[x] with coefficients from a field K is also
a principal ideal domain. We will return to this in the next chapter.
Not every integral domain is a PID. Consider K[x, y] = (K[x])[y], the set of polyno-
mials over K in two variables x, y (see Chapter 4). Let I consist of all the polynomials
with zero constant term.

Lemma 2.5.6. The set I in K[x, y] as defined above is an ideal, but not a principal ideal.

Proof. We leave the proof that I forms an ideal to the exercises. To show that it is not
a principal ideal, suppose I = ⟨p(x, y)⟩. Now the polynomial q(x) = x has zero con-
stant term, so q(x) ∈ I. Hence, p(x, y) cannot be a constant polynomial. In addition,
if p(x, y) had any terms with y in them, there would be no way to multiply p(x, y) by
a polynomial h(x, y) and obtain just x. Therefore, p(x, y) can contain no terms with y
in them. But the same argument, using s(y) = y, shows that p(x, y) cannot have any
terms with x in them. Therefore, there can be no such p(x, y) generating I, and so, I is
not principal, and K[x, y] is not a principal ideal domain.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 8:58 AM
28 | 2 Maximal and prime ideals

2.6 Exercises
1. Consider the set ⟨r, I⟩ = {rx + i : x ∈ R, i ∈ I}, where I is an ideal. Prove that this is
also an ideal called the ideal generated by r and I, denoted ⟨r, I⟩.
2. Let R and S be commutative rings, and let ϕ : R → S be a ring epimorphism. Let
M be a maximal ideal in R. Show:
ϕ(M) is a maximal ideal in S if and only if ker(ϕ) ⊂ M. Is ϕ(M) always a prime
ideal of S?
3. Let A1 , . . . , At be ideals of a commutative ring R. Let P be a prime ideal of R. Show:
(i) ⋂ti=1 Ai ⊂ P ⇒ Aj ⊂ P for at least one index j.
(ii) ⋂ti=1 Ai = P ⇒ Aj = P for at least one index j.
4. Which of the following ideals A are prime ideals of R? Which are maximal ideals?
(i) A = (x), R = ℤ[x].
(ii) A = (x2 ), R = ℤ[x].
(iii) A = (1 + √5), R = ℤ[√5].
(iv) A = (x, y), R = ℚ[x, y].
5. Let w = 21 (1 + √−3). Show that ⟨2⟩ is a prime ideal and even a maximal ideal of
ℤ[w], but ⟨2⟩ is neither a prime ideal nor a maximal ideal of ℤ[i], i = √−1 ∈ ℂ.
6. Let R = { ba : a, b ∈ ℤ, b odd}. Show that R is a subring of ℚ, and that there is only
one maximal ideal M in R.
7. Let R be a commutative ring with an identity. Let x, y ∈ R and x ≠ 0 not be a
zero divisor. Furthermore, let ⟨x⟩ be a prime ideal with ⟨x⟩ ⊂ ⟨y⟩ ≠ R. Show that
⟨x⟩ = ⟨y⟩.
8. Consider K[x, y] the set of polynomials over K in two variables x, y. Let I consist of
all the polynomials with zero constant term. Prove that the set I is an ideal.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 8:58 AM
3 Prime elements and unique factorization domains
3.1 The fundamental theorem of arithmetic
The integers ℤ have served as much of our motivation for properties of integral do-
mains. In the last chapter, we saw that ℤ is a principal ideal domain, and furthermore,
that prime ideals ≠ {0} are maximal. From the viewpoint of the multiplicative structure
of ℤ and the viewpoint of classical number theory, the most important property of ℤ
is the fundamental theorem of arithmetic. This states that any integer n ≠ 0 is uniquely
expressible as a product of primes, where uniqueness is up to ordering and the intro-
duction of ±1; that is, units. In this chapter, we show that this property is not unique to
the integers, and there are many other integral domains, where this also holds. These
are called unique factorization domains, and we will present several examples. First,
we review the fundamental theorem of arithmetic, its proof and several other ideas
from classical number theory.

Theorem 3.1.1 (Fundamental theorem of arithmetic). Given any integer n ≠ 0, there is


a factorization

n = cp1 p2 ⋅ ⋅ ⋅ pk ,

where c = ±1 and p1 , . . . , pk are primes. Furthermore, this factorization is unique up to


the ordering of the factors.

There are two main ingredients that go into the proof: induction and Euclid’s
lemma. We presented this in the last chapter. In turn, however, Euclid’s lemma de-
pends upon the existence of greatest common divisors and their linear expressibility.
Therefore, to begin, we present several basic ideas from number theory.
The starting point for the theory of numbers is divisibility.

Definition 3.1.2. If a, b are integers, we say that a divides b, or that a is a factor or


divisor of b, if there exists an integer q such that b = aq. We denote this by a|b. b is
then a multiple of a. If b > 1 is an integer whose only factors are ±1, ±b, then b is a
prime, otherwise, b > 1 is composite.

The following properties of divisibility are straightforward consequences of the


definition.

Lemma 3.1.3. The following properties hold:


(1) a|b ⇒ a|bc for any integer c.
(2) a|b and b|c implies a|c.
(3) a|b and a|c implies that a|(bx + cy) for any integers x, y.
(4) a|b and b|a implies that a = ±b.
(5) If a|b and a > 0, b > 0, then a ≤ b.

https://doi.org/10.1515/9783110603996-003

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
30 | 3 Prime elements and unique factorization domains

(6) a|b if and only if ca|cb for any integer c ≠ 0.


(7) a|0 for all a ∈ ℤ, and 0|a only for a = 0.
(8) a| ± 1 only for a = ±1.
(9) a1 |b1 and a2 |b2 implies that a1 a2 |b1 b2 .

If b, c, x, y are integers, then an integer bx + cy is called a linear combination of b, c.


Thus, part (3) of Lemma 3.1.3 says that if a is a common divisor of b, c, then a divides
any linear combination of b and c.
Furthermore, note that if b > 1 is a composite, then there exists x > 0 and y > 0
such that b = xy, and from part (5), we must have 1 < x < b, 1 < y < b.
In ordinary arithmetic, given a, b, we can always attempt to divide a into b. The
next result, called the division algorithm, says that if a > 0, either a will divide b, or
the remainder of the division of b by a will be less than a.

Theorem 3.1.4 (Division algorithm). Given integers a, b with a > 0, then there exist
unique integers q and r such that b = qa + r, where either r = 0 or 0 < r < a.

One may think of q and r as the quotient and remainder, respectively, when divid-
ing b by a.

Proof. Given a, b with a > 0, consider the set

S = {b − qa ≥ 0 : q ∈ ℤ}.

If b > 0, then b + a ≥ 0, and the sum is in S. If b ≤ 0, then there exists a q > 0 with
−qa < b. Then b + qa > 0 and is in S. Therefore, in either case, S is nonempty. Hence, S
is a nonempty subset of ℕ ∪ {0} and, therefore, has a least element r. If r ≠ 0, we must
show that 0 < r < a. Suppose r ≥ a, then r = a + x with x ≥ 0, and x < r since a > 0.
Then b − qa = r = a + x ⇒ b − (q + 1)a = x. This means that x ∈ S. Since x < r, this
contradicts the minimality of r, which is a contradiction. Therefore, if r ≠ 0, it follows
that 0 < r < a.
The only thing left is to show the uniqueness of q and r. Suppose b = q1 a + r1 also.
By the construction above, r1 must also be the minimal element of S. Hence, r1 ≤ r,
and r ≤ r1 so r = r1 . Now

b − qa = b − q1 a 󳨐⇒ (q1 − q)a = 0,

but since a > 0, it follows that q1 − q = 0 so that q = q1 .

The next idea that is necessary is the concept of greatest common divisor.

Definition 3.1.5. Given nonzero integers a, b, their greatest common divisor or GCD
d > 0 is a positive integer such that it is their common divisor, that is, d|a and d|b, and
if d1 is any other common divisor, then d1 |d. We denote the greatest common divisor
of a, b by either gcd(a, b) or (a, b).

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
3.1 The fundamental theorem of arithmetic | 31

Certainly, if a, b are nonzero integers with a > 0 and a|b, then a = gcd(a, b).
The next result says that given any nonzero integers, they do have a greatest com-
mon divisor, and it is unique.

Theorem 3.1.6. Given nonzero integers a, b, their GCD exists, is unique, and can be char-
acterized as the least positive linear combination of a and b.

Proof. Given nonzero a, b, consider the set

S = {ax + by > 0 : x, y ∈ ℤ}.

Now, a2 + b2 > 0, so S is a nonempty subset of ℕ and, hence, has a least element,


d > 0. We show that d is the GCD.
First we must show that d is a common divisor. Now d = ax + by and is the least
such positive linear combination. By the division algorithm, a = qd + r with 0 ≤ r < d.
Suppose r ≠ 0. Then r = a − qd = a − q(ax + by) = (1 − qx)a − qby > 0. Hence, r is a
positive linear combination of a and b, and therefore in S. But then r < d, contradicting
the minimality of d in S. It follows that r = 0, and so, a = qd, and d|a. An identical
argument shows that d|b, and so, d is a common divisor of a and b. Let d1 be any other
common divisor of a and b. Then d1 divides any linear combination of a and b, and so
d1 |d. Therefore, d is the GCD of a and b.
Finally, we must show that d is unique. Suppose d1 is another GCD of a and b. Then
d1 > 0, and d1 is a common divisor of a, b. Then d1 |d since d is a GCD. Identically, d|d1
since d1 is a GCD. Therefore, d = ±d1 , and then d = d1 since they are both positive.

If (a, b) = 1, then we say that a, b are relatively prime. It follows that a and b are
relatively prime if and only if 1 is expressible as a linear combination of a and b. We
need the following three results:

Lemma 3.1.7. If d = (a, b), then a = a1 d and b = b1 d with (a1 , b1 ) = 1.

Proof. If d = (a, b), then d|a, and d|b. Hence, a = a1 d, and b = b1 d. We have

d = ax + by = a1 dx + b1 dy.

Dividing both sides of the equation by d, we obtain

1 = a1 x + b1 y.

Therefore, (a1 , b1 ) = 1.

Lemma 3.1.8. For any integer c, we have that (a, b) = (a, b + ac).

Proof. Suppose (a, b) = d and (a, b + ac) = d1 . Now d is the least positive linear com-
bination of a and b. Suppose d = ax + by. d1 is a linear combination of a, b + ac so
that

d1 = ar + (b + ac)s = a(cs + r) + bs.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
32 | 3 Prime elements and unique factorization domains

Hence, d1 is also a linear combination of a and b; therefore, d1 ≥ d. On the other hand,


d1 |a, and d1 |(b+ac), and so, d1 |b. Therefore, d1 |d, so d1 ≤ d. Combining these, we must
have d1 = d.
The next result, called the Euclidean algorithm, provides a technique for both find-
ing the GCD of two integers and expressing the GCD as a linear combination.
Theorem 3.1.9 (Euclidean algorithm). Given integers b and a > 0 with a ∤ b, the follow-
ing repeated divisions are formed:

b = q 1 a + r1 , 0 < r1 < a
a = q 2 r1 + r2 , 0 < r2 < r1
..
.
rn−2 = qn rn−1 + rn , 0 < rn < rn−1
rn−1 = qn+1 rn .

The last nonzero remainder rn is the GCD of a, b. Furthermore, rn can be expressed as


a linear combination of a and b by successively eliminating the ri ’s in the intermediate
equations.
Proof. In taking the successive divisions as outlined in the statement of the theorem,
each remainder ri gets strictly smaller and still nonnegative. Hence, it must finally end
with a zero remainder. Therefore, there is a last nonzero remainder rn . We must show
that this is the GCD.
Now from Lemma 3.1.7, the gcd (a, b) = (a, b − q1 a) = (a, r1 ) = (r1 , a − q2 r1 ) = (r1 , r2 ).
Continuing in this manner, we have then that (a, b) = (rn−1 , rn ) = rn since rn divides
rn−1 . This shows that rn is the GCD.
To express rn as a linear combination of a and b, first notice that

rn = rn−2 − qn rn−1 .

Substituting this in the immediately preceding division, we get

rn = rn−2 − qn (rn−3 − qn−1 rn−2 ) = (1 + qn qn−1 )rn−2 − qn rn−3 .

Doing this successively, we ultimately express rn as a linear combination of a


and b.
Example 3.1.10. Find the GCD of 270 and 2412, and express it as a linear combination
of 270 and 2412.
We apply the Euclidean algorithm

2412 = 8 ⋅ 270 + 252


270 = 1 ⋅ 252 + 18
252 = 14 ⋅ 18.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
3.1 The fundamental theorem of arithmetic | 33

Therefore, the last nonzero remainder is 18, which is the GCD. We now must express
18 as a linear combination of 270 and 2412.
From the first equation

252 = 2412 − 8 ⋅ 270,

which gives in the second equation

270 = 2412 − 8 ⋅ 270 + 18 󳨐⇒ 18 = −1 ⋅ 2412 + 9 ⋅ 270,

which is the desired linear combination.

The next result that we need is Euclid’s lemma. We stated and proved this in the
last chapter, but we restate it here.

Lemma 3.1.11 (Euclid’s lemma). If p is a prime and p|ab, then p|a, or p|b.

We can now prove the fundamental theorem of arithmetic. Induction suffices to


show that there always exists such a decomposition into prime factors.

Lemma 3.1.12. Any integer n > 1 can be expressed as a product of primes, perhaps with
only one factor.

Proof. The proof is by induction. n = 2 is prime. Therefore, it is true at the lowest level.
Suppose that any integer 2 ≤ k < n can be decomposed into prime factors, we must
show that n then also has a prime factorization.
If n is prime, then we are done. Suppose then that n is composite. Hence, n = m1 m2
with 1 < m1 < n, 1 < m2 < n. By the inductive hypothesis, both m1 and m2 can be
expressed as products of primes. Therefore, n can, also using the primes from m1 and
m2 , completing the proof.

Before we continue to the fundamental theorem, we mention that the existence


of a prime decomposition, unique or otherwise, can be used to prove that the set of
primes is infinite. The proof we give goes back to Euclid and is quite straightforward.

Theorem 3.1.13. There are infinitely many primes.

Proof. Suppose that there are only finitely many primes p1 , . . . , pn . Each of these is
positive, so we can form the positive integer

N = p1 p2 ⋅ ⋅ ⋅ pn + 1.

From Lemma 3.1.12, N has a prime decomposition. In particular, there is a prime p,


which divides N. Then

p|(p1 p2 ⋅ ⋅ ⋅ pn + 1).

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
34 | 3 Prime elements and unique factorization domains

Since the only primes are assumed p1 , p2 , . . . , pn , it follows that p = pi for some i =
1, . . . , n. But then p|p1 p2 ⋅ ⋅ ⋅ pi ⋅ ⋅ ⋅ pn so p cannot divide p1 ⋅ ⋅ ⋅ pn + 1, which is a contradic-
tion. Therefore, p is not one of the given primes showing that the list of primes must
be endless.

We can now prove the fundamental theorem of arithmetic.

Proof. We assume that n ≥ 1. If n ≤ −1, we use c = −n, and the proof is the same. The
statement certainly holds for n = 1 with k = 0. Now suppose n > 1. From Lemma 3.1.12,
n has a prime decomposition:

n = p1 p2 ⋅ ⋅ ⋅ pm .

We must show that this is unique up to the ordering of the factors. Suppose then that
n has another such factorization n = q1 q2 ⋅ ⋅ ⋅ qk with the qi all prime. We must show
that m = k, and that, the primes are the same. Now we have

n = p1 p2 ⋅ ⋅ ⋅ pm = q1 ⋅ ⋅ ⋅ qk .

Assume that k ≥ m. From

n = p1 p2 ⋅ ⋅ ⋅ pm = q1 ⋅ ⋅ ⋅ qk ,

it follows that p1 |q1 q2 ⋅ ⋅ ⋅ qk . From Lemma 3.1.11 then, we must have that p1 |qi for some i.
But qi is prime, and p1 > 1, so it follows that p1 = qi . Therefore, we can eliminate p1
and qi from both sides of the factorization to obtain

p2 ⋅ ⋅ ⋅ pm = q1 ⋅ ⋅ ⋅ qi−1 qi+1 ⋅ ⋅ ⋅ qk .

Continuing in this manner, we can eliminate all the pi from the left side of the factor-
ization to obtain

1 = qm+1 ⋅ ⋅ ⋅ qk .

If qm+1 , . . . , qk were primes, this would be impossible. Therefore, m = k, and each prime
pi was included in the primes q1 , . . . , qm . Therefore, the factorizations differ only in the
order of the factors, proving the theorem.

3.2 Prime elements, units and irreducibles


We now let R be an arbitrary integral domain and attempt to mimic the divisibility
definitions and properties.

Definition 3.2.1. Let R be an integral domain.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
3.2 Prime elements, units and irreducibles | 35

(1) Suppose that a, b ∈ R. Then a is a factor or divisor of b if there exists a c ∈ R with


b = ac. We denote this, as in the integers, by a|b. If a is a factor of b, then b is
called a multiple of a.
(2) An element a ∈ R is a unit if a has a multiplicative inverse within R; that is, there
exists an element a−1 ∈ R with aa−1 = 1.
(3) A prime element of R is an element p ≠ 0 such that p is not a unit, and if p|ab, then
p|a or p|b.
(4) An irreducible element in R is an element c ≠ 0 such that c is not a unit, and if
c = ab, then a or b must be a unit.
(5) a and b in R are associates if there exists a unit e ∈ R with a = eb.

Notice that in the integers ℤ, the units are just ±1. The set of prime elements co-
incides with the set of irreducible elements. In ℤ, these are precisely the set of prime
numbers. On the other hand, if K is a field, every nonzero element is a unit. Therefore,
in K, there are no prime elements and no irreducible elements.
Recall that the modular rings ℤn are fields (and integral domains) when n is a
prime. In general, if n is not a prime then ℤn is a commutative ring with an identity,
and a unit is still an invertible element. We can characterize the units within ℤn .

Lemma 3.2.2. a ∈ ℤn is a unit if and only if (a, n) = 1.

Proof. Suppose (a, n) = 1. Then there exist x, y ∈ ℤ such that ax + ny = 1. This implies
that ax ≡ 1 mod n, which in turn implies that ax = 1 in ℤn and, therefore, a is a unit.
Conversely, suppose a is a unit in ℤn . Then there is an x ∈ ℤn with ax = 1. In terms
of congruence then

ax ≡ 1 mod n 󳨐⇒ n|(ax − 1) 󳨐⇒ ax − 1 = ny 󳨐⇒ ax − ny = 1.

Therefore, 1 is a linear combination of a and n and so (a, n) = 1.

If R is an integral domain, then the set of units within R will form a group.

Lemma 3.2.3. If R is a commutative ring with an identity, then the set of units in R form
an abelian group under ring multiplication. This is called the unit group of R, denoted
U(R).

Proof. The commutativity and associativity of U(R) follow from the ring properties.
The identity of U(R) is the multiplicative identity of R, whereas the ring multiplicative
inverse for each unit is the group inverse. We must show that U(R) is closed under
ring multiplication. If a ∈ R is a unit, we denote its multiplicative inverse by a−1 . Now
suppose a, b ∈ U(R). Then a−1 , b−1 exist. It follows that

(ab)(b−1 a−1 ) = a(bb−1 )a−1 = aa−1 = 1.

Hence, ab has an inverse, namely b−1 a−1 (= a−1 b−1 in a commutative ring) and, hence,
ab is also a unit. Therefore, U(R) is closed under ring multiplication.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
36 | 3 Prime elements and unique factorization domains

In general, irreducible elements are not prime. Consider for example the subring
of the complex numbers (see exercises) given by

R = ℤ[i√5] = {x + iy√5 : x, y ∈ ℤ}.

This is a subring of the complex numbers ℂ and, hence, can have no zero divisors.
Therefore, R is an integral domain.
For an element x + iy√5 ∈ R, define its norm by

N(x + iy√5) = 󵄨󵄨󵄨x + iy√5󵄨󵄨󵄨 = x2 + 5y2 .


󵄨 󵄨

Since x, y ∈ ℤ, it is clear that the norm of an element in R is a nonnegative integer.


Furthermore, if a ∈ R with N(a) = 0, then a = 0.
We have the following result concerning the norm:

Lemma 3.2.4. Let R and N be as above. Then


(1) N(ab) = N(a)N(b) for any elements a, b ∈ R.
(2) The units of R are those a ∈ R with N(a) = 1. In R, the only units are ±1.

Proof. The fact that the norm is multiplicative is straightforward and left to the exer-
cises. If a ∈ R is a unit, then there exists a multiplicative inverse b ∈ R with ab = 1.
Then N(ab) = N(a)N(b) = 1. Since both N(a) and N(b) are nonnegative integers, we
must have N(a) = N(b) = 1.
Conversely, suppose that N(a) = 1. If a = x + iy√5, then x2 + 5y2 = 1. Since x, y ∈ ℤ,
we must have y = 0 and x2 = 1. Then a = x = ±1.

Using this lemma we can show that R possesses irreducible elements that are not
prime.

Lemma 3.2.5. Let R be as above. Then 3 = 3 + i0√5 is an irreducible element in R, but


3 is not prime.

Proof. Suppose that 3 = ab with a, b ∈ R and a, b nonunits. Then N(3) = 9 = N(a)N(b)


with neither N(a) = 1, nor N(b) = 1. Hence, N(a) = 3, and N(b) = 3. Let a = x + iy√5.
It follows that x2 + 5y2 = 3. Since x, y ∈ ℤ, this is impossible. Therefore, one of a or b
must be a unit, and 3 is an irreducible element.
We show that 3 is not prime in R. Let a = 2 + i√5 and b = 2 − i√5. Then ab = 9 and,
hence, 3|ab. Suppose 3|a so that a = 3c for some c ∈ R. Then

9 = N(a) = N(3)N(c) = 9N(c) 󳨐⇒ N(c) = 1.

Therefore, c is a unit in R, and from Lemma 3.2.4, we get c = ±1. Hence, a = ±3. This is
a contradiction, so 3 does not divide a. An identical argument shows that 3 does not
divide b. Therefore, 3 is not a prime element in R.

We now examine the relationship between prime elements and irreducibles.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
3.2 Prime elements, units and irreducibles | 37

Theorem 3.2.6. Let R be an integral domain. Then


(1) Each prime element of R is irreducible.
(2) p ∈ R is a prime element if and only if p ≠ 0, and ⟨p⟩ = pR is a prime ideal.
(3) p ∈ R is irreducible if and only if p ≠ 0, and ⟨p⟩ = pR is maximal in the set of all
principal ideals of R, which are not equal to R.

Proof. (1) Suppose that p ∈ R is a prime element, and p = ab. We must show that
either a or b must be a unit. Now p|ab, so either p|a, or p|b. Without loss of generality,
we may assume that p|a, so a = pr for some r ∈ R. Hence, p = ab = (pr)b = p(rb).
However, R is an integral domain, so p − prb = p(1 − rb) = 0 implies that 1 − rb = 0
and, hence, rb = 1. Therefore, b is a unit and, hence, p is irreducible.
(2) Suppose that p is a prime element. Then p ≠ 0. Consider the ideal pR, and
suppose that ab ∈ pR. Then ab is a multiple of p and, hence, p|ab. Since p is prime, it
follows that p|a or p|b. If p|a, then a ∈ pR, whereas if p|b, then b ∈ pR. Therefore, pR
is a prime ideal.
Conversely, suppose that pR is a prime ideal, and suppose that p = ab. Then ab ∈
pR, so a ∈ pR, or b ∈ pR. If a ∈ pR, then p|a, and if b ∈ pR, then p|b. Therefore, p is
prime.
(3) Let p be irreducible, then p ≠ 0. Suppose that pR ⊂ aR, where a ∈ R. Then
p = ra for some r ∈ R. Since p is irreducible, it follows that either a is a unit, or r is a
unit. If r is a unit, we have pR = raR = aR ≠ R since p is not a unit. If a is a unit, then
aR = R, and pR = rR ≠ R. Therefore, pR is maximal in the set of principal ideals not
equal to R.
Conversely, suppose p ≠ 0 and pR is a maximal ideal in the set of principal ideals
≠ R. Let p = ab with a not a unit. We must show that b is a unit. Since aR ≠ R, and
pR ⊂ aR, from the maximality we must have pR = aR. Hence, a = rp for some r ∈ R.
Then p = ab = rpb and, as before, we must have rb = 1 and b a unit.
Theorem 3.2.7. Let R be a principle ideal domain. Then we have the following:
(1) An element p ∈ R is irreducible if and only if it is a prime element.
(2) A nonzero ideal of R is a maximal ideal if and only if it is a prime ideal.
(3) The maximal ideals of R are precisely those ideals pR, where p is a prime element.

Proof. First note that {0} is a prime ideal, but not maximal.
(1) We already know that prime elements are irreducible. To show the converse,
suppose that p is irreducible. Since R is a principal ideal domain from Theorem 3.2.6,
we have that pR is a maximal ideal, and each maximal ideal is also a prime ideal.
Therefore, from Theorem 3.2.6, we have that p is a prime element.
(2) We already know that each maximal ideal is a prime ideal. To show the con-
verse, suppose that I ≠ {0} is a prime ideal. Then I = pR, where p is a prime element
with p ≠ 0. Therefore, p is irreducible from part (1) and, hence, pR is a maximal ideal
from Theorem 3.2.6.
(3) This follows directly from the proof in part (2) and Theorem 3.2.6.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
38 | 3 Prime elements and unique factorization domains

This Theorem especially explains the following remark at the end of Section 2.3:
in the principal ideal domain ℤ, a proper ideal is maximal if and only if it is a prime
ideal.

3.3 Unique factorization domains


We now consider integral domains, where there is unique factorization into primes. If
R is an integral domain and a, b ∈ R, then we say that a and b are associates if there
exists a unit ϵ ∈ R with a = ϵb.

Definition 3.3.1. An integral domain D is a unique factorization domain or UFD if for


each d ∈ D either d = 0, d is a unit, or d has a factorization into primes, which is
unique up to ordering and unit factors. This means that if

r = p1 ⋅ ⋅ ⋅ pm = q1 ⋅ ⋅ ⋅ qk ,

then m = k, and each pi is an associate of some qj .

There are several relationships in integral domains that are equivalent to unique
factorization.

Definition 3.3.2. Let R be an integral domain.


(1) R has property (A) if and only if for each nonunit a ≠ 0 there are irreducible ele-
ments q1 , . . . , qr ∈ R, satisfying a = q1 ⋅ ⋅ ⋅ qr .
(2) R has property (A󸀠 ) if and only if for each nonunit a ≠ 0 there are prime elements
p1 , . . . , pr ∈ R, satisfying a = p1 ⋅ ⋅ ⋅ pr .
(3) R has property (B) if and only if whenever q1 , . . . , qr and q1󸀠 , . . . , qs󸀠 are irreducible
elements of R with

q1 ⋅ ⋅ ⋅ qr = q1󸀠 ⋅ ⋅ ⋅ qs󸀠 .

Then r = s, and there is a permutation π ∈ Sr such that for each i ∈ {1, . . . , r} the
elements qi and qπ(i)
󸀠
are associates (uniqueness up to ordering and unit factors).
(4) R has property (C) if and only if each irreducible element of R is a prime element.

Notice that properties (A) and (C) together are equivalent to what we defined as
unique factorization. Hence, an integral domain satisfying (A) and (C) is a UFD. Next,
we show that there are other equivalent formulations.

Theorem 3.3.3. In an integral domain R, the following are equivalent:


(1) R is a UFD.
(2) R satisfies properties (A) and (B).
(3) R satisfies properties (A) and (C).
(4) R satisfies property (A󸀠 ).

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
3.3 Unique factorization domains | 39

Proof. As remarked before, the statement of the theorem by definition (A) and (C) are
equivalent to unique factorization. We show here that (2), (3), and (4) are equivalent.
First, we show that (2) implies (3).
Suppose that R satisfies properties (A) and (B). We must show that it also satisfies
(C); that is, we must show that if q ∈ R is irreducible, then q is prime. Suppose that
q ∈ R is irreducible and q|ab with a, b ∈ R. Then we have ab = cq for some c ∈ R. If a
is a unit from ab = cq, we get that b = a−1 cq, and q|b. The results are identical if b is a
unit. Therefore, we may assume that neither a nor b are units.
If c = 0, then since R is an integral domain, either a = 0, or b = 0, and q|a, or q|b.
We may assume then that c ≠ 0.
If c is a unit, then q = c−1 ab, and since q is irreducible, either c−1 a, or b are units.
If c−1 a is a unit, then a is also a unit. Therefore, if c is a unit, either a or b are units
contrary to our assumption.
Therefore, we may assume that c ≠ 0, and c is not a unit. From property (A) we
have

a = q1 ⋅ ⋅ ⋅ qr
b = q1󸀠 ⋅ ⋅ ⋅ qs󸀠
c = q1󸀠󸀠 ⋅ ⋅ ⋅ qt󸀠󸀠 ,

where q1 , . . . qr , q1󸀠 , . . . , qs󸀠 , q1󸀠󸀠 , . . . qt󸀠󸀠 are all irreducibles. Hence,

q1 ⋅ ⋅ ⋅ qr q1󸀠 ⋅ ⋅ ⋅ qs󸀠 = q1󸀠󸀠 ⋅ ⋅ ⋅ qt󸀠󸀠 ⋅ q.

From property (B), q is an associate of some qi or qj󸀠 . Hence, q|qi or q|qj󸀠 . It follows
that q|a, or q|b and, therefore, q is a prime element.
That (3) implies (4) is direct.
We show that (4) implies (2).
Suppose that R satisfies property (A󸀠 ). We must show that it satisfies both (A)
and (B). We show first that (A) follows from (A󸀠 ) by showing that irreducible elements
are prime.
Suppose that q is irreducible. Then from (A󸀠 ), we have

q = p1 ⋅ ⋅ ⋅ pr

with each pi prime. It follows, without loss of generality, that p2 ⋅ ⋅ ⋅ pr is a unit, and p1
is a nonunit and, hence, pi |1 for i = 2, . . . , r. Thus, q = p1 , and q is prime. Therefore,
(A) holds.
We now show that (B) holds. Let

q1 ⋅ ⋅ ⋅ qr = q1󸀠 ⋅ ⋅ ⋅ qs󸀠 ,

where qi , qj󸀠 are all irreducibles; hence primes. Then

q1󸀠 |q1 ⋅ ⋅ ⋅ qr ,

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
40 | 3 Prime elements and unique factorization domains

and so, q1󸀠 |qi for some i. Without loss of generality, suppose q1󸀠 |q1 . Then q1 = aq1󸀠 . Since
q1 is irreducible, it follows that a is a unit, and q1 and q1󸀠 are associates. It follows then
that

aq2 ⋅ ⋅ ⋅ qr = q2󸀠 ⋅ qs󸀠

since R has no zero divisors. Property (B) holds then by induction, and the theorem is
proved.

Note that in our new terminology, ℤ is a UFD. In the next section, we will present
other examples of UFD’s. However, not every integral domain is a unique factorization
domain.
As we defined in the last section, let R be the following subring of ℂ:

R = ℤ[i√5] = {x + iy√5 : x, y ∈ ℤ}.

R is an integral domain, and we showed, using the norm, that 3 is an irreducible in R.


Analogously, we can show that the elements 2 + i√5, 2 − i√5 are also irreducibles in R,
and furthermore, 3 is not an associate of either 2 + i√5 or 2 − i√5. Then

9 = 3 ⋅ 3 = (2 + i√5)(2 − i√5)

give two different decompositions for an element in terms of irreducible elements. The
fact that R is not a UFD also follows from the fact that 3 is an irreducible element, which
is not prime.
Unique factorization is tied to the famous solution of Fermat’s big theorem. Wiles
and Taylor in 1995 proved the following:

Theorem 3.3.4. The equation x p + yp = z p has no integral solutions with xyz ≠ 0 for any
prime p ≥ 3.

Kummer tried to prove this theorem by attempting to factor xp = z p − yp . We call


2πi
the statement of Theorem 3.3.4 in an integral domain R property (Fp ). Let ϵ = e p . Then

p−1
z p − yp = ∏(z − ϵj y).
j=0

View this equation in the ring:

p−1
R = ℤ[ϵ] = { ∑ aj ϵj : aj ∈ ℤ}.
j=0

Kummer proved that if R is a UFD, then property (Fp ) holds. However, independently,
from Uchida and Montgomery (1971), R is a UFD only if p ≤ 19 (see [49]).

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
3.4 Principal ideal domains and unique factorization | 41

3.4 Principal ideal domains and unique factorization


In this section, we prove that every principal ideal domain (PID) is a unique factoriza-
tion domain (UFD). We say that an ascending chain of ideals in R

I1 ⊂ I2 ⊂ ⋅ ⋅ ⋅ ⊂ In ⊂ ⋅ ⋅ ⋅

becomes stationary if there exists an m such that Ir = Im for all r ≥ m.

Theorem 3.4.1. Let R be an integral domain. If each ascending chain of principal ideals
in R becomes stationary, then R satisfies property (A).

Proof. Suppose that a ≠ 0 is a not a unit in R. Suppose that a is not a product of


irreducible elements. Clearly then, a cannot itself be irreducible. Hence, a = a1 b1 with
a1 , b1 ∈ R, and a1 , b1 are not units. If both a1 or b1 can be expressed as a product of
irreducible elements, then so can a. Without loss of generality then, suppose that a1
is not a product of irreducible elements.
Since a1 |a, we have the inclusion of ideals aR ⊆ a1 R. If a1 R = aR, then a1 ∈ aR, and
a1 = ar = a1 b1 r, which implies that b1 is a unit contrary to our assumption. Therefore,
aR ≠ a1 R, and the inclusion is proper. By iteration then, we obtain a strictly increasing
chain of ideals

aR ⊂ a1 R ⊂ ⋅ ⋅ ⋅ ⊂ an R ⊂ ⋅ ⋅ ⋅ .

From our hypothesis on R, this must become stationary, contradicting the argument
above that the inclusion is proper. Therefore, a must be a product of irreducibles.

Theorem 3.4.2. Each principal ideal domain R is a unique factorization domain.

Proof. Suppose that R is a principal ideal domain. R satisfies property (C) by Theo-
rem 3.2.7(1). Therefore, to show that it is a unique factorization domain, we must show
that it also satisfies property (A). From the previous theorem, it suffices to show that
each ascending chain of principal ideals becomes stationary. Consider such an as-
cending chain

a1 R ⊂ a2 R ⊂ ⋅ ⋅ ⋅ ⊂ an R ⊂ ⋅ ⋅ ⋅ .

Now let

I = ⋃ ai R.
i=1

Now I is an ideal in R; hence a principal ideal. Therefore, I = aR for some a ∈ R. Since


I is a union, there exists an m such that a ∈ am R. Therefore, I = aR ⊂ am R and, hence,
I = am R, and ai R ⊂ am R for all i ≥ m. Therefore, the chain becomes stationary and,
from Theorem 3.4.1, R satisfies property (A).

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
42 | 3 Prime elements and unique factorization domains

Since we showed that the integers ℤ are a PID, we can recover the fundamental
theorem of arithmetic from Theorem 3.4.2. We now present another important example
of a PID; hence a UFD. In the next chapter, we will look in detail at polynomials with
coefficients in an integral domain. Below, we consider polynomials with coefficients
in a field, and for the present leave out many of the details.
If K is a field and n is a nonnegative integer, then a polynomial of degree n over K
is a formal sum of the form

P(x) = a0 + a1 x + ⋅ ⋅ ⋅ + an xn

with ai ∈ K for i = 0, . . . , n, an ≠ 0, and x an indeterminate. A polynomial P(x) over


K is either a polynomial of some degree or the expression P(x) = 0, which is called
the zero polynomial, and has degree −∞. We denote the degree of P(x) by deg P(x). A
polynomial of zero degree has the form P(x) = a0 and is called a constant polynomial,
and can be identified with the corresponding element of K. The elements ai ∈ K are
called the coefficients of P(x); an is the leading coefficient. If an = 1, P(x) is called
a monic polynomial. Two nonzero polynomials are equal if and only if they have the
same degree and exactly the same coefficients. A polynomial of degree 1 is called a
linear polynomial, whereas one of degree two is a quadratic polynomial.
We denote by K[x] the set of all polynomials over K, and we will show that K[x] be-
comes a principal ideal domain; hence a unique factorization domain. We first define
addition, subtraction, and multiplication on K[x] by algebraic manipulation. That is,
suppose P(x) = a0 + a1 x + ⋅ ⋅ ⋅ + an xn , Q(x) = b0 + b1 x + ⋅ ⋅ ⋅ + bm xm , then

P(x) ± Q(x) = (a0 ± b0 ) + (a1 ± b1 )x + ⋅ ⋅ ⋅ ;

that is, the coefficient of xi in P(x) ± Q(x) is ai ± bi , where ai = 0 for i > n, and bj = 0
for j > m. Multiplication is given by

P(x)Q(x) = (a0 b0 ) + (a1 b0 + a0 b1 )x + (a0 b2 + a1 b1 + a2 b0 )x2 + ⋅ ⋅ ⋅ + (an bm )xn+m ;

that is, the coefficient of xi in P(x)Q(x) is (a0 bi + a1 bi−1 + ⋅ ⋅ ⋅ + ai b0 ).

Example 3.4.3. Let P(x) = 3x2 + 4x − 6 and Q(x) = 2x + 7 be in ℚ[x]. Then

P(x) + Q(x) = 3x2 + 6x + 1

and

P(x)Q(x) = (3x2 + 4x − 6)(2x + 7) = 6x 3 + 29x 2 + 16x − 42.

From the definitions, the following degree relationships are clear. The proofs are
in the exercises.

Lemma 3.4.4. Let 0 ≠ P(x), 0 ≠ Q(x) in K[x]. Then the following hold:

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
3.4 Principal ideal domains and unique factorization | 43

(1) deg P(x)Q(x) = deg P(x) + deg Q(x).


(2) deg(P(x) ± Q(x)) ≤ max(deg P(x), deg Q(x)) if P(x) ± Q(x) ≠ 0.

We next obtain the following:

Theorem 3.4.5. If K is a field, then K[x] forms an integral domain. K can be naturally
embedded into K[x] by identifying each element of K with the corresponding constant
polynomial. The only units in K[x] are the nonzero elements of K.

Proof. Verification of the basic ring properties is solely computational and is left to the
exercises. Since deg P(x)Q(x) = deg P(x) + deg Q(x), it follows that if neither P(x) ≠ 0,
nor Q(x) ≠ 0, then P(x)Q(x) ≠ 0 and, therefore, K[x] is an integral domain.
If G(x) is a unit in K[x], then there exists an H(x) ∈ K[x] with G(x)H(x) = 1. From
the degrees, we have deg G(x) + deg H(x) = 0, and since deg G(x) ≥ 0, deg H(x) ≥ 0.
This is possible only if deg G(x) = deg H(x) = 0. Therefore, G(x) ∈ K.

Now that we have K[x] as an integral domain, we proceed to show that K[x] is a
principal ideal domain and, hence, there is unique factorization into primes. We first
repeat the definition of a prime in K[x]. If 0 ≠ f (x) has no nontrivial, nonunit factors
(it cannot be factorized into polynomials of lower degree), then f (x) is a prime in K[x]
or a prime polynomial. A prime polynomial is also called an irreducible polynomial.
Clearly, if deg g(x) = 1, then g(x) is irreducible.
The fact that K[x] is a principal ideal domain follows from the division algorithm
for polynomials, which is entirely analogous to the division algorithm for integers.

Lemma 3.4.6 (Division algorithm in K[x]). If 0 ≠ f (x), 0 ≠ g(x) ∈ K[x], then there exist
unique polynomials q(x), r(x) ∈ K[x] such that f (x) = q(x)g(x) + r(x), where r(x) = 0 or
deg r(x) < deg g(x). (The polynomials q(x) and r(x) are called, respectively, the quotient
and remainder.)

We give a formal proof in Chapter 4 on polynomials and polynomial rings, but


content ourselves here with doing two examples from ℚ[x]:

Example 3.4.7.
(1) Let f (x) = 3x 4 − 6x2 + 8x − 6, g(x) = 2x 2 + 4. Then

3x4 − 6x 2 + 8x − 6 3 2
= x −6 with remainder 8x + 18.
2x 2 + 4 2

Thus, here, q(x) = 32 x2 − 6, r(x) = 8x + 18.


(2) Let f (x) = 2x 5 + 2x4 + 6x 3 + 10x2 + 4x, g(x) = x2 + x. Then

2x5 + 2x 4 + 6x3 + 10x 2 + 4x


= 2x 3 + 6x + 4.
x2 + x

Thus, here, q(x) = 2x3 + 6x + 4, and r(x) = 0.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
44 | 3 Prime elements and unique factorization domains

Theorem 3.4.8. Let K be a field. Then the polynomial ring K[x] is a principal ideal do-
main; hence a unique factorization domain.

Proof. The proof is essentially analogous to the proof in the integers. Let I be an ideal
in K[x] with I ≠ K[x]. Let f (x) be a polynomial in I of minimal degree. We claim that
I = ⟨f (x)⟩, the principal ideal generated by f (x). Let g(x) ∈ I. We must show that g(x)
is a multiple of f (x). By the division algorithm in K[x], we have

g(x) = q(x)f (x) + r(x),

where r(x) = 0, or deg(r(x)) < deg(f (x)). If r(x) ≠ 0, then deg(r(x)) < deg(f (x)). How-
ever, r(x) = g(x) − q(x)f (x) ∈ I since I is an ideal, and g(x), f (x) ∈ I. This is a contra-
diction since f (x) was assumed to be a polynomial in I of minimal degree. Therefore,
r(x) = 0 and, hence, g(x) = q(x)f (x) is a multiple of f (x). Therefore, each element of I
is a multiple of f (x) and, hence, I = ⟨f (x)⟩.
Therefore, K[x] is a principal ideal domain and, from Theorem 3.4.2, a unique fac-
torization domain.

We proved that in a principal ideal domain, every ascending chain of ideals be-
comes stationary. In general, a ring R (commutative or not) satisfies the ascending
chain condition or ACC if every ascending chain of left (or right) ideals in R becomes
stationary. A ring satisfying the ACC is called a Noetherian ring.

3.5 Euclidean domains


In analyzing the proof of unique factorization in both ℤ and K[x], it is clear that it
depends primarily on the division algorithm. In ℤ, the division algorithm depended
on the fact that the positive integers could be ordered, and in K[x], on the fact that
the degrees of nonzero polynomials are nonnegative integers and, hence, could be
ordered. This basic idea can be generalized in the following way:

Definition 3.5.1. An integral domain D is a Euclidean domain if there exists a function


N from D⋆ = D \ {0} to the nonnegative integers such that
(1) N(r1 ) ≤ N(r1 r2 ) for any r1 , r2 ∈ D⋆ .
(2) For all r1 , r2 ∈ D with r1 ≠ 0, there exist q, r ∈ D such that

r2 = qr1 + r,

where either r = 0, or N(r) < N(r1 ).

The function N is called a Euclidean norm on D.

Therefore, Euclidean domains are precisely those integral domains, which allow
division algorithms. In the integers ℤ, define N(z) = |z|. Then N is a Euclidean norm

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
3.5 Euclidean domains | 45

on ℤ and, hence, ℤ is a Euclidean domain. On K[x], define N(p(x)) = deg(p(x)) if


p(x) ≠ 0. Then N is also a Euclidean norm on K[x] so that K[x] is also a Euclidean
domain. In any Euclidean domain, we can mimic the proofs of unique factorization in
both ℤ and K[x] to obtain the following:

Theorem 3.5.2. Every Euclidean domain is a principal ideal domain; hence a unique
factorization domain.

Before proving this theorem, we must develop some results on the number theory
of general Euclidean domains. First, some properties of the norm.

Lemma 3.5.3. If R is a Euclidean domain then the following hold:


(a) N(1) is minimal among {N(r) : r ∈ R⋆ }.
(b) N(u) = N(1) if and only if u is a unit.
(c) N(a) = N(b) for a, b ∈ R⋆ if a, b are associates.
(d) N(a) < N(ab) unless b is a unit.

Proof. (a) From property (1) of Euclidean norms, we have

N(1) ≤ N(1 ⋅ r) = N(r) for any r ∈ R⋆ .

(b) Suppose u is a unit. Then there exists u−1 with u ⋅ u−1 = 1. Then

N(u) ≤ N(u ⋅ u−1 ) = N(1).

From the minimality of N(1), it follows that N(u) = N(1).


Conversely, suppose N(u) = N(1). Apply the division algorithm to get

1 = qu + r.

If r ≠ 0, then N(r) < N(u) = N(1), contradicting the minimality of N(1). Therefore,
r = 0, and 1 = qu. Then u has a multiplicative inverse and, hence, is a unit.
(c) Suppose a, b ∈ R⋆ are associates. Then a = ub with u a unit. Then

N(b) ≤ N(ub) = N(a).

On the other hand, b = u−1 a. Therefore,

N(a) ≤ N(u−1 a) = N(b).

Since N(a) ≤ N(b), and N(b) ≤ N(a), it follows that N(a) = N(b).
(d) Suppose N(a) = N(ab). Apply the division algorithm

a = q(ab) + r,

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
46 | 3 Prime elements and unique factorization domains

where r = 0, or N(r) < N(ab). If r ≠ 0, then

r = a − qab = a(1 − qb) 󳨐⇒ N(ab) = N(a) ≤ N(a(1 − qb)) = N(r),

contradicting that N(r) < N(ab). Hence, r = 0, and a = q(ab) = (qb)a. Then

a = (qb)a = 1 ⋅ a 󳨐⇒ qb = 1

since there are no zero divisors in an integral domain. Hence, b is a unit. Since N(a) ≤
N(ab), it follows that if b is not a unit, we must have N(a) < N(ab).

We can now prove Theorem 3.5.2.

Proof. Let D be a Euclidean domain. We show that each ideal I ≠ D in D is principal.


Let I ≠ D be an ideal in D. If I = {0}, then I = ⟨0⟩, and I is principal. Therefore, we
may assume that there are nonzero elements in I. Hence, there are elements x ∈ I with
strictly positive norm. Let a be an element of I of minimal norm. We claim that I = ⟨a⟩.
Let b ∈ I. We must show that b is a multiple of a. Now by the division algorithm

b = qa + r,

where either r = 0, or N(r) < N(a). As in ℤ and K[x], we have a contradiction if r ≠ 0.


In this case, N(r) < N(a), but r = b − qa ∈ I since I is an ideal, contradicting the
minimality of N(a). Therefore, r = 0, and b = qa and, hence, I = ⟨a⟩.

As a final example of a Euclidean domain, we consider the Gaussian integers

ℤ[i] = {a + bi : a, b ∈ ℤ}.

It was first observed by Gauss that this set permits unique factorization. To show this,
we need a Euclidean norm on ℤ[i].

Definition 3.5.4. If z = a + bi ∈ ℤ[i], then its norm N(z) is defined by

N(a + bi) = a2 + b2 .

The basic properties of this norm follow directly from the definition (see exer-
cises).

Lemma 3.5.5. If α, β ∈ ℤ[i] then we have the following:


(1) N(α) is an integer for all α ∈ ℤ[i].
(2) N(α) ≥ 0 for all α ∈ ℤ[i].
(3) N(α) = 0 if and only if α = 0.
(4) N(α) ≥ 1 for all α ≠ 0.
(5) N(αβ) = N(α)N(β); that is, the norm is multiplicative.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
3.5 Euclidean domains | 47

From the multiplicativity of the norm, we have the following concerning primes
and units in ℤ[i].

Lemma 3.5.6.
(1) u ∈ ℤ[i] is a unit if and only if N(u) = 1.
(2) If π ∈ ℤ[i] and N(π) = p, where p is an ordinary prime in ℤ, then π is a prime in
ℤ[i].

Proof. Certainly u is a unit if and only if N(u) = N(1). But in ℤ[i], we have N(1) = 1.
Therefore, the first part follows.
Suppose next that π ∈ ℤ[i] with N(π) = p for some p ∈ ℤ. Suppose that π = π1 π2 .
From the multiplicativity of the norm, we have

N(π) = p = N(π1 )N(π2 ).

Since each norm is a positive ordinary integer, and p is a prime, it follows that either
N(π1 ) = 1, or N(π2 ) = 1. Hence, either π1 or π2 is a unit. Therefore, π is a prime in
ℤ[i].

Armed with this norm, we can show that ℤ[i] is a Euclidean domain.

Theorem 3.5.7. The Gaussian integers ℤ[i] form a Euclidean domain.

Proof. That ℤ[i] forms a commutative ring with an identity can be verified directly
and easily. If αβ = 0, then N(α)N(β) = 0, and since there are no zero divisors in ℤ, we
must have N(α) = 0, or N(β) = 0. But then either α = 0, or β = 0 and, hence, ℤ[i] is
an integral domain. To complete the proof, we show that the norm N is a Euclidean
norm.
From the multiplicativity of the norm, we have, if α, β ≠ 0

N(αβ) = N(α)N(β) ≥ N(α) since N(β) ≥ 1.

Therefore, property (1) of Euclidean norms is satisfied. We must now show that the
division algorithm holds.
Let α = a + bi and β = c + di be Gaussian integers. Recall that the inverse for a
nonzero complex number z = x + iy is

1 z x − iy
= 2 = 2 .
z |z| x + y2

Therefore, as a complex number

α β c − di
= α 2 = (a + bi) 2
β |β| c + d2
ac + bd ac − bd
= 2 + 2 i = u + iv.
c + d2 c + d2

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
48 | 3 Prime elements and unique factorization domains

Now since a, b, c, d are integers u, v must be rationals. The set

{u + iv : u, v ∈ ℚ}

is called the set of the Gaussian rationals.


If u, v ∈ ℤ, then u + iv ∈ ℤ[i], α = qβ with q = u + iv, and we are done. Otherwise,
choose ordinary integers m, n satisfying |u − m| ≤ 21 and |v − n| ≤ 21 , and let q = m + in.
Then q ∈ ℤ[i]. Let r = α − qβ. We must show that N(r) < N(β).
Working with complex absolute value, we get

󵄨󵄨 α 󵄨󵄨
|r| = |α − qβ| = |β|󵄨󵄨󵄨 − q󵄨󵄨󵄨.
󵄨 󵄨
󵄨󵄨 β 󵄨󵄨

Now

󵄨󵄨 α 2 2
󵄨󵄨 1 1
󵄨󵄨 − q󵄨󵄨󵄨 = 󵄨󵄨󵄨(u − m) + i(v − n)󵄨󵄨󵄨 = √(u − m)2 + (v − n)2 ≤ √( ) + ( ) < 1.
󵄨󵄨 󵄨 󵄨 󵄨
󵄨󵄨 β 󵄨󵄨 2 2

Therefore,

|r| < |β| 󳨐⇒ |r|2 < |β|2 󳨐⇒ N(r) < N(β),

completing the proof.

Since ℤ[i] forms a Euclidean domain, it follows from our previous results that ℤ[i]
must be a principal ideal domain; hence a unique factorization domain.

Corollary 3.5.8. The Gaussian integers are a UFD.

Since we will now be dealing with many kinds of integers, we will refer to the
ordinary integers ℤ as the rational integers and the ordinary primes p as the rational
primes. It is clear that ℤ can be embedded into ℤ[i]. However, not every rational prime
is also prime in ℤ[i]. The primes in ℤ[i] are called the Gaussian primes. For example,
we can show that both 1 + i and 1 − i are Gaussian primes; that is, primes in ℤ[i].
However, (1 + i)(1 − i) = 2. Therefore, the rational prime 2 is not a prime in ℤ[i]. Using
the multiplicativity of the Euclidean norm in ℤ[i], we can describe all the units and
primes in ℤ[i].

Theorem 3.5.9.
(1) The only units in ℤ[i] are ±1, ±i.
(2) Suppose π is a Gaussian prime. Then π is one of the following:
(a) a positive rational prime p ≡ 3 mod 4, or an associate of such a rational prime.
(b) 1 + i, or an associate of 1 + i.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
3.5 Euclidean domains | 49

(c) a + bi, or a − bi, where a > 0, b > 0, a is even, and N(π) = a2 + b2 = p with p a
rational prime congruent to 1 mod 4, or an associate of a + bi, or a − bi.

Proof. (1) Suppose u = x + iy ∈ ℤ[i] is a unit. Then, from Lemma 3.5.6, we have N(u) =
x2 + y2 = 1, implying that (x, y) = (0, ±1) or (x, y) = (±1, 0). Hence, u = ±1 or u = ±i.
(2) Now suppose that π is a Gaussian prime. Since N(π) = ππ, and π ∈ ℤ[i], it
follows that π|N(π). N(π) is a rational integer, so N(π) = p1 ⋅ ⋅ ⋅ pk , where the pi ’s are
rational primes. By Euclid’s lemma π|pi for some pi and, hence, a Gaussian prime must
divide at least one rational prime. On the other hand, suppose π|p and π|q, where
p, q are different primes. Then (p, q) = 1 and, hence, there exist x, y ∈ ℤ such that
1 = px + qy. It follows that π|1 is a contradiction. Therefore, a Gaussian prime divides
one and only one rational prime.
Let p be the rational prime that π divides. Then N(π)|N(p) = p2 . Since N(π) is a
rational integer, it follows that N(π) = p, or N(π) = p2 . If π = a + bi, then a2 + b2 = p,
or a2 + b2 = p2 .
If p = 2, then a2 + b2 = 2, or a2 + b2 = 4. It follows that π = ±2, ±2i, or π = 1 + i, or an
associate of 1 + i. Since (1 + i)(1 − i) = 2, and neither 1 + i, nor 1 − i are units, it follows
that neither 2, nor any of its associates are primes. Then π = 1 + i, or an associate of
1 + i. To see that 1 + i is prime supposes 1 + i = αβ. Then N(1 + i) = 2 = N(α)N(β). It
follows that either N(α) = 1, or N(β) = 1, and either α or β is a unit.
If p ≠ 2, then either p ≡ 3 mod 4, or p ≡ 1 mod 4. Suppose first that p ≡ 3 mod 4.
Then a2 + b2 = p would imply, from Fermat’s two-square theorem (see [43]), that p ≡
1 mod 4. Therefore, from the remarks above a2 + b2 = p2 , and N(π) = N(p). Since π|p,
we have π = αp with α ∈ ℤ[i]. From N(π) = N(p), we get that N(α) = 1, and α is a unit.
Therefore, π and p are associates. Hence, in this case, π is an associate of a rational
prime congruent to 3 mod 4.
Finally, suppose p ≡ 1 mod 4. From the remarks above, either N(π) = p, or N(π) =
p2 . If N(π) = p2 , then a2 + b2 = p2 . Since p ≡ 1 mod 4, from Fermat’s two square
theorem, there exist m, n ∈ ℤ with m2 + n2 = p. Let u = m + in, then the norm N(u) = p.
Since p is a rational prime, it follows that u is a Gaussian prime. Similarly, its conjugate
u is also a Gaussian prime. Now uu | p2 = N(π). Since π|N(π), it follows that π|uu,
and from Euclid’s lemma, either π|u, or π|u. If π|u, they are associates since both are
primes. But this is a contradiction since N(π) ≠ N(u). The same is true if π|u.
It follows that if p ≡ 1 mod 4, then N(π) ≠ p2 . Therefore, in this case, N(π) =
p = a2 + b2 . An associate of π has both a, b > 0 (see exercises). Furthermore, since
a2 + b2 = p, one of a or b must be even. If a is odd, then b is even; then iπ is an
associate of π with a even, completing the proof.

Finally, we mention that the methods used in ℤ[i] cannot be applied to all
quadratic integers. For example, we have seen that there is not unique factorization
in ℤ[√−5].

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
50 | 3 Prime elements and unique factorization domains

3.6 Overview of integral domains


Here we present some additional definitions for special types of integral domains.

Definition 3.6.1.
(1) A Dedekind domain D is an integral domain such that each nonzero proper ideal
A ({0} ≠ A ≠ R) can be written uniquely as a product of prime ideals

A = P1 ⋅ ⋅ ⋅ Pr

with each Pi being a prime ideal and the factorization being unique up to ordering.
(2) A Prüfer ring R is an integral domain such that

A ⋅ (B ∩ C) = AB ∩ AC

for all ideals A, B, C in R.

Dedekind domains arise naturally in algebraic number theory. It can be proved


that the rings of algebraic integers in any algebraic number field are Dedekind do-
mains (see [43]).
If R is a Dedekind domain, it is also a Prüfer Ring. If R is a Prüfer ring and a unique
factorization domain, then R is a principal ideal domain.
In the next chapter, we will prove a Gaussian theorem which states that if R is a
UFD, then the polynomial ring R[x] is also a UFD. If K is a field, we have already seen
that K[x] is a UFD. Hence, the polynomial ring in several variables K[x1 , . . . , xn ] is also
a UFD. This fact plays an important role in algebraic geometry.

3.7 Exercises
1. Let R be an integral domain, and let π ∈ R \ (U(R) ∪ {0}). Show the following:
(i) If for each a ∈ R with π ∤ a, there exist λ, μ ∈ R with λπ + μa = 1, then π is a
prime element of R.
(ii) Give an example for a prime element π in an UFD R, which does not satisfy
the conditions of (i).
2. Let R be a UFD, and let a1 , . . . , at be pairwise coprime elements of R. If a1 ⋅ ⋅ ⋅ at is
an m-th power (m ∈ ℕ), then all factors ai are an associate of an m-th power. Is
each ai necessarily an m-th power?
3. Decide if the unit group of ℤ[√3], ℤ[√5], and ℤ[√7] is finite or infinite. For which
a ∈ ℤ are (1 − √5) and (a + √5) associates in ℤ[√5]?
4. Let k ∈ ℤ and k ≠ x2 for all x ∈ ℤ. Let α = a + b√k and β = c + d√k be elements of
ℤ[√k], and N(α) = a2 − kb2 , N(β) = c2 − kd2 . Show the following:
(i) The equality of the absolute values of N(α) and N(β) is necessary for the as-
sociation of α and β in ℤ[√k]. Is this constraint also sufficient?

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
3.7 Exercises | 51

(ii) Sufficient for the irreducibility of α in ℤ[√k] is the irreducibility of N(α) in ℤ.


Is this also necessary?
5. In general irreducible elements are not prime. Consider the set of complex number
given by

R = ℤ[i√5] = {x + iy√5 : x, y ∈ ℤ}.

Show that they form a subring of ℂ.


6. For an element x + iy√5 ∈ R define its norm by

N(x + iy√5) = 󵄨󵄨󵄨x + iy√5󵄨󵄨󵄨 = x2 + 5y2 .


󵄨 󵄨

Prove that the norm is multiplicative, that is N(ab) = N(a)N(b).


7. Prove Lemma 3.4.4.
8. Prove that the set of polynomials R[x] with coefficients in a ring R forms a ring.
9. Prove the basic properties of the norm of the Gaussian integers. If α, β ∈ ℤ[i] then:
(i) N(α) is an integer for all α ∈ ℤ[i].
(ii) N(α) ≥ 0 for all α ∈ ℤ[i].
(iii) N(α) = 0 if and only if α = 0.
(iv) N(α) ≥ 1 for all α ≠ 0.
(v) N(αβ) = N(α)N(β), that is the norm is multiplicative.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:10 AM
Brought to you by | Stockholm University Library
Authenticated
Download Date | 10/13/19 3:10 AM
4 Polynomials and polynomial rings
4.1 Polynomials and polynomial rings
In the last chapter, we saw that if K is a field, then the set of polynomials with co-
efficients in K, which we denoted K[x], forms a unique factorization domain. In this
chapter, we take a more detailed look at polynomials over a general ring R. We then
prove that if R is a UFD, then the polynomial ring R[x] is also a UFD. We first take a
formal look at polynomials.
Let R be a commutative ring with an identity. Consider the set R̃ of functions f
from the nonnegative integers N = ℕ ∪ {0} into R with only a finite number of values
nonzero. That is,

R̃ = {f : N → R : f (n) ≠ 0 for only finitely many n}.

On R,̃ we define the following addition and multiplication:

(f + g)(n) = f (n) + g(n)


(f ⋅ g)(n) = ∑ f (i)g(j).
i+j=n

If we let x = (0, 1, 0, . . .) and identify (r, 0, . . .) with r ∈ R, then

x0 = (1, 0, . . .) = 1, and xi+1 = x ⋅ xi .

Now if f = (r0 , r1 , r2 , . . .), then f can be written as


∞ m
f = ∑ ri xi = ∑ ri xi
i=0 i=0

for some m ≥ 0 since ri ≠ 0 for only finitely many i. Furthermore, this presentation is
unique.
We now call x an indeterminate over R, and write each element of R̃ as f (x) =
m
∑i=0 ri xi with f (x) = 0 or rm ≠ 0. We also now write R[x] for R.̃ Each element of R[x]
is called a polynomial over R. The elements r0 , . . . , rm are called the coefficients of f (x)
with rm the leading coefficient. If rm ≠ 0, the non-negative integer m is called the de-
gree of f (x), which we denote by deg f (x). We say that f (x) = 0 has degree −∞. The
uniqueness of the representation of a polynomial implies that two nonzero polynomi-
als are equal if and only if they have the same degree and exactly the same coefficients.
A polynomial of degree 1 is called a linear polynomial, whereas one of degree two is a
quadratic polynomial. The set of polynomials of degree 0, together with 0, form a ring
isomorphic to R and, hence, can be identified with R, the constant polynomials. Thus,
the ring R embeds in the set of polynomials R[x]. The following results are straightfor-
ward concerning degree:

https://doi.org/10.1515/9783110603996-004

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:10 AM
54 | 4 Polynomials and polynomial rings

Lemma 4.1.1. Let f (x) ≠ 0, g(x) ≠ 0 ∈ R[x]. Then the following hold:
(a) deg f (x)g(x) ≤ deg f (x) + deg g(x).
(b) deg(f (x) ± g(x)) ≤ max(deg f (x), deg g(x)).

If R is an integral domain, then we have equality in (a).

Theorem 4.1.2. Let R be a commutative ring with an identity. Then the set of polynomi-
als R[x] forms a ring called the ring of polynomials over R. The ring R identified with 0
and the polynomials of degree 0 naturally embeds into R[x]. R[x] is commutative. Fur-
thermore, R[x] is uniquely determined by R and x.

Proof. Set f (x) = ∑ni=0 ri xi and g(x) = ∑m j


j=0 sj x . The ring properties follow directly by
computation. The identification of r ∈ R with the polynomial r(x) = r provides the
embedding of R into R[x]. From the definition of multiplication in R[x], if R is commu-
tative, then R[x] is commutative. Note that if R has a multiplicative identity 1 ≠ 0, then
this is also the multiplicative identity of R[x].
Finally, if S is a ring that contains R and α ∈ S, then

R[α] = {∑ ri αi : ri ∈ R, and ri ≠ 0 for only a finite number of i}


i≥0

is a homomorphic image of R[x] via the map

∑ ri xi 󳨃→ ∑ ri αi .
i≥0 i≥0

Hence, R[x] is uniquely determined by R and x. We remark that R[α] must be commu-
tative.

If R is an integral domain, then irreducible polynomials are defined as irreducibles


in the ring R[x]. If R is a field, then f (x) is an irreducible polynomial if there is no fac-
torization f (x) = g(x)h(x), where g(x) and h(x) are polynomials of lower degree than
f (x). Otherwise, f (x) is called reducible. In elementary mathematics, polynomials are
considered as functions. We recover that idea via the concept of evaluation.

Definition 4.1.3. Let f (x) = r0 + r1 x + ⋅ ⋅ ⋅ + rm xn be a polynomial over a commutative


ring R with an identity, and let c ∈ R. Then the element

f (c) = r0 + r1 c + ⋅ ⋅ ⋅ + rn cn ∈ R

is called the evaluation of f (x) at c.

Definition 4.1.4. If f (x) ∈ R[x] and f (c) = 0 for c ∈ R, then c is called a zero or a root
of f (x) in R.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:10 AM
4.2 Polynomial rings over fields | 55

4.2 Polynomial rings over fields


We now restate some of the result of the last chapter for K[x], where K is a field. We
then consider some consequences of these results to zeros of polynomials.

Theorem 4.2.1. If K is a field, then K[x] forms an integral domain. K can be naturally
embedded into K[x] by identifying each element of K with the corresponding constant
polynomial. The only units in K[x] are the nonzero elements of K.

Proof. Verification of the basic ring properties is solely computational and is left to the
exercises. Since deg P(x)Q(x) = deg P(x) + deg Q(x), it follows that if neither P(x) ≠ 0,
nor Q(x) ≠ 0, then P(x)Q(x) ≠ 0. Therefore, K[x] is an integral domain.
If G(x) is a unit in K[x], then there exists an H(x) ∈ K[x] with G(x)H(x) = 1.
From the degrees, we have deg G(x) + deg H(x) = 0, and since deg G(x) ≥ 0,
deg H(x) ≥ 0. This is possible only if deg G(x) = deg H(x) = 0. Therefore, G(x) ∈ K.

Now that we have K[x] as an integral domain, we proceed to show that K[x] is a
principal ideal domain and, hence, there is unique factorization into primes. We first
repeat the definition of a prime in K[x]. If 0 ≠ f (x) has no nontrivial, nonunit factors (it
cannot be factorized into polynomials of lower degree), then f (x) is a prime in K[x] or a
prime polynomial. A prime polynomial is also called an irreducible polynomial over K.
Clearly, if deg g(x) = 1, then g(x) is irreducible.
The fact that K[x] is a principal ideal domain follows from the division algorithm
for polynomials, which is entirely analogous to the division algorithm for integers.

Theorem 4.2.2 (Division algorithm in K[x]). If 0 ≠ f (x), 0 ≠ g(x) ∈ K[x], then there ex-
ist unique polynomials q(x), r(x) ∈ K[x] such that f (x) = q(x)g(x) + r(x), where r(x) = 0,
or deg r(x) < deg g(x). (The polynomials q(x) and r(x) are called respectively the quo-
tient and remainder.)

Proof. If deg f (x) = 0 and deg g(x) ≥ 1, then we just choose q(x) = 0, and r(x) = f (x).
If deg f (x) = 0 = deg g(x), then f (x) = f ∈ K, and g(x) = g ∈ K, and we choose
q(x) = gf and r(x) = 0. Hence, Theorem 4.2.2 is proved for deg f (x) = 0, also certainly
the uniqueness statement.
Now, let n > 0 and Theorem 4.2.2 be proved for all f (x) ∈ K[x] with deg f (x) < n.
Now, given

f (x) = an xn + an−1 xn−1 + ⋅ ⋅ ⋅ + a1 x + a0 , with an ≠ 0, and


m m−1
g(x) = bm x + bm−1 x + ⋅ ⋅ ⋅ + b1 x + b0 , with bm ≠ 0, m ≥ 0.

If m > n, then just choose q(x) = 0 and r(x) = f (x).


Now, finally, let 0 ≤ m ≤ n. We define

an n−m
h(x) = f (x) − x g(x).
bm

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:10 AM
56 | 4 Polynomials and polynomial rings

We have deg h(x) < n. Hence, by induction assumption, there are q1 (x) and r(x) with
h(x) = q1 (x)g(x) + r(x) and deg r(x) < deg g(x). Then
an n−m
f (x) = h(x) + x g(x)
bm
an n−m
=( x + q1 (x))g(x) + r(x)
bm
an n−m
= q(x)g(x) + r(x) with q(x) = x + q1 (x),
bm

which proves the existence.


We now show the uniqueness. Let

f (x) = q1 (x)g(x) + r1 (x)


= q2 (x)g(x) + r2 (x),

with

deg r1 (x) < deg g(x), and deg r2 (x) < deg g(x).

Assume r1 (x) ≠ r2 (x). Let deg r1 (x) ≥ deg r2 (x). We get

(q2 (x) − q1 (x))g(x) = r1 (x) − r2 (x),

which gives a contradiction because deg(r1 (x) − r2 (x)) < deg g(x), and q2 (x) − q1 (x) ≠ 0
if r1 (x) ≠ r2 (x). Therefore, r1 (x) = r2 (x), and furthermore q1 (x) = q2 (x) because K[x] is
an integral domain.

Example 4.2.3. Let f (x) = 2x 3 + x2 − 5x + 3, g(x) = x2 + x + 1. Then

2x 3 + x2 − 5x + 3
= 2x − 1 with remainder − 6x + 4.
x2 + x + 1
Hence, q(x) = 2x − 1, r(x) = −6x + 4, and

2x3 + x2 − 5x + 3 = (2x − 1)(x 2 + x + 1) + (−6x + 4).

Theorem 4.2.4. Let K be a field. Then the polynomial ring K[x] is a principal ideal do-
main, and hence a unique factorization domain.

We now give some consequences relative to zeros of polynomials in K[x].

Theorem 4.2.5. If f (x) ∈ K[x] and c ∈ K with f (c) = 0, then

f (x) = (x − c)h(x),

where deg h(x) < deg f (x).

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:10 AM
4.3 Polynomial rings over integral domains | 57

Proof. Divide f (x) by x − c. Then by the division algorithm, we have

f (x) = (x − c)h(x) + r(x),

where r(x) = 0, or deg r(x) < deg(x −c) = 1. Hence, if r(x) ≠ 0, then r(x) is a polynomial
of degree 0, that is, a constant polynomial, and thus r(x) = r for r ∈ K. Hence, we have

f (x) = (x − c)h(x) + r.

This implies that

0 = f (x) = 0h(c) + r = r

and, therefore, r = 0, and f (x) = (x − c)h(x). Since deg(x − c) = 1, we must have that
deg h(x) < deg f (x).

If f (x) = (x − c)k h(x) for some k ≥ 1 with h(c) ≠ 0, then c is called a zero of order k.

Theorem 4.2.6. Let f (x) ∈ K[x] with degree 2 or 3. Then f is irreducible if and only if
f (x) does not have a zero in K.

Proof. Suppose that f (x) is irreducible of degree 2 or 3. If f (x) has a zero c, then from
Theorem 4.2.5, we have f (x) = (x − c)h(x) with h(x) of degree 1 or 2. Therefore, f (x) is
reducible a contradiction and, hence, f (x) cannot have a zero.
From Theorem 4.2.5, if f (x) has a zero and is of degree greater than 1, then f (x) is
reducible.
If f (x) is reducible, then f (x) = g(x)h(x) with deg g(x) = 1 and, hence, f (x) has a
zero in K.

4.3 Polynomial rings over integral domains


Here we consider R[x] where R is an integral domain.

Definition 4.3.1. Let R be an integral domain. Then a1 , a2 , . . . , an ∈ R are coprime if the


set of all common divisors of a1 , a2 , . . . , an consists only of units.

Notice, for example, that this concept depends on the ring R. For example, 6 and
9 are not coprime over the integers ℤ since 3|6 and 3|9 and 3 is not a unit. However,
6 and 9 are coprime over the rationals ℚ. Here, 3 is a unit.

Definition 4.3.2. Let f (x) = ∑ni=0 ri xi ∈ R[x], where R is an integral domain. Then f (x)
is a primitive polynomial or just primitive if r0 , r1 , . . . , rn are coprime in R.

Theorem 4.3.3. Let R be an integral domain. Then the following hold:


(a) The units of R[x] are the units of R.
(b) If p is a prime element of R, then p is a prime element of R[x].

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:10 AM
58 | 4 Polynomials and polynomial rings

Proof. If r ∈ R is a unit, then since R embeds into R[x], it follows that r is also a unit
in R[x]. Conversely, suppose that h(x) ∈ R[x] is a unit. Then there is a g(x) such that
h(x)g(x) = 1. Hence, deg f (x) + deg g(x) = deg 1 = 0. Since degrees are nonnegative
integers, it follows that deg f (x) = deg g(x) = 0 and, hence, f (x) ∈ R.
Now suppose that p is a prime element of R. Then p ≠ 0, and pR is a prime ideal
in R. We must show that pR[x] is a prime ideal in R[x]. Consider the map

τ : R[x] → (R/pR)[x] given by


n n
τ( ∑ ri xi ) = ∑ (ri + pR)x i .
i=0 i=0

Then τ is an epimorphism with kernel pR[x]. Since pR is a prime ideal, we know that
R/pR is an integral domain. It follows that (R/pR)[x] is also an integral domain. Hence,
pR[x] must be a prime ideal in R[x], and therefore p is also a prime element of R[x].

Recall that each integral domain R can be embedded into a unique field of frac-
tions K. We can use results on K[x] to deduce some results in R[x].

Lemma 4.3.4. If K is a field, then each nonzero f (x) ∈ K[x] is a primitive.

Proof. Since K is a field, each nonzero element of K is a unit. Therefore, the only com-
mon divisors of the coefficients of f (x) are units and, hence, f (x) ∈ K[x] is primi-
tive.

Theorem 4.3.5. Let R be an integral domain. Then each irreducible f (x) ∈ R[x] of degree
> 0 is primitive.

Proof. Let f (x) be an irreducible polynomial in R[x], and let r ∈ R be a common divi-
sor of the coefficients of f (x). Then f (x) = rg(x), where g(x) ∈ R[x]. Then deg f (x) =
deg g(x) > 0, so g(x) ∉ R. Since the units of R[x] are the units of R, it follows that g(x)
is not a unit in R[x]. Since f (x) is irreducible, it follows that r must be a unit in R[x]
and, hence, r is a unit in R. Therefore, f (x) is primitive.

Theorem 4.3.6. Let R be an integral domain and K its field of fractions. If f (x) ∈ R[x] is
primitive and irreducible in K[x], then f (x) is irreducible in R[x].

Proof. Suppose that f (x) ∈ R[x] is primitive and irreducible in K[x], and suppose that
f (x) = g(x)h(x), where g(x), h(x) ∈ R[x] ⊂ K[x]. Since f (x) is irreducible in K[x], either
g(x) or h(x) must be a unit in K[x]. Without loss of generality, suppose that g(x) is a
unit in K[x]. Then g(x) = g ∈ K. But g(x) ∈ R[x], and K ∩ R[x] = R.
Hence, g ∈ R. Then g is a divisor of the coefficients of f (x), and as f (x) is primitive,
g(x) must be a unit in R and, therefore, also a unit in R[x]. Therefore, f (x) is irreducible
in R[x].

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:10 AM
4.4 Polynomial rings over unique factorization domains | 59

4.4 Polynomial rings over unique factorization domains


In this section, we prove that if R is a UFD, then the polynomial ring R[x] is also a UFD.
We first need the following due to Gauss:

Theorem 4.4.1 (Gauss’ lemma). Let R be a UFD and f (x), g(x) primitive polynomials in
R[x]. Then their product f (x)g(x) is also primitive.

Proof. Let R be a UFD and f (x), g(x) primitive polynomials in R[x]. Suppose that
f (x)g(x) is not primitive. Then there is a prime element p ∈ R that divides each of
the coefficients of f (x)g(x). Then p|f (x)g(x). Since prime elements of R are also prime
elements of R[x], it follows that p is also a prime element of R[x] and, hence, p|f (x),
or p|g(x). Therefore, either f (x) or g(x) is not primitive, giving a contradiction.

Theorem 4.4.2. Let R be a UFD and K its field of fractions.


(a) If g(x) ∈ K[x] is nonzero, then there is a nonzero a ∈ K such that ag(x) ∈ R[x] is
primitive.
(b) Let f (x), g(x) ∈ R[x] with g(x) primitive and f (x) = ag(x) for some a ∈ K. Then a ∈ R.
(c) If f (x) ∈ R[x] is nonzero, then there is a b ∈ R and a primitive g(x) ∈ R[x] such that
f (x) = bg(x).

r
Proof. (a) Suppose that g(x) = ∑ni=0 ai xi with ai = si , ri , si ∈ R. Set s = s0 s1 ⋅ ⋅ ⋅ sn .
i
Then sg(x) is a nonzero element of R[x]. Let d be a greatest common divisor of the
coefficients of sg(x). If we set a = ds , then ag(x) is primitive.
(b) For a ∈ K, there are coprime r, s ∈ R satisfying a = sr . Suppose that a ∉ R.
Then there is a prime element p ∈ R dividing s. Since g(x) is primitive, p does not
divide all the coefficients of g(x). However, we also have f (x) = ag(x) = sr g(x). Hence,
sf (x) = rg(x), where p|s and p does not divide r. Therefore, p divides all the coefficients
of g(x) and, hence, a ∈ R.
(c) From part (a), there is a nonzero a ∈ K such that af (x) is primitive in R[x].
Then f (x) = a−1 (af (x)). From part (b), we must have a−1 ∈ R. Set g(x) = af (x) and
b = a−1 .

Theorem 4.4.3. Let R be a UFD and K its field of fractions. Let f (x) ∈ R[x] be a polyno-
mial of degree ≥ 1.
(a) If f (x) is primitive and f (x)|g(x) in K[x], then f (x) divides g(x) also in R[x].
(b) If f (x) is irreducible in R[x], then it is also irreducible in K[x].
(c) If f (x) is primitive and a prime element of K[x], then f (x) is also a prime element of
R[x].

Proof. (a) Suppose that g(x) = f (x)h(x) with h(x) ∈ K[x]. From Theorem 4.4.2 part
(a), there is a nonzero a ∈ K such that h1 (x) = ah(x) is primitive in R[x]. Hence,
g(x) = a1 (f (x)h1 (x)). From Gauss’ lemma f (x)h1 (x) is primitive in R[x]. Therefore, from
Theorem 4.4.2 part (b), we have a1 ∈ R. It follows that f (x)|g(x) in R[x].

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:10 AM
60 | 4 Polynomials and polynomial rings

(b) Suppose that g(x) ∈ K[x] is a factor of f (x). From Theorem 4.4.2 part (a), there
is a nonzero a ∈ K with g1 (x) = ag(x) primitive in R[x]. Since a is a unit in K, it follows
that

g(x)|f (x) in K[x] implies g1 (x)|f (x) in K[x]

and, hence, since g1 (x) is primitive

g1 (x)|f (x) in R[x].

However, by assumption, f (x) is irreducible in R[x]. This implies that either g1 (x) is a
unit in R, or g1 (x) is an associate of f (x).
If g1 (x) is a unit, then g1 ∈ K, and g1 = ga. Hence, g ∈ K; that is, g = g(x) is a unit.
If g1 (x) is an associate of f (x), then f (x) = bg(x), where b ∈ K since g1 (x) = ag(x)
with a ∈ K. Combining these, it follows that f (x) has only trivial factors in K[x], and
since—by assumption—f (x) is nonconstant, it follows that f (x) is irreducible in K[x].
(c) Suppose that f (x)|g(x)h(x) with g(x), h(x) ∈ R[x]. Since f (x) is a prime element
in K[x], we have that f (x)|g(x) or f (x)|h(x) in K[x]. From part (a), we have f (x)|g(x) or
f (x)|h(x) in R[x] implying that f (x) is a prime element in R[x].

We can now state and prove our main result.

Theorem 4.4.4 (Gauss). Let R be a UFD. Then the polynomial ring R[x] is also a UFD.

Proof. By induction, on degree, we show that each nonunit f (x) ∈ R[x], f (x) ≠ 0, is
a product of prime elements. Since R is an integral domain, so is R[x]. Therefore, the
fact that R[x] is a UFD then follows from Theorem 3.3.3.
If deg f (x) = 0, then f (x) = f is a nonunit in R. Since R is a UFD, f is a product
of prime elements in R. However, from Theorem 4.3.3, each prime factor is then also
prime in R[x]. Therefore, f (x) is a product of prime elements.
Now suppose n > 0 and that the claim is true for all polynomials f (x) of degree
< n. Let f (x) be a polynomial of degree n > 0. From Theorem 4.4.2 (c), there is an a ∈ R
and a primitive h(x) ∈ R[x] satisfying f (x) = ah(x). Since R is a UFD, the element a is a
product of prime elements in R, or a is a unit in R. Since the units in R[x] are the units
in R, and a prime element in R is also a prime element in R[x], it follows that a is a
product of prime elements in R[x], or a is a unit in R[x]. Let K be the field of fractions
of R. Then K[x] is a UFD. Hence, h(x) is a product of prime elements of K[x]. Let p(x) ∈
K[x] be a prime divisor of h(x). From Theorem 4.4.2, we can assume by multiplication
of field elements that p(x) ∈ R[x], and p(x) is primitive. From Theorem 4.4.2 (c), it
follows that p(x) is a prime element of R[x]. Furthermore, from Theorem 4.4.3 (a), p(x)
is a divisor of h(x) in R[x]. Therefore,

f (x) = ah(x) = ap(x)g(x) ∈ R[x],

where the following hold:

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:10 AM
4.4 Polynomial rings over unique factorization domains | 61

(1) a is a product of prime elements of R[x], or a is a unit in R[x],


(2) deg p(x) > 0, since p(x) is a prime element in K[x],
(3) p(x) is a prime element in R[x], and
(4) deg g(x) < deg f (x) since deg p(x) > 0.

By our inductive hypothesis, we have then that g(x) is a product of prime elements in
R[x], or g(x) is a unit in R[x]. Therefore, the claim holds for f (x), and therefore holds
for all f (x) by induction.

If R[x] is a polynomial ring over R, we can form a polynomial ring in a new inde-
terminate y over this ring to form (R[x])[y]. It is straightforward that (R[x])[y] is iso-
morphic to (R[y])[x]. We denote both of these rings by R[x, y] and consider this as the
ring of polynomials in two commuting variables x, y with coefficients in R.
If R is a UFD, then from Theorem 4.4.4, R[x] is also a UFD. Hence, R[x, y] is
also a UFD. Inductively then, the ring of polynomials in n commuting variables
R[x1 , x2 , . . . , xn ] is also a UFD. Here, R[x1 , . . . , xn ] is inductively given by R[x1 , . . . , xn ] =
(R[x1 , . . . , xn−1 ])[xn ] if n > 2.

Corollary 4.4.5. If R is a UFD, then the polynomial ring in n commuting variables


R[x1 , . . . , xn ] is also a UFD.

We now give a condition for a polynomial in R[x] to have a zero in K[x], where K
is the field of fractions of R.

Theorem 4.4.6. Let R be a UFD and K its field of fractions. Let f (x) = xn + rn−1 xn−1 + ⋅ ⋅ ⋅ +
r0 ∈ R[x]. Suppose that β ∈ K is a zero of f (x). Then β is in R and is a divisor of r0 .

Proof. Let β = sr , where s ≠ 0, and r, s ∈ R and r, s are coprime. Now

r rn r n−1
f ( ) = 0 = n + rn−1 n−1 + ⋅ ⋅ ⋅ + r0 .
s s s

Hence, it follows that s must divide r n . Since r and s are coprime, s must be a unit, and
then, without loss of generality, we may assume that s = 1. Then β ∈ R, and

r(r n−1 + ⋅ ⋅ ⋅ + r1 ) = −r0 ,

and so r|a0 .

Note that since ℤ is a UFD, Gauss’ theorem implies that ℤ[x] is also a UFD. How-
ever, ℤ[x] is not a principal ideal domain. For example, the set of integral polynomials
with even constant term is an ideal, but not principal. We leave the verification to the
exercises. On the other hand, we saw that if K is a field, K[x] is a PID. The question
arises as to when R[x] actually is a principal ideal domain. It turns out to be precisely
when R is a field.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:10 AM
62 | 4 Polynomials and polynomial rings

Theorem 4.4.7. Let R be a commutative ring with an identity. Then the following are
equivalent:
(a) R is a field.
(b) R[x] is Euclidean.
(c) R[x] is a principal ideal domain.

Proof. From Section 4.2, we know that (a) implies (b), which in turn implies (c). There-
fore, we must show that (c) implies (a). Assume then that R[x] is a principal ideal do-
main. Define the map

τ : R[x] → R

by

τ(f (x)) = f (0).

It is easy to see that τ is a ring homomorphism with R[x]/ ker(τ) ≅ R. Therefore,


ker(τ) ≠ R[x]. Since R[x] is a principal ideal domain, it is an integral domain. It follows
that ker(τ) must be a prime ideal since the quotient ring is an integral domain. How-
ever, since R[x] is a principal ideal domain, prime ideals are maximal ideals; hence,
ker(τ) is a maximal ideal by Theorem 3.2.7. Therefore, R ≅ R[x]/ ker(τ) is a field.

We now consider the relationship between irreducibles in R[x] for a general inte-
gral domain and irreducibles in K[x], where K is its field of fractions. This is handled
by the next result called Eisenstein’s criterion.

Theorem 4.4.8 (Eisenstein’s criterion). Let R be an integral domain and K its field of
fractions. Let f (x) = ∑ni=0 ai xi ∈ R[x] of degree n > 0. Let p be a prime element of R
satisfying the following:
(1) p|ai for i = 0, . . . , n − 1.
(2) p does not divide an .
(3) p2 does not divide a0 .

Then the following hold:


(a) If f (x) is primitive, then f (x) is irreducible in R[x].
(b) Suppose that R is a UFD. Then f (x) is also irreducible in K[x].

Proof. (a) Suppose that f (x) = g(x)h(x) with g(x), h(x) ∈ R[x]. Suppose that
k l
g(x) = ∑ bi xi , bk ≠ 0 and h(x) = ∑ cj xj , cl ≠ 0.
i=0 j=0

Then a0 = b0 c0 . Now p|a0 , but p2 does not divide a0 . This implies that either p does
not divide b0 , or p doesn’t divide c0 . Without loss of generality, assume that p|b0 and
p does not divide c0 .

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:10 AM
4.4 Polynomial rings over unique factorization domains | 63

Since an = bk cl , and p does not divide an , it follows that p does not divide bk . Let
bj be the first coefficient of g(x), which is not divisible by p. Consider

aj = bj c0 + ⋅ ⋅ ⋅ + b0 cj ,

where everything after the first term is divisible by p. Since p does not divide both bj
and c0 , it follows that p does not divide bj c0 . Therefore, p does not divide aj , which
implies that j = n. Then from j ≤ k ≤ n, it follows that k = n. Therefore, deg g(x) =
deg f (x) and, hence, deg h(x) = 0. Thus, h(x) = h ∈ R. Then from f (x) = hg(x) with f
primitive, it follows that h is a unit and, therefore, f (x) is irreducible.
(b) Suppose that f (x) = g(x)h(x) with g(x), h(x) ∈ R[x]. The fact that f (x) was
primitive was only used in the final part of part (a). Therefore, by the same arguments
as in part (a), we may assume—without loss of generality—that h ∈ R ⊂ K. Therefore,
f (x) is irreducible in K[x].

Following are some examples:

Example 4.4.9. Let R = ℤ and p a prime number. Suppose that n, m are integers such
that n ≥ 1 and p does not divide m. Then xn ± pm is irreducible in ℤ[x] and ℚ[x]. In
1
particular, (pm) n is irrational.

Example 4.4.10. Let R = ℤ and p a prime number. Consider the polynomial

xp − 1
Φp (x) = = xp−1 + xp−2 + ⋅ ⋅ ⋅ + 1.
x−1
Since all the coefficients of Φp (x) are equal to 1, Eisenstein’s criterion is not directly
applicable. However, the fact that Φp (x) is irreducible implies that for any integer a,
the polynomial Φp (x + a) is also irreducible in ℤ[x]. It follows that
p p p−1 p p
(x + 1)p − 1 x + ( 1 )x + ⋅ ⋅ ⋅ + (p−1 )x + 1 − 1
Φp (x + 1) = =
(x + 1) − 1 x
p−1 p p−2 p
= x + ( )x + ⋅ ⋅ ⋅ + ( ).
1 p−1

Now p|(pi ) for 1 ≤ i ≤ p − 1 (see exercises) and, moreover, (p−1


p
) = p is not divis-
ible by p2 . Therefore, we can apply the Eisenstein criterion to conclude that Φp (x) is
irreducible in ℤ[x] and ℚ[x].

Theorem 4.4.11. Let R be a UFD and K its field of fractions. Let f (x) = ∑ni=0 ai xi ∈ R[x]
be a polynomial of degree ≥ 1. Let P be a prime ideal in R with an ∉ P. Let R = R/P, and
let α : R[x] → R[x] be defined by
m m
α( ∑ ri xi ) = ∑ (ri + P)xi .
i=0 i=0

α is an epimorphism. Then if α(f (x)) is irreducible in R[x], then f (x) is irreducible in K[x].

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:10 AM
64 | 4 Polynomials and polynomial rings

Proof. By Theorem 4.4.3, there is an a ∈ R and a primitive g(x) ∈ R[x] satisfying f (x) =
ag(x). Since an ∉ P, we have that α(a) ≠ 0. Furthermore, the highest coefficient of
g(x) is also not an element of P. If α(g(x)) is reducible, then α(f (x)) is also reducible.
Thus, α(g(x)) is irreducible. However, from Theorem 4.4.4, g(x) is irreducible in K[x].
Therefore, f (x) = ag(x) is also irreducible in K[x]. Therefore, to prove the theorem, it
suffices to consider the case where f (x) is primitive in R[x].
Now suppose that f (x) is primitive. We show that f (x) is irreducible in R[x].
Suppose that f (x) = g(x)h(x), g(x), h(x) ∈ R[x] with h(x), g(x) nonunits in R[x].
Since f (x) is primitive, g, h ∉ R. Therefore, deg g(x) < deg f (x), and deg h(x) < deg f (x).
Now we have α(f (x)) = α(g(x))α(h(x)). Since P is a prime ideal, R/P is an integral
domain. Therefore, in R[x] we have

deg α(g(x)) + deg α(h(x)) = deg α(f (x)) = deg f (x)

since an ∉ P. Since R is a UFD, it has no zero divisors. Therefore,

deg f (x) = deg g(x) + deg h(x).

Now

deg α(g(x)) ≤ deg g(x)


deg α(h(x)) ≤ deg h(x).

Therefore, deg α(g(x)) = deg g(x), and deg α(h(x)) = deg h(x). Therefore, α(f (x)) is
reducible, and we have a contradiction.

It is important to note that α(f (x)), being reducible, does not imply that f (x) is
reducible. For example, f (x) = x2 + 1 is irreducible in ℤ[x]. However, in ℤ2 [x], we have

x2 + 1 = (x + 1)2

and, hence, f (x) is reducible in ℤ2 [x].

Example 4.4.12. Let f (x) = x5 − x2 + 1 ∈ ℤ[x]. Choose P = 2ℤ so that

α(f (x)) = x5 + x2 + 1 ∈ ℤ2 [x].

Suppose that in ℤ2 [x], we have α(f (x)) = g(x)h(x). Without loss of generality, we may
assume that g(x) is of degree 1 or 2.
If deg g(x) = 1, then α(f (x)) has a zero c in ℤ2 [x]. The two possibilities for c are
c = 0, or c = 1. Then the following hold;

If c = 0, then 0 + 0 + 1 = 1 ≠ 0.
If c = 1, then 1 + 1 + 1 = 1 ≠ 0.

Hence, the degree of g(x) cannot be 1.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:10 AM
4.5 Exercises | 65

Suppose deg g(x) = 2. The polynomials of degree 2 over ℤ2 [x] have the form

x2 + x + 1, x2 + x, x2 + 1, x2 .

The last three, x2 + x, x2 + 1, x2 all have zeros in ℤ2 [x]. Therefore, they cannot divide
α(f (x)). Therefore, g(x) must be x2 + x + 1. Applying the division algorithm, we obtain

α(f (x)) = (x 3 + x2 )(x2 + x + 1) + 1

and, therefore, x2 + x + 1 does not divide α(f (x)). It follows that α(f (x)) is irreducible,
and from the previous theorem, f (x) must be irreducible in ℚ[x].

4.5 Exercises
1. For which a, b ∈ ℤ does the polynomial x2 + 3x + 1 divide the polynomial x3 + x2 +
ax + b?
2. Let a + bi ∈ ℂ be a zero of f (x) ∈ ℝ[x]. Show that also a − ib is a zero of f (x).
3. Determine all quadratic irreducible polynomials over ℝ.
4. Let R be an integral domain, I ⊲ R an ideal, and f ∈ R[x] a monic polynomial.
Define (R/I)[x] by the mapping R[x] → (R/I)[x], f = ∑ ai xi 󳨃→ f ̄ = ∑ aī xi , where
ā := a + I. Show, if (R/I)[x] is irreducible, then f ∈ R[x] is also irreducible.
5. Decide if the following polynomials f ∈ R[x] are irreducible:
(i) f (x) = x3 + 2x 2 + 3, R = ℤ.
(ii) f (x) = x5 − 2x + 1, R = ℚ.
(iii) f (x) = 3x 4 + 7x2 + 14x + 7, R = ℚ.
(iv) f (x) = x7 + (3 − i)x2 + (3 + 4i)x + 4 + 2i, R = ℤ[i].
(v) f (x) = x4 + 3x 3 + 2x2 + 3x + 4, R = ℚ.
(vi) f (x) = 8x 3 − 4x2 + 2x − 1, R = ℤ.
6. Let R be an integral domain with characteristic 0, let k ≥ 1 and α ∈ R. In R[x],
define the derivatives f (k) (x), k = 0, 1, 2, . . . , of a polynomial f (x) ∈ R[x] by

f 0 (x) := f (x),
󸀠
f (k) (x) := f (k−1) (x).

Show that α is a zero of order k of the polynomial f (x) ∈ R[x], if f (k−1) (α) = 0, but
f (k) (α) ≠ 0.
7. Prove that the set of integral polynomials with even constant term is an ideal, but
not principal.
8. Prove that p|(pi ) for 1 ≤ i ≤ p − 1.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:10 AM
Brought to you by | Chalmers University of Technology
Authenticated
Download Date | 9/12/19 6:10 AM
5 Field extensions
5.1 Extension fields and finite extensions
Much of algebra in general arose from the theory of equations, specifically polynomial
equations. As discovered by Galois and Abel, the solutions of polynomial equations
over fields is intimately tied to the theory of field extensions. This theory eventually
blossoms into Galois Theory. In this chapter, we discuss the basic material concerning
field extensions.
Recall that if L is a field and K ⊂ L is also a field under the same operations as L,
then K is called a subfield of L. If we view this situation from the viewpoint of K, we
say that L is an extension field or field extension of K. If K, L are fields with K ⊂ L, we
always assume that K is a subfield of L.

Definition 5.1.1. If K, L are fields with K ⊂ L, then we say that L is a field extension or
extension field of K. We denote this by L|K.
Note that this is equivalent to having a field monomorphism

i:K→L

and then identifying K and i(K).

As examples, we have that ℝ is an extension field of ℚ, and ℂ is an extension


field of both ℂ and ℚ. If K is any field then the ring of polynomials K[x] over K is an
integral domain. Let K(x) be the field of fractions of K[x]. This is called the field of
rational functions over K. Since K can be considered as part of K[x], it follows that
K ⊂ K(x) and, hence, K(x) is an extension field of K.
A crucial concept is that of the degree of a field extension. Recall that a vector
space V over a field K consists of an abelian group V together with scalar multiplica-
tion from K satisfying the following:
(1) fv ∈ V if f ∈ K, v ∈ V.
(2) f (u + v) = fu + fv for f ∈ K, u, v ∈ V.
(3) (f + g)v = fv + gv for f , g ∈ K, v ∈ V.
(4) (fg)v = f (gv) for f , g ∈ K, v ∈ V.
(5) 1v = v for v ∈ V.

Notice that if K is a subfield of L, then products of elements of L with elements of K are


still in L. Since L is an abelian group under addition, L can be considered as a vector
space over K. Thus, any extension field is a vector space over any of its subfields. Using
this, we define the degree |L : K| of an extension K ⊂ L as the dimension dimK (L) of L
as a vector space over K. We call L a finite extension of K if |L : K| < ∞.

Definition 5.1.2. If L is an extension field of K, then the degree of the extension L|K
is defined as the dimension, dimK (L), of L, as a vector space over K. We denote the

https://doi.org/10.1515/9783110603996-005

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:13 AM
68 | 5 Field extensions

degree by |L : K|. The field extension L|K is a finite extension if the degree |L : K| is
finite.

Lemma 5.1.3. |ℂ : ℝ| = 2, but |ℝ : ℚ| = ∞.

Proof. Every complex number can be written uniquely as a+ib, where a, b ∈ ℝ. Hence,
the elements 1, i constitute a basis for ℂ over ℝ and, therefore, the dimension is 2. That
is, |ℂ : ℝ| = 2.
The fact that |ℝ : ℚ| = ∞ depends on the existence of transcendental numbers.
An element r ∈ ℝ is algebraic (over ℚ) if it satisfies some nonzero polynomial with
coefficients from ℚ. That is, P(r) = 0, where

0 ≠ P(x) = a0 + a1 x + ⋅ ⋅ ⋅ + an xn with ai ∈ ℚ.

Any q ∈ ℚ is algebraic since if P(x) = x − q, then P(q) = 0. However, many irrationals


are also algebraic. For example, √2 is algebraic since x2 − 2 = 0 has √2 as a zero. An
element r ∈ ℝ is transcendental if it is not algebraic.
In general, it is very difficult to show that a particular element is transcendental.
However, there are uncountably many transcendental elements (see exercises). Spe-
cific examples are e and π. We will give a proof of their transcendence in Chapter 20.
Since e is transcendental, for any natural number n, the set of vectors {1, e,
e2 , . . . , en } must be independent over ℚ, for otherwise there would be a polynomial
that e would satisfy. Therefore, we have infinitely many independent vectors in ℝ
over ℚ, which would be impossible if ℝ had finite degree over ℚ.

Lemma 5.1.4. If K is any field, then |K(x) : K| = ∞.

Proof. For any n, the elements 1, x, x2 , . . . , xn are independent over K. Therefore, as in


the proof of Lemma 5.1.3, K(x) must be infinite dimensional over K.

If L|K and L1 |K1 are field extensions, then they are isomorphic field extensions if
there exists a field isomorphism f : L → L1 such that f|K is an isomorphism from K to
K1 .
Suppose that K ⊂ L ⊂ M are fields. Below we show that the degrees multiply. In
this situation, where K ⊂ L ⊂ M, we call L an intermediate field.

Theorem 5.1.5. Let K, L, M be fields with K ⊂ L ⊂ M. Then

|M : K| = |M : L||L : K|.

Note that |M : K| = ∞ if and only if either |M : L| = ∞, or |L : K| = ∞.

Proof. Let {xi : i ∈ I} be a basis for L as a vector space over K, and let {yj : j ∈ J} be a
basis for M as a vector space over L. To prove the result, it is sufficient to show that the
set

B = {xi yj : i ∈ I, j ∈ J}

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:13 AM
5.1 Extension fields and finite extensions | 69

is a basis for M as a vector space over K. To show this, we must show that B is a linearly
independent set over K, and that B spans M.
Suppose that

∑ kij xi yj = 0 where kij ∈ K.


i,j

We can then write this sum as

∑(∑ kij xi )yj = 0.


j i

But ∑i kij xi ∈ L. Since {yj : j ∈ J} is a basis for M over L, the yj are independent over
L; hence, for each j, we get ∑i kij xi = 0. Now since {xi : i ∈ I} is a basis for L over K, it
follows that the xi are linearly independent, and since for each j we have ∑i kij xi = 0,
it must be that kij = 0 for all i and for all j. Therefore, the set B is linearly independent
over K.
Now suppose that m ∈ M. Then since {yj : j ∈ J} spans M over L, we have

m = ∑ cj yj with cj ∈ L.
j

However, {xi : i ∈ I} spans L over K, and so for each cj , we have

cj = ∑ kij xi with kij ∈ K.


i

Combining these two sums, we have

m = ∑ kij xi yj
ij

and, hence, B spans M over K. Therefore, B is a basis for M over K, and the result is
proved.

Corollary 5.1.6.
(a) If |L : K| is a prime number, then there exists no proper intermediate field between
L and K.
(b) If K ⊂ L and |L : K| = 1, then L = K.

Let L|K be a field extension, and suppose that A ⊂ L. Then certainly there are sub-
rings of L containing both A and K, for example L. We denote by K[A] the intersection
of all subrings of L containing both K and A. Since the intersection of subrings is a
subring, it follows that K[A] is a subring containing both K and A and the smallest
such subring. We call K[A] the ring adjunction of A to K.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:13 AM
70 | 5 Field extensions

In an analogous manner, we let K(A) be the intersection of all subfields of L con-


taining both K and A. This is then a subfield of L, and the smallest subfield of L con-
taining both K and A. The subfield K(A) is called the field adjunction of A to K.
Clearly, K[A] ⊂ K(A). If A = {a1 , . . . , an }, then we write

K[A] = K[a1 , . . . , an ] and K(A) = K(a1 , . . . , an ).

Definition 5.1.7. The field extension L|K is finitely generated if there exist a1 , . . . ,
an ∈ L such that L = K(a1 , . . . , an ). The extension L|K is a simple extension if there
is an a ∈ L with L = K(a). In this case, a is called a primitive element of L|K.

In Chapter 7, we will look at an alternative way to view the adjunction construc-


tions in terms of polynomials.

5.2 Finite and algebraic extensions


We now turn to the relationship between field extensions and the solution of polyno-
mial equations.

Definition 5.2.1. Let L|K be a field extension. An element a ∈ L is algebraic over K if


there exists a polynomial p(x) ∈ K[x] with p(a) = 0. L is an algebraic extension of K
if each element of L is algebraic over K. An element a ∈ L that is not algebraic over
K is called transcendental. L is a transcendental extension if there are transcendental
elements; that is, they are not algebraic over K.

For the remainder of this section, we assume that L|K is a field extension.

Lemma 5.2.2. Each element of K is algebraic over K.

Proof. Let k ∈ K. Then k is a zero of the polynomial p(x) = x − k ∈ K[x].

We tie now algebraic extensions to finite extensions.

Theorem 5.2.3. If L|K is a finite extension, then L|K is an algebraic extension.

Proof. Suppose that L|K is a finite extension and a ∈ L. We must show that a is alge-
braic over K. Suppose that |L : K| = n < ∞, then dimK (L) = n. It follows that any n + 1
elements of L are linearly dependent over K.
Now consider the elements 1, a, a2 , . . . , an in L. These are n + 1 distinct elements
in L, so they are dependent over K. Hence, there exist c0 , . . . , cn ∈ K not all zero such
that

c0 + c1 a + ⋅ ⋅ ⋅ + cn an = 0.

Let p(x) = c0 + c1 x + ⋅ ⋅ ⋅ + cn xn . Then p(x) ∈ K[x], and p(a) = 0. Therefore, a is algebraic


over K. Since a was arbitrary, it follows that L is an algebraic extension of K.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:13 AM
5.3 Minimal polynomials and simple extensions | 71

From the previous theorem, it follows that every finite extension is algebraic. The
converse is not true; that is, there are algebraic extensions that are not finite. We will
give examples in Section 5.4.
The following lemma gives some examples of algebraic and transcendental exten-
sions.

Lemma 5.2.4. ℂ|ℝ is algebraic, but ℝ|ℚ and ℂ|ℚ are transcendental. If K is any field,
then K(x)|K is transcendental.

Proof. Since 1, i constitute a basis for ℂ over ℝ, we have |ℂ : ℝ| = 2. Hence, ℂ is a finite


extension of ℝ; therefore, from Theorem 5.2.3, an algebraic extension. More directly,
if α = a + ib ∈ ℂ, then α is a zero of x2 − 2ax + (a2 + b2 ) ∈ ℝ[x].
The existence of transcendental numbers (we will discuss these more fully in Sec-
tion 5.5) shows that both ℝ|ℚ and ℂ|ℚ are transcendental extensions.
Finally, the element x ∈ K(x) is not a zero of any polynomial in K[x]. Therefore,
x is a transcendental element, so the extension K(x)|K is transcendental.

5.3 Minimal polynomials and simple extensions


If L|K is a field extension and a ∈ L is algebraic over K, then p(a) = 0 for some poly-
nomial p(x) ∈ K[x]. In this section, we consider the smallest such polynomial and tie
it to a simple extension of K.

Definition 5.3.1. Suppose that L|K is a field extension and a ∈ L is algebraic over K.
The polynomial ma (x) ∈ K[x] is the minimal polynomial of a over K if the following
hold:
(1) ma (x) has leading coefficient 1; that is, it is a monic polynomial.
(2) ma (a) = 0.
(3) If f (x) ∈ K[x] with f (a) = 0, then ma (x)|f (x).

Hence, ma (x) is the monic polynomial of minimal degree that has a as a zero.

We prove next that every algebraic element has such a minimal polynomial.

Theorem 5.3.2. Suppose that L|K is a field extension and a ∈ L is algebraic over K. Then
we have:
(1) The minimal polynomial ma (x) ∈ K[x] exists and is irreducible over K.
(2) K[a] ≅ K(a) ≅ K[x]/(ma (x)), where (ma (x)) is the principal ideal in K[x] generated
by ma (x).
(3) |K(a) : K| = deg(ma (x)). Therefore, K(a)|K is a finite extension.

Proof. (1) Suppose that a ∈ L is algebraic over K. Let

I = {f (x) ∈ K[x] : f (a) = 0}.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:13 AM
72 | 5 Field extensions

Since a is algebraic, I ≠ 0. It is straightforward to show (see exercises) that I is an


ideal in K[x]. Since K is a field, we have that K[x] is a principal ideal domain. Hence,
there exists g(x) ∈ K[x] with I = (g(x)). Let b be the leading coefficient of g(x). Then
ma (x) = b−1 g(x) is a monic polynomial. We claim that ma (x) is the minimal polynomial
of a and that ma (x) is irreducible.
First, it is clear that I = (g(x)) = (ma (x)). If f (x) ∈ K[x] with f (a) = 0, then f (x) =
h(x)ma (x) for some h(x). Therefore, ma (x) divides any polynomial that has a as a zero.
It follows that ma (x) is the minimal polynomial.
Suppose that ma (x) = g1 (x)g2 (x). Then since ma (a) = 0, it follows that either
g1 (a) = 0 or g2 (a) = 0. Suppose g1 (a) = 0. Then from above, ma (x)|g1 (x), and since
g1 (x)|ma (x), we must then have that g2 (x) is a unit. Therefore, ma (x) is irreducible.
(2) Consider the map τ : K[x] → K[a] given by

τ(∑ ki xi ) = ∑ ki ai .
i i

Then τ is a ring epimorphism (see exercises), and

ker(τ) = {f (x) ∈ K[x] : f (a) = 0} = (ma (x))

from the argument in the proof of part (1). It follows that

K[x]/(ma (x)) ≅ K[a].

Since ma (x) is irreducible, we have K[x]/(ma (x)) is a field and, therefore, K[a] = K(a).
(3) Let n = deg(ma (x)). We claim that the elements 1, a, . . . , an−1 are a basis for
K[a] = K(a) over K. First suppose that
n−1
∑ ci ai = 0
i=1

with not all ci = 0 and ci ∈ K. Then h(a) = 0, where h(x) = ∑n−1 i


i=0 ci x . But this contra-
dicts the fact that ma (x) has minimal degree over all polynomials in K[x] that have a
as a zero. Therefore, the set 1, a, . . . , an−1 is linearly independent over K.
Now let b ∈ K[a] ≅ K[x]/(ma (x)). Then there is a g(x) ∈ K[x] with b = g(a). By the
division algorithm

g(x) = h(x)ma (x) + r(x),

where r(x) = 0 or deg(r(x)) < deg(ma (x)). Now

r(a) = g(a) − h(a)ma (a) = g(a) = b.

If r(x) = 0, then b = 0. If r(x) ≠ 0, then since deg(r(x)) < n, we have

r(x) = c0 + c1 x + ⋅ ⋅ ⋅ + cn−1 xn−1

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:13 AM
5.3 Minimal polynomials and simple extensions | 73

with ci ∈ K and some ci , but not all might be zero. This implies that

b = r(a) = c0 + c1 a + ⋅ ⋅ ⋅ + cn−1 an−1

and, hence, b is a linear combination over K of 1, a, . . . , an−1 . Hence, 1, a, . . . , an−1 spans


K[a] over K and, hence, forms a basis.

Theorem 5.3.3. Suppose that L|K is a field extension and a ∈ L is algebraic over K.
Suppose that f (x) ∈ K[x] is a monic polynomial with f (a) = 0. Then f (x) is the minimal
polynomial if and only if f (x) is irreducible in K[x].

Proof. Suppose that f (x) is the minimal polynomial of a. Then f (x) is irreducible from
the previous theorem.
Conversely, suppose that f (x) is monic, irreducible and f (a) = 0. From the previ-
ous theorem ma (x)|f (x). Since f (x) is irreducible, we have f (x) = cma (x) with c ∈ K.
However, since both f (x) and ma (x) are monic, we must have c = 1, and f (x) = ma (x).

We now show that a finite extension of K is actually finitely generated over K. In


addition, it is generated by finitely many algebraic elements.

Theorem 5.3.4. Let L|K be a field extension. Then the following are equivalent:
(1) L|K is a finite extension.
(2) L|K is an algebraic extension, and there exist elements a1 , . . . , an ∈ L such that L =
K(a1 , . . . , an ).
(3) There exist algebraic elements a1 , . . . , an ∈ L such that L = K(a1 , . . . , an ).

Proof. (1) ⇒ (2). We have seen in Theorem 5.2.3 that a finite extension is algebraic.
Suppose that a1 , . . . , an are a basis for L over K. Then clearly L = K(a1 , . . . , an ).
(2) ⇒ (3). If L|K is an algebraic extension and L = K(a1 , . . . , an ), then each ai is
algebraic over K.
(3) ⇒ (1). Suppose that there exist algebraic elements a1 , . . . , an ∈ L such that
L = K(a1 , . . . , an ). We show that L|K is a finite extension. We do this by induction on n.
If n = 1, then L = K(a) for some algebraic element a, and the result follows from The-
orem 5.3.2. Suppose now that n ≥ 2. We assume then that an extension K(a1 , . . . , an−1 )
with a1 , . . . , an−1 algebraic elements is a finite extension. Now suppose that we have
L = K(a1 , . . . , an ) with a1 , . . . , an algebraic elements.
Then

󵄨󵄨K(a1 , . . . , an ) : K 󵄨󵄨󵄨
󵄨󵄨 󵄨

= 󵄨󵄨󵄨K(a1 , . . . , an−1 )(an ) : K(a1 , . . . , an−1 )󵄨󵄨󵄨󵄨󵄨󵄨K(a1 , . . . , an−1 ) : K 󵄨󵄨󵄨.


󵄨 󵄨󵄨 󵄨

The second term |K(a1 , . . . , an−1 ) : K| is finite from the inductive hypothesis. The first
term |K(a1 , . . . , an−1 )(an ) : K(a1 , . . . , an−1 )| is also finite from Theorem 5.3.2 since it is

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:13 AM
74 | 5 Field extensions

a simple extension of the field K(a1 , . . . , an−1 ) by the algebraic element an . Therefore,
|K(a1 , . . . , an ) : K| is finite.

Theorem 5.3.5. Suppose that K is a field and R is an integral domain with K ⊂ R. Then
R can be viewed as a vector space over K. If dimK (R) < ∞, then R is a field.

Proof. Let r0 ∈ R with r0 ≠ 0. Define the map from R to R given by

τ(r) = rr0 .

It is easy to show (see exercises) that this is a linear transformation from R to R, con-
sidered as a vector space over K.
Suppose that τ(r) = 0. Then rr0 = 0 and, hence, r = 0 since r0 ≠ 0 and R is an
integral domain. It follows that τ is an injective map. Since R is a finite dimensional
vector space over K, and τ is an injective linear transformation, it follows that τ must
also be surjective. This implies that there exists an r1 with τ(r1 ) = 1. Then r1 r0 = 1 and,
hence, r0 has an inverse within R. Since r0 was an arbitrary nonzero element of R, it
follows that R is a field.

Theorem 5.3.6. Suppose that K ⊂ L ⊂ M is a chain of field extensions. Then M|K is


algebraic if and only if M|L is algebraic, and L|K is algebraic.

Proof. If M|K is algebraic, then certainly M|L and L|K are algebraic.
Now suppose that M|L and L|K are algebraic. We show that M|K is algebraic. Let
a ∈ M. Then since a is algebraic over L, there exist b0 , b1 , . . . , bn ∈ L with

b0 + b1 a + ⋅ ⋅ ⋅ + bn an = 0.

Each bi is algebraic over K and, hence, K(b0 , . . . , bn ) is finite dimensional over K.


Therefore, K(b0 , . . . , bn )(a) = K(b0 , . . . , bn , a) is also finite dimensional over K. There-
fore, K(b0 , . . . , bn , a) is a finite extension of K and, hence, an algebraic extension K.
Since a ∈ K(b0 , . . . , bn , a), it follows that a is algebraic over K and, therefore, M is
algebraic over K.

5.4 Algebraic closures


As before, suppose that L|K is a field extension. Since each element of K is algebraic
over K, there are certainly algebraic elements over K within L. Let 𝒜K denote the set
of all elements of L that are algebraic over K. We prove that 𝒜K is actually a subfield
of L. It is called the algebraic closure of K within L.

Theorem 5.4.1. Suppose that L|K is a field extension, and let 𝒜K denote the set of all
elements of L that are algebraic over K. Then 𝒜K is a subfield of L. 𝒜K is called the
algebraic closure of K in L.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:13 AM
5.5 Algebraic and transcendental numbers | 75

Proof. Since K ⊂ 𝒜K , we have that 𝒜K ≠ 0. Let a, b ∈ 𝒜K . Since a, b are both algebraic


over K from Theorem 5.3.4, we have that K(a, b) is a finite extension of K. Therefore,
K(a, b) is an algebraic extension of K and, hence, each element of K(a, b) is algebraic
over K. Now a, b ∈ K(a, b) if b ≠ 0, and K(a, b) is a field. Therefore, a ± b, ab, and a/b
are all in K(a, b) and, hence, all algebraic over K. Therefore, a ± b, ab, a/b, if b ≠ 0, are
all in 𝒜K . It follows that 𝒜K is a subfield of L.

In Section 5.2, we showed that every finite extension is an algebraic extension.


We mentioned that the converse is not necessarily true; that is, there are algebraic
extensions that are not finite. Here we give an example.

Theorem 5.4.2. Let 𝒜 be the algebraic closure of the rational numbers ℚ within the
complex numbers ℂ. Then 𝒜 is an algebraic extension of ℚ, but |𝒜 : ℚ| = ∞.

Proof. From the previous theorem, 𝒜 is an algebraic extension of ℚ. We show that it


cannot be a finite extension. By Eisenstein’s criterion, the rational polynomial f (x) =
xp + p is irreducible over ℚ for any prime p. Let a be a zero in ℂ of f (x). Then a ∈ 𝒜,
and |ℚ(a) : ℚ| = p. Therefore, |𝒜 : ℚ| ≥ p for all primes p. Since there are infinitely
many primes, this implies that |𝒜 : ℚ| = ∞.

5.5 Algebraic and transcendental numbers


In this section, we consider the string of field extensions ℚ ⊂ ℝ ⊂ ℂ.

Definition 5.5.1. An algebraic number α is an element of ℂ, which is algebraic over ℚ.


Hence, an algebraic number is an α ∈ ℂ such that f (α) = 0 for some f (x) ∈ ℚ[x]. If
α ∈ ℂ is not algebraic, it is transcendental.

We will let 𝒜 denote the totality of algebraic numbers within the complex num-
bers ℂ, and 𝒯 the set of transcendentals so that ℂ = 𝒜 ∪ 𝒯 . In the language of the last
subsection, 𝒜 is the algebraic closure of ℚ within ℂ. As in the general case, if α ∈ ℂ is
algebraic, we will let mα (x) denote the minimal polynomial of α over ℚ.
We now examine the sets 𝒜 and 𝒯 more closely. Since 𝒜 is precisely the algebraic
closure of ℚ in ℂ, we have from our general result that 𝒜 actually forms a subfield
of ℂ. Furthermore, since the intersection of subfields is again a subfield, it follows
that 𝒜󸀠 = 𝒜 ∩ ℝ, the real algebraic numbers form a subfield of the reals.

Theorem 5.5.2. The set 𝒜 of algebraic numbers forms a subfield of ℂ. The subset 𝒜󸀠 =
𝒜 ∩ ℝ of real algebraic numbers forms a subfield of ℝ.

Since each rational is algebraic, it is clear that there are algebraic numbers. Fur-
thermore, there are irrational algebraic numbers, √2 for example, since it satisfies the
irreducible polynomial x2 − 2 = 0 over ℚ. On the other hand, we have not examined

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:13 AM
76 | 5 Field extensions

the question of whether transcendental numbers really exist. To show that any par-
ticular complex number is transcendental is, in general, quite difficult. However, it is
relatively easy to show that there are uncountably infinitely many transcendentals.

Theorem 5.5.3. The set 𝒜 of algebraic numbers is countably infinite. Therefore, 𝒯 , the
set of transcendental numbers, and 𝒯 󸀠 = 𝒯 ∩ ℝ, the real transcendental numbers, are
uncountably infinite.

Proof. Let

𝒫n = {f (x) ∈ ℚ[x] : deg(f (x)) ≤ n}.

Since if f (x) ∈ 𝒫n , f (x) = qo + q1 x + ⋅ ⋅ ⋅ + qn xn with qi ∈ ℚ, we can identify a polynomial


of degree ≤ n with an (n + 1)-tuple (q0 , q1 , . . . , qn ) of rational numbers. Therefore, the
set 𝒫n has the same size as the (n + 1)-fold Cartesian product of ℚ:

ℚn+1 = ℚ × ℚ × ⋅ ⋅ ⋅ × ℚ.

Since a finite Cartesian product of countable sets is still countable, it follows that 𝒫n
is a countable set.
Now let

ℬn = ⋃ {zeros of p(x)};
p(x)∈𝒫n

that is, ℬn is the union of all zeros in ℂ of all rational polynomials of degree ≤ n. Since
each such p(x) has a maximum of n zeros, and since 𝒫n is countable, it follows that ℬn
is a countable union of finite sets and, hence, is still countable. Now

𝒜 = ⋃ ℬn ,
n=1

so 𝒜 is a countable union of countable sets and is, therefore, countable.


Since both ℝ and ℂ are uncountably infinite, the second assertions follow directly
from the countability of 𝒜. If say 𝒯 were countable, then ℂ = 𝒜 ∪ 𝒯 would also be
countable, which is a contradiction.

From Theorem 5.5.3, we know that there exist infinitely many transcendental num-
bers. Liouville, in 1851, gave the first proof of the existence of transcendentals by ex-
hibiting a few. He gave the following as one example:

Theorem 5.5.4. The real number



1
c=∑
j=1 10j!

is transcendental.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:13 AM
5.5 Algebraic and transcendental numbers | 77

Proof. First of all, since 101 j! < 101 j , and ∑∞ 1


j=1 10j is a convergent geometric series, it fol-
lows from the comparison test that the infinite series defining c converges and defines
1 1 1
a real number. Furthermore, since ∑∞ j=1 10j = 9 , it follows that c < 9 < 1.
Suppose that c is algebraic so that g(c) = 0 for some rational nonzero polynomial
g(x). Multiplying through by the least common multiple of all the denominators in
g(x), we may suppose that f (c) = 0 for some integral polynomial f (x) = ∑nj=0 mj xj .
Then c satisfies
n
∑ mj cj = 0
j=0

for some integers m0 , . . . , mn .


If 0 < x < 1, then by the triangle inequality
󵄨󵄨 n 󵄨󵄨 n
󵄨󵄨 󸀠 󵄨󵄨 󵄨󵄨󵄨 j−1 󵄨󵄨󵄨
󵄨󵄨f (x) =
󵄨󵄨 󵄨󵄨 ∑ jmj x 󵄨󵄨 ≤ ∑ |jmj | = B,
󵄨󵄨 󵄨󵄨
󵄨 j=1 󵄨 j=1
where B is a real constant depending only on the coefficients of f (x).
Now let
k
1
ck = ∑
j=1 10j!

be the k-th partial sum for c. Then



1 1
|c − ck | = ∑ j!
< 2 ⋅ (k+1)! .
j=k+1 10 10

Apply the mean value theorem to f (x) at c and ck to obtain

󵄨󵄨f (c) − f (ck )󵄨󵄨󵄨 = |c − ck |󵄨󵄨󵄨f (ζ )󵄨󵄨󵄨


󵄨󵄨 󵄨 󵄨 󸀠 󵄨

for some ζ with ck < ζ < c < 1. Now since 0 < ζ < 1, we have
1
|c − ck |󵄨󵄨󵄨f 󸀠 (ζ )󵄨󵄨󵄨 < 2B
󵄨 󵄨
.
10(k+1)!
On the other hand, since f (x) can have at most n zeros, it follows that for all k large
enough, we would have f (ck ) ≠ 0. Since f (c) = 0, we have
󵄨󵄨 n 󵄨󵄨
󵄨 󵄨󵄨 j 󵄨󵄨 1
󵄨󵄨f (c) − f (ck )󵄨󵄨󵄨 = 󵄨󵄨󵄨f (ck )󵄨󵄨󵄨 = 󵄨󵄨󵄨 ∑ mj ck 󵄨󵄨󵄨 > nk!
󵄨󵄨 󵄨 󵄨
󵄨󵄨󵄨 j=1 󵄨󵄨󵄨 10

j
since for each j, mj ck is a rational number with denominator 10jk! . However, if k is
chosen sufficiently large and n is fixed, we have
1 2B
> ,
10nk! 10(k+1)!

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:13 AM
78 | 5 Field extensions

contradicting the equality from the mean value theorem. Therefore, c is transcenden-
tal.

In 1873, Hermite proved that e is transcendental, whereas, in 1882, Lindemann


showed that π is transcendental. Schneider, in 1934, showed that ab is transcendental
if a ≠ 0, a, and b are algebraic and b is irrational. In Chapter 20, we will prove that
both e and π are transcendental. An interesting open question is the following:
Is π transcendental over ℚ(e)?
To close this section, we show that in general if a ∈ L is transcendental over K,
then K(a)|K is isomorphic to the field of rational functions over K.

Theorem 5.5.5. Suppose that L|K is a field extension and a ∈ L is transcendental over K.
Then K(a)|K is isomorphic to K(x)|K. Here the isomorphism μ : K(x) → K(a) can be
chosen such that μ(x) = a.

Proof. Define the map μ : K(x) → K(a) by

f (x) f (a)
μ( )=
g(x) g(a)

for f (x), g(x) ∈ K[x] with g(x) ≠ 0. Then μ is a homomorphism, and μ(x) = a. Since
μ ≠ 0, it follows that μ is an isomorphism.

5.6 Exercises
1. Let a ∈ ℂ with a3 − 2a + 2 = 0 and b = a2 − a. Compute the minimal polynomial
mb (x) of b over ℚ and compute the inverse of b in ℚ(a).
2. Determine the algebraic closure of ℝ in ℂ(x).
n
3. Let an := 2√2 ∈ ℝ, n = 1, 2, 3, . . . and A := {an : n ∈ ℕ} and E := ℚ(A). Show the
following:
(i) |ℚ(an ) : ℚ| = 2n .
(ii) |E : ℚ| = ∞.
(iii) E = ⋃∞n=1 ℚ(an ).
(iv) E is algebraic over ℚ.
4. Determine |E : ℚ| for
(i) E = ℚ(√2, √−2).
(ii) E = ℚ(√3, √3 + √3 3).
(iii) E = ℚ( 1+i , −1+i ).
√2 √2
5. Show that ℚ(√2, √3) = {a + b√2 + c√3 + d√6 : a, b, c, d ∈ ℚ}. Determine the degree
of ℚ(√2, √3) over ℚ. Further show that ℚ(√2, √3) = ℚ(√2 + √3).
6. Let K, E be fields and a ∈ E be transcendental over K. Show the following:
(i) Each element of K(a)|K, which is not in K, is transcendental over K.
(ii) an is transcendental over K for each n > 1.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:13 AM
5.6 Exercises | 79

3
a
(iii) If L := K( a+1 ), then a is algebraic over L. Determine the minimal polynomial
ma (x) of a over L.
7. Let K be a field and a ∈ K(x) \ K. Show the following:
(i) x is algebraic over K(a).
(ii) If L is a field with K ⊂ L ⊆ K(x) and if a ∈ L, then |K(x) : L| < ∞.
(iii) a is transcendental over K.
8. Suppose that a ∈ L is algebraic over K. Let

I = {f (x) ∈ K[x] : f (a) = 0}.

Since a is algebraic I ≠ 0. Prove that I is an ideal in K[x].


9. Prove that there are uncountably many transcendental numbers. To do this show
that the set 𝒜 of algebraic numbers is countable. To do this:
(i) Show that ℚn [x], the set of rational polynomials of degree ≤ n, is countable
(finite Cartesian product of countable sets).
(ii) Let ℬn = {Zeros of polynomials in ℚn }. Show that ℬ is countable.
(iii) Show that 𝒜 = ⋃∞ n=1 ℬn and conclude that 𝒜 is countable.
(iv) Show that the transcendental numbers are uncountable.
10. Consider the map τ : K[x] → K[a] given by

τ(∑ ki xi ) = ∑ ki ai .
i i

Show that τ is a ring epimorphism.


11. Suppose that K is a field and R is an integral domain with K ⊂ R. Then R can be
viewed as a vector space over K. Let r0 ∈ R with r0 ≠ 0. Define the map from R to
R given by

τ(r) = rr0 .

Show that this is a linear transformation from R to R, considered as a vector space


over K.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:13 AM
Brought to you by | Chalmers University of Technology
Authenticated
Download Date | 9/12/19 6:13 AM
6 Field extensions and compass and straightedge
constructions
6.1 Geometric constructions

Greek mathematicians in the classical period posed the problem of constructing cer-
tain geometric figures in the Euclidean plane using only a straightedge and a compass.
These are known as geometric construction problems.
Recall from elementary geometry that using a straightedge and compass, it is pos-
sible to draw a line parallel to a given line segment through a given point, to extend a
given line segment, and to erect a perpendicular to a given line at a given point on that
line. There were other geometric construction problems that the Greeks could not de-
termine straightedge and compass solutions but, on the other hand, were never able to
prove that such constructions were impossible. In particular, there were four famous
insolvable (to the Greeks) construction problems. The first is the squaring of the circle.
This problem is, given a circle, to construct using straightedge and compass a square
having an area equal to that of the given circle. The second is the doubling of the cube.
This problem is, given a cube of given side length, to construct using a straightedge
and compass, a side of a cube having double the volume of the original cube. The third
problem is the trisection of an angle. This problem is to trisect a given angle using only
a straightedge and compass. The final problem is the construction of a regular n-gon.
This problems asks which regular n-gons could be constructed using only straightedge
and compass.
By translating each of these problems into the language of field extensions, we
can show that each of the first three problems are insolvable in general, and we can
give the complete solution to the construction of the regular n-gons.

6.2 Constructible numbers and field extensions

We now translate the geometric construction problems into the language of field ex-
tensions. As a first step, we define a constructible number.

Definition 6.2.1. Suppose we are given a line segment of unit length. An α ∈ ℝ is


constructible if we can construct a line segment of length |α|, in a finite number of
steps, from the unit segment using a straightedge and compass.

Our first result is that the set of all constructible numbers forms a subfield of ℝ.

Theorem 6.2.2. The set 𝒞 of all constructible numbers forms a subfield of ℝ. Further-
more, ℚ ⊂ 𝒞 .

https://doi.org/10.1515/9783110603996-006

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:20 AM
82 | 6 Field extensions and compass and straightedge constructions

Proof. Let 𝒞 be the set of all constructible numbers. Since the given unit length seg-
ment is constructible, we have 1 ∈ 𝒞 . Therefore, 𝒞 ≠ 0. Thus, to show that it is a field,
we must show that it is closed under the field operations.
Suppose α, β are constructible. We must show then that α ± β, αβ, and α/β for β ≠ 0
are constructible. If α, β > 0, construct a line segment of length |α|. At one end of this
line segment, extend it by a segment of length |β|. This will construct a segment of
length α + β. Similarly, if α > β, lay off a segment of length |β| at the beginning of a
segment of length |α|. The remaining piece will be α − β. By considering cases, we can
do this in the same manner if either α or β, or both, are negative. These constructions
are pictured in Figure 6.1. Therefore, α ± β are constructible.

Figure 6.1: Addition of constructible numbers.

In Figure 6.2, we show how to construct αβ. Let the line segment OA have length |α|.
Consider a line L through O not coincident with OA. Let OB have length |β| as in the
diagram. Let P be on ray OB so that OP has length 1. Draw AP and then find Q on ray
OA such that BQ is parallel to AP. From similar triangles, we then have

|OP| |OA| 1 |α|


= ⇒ = .
|OB| |OQ| |β| |OQ|

Then |OQ| = |α||β|, and so αβ is constructible.

Figure 6.2: Multiplication of constructible numbers.

A similar construction, pictured in Figure 6.3, shows that α/β for β ≠ 0 is constructible.
Find OA, OB, OP as above. Now, connect A to B, and let PQ be parallel to AB. From
similar triangles again, we have

1 |OQ| |α|
= 󳨐⇒ = |OQ|.
|β| |α| |β|

Hence, α/β is constructible.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:20 AM
6.2 Constructible numbers and field extensions | 83

Figure 6.3: Inversion of constructible numbers.

Therefore, 𝒞 is a subfield of ℝ. Since char 𝒞 = 0, it follows that ℚ ⊂ 𝒞 .

Let us now consider how a constructible number is found in the plane. Starting
at the origin and using the unit length and the constructions above, we can locate
any point in the plane with rational coordinates. That is, we can construct the point
P = (q1 , q2 ) with q1 , q2 ∈ ℚ. Using only straightedge and compass, any further point in
the plane can be determined in one of the following three ways:
1. The intersection point of two lines, each of which passes through two known
points each having rational coordinates.
2. The intersection point of a line passing through two known points having rational
coordinates and a circle, whose center has rational coordinates, and whose radius
squared is rational.
3. The intersection point of two circles, each of whose centers has rational coordi-
nates, and each of whose radii is the square root of a rational number.

Analytically, the first case involves the solution of a pair of linear equations, each with
rational coefficients and, thus, only leads to other rational numbers. In cases two and
three, we must solve equations of the form x2 +y2 +ax+by+c = 0, with a, b, c ∈ ℚ. These
will then be quadratic equations over ℚ and, thus, the solutions will either be in ℚ, or
in a quadratic extension ℚ(√α) of ℚ. Once a real quadratic extension of ℚ is found, the
process can be iterated. Conversely, using the altitude theorem, if α is constructible,
so is √α. A much more detailed description of the constructible numbers can be found
in [42]. We thus can prove the following theorem:

Theorem 6.2.3. If γ is constructible with γ ∉ ℚ, then there exists a finite number of


elements α1 , . . . , αr ∈ ℝ with αr = γ such that for i = 1, . . . , r, ℚ(α1 , . . . , αi ) is a quadratic
extension of ℚ(α1 , . . . , αi−1 ). In particular, |ℚ(γ) : ℚ| = 2n for some n ≥ 1.

Therefore, the constructible numbers are precisely those real numbers that
are contained in repeated quadratic extensions of ℚ. In the next section, we use
this idea to show the impossibility of the first three mentioned construction prob-
lems.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:20 AM
84 | 6 Field extensions and compass and straightedge constructions

6.3 Four classical construction problems


We now consider the aforementioned construction problems. Our main technique will
be to use Theorem 6.2.3. From this result, we have that if γ is constructible with γ ∉ ℚ,
then |ℚ(γ) : ℚ| = 2n for some n ≥ 1.

6.3.1 Squaring the circle

Theorem 6.3.1. It is impossible to square the circle. That is, it is impossible in general,
given a circle, to construct using straightedge and compass a square having area equal
to that of the given circle.

Proof. Suppose the given circle has radius 1. It is then constructible and would have
an area of π. A corresponding square would then have to have a side of length √π. To
be constructible a number, α must have |ℚ(α) : ℚ| = 2m < ∞ and, hence, α must be al-
gebraic. However, π is transcendental, so √π is also transcendental (see Section 20.4);
therefore not constructible.

6.3.2 The doubling of the cube

Theorem 6.3.2. It is impossible to double the cube. This means that it is impossible in
general, given a cube of given side length, to construct using a straightedge and compass,
a side of a cube having double the volume of the original cube.

Proof. Let the given side length be 1, so that the original volume is also 1. To double
this, we would have to construct a side of length 21/3 . However, |ℚ(21/3 ) : ℚ| = 3 since
the minimal polynomial over ℚ is m21/3 (x) = x3 − 2. This is not a power of 2, so 21/3 is
not constructible.

6.3.3 The trisection of an angle

Theorem 6.3.3. It is impossible to trisect an angle. This means that it is impossible, in


general, to trisect a given angle using only a straightedge and compass.

Proof. An angle θ is constructible if and only if a segment of length | cos θ| is con-


structible. Since cos(π/3) = 1/2, therefore, π/3 is constructible. We show that it cannot
be trisected by straightedge and compass.
The following trigonometric identity holds:

cos(3θ) = 4 cos3 (θ) − 3 cos(θ).

Let α = cos(π/9). From the above identity, we have 4α3 − 3α − 21 = 0. The polynomial
4x3 −3x − 21 is irreducible over ℚ and, hence, the minimal polynomial over ℚ is mα (x) =

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:20 AM
6.3 Four classical construction problems | 85

x3 − 43 x − 81 . It follows that |ℚ(α) : ℚ| = 3; hence, α is not constructible. Therefore, the


corresponding angle π/9 is not constructible. Therefore, π/3 is constructible, but it
cannot be trisected.

6.3.4 Construction of a regular n-gon

The final construction problem we consider is the construction of regular n-gons. The
algebraic study of the constructibility of regular n-gons was initiated by Gauss in the
early part of the nineteenth century.
Notice first that a regular n-gon will be constructible for n ≥ 3 if and only if the
angle 2πn
is constructible, which is the case if and only if the length cos 2πn
is a con-

structible number. From our techniques, if cos n is a constructible number, then nec-
essarily |ℚ(cos( 2π n
)) : ℚ| = 2m for some m. After we discuss Galois theory, we see that
this condition is also sufficient. Therefore, cos 2πn
is a constructible number if and only
if |ℚ(cos( 2πn
)) : ℚ| = 2 m
for some m.
The solution of this problem, that is, the determination of when
|ℚ(cos( 2πn
)) : ℚ| = 2m , involves two concepts from number theory: the Euler phi-
function and Fermat primes.

Definition 6.3.4. For any natural number n, the Euler phi-function is defined by

ϕ(n) = number of integers less than or equal to n, and relatively prime to n.

Example 6.3.5. ϕ(6) = 2 since among 1, 2, 3, 4, 5, 6 only 1, 5 are relatively prime to 6.

It is fairly straightforward to develop a formula for ϕ(n). A formula is first deter-


mined for primes and for prime powers, and then pasted back together via the funda-
mental theorem of arithmetic.

Lemma 6.3.6. For any prime p and m > 0,


1
ϕ(pm ) = pm − pm−1 = pm (1 − ).
p
Proof. If 1 ≤ a ≤ p, then either a = p, or (a, p) = 1. It follows that the positive integers
less than or equal to pm , which are not relatively prime to pm are precisely the multiples
of p; that is, p, 2p, 3p, . . . , pm−1 ⋅ p. All other positive a < pm are relatively prime to pm .
Hence, the number relatively prime to pm is

pm − pm−1 .

Lemma 6.3.7. If (a, b) = 1, then ϕ(ab) = ϕ(a)ϕ(b).

Proof. Given a natural number n a reduced residue system modulo n is a set of integers
x1 , . . . , xk such that each xi is relatively prime to n, xi ≠ xj mod n unless i = j, and if

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:20 AM
86 | 6 Field extensions and compass and straightedge constructions

(x, n) = 1 for some integer x, then x ≡ xi mod n for some i. Clearly, ϕ(n) is the size of a
reduced residue system modulo n.
Let Ra = {x1 , . . . , xϕ(a) } be a reduced residue system modulo a, Rb = {y1 , . . . , yϕ(b) }
be a reduced residue system modulo b, and let

S = {ayi + bxj : i = 1, . . . , ϕ(b), j = 1, . . . , ϕ(a)}.

We claim that S is a reduced residue system modulo ab. Since S has ϕ(a)ϕ(b) elements,
it will follow that ϕ(ab) = ϕ(a)ϕ(b).
To show that S is a reduced residue system modulo ab, we must show three things:
first that each x ∈ S is relatively prime to ab; second that the elements of S are distinct;
and, finally, that given any integer n with (n, ab) = 1, then n ≡ s mod ab for some s ∈ S.
Let x = ayi + bxj . Then since (xj , a) = 1 and (a, b) = 1, it follows that (x, a) = 1.
Analogously, (x, b) = 1. Since x is relatively prime to both a and b, we have (x, ab) = 1.
This shows that each element of S is relatively prime to ab.
Next suppose that

ayi + bxj ≡ ayk + bxl mod ab.

Then

ab|(ayi + bxj ) − (ayk + bxl ) 󳨐⇒ ayi ≡ ayk mod b.

Since (a, b) = 1, it follows that yi ≡ yk mod b. But then yi = yk since Rb is a reduced


residue system. Similarly, xj = xl . This shows that the elements of S are distinct modulo
ab.
Finally, suppose (n, ab) = 1. Since (a, b) = 1, there exist x, y with ax + by = 1. Then

anx + bny = n.

Since (x, b) = 1, and (n, b) = 1, it follows that (nx, b) = 1. Therefore, there is an si with
nx = si + tb. In the same manner, (ny, a) = 1, and so there is an rj with ny = rj + ua.
Then

a(si + tb) + b(rj + ua) = n 󳨐⇒ n = asi + brj + (t + u)ab


󳨐⇒ n ≡ ari + bsj mod ab,

and we are done.

We now give the general formula for ϕ(n).


e e
Theorem 6.3.8. Suppose n = p1 1 ⋅ ⋅ ⋅ pkk , then

e e −1 e e −1 e e −1
ϕ(n) = (p1 1 − p1 1 )(p22 − p22 ) ⋅ ⋅ ⋅ (pkk − pkk ).

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:20 AM
6.3 Four classical construction problems | 87

Proof. From the previous lemma, we have

e e e
ϕ(n) = ϕ(p1 1 )ϕ(p22 ) ⋅ ⋅ ⋅ ϕ(pkk )
e e −1 e e −1 e e −1
= (p1 1 − p1 1 )(p22 − p22 ) ⋅ ⋅ ⋅ (pkk − pkk )
e e e e
= p1 1 (1 − 1/p1 ) ⋅ ⋅ ⋅ pkk (1 − 1/pk ) = p1 1 ⋅ ⋅ ⋅ pkk ⋅ (1 − 1/p1 ) ⋅ ⋅ ⋅ (1 − 1/pk )
= n ∏(1 − 1/pi ).
i

Example 6.3.9. Determine ϕ(126). Now

126 = 2 ⋅ 32 ⋅ 7 󳨐⇒ ϕ(126) = ϕ(2)ϕ(32 )ϕ(7) = (1)(32 − 3)(6) = 36.

Hence, there are 36 units in ℤ126 .

An interesting result with many generalizations in number theory is the following:

Theorem 6.3.10. For n > 1 and for d ≥ 1

∑ ϕ(d) = n.
d|n

Proof. We first prove the theorem for prime powers and then paste together via the
fundamental theorem of arithmetic.
Suppose that n = pe for p a prime. Then the divisors of n are 1, p, p2 , . . . , pe , so

∑ ϕ(d) = ϕ(1) + ϕ(p) + ϕ(p2 ) + ⋅ ⋅ ⋅ + ϕ(pe )


d|n

= 1 + (p − 1) + (p2 − p) + ⋅ ⋅ ⋅ + (pe − pe−1 ).

Notice that this sum telescopes; that is, 1 + (p − 1) = p, p + (p2 − p) = p2 and so on.
Hence, the sum is just pe , and the result is proved for n a prime power.
We now do an induction on the number of distinct prime factors of n. The above
argument shows that the result is true if n has only one distinct prime factor. Assume
that the result is true whenever an integer has less than k distinct prime factors, and
e e
suppose n = p1 1 ⋅ ⋅ ⋅ pkk has k distinct prime factors. Then n = pe c, where p = p1 , e = e1 ,
and c has fewer than k distinct prime factors. By the inductive hypothesis

∑ ϕ(d) = c.
d|c

Since (c, p) = 1, the divisors of n are all of the form pα d1 , where d1 |c, and
α = 0, 1, . . . , e. It follows that

∑ ϕ(d) = ∑ ϕ(d1 ) + ∑ ϕ(pd1 ) + ⋅ ⋅ ⋅ + ∑ ϕ(pe d1 ).


d|n d1 |c d1 |c d1 |c

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:20 AM
88 | 6 Field extensions and compass and straightedge constructions

Since (d1 , pα ) = 1, for any divisor of c, this sum equals

∑ ϕ(d1 ) + ∑ ϕ(p)ϕ(d1 ) + ⋅ ⋅ ⋅ + ∑ ϕ(pe )ϕ(d1 )


d1 |c d1 |c d1 |c

= ∑ ϕ(d1 ) + (p − 1) ∑ ϕ(d1 ) + ⋅ ⋅ ⋅ + (pe − pe−1 ) ∑ ϕ(d1 )


d1 |c d1 |c d1 |c
2 e e−1
= c + (p − 1)c + (p − p)c + ⋅ ⋅ ⋅ + (p − p )c.

As in the case of prime powers, this sum telescopes, giving a final result

∑ ϕ(d) = pe c = n.
d|n

Example 6.3.11. Consider n = 10. The divisors are 1, 2, 5, 10. Then ϕ(1) = 1, ϕ(2) = 1,
ϕ(5) = 4, ϕ(10) = 4. Then

ϕ(1) + ϕ(2) + ϕ(5) + ϕ(10) = 1 + 1 + 4 + 4 = 10.

We will see later in the book that the Euler phi-function plays an important role
in the structure theory of abelian groups.
We now turn to Fermat primes.

Definition 6.3.12. The Fermat numbers are the sequence (Fn ) of positive integers de-
fined by
n
Fn = 22 + 1, n = 0, 1, 2, 3, . . . .

If a particular Fn is prime, it is called a Fermat prime.

Fermat believed that all the numbers in this sequence were primes. In fact, F0 , F1 ,
F2 , F3 , F4 are all primes, but F5 is composite and divisible by 641 (see exercises). It is
still an open question whether or not there are infinitely many Fermat primes. It has
been conjectured that there are only finitely many. On the other hand, if a number of
the form 2n + 1 is a prime for some integer n, then it must be a Fermat prime.

Theorem 6.3.13. If a ≥ 2 and an + 1 is a prime for some n ≥ 1, then a is even, and n = 2m


for some nonnegative integer m. In particular, if p = 2k + 1 is a prime for some k ≥ 1, then
k = 2n for some n, and p is a Fermat prime.

Proof. If a is odd then an + 1 is even and, hence, not a prime. Suppose then that a is
even and n = kl with k odd and k ≥ 3. Then

akl + 1
= a(k−1)l − a(k−2)l + ⋅ ⋅ ⋅ + 1.
al + 1

Therefore, al + 1 divides akl + 1 if k ≥ 3. Hence, if an + 1 is a prime, we must have


n = 2m .

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:20 AM
6.4 Exercises | 89

We can now state the solution to the constructibility of regular n-gons.


Theorem 6.3.14. A regular n-gon is constructible with a straightedge and compass if
and only if n = 2m p1 ⋅ ⋅ ⋅ pk , where p1 , . . . , pk are distinct Fermat primes.
For example, before proving the theorem, notice that a regular 20-gon is con-
structible since 20 = 22 ⋅ 5, and 5 is a Fermat prime. On the other hand, a regular
11-gon is not constructible.
2πi
Proof. Let μ = e n be a primitive n-th root of unity. Since

2πi 2π 2π
e n = cos( ) + i sin( )
n n

is easy to compute that (see exercises)

1 2π
μ+ = 2 cos( ).
μ n

Therefore, ℚ(μ + μ1 ) = ℚ(cos( 2π


n
)). After we discuss Galois theory in more detail, we
will prove that
󵄨󵄨 1 󵄨󵄨 ϕ(n)
󵄨󵄨 󵄨
󵄨󵄨ℚ(μ + ) : ℚ󵄨󵄨󵄨 = ,
󵄨󵄨 μ 󵄨󵄨 2

where ϕ(n) is the Euler phi-function. Therefore, cos( 2π


n
) is constructible if and only if
ϕ(n)
2
and, hence, ϕ(n) is a power of 2.
e e
Suppose that n = 2m p1 1 ⋅ ⋅ ⋅ pkk , all pi odd primes. Then from Theorem 6.3.8,

e e −1 e e −1 e e −1
ϕ(n) = 2m−1 ⋅ (p1 1 − p1 1 )(p22 − p22 ) ⋅ ⋅ ⋅ (pkk − pkk ).

If this was a power of 2 each factor must also be a power of 2. Now


e e −1 e −1
pi i − pi i = pi i (pi − 1).

If this is to be a power of 2, we must have ei = 1 and pi − 1 = 2ki for some ki . Therefore,


each prime is distinct to the first power, and pi = 2ki + 1 is a Fermat prime, proving the
theorem.

6.4 Exercises
1. Let ϕ be a given angle. In which of the following cases is the angle ψ constructible
from the angle ϕ by compass and straightedge?
π π
(a) ϕ = 13 , ψ = 26 .
π π
(b) ϕ = 33 , ψ = 11 .
π
(c) ϕ = π7 , ψ = 12 .

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:20 AM
90 | 6 Field extensions and compass and straightedge constructions

2. (The golden section) In the plane, let AB be a given segment from A to B with
length a. The segment AB should be divided such that the proportion of AB to the
length of the bigger subsegment is equal to the proportion of the length of the
bigger subsegment to the length of the smaller subsegment:

a b
= ,
b a−b
where b is the length of the bigger subsegment. Such a division is called division
by the golden section. If we write b = ax, 0 < x < 1, then x1 = 1−x x
, that is, x2 = 1 − x.
Do the following:
(a) Show that x1 = 1+2 5 = α.

(b) Construct the division of AB by the golden section with compass and straight-
edge.
(c) If we divide the radius r > 0 of a circle by the golden section, then the bigger
part of the so divided radius is the side of the regular 10-gon with its 10 vertices
on the circle.
3. Given a regular 10-gon such that the 10 vertices are on the circle with radius R > 0.
Show that the length of each side is equal to the bigger part of the radius divided
by the golden section. Describe the procedure of the construction of the regular
10-gon and 5-gon.
4. Construct the regular 17-gon with compass and straightedge. Hint: We have to con-
2πi
struct the number 21 (ω + ω−1 ) = cos 2π
17
, where ω = e 17 . First, construct the positive
zero ω1 of the polynomial x2 + x − 4; we get

1
ω1 = (√17 − 1) = ω + ω−1 + ω2 + ω−2 + ω4 + ω−4 + ω8 + ω−8 .
2

Then, construct the positive zero ω2 of the polynomial x2 − ω1 x − 1; we get

1 √
ω2 = ( 17 − 1 + √34 − 2√17) = ω + ω−1 + ω4 + ω−4 .
4

From ω1 and ω2 , construct β = 21 (ω22 − ω1 + ω2 − 4). Then ω3 = 2 cos 2π


17
is the biggest
2
of the two positive zeros of the polynomial x − ω2 x + β.
5. The Fibonacci numbers fn , n ∈ ℕ∪{0} are defined by f0 = 0, f1 = 1 and fn+2 = fn+1 +fn
for n ∈ ℕ ∪ {0}. Show the following:
αn −βn 1+√5 1−√5
(a) fn = α−β
with α = 2
, β= 2
.
fn+1 fn+1 1+√5
(b) ( )
fn n∈ℕ
converges and limn→∞ f = 2
= α.
n
0 1 n f fn
(c) fn fn+1 ), n ∈ ℕ.
( 1 1 ) = ( n−1
(d) f1 + f2 + ⋅ ⋅ ⋅ + fn = fn+2 − 1, n ≥ 1.
(e) fn−1 fn+1 − fn2 = (−1)n , n ∈ ℕ.
(f) f12 + f22 + ⋅ ⋅ ⋅ + fn2 = fn fn+1 , n ∈ ℕ.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:20 AM
6.4 Exercises | 91

6. Show: The Fermat numbers F0 , F1 , F2 , F3 , F4 are all prime but F5 is composite and
divisible by 641.
2πi
7. Let μ = e n be a primitive n-th root of unity. Using

2πi 2π 2π
e n = cos( ) + i sin( ),
n n

show that

1 2π
μ+ = 2 cos( ).
μ n

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:20 AM
Brought to you by | Chalmers University of Technology
Authenticated
Download Date | 9/12/19 6:20 AM
7 Kronecker’s theorem and algebraic closures
7.1 Kronecker’s theorem
In the last chapter, we proved that if L|K is a field extension, then there exists an inter-
mediate field K ⊂ 𝒜 ⊂ L such that 𝒜 is algebraic over K, and contains all the elements
of L that are algebraic over K. We call 𝒜 the algebraic closure of K within L. In this
chapter, we prove that starting with any field K, we can construct an extension field
K that is algebraic over K and is algebraically closed. By this, we mean that there are
no algebraic extensions of K or, equivalently, that there are no irreducible nonlinear
polynomials in K[x]. In the final section of this chapter, we will give a proof of the fa-
mous fundamental theorem of algebra, which in the language of this chapter says that
the field ℂ of complex numbers is algebraically closed. We will present another proof
of this important result later in the book after we discuss Galois theory.
First, we need the following crucial result of Kronecker, which says that given a
polynomial f (x) in K[x], where K is a field, we can construct an extension field L of K,
in which f (x) has a zero α. We say that L has been constructed by adjoining α to K.
Recall that if f (x) ∈ K[x] is irreducible, then f (x) can have no zeros in K. We first need
the following concept:

Definition 7.1.1. Let L|K and L󸀠 |K be field extensions. Then a K-isomorphism is an iso-
morphism τ : L → L󸀠 , that is, the identity map on K; thus, it fixes each element of K.

Theorem 7.1.2 (Kronecker’s theorem). Let K be a field and f (x) ∈ K[x]. Then there ex-
ists a finite extension K 󸀠 of K, where f (x) has a zero.

Proof. Suppose that f (x) ∈ K[x]. We know that f (x) factors into irreducible polynomi-
als. Let p(x) be an irreducible factor of f (x). From the material in Chapter 4, we know
that since p(x) is irreducible, the principal ideal ⟨p(x)⟩ in K[x] is a maximal ideal. To
see this, suppose that g(x) ∉ ⟨p(x)⟩, so that g(x) is not a multiple of p(x). Since p(x) is
irreducible, it follows that (p(x), g(x)) = 1. Thus, there exist h(x), k(x) ∈ K[x] with

h(x)p(x) + k(x)g(x) = 1.

The element on the left is in the ideal (g(x), p(x)), so the identity, 1, is in this ideal.
Therefore, the whole ring K[x] is in this ideal. Since g(x) was arbitrary, this implies
that the principal ideal ⟨p(x)⟩ is maximal.
Now let K 󸀠 = K[x]/⟨p(x)⟩. Since ⟨p(x)⟩ is a maximal ideal, it follows that K 󸀠 is a
field. We show that K can be embedded in K 󸀠 , and that p(x) has a zero in K 󸀠 .
First, consider the map α : K[x] → K 󸀠 by α(f (x)) = f (x) + ⟨p(x)⟩. This is a homo-
morphism. Since the identity element 1 ∈ K is not in ⟨p(x)⟩, it follows that α restricted
to K is nontrivial. Therefore, α restricted to K is a monomorphism since if ker(α|K ) ≠ K
then ker(α|K ) = {0}. Therefore, K can be embedded into α(K), which is contained in

https://doi.org/10.1515/9783110603996-007

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
94 | 7 Kronecker’s theorem and algebraic closures

K 󸀠 . Therefore, K 󸀠 can be considered as an extension field of K. Consider the element


a = x + ⟨p(x)⟩ ∈ K 󸀠 . Then p(a) = p(x) + ⟨p(x)⟩ = 0 + ⟨p(x)⟩ since p(x) ∈ ⟨p(x)⟩. But
0 + ⟨p(x)⟩ is the zero element 0 of the factor ring K[x]/⟨p(x)⟩. Therefore, in K 󸀠 , we have
p(a) = 0; hence, p(x) has a zero in K 󸀠 . Since p(x) divides f (x), we must have f (a) = 0
in K 󸀠 also. Therefore, we have constructed an extension field of K, in which f (x) has a
zero.
In conformity to Chapter 5, we write K(a) for the field adjunction of a = x +⟨(p(x))⟩
to K. We now outline an intuitive construction. From this, we say that the field K is
constructed by adjoining the zero (α) to K. We remark that this construction is not a
formally correct proof as that given for Theorem 7.1.2.
We can assume that f (x) is irreducible. Suppose that f (x) = a0 + a1 x + ⋅ ⋅ ⋅ + an xn
with an ≠ 0. Define α to satisfy

a0 + a1 α + ⋅ ⋅ ⋅ + an αn = 0.

Now, define K 󸀠 = K(α) in the following manner. We let

K(α) = {c0 + c1 α + ⋅ ⋅ ⋅ + cn−1 αn−1 : ci ∈ K}.

Then on K(α), define addition and subtraction componentwise, and define multipli-
cation by algebraic manipulation, replacing powers of α higher than αn by using

−a0 − a1 α − ⋅ ⋅ ⋅ − an−1 αn−1


αn = .
an

We claim that K 󸀠 = K(α), then forms a field of finite degree over K. The basic
ring properties follow easily by computation (see exercises) using the definitions. We
must show then that every nonzero element of K(α) has a multiplicative inverse. Let
g(α) ∈ K(α). Then the corresponding polynomial g(x) ∈ K[x] is a polynomial of degree
≤ n − 1. Since f (x) is irreducible of degree n, it follows that f (x) and g(x) must be
relatively prime; that is, (f (x), g(x)) = 1. Hence, there exist a(x), b(x) ∈ K[x] with

a(x)f (x) + b(x)g(x) = 1.

Evaluate these polynomials at α to get

a(α)f (α) + b(α)g(α) = 1.

Since by definition we have f (α) = 0, this becomes

b(α)g(α) = 1.

Now b(α) might have degree higher than n − 1 in α. However, using the relation that
f (α) = 0, we can rewrite b(α) as b(α), where b(α) now has degree ≤ n − 1 in α and,
hence, is in K(α). Therefore,

b(α)g(α) = 1;

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
7.1 Kronecker’s theorem | 95

hence, g(α) has a multiplicative inverse. It follows that K(α) is a field and, by definition,
f (α) = 0. The elements 1, α, . . . , αn−1 form a basis for K(α) over K and, hence,

󵄨󵄨K(α) : K 󵄨󵄨󵄨 = n.
󵄨󵄨 󵄨

Example 7.1.3. Let f (x) = x2 + 1 ∈ ℝ[x]. This is irreducible over ℝ. We construct the
field, in which this has a zero. Let K 󸀠 ≅ K[x]/⟨x 2 + 1⟩, and let a ∈ K 󸀠 with f (a) = 0. The
extension field ℝ(α) then has the form

K 󸀠 = ℝ(α) = {x + αy : x, y ∈ ℝ, α2 = −1}.

It is clear that this field is ℝ-isomorphic to the complex numbers ℂ; that is, ℝ(α) ≅
ℝ(i) ≅ ℂ.

Theorem 7.1.4. Let p(x) ∈ K[x] be an irreducible polynomial, and let K 󸀠 = K(α) be the
extension field of K constructed in Kronecker’s theorem, in which p(x) has a zero α. Let L
be an extension field of K, and suppose that a ∈ L is algebraic with minimal polynomial
mα (x) = p(x). Then K(α) is K-isomorphic to K(a).

Proof. If L|K is a field extension and a ∈ L with p(a) = 0 and if deg(p(x)) = n, then the
elements 1, a, . . . , an−1 constitute a basis for K(a) over K, and the elements 1, α, . . . , αn−1
constitute a basis for K(α) over K. The mapping

τ : K(a) → K(α)

defined by τ(k) = k if k ∈ K and τ(a) = α, and then extended by linearity, is easily


shown to be a K-isomorphism.

Theorem 7.1.5. Let K be a field. Then the following are equivalent:


(1) Each nonconstant polynomial in K[x] has a zero in K.
(2) Each nonconstant polynomial in K[x] factors into linear factors over K. That is, for
each f (x) ∈ K[x], there exist elements a1 , . . . , an , b ∈ K with

f (x) = b(x − a1 ) ⋅ ⋅ ⋅ (x − an ).

(3) An element of K[x] is irreducible if and only if it is of degree one.


(4) If L|K is an algebraic extension, then L = K.

Proof. Suppose that each nonconstant polynomial in K[x] has a zero in K. Let f (x) ∈
K[x] with deg(f (x)) = n. Suppose that a1 is a zero of f (x), then

f (x) = (x − a1 )h(x),

where the degree of h(x) is n − 1. Now h(x) has a zero a2 in K so that

f (x) = (x − a1 )(x − a2 )g(x)

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
96 | 7 Kronecker’s theorem and algebraic closures

with deg(g(x)) = n−2. Continue in this manner, and f (x) factors completely into linear
factors. Hence, (1) implies (2).
Now suppose (2); that is, that each nonconstant polynomial in K[x] factors into
linear factors over K. Suppose that f (x) is irreducible. If deg(f (x)) > 1, then f (x) factors
into linear factors and, hence, is not irreducible. Therefore, f (x) must be of degree 1,
and (2) implies (3).
Now suppose that an element of K[x] is irreducible if and only if it is of degree one,
and suppose that L|K is an algebraic extension. Let a ∈ L. Then a is algebraic over K.
Its minimal polynomial ma (x) is monic and irreducible over K and, hence, from (3),
is linear. Therefore, ma (x) = x − a ∈ K[x]. It follows that a ∈ K and, hence, K = L.
Therefore, (3) implies (4).
Finally, suppose that whenever L|K is an algebraic extension, then L = K. Suppose
that f (x) is a nonconstant polynomial in K[x]. From Kronecker’s theorem, there exists
a field extension L, and a ∈ L with f (a) = 0. However, L is an algebraic extension.
Therefore, by supposition, K = L. Therefore, a ∈ K, and f (x) has a zero in K. Therefore,
(4) implies (1), completing the proof.

In the next section, we will prove that given a field K, we can always find an ex-
tension field K with the properties of the last theorem.

7.2 Algebraic closures and algebraically closed fields


A field K is termed algebraically closed if K has no algebraic extensions other than K
itself. This is equivalent to any one of the conditions of Theorem 7.1.5.

Definition 7.2.1. A field K is algebraically closed if every nonconstant polynomial


f (x) ∈ K[x] has a zero in K.

The following theorem is just a restatement of Theorem 7.1.5.

Theorem 7.2.2. A field K is algebraically closed if and only it satisfies any one of the
following conditions:
(1) Each nonconstant polynomial in K[x] has a zero in K.
(2) Each nonconstant polynomial in K[x] factors into linear factors over K. That is, for
each f (x) ∈ K[x], there exist elements a1 , . . . , an , b ∈ K with

f (x) = b(x − a1 ) ⋅ ⋅ ⋅ (x − an ).

(3) An element of K[x] is irreducible if and only if it is of degree one.


(4) If L|K is an algebraic extension, then L = K.

The prime example of an algebraically closed field is the field ℂ of complex num-
bers. The fundamental theorem of algebra says that any nonconstant complex poly-
nomial has a complex zero.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
7.2 Algebraic closures and algebraically closed fields | 97

We now show that the algebraic closure of one field within an algebraically closed
field is algebraically closed. First, we define a general algebraic closure.

Definition 7.2.3. An extension field K of a field K is an algebraic closure of K if K is


algebraically closed and K|K is algebraic.

Theorem 7.2.4. Let K be a field and L|K an extension of K with L algebraically closed.
Let K = 𝒜K be the algebraic closure of K within L. Then K is an algebraic closure of K.

Proof. Let K = 𝒜K be the algebraic closure of K within L. We know that K|K is alge-
braic. Therefore, we must show that K is algebraically closed.
Let f (x) be a nonconstant polynomial in K[x]. Then f (x) ∈ L[x]. Since L is alge-
braically closed, f (x) has a zero a in L. Since f (a) = 0 and f (x) ∈ K[x], it follows that a
is algebraic over K. However, K is algebraic over K. Therefore, a is also algebraic over K.
Hence, a ∈ K, and f (x) has a zero in K. Therefore, K is algebraically closed.

We want to note the distinction between being algebraically closed and being an
algebraic closure.

Lemma 7.2.5. The complex numbers ℂ are an algebraic closure of ℝ, but not an alge-
braic closure of ℚ. An algebraic closure of ℚ is 𝒜 the field of algebraic numbers within ℂ.

Proof. ℂ is algebraically closed (the fundamental theorem of algebra), and since


|ℂ : ℝ| = 2, it is algebraic over ℝ. Therefore, ℂ is an algebraic closure of ℝ. Although
ℂ is algebraically closed and contains the rational numbers ℚ, it is not an algebraic
closure of ℚ since it is not algebraic over ℚ as there exist transcendental elements.
On the other hand, 𝒜, the field of algebraic numbers within ℚ, is an algebraic
closure of ℚ from Theorem 7.2.4.

We now show that every field has an algebraic closure. To do this, we first show
that any field can be embedded into an algebraically closed field.

Theorem 7.2.6. Let K be a field. Then K can be embedded into an algebraically closed
field.

Proof. We show first that there is an extension field L of K, in which each nonconstant
polynomial f (x) ∈ K[x] has a zero in L.
Assign to each nonconstant f (x) ∈ K[x] the symbol yf , and consider

R = K[yf : f (x) ∈ K[x]],

the polynomial ring over K in the variables yf . Let

n
I = {∑ fj (yfj )rj : rj ∈ R, fj (x) ∈ K[x]}.
j=1

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
98 | 7 Kronecker’s theorem and algebraic closures

It is straightforward that I is an ideal in R. Suppose that I = R. Then 1 ∈ I. Hence, there


is a linear combination

1 = g1 f1 (yf1 ) + ⋅ ⋅ ⋅ + gn fn (yfn ),

where gi ∈ I = R.
In the n polynomials g1 , . . . , gn , there are only a finite number of variables, say for
example,

y f1 , . . . , y fn , . . . , y fm .

Hence,
n
1 = ∑ gi (yf1 , . . . , yfm )fi (yfi ). (∗)
i=1

Successive applications of Kronecker’s theorem lead us to construct an extension field


P of K, in which each fi has a zero ai . Substituting ai for yfi in (∗) above, we get that
1 = 0 a contradiction. Therefore, I ≠ R.
Since I is a ideal not equal to the whole ring R, it follows that I is contained in a
maximal ideal M of R. Set L = R/M. Since M is maximal L is a field. Now K ∩ M = {0}.
If not, suppose that a ∈ K ∩ M with a ≠ 0. Then a−1 a = 1 ∈ M, and then M = R. Now
define τ : K → L by τ(k) = k + M. Since K ∩ M = {0}, it follows that ker(τ) = {0}.
Therefore, τ is a monomorphism. This allows us to identify K and τ(K), and shows
that K embeds into L.
Now suppose that f (x) is a nonconstant polynomial in K[x]. Then

f (yf + M) = f (yf ) + M.

However, by the construction f (yf ) ∈ M, so that

f (yf + M) = M = the zero element of L.

Therefore, yf + M is a zero of f (x).


Therefore, we have constructed a field L, in which every nonconstant polynomial
in K[x] has a zero in L.
We now iterate this procedure to form a chain of fields

K ⊂ K1 (= L) ⊂ K2 ⊂ ⋅ ⋅ ⋅

such that each nonconstant polynomial of Ki [x] has a zero in Ki+1 .


Now let K̂ = ⋃I Ki . It is easy to show (see exercises) that K̂ is a field. If f (x) is a
nonconstant polynomial in K[x], ̂ then there is some i with f (x) ∈ Ki [x]. Therefore, f (x)
has a zero in Ki+1 [x] ⊂ K.̂ Hence, f (x) has a zero in K,̂ and K̂ is algebraically closed.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
7.2 Algebraic closures and algebraically closed fields | 99

Theorem 7.2.7. Let K be a field. Then K has an algebraic closure.

Proof. Let K̂ be an algebraically closed field containing K, which exists from Theo-
rem 7.2.6.
Now let K = 𝒜K̂ be the set of elements of K̂ that are algebraic over K. From Theo-
rem 7.2.4, K̂ is an algebraic closure of K.
The following lemma is straightforward. We leave the proof to the exercises.

Lemma 7.2.8. Let K, K 󸀠 be fields and ϕ : K → K 󸀠 a homomorphism. Then

ϕ̃ : K[x] → K 󸀠 [x], given by


n n
̃ ∑ k xi ) = ∑ (ϕ(k ))x i ,
ϕ( i i
i=1 i=0

is also a homomorphism. By convention, we identify ϕ and ϕ̃ and write ϕ = ϕ.̃ If ϕ is an


isomorphism, then so is ϕ.̃

Lemma 7.2.9. Let K, K 󸀠 be fields and ϕ : K → K 󸀠 an isomorphism. Let f (x) ∈ K[x] be


irreducible. Let K ⊂ K(a) and K 󸀠 ⊂ K 󸀠 (a󸀠 ), where a is a zero of f (x) and a󸀠 is a zero of
ϕ(f (x)). Then there is an isomorphism ψ : K(a) → K 󸀠 (a󸀠 ) with ψ|K = ϕ and ψ(a) = a󸀠 .
Furthermore, ψ is uniquely determined.

Proof. This is a generalized version of Theorem 7.1.4. If b ∈ K(a), then from the con-
struction of K(a), there is a polynomial g(x) ∈ K[x] with b = g(a). Define a map

ψ : K(a) → K 󸀠 (a󸀠 )

by

ψ(b) = ϕ(g(x))(a󸀠 ).

We show that ψ is an isomorphism.


First, ψ is well-defined. Suppose that b = g(a) = h(a) with h(x) ∈ K[x]. Then
(g − h)(a) = 0. Since f (x) is irreducible, this implies that f (x) = cma (x), and since a is
a zero of (g − h)(x), then f (x)|(g − h)(x). Then

ϕ(f (x))|(ϕ(g(x)) − ϕ(h(x))).

Since ϕ(f (x))(a󸀠 ) = 0, this implies that ϕ(g(x))(a󸀠 ) = ϕ(h(x))(a󸀠 ); hence, the map ψ is
well-defined.
It is easy to show that ψ is a homomorphism. Let b1 = g1 (a), b2 = g2 (a). Then
b1 b1 = g1 g2 (a). Hence,

ψ(b1 b2 ) = (ϕ(g1 g2 ))(a󸀠 ) = ϕ(g1 )(a󸀠 )ϕ(g2 )(a󸀠 ) = ψ(b1 )ψ(b2 ).

In the same manner, we have ψ(b1 + b2 ) = ψ(b1 ) + ψ(b2 ).

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
100 | 7 Kronecker’s theorem and algebraic closures

Now suppose that k ∈ K so that k ∈ K[x] is a constant polynomial. Then ψ(k) =


(ϕ(k))(a󸀠 ) = ϕ(k). Therefore, ψ restricted to K is precisely ϕ.
As ψ is not the zero mapping, it follows that ψ is a monomorphism.
Finally, since K(a) is generated from K and a, and ψ restricted to K is ϕ, it follows
that ψ is uniquely determined by ϕ and ψ(a) = a󸀠 . Hence, ψ is unique.

Theorem 7.2.10. Let L|K be an algebraic extension. Suppose that L1 is an algebraically


closed field and ϕ is an isomorphism from K to K1 ⊂ L1 . Then there exists a monomor-
phism ψ from L to L1 with ψ|K = ϕ.

Before we give the proof, we note that the theorem gives the following diagram:

In particular, the theorem can be applied to monomorphisms of a field K within


an algebraic closure K of K. Specifically, suppose that K ⊂ K, where K is an algebraic
closure of K, and let α : K → K be a monomorphism with α(K) = K. Then there exists
an automorphism α∗ of K with α|∗K = α.
Proof of Theorem 7.2.10. Consider the set

ℳ = {(M, τ) : M is a field with K ⊂ M ⊂ L,


where there exists a monomorphism τ : M → L1 with τ|K = ϕ}.

Now the set ℳ is nonempty since (K, ϕ) ∈ ℳ. Order ℳ by (M1 , τ1 ) < (M2 , τ2 ) if
M1 ⊂ M2 and (τ2 )|M = τ1 . Let
1

𝒦 = {(Mi , τi ) : i ∈ I}

be a chain in ℳ. Let (M, τ) be defined by

M = ⋃ Mi with τ(a) = τi (a) for all a ∈ Mi .


i∈I

It is clear that M is an upper bound for the chain 𝒦. Since each chain has an upper
bound it follows from Zorn’s lemma that ℳ has a maximal element (N, ρ). We show
that N = L.
Suppose that N ⊊ L. Let a ∈ L \ N. Then a is algebraic over N and further algebraic
over K, since L|K is algebraic. Let ma (x) ∈ N[x] be the minimal polynomial of a relative
to N. Since L1 is algebraically closed, ρ(ma (x)) has a zero a󸀠 ∈ L1 . Therefore, there is a
monomorphism ρ󸀠 : N(a) → L1 with ρ󸀠 restricted to N, the same as ρ. It follows that
(N, ρ) < (N(a), ρ󸀠 ) since a ∉ N. This contradicts the maximality of N. Therefore, N = L,
completing the proof.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
7.3 The fundamental theorem of algebra | 101

Combining the previous two theorems, we can now prove that any two algebraic
closures of a field K are unique up to K-isomorphism; that is, up to an isomorphism,
thus, is the identity on K.

Theorem 7.2.11. Let L1 and L2 be algebraic closures of the field K. Then there is a
K-isomorphism τ : L → L1 . Again by K-isomorphism, we mean that τ is the identity
on K.

Proof. From Theorem 7.2.7, there is a monomorphism τ : L1 → L2 with τ the identity


on K. However, since L1 is algebraically closed, so is τ(L1 ). Then L2 |τ(L1 ) is an alge-
braic extension. Therefore, since L2 is algebraically closed, we must have L2 = τ(L1 ).
Therefore, τ is also surjective and, hence, an isomorphism.

The following corollary is immediate.

Corollary 7.2.12. Let L|K and L󸀠 |K be field extensions with a ∈ L and a󸀠 ∈ L󸀠 algebraic
elements over K. Then K(a) is K-isomorphic to K(a󸀠 ) if and only if |K(a) : K| = |K(a󸀠 ) : K|,
and there is an element a󸀠󸀠 ∈ K(a󸀠 ) with ma (x) = ma󸀠󸀠 (x).

7.3 The fundamental theorem of algebra


In this section, we give a proof of the fact that the complex numbers form an alge-
braically closed field. This is known as the fundamental theorem of algebra. First, we
need the concept of a splitting field for a polynomial. In the next chapter, we will ex-
amine this concept more deeply.

7.3.1 Splitting fields

We have just seen that given an irreducible polynomial over a field K, we could always
find a field extension, in which this polynomial has a zero. We now push this further
to obtain field extensions, where a given polynomial has all its zeros.

Definition 7.3.1. If K is a field and 0 ≠ f (x) ∈ K[x], and K 󸀠 is an extension field of K,


then f (x) splits in K 󸀠 (K 󸀠 may be K), if f (x) factors into linear factors in K 󸀠 [x]. Equiva-
lently, this means that all the zeros of f (x) are in K 󸀠 .
K 󸀠 is a splitting field for f (x) over K if K 󸀠 is the smallest extension field of K, in
which f (x) splits. (A splitting field for f (x) is the smallest extension field, in which f (x)
has all its possible zeros.)
K 󸀠 is a splitting field over K if it is the splitting field for some finite set of polyno-
mials over K.

Theorem 7.3.2. If K is a field and 0 ≠ f (x) ∈ K[x], then there exists a splitting field for
f (x) over K.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
102 | 7 Kronecker’s theorem and algebraic closures

Proof. The splitting field is constructed by repeated adjoining of zeros. Suppose, with-
out loss of generality, that f (x) is irreducible of degree n over K. From Theorem 7.1.2,
there exists a field K 󸀠 containing α with f (α) = 0. Then f (x) = (x − α)g(x) ∈ K 󸀠 [x] with
deg g(x) = n − 1. By an inductive argument, g(x) has a splitting field; therefore, so does
f (x).

In the next chapter, we will further characterize splitting fields.

7.3.2 Permutations and symmetric polynomials

To obtain a proof of the fundamental theorem of algebra, we need to go a bit outside


of our main discussions of rings and fields and introduce symmetric polynomials. To
introduce this concept, we first review some basic ideas from elementary group theory,
which we will look at in detail later in the book.

Definition 7.3.3. A group G is a set with one binary operation, which we will denote
by multiplication, such that the following hold:
(1) The operation is associative; that is, (g1 g2 )g3 = g1 (g2 g3 ) for all g1 , g2 , g3 ∈ G.
(2) There exists an identity for this operation; that is, an element 1 such that 1g = g
for each g ∈ G.
(3) Each g ∈ G has an inverse for this operation; that is, for each g, there exists a g −1
with the property that gg −1 = 1.

If in addition the operation is commutative (g1 g2 = g2 g1 for all g1 , g2 ∈ G), the group
G is called an abelian group. The order of G is the number of elements in G, denoted
|G|. If |G| < ∞, G is a finite group. H ⊂ G is a subgroup if H is also a group under the
same operation as G. Equivalently, H is a subgroup if H ≠ 0, and H is closed under the
operation and inverses.

Groups most often arise from invertible mappings of a set onto itself. Such map-
pings are called permutations.

Definition 7.3.4. If T is a set, a permutation on T is a one-to-one mapping of T onto


itself. We denote the set of all permutations on T by ST .

Theorem 7.3.5. For any set T, ST forms a group under composition called the symmetric
group on T. If T, T1 have the same cardinality (size), then ST ≅ ST1 . If T is a finite set with
|T| = n, then ST is a finite group, and |ST | = n!.

Proof. If ST is the set of all permutations on the set T, we must show that composition
is an operation on ST that is associative and has an identity and inverses.
Let f , g ∈ ST . Then f , g are one-to-one mappings of T onto itself. Consider f ∘ g :
T → T. If f ∘ g(t1 ) = f ∘ g(t2 ), then f (g(t1 )) = f (g(t2 )), and g(t1 ) = g(t2 ), since f is
one-to-one. But then t1 = t2 since g is one-to-one.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
7.3 The fundamental theorem of algebra | 103

If t ∈ T, there exists t1 ∈ T with f (t1 ) = t since f is onto. Then there exists t2 ∈ T


with g(t2 ) = t1 since g is onto. Putting these together, f (g(t2 )) = t; therefore, f ∘g is onto.
Therefore, f ∘ g is also a permutation, and composition gives a valid binary operation
on ST .
The identity function 1(t) = t for all t ∈ T will serve as the identity for ST , whereas
the inverse function for each permutation will be the inverse. Such unique inverse
functions exist since each permutation is a bijection.
Finally, composition of functions is always associative; therefore, ST forms a
group.
If T, T1 have the same cardinality, then there exists a bijection σ : T → T1 . Define a
map F : ST → ST1 in the following manner: if f ∈ ST , let F(f ) be the permutation on T1
given by F(f )(t1 ) = σ(f (σ −1 (t1 ))). It is straightforward to verify that F is an isomorphism
(see the exercises).
Finally, suppose |T| = n < ∞. Then T = {t1 , . . . , tn }. Each f ∈ ST can be pictured as

t1 ... tn
f =( ).
f (t1 ) ... f (tn )

For t1 , there are n choices for f (t1 ). For t2 , there are only n − 1 choices since f is one-to-
one. This continues down to only one choice for tn . Using the multiplication principle,
the number of choices for f and, therefore, the size of ST is

n(n − 1) ⋅ ⋅ ⋅ 1 = n!.

For a set with n elements, we denote ST by Sn called the symmetric group on n


symbols.

Example 7.3.6. Write down the six elements of S3 , and give the multiplication table
for the group.
Name the three elements 1, 2, 3 of T. The six elements of S3 are then:

1 2 3 1 2 3 1 2 3
1=( ), a=( ), b=( )
1 2 3 2 3 1 3 1 2
1 2 3 1 2 3 1 2 3
c=( ), d=( ), e=( ).
2 1 3 3 2 1 1 3 2

The multiplication table for S3 can be written down directly by doing the required
composition. For example,

1 2 3 1 2 3 1 2 3
ac = ( )( )=( ) = d.
2 3 1 2 1 3 3 2 1

To see this, note that a : 1 → 2, 2 → 3, 3 → 1; c : 1 → 2, 2 → 1, 3 → 3, and so


ac : 1 → 3, 2 → 2, 3 → 1.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
104 | 7 Kronecker’s theorem and algebraic closures

It is somewhat easier to construct the multiplication table if we make some obser-


vations. First, a2 = b, and a3 = 1. Next, c2 = 1, d = ac, e = a2 c and, finally, ac = ca2 .
From these relations, the following multiplication table can be constructed:
1 a a2 c ac a2 c
1 1 a a2 c ac a2 c
a a a2 1 ac a2 c c
a2 a2 1 a a2 c c ac .
c c a2 c ac 1 a2 a
ac ac c a2 c a 1 a2
a2 c a2 c ac c a2 a 1

To see this, consider, for example, (ac)a2 = a(ca2 ) = a(ac) = a2 c.


More generally, we can say that S3 has a presentation given by

S3 = ⟨a, c; a3 = c2 = 1, ac = ca2 ⟩.

By this, we mean that S3 is generated by a, c, or that S3 has generators a, c. Thus,


the whole group and its multiplication table can be generated by using the relations
a3 = c2 = 1, ac = ca2 .

An important result, the form of which we will see later in our work on extension
fields, is the following:

Lemma 7.3.7. Let T be a set and T1 ⊂ T a subset. Let H be the subset of ST that fixes
each element of T1 ; that is, f ∈ H if f (t) = t for all t ∈ T1 . Then H is a subgroup.

Proof. H ≠ 0 since 1 ∈ H. Now suppose h1 , h2 ∈ H. Let t1 ∈ T1 , and consider h1 ∘ h2 (t1 ) =


h1 (h2 (t1 )). Now h2 (t1 ) = t1 since h2 ∈ H, but then h1 (t1 ) = t1 since h1 ∈ H. Therefore,
h1 ∘h2 ∈ H, and H is closed under composition. If h1 fixes t1 , then h−1 1 also fixes t1 . Thus,
H is also closed under inverses and is, therefore, a subgroup.

We now apply these ideas of permutations to certain polynomial rings in indepen-


dent indeterminates over a field. We will look at these in detail in Chapter 11.

Definition 7.3.8. Let y1 , . . . , yn be (independent) indeterminates over a field K. A poly-


nomial f (y1 , . . . , yn ) ∈ K[y1 , . . . , yn ] is a symmetric polynomial in y1 , . . . , yn if f (y1 , . . . , yn )
is unchanged by any permutation σ of {y1 , . . . , yn }; that is, f (y1 , . . . , yn ) = f (σ(y1 ), . . . ,
σ(yn )).
If K ⊂ K 󸀠 are fields and α1 , . . . , αn are in K 󸀠 , then we call a polynomial f (α1 , . . . , αn )
with coefficients in K symmetric in α1 , . . . , αn if f (α1 , . . . , αn ) is unchanged by any per-
mutation σ of {α1 , . . . , αn }.

Example 7.3.9. Let K be a field and k0 , k1 ∈ K. Let h(y1 , y2 ) = k0 (y1 + y2 ) + k1 (y1 y2 ).


There are two permutations on {y1 , y2 }, namely, σ1 : y1 → y1 , y2 → y2 and σ2 :
y1 → y2 , y2 → y1 . Applying either one of these two to {y1 , y2 } leaves h(y1 , y2 ) invariant.
Therefore, h(y1 , y2 ) is a symmetric polynomial.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
7.3 The fundamental theorem of algebra | 105

Definition 7.3.10. Let x, y1 , . . . , yn be indeterminates over a field K (or elements of an


extension field K 󸀠 of K). Form the polynomial

p(x, y1 , . . . , yn ) = (x − y1 ) ⋅ ⋅ ⋅ (x − yn ).

The i-th elementary symmetric polynomial si in y1 , . . . , yn for i = 1, . . . , n, is (−1)i ai , where


ai is the coefficient of xn−i in p(x, y1 , . . . , yn ).

Example 7.3.11. Consider y1 , y2 , y3 . Then

p(x, y1 , y2 , y3 ) = (x − y1 )(x − y2 )(x − y3 )


= x3 − (y1 + y2 + y3 )x2 + (y1 y2 + y1 y3 + y2 y3 )x − y1 y2 y3 .

Therefore, the three elementary symmetric polynomials in y1 , y2 , y3 over any field


are
(1) s1 = y1 + y2 + y3 .
(2) s2 = y1 y2 + y1 y3 + y2 y3 .
(3) s3 = y1 y2 y3 .

In general, the pattern of the last example holds for y1 , . . . , yn . That is,

s1 = y1 + y2 + ⋅ ⋅ ⋅ + yn
s2 = y1 y2 + y1 y3 + ⋅ ⋅ ⋅ + yn−1 yn
s3 = y1 y2 y3 + y1 y2 y4 + ⋅ ⋅ ⋅ + yn−2 yn−1 yn
..
.
s n = y1 ⋅ ⋅ ⋅ yn .

The importance of the elementary symmetric polynomials is that any symmetric


polynomial can be built up from the elementary symmetric polynomials. We make this
precise in the next theorem called the fundamental theorem of symmetric polynomials.
We will use this important result several times, and we will give a complete proof in
Section 7.5.

Theorem 7.3.12 (Fundamental theorem of symmetric polynomials). If P is a symmetric


polynomial in the indeterminates y1 , . . . , yn over a field K; that is, P ∈ K[y1 , . . . , yn ] and
P is symmetric, then there exists a unique g ∈ K[y1 , . . . , yn ] such that f (y1 , . . . , yn ) =
g(s1 , . . . , sn ). That is, any symmetric polynomial in y1 , . . . , yn is a polynomial expression
in the elementary symmetric polynomials in y1 , . . . , yn .

From this theorem, we obtain the following two lemmas, which will be crucial in
our proof of the fundamental theorem of algebra.

Lemma 7.3.13. Let p(x) ∈ K[x], and suppose p(x) has the zeros α1 , . . . , αn in the splitting
field K 󸀠 . Then the elementary symmetric polynomials in α1 , . . . , αn are in K.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
106 | 7 Kronecker’s theorem and algebraic closures

Proof. Suppose p(x) = c0 + c1 x + ⋅ ⋅ ⋅ + cn xn ∈ K[x]. Since p(x) splits in K 󸀠 [x], with zeros
α1 , . . . , αn , we have that, in K 󸀠 [x],

p(x) = cn (x − α1 ) ⋅ ⋅ ⋅ (x − αn ).

The coefficients are then cn (−1)i si (α1 , . . . , αn ), where the si (α1 , . . . , αn ) are the ele-
mentary symmetric polynomials in α1 , . . . , αn . However, p(x) ∈ K[x], so each coefficient
is in K. It follows then that for each i, cn (−1)i si (α1 , . . . , αn ) ∈ K; hence, si (α1 , . . . , αn ) ∈ K
since cn ∈ K.

Lemma 7.3.14. Let p(x) ∈ K[x], and suppose p(x) has the zeros α1 , . . . , αn in the split-
ting field K 󸀠 . Suppose further that g(x) = g(x, α1 , . . . , αn ) ∈ K 󸀠 [x]. If g(x) is a symmetric
polynomial in α1 , . . . , αn , then g(x) ∈ K[x].

Proof. If g(x) = g(x, α1 , . . . , αn ) is symmetric in α1 , . . . , αn , then from Theorem 7.3.12,


it is a symmetric polynomial in the elementary symmetric polynomials in α1 , . . . , αn .
From Lemma 7.3.13, these are in the ground field K, so the coefficients of g(x) are in K.
Therefore, g(x) ∈ K[x].

7.4 The fundamental theorem of algebra


We now present a proof of the fundamental theorem of algebra.

Theorem 7.4.1 (Fundamental theorem of algebra). Any nonconstant complex polyno-


mial has a complex zero. In other words, the complex number field ℂ is algebraically
closed.

The proof depends on the following sequence of lemmas. The crucial one now is
the last, which says that any real polynomial must have a complex zero.

Lemma 7.4.2. Any odd-degree real polynomial must have a real zero.

Proof. This is a consequence of the intermediate value theorem from analysis.


Suppose P(x) ∈ ℝ[x] with deg P(x) = n = 2k + 1, and suppose the leading coeffi-
cient an > 0 (the proof is almost identical if an < 0). Then

P(x) = an xn + (lower terms),

and n is odd. Then,


(1) limx→∞ P(x) = limx→∞ an xn = ∞ since an > 0.
(2) limx→−∞ P(x) = limx→−∞ an xn = −∞ since an > 0 and n is odd.

From (1), P(x) gets arbitrarily large positively, so there exists an x1 with P(x1 ) > 0.
Similarly, from (2) there exists an x2 with P(x2 ) < 0.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
7.4 The fundamental theorem of algebra | 107

A real polynomial is a continuous real-valued function for all x ∈ ℝ. Since


P(x1 )P(x2 ) < 0, it follows from the intermediate value theorem that there exists an
x3 , between x1 and x2 , such that P(x3 ) = 0.

Lemma 7.4.3. Any degree-two complex polynomial must have a complex zero.

Proof. This is a consequence of the quadratic formula and of the fact that any complex
number has a square root.
If P(x) = ax2 + bx + c, a ≠ 0, then the zeros formally are

−b + √b2 − 4ac −b − √b2 − 4ac


x1 = , x2 = .
2a 2a
From DeMoivre’s theorem, every complex number has a square root; hence, x1 , x2 exist
in ℂ. They of course are the same if b2 − 4ac = 0.
To go further, we need the concept of the conjugate of a polynomial and some
straightforward consequences of this idea.

Definition 7.4.4. If P(x) = a0 + ⋅ ⋅ ⋅ + an xn is a complex polynomial then its conjugate is


the polynomial P(x) = a0 + ⋅ ⋅ ⋅ + an xn . That is, the conjugate is the polynomial whose
coefficients are the complex conjugates of those of P(x).

Lemma 7.4.5. For any P(x) ∈ ℂ[x], we have the following:


(1) P(z) = P(z) if z ∈ ℂ.
(2) P(x) is a real polynomial if and only if P(x) = P(x).
(3) If P(x)Q(x) = H(x), then H(x) = (P(x))(Q(x)).

Proof. (1) Suppose z ∈ ℂ and P(z) = a0 + ⋅ ⋅ ⋅ + an z n . Then

P(z) = a0 + ⋅ ⋅ ⋅ + an z n = a0 + a1 z + ⋅ ⋅ ⋅ + an z n = P(z).

(2) Suppose P(x) is real, then ai = ai for all its coefficients; hence, P(x) = P(x).
Conversely, suppose P(x) = P(x). Then ai = ai for all its coefficients; hence, ai ∈ ℝ for
each ai ; therefore, P(x) is a real polynomial.
(3) The proof is a computation and left to the exercises.

Lemma 7.4.6. Suppose G(x) ∈ ℂ[x]. Then H(x) = G(x)G(x) ∈ ℝ[x].

Proof. H(x) = G(x)G(x) = G(x)G(x) = G(x)G(x) = G(x)G(x) = H(x). Therefore, H(x) is a


real polynomial.

Lemma 7.4.7. If every nonconstant real polynomial has a complex zero, then every non-
constant complex polynomial has a complex zero.

Proof. Let P(x) ∈ ℂ[x], and suppose that every nonconstant real polynomial has at
least one complex zero. Let H(x) = P(x)P(x). From Lemma 7.4.6, H(x) ∈ ℝ[x]. By sup-
position there exists a z0 ∈ ℂ with H(z0 ) = 0. Then P(z0 )P(z0 ) = 0, and since ℂ

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
108 | 7 Kronecker’s theorem and algebraic closures

is a field it has no zero divisors. Hence, either P(z0 ) = 0, or P(z0 ) = 0. In the first
case, z0 is a zero of P(x). In the second case, P(z0 ) = 0. Then from Lemma 7.4.5,
P(z0 ) = P(z0 ) = P(z0 ) = 0. Therefore, z0 is a zero of P(x).

Now we come to the crucial lemma.

Lemma 7.4.8. Any nonconstant real polynomial has a complex zero.

Proof. Let f (x) = a0 +a1 x+⋅ ⋅ ⋅+an xn ∈ ℝ[x] with n ≥ 1, an ≠ 0. The proof is an induction
on the degree n of f (x).
Suppose n = 2m q, where q is odd. We do the induction on m. If m = 0, then f (x) has
odd degree, and the theorem is true from Lemma 7.4.2. Assume then that the theorem
is true for all degrees d = 2k q󸀠 , where k < m and q󸀠 is odd. Now assume that the degree
of f (x) is n = 2m q.
Suppose K 󸀠 is the splitting field for f (x) over ℝ, in which the zeros are α1 , . . . , αn .
We show that at least one of these zeros must be in ℂ. (In fact, all are in ℂ, but to prove
the lemma, we need only show at least one.)
Let h ∈ ℤ, and form the polynomial

H(x) = ∏(x − (αi + αj + hαi αj )).


i<j

This is in K 󸀠 [x]. In forming H(x), we chose pairs of zeros {αi , αj }, so the number of
such pairs is the number of ways of choosing two elements out of n = 2m q elements.
This is given by

(2m q)(2m q − 1)
= 2m−1 q(2m q − 1) = 2m−1 q󸀠
2
with q󸀠 odd. Therefore, the degree of H(x) is 2m−1 q󸀠 .
H(x) is a symmetric polynomial in the zeros α1 , . . . , αn . Since α1 , . . . , αn are the zeros
of a real polynomial, from Lemma 7.3.14, any polynomial in the splitting field symmet-
ric in these zeros must be a real polynomial.
Therefore, H(x) ∈ ℝ[x] with degree 2m−1 q󸀠 . By the inductive hypothesis, then, H(x)
must have a complex zero. This implies that there exists a pair {αi , αj } with

αi + αj + hαi αj ∈ ℂ.

Since h was an arbitrary integer, for any integer h1 , there must exist such a pair
{αi , αj } with

αi + αj + h1 αi αj ∈ ℂ.

Now let h1 vary over the integers. Since there are only finitely many such pairs
{αi , αj }, it follows that there must be at least two different integers h1 , h2 such that

z1 = αi + αj + h1 αi αj ∈ ℂ, and z2 = αi + αj + h2 αi αj ∈ ℂ.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
7.5 The fundamental theorem of symmetric polynomials | 109

Then z1 − z2 = (h1 − h2 )αi αj ∈ ℂ, and since h1 , h2 ∈ ℤ ⊂ ℂ, it follows that αi αj ∈ ℂ.


But then h1 αi αj ∈ ℂ, from which it follows that αi + αj ∈ ℂ. Then,

p(x) = (x − αi )(x − αj ) = x2 − (αi + αj )x + αi αj ∈ ℂ[x].

However, p(x) is then a degree-two complex polynomial, and so from Lemma 7.4.3, its
zeros are complex. Therefore, αi , αj ∈ ℂ; thus, f (x) has a complex zero.

It is now easy to give a proof of the fundamental theorem of algebra. From Lem-
ma 7.4.8, every nonconstant real polynomial has a complex zero. From Lemma 7.4.7, if
every nonconstant real polynomial has a complex zero, then every nonconstant com-
plex polynomial has a complex zero, proving the fundamental theorem.

Theorem 7.4.9. If E is a finite dimensional field extension of ℂ, then E = ℂ.

Proof. Let a ∈ E. Regard the elements 1, a, a2 , . . . . These elements become linearly de-
pendent over ℂ, and we get a nonconstant polynomial over ℂ with zero a. By the fun-
damental theorem of algebra, we know that a ∈ ℂ.

Corollary 7.4.10. If E is a finite dimensional field extension of ℝ, then E = ℝ, or E = ℂ.

7.5 The fundamental theorem of symmetric polynomials


In the proof of the fundamental theorem of algebra that was given in the previous
section, we used the fact that any symmetric polynomial in n indeterminates is a poly-
nomial in the elementary symmetric polynomials in these indeterminates. In this sec-
tion, we give a proof of this theorem.
Let R be an integral domain with x1 , . . . , xn (independent) indeterminates over R,
and let R[x1 , . . . , xn ] be the polynomial ring in these indeterminates. Any polynomial
i i
f (x1 , . . . , xn ) ∈ R[x1 , . . . , xn ] is composed of a sum of pieces of the form ax11 ⋅ ⋅ ⋅ xnn with
a ∈ R. We first put an order on these pieces of a polynomial.
i i j j
The piece ax11 ⋅ ⋅ ⋅ xnn with a ≠ 0 is called higher than the piece bx11 ⋅ ⋅ ⋅ xnn with b ≠ 0,
if the first one of the differences

i1 − j1 , i2 − j2 , . . . , in − jn

that differs from zero is in fact positive. The highest piece of a polynomial f (x1 , . . . , xn )
is denoted by HG(f ).

Lemma 7.5.1. For f (x1 , . . . , xn ), g(x1 , . . . , xn ) ∈ R[x1 , . . . , xn ], we have

HG(fg) = HG(f ) HG(g).

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
110 | 7 Kronecker’s theorem and algebraic closures

Proof. We use an induction on n, the number of indeterminates. It is clearly true for


n = 1, and now assume that the statement holds for all polynomials in k indetermi-
nates with k < n and n ≥ 2. Order the polynomials via exponents on the first indeter-
minate x1 so that

f (x1 , . . . , xn ) = x1r ϕr (x2 , . . . , xn ) + x1r−1 ϕr−1 (x2 , . . . , xn )


+ ⋅ ⋅ ⋅ + ϕ0 (x2 , . . . , xn )

g(x1 , . . . , xn ) = x1s ψs (x2 , . . . , xn ) + x1s−1 ψs−1 (x2 , . . . , xn )


+ ⋅ ⋅ ⋅ + ψ0 (x2 , . . . , xn ).

Then HG(fg) = x1r+s HG(ϕr ψs ). By the inductive hypothesis

HG(ϕr ψs ) = HG(ϕr ) HG(ψs ).

Hence,

HG(fg) = x1r+s HG(ϕr ) HG(ψs )


= (x1r HG(ϕr ))(x1s HG(ψs )) = HG(f ) HG(g).

The elementary symmetric polynomials in n indeterminates x1 , . . . , xn are as fol-


lows:

s1 = x1 + x2 + ⋅ ⋅ ⋅ + xn
s2 = x1 x2 + x1 x3 + ⋅ ⋅ ⋅ + xn−1 xn
s3 = x1 x2 x3 + x1 x2 x4 + ⋅ ⋅ ⋅ + xn−2 xn−1 xn
..
.
sn = x1 ⋅ ⋅ ⋅ xn .

These were found by forming the polynomial p(x, x1 , . . . , xn ) = (x − x1 ) ⋅ ⋅ ⋅ (x − xn ).


The i-th elementary symmetric polynomial si in x1 , . . . , xn is then (−1)i ai , where ai is
the coefficient of xn−i in p(x, x1 , . . . , xn ).
In general,

sk = ∑ xi1 xi2 ⋅ ⋅ ⋅ xik ,


i1 <i2 <⋅⋅⋅<ik ,1≤k≤n

where the sum is taken over all the (nk ) different systems of indices i1 , . . . , ik with
i1 < i2 < ⋅ ⋅ ⋅ < ik .
Furthermore, a polynomial s(x1 , . . . , xn ) is a symmetric polynomial if s(x1 , . . . , xn )
is unchanged by any permutation σ of {x1 , . . . , xn }; that is, s(x1 , . . . , xn ) = s(σ(x1 ), . . . ,
σ(xn )).

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
7.5 The fundamental theorem of symmetric polynomials | 111

k k
Lemma 7.5.2. In the highest piece ax1 1 ⋅ ⋅ ⋅ xnn , a ≠ 0, of a symmetric polynomial s(x1 , . . . ,
xn ), we have k1 ≥ k2 ≥ ⋅ ⋅ ⋅ ≥ kn .

Proof. Assume that ki < kj for some i < j. As a symmetric polynomial, s(x1 , . . . , xn ) also
k k k k
must then contain the piece ax1 1 ⋅ ⋅ ⋅ xi j ⋅ ⋅ ⋅ xj i ⋅ ⋅ ⋅ xnn , which is higher than
k k k k
ax1 1 ⋅ ⋅ ⋅ xi i ⋅ ⋅ ⋅ xj j ⋅ ⋅ ⋅ xnn , giving a contradiction.
k −k2 k2 −k3 k −kn kn
Lemma 7.5.3. The product s1 1 s2 ⋅ ⋅ ⋅ sn−1
n−1
sn with k1 ≥ k2 ≥ ⋅ ⋅ ⋅ ≥ kn has the high-
k k k
est piece x1 1 x2 2 ⋅ ⋅ ⋅ xnn .

Proof. From the definition of the elementary symmetric polynomials, we have that

HG(stk ) = (x1 x2 ⋅ ⋅ ⋅ xk )t , 1 ≤ k ≤ n, t ≥ 1.

From Lemma 7.4.2,


k −k2 k2 −k3 kn−1 −kn kn
HG(s1 1 s2 ⋅ ⋅ ⋅ sn−1 sn )
k1 −k2 k2 −k3 kn−1 −kn
= x1 (x1 x2 ) ⋅ ⋅ ⋅ (x1 ⋅ ⋅ ⋅ xn−1 )(x1 ⋅ ⋅ ⋅ xn )kn
k k
= x1 1 x2 2 ⋅ ⋅ ⋅ xnkn .

Theorem 7.5.4. Let s(x1 , . . . , xn ) ∈ R[x1 , . . . , xn ] be a symmetric polynomial. Then


s(x1 , . . . , xn ) can be uniquely expressed as a polynomial f (s1 , . . . , sn ) in the elementary
symmetric polynomials s1 , . . . , sn with coefficients from R.

Proof. We prove the existence of the polynomial f by induction on the size of the high-
est pieces. If in the highest piece of a symmetric polynomial all exponents are zero,
then it is constant, that is, an element of R. Therefore, there is nothing to prove.
Now we assume that each symmetric polynomial with highest piece smaller than
that of s(x1 , . . . , xn ) can be written as a polynomial in the elementary symmetric poly-
k k
nomials. Let ax1 1 ⋅ ⋅ ⋅ xnn , a ≠ 0, be the highest piece of s(x1 , . . . , xn ). Let

k −k2 k −kn kn
t(x1 , . . . , xn ) = s(x1 , . . . , xn ) − as1 1 ⋅ ⋅ ⋅ sn−1
n−1
sn .

Clearly, t(x1 , . . . , xn ) is another symmetric polynomial, and from Lemma 7.4.5, the
highest piece of t(x1 , . . . , xn ) is smaller than that of s(x1 , . . . , xn ). Therefore, t(x1 , . . . , xn ).
k −k kn−1 −kn kn
Hence, s(x1 , . . . , xn ) = t(x1 , . . . , xn ) + as1 1 2 ⋅ ⋅ ⋅ sn−1 sn can be written as a polynomial
in s1 , . . . , sn .
To prove the uniqueness of this expression, assume that s(x1 , . . . , xn ) = f (s1 , . . . ,
sn ) = g(s1 , . . . , sn ). Then f (s1 , . . . , sn ) − g(s1 , . . . , sn ) = h(s1 , . . . , sn ) = ϕ(x1 , . . . , xn ) is the
zero polynomial in x1 , . . . , xn . Hence, if we write h(s1 , . . . , sn ) as a sum of products of
powers of the s1 , . . . , sn , all coefficients disappear because two different products of
powers in the s1 , . . . , sn have different highest pieces. This follows from the previous
set of lemmas. Therefore, f and g are the same, proving the theorem.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
112 | 7 Kronecker’s theorem and algebraic closures

7.6 Skew field extensions of ℂ and Frobenius’s theorem


Let V be a ℝ-vector space with dimℝ (V) = n < ∞.
We have already seen that as a consequence of the Fundamental theorem of al-
gebra that only for n = 1 and n = 2, we may provide V with a multiplication such
that V becomes a field with respect to the addition in V and this multiplication. Up to
isomorphisms, we get V = ℝ if n = 1 and V = ℂ if n = 2.
If we want a suitable multiplication for n ≥ 3, we have to give up some of the rules
of a field. If all the axioms of a field hold except for the commutativity of multiplication,
then we have a skew field or division ring. Hence, a division ring is a noncommuta-
tive ring with identity, in which every nonzero element has a multiplicative inverse.
Hamilton described for n = 4 a multiplication in V in such a way that V becomes
a skew field. In his honor, we talk about the Hamiltonian skew field. This skew field
is denoted by ℍ and is called the quaternions.
In this section, we want first to describe the skew field ℍ of Hamilton’s quater-
nions and then to prove that if n ≥ 3, only for n = 4 can we provide V with a multipli-
cation such that V becomes a skew field.
We start with the construction and description of ℍ. Let {1, i, j, k} be a basis of V.
The addition will be the usual addition in the vector space. We also take scalar mul-
tiplication by ℝ. The basis element 1 shall be the unit element for the multiplication
(as already mentioned in the case of the complex numbers, this is not a restriction be-
cause any nonzero vector in V is a member of a basis). The basis element 1 then should
generate the embedding of ℝ.
For i, j, k, we define a multiplication by the following rules of Hamilton:

i2 = j2 = k 2 = −1,
ij = k, jk = i, ki = j,
ji = −k, kj = −i, ik = −j.

For

x = x0 + x1 i + x2 j + x3 k and y = y0 + y1 i + y2 j + x3 k,

we determine the addition and multiplication in V by following basic algebraic ma-


nipulation:

x + y := (x0 + y0 ) + (x1 + y1 )i + (x2 + y2 )j + (x3 + y3 )k,


x ⋅ y := (x0 y0 − x1 y1 − x2 y2 − x3 y3 ) + (x0 y1 + x1 y0 + x2 y3 − x3 y2 )i
+ (x0 y2 − x1 y3 + x2 y0 + x3 y1 )j + (x0 y3 + x1 y2 − x2 y1 + x3 y0 )k.

Together with this addition and multiplication, V becomes a noncommutative ring


with unit element 1. For each quaternion

x = x0 + x1 i + x2 j + x3 k,

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
7.6 Skew field extensions of ℂ and Frobenius’s theorem | 113

we define the conjugate quaternion by

x := x0 − x1 i − x2 j − x3 k.

We have the rules

x = x, x + y = x + y, λx = λx, λ ∈ ℝ, and xy = x ⋅ y.

With help of the conjugation, we may now define the norm and the length of a quater-
nion

x = x0 + x1 i + x2 j + x3 k

by

n(x) = xx = xx = x02 + x12 + x22 + x32 and |x| = √x02 + x12 + x22 + x32 ,

respectively, in analogy to the complex numbers. If x ≠ 0, then we get the multiplica-


x
tive inverse x−1 by x−1 = xx , because

x x
xx−1 = x =1=x .
xx xx

Hence, together with the addition and multiplication, V becomes a skew field, in
which ℝ can be embedded via r 󳨃→ r ⋅ 1 for r ∈ ℝ.

Theorem 7.6.1. The set of quaternions ℍ is a skew field, which contains both the reals
and the complexes as subfields. It has dimension 4 as a vector space over ℝ. Further-
more, rx = xr for all x ∈ ℍ, and all r ∈ ℝ (considered as elements of ℍ).

In ℍ, there is an important multiplicative rule for the norm and the length:

n(xy) = n(x)n(y) and |xy| = |x||y| for x, y ∈ ℍ.

This can be shown by an easy calculation.


This result on norms in the quaternions provides the general equation in ℝ on
sums of four squares:

(x02 + x12 + x22 + x32 )(y02 + y12 + y22 + y32 ) = (x0 y0 − x1 y1 − x2 y2 − x3 y3 )2


+ (x0 y1 + x1 y0 + x2 y3 − x3 y2 )2
+ (x0 y2 − x1 y3 + x2 y0 + x3 y1 )2
+ (x0 y3 + x1 y2 − x2 y1 + x3 y0 )2 .

This equation is one of the bases for the Theorem of Lagrange.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
114 | 7 Kronecker’s theorem and algebraic closures

Theorem 7.6.2 (Theorem of Lagrange). Each natural number n can be written as a sum

n = a2 + b2 + c2 + d2

of four squares with a, b, c, d ∈ ℤ.

Hint: We have only to show that (see [43, Chapter 3.2]) if p is a prime number with
p ≡ 3 mod 4, then p = a2 + b2 + c2 + d2 for some a, b, c, d ∈ ℤ.
A proof of this can be found for instance in the book [43].
We remark that the skew field ℍ of the quaternions can be embedded into M(2, ℂ)
via
1 0 i 0
1 󳨃→ ( ), i 󳨃→ ( ),
0 1 0 −i
0 1 0 i
j 󳨃→ ( ), k 󳨃→ ( ).
−1 0 i 0

Using this map, a quaternion x = x0 + x1 i + x2 j + x3 k can be considered as a matrix

x + x1 i x2 + x3 i w z
( 0 )=( )
−x2 + x3 i x0 − x1 i −z w

with w = x0 + x1 i ∈ ℂ and z = x2 + x3 i ∈ ℂ.
We have shown that the quaternions form a skew field of degree 4 over the real
numbers. We ask whether there can be other finite degree skew field extensions of ℝ.
Let V be a ℝ-vector space of dimℝ (V) = n < ∞. For which n, we may provide V with a
multiplication such that V with the vector addition and this multiplication becomes a
field, or a skew field.
We remark that some nonzero vector in V has to be the unit element 1; therefore,
we automatically have an embedding ℝ → V.
Let n ≥ 2. Since the irreducible polynomials from ℝ[x] have degree 1 or 2, then
under the existence of such a multiplication, each element α ∈ V, which is not in ℝ
(considered as a subset of V), must be a zero of a quadratic polynomial from ℝ[x].
We now assume that we have in V a multiplication such that V, together with the
addition in V and this multiplication, is a field or a skew field.
If n = 2, we get the field ℂ of the complex numbers.
Now, let n = 3.
Using analogous thoughts as for the implementation of ℂ, we may construct in
two steps a basis {1, i, j} of V such that 1 is the unit element of V, and i2 = j2 = −1.
Recall that a two-dimensional subspace of V has to be isomorphic to ℂ as a subfield
of V.
Let k = ij. Since dimℝ (V) = 3, we must have k = a1 + b1 i + c1 j with a1 , b1 , c1 ∈ ℝ.
Multiplication from the left with i results in

−j = a1 i − b1 + c1 k = a1 i − b1 + c1 (a1 + b1 i + c1 j),

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
7.6 Skew field extensions of ℂ and Frobenius’s theorem | 115

and since 1, i, j are linearly independent, therefore, we get c12 = −1, which is impossible
in ℝ. Therefore, the case n = 3 is not possible.
If n = 4, we may construct in V three linearly independent elements 1, i, j such that
1 is the unit element of V, and i2 = j2 = −1. Certainly ij is linearly independent from 1, i
and j, because otherwise, we get a contradiction as in the case n = 3. Also ji is linearly
independent from 1, i and j. Now i + j and i − j are both zeros of quadratic polynomials
over ℝ; that is, there exists r1 , s1 , r2 , s2 ∈ ℝ with

(i + j)2 + r1 (i + j) + s1 = 0 and (i − j)2 + r2 (i − j) + s2 = 0.

If we add these equations, we see that r1 = r2 = 0; therefore, we get from the first
equation that ij + ji = c ∈ ℝ. Here, we used that 1, i and j are linearly independent.
Now, we may replace j by j + c2 i, which gives

c c
i(j + i) + (j + i)i = 0.
2 2

Since the subspace of V generated by 1 and j + c2 i must, as a field, be isomorphic to ℂ,


we may normalize j + c2 i to j1 with j12 = −1.
We now define k = ij1 . Then automatically

k = ij1 = −j1 i and k 2 = −1.

So altogether, we may construct a basis {1, i, j, k} of V such that 1 is the unit element
of V, and i2 = j2 = k 2 = −1, k = ij = −ji. Thereby, V is isomorphic to the skew field ℍ of
the quaternions.
Finally, let n ≥ 5.
Analogously as for the case n = 4 and the general observation for the subfield
isomorphic to ℂ, we may construct a basis {1, i, j, k, l, . . .} such that

i2 = j2 = k 2 = −1, k = ij = −ji and l2 = −1.

Analogously, as in the case n = 4, we have that i + l and i − l are both zeros of quadratic
polynomials over ℝ.
Therefore, as in the case n = 4,

il = li = a2 ∈ ℝ.

In the same manner, we get

jl + lj = b2 ∈ ℝ and kl + lk = c2 ∈ ℝ.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
116 | 7 Kronecker’s theorem and algebraic closures

We calculate

lk = l(ij) = a2 j − ilj = a2 j − i(b2 − jl)


= a2 j − b2 i + ijl = a2 j − b2 i + kl
= a2 j − b2 i + c2 − lk.

From this, we get

2lk = a2 j − b2 i + c2 .

Multiplication with k from the right gives

−2l = a2 i + b2 j + c2 k,

because jk = i, and ik = −j.


This means that l is linearly dependent of {1, i, j, k}, which is not the case. This
contradiction shows that n ≥ 5 is not possible.
Altogether, we have proven the following theorem:

Theorem 7.6.3 (Theorem of Frobenius). Let V be a ℝ-vector space with dimℝ (V) = n <
∞. Let V be provided in addition with a multiplication, such that V together with the
vector addition and the multiplication is a field or a skew field.
Then n = 1, 2 or 4.
If n = 1, then V is isomorphic to ℝ.
If n = 2, then V is isomorphic to ℂ.
If n = 4, then V is isomorphic to ℍ.

7.7 Exercises
1. Let f , g ∈ K[x] be irreducible polynomials of degree 2 over the field K. Let α1 , α2
(respectively, β1 , β2 ) be zeros of f and g. For 1 ≤ i, j ≤ 2, let νij = αi + βj . Show the
following:
(a) |K(νij ) : K| ∈ {1, 2, 3, 4}.
(b) For fixed f , g, there are at most two different degrees in (a).
(c) Decide which sets of combinations of degrees in (b) (with f , g variable) are
possible, and give an example in each case.
2. Let L|K be a field extension; let ν ∈ L and f (x) ∈ L[x], a polynomial of degree ≥ 1.
Let all coefficients of f (x) be algebraic over K. If f (ν) = 0, then ν is algebraic over K.
3. Let L|K be a field extension, and let M be an intermediate field. The extension M|K
is algebraic. For ν ∈ L, the following are equivalent:
(a) ν is algebraic over M.
(b) ν is algebraic over K.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
7.7 Exercises | 117

4. Let L|K be a field extension and ν1 , ν2 ∈ L. Then the following are equivalent:
(a) ν1 and ν2 are algebraic over K.
(b) ν1 + ν2 and ν1 ν2 are algebraic over K.
5. Let L|K be a simple field extension. Then there is an extension field L󸀠 of L of the
form L󸀠 = K(ν1 , ν2 ) with the following:
(a) ν1 and ν2 are transcendental over K.
(b) The set of all over K algebraic elements of L󸀠 is L.
6. In the proof of Theorem 7.1.4, show that the mapping

τ : K(a) → K(α),

defined by τ(k) = k if k ∈ K and τ(a) = α, and then extended by linearity, is a


K-isomorphism.
7. Prove Lemma 7.2.8.
8. If T, T1 are sets with the same cardinality, then there exists a bijection σ : T → T1 .
Define a map F : ST → ST1 in the following manner: if f ∈ ST , let F(f ) be the per-
mutation on T1 given by F(f )(t1 ) = σ(f (σ −1 (t1 ))). Prove that F is an isomorphism.
9. Prove that if P(X), Q(x), H(x) ∈ ℂ, then if P(x)Q(x) = H(x), then H(x) =
(P(x))(Q(x)).
10. Show the multiplicative rule for the norm and the length for the quaternions:

n(xy) = n(x)n(y) and |xy| = |x||y| for x, y ∈ ℍ.

11. Determine all irreducible polynomials over ℝ. Factorize f (x) ∈ ℝ[x] in irreducible
polynomials.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
Brought to you by | Chalmers University of Technology
Authenticated
Download Date | 9/12/19 6:27 AM
8 Splitting fields and normal extensions
8.1 Splitting fields

In the last chapter, we introduced splitting fields and used this idea to present a proof
of the fundamental theorem of algebra. The concept of a splitting field is essential to
the Galois theory of equations. Therefore, in this chapter, we look more deeply at this
idea.

Definition 8.1.1. Let K be a field and f (x) a nonconstant polynomial in K[x]. An exten-
sion field L of K is a splitting field for f (x) over K if the following hold:
(a) f (x) splits into linear factors in L[x].
(b) K ⊂ M ⊂ L and M ≠ L, resulting in f (x) not splitting into linear factors in M[x].

From part (b) in the definition, the following is clear:

Lemma 8.1.2. L is a splitting field for f (x) ∈ K[x] if and only if f (x) splits into linear
factors in L[x], and if f (x) = b(x − a1 ) ⋅ ⋅ ⋅ (x − an ) with b ∈ K, then L = K(a1 , . . . , an ).

Example 8.1.3. The field ℂ of complex numbers is a splitting field for the polynomial
p(x) = x2 + 1 in ℝ[x]. In fact, since ℂ is algebraically closed, it is a splitting field for
any real polynomial f (x) ∈ ℝ[x], which has at least one nonreal zero.
The field ℚ(i) adjoining i to ℚ is a splitting field for x2 + 1 over ℚ[x].

The next result was used in the previous chapter. We restate and reprove it here.

Theorem 8.1.4. Let K be a field. Then each nonconstant polynomial in K[x] has a split-
ting field.

Proof. Let K be an algebraic closure of K. Then f (x) splits in K[x]; that is, f (x) = b(x −
a1 ) ⋅ ⋅ ⋅ (x − an ) with b ∈ K and ai ∈ K. Let L = K(a1 , . . . , an ). Then L is the splitting field
for f (x) over K.

We next show that the splitting field over K of a given polynomial is unique up to
K-isomorphism.

Theorem 8.1.5. Let K, K 󸀠 be fields and ϕ : K → K 󸀠 an isomorphism. Let f (x) be a non-


constant polynomial in K[x] and f 󸀠 (x) = ϕ(f (x)) its image in K 󸀠 [x]. Suppose that L is a
splitting field for f (x) over K, and L󸀠 is a splitting field for f 󸀠 (x) over K 󸀠 .
(a) Suppose that L󸀠 ⊂ L󸀠󸀠 . Then, if ψ : L → L󸀠󸀠 is a monomorphism with ψ|K = ϕ, then
ψ is an isomorphism from L onto L󸀠 . Moreover, ψ maps the set of zeros of f (x) in L
onto the set of zeros of f 󸀠 (x) in L󸀠 . The map ψ is uniquely determined by the values
of the zeros of f (x).

https://doi.org/10.1515/9783110603996-008

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
120 | 8 Splitting fields and normal extensions

(b) If g(x) is an irreducible factor of f (x) in K[x], a is a zero of g(x) in L, and a󸀠 is a zero
of g 󸀠 (x) = ϕ(g(x)) in L󸀠 , then there is an isomorphism ψ from L to L󸀠 with ψ|K = ϕ
and ψ(a) = ψ(a󸀠 ).

Before giving the proof of this theorem, we note that the following important result
is a direct consequence of it:

Theorem 8.1.6. A splitting field for f (x) ∈ K[x] is unique up to K-isomorphism.

Proof of Theorem 8.1.5. Suppose that f (x) = b(x − a1 ) ⋅ ⋅ ⋅ (x − an ) ∈ L[x] and that f 󸀠 (x) =
b󸀠 (x − a󸀠1 ) ⋅ ⋅ ⋅ (x − a󸀠n ) ∈ L󸀠 [x]. Then

f 󸀠 (x) = ϕ(f (x)) = ψ(f (x)) = (ψ(b))(x − ψ(a1 )) ⋅ ⋅ ⋅ (x − ψ(an )).

We have proved that polynomials have unique factorization over fields. Since L󸀠 ⊂ L󸀠󸀠 ,
it follows that the set of zeros (ψ(a1 ), . . . , ψ(an )) is a permutation of the set of zeros
(a󸀠1 , . . . , a󸀠n ). In particular, this implies that ψ(ai ) ∈ L󸀠 ; thus,

im(ψ) = L󸀠 = K 󸀠 (a1 , . . . , a󸀠n ).

Since the image of ψ is K 󸀠 (a1 , . . . , a󸀠n ) = K 󸀠 (ψ(ai ), . . . , ψ(an )), it is clear that ψ is uniquely
determined by the images ψ(ai ). This proves part (a).
For part (b), embed L󸀠 in an algebraic closure L󸀠󸀠 . Hence, there is a monomorphism

ϕ󸀠 : K(a) → L󸀠󸀠

with ϕ󸀠|K = ϕ and ϕ󸀠 (a) = a󸀠 . Hence, there is a monomorphism ψ : L → L󸀠󸀠 with


ψ|K(a) = ϕ󸀠 . Then from part (a), it follows that ψ : L → L󸀠 is an isomorphism.

Example 8.1.7. Let f (x) = x3 − 7 ∈ ℚ[x]. This has no zeros in ℚ, and since it is of
degree 3, it follows that it must be irreducible in ℚ[x].
Let ω = − 21 + 23 i ∈ ℂ. Then it is easy to show by computation that ω2 = − 21 − 23 i,
√ √

and ω3 = 1. Therefore, the three zeros of f (x) in ℂ are as follows:

a1 = 71/3
a2 = ω ⋅ 71/3
a3 = ω2 ⋅ 71/3 .

Hence, L = ℚ(a1 , a2 , a3 ), the splitting field of f (x). Since the minimal polynomial
of all three zeros over ℚ is the same f (x), it follows that

ℚ(a1 ) ≅ ℚ(a2 ) ≅ ℚ(a3 ).

Since ℚ(a1 ) ⊂ ℝ and a2 , a3 are nonreal, it is clear that a2 , a3 ∉ ℚ(a1 ).

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
8.2 Normal extensions | 121

1/3
Suppose that ℚ(a2 ) = ℚ(a3 ). Then ω = a3 a−1 2 ∈ ℚ(a2 ), and so 7 = ω−1 a2 ∈
ℚ(a2 ). Hence, Q(a1 ) ⊂ ℚ(a2 ); therefore, ℚ(a1 ) = ℚ(a2 ) since they have the same degree
over ℚ. This contradiction shows that ℚ(a2 ) and ℚ(a3 ) are distinct.
2
By computation, we have a3 = a−1 1 a2 ; hence,

L = ℚ(a1 , a2 , a3 ) = ℚ(a1 , a2 ) = ℚ(71/3 , ω).

Now the degree of L over ℚ is

|L : ℚ| = 󵄨󵄨󵄨Q(71/3 , ω) : ℚ(ω)󵄨󵄨󵄨󵄨󵄨󵄨ℚ(ω) : ℚ󵄨󵄨󵄨.


󵄨 󵄨󵄨 󵄨

Now |ℚ(ω) : ℚ| = 2 since the minimal polynomial of ω over ℚ is x2 +x +1. Since no zero
of f (x) lies in ℚ(ω), and the degree of f (x) is 3, it follows that f (x) is irreducible over
ℚ(ω). Therefore, we have that the degree of L over ℚ(ω) is 3. Hence, |L : ℚ| = (2)(3) = 6.
We now have the following lattice diagram of fields and subfields:

We do not know however if there are any more intermediate fields. There could,
for example, be infinitely many. However, as we will see when we do the Galois theory,
there are no others.

8.2 Normal extensions


We now consider algebraic field extensions L of K, which have the property that if
f (x) ∈ K[x] has a zero in L, then f (x) must split in L. In particular, we show that if L is
a splitting field of finite degree for some g(x) ∈ K[x], then L has this property.

Definition 8.2.1. A field extension L of a field K is a normal extension if the following


hold:
(a) L|K is algebraic.
(b) Each irreducible polynomial f (x) ∈ K[x] that has a zero in L splits into linear fac-
tors in L[x].

Note, in Example 8.1.7, the extension fields Q(αi )|ℚ are not normal extensions.
Although f (x) has a zero in ℚ(αi ), the polynomial f (x) does not split into linear factors
in ℚ(αi )[x].

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
122 | 8 Splitting fields and normal extensions

We now show that L|K is a finite normal extension if and only if L is the splitting
field for some f (x) ∈ K[x].

Theorem 8.2.2. Let L|K be a finite extension. Then the following are equivalent:
(a) L|K is a normal extension.
(b) L|K is a splitting field for some f (x) ∈ K[x].
(c) If L ⊂ L󸀠 and ψ : L → L󸀠 is a monomorphism with ψ|K , the identity map on K, then ψ
is an automorphism of L; that is, ψ(L) = L.

Proof. Suppose that L|K is a finite normal extension. Since L|K is a finite extension, L is
algebraic over K, and since of finite degree, we have L = K(a1 , . . . , an ) with ai algebraic
over K.
Let fi (x) ∈ K[x] be the minimal polynomial of ai . Since L|K is a normal extension,
fi (x) splits in L[x]. This is true for each i = 1, . . . , n. Let f (x) = f1 (x)f2 (x) ⋅ ⋅ ⋅ fn (x). Then f (x)
splits into linear factors in L[x]. Since K = K(a1 , . . . , an ), the polynomial f (x) cannot
have all its zeros in any intermediate extension between K and L. Therefore, L is the
splitting field for f (x). Hence, (a) implies (b).
Now suppose that L ⊂ L󸀠 and ψ : L → L󸀠 is a monomorphism with ψ|K the identity
map on K. Then the extension field ψ(L) of K is also a splitting field for f (x) since ψ|K
is the identity on K. Hence, ψ maps the zeros of f (x) in L ⊂ L󸀠 onto the zeros of f (x) in
ψ(L) ⊂ L󸀠 , and thus it follows that ψ(L) = L. Hence, (b) implies (c).
Finally, suppose (c). Hence, we assume that if L ⊂ L󸀠 and ψ : L → L󸀠 is a monomor-
phism with ψ|K , the identity map on K, then ψ is an automorphism of L; that is,
ψ(L) = L.
As before L|K is algebraic since L|K is finite. Suppose that f (x) ∈ K[x] is irre-
ducible and that a ∈ L is a zero of f (x). There are algebraic elements a1 , . . . , an ∈ L
with L = K(a1 , . . . , an ) since L|K is finite. For i = 1, . . . , n, let fi (x) ∈ K[x] be the minimal
polynomial of ai , and let g(x) = f (x)f1 (x) ⋅ ⋅ ⋅ fn (x). Let L󸀠 be the splitting field of g(X).
Clearly, L ⊂ L󸀠 . Let b ∈ L󸀠 be a zero of f (x). From Theorem 8.1.5, there is an automor-
phism ψ of L󸀠 with ψ(a) = b and ψ|K , the identity on K. Hence, by our assumption, ψ|L
is an automorphism of L. It follows that b ∈ L; hence, f (x) splits in L[x]. Therefore, (c)
implies (a), completing the proof.

To give simple examples of normal extensions, we have the following:

Lemma 8.2.3. If L is an extension of K with |L : K| = 2, then L is a normal extension


of K.

Proof. Suppose that |L : K| = 2. Then L|K is algebraic since it is finite. Let f (x) ∈
K[x] be irreducible with leading coefficient 1, and which has a zero in L. Let a be
one zero. Then f (x) must be the minimal polynomial of a. However, deg(ma (x)) ≤
|L : K| = 2; hence, f (x) is of degree 1 or 2. Since f (x) has a zero in L, it follows that
it must split into linear factors in L[x]; therefore, L is a normal extension.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
8.2 Normal extensions | 123

Later, we will tie this result to group theory when we prove that a subgroup of
index 2 must be a normal subgroup.

Example 8.2.4. As a first example of the lemma, consider the polynomial f (x) = x2 −2.
In ℝ, this splits as (x − √2)(x + √2); hence, the field ℚ(√2) is the splitting field of
f (x) = x2 − 2 over ℚ. Therefore, ℚ(√2) is a normal extension of ℚ.

Example 8.2.5. As a second example, consider the polynomial x4 −2 in ℚ[x]. The zeros
in ℂ are

21/4 , 21/4 i, 21/4 i2 , 21/4 i3 .

Hence,

L = ℚ(21/4 , 21/4 i, 21/4 i2 , 21/4 i3 )

is the splitting field of x4 − 2 over ℚ.


Now

L = ℚ(21/4 , 21/4 i, 21/4 i2 , 21/4 i3 ) = ℚ(21/4 , i).

Therefore, we have

|L : ℚ| = 󵄨󵄨󵄨L : ℚ(21/4 )󵄨󵄨󵄨󵄨󵄨󵄨ℚ(21/4 ) : ℚ󵄨󵄨󵄨.


󵄨 󵄨󵄨 󵄨

Since x4 − 2 is irreducible over ℚ, we have |ℚ(21/4 ) : ℚ| = 4. Since i has degree 2 over


any real field, we have |L : ℚ(21/4 )| = 2. Therefore, L is a normal extension of ℚ(21/4 ),
and x2 − √2 ∈ ℚ(√2)[x] has the splitting field ℚ(21/4 ).
Altogether, we have that L|ℚ(21/4 ), ℚ(21/4 )|ℚ(21/2 ), ℚ(21/2 )|ℚ, and L|ℚ are normal
extensions. However, ℚ(21/4 )|ℚ is not normal since 21/4 is a zero of x4 − 2, but ℚ(21/4 )
does not contain all the zeros of x4 − 2.
Hence, we get the following Figure 8.1.

Figure 8.1: Normal extensions.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
124 | 8 Splitting fields and normal extensions

8.3 Exercises
1. Determine the splitting field of f (x) ∈ ℚ[x] and its degree over ℚ in the following
cases:
(a) f (x) = x4 − p, where p is a prime.
(b) f (x) = xp − 2, where p is a prime.
2. Determine the degree of the splitting field of the polynomial x4 + 4 over ℚ. Deter-
mine the splitting field of x6 + 4x 4 + 4x2 + 3 over ℚ.
3. For each a ∈ ℤ, let fa (x) = x3 − ax 2 + (a − 3)x + 1 ∈ ℚ[x] be given:
(a) fa is irreducible over ℚ for each a ∈ ℤ.
(b) If b ∈ ℝ is a zero of fa , then also (1 − b)−1 and (b − 1)b−1 are zeros of fa .
(c) Determine the splitting field L of fa (x) over ℚ and its degree |L : ℚ|.
4. Let K be a field and f (x) ∈ K[x] a polynomial of degree n. Let L be a splitting field
of f (x). Show the following:
(a) If a1 , . . . , an ∈ L are the zeros of f , then |K(a1 , . . . , at ) : K| ≤ n ⋅ (n − 1) ⋅ ⋅ ⋅ (n − t + 1)
for each t with 1 ≤ t ≤ n.
(b) L over K is of degree at most n!.
(c) If f (x) is irreducible over K, then n divides |L : K|.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:27 AM
9 Groups, subgroups, and examples
9.1 Groups, subgroups, and isomorphisms
Recall from Chapter 1 that the three most commonly studied algebraic structures are
groups, rings and fields. We have now looked rather extensively at rings and fields.
In this chapter, we consider the basic concepts of group theory. Groups arise in many
different areas of mathematics. For example they arise in geometry as groups of con-
gruence motions, and in topology as groups of various types of continuous functions.
Later in this book, they will appear in Galois theory as groups of automorphisms of
fields. First, we recall the definition of a group given previously in Chapter 1.

Definition 9.1.1. A group G is a set with one binary operation, which we will denote
by multiplication, such that
(1) The operation is associative; that is, (g1 g2 )g3 = g1 (g2 g3 ) for all g1 , g2 , g3 ∈ G.
(2) There exists an identity for this operation; that is, an element 1 such that 1g = g
and g1 = g for each g ∈ G.
(3) Each g ∈ G has an inverse for this operation; that is, for each g, there exists a g −1
with the property that gg −1 = 1, and g −1 g = 1.

If, in addition, the operation is commutative; that is, g1 g2 = g2 g1 for all g1 , g2 ∈ G, the
group G is called an abelian group.
The order of G, denoted |G|, is the number of elements in the group G. If |G| < ∞,
G is a finite group, otherwise, it is an infinite group.

It follows easily from the definition that the identity is unique, and that each ele-
ment has a unique inverse.

Lemma 9.1.2. If G is a group, then there is a unique identity. Furthermore, if g ∈ G, its


inverse is unique. Finally, if g1 , g2 ∈ G, then (g1 g2 )−1 = g2−1 g1−1 .

Proof. Suppose that 1 and e are both identities for G. Then 1e = e since 1 is an identity,
and 1e = 1 since e is an identity. Therefore, 1 = e, and there is only one identity.
Next suppose that g ∈ G, g1 , and g2 are inverses for g. Then

g1 gg2 = (g1 g)g2 = 1g2 = g2

since g1 g = 1. On the other hand,

g1 gg2 = g1 (gg2 ) = g1 1 = g1

since gg2 = 1. It follows that g1 = g2 , and g has a unique inverse.


Finally, consider

(g1 g2 )(g2−1 g1−1 ) = g1 (g2 g2−1 )g1−1 = g1 1g1−1 = g1 g1−1 = 1.

https://doi.org/10.1515/9783110603996-009

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:32 AM
126 | 9 Groups, subgroups, and examples

Therefore, g2−1 g1−1 is an inverse for g1 g2 , and since inverses are unique, it is the inverse
of the product.

Groups most often arise as permutations on a set. We will see this, as well as other
specific examples of groups, in the next sections.
Finite groups can be completely described by their group tables or multiplication
tables. These are sometimes called Cayley tables. In general, let G = {g1 , . . . , gn } be a
group, then the multiplication table of G is

g1 g2 ⋅⋅⋅ gj ⋅⋅⋅ gn
g1 ⋅⋅⋅
g2 ⋅⋅⋅
..
.
gi ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ gi gj
..
.
gn ...

The entry in the row of gi ∈ G and column of gj ∈ G is the product (in that order)
gi gj in G.
Groups satisfy the cancellation law for multiplication.

Lemma 9.1.3. If G is a group and a, b, c ∈ G with ab = ac or ba = ca, then b = c.

Proof. Suppose that ab = ac. Then a has an inverse a−1 , so we have

a−1 (ab) = a−1 (ac).

From the associativity of the group operation, we then have

(a−1 a)b = (a−1 a)c 󳨐⇒ 1 ⋅ b = 1 ⋅ c 󳨐⇒ b = c.

A consequence of Lemma 9.1.3 is that each row and each column in a group table is
just a permutation of the group elements. That is, each group element appears exactly
once in each row and each column.
A subset H ⊂ G is a subgroup of G if H is also a group under the same operation
as G. As for rings and fields, a subset of a group is a subgroup if it is nonempty and
closed under both the group operation and inverses.

Lemma 9.1.4.
1. A subset H ⊂ G is a subgroup if H ≠ 0, and H is closed under the operation and
inverses. That is, if a, b ∈ H, then ab ∈ H, and a−1 , b−1 ∈ H.
2. A nonempty subset H of a group G is a subgroup if and only if ab−1 ∈ H for all
a, b ∈ H. In addition, if G is finite, then H is a subgroup if and only if ab ∈ H for all
a, b ∈ H.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:32 AM
9.2 Examples of groups | 127

We leave the proof of this to the exercises.


Let G be a group and g ∈ G; we denote by g n , n ∈ ℕ, as with numbers, the product
of g taken n times. A negative exponent will indicate the inverse of the positive expo-
nent. As usual, let g 0 = 1. Clearly, group exponentiation will satisfy the standard laws
of exponents. Now consider the set

H = {1 = g 0 , g, g −1 , g 2 , g −2 , . . .}

of all powers of g. We will denote this by ⟨g⟩.

Lemma 9.1.5. If G is a group and g ∈ G, then ⟨g⟩ forms a subgroup of G called the cyclic
subgroup generated by g. ⟨g⟩ is abelian, even if G is not.

Proof. If g ∈ G, then g ∈ ⟨g⟩; hence, ⟨g⟩ is nonempty. Suppose then that a = g n , b = g m


are elements of ⟨g⟩. Then ab = g n g m = g n+m ∈ ⟨g⟩, so ⟨g⟩ is closed under the group
operation. Furthermore, a−1 = (g n )−1 = g −n ∈ ⟨g⟩ so ⟨g⟩ is closed under inverses.
Therefore, ⟨g⟩ is a subgroup.
Finally, ab = g n g m = g n+m = g m+n = g m g n = ba; hence, ⟨g⟩ is abelian.

Suppose that g ∈ G and g m = 1 for some positive integer m. Then let n be the small-
est positive integer such that g n = 1. It follows that the set of elements {1, g, g 2 , . . . , g n−1 }
are all distinct, but for any other power g k , we have g k = g t for some k = 0, 1, . . . , n − 1
(see exercises). The cyclic subgroup generated by g then has order n, and we say that g
has order n, which we denote by o(g) = n. If no such n exists, we say that g has infinite
order. We will look more deeply at cyclic groups and subgroups in Section 9.5.
We introduce one more concept before looking at examples.

Definition 9.1.6. If G and H are groups, then a mapping f : G → H is a (group) homo-


morphism if f (g1 g2 ) = f (g1 )f (g2 ) for any g1 , g2 ∈ G. If f is also a bijection, then it is an
isomorphism.

As with rings and fields, we say that two groups G and H are isomorphic, denoted
by G ≅ H, if there exists an isomorphism f : G → H. This means that, abstractly, G
and H have exactly the same algebraic structure.

9.2 Examples of groups


As already mentioned, groups arise in many diverse areas of mathematics. In this sec-
tion and the next, we present specific examples of groups.
First of all, any ring or field under addition forms an abelian group. Hence, for
example, (ℤ, +), (ℚ, +), (ℝ, +), (ℂ, +), where ℤ, ℚ, ℝ, ℂ are respectively the integers, the
rationals, the reals, and the complex numbers; all are infinite abelian groups. If ℤn is
the modular ring ℤ/nℤ, then for any natural number n, (ℤn , +) forms a finite abelian

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:32 AM
128 | 9 Groups, subgroups, and examples

group. In abelian groups, the group operation is often denoted by + and the identity
element by 0 (zero).
In a field K, the nonzero elements are all invertible and form a group under multi-
plication. This is called the multiplicative group of the field K and is usually denoted by
K ∗ . Since multiplication in a field is commutative, the multiplicative group of a field
is an abelian group. Hence, ℚ∗ , ℝ∗ , ℂ∗ are all infinite abelian groups, whereas if p is
a prime, ℤ∗p forms a finite abelian group. Recall that if p is a prime, then the modular
ring ℤp is a field.
Within ℚ∗ , ℝ∗ , ℂ∗ , there are certain multiplicative subgroups. Since the positive
rationals ℚ+ and the positive reals ℝ+ are closed under multiplication and inverse,
they form subgroups of ℚ∗ and ℝ∗ , respectively. In ℂ, if we consider the set of all
complex numbers z with |z| = 1, these form a multiplicative subgroup. Further within
this subgroup, if we consider the set of n-th roots of unity z (that is z n = 1) for a fixed n,
this forms a subgroup, this time of finite order.
The multiplicative group of a field is a special case of the unit group of a ring. If R
is a ring with identity, recall that a unit is an element of R with a multiplicative inverse.
Hence, in ℤ, the only units are ±1, whereas in any field every nonzero element is a unit.

Lemma 9.2.1. If R is a ring with identity, then the set of units in R forms a group under
multiplication called the unit group of R, and is denoted by U(R). If R is a field, then
U(R) = R∗ .

Proof. Let R be a ring with identity. Then the identity 1 itself is a unit, so 1 ∈ U(R);
hence, U(R) is nonempty. If e ∈ R is a unit, then it has a multiplicative inverse e−1 .
Clearly then, the multiplicative inverse has an inverse, namely, e so e−1 ∈ U(R) if e is.
Hence, to show U(R) is a group, we must show that it is closed under product.
Let e1 , e2 ∈ U(R). Then there exist e1−1 , e2−1 . It follows that e2−1 e1−1 is an inverse for
e1 e2 . Hence, e1 e2 is also a unit, and U(R) is closed under product. Therefore, for any
ring R with identity U(R) forms a multiplicative group.

To present examples of nonabelian groups, we turn to matrices. If K is a field, we


let

GL(n, K) = {n × n matrices over K with nonzero determinant}

and

SL(n, K) = {n × n matrices over K with determinant one}.

Lemma 9.2.2. If K is a field, then for n ≥ 2, GL(n, K) forms a nonabelian group under
matrix multiplication, and SL(n, K) forms a subgroup.
GL(n, K) is called the n-dimensional general linear group over K, whereas SL(n, K)
is called the n-dimensional special linear group over K.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:32 AM
9.2 Examples of groups | 129

Proof. Recall that for two n × n matrices A, B with n ≥ 2 over a field, we have

det(AB) = det(A) det(B),

where det is the determinant.


Now for any field, the n×n identity matrix I has determinant 1; hence, I ∈ GL(n, K).
Since the determinant is multiplicative, the product of two matrices with nonzero de-
terminant has nonzero determinant, so GL(n, K) is closed under product. Furthermore,
over a field K, if A is an invertible matrix, then
1
det(A−1 ) = .
det A
Therefore, if A has nonzero determinant, so does its inverse. It follows that GL(n, K) has
the inverse of any of its elements. Since matrix multiplication is associative, it follows
that GL(n, K) forms a group. It is nonabelian since in general matrix multiplication is
noncommutative.
SL(n, K) forms a subgroup of GL(n, K) because det(A−1 ) = 1 if det(A) = 1.

Groups play an important role in geometry. In any metric geometry, an isometry is


a mapping that preserves distance. To understand a geometry, one must understand
the group of isometries. We look briefly at the Euclidean geometry of the plane ℰ 2 .
An isometry or congruence motion of ℰ 2 is a transformation or bijection T of ℰ 2 that
preserves distance; that is, d(a, b) = d(T(a), T(b)) for all points a, b ∈ ℰ 2 .

Theorem 9.2.3. The set of congruence motions of ℰ 2 forms a group called the Euclidean
group. We denote the Euclidean group by ℰ .

Proof. The identity map I is clearly an isometry, and since composition of mappings
is associative, we need only to show that the product of isometries is an isometry, and
that the inverse of an isometry is an isometry.
Let T, U be isometries. Then d(a, b) = d(T(a), T(b)) and d(a, b) = d(U(a),
U(b)) for any points a, b. Now consider

d(TU(a), TU(b)) = d(T(U(a)), T(U(b))) = d(U(a), U(b))

since T is an isometry. However,

d(U(a), U(b)) = d(a, b)

since U is an isometry. Combining these, we have that TU is also an isometry.


Consider T −1 and points a, b. Then

d(T −1 (a), T −1 (b)) = d(TT −1 (a), TT −1 (b))

since T is an isometry. But TT −1 = I; hence,

d(T −1 (a), T −1 (b)) = d(TT −1 (a), TT −1 (b)) = d(a, b).

Therefore, T −1 is also an isometry; hence, ℰ is a group.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:32 AM
130 | 9 Groups, subgroups, and examples

One of the major results concerning ℰ is the following. We refer to [32], [33], [23],
and [29] for a more thorough treatment.

Theorem 9.2.4. If T ∈ ℰ , then T is either a translation, rotation, reflection, or glide re-


flection. The set of translations and rotations forms a subgroup.

Proof. We outline a brief proof. If T is an isometry and T fixes the origin (0, 0), then T
is a linear mapping. It follows that T is a rotation or a reflection. If T does not fix the
origin, then there is a translation T0 such that T0 T fixes the origin. This gives transla-
tions and glide reflections. In the exercises, we expand out more of the proof.

If D is a geometric figure in ℰ 2 , such as a triangle or square, then a symmetry of


D is a congruence motion T : ℰ 2 → ℰ 2 that leaves D in place. However, it may move
the individual elements of D. For example, a rotation about the center of a circle is a
symmetry of the circle.

Lemma 9.2.5. If D is a geometric figure in ℰ 2 , then the set of symmetries of D forms a


subgroup of ℰ called the symmetry group of D, denoted by Sym(D).

Proof. We show that Sym(D) is a subgroup of ℰ . The identity map I fixes D, so I ∈


Sym(D), and thus Sym(D) is nonempty. Let T, U ∈ Sym(D). Then T maps D to D, and
so does U. It follows directly that so does the composition TU; hence, TU ∈ Sym(D).
If T maps D to D, then certainly the inverse does. Therefore, Sym(D) is a subgroup
of ℰ .

Example 9.2.6. Let T be an equilateral triangle. Then there are exactly six symmetries
of T (see exercises). These are as follows:

I = the identity,
r = a rotation of 120∘ around the center of T,

r 2 = a rotation of 240∘ around the center of T,


f = a reflection over the perpendicular bisector of one of the sides,
fr = the composition of f and r,

fr 2 = the composition of f and r 2 .

Sym(T) is called the dihedral group D3 . In the next section, we will see that it is
isomorphic to S3 , the symmetric group on 3 symbols.

9.3 Permutation groups


Groups most often appear as groups of transformations or permutations on a set. In
this section, we will take a short look at permutation groups, and then examine them

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:32 AM
9.3 Permutation groups | 131

more deeply in Chapter 11. We recall some ideas, first introduced in Chapter 7, in rela-
tion to the proof of the fundamental theorem of algebra.

Definition 9.3.1. If A is a set, a permutation on A is a one-to-one mapping of A onto


itself. We denote the set of all permutations on A by SA .

Theorem 9.3.2. For any set A, SA forms a group under composition, called the symmet-
ric group on A. If |A| > 2, then SA is nonabelian. Furthermore, if A, B have the same
cardinality, then SA ≅ SB .

Proof. If SA is the set of all permutations on the set A, we must show that composition
is an operation on SA that is associative, and has an identity and inverses.
Let f , g ∈ SA . Then f , g are one-to-one mappings of A onto itself. Consider f ∘ g :
A → A. If f ∘ g(a1 ) = f ∘ g(a2 ), then f (g(a1 )) = f (g(a2 )), and g(a1 ) = g(a2 ), since f is
one-to-one. But then a1 = a2 since g is one-to-one.
If a ∈ A, there exists a1 ∈ A with f (a1 ) = a since f is onto. Then there exists a2 ∈ A
with g(a2 ) = a1 since g is onto. Putting these together, f (g(a2 )) = a; therefore, f ∘ g
is onto. Therefore, f ∘ g is also a permutation, and composition gives a valid binary
operation on SA .
The identity function 1(a) = a for all a ∈ A will serve as the identity for SA , whereas
the inverse function for each permutation will be the inverse. Such unique inverse
functions exist since each permutation is a bijection.
Finally, composition of functions is always associative; therefore, SA forms a
group.
Suppose that |A| > 2. Then A has at least 3 elements. Call them a1 , a2 , a2 . Consider
the 2 permutations f and g, which fix (leave unchanged) all of A, except a1 , a2 , a3 and
on these three elements:

f (a1 ) = a2 , f (a2 ) = a3 , f (a3 ) = a1


g(a1 ) = a2 , g(a2 ) = a1 , g(a3 ) = a3 .

Then under composition

f (g(a1 )) = a3 , f (g(a2 )) = a2 , f (g(a3 )) = a1 ,

whereas

g(f (a1 )) = a1 , g(f (a2 )) = a3 , g(f (a3 )) = a2 .

Therefore, f ∘ g ≠ g ∘ f ; hence, SA is not abelian.


If A, B have the same cardinality, then there exists a bijection σ : A → B. Define a
map F : SA → SB in the following manner: if f ∈ SA , let F(f ) be the permutation on B,
given by F(f )(b) = σ(f (σ −1 (b))). It is straightforward to verify that F is an isomorphism
(see the exercises).

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:32 AM
132 | 9 Groups, subgroups, and examples

If A1 ⊂ A, then those permutations on A that map A1 to A1 form a subgroup of SA


called the stabilizer of A1 , denoted as stab(A1 ). We leave the proof to the exercises.

Lemma 9.3.3. If A1 ⊂ A, then stab(A1 ) = {f ∈ SA : f : A1 → A1 } forms a subgroup of SA .

A permutation group is any subgroup of SA for some set A.


We now look at finite permutation groups. Let A be a finite set, say A = {a1 ,
a2 , . . . , an }. Then each f ∈ SA can be pictured as

a1 ... an
f =( ).
f (a1 ) ... f (an )

For a1 , there are n choices for f (a1 ). For a2 , there are only n − 1 choices since f is one-to-
one. This continues down to only one choice for an . Using the multiplication principle,
the number of choices for f ; therefore, the size of SA is

n(n − 1) ⋅ ⋅ ⋅ 1 = n!.

We have thus proved the following theorem.

Theorem 9.3.4. If |A| = n then |SA | = n!.

For a set A with n elements, we denote SA by Sn , called the symmetric group on n


symbols.

Example 9.3.5. Write down the six elements of S3 and give the multiplication table for
the group.
Name the three elements 1, 2, 3. The six elements of S3 are then as follows:

1 2 3 1 2 3 1 2 3
1=( ), a=( ), b=( )
1 2 3 2 3 1 3 1 2
1 2 3 1 2 3 1 2 3
c=( ), d=( ), e=( ).
2 1 3 3 2 1 1 3 2

The multiplication table for S3 can be written down directly by doing the required
composition. For example,

1 2 3 1 2 3 1 2 3
ac = ( )( )=( ) = d.
2 3 1 2 1 3 3 2 1

To see this, note that a : 1 → 2, 2 → 3, 3 → 1; c : 1 → 2, 2 → 1, 3 → 3, and so


ac : 1 → 3, 2 → 2, 3 → 1.
It is somewhat easier to construct the multiplication table if we make some obser-
vations. First, a2 = b and a3 = 1. Next, c2 = 1, d = ac, e = a2 c and, finally, ac = ca2 .

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:32 AM
9.4 Cosets and Lagrange’s theorem | 133

From these relations, the following multiplication table can be constructed:

1 a a2 c ac a2 c
1 1 a a2 c ac a2 c
a a a2 1 ac a2 c c
a2 a2 1 a a2 c c ac
c c a2 c ac 1 a2 a
ac ac c a2 c a 1 a2
a2 c a2 c ac c a2 a 1

To see this, consider, for example, (ac)a2 = a(ca2 ) = a(ac) = a2 c.


More generally, we can say that S3 has a presentation given by

S3 = ⟨a, c; a3 = c2 = 1, ac = ca2 ⟩.

By this, we mean that S3 is generated by a, c, or that S3 has generators a, c, and


the whole group and its multiplication table can be generated by using the relations
a3 = c2 = 1, ac = ca2 .

A theorem of Cayley actually shows that every group is a permutation group.


A group G is a permutation group on the group G itself considered as a set. This
result, however, does not give much information about the group.

Theorem 9.3.6 (Cayley’s theorem). Let G be a group. Consider the set of elements of G.
Then the group G is a permutation group on the set G; that is, G is a subgroup of SG .

Proof. We show that to each g ∈ G, we can associate a permutation of the set G. If


g ∈ G, let πg be the map given by

πg : g1 → gg1 for each g1 ∈ G.

It is straightforward to show that each πg is a permutation on G.

9.4 Cosets and Lagrange’s theorem


In this section, given a group G and a subgroup H, we define an equivalence relation
on G. The equivalence classes all have the same size and are called the (left) or (right)
cosets of H in G.

Definition 9.4.1. Let G be a group and H ⊂ G a subgroup. For a, b ∈ G, define a ∼ b if


a−1 b ∈ H.

Lemma 9.4.2. Let G be a group and H ⊂ G a subgroup. Then the relation defined above
is an equivalence relation on G. The equivalence classes all have the form aH for a ∈ G
and are called the left cosets of H in G. Clearly, G is a disjoint union of its left cosets.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:32 AM
134 | 9 Groups, subgroups, and examples

Proof. Let us show, first of all, that this is an equivalence relation. Now a ∼ a since
a−1 a = e ∈ H. Therefore, the relation is reflexive. Furthermore, a ∼ b implies a−1 b ∈ H,
but since H is a subgroup of G, we have b−1 a = (a−1 b)−1 ∈ H. Thus, b ∼ a. Therefore,
the relation is symmetric. Finally, suppose that a ∼ b and b ∼ c. Then a−1 b ∈ H, and
b−1 c ∈ H. Since H is a subgroup a−1 b ⋅ b−1 c = a−1 c ∈ H; hence, a ∼ c. Therefore, the
relation is transitive and, hence, is an equivalence relation.
For a ∈ G, the equivalence class is

[a] = {g ∈ G : a ∼ g} = {a ∈ G : a−1 g ∈ H}.

But then, clearly, g ∈ aH. It follows that the equivalence class for a ∈ G is precisely
the set

aH = {g ∈ G : g = ah for some h ∈ H}.

These classes, aH, are called left cosets of H, and since they are equivalence classes,
they partition G. This means that every element of g is in one and only one left coset.
In particular, bH = H = eH if and only if b ∈ H.

If aH is a left coset, then we call the element a a coset representative. A complete


collection

{a ∈ G : {aH} is the set of all distinct left cosets of H}

is called a (left) transversal of H in G.


One could define another equivalence relation by defining a ∼ b if and only if
ba−1 ∈ H. Again, this can be shown to be an equivalence relation on G, and the equiv-
alence classes here are sets of the form

Ha = {g ∈ G : g = ha for some h ∈ H},

called right cosets of H. Also, of course, G is the (disjoint) union of distinct right cosets.
It is easy to see that any two left (right) cosets have the same order (number of
elements). To demonstrate this, consider the mapping aH → bH via ah 󳨃→ bh, where
h ∈ H. It is not hard to show that this mapping is 1–1 and onto (see exercises). Thus, we
have |aH| = |bH|. (This is also true for right cosets and can be established in a similar
manner.) Letting b ∈ H in the above discussion, we see |aH| = |H|, for any a ∈ G. That
is, the size of each left or right coset is exactly the same as the subgroup H.
One can also see that the collection {aH} of all distinct left cosets has the same
number of elements as the collection {Ha} of all distinct right cosets. In other words,
the number of left cosets equals the number of right cosets (this number may be infi-
nite). For example, consider the map

f : aH → Ha−1 .

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:32 AM
9.4 Cosets and Lagrange’s theorem | 135

This mapping is well-defined; for if aH = bH, then b = ah, where h ∈ H. Thus, f (bH) =
Hb−1 = Hh−1 a−1 = f (aH). It is not hard to show that this mapping is 1–1 and onto (see
exercises). Hence, the number of left cosets equals the number of right cosets.

Definition 9.4.3. Let G be a group and H ⊂ G a subgroup. The number of distinct left
cosets, which is the same as the number of distinct right cosets, is called the index of
H in G, denoted by [G : H].

Now let us consider the case where the group G is finite. Each left coset has the
same size as the subgroup H; here, both are finite. Hence, |aH| = |H| for each coset.
In addition, the group G is a disjoint union of the left cosets; that is,

G = H ∪ g1 H ∪ ⋅ ⋅ ⋅ ∪ gn H.

Since this is a disjoint union, we have

|G| = |H| + |g1 H| + ⋅ ⋅ ⋅ + |gn H| = |H| + |H| + ⋅ ⋅ ⋅ + |H| = |H|[G : H].

This establishes the following extremely important theorem:

Theorem 9.4.4 (Lagrange’s theorem). Let G be a group and H ⊂ G a subgroup. Then

|G| = |H|[G : H].

If G is a finite group, this implies that both the order of a subgroup and the index of a
subgroup are divisors of the order of the group.

This theorem plays a crucial role in the structure theory of finite groups since it
greatly restricts the size of subgroups. For example, in a group of order 10, there can
be proper subgroups only of orders 1, 2, and 5.
As an immediate corollary, we have the following result:

Corollary 9.4.5. The order of any element g ∈ G, where G is a finite group, divides the
order of the group. In particular, if |G| = n and g ∈ G, then o(g)|n, and g n = 1.

Proof. Let g ∈ G and o(g) = m. Then m is the size of the cyclic subgroup generated
by g; hence divides n from Lagrange’s theorem. Then n = mk, and so
k
g n = g mk = (g m ) = 1k = 1.

Before leaving this section, we consider some results concerning general subsets
of a group.
Suppose that G is a group and S is an arbitrary nonempty subset of G, S ⊂ G, and
S ≠ 0. Such a set S is usually called a complex of G.
If U and V are two complexes of G, the product UV is defined as follows:

UV = {g1 g2 ∈ G : u ∈ U, v ∈ V}.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:32 AM
136 | 9 Groups, subgroups, and examples

Now suppose that U, V are subgroups of G. When is the complex UV again a sub-
group of G?

Theorem 9.4.6. The product UV of two subgroups U, V of a group G is itself a subgroup


if and only if U and V commute; that is, if and only if UV = VU.

Proof. We note first that when we say U and V commute, we do not demand that this
is so elementwise. In other words, it is not required that uv = vu for all u ∈ U and all
v ∈ V. All that is required is that for any u ∈ U and v ∈ V uv = v1 u1 for some elements
u1 ∈ U and v1 ∈ V.
Assume that UV is a subgroup of G. Let u ∈ U and v ∈ V. Then u ∈ U ⋅ 1 ⊂ UV and
v ∈ 1 ⋅ V ⊂ UV. But since UV is assumed itself to be a subgroup, it follows that vu ∈ UV.
Hence, each product vu ∈ UV, and so VU ⊂ UV. In an identical manner, UV ⊂ VU,
and so UV = VU.
Conversely, suppose that UV = VU. Let g1 = u1 v1 ∈ UV, g2 = u2 v2 ∈ UV. Then

g1 g1 = (u1 v1 )(u2 v2 ) = u1 (v1 u2 )v2 = u1 u3 v3 v2 = (u1 u3 )(v3 v2 ) ∈ UV

since v1 u2 = u3 v3 for some u3 ∈ U and v3 ∈ V. Furthermore,

1 = u4 v4 .
g1−1 = (u1 v1 )−1 = v1−1 u−1

It follows that UV is a subgroup.

Theorem 9.4.7 (product formula). Let U, V be subgroups of G, and let R be a left


transversal of the intersection U ∩ V in U. Then

UV = ⋃ rV,
r∈R

where this is a disjoint union.


In particular, if U, V are finite, then
|U||V|
|UV| = .
|U ∩ V|
Proof. Since R ⊂ U, we have that

⋃ rV ⊂ UV.
r∈R

In the other direction, let uv ∈ UV. Then

U = ⋃ r(U ∩ V).
r∈R

It follows that u = rv󸀠 with r ∈ R, and v󸀠 ∈ U ∩ V. Hence,

uv = rv󸀠 v ∈ rV.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:32 AM
9.4 Cosets and Lagrange’s theorem | 137

The union of cosets of V is disjoint, so

uv ∈ ⋃ rV.
r∈R

Therefore, UV ⊂ ⋃r∈R rV, proving the equality.


Now suppose that |U| and |V| are finite. Then we have

|U| |U||V|
|UV| = |R||V| = |U : U ∩ V||V| = |V| = .
|U ∩ V| |U ∩ V|

We now show that index is multiplicative. Later, we will see how this fact is related
to the multiplicativity of the degree of field extensions.

Theorem 9.4.8. Suppose G is a group and U and V are subgroups with U ⊂ V ⊂ G. Then
if G is the disjoint union

G = ⋃ rV,
r∈R

R a left transversal of V in G, and V is the disjoint union

V = ⋃ sU,
s∈S

S a left transversal of U in V, then we get a disjoint union for G as

G = ⋃ rsU.
r∈R,s∈S

In particular, if [G : V] and [V : U] are finite, then

[G : U] = [G : V][V : U].

Proof. Now

G = ⋃ rV = ⋃ (⋃ sU) = ⋃ rsU.
r∈R r∈R s∈S r∈R,s∈S

Suppose that r1 s1 U = r2 s2 U. Then r1 s1 UV = r2 s2 UV. But s1 UV = V, and s2 UV = V so


r1 V = r2 V, which implies that r1 = r2 . Then s1 U = s2 U, which implies that s1 = s2 .
Therefore, the union is disjoint.
The index formula now follows directly.

The next result says that the intersection of subgroups of finite index must again
be of finite index.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:32 AM
138 | 9 Groups, subgroups, and examples

Theorem 9.4.9 (Poincaré). Suppose that U, V are subgroups of finite index in G. Then
U ∩ V is also of finite index. Furthermore,

[G : U ∩ V] ≤ [G : U][G : V].

If [G : U], [G : V] are relatively prime then equality holds.

Proof. Let r be the number of left cosets of U in G that are contained in UV. r is finite
since the index [G : U] is finite. From Theorem 9.4.7, we then have

|V : U ∩ V| = r ≤ [G : U].

Then from Theorem 9.4.8,

[G : U ∩ V] = [G : V][V : U ∩ V] ≤ [G : V][G : U].

Since both [G : U] and [G : V] are finite, so is [G : U ∩ V].


Now [G : U]|[G : U ∩ V], [G : V]|[G : U ∩ V]. If [G : U], and [G : V] are relatively
prime, then

[G : U][G : V]|[G : U ∩ V] 󳨐⇒ [G : U][G : V] ≤ [G : U ∩ V]

Therefore, we must have equality.

Corollary 9.4.10. Suppose that [G : U] and [G : V] are finite and relatively prime. Then
G = UV.

Proof. From Theorem 9.4.9, we have

[G : U ∩ V] = [G : U][G : V].

From Theorem 9.4.8

[G : U ∩ V] = [G : V][V : U ∩ V].

Combing these, we have

[V : U ∩ V] = [G : U].

The number of left cosets of U in G that are contained in VU is equal to the number of
all left cosets of U in G. It follows then that we must have G = UV.

9.5 Generators and cyclic groups


We saw that if G is any group and g ∈ G, then the powers of g generate a subgroup
of G, called the cyclic subgroup generated by g. Here, we explore more fully the idea
of generating a group or subgroup. We first need the following:

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:32 AM
9.5 Generators and cyclic groups | 139

Lemma 9.5.1. If U and V are subgroups of a group G, then their intersection U ∩ V is


also a subgroup.

Proof. Since the identity of G is in both U and V, we have that U ∩ V is nonempty.


Suppose that g1 , g2 ∈ U ∩ V. Then g1 , g2 ∈ U; hence, g1−1 g2 ∈ U since U is a subgroup.
Analogously, g1−1 g2 ∈ V. Hence, g −1 g2 ∈ U ∩ V; therefore, U ∩ V is a subgroup.

Now let S be a subset of a group G. The subset S is certainly contained in at least


one subgroup of G, namely G itself. Let {Uα } be the collection of all subgroups of G
containing S. Then ⋂α Uα is again a subgroup of G from Lemma 9.5.1. Furthermore,
it is the smallest subgroup of G containing S (see the exercises). We call ⋂α Uα the
subgroup of G generated by S, and denote it by ⟨S⟩, or grp(S). We call the set S a set of
generators for ⟨S⟩.

Definition 9.5.2. A subset M of a group G is a set of generators for G if G = ⟨M⟩; that


is, the smallest subgroup of G containing M is all of G. We say that G is generated by
M, and that M is a set of generators for G.

Notice that any group G has at least one set of generators, namely G itself. If G =
⟨M⟩ and M is a finite set, then we say that G is finitely generated. Clearly, any finite
group is finitely generated. Shortly, we will give an example of a finitely generated
infinite group.

Example 9.5.3. The set of all reflections forms a set of generators for the Euclidean
group ℰ . Recall that any T ∈ ℰ is either a translation, a rotation, a reflection, or a glide
reflection. It can be shown (see exercises) that any one of these can be expressed as a
product of 3, or fewer reflections.

We now consider the case, where a group G has a single generator.

Definition 9.5.4. A group G is cyclic if there exists a g ∈ G such that G = ⟨g⟩.

In this case, G = {g n : n ∈ ℤ}; that is, G consists of all the powers of the element g.
If there exists an integer m such that g m = 1, then there exists a smallest such positive
integer say n. It follows that g k = g l if and only if k ≡ l mod n. In this situation, the
distinct powers of g are precisely

{1 = g 0 , g, g 2 , . . . , g n−1 }.

It follows that |G| = n. We then call G a finite cyclic group. If no such power exists, then
all the powers of G are distinct and G is an infinite cyclic group.
We show next that any two cyclic groups of the same order are isomorphic.

Theorem 9.5.5.
(a) If G = ⟨g⟩ is an infinite cyclic group, then G ≅ (ℤ, +); that is, the integers under
addition.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:32 AM
140 | 9 Groups, subgroups, and examples

(b) If G = ⟨g⟩ is a finite cyclic group of order n, then G ≅ (ℤn , +); that is, the integers
modulo n under addition.

It follows that for a given order there is only one cyclic group up to isomorphism.

Proof. Let G be an infinite cyclic group with generator g. Map g onto 1 ∈ (ℤ, +). Since
g generates G and 1 generates ℤ under addition, this can be extended to a homomor-
phism. It is straightforward to show that this defines an isomorphism.
Now let G be a finite cyclic group of order n with generator g. As above, map g to
1 ∈ ℤn and extend to a homomorphism. Again it is straightforward to show that this
defines an isomorphism.
Now let G and H be two cyclic groups of the same order. If both are infinite, then
both are isomorphic to (ℤ, +) and, hence, isomorphic to each other. If both are finite of
order n, then both are isomorphic to (ℤn , +) and, hence, isomorphic to each other.

Theorem 9.5.6. Let G = ⟨g⟩ be a finite cyclic group of order n. Then every subgroup of
G is also cyclic. Furthermore, if d|n, there exists a unique subgroup of G of order d.

Proof. Let G = ⟨g⟩ be a finite cyclic group of order n, and suppose that H is a subgroup
of G. Notice that if g m ∈ H, then g −m is also in H since H is a subgroup. Hence, H must
contain positive powers of the generator g. Let t be the smallest positive power of g
such that g t ∈ H. We claim that H = ⟨g t ⟩, the cyclic subgroup of G generated by g t . Let
h ∈ H, then h = g m for some positive integer m ≥ t. Divide m by t to get

m = qt + r, where r = 0 or 0 < r < t.

If r ≠ 0, then r = m − qt > 0. Now g m ∈ H, g t ∈ H so g −qt ∈ H for any q since H is a


subgroup. It follows that g m g −qt = g m−qt ∈ H. This implies that g r ∈ H. However, this
is a contradiction since r < t and t is the least positive power in H. It follows that r = 0
so m = qt. This implies that g m = g qt = (g t )q ; that is, g m is a multiple of g t . Therefore,
every element of H is a multiple of g t ; thus, g t generates H and, hence, H is cyclic.
Now suppose that d|n so that n = kd. Let H = ⟨g k ⟩; that is, the subgroup of G
generated by g k . We claim that H has order d and that any other subgroup H1 of G
with order d coincides with H. Now (g k )d = g kd = g n = 1, so the order of g k divides d,
hence is ≤ d. Suppose that (g k )d1 = g kd1 = 1 with d1 < d. Then since the order of g is n,
we have n = kd|kd1 with d1 < d, which is impossible. Therefore, the order of g k is d,
and h = ⟨g k ⟩ is a subgroup of G of order d.
Now let H1 be a subgroup of G of order d. We must show that H1 = H. Let h ∈ H1 ,
so h = g t ; hence, g td = 1. It follows that n|td, and so kd|td; hence k|t. That is, t = qk for
some positive integer q. Therefore, g t = (g k )q ∈ H. Therefore, H1 ⊂ H, and since they
are of the same size, H = H1 .

Theorem 9.5.7. Let G = ⟨g⟩ be an infinite cyclic group. Then a subgroup H is of the form
H = ⟨g t ⟩ for a positive integer t. Furthermore, if t1 , t2 are positive integers with t1 ≠ t2 ,
then ⟨g t1 ⟩ and ⟨g t2 ⟩ are distinct.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:32 AM
9.5 Generators and cyclic groups | 141

Proof. Let G = ⟨g⟩ be an infinite cyclic group and H a subgroup of G. As in the proof of
Theorem 9.5.6, H must contain positive powers of the generator g. Let t be the smallest
positive power of g such that g t ∈ H. We claim that H = ⟨g t ⟩, the cyclic subgroup of G
generated by g t . Let h ∈ H, then h = g m for some positive integer m ≥ t. Divide m by t
to get

m = qt + r where r = 0 or 0 < r < t.

If r ≠ 0, then r = m − qt > 0. Now g m ∈ H, g t ∈ H so g −qt ∈ H for any q since H is a


subgroup. It follows that g m g −qt = g m−qt ∈ H. This implies that g r ∈ H. However, this
is a contradiction since r < t and t is the least positive power in H. It follows that r = 0,
so m = qt. This implies that g m = g qt = (g t )q ; that is, g m is a multiple of g t . Therefore,
every element of H is a multiple of g t and, therefore, g t generates H; hence, H = ⟨g t ⟩.
From the proof above in the subgroup ⟨g t ⟩, the integer t is the smallest positive
power of g in ⟨g t ⟩. Therefore, if t1 , t2 are positive integers with t1 ≠ t2 , then ⟨g t1 ⟩ and
⟨g t2 ⟩ are distinct.

Theorem 9.5.8. Let G = ⟨g⟩ be a cyclic group. Then the following hold:
(a) If G = ⟨g⟩ is finite of order n, then g k is also a generator if and only if (k, n) = 1. That
is, the generators of G are precisely those powers g k , where k is relatively prime to n.
(b) If G = ⟨g⟩ is infinite, then the only generators are g, g −1 .

Proof. (a) Let G = ⟨g⟩ be a finite cyclic group of order n, and suppose that (k, n) = 1.
Then there exist integers x, y with kx + ny = 1. It follows that
x y x
g = g kx+ny = (g k ) (g n ) = (g k )

since g n = 1. Hence, g is a power of g k , that implies every element of G is also a power


of g k . Therefore, g k is also a generator.
Conversely, suppose that g k is also a generator. Then g is a power of g k , so there
exists an x such that g = g kx . It follows that kx ≡ 1 modulo n, and so there exists a y
such that

kx + ny = 1.

This then implies that (k, n) = 1.


(b) If G = ⟨g⟩ is infinite, then any power of g other than g −1 generates a proper
subgroup. If g is a power of g n for some n so that g = g nx , it follows that g nx−1 = 1,
thus, g has finite order, contradicting that G is infinite cyclic.

Recall that for positive integers n, the Euler phi-function is defined as follows:

Definition 9.5.9. For any n > 0, let

ϕ(n) = number of integers less than or equal to n, and relatively prime to n.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:32 AM
142 | 9 Groups, subgroups, and examples

Example 9.5.10. ϕ(6) = 2 since among 1, 2, 3, 4, 5, 6 only 1, 5 are relatively prime to 6.

Corollary 9.5.11. If G = ⟨g⟩ is finite of order n, then there are ϕ(n) generators for G,
where ϕ is the Euler phi-function.

Proof. From Theorem 9.5.8, the generators of G are precisely the powers g k , where
(k, n) = 1. The numbers relatively prime to n are counted by the Euler phi-function.

Recall that in an arbitrary group G, if g ∈ G, then the order of g, denoted o(g),


is the order of the cyclic subgroup generated by g. Given two elements g, h ∈ G, in
general, there is no relationship between o(g), o(h) and the order of the product gh.
However, if they commute, there is a very direct relationship.

Lemma 9.5.12. Let G be an arbitrary group and g, h ∈ G both of finite order o(g), o(h). If
g and h commute; that is, gh = hg, then o(gh) divides lcm(o(g), o(h)). In particular, if G is
an abelian group, then o(gh)| lcm(o(g), o(h)) for all g, h ∈ G of finite order. Furthermore,
if ⟨g⟩ ∩ ⟨h⟩ = {1}, then o(gh) = lcm(o(g), o(h)).

Proof. Suppose o(g) = n and o(h) = m are finite. If g, h commute, then for any k, we
have (gh)k = g k hk . Let t = lcm(n, m), then t = k1 m, t = k2 n. Hence,

k k
(gh)t = g t ht = (g m ) 1 (hn ) 2 = 1.

Therefore, the order of gh is finite and divides t. Suppose that ⟨g⟩∩⟨h⟩ = {1}; that is, the
cyclic subgroup generated by g intersects trivially with the cyclic subgroup generated
by h. Let k = o(gh), which we know is finite from the first part of the lemma. Let t =
lcm(n, m). We then have (gh)k = g k hk = 1, which implies that g k = h−k . Since the cyclic
subgroups have only trivial intersection, this implies that g k = 1 and hk = 1. But then
n|k and m|k; hence t|k. Since k|t it follows that k = t.

Recall that if m and n are relatively prime, then lcm(m, n) = mn. Furthermore,
if the orders of g and h are relatively prime, it follows from Lagrange’s theorem that
⟨g⟩ ∩ ⟨h⟩ = {1}. We then get the following:

Corollary 9.5.13. If g, h commute and o(g) and o(h) are finite and relatively prime, then
o(gh) = o(g)o(h).

Definition 9.5.14. If G is a finite abelian group, then the exponent of G is the lcm of
the orders of all elements of G. That is,

exp(G) = lcm{o(g) : g ∈ G}.

As a consequence of Lemma 9.5.12, we obtain

Lemma 9.5.15. Let G be a finite abelian group. Then G contains an element of order
exp(G).

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:32 AM
9.5 Generators and cyclic groups | 143

e e
Proof. Suppose that exp(G) = p1 1 ⋅ ⋅ ⋅ pkk with pi distinct primes. By the definition of
e r
exp(G), there is a gi ∈ G with o(gi ) = pi i ri with pi and ri relatively prime. Let hi = gi i .
ei
Then from Lemma 9.5.12, we get o(hi ) = pi . Now let g = h1 h2 ⋅ ⋅ ⋅ hk . From the corollary
e e
to Lemma 9.5.12, we have o(g) = p1 1 ⋅ ⋅ ⋅ pkk = exp(G).

If K is a field then the multiplicative subgroup of nonzero elements of K is an


abelian group K ⋆ . The above results lead to the fact that a finite subgroup of K ⋆ must
actually be cyclic.

Theorem 9.5.16. Let K be a field. Then any finite subgroup of K ⋆ is cyclic.

Proof. Let A ⊂ K ⋆ with |A| = n. Suppose that m = exp(A). Consider the polynomial
f (x) = xm − 1 ∈ K[x]. Since the order of each element in A divides m, it follows that
am = 1 for all a ∈ A; hence, each a ∈ A is a zero of the polynomial f (x). Hence, f (x) has
at least n zeros. Since a polynomial of degree m over a field can have at most m zeros, it
follows that n ≤ m. From Lemma 9.5.15, there is an element a ∈ A with o(a) = m. Since
|A| = n, it follows that m|n; hence, m ≤ n. Therefore, m = n; hence, A = ⟨a⟩ showing
that A is cyclic.

We close this section with two other results concerning cyclic groups. The first
proves, using group theory, a very interesting number theoretic result concerning the
Euler phi-function.

Theorem 9.5.17. For n > 1 and for d ≥ 1

∑ ϕ(d) = n.
d|n

Proof. Consider a cyclic group G of order n. For each d|n, d ≥ 1, there is a unique
cyclic subgroup H of order d. H then has ϕ(d) generators. Each element in G generates
its own cyclic subgroup H1 , say of order d and, hence, must be included in the ϕ(d)
generators of H1 . Therefore,

∑ ϕ(d) = sum of the numbers of generators of the cyclic subgroups of G.


d|n

But this must be the whole group; hence, this sum is n.

We shall make use of the above theorem directly in the following theorem.

Theorem 9.5.18. If |G| = n and if for each positive d such that d|n, G has at most one
cyclic subgroup of order d, then G is cyclic (and, consequently, has exactly one cyclic
subgroup of order d).

Proof. For each d|n, d > 0, let ψ(d) = the number of elements of G of order d. Then

∑ ψ(d) = n.
d|n

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:32 AM
144 | 9 Groups, subgroups, and examples

Now suppose that ψ(d) ≠ 0 for a given d|n. Then there exists an a ∈ G of order d,
which generates a cyclic subgroup, ⟨a⟩, of order d of G. We claim that all elements of
G of order d are in ⟨a⟩. Indeed, if b ∈ G with o(b) = d and b ∉ ⟨a⟩, then ⟨b⟩ is a second
cyclic subgroup of order d, distinct from ⟨a⟩. This contradicts the hypothesis, so the
claim is proved. Thus, if ψ(d) ≠ 0, then ψ(d) = ϕ(d). In general, we have ψ(d) ≤ ϕ(d),
for all positive d|n. But n = ∑d|n ψ(d) ≤ ∑d|n ϕ(d), by the previous theorem. It follows,
clearly, from this that ψ(d) = ϕ(d) for all d|n. In particular, ψ(n) = ϕ(n) ≥ 1. Hence,
there exists at least one element of G of order n; hence, G is cyclic. This completes the
proof.

Corollary 9.5.19. If in a group G of order n, for each d|n, the equation x d = 1 has at most
d solutions in G, then G is cyclic.

Proof. The hypothesis clearly implies that G can have at most one cyclic subgroup of
order d since all elements of such a subgroup satisfy the equation. So Theorem 9.5.18
applies to give our result.

If H is a subgroup of a group G then G operates as a group of permutations on the


set {aH : a ∈ R} of left cosets of H in G where R is a left transversal of H in G. This we
can use to show that a finitely generated group has only finitely many subgroups of a
given finite index.

Theorem 9.5.20. Let G be a finitely generated group. The number of subgroups of index
n < ∞ is finite.

Proof. Let H be a subgroup of index n. We choose a left transversal {c1 , . . . , cn } for H in G


where c1 = 1 represents H. G permutes the set of cosets ci H by multiplication from the
left. This induces a homomorphism ψH from G to Sn as follows. For each g ∈ G let ψH (g)
be the permutation which maps i to j if gci H = cj H. ψH (g) fixes the number 1 if and
only if g ∈ H because c1 H = H. Now, let H and L be two different subgroups of index n
in G. Then there exists g ∈ H with g ∉ L and ψH (g) ≠ ψL (g), and hence ψH and ψL are
different. Since G is finitely generated there are only finitely many homomorphisms
from G to Sn . Therefore the number of subgroups of index n < ∞ is finite.

9.6 Exercises
1. Prove Lemma 9.1.4.
2. Let G be a group and H a nonempty subset. H is a subgroup of G if and only if
ab−1 ∈ H for all a, b ∈ H.
3. Suppose that g ∈ G and g m = 1 for some positive integer m. Let n be the small-
est positive integer such that g n = 1. Show the set of elements {1, g, g 2 , . . . ,
g n−1 } are all distinct but for any other power g k we have g k = g t for some
k = 0, 1, . . . , n − 1.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:32 AM
9.6 Exercises | 145

4. Let G be a group and U1 , U2 be finite subgroups of G. If |U1 | and |U2 | are relatively
prime, then U1 ∩ U2 = {e}.
5. Let A, B be subgroups of a finite group G. If |A| ⋅ |B| > |G| then A ∩ B ≠ {e}.
2 2
6. Let G be the set of all real matrices of the form ( ba −b
a ), where a + b ≠ 0. Show:
(a) G is a group.
(b) For each n ∈ ℕ there is at least one element of order n in G.
7. Let p be a prime, and let G = SL(2, p) = SL(2, ℤp ). Show: G has at least 2p − 2
elements of order p.
8. Let p be a prime and a ∈ ℤ. Show that ap ≡ a mod p.
9. Here we outline a proof that every planar Euclidean congruence motion is either
a rotation, translation, reflection or glide reflection. An isometry in this problem
is a planar Euclidean congruence motion. Show:
(a) If T is an isometry then it is completely determined by its action on a triangle –
equivalent to showing that if T fixes three noncollinear points then it must be
the identity.
(b) If an isometry T has exactly one fixed point then it must be a rotation with
that point as center.
(c) If an isometry T has two fixed points then it fixes the line joining them. Then
show that if T is not the identity it must be a reflection through this line.
(d) If an isometry T has no fixed point but preserves orientation then it must be
a translation.
(e) If an isometry T has no fixed point but reverses orientation then it must be a
glide reflection.
10. Let Pn be a regular n-gon and Dn its group of symmetries. Show that |Dn | = 2n.
(Hint: First show that |Dn | ≤ 2n and then exhibit 2n distinct symmetries.)
11. If A, B have the same cardinality, then there exists a bijection σ : A → B. Define a
map F : SA → SB in the following manner: if f ∈ SA , let F(f ) be the permutation
on B given by F(f )(b) = σ(f (σ −1 (b))). Show that F is an isomorphism.
12. Prove Lemma 9.3.3.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:32 AM
Brought to you by | Chalmers University of Technology
Authenticated
Download Date | 9/12/19 6:32 AM
10 Normal subgroups, factor groups, and direct
products
10.1 Normal subgroups and factor groups
In rings, we saw that there were certain special types of subrings, called ideals, which
allowed us to define factor rings. The analogous object for groups is called a normal
subgroup, which we will define and investigate in this section.

Definition 10.1.1. Let G be an arbitrary group and suppose that H1 and H2 are sub-
groups of G. We say that H2 is conjugate to H1 if there exists an element a ∈ G such that
H2 = a−1 H1 a. H1 , H2 are the called conjugate subgroups of G.

Lemma 10.1.2. Let G be an arbitrary group. Then the relation of conjugacy is an equiv-
alence relation on the set of subgroups of G.

Proof. We must show that conjugacy is reflexive, symmetric, and transitive. If H is a


subgroup of G, then 1−1 H1 = H; hence, H is conjugate to itself and, therefore, the
relation is reflexive.
Suppose that H1 is conjugate to H2 . Then there exists a g ∈ G with g −1 H1 g = H2 .
This implies that gH2 g −1 = H1 . However, (g −1 )−1 = g; hence, letting g −1 = g1 , we have
g1−1 H2 g1 = H1 . Therefore, H2 is conjugate to H1 and conjugacy is symmetric.
Finally, suppose that H1 is conjugate to H2 and H2 is conjugate to H3 . Then there
exist g1 , g2 ∈ G with H2 = g1−1 H1 g1 and H3 = g2−1 H2 g2 . Then

H3 = g2−1 g1−1 H1 g1 g2 = (g1 g2 )−1 H1 (g1 g2 ).

Therefore, H3 is conjugate to H1 and conjugacy is transitive.

Lemma 10.1.3. Let G be an arbitrary group. Then for g ∈ G, the map g : a → g −1 ag is


an automorphism on G.

Proof. For a fixed g ∈ G, define the map f : G → G by f (a) = g −1 ag for a ∈ G. We must


show that this is a homomorphism, and that it is one-to-one and onto.
Let a1 , a2 ∈ G. Then

f (a1 a2 ) = g −1 a1 a2 g = (g −1 a1 g)(g −1 a2 g) = f (a1 )f (a2 ).

Hence, f is a homomorphism.
If f (a1 ) = f (a2 ), then g −1 a1 g = g −1 a2 g. Clearly, by the cancellation law, we then
have a1 = a2 ; hence, f is one-to-one.
Finally, let a ∈ G, and let a1 = gag −1 . Then a = g −1 a1 g; hence, f (a1 ) = a. It follows
that f is onto; therefore, f is an automorphism on G.

https://doi.org/10.1515/9783110603996-010

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
148 | 10 Normal subgroups, factor groups, and direct products

In general, a subgroup H of a group G may have many different conjugates. How-


ever, in certain situations, the only conjugate of a subgroup H is H itself. If this is the
case, we say that H is a normal subgroup. We will see shortly that this is precisely the
analog for groups of the concept of an ideal in rings.

Definition 10.1.4. Let G be an arbitrary group. A subgroup H is a normal subgroup of G,


which we denote by H ⊲ G, if g −1 Hg = H for all g ∈ G.

Since the conjugation map is an isomorphism, it follows that if g −1 Hg ⊂ H, then


g −1 Hg = H. Hence, in order to show that a subgroup is normal, we need only show
inclusion.

Lemma 10.1.5. Let N be a subgroup of a group G. Then if a−1 Na ⊂ N for all a ∈ G, then
a−1 Na = N. In particular, a−1 Na ⊂ N for all a ∈ G implies that N is a normal subgroup.

Notice that if g −1 Hg = H, then Hg = gH. That is as sets the left coset, gH, is equal
to the right coset, Hg. Hence, for each h1 ∈ H, there is an h2 ∈ H with gh1 = h2 g. If
H ⊲ G, this is true for all g ∈ G. Furthermore, if H is normal, then for the product of
two cosets g1 H and g2 H, we have

(g1 H)(g2 H) = g1 (Hg2 )H = g1 g2 (HH) = g1 g2 H.

If (g1 H)(g2 H) = (g1 g2 )H for all g1 , g2 ∈ G, we necessarily have g −1 Hg = H for all g ∈ G.


Hence, we have proved the following:

Lemma 10.1.6. Let H be a subgroup of a group G. Then the following are equivalent:
(1) H is a normal subgroup of G.
(2) g −1 Hg = H for all g ∈ G.
(3) gH = Hg for all g ∈ G.
(4) (g1 H)(g2 H) = (g1 g2 )H for all g1 , g2 ∈ G.

This is precisely the condition needed to construct factor groups. First we give
some examples of normal subgroups.

Lemma 10.1.7. Every subgroup of an abelian group is normal.

Proof. Let G be abelian and H a subgroup of G. Suppose g ∈ G, then gh = hg for all


h ∈ H since G is abelian. It follows that gH = Hg. Since this is true for every g ∈ G, it
follows that H is normal.

Lemma 10.1.8. Let H ⊂ G be a subgroup of index 2; that is, [G : H] = 2. Then H is normal


in G.

Proof. Suppose that [G : H] = 2. We must show that gH = Hg for all g ∈ G. If g ∈ H,


clearly then, H = gH = Hg. Therefore, we may assume that g is not in H. Then there

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
10.1 Normal subgroups and factor groups | 149

are only 2 left cosets and 2 right cosets. That is,

G = H ∪ gH = H ∪ Hg.

Since the union is a disjoint union, we must have gH = Hg; hence, H is normal.

Lemma 10.1.9. Let K be any field. Then the group SL(n, K) is a normal subgroup of
GL(n, K) for any positive integer n.

Proof. Recall that GL(n, K) is the group of n × n matrices over the field K with nonzero
determinant, whereas SL(n, K) is the subgroup of n × n matrices over the field K with
determinant equal to 1. Let U ∈ SL(n, K) and T ∈ GL(n, K). Consider T −1 UT. Then

det(T −1 UT) = det(T −1 ) det(U) det(T) = det(U) det(T −1 T)


= det(U) det(I) = det(U) = 1.

Hence, T −1 UT ∈ SL(n, K) for any U ∈ SL(n, K), and any T ∈ GL(n, K). It follows that
T −1 SL(n, K)T ⊂ SL(n, K); therefore, SL(n, K) is normal in GL(n, K).

The intersection of normal subgroups is again normal, and the product of normal
subgroups is normal.

Lemma 10.1.10. Let N1 , N2 be normal subgroups of the group G. Then the following hold:
(1) N1 ∩ N2 is a normal subgroup of G.
(2) N1 N2 is a normal subgroup of G.
(3) If H is any subgroup of G, then N1 ∩ H is a normal subgroup of H, and N1 H = HN1 .

Proof. (a) Let n ∈ N1 ∩ N2 and g ∈ G. Then g −1 ng ∈ N1 since N1 is normal. Similarly,


g −1 ng ∈ N2 since N2 is normal. Hence, g −1 ng ∈ N1 ∩ N2 . It follows that g −1 (N1 ∩ N2 )g ⊂
N1 ∩ N2 ; therefore, N1 ∩ N2 is normal.
(b) Let n1 ∈ N1 , n2 ∈ N2 . Since N1 , N2 are both normal N1 N2 = N2 N1 as sets, and the
complex N1 N2 forms a subgroup of G. Let g ∈ G and n1 n2 ∈ N1 N2 . Then

g −1 (n1 n2 )g = (g −1 n1 g)(g −1 n2 g) ∈ N1 N2

since g −1 n1 g ∈ N1 and g −1 n2 g ∈ N2 . Therefore, N1 N2 is normal in G.


(c) Let h ∈ H and n ∈ N ∩ H. Then as in part (a), h−1 nh ∈ N ∩ H; therefore, N ∩ H is
a normal subgroup of H.
If nh ∈ N1 H, n ∈ N1 , h ∈ H, then nh = hn󸀠 with some n󸀠 ∈ N1 . Hence, N1 H =
HN1 .

We now construct factor groups or quotient groups of a group modulo a normal


subgroup.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
150 | 10 Normal subgroups, factor groups, and direct products

Definition 10.1.11. Let G be an arbitrary group and H a normal subgroup of G. Let G/H
denote the set of distinct left (and hence also right) cosets of H in G. On G/H, define
the multiplication

(g1 H)(g2 H) = g1 g2 H

for any elements g1 H, g2 H in G/H.

Theorem 10.1.12. Let G be a group and H a normal subgroup of G. Then G/H under the
operation defined above forms a group. This group is called the factor group or quotient
group of G modulo H. The identity element is the coset 1H = H, and the inverse of a coset
gH is g −1 H.

Proof. We first show that the operation on G/N is well-defined. Suppose that a󸀠 N = aN
and b󸀠 N = bN, then b󸀠 ∈ bN, and so b󸀠 = bn1 . Similarly a󸀠 = an2 , where n1 , n2 ∈ N.
Therefore,

a󸀠 b󸀠 N = an2 bn1 N = an2 bN

since n1 ∈ N. But b−1 n2 b = n3 ∈ N, since N is normal. Therefore, the right-hand side of


the equation can be written as

an2 bN = abN.

Thus, we have shown that if N ⊲ G, then a󸀠 b󸀠 N = abN, and the operation on G/N is
indeed well-defined.
The associative law is true, because coset multiplication as defined above uses the
ordinary group operation, which is by definition associative.
The coset N serves as the identity element of G/N. Notice that

aN ⋅ N = aN 2 = aN,

and

N ⋅ aN = aN 2 = aN.

The inverse of aN is a−1 N since

aNa−1 N = aa−1 N 2 = N.

We emphasize that the elements of G/N are cosets; thus, subsets of G. If |G| < ∞,
then |G/N| = [G : N], the number of cosets of N in G. It is also to be emphasized that
for G/N to be a group, N must be a normal subgroup of G.
In some cases, properties of G are preserved in factor groups.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
10.2 The group isomorphism theorems | 151

Lemma 10.1.13. If G is abelian, then any factor group of G is also abelian. If G is cyclic,
then any factor group of G is also cyclic.

Proof. Suppose that G is abelian and H is a subgroup of G. H is necessarily normal


from Lemma 10.1.7 so that we can form the factor group G/H. Let g1 H, g2 H ∈ G/H.
Since G is abelian, we have g1 g2 = g2 g1 . Then in G/H,

(g1 H)(g2 H) = (g1 g2 )H = (g2 g1 )H = (g2 H)(g1 H).

Therefore, G/H is abelian.


We leave the proof of the second part to the exercises.

An extremely important concept has to do with when a group contains no proper


normal subgroups other than the identity subgroup {1}.

Definition 10.1.14. A group G ≠ {1} is simple, provided that N ⊲ G implies N = G or


N = {1}.

One of the most outstanding problems in group theory has been to give a complete
classification of all finite simple groups. In other words, this is the program to discover
all finite simple groups, and to prove that there are no more to be found. This was ac-
complished through the efforts of many mathematicians. The proof of this magnificent
result took thousands of pages. We refer the reader to [25] for a complete discussion of
this. We give one elementary example:

Lemma 10.1.15. Any finite group of prime order is simple and cyclic.

Proof. Suppose that G is a finite group and |G| = p, where p is a prime. Let g ∈ G with
g ≠ 1. Then ⟨g⟩ is a nontrivial subgroup of G, so its order divides the order of G by
Lagrange’s theorem. Since g ≠ 1, and p is a prime, we must have |⟨g⟩| = p. Therefore,
⟨g⟩ is all of G; that is, G = ⟨g⟩; hence, G is cyclic.
The argument above shows that G has no nontrivial proper subgroups and, there-
fore, no nontrivial normal subgroups. Therefore, G is simple.

In the next chapter, we will examine certain other finite simple groups.

10.2 The group isomorphism theorems


In Chapter 1, we saw that there was a close relationship between ring homomorphisms
and factor rings. In particular to each ideal, and consequently to each factor ring, there
is a ring homomorphism that has that ideal as its kernel. Conversely, to each ring ho-
momorphism, its kernel is an ideal, and the corresponding factor ring is isomorphic
to the image of the homomorphism. This was formalized in Theorem 1.5.7, which we

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
152 | 10 Normal subgroups, factor groups, and direct products

called the ring isomorphism theorem. We now look at the group theoretical analog of
this result, called the group isomorphism theorem. We will then examine some conse-
quences of this result that will be crucial in the Galois theory of fields.

Definition 10.2.1. If G1 and G2 be groups and f : G1 → G2 is a group homomorphism,


then the kernel of f , denoted ker(f ), is defined as

ker(f ) = {g ∈ G1 : f (g) = 1}.

That is the kernel, the set of the elements of G1 that map onto the identity of G2 . The
image of f , denoted im(f ), is the set of elements of G2 mapped onto by f from elements
of G1 . That is,

im(f ) = {g ∈ G2 : f (g1 ) = g2 for some g1 ∈ G1 }.

Note that if f is a surjection, then im(f ) = G2 .

As with ring homomorphisms the kernel measures how far a homomorphism is


from being an injection, that is, a one-to-one mapping.

Lemma 10.2.2. Let G1 and G2 be groups and f : G1 → G2 a group homomorphism. Then


f is injective if and only if ker(f ) = {1}.

Proof. Suppose that f is injective. Since f (1) = 1, we always have 1 ∈ ker(f ). Suppose
that g ∈ ker(f ). Then f (g) = f (1). Since f is injective, this implies that g = 1; hence,
ker(f ) = {1}.
Conversely, suppose that ker(f ) = {1} and f (g1 ) = f (g2 ). Then

f (g1 )(f (g2 )) = 1 󳨐⇒ f (g1 g2−1 ) = 1 󳨐⇒ g1 g2−1 ∈ ker(f ).


−1

Then since ker(f ) = {1}, we have g1 g2−1 = 1; hence, g1 = g2 . Therefore, f is injective.

We now state the group isomorphism theorem. This is entirely analogous to the
ring isomorphism theorem replacing ideals by normal subgroups. We note that this
theorem is sometimes called the first group isomorphism theorem.

Theorem 10.2.3 (Group isomorphism theorem).


(a) Let G1 and G2 be groups and f : G1 → G2 a group homomorphism. Then ker(f ) is a
normal subgroup of G1 , im(f ) is a subgroup of G2 , and

G/ ker(f ) ≅ im(f ).

(b) Conversely, suppose that N is a normal subgroup of a group G. Then there exists a
group H and a homomorphism f : G → H such that ker(f ) = N, and im(f ) = H.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
10.2 The group isomorphism theorems | 153

Proof. (a) Since 1 ∈ ker(f ), the kernel is nonempty. Suppose that g1 , g2 ∈ ker(f ). Then
f (g1 ) = f (g2 ) = 1. It follows that f (g1 g2−1 ) = f (g1 )(f (g2 ))−1 = 1. Hence, g1 g2−1 ∈ ker(f );
therefore, ker(f ) is a subgroup of G1 . Furthermore, for any g ∈ G1 , we have

f (g −1 g1 g) = (f (g)) f (g1 )f (g)


−1

⋅ 1 ⋅ f (g) = f (g −1 g) = f (1) = 1.
−1
= (f (g))

Hence, g −1 g1 g ∈ ker(f ) and ker(f ) is a normal subgroup.


It is straightforward to show that im(f ) is a subgroup of G2 .
Consider the map f ̂ : G/ ker(f ) → im(f ) defined by

f ̂(g ker(f )) = f (g).

We show that this is an isomorphism.


Suppose that g1 ker(f ) = g2 ker(f ), then g1 g2−1 ∈ ker(f ) so that f (g1 g2−1 ) = 1. This
implies that f (g1 ) = f (g2 ); hence, the map f ̂ is well-defined. Now,

f ̂(g1 ker(f )g2 ker(f )) = f ̂(g1 g2 ker(f )) = f (g1 g2 )


= f (g1 )f (g2 ) = f ̂(g1 ker(f ))f ̂(g2 ker(f ));

therefore, f ̂ is a homomorphism.
Suppose that f ̂(g1 ker(f )) = f ̂(g2 ker(f )), then f (g1 ) = f (g2 ); hence, g1 ker(f ) =
g2 ker(f ). It follows that f ̂ is injective.
Finally, suppose that h ∈ im(f ). Then there exists a g ∈ G1 with f (g) = h. Then
̂f (g ker(f )) = h, and f ̂ is a surjection onto im(f ). Therefore, f ̂ is an isomorphism com-
pleting the proof of part (a).
(b) Conversely, suppose that N is a normal subgroup of G. Define the map f : G →
G/N by f (g) = gN for g ∈ G. By the definition of the product in the quotient group G/N,
it is clear that f is a homomorphism with im(f ) = G/N. If g ∈ ker(f ), then f (g) = gN = N
since N is the identity in G/N. However, this implies that g ∈ N; hence, it follows that
ker(f ) = N, completing the proof.

There are two related theorems that are called the second isomorphism theorem
and the third isomorphism theorem.

Theorem 10.2.4 (Second isomorphism theorem). Let N be a normal subgroup of a


group G and U a subgroup of G. Then U ∩ N is normal in U, and

(UN)/N ≅ U/(U ∩ N).

Proof. From Lemma 10.1.10, we know that U ∩ N is normal in U. Define the map

α : UN → U/U ∩ N

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
154 | 10 Normal subgroups, factor groups, and direct products

by α(un) = u(U ∩ N). If un = u󸀠 n󸀠 , then u󸀠 −1 u = n󸀠 n−1 ∈ U ∩ N. Therefore, u󸀠 (U ∩ N) =


u(U ∩ N); hence, the map α is well-defined.
Suppose that un, u󸀠 n󸀠 ∈ UN. Since N is normal in G, we have that unu󸀠 n󸀠 ∈ uu󸀠 N.
Hence, unu󸀠 n󸀠 = uu󸀠 n󸀠󸀠 with n󸀠󸀠 ∈ N. Then

α(unu󸀠 n󸀠 ) = α(uu󸀠 n) = uu󸀠 (U ∩ N).

However, U ∩ N is normal in U, so

uu󸀠 (U ∩ N) = u(U ∩ N)u󸀠 (U ∩ N) = α(un)α(u󸀠 n󸀠 ).

Therefore, α is a homomorphism.
We have im(α) = U/(U ∩ N) by definition. Suppose that un ∈ ker(α). Then α(un) =
U ∩ N ⊂ N, which implies u ∈ N. Therefore, ker(f ) = N. From the group isomorphism
theorem, we then have

UN/N ≅ U/(U ∩ N),

proving the theorem.

Theorem 10.2.5 (Third isomorphism theorem). Let N and M be normal subgroups of a


group G with N a subgroup of M. Then M/N is a normal subgroup in G/N, and

(G/N)/(M/N) ≅ G/M.

Proof. Define the map β : G/N → G/M by

β(gN) = gM.

It is straightforward that β is well-defined and a homomorphism. If gN ∈ ker(β), then


β(gN) = gM = M; hence, g ∈ M. It follows that ker(β) = M/N. In particular, this shows
that M/N is normal in G/N. From the group isomorphism theorem then,

(G/N)/(M/N) ≅ G/M.

For a normal subgroup N in G, the homomorphism f : G → G/N provides a one-


to-one correspondence between subgroups of G containing N and the subgroups of
G/N. This correspondence will play a fundamental role in the study of subfields of a
field.

Theorem 10.2.6 (Correspondence Theorem). Let N be a normal subgroup of a group G,


and let f be the corresponding homomorphism f : G → G/N. Then the mapping

ϕ : H → f (H),

where H is a subgroup of G containing N provides a one-to-one correspondence between


all the subgroups of G/N and the subgroups of G containing N.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
10.3 Direct products of groups | 155

Proof. We first show that the mapping ϕ is surjective. Let H1 be a subgroup of G/N,
and let

H = {g ∈ G : f (g) ∈ H1 }.

We show that H is a subgroup of G, and that N ⊂ H.


If g1 , g2 ∈ H, then f (g1 ) ∈ H1 , and f (g2 ) ∈ H1 . Therefore, f (g1 )f (g2 ) ∈ H1 ; hence,
f (g1 g2 ) ∈ H1 . Therefore, g1 g2 ∈ H. In an identical fashion, g1−1 ∈ H. Therefore, H is a
subgroup of G. If n ∈ N, then f (n) = 1 ∈ H1 ; hence, n ∈ H. Therefore, N ⊂ H, showing
that the map ϕ is surjective.
Suppose that ϕ(H1 ) = ϕ(H2 ), where H1 and H2 are subgroups of G containing N.
This implies that f (H1 ) = f (H2 ). Let g1 ∈ H1 . Then f (g1 ) = f (g2 ) for some g2 ∈ H2 . Then
g1 g2−1 ∈ ker(f ) = N ⊂ H2 . It follows that g1 g2−1 ∈ H2 so that g1 ∈ H2 . Hence, H1 ⊂ H2 . In a
similar fashion, H2 ⊂ H1 ; therefore, H1 = H2 . It follows that ϕ is injective.

10.3 Direct products of groups


In this section, we look at a very important construction, the direct product, which
allows us to build new groups out of existing groups. This construction is the analog
for groups of the direct sum of rings. As an application of this construction, in the
next section, we present a theorem, which completely describes the structure of finite
abelian groups.
Let G1 , G2 be groups and let G be the Cartesian product of G1 and G2 . That is,

G = G1 × G2 = {(a, b) : a ∈ G1 , b ∈ G2 }.

On G, define

(a1 , b1 ) ⋅ (a2 , b2 ) = (a1 a2 , b1 b2 ).

With this operation, it is direct to verify the groups axioms for G; hence, G becomes a
group.

Theorem 10.3.1. Let G1 , G2 be groups and G the Cartesian product G1 × G2 with the op-
eration defined above. Then G forms a group called the direct product of G1 and G2 . The
identity element is (1, 1), and (g, h)−1 = (g −1 , h−1 ).

This construction can be iterated to any finite number of groups (also to an infinite
number, but we will not consider that here) G1 , . . . , Gn to form the direct product G1 ×
G2 × ⋅ ⋅ ⋅ × Gn .

Theorem 10.3.2. For groups G1 and G2 , we have G1 × G2 ≅ G2 × G1 , and G1 × G2 is abelian


if and only if each Gi , i = 1, 2, is abelian.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
156 | 10 Normal subgroups, factor groups, and direct products

Proof. The map (a, b) → (b, a), where a ∈ G1 , b ∈ G2 provides an isomorphism from
G1 × G2 → G2 × G1 .
Suppose that both G1 , G2 are abelian. Then if a1 , a2 ∈ G1 , b1 , b2 ∈ G2 , we have

(a1 , b1 )(a2 , b2 ) = (a1 a2 , b1 b2 ) = (a2 a1 , b2 b1 ) = (a2 , b2 )(a1 , b1 );

hence, G1 × G2 is abelian.
Conversely, suppose G1 × G2 is abelian, and suppose that a1 , a2 ∈ G1 . Then for the
identity 1 ∈ G2 , we have

(a1 a2 , 1) = (a1 , 1)(a2 , 1) = (a2 , 1)(a1 , 1) = (a2 a1 , 1).

Therefore, a1 a2 = a2 a1 , and G1 is abelian. Similarly, G2 is abelian.

We show next that in G1 × G2 , there are normal subgroups H1 , H2 with H1 ≅ G1 and


H2 ≅ G2 .

Theorem 10.3.3. Let G = G1 × G2 . Let H1 = {(a, 1) : a ∈ G1 } and H2 = {(1, b) : b ∈ G2 }.


Then both H1 and H2 are normal subgroups of G with G = H1 H2 and H1 ∩ H2 = {1}.
Furthermore, H1 ≅ G1 , H2 ≅ G2 , G/H1 ≅ G2 , and G/H2 ≅ G1 .

Proof. Map G1 × G2 onto G2 by (a, b) → b. It is clear that this map is a homomorphism,


and that the kernel is H1 = {(a, 1) : a ∈ G1 }. This establishes that H1 is a normal sub-
group of G, and that G/H1 ≅ G2 . In an identical fashion, we get that G/H2 ≅ G1 . The
map (a, 1) → a provides the isomorphism from H1 onto G1 .

If the factors are finite, it is easy to find the order of G1 ×G2 . The size of the Cartesian
product is just the product of the sizes of the factors.

Lemma 10.3.4. If |G1 | and |G2 | are finite, then |G1 × G2 | = |G1 ||G2 |.

Now suppose that G is a group with normal subgroups G1 , G2 such that G = G1 G2


and G1 ∩ G2 = {1}. Then we will show that G is isomorphic to the direct product G1 × G2 .
In this case, we say that G is the internal direct product of its subgroups, and that G1 , G2
are direct factors of G.

Theorem 10.3.5. Suppose that G is a group with normal subgroups G1 , G2 such that G =
G1 G2 , and G1 ∩ G2 = {1}. Then G is isomorphic to the direct product G1 × G2 .

Proof. Since G = G1 G2 , each element of G has the form ab with a ∈ G1 , b ∈ G2 . This


representation as ab is unique, because G1 ∩ G2 = {1}. We first show that each a ∈
G1 commutes with each b ∈ G2 . Consider the element aba−1 b−1 . Since G1 is normal
ba−1 b−1 ∈ G1 , which implies that abab−1 ∈ G1 . Since G2 is normal, aba−1 ∈ G2 , which
implies that aba−1 b−1 ∈ G2 . Therefore, aba−1 b−1 ∈ G1 ∩ G2 = {1}; hence, aba−1 b1 = 1, so
that ab = ba.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
10.4 Finite Abelian groups | 157

Now map G onto G1 × G2 by f (ab) → (a, b). We claim that this is an isomorphism.
It is clearly onto. Now

f ((a1 b1 )(a2 b2 )) = f (a1 a2 b1 b2 ) = (a1 a2 , b1 b2 )


= (a1 , b1 )(a2 , b2 ) = f ((a1 , b1 ))(f (a2 , b2 )),

so that f is a homomorphism. The kernel is G1 ∩ G2 = {1}, and so f is an isomor-


phism.

Although the end resulting groups are isomorphic, we call G1 × G2 an external


direct product if we started with the groups G1 , G2 and constructed G1 ×G2 , and call G1 ×
G2 an internal direct product if we started with a group G having normal subgroups,
as in the theorem.

10.4 Finite Abelian groups


We now use the results of the last section to present a theorem that completely pro-
vides the structure of finite abelian groups. This theorem is a special case of a general
result on modules that we will examine in detail in Chapter 19.

Theorem 10.4.1 (Basis theorem for finite abelian groups). Let G be a finite abelian
group. Then G is a direct product of cyclic groups of prime power order.

Before giving the proof, we give two examples showing how this theorem leads to
the classification of finite abelian groups.
Since all cyclic groups of order n are isomorphic to (ℤn , +), we will denote a cyclic
group of order n by ℤn .

Example 10.4.2. Classify all abelian groups of order 60. Let G be an abelian group of
order 60. From Theorem 10.4.1, G must be a direct product of cyclic groups of prime
power order. Now 60 = 22 ⋅ 3 ⋅ 5, so the only primes involved are 2, 3, and 5. Hence, the
cyclic group involved in the direct product decomposition of G have order either 2, 4,
3, or 5 (by Lagrange’s theorem, they must be divisors of 60). Therefore, G must be of
the form

G ≅ ℤ4 × ℤ3 × ℤ5
G ≅ ℤ2 × ℤ2 × ℤ3 × ℤ5 .

Hence, up to isomorphism, there are only two abelian groups of order 60.

Example 10.4.3. Classify all abelian groups of order 180. Now 180 = 22 ⋅ 32 ⋅ 5, so the
only primes involved are 2, 3, and 5. Hence, the cyclic group involved in the direct

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
158 | 10 Normal subgroups, factor groups, and direct products

product decomposition of G have order either 2, 4, 3, 9, or 5 (by Lagrange’s theorem,


they must be divisors of 180). Therefore, G must be of the form

G ≅ ℤ4 × ℤ9 × ℤ5
G ≅ ℤ2 × ℤ2 × ℤ9 × ℤ5
G ≅ ℤ4 × ℤ3 × ℤ3 × ℤ5
G ≅ ℤ2 × ℤ2 × ℤ3 × ℤ3 × ℤ5 .

Hence, up to isomorphism, there are four abelian groups of order 180.

The proof of Theorem 10.4.1 involves the following lemmas:

Lemma 10.4.4. Let G be a finite abelian group, and let p||G|, where p is a prime. Then
all the elements of G, whose orders are a power of p, form a normal subgroup of G. This
subgroup is called the p-primary component of G, which we will denote by Gp .

Proof. Let p be a prime with p||G|, and let a and b be two elements of G of order a power
of p. Since G is abelian, the order of ab is the lcm of the orders, which is again a power
of p. Therefore, ab ∈ Gp . The order of a−1 is the same as the order of a, so a−1 ∈ Gp ;
therefore, Gp is a subgroup.
e e
Lemma 10.4.5. Let G be a finite abelian group of order n. Suppose that n = p1 1 ⋅ ⋅ ⋅ pkk
with p1 , . . . , pk distinct primes. Then

G ≅ Gp1 × ⋅ ⋅ ⋅ × Gpk ,

where Gpi is the pi -primary component of G.

Proof. Each Gpi is normal since G is abelian, and since distinct primes are relatively
prime, the intersection of the Gpi is the identity. Therefore, Lemma 10.4.5 will follow
by showing that each element of G is a product of elements in the Gp1 .
f f f
Let g ∈ G. Then the order of g is p11 ⋅ ⋅ ⋅ pkk . We write this as pii m with (m, pi ) = 1.
f
Then g m has order pii and, hence, is in Gpi . Now since p1 , . . . , pk are relatively prime,
there exists m1 , . . . , mk with
f f
m1 p11 + ⋅ ⋅ ⋅ + mk pkk = 1;

hence,
f1 fk
m1 m
g = (g p1 ) ⋅ ⋅ ⋅ (g pk ) k .

Therefore, g is a product of elements in the Gpi .

We next need the concept of a basis. Let G be any finitely generated abelian group
(finite or infinite), and let g1 , . . . , gn be a set of generators for G. The generators g1 , . . . , gn
form a basis if

G = ⟨g1 ⟩ × ⋅ ⋅ ⋅ × ⟨gn ⟩;

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
10.4 Finite Abelian groups | 159

that is, G is the direct product of the cyclic subgroups generated by the gi . The basis
theorem for finite abelian groups says that any finite abelian group has a basis.
Suppose that G is a finite abelian group with a basis g1 , . . . , gk so that G = ⟨g1 ⟩ ×
⋅ ⋅ ⋅ × ⟨gk ⟩. Since G is finite, each gi has finite order, say mi . It follows then, from the fact
that G is a direct product, that each g ∈ G can be expressed as

n n
g = g1 1 ⋅ ⋅ ⋅ gk k

and, furthermore, the integers n1 , . . . , nk are unique modulo the order of gi . Hence,
each integer ni can be chosen in the range 0, 1, . . . , mi − 1, and within this range for the
element g, the integer ni is unique.
From the previous lemma, each finite abelian group splits into a direct product of
its p-primary components for different primes p. Hence, to complete the proof of the
basis theorem, we must show that any finite abelian group of order pm for some prime
p has a basis. We call an abelian group of order pm an abelian p-group.
Consider an abelian group G of order pm for a prime p. It is somewhat easier to
complete the proof if we consider the group using additive notation. That is, the oper-
ation is considered +, the identity as 0, and powers are given by multiples. Hence, if
an element g ∈ G has order pk , then in additive notation, pk g = 0. A set of elements
g1 , . . . , gk is then a basis for G if each g ∈ G can be expressed uniquely as g = m1 g1 +
⋅ ⋅ ⋅ + mk gk , where the mi are unique modulo the order of gi . We say that the g1 , . . . , gk
are independent, and this is equivalent to the fact that whenever m1 g1 + ⋅ ⋅ ⋅ + mk gk = 0,
then mi ≡ 0 modulo the order of gi . We now prove that any abelian p-group has a basis.

Lemma 10.4.6. Let G be a finite abelian group of prime power order pn for some prime p.
Then G is a direct product of cyclic groups.

Notice that in the group G, we have pn g = 0 for all g ∈ G as a consequence of


Lagrange’s theorem. Furthermore, every element has as its order a power of p. The
smallest power of p, say pr such that pr g = 0 for all g ∈ G, is called the exponent of G.
Any finite abelian p-group must have some exponent pr .

Proof. The proof of this lemma is by induction on the exponent.


The lowest possible exponent is p. So, first, suppose that pg = 0 for all g ∈ G.
Since G is finite it has a finite system of generators. Let S = {g1 , . . . , gk } be a minimal
set of generators for G. We claim that this is a basis. Since this is a set of generators, to
show that it is a basis, we must show that they are independent. Hence, suppose that
we have

m1 g1 + ⋅ ⋅ ⋅ + mk gk = 0 (10.1)

for some set of integers mi . Since the order of each gi is p, as explained above, we may
assume that 0 ≤ mi < p for i = 1, . . . , k. Suppose that one mi ≠ 0. Then (mi , p) = 1;

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
160 | 10 Normal subgroups, factor groups, and direct products

hence, there exists an xi with mi xi ≡ 1 mod p (see Chapter 4). Multiplying the equa-
tion (10.1) by xi , we get modulo p,

m1 xi g1 + ⋅ ⋅ ⋅ + gi + ⋅ ⋅ ⋅ + mk xi gk = 0,

and rearranging

gi = −m1 xi g1 − ⋅ ⋅ ⋅ − mk xk gk .

But then gi can be expressed in terms of the other gj ; therefore, the set {g1 , . . . , gk } is
not minimal. It follows that g1 , . . . , gk constitute a basis, and the lemma is true for the
exponent p.
Now suppose that any finite abelian group of exponent pn−1 has a basis, and as-
sume that G has exponent pn . Consider the set G = pG = {pg : g ∈ G}. It is straight-
forward that this forms a subgroup (see exercises). Since pn g = 0 for all g ∈ G, it
follows that pn−1 g = 0 for all g ∈ G, and so the exponent of G ≤ pn−1 . By the inductive
hypothesis, G has a basis

S = {pg1 , . . . , pgk }.

Consider the set {g1 , . . . , gk }, and adjoin to this set the set of all elements h ∈ G, satis-
fying ph = 0. Call this set S1 , so that we have

S1 = {g1 , . . . , gk , h1 , . . . , ht }.

We claim that S1 is a set of generators for G. Let g ∈ G. Then pg ∈ G, which has the
basis pg1 , . . . , pgk , so that

pg = m1 pg1 + ⋅ ⋅ ⋅ + mk pgk .

This implies that

p(g − m1 g1 − ⋅ ⋅ ⋅ − mk gk ) = 0,

so that g1 − m1 g1 − ⋅ ⋅ ⋅ − mk gk must be one of the hi . Hence,

g − m1 g1 − ⋅ ⋅ ⋅ − mk gk = hi , so that g = m1 g1 + ⋅ ⋅ ⋅ + mk gk + hi ,

proving the claim.


Now S1 is finite, so there is a minimal subset of S1 that is still a generating system
for G. Call this S0 , and suppose that S0 , renumbering if necessary, is

S0 = {g1 , . . . , gr , h1 , . . . , hs } with phi = 0 for i = 1, . . . , s.

The subgroup generated by h1 , . . . , hs has exponent p. Therefore, by inductive hypoth-


esis, has a basis. We may assume then that h1 , . . . , hs is a basis for this subgroup and,

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
10.5 Some properties of finite groups | 161

hence, is independent. We claim now that g1 , . . . , gr , h1 , . . . , hs are independent and,


hence, form a basis for G.
Suppose that

m1 g1 + ⋅ ⋅ ⋅ + mr gr + n1 h1 + ⋅ ⋅ ⋅ + ns hs = 0 (10.2)

for some integers m1 , . . . , mr , h1 , . . . , hs . Each mi , ni must be divisible by p. Suppose, for


example, that an mi is not. Then (mi , p) = 1, and then (mi , pn ) = 1. This implies that
there exists an xi with mi xi ≡ 1 mod pn . Multiplying through by xi and rearranging, we
then obtain

gi = −m1 xi g1 − ⋅ ⋅ ⋅ − ns xi hs .

Therefore, gi can be expressed in terms of the remaining elements of S0 , contradict-


ing the minimality of S0 . An identical argument works if an ni is not divisible by p.
Therefore, the relation (10.2) takes the form

a1 pg1 + ⋅ ⋅ ⋅ + ar pgr + b1 ph1 + ⋅ ⋅ ⋅ + bs phs = 0. (10.3)

Each of the terms phi = 0, so that (10.3) becomes

a1 pg1 + ⋅ ⋅ ⋅ + ar pgr = 0.

The g1 , . . . , gr are independent and, hence, ai p = 0 for each i; hence, ai = 0. Now (10.2)
becomes

n1 h1 + ⋅ ⋅ ⋅ + ns hs = 0.

However, h1 , . . . , hs are independent, so each ni = 0, completing the claim.


Therefore, the whole group G has a basis proving the lemma by induction.

For more details see the proof of the general result on modules over principal ideal
domains later in the book. There is also an additional elementary proof for the basis
theorem for finitely generated abelian groups.

10.5 Some properties of finite groups


Classification is an extremely important concept in algebra. A large part of the theory
is devoted to classifying all structures of a given type, for example all UFD’s. In most
cases, this is not possible. Since for a given finite n, there are only finitely many group
tables, it is theoretically possible to classify all groups of order n. However, even for
small n, this becomes impractical. We close the chapter by looking at some further
results on finite groups, and then using these to classify all the finite groups up to
order 10.
Before stating the classification, we give some further examples of groups that are
needed.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
162 | 10 Normal subgroups, factor groups, and direct products

Example 10.5.1. In Example 9.2.6, we saw that the symmetry group of an equilateral
triangle had 6 elements, and is generated by elements r and f , which satisfy the re-
lations r 3 = f 2 = 1, f −1 rf = r −1 , where r is a rotation of 120∘ about the center of the
triangle, and f is a reflection through an altitude. This was called the dihedral group
D3 of order 6.
This can be generalized to any regular n-gon, n > 2. If D is a regular n-gon, then
the symmetry group Dn has 2n elements, and is called the dihedral group of order 2n.
It is generated by elements r and f , which satisfy the relations r n = f 2 = 1, f −1 rf = r n−1 ,
where r is a rotation of 2π n
about the center of the n-gon, and f is a reflection.
Hence, D4 , the symmetries of a square, has order 8 and D5 , the symmetries of a
regular pentagon, has order 10.

Example 10.5.2. Let i, j, k be the generators of the quaternions. Then we have

i2 = j2 = k 2 = −1, (−1)2 = 1, and ijk = 1.

These elements then form a group of order 8 called the quaternion group denoted by Q.
Since ijk = 1, we have ij = −ji, and the generators i and j satisfy the relations i4 = j4 = 1,
i2 = j2 , ij = i2 ji.

We now state the main classification, and then prove it in a series of lemmas.

Theorem 10.5.3. Let G be a finite group.


(a) If |G| = 2, then G ≅ ℤ2 .
(b) If |G| = 3, then G ≅ ℤ3 .
(c) If |G| = 4, then G ≅ ℤ4 , or G ≅ ℤ2 × ℤ2 .
(d) If |G| = 5, then G ≅ ℤ5 .
(e) If |G| = 6, then G ≅ ℤ6 ≅ ℤ2 × ℤ3 , or G ≅ D3 , the dihedral group with 6 elements.
(Note D3 ≅ S3 the symmetric group on 3 symbols.)
(f) If |G| = 7, then G ≅ ℤ7 .
(g) If |G| = 8, then G ≅ ℤ8 , or G ≅ ℤ4 × ℤ2 , or G ≅ ℤ2 × ℤ2 × ℤ2 , or G ≅ D4 , the dihedral
group of order 8, or G ≅ Q, the quaternion group.
(h) If |G| = 9, then G ≅ ℤ9 , or G ≅ ℤ3 × ℤ3 .
(i) If |G| = 10, then G ≅ ℤ10 ≅ ℤ2 × ℤ5 , or G ≅ D5 , the dihedral group with 10 elements.

Recall from Section 10.1, that a finite group of prime order must be cyclic. Hence,
in the theorem, the cases |G| = 2, 3, 5, 7 are handled. We next consider the case, where
G has order p2 , and where p is a prime.

Definition 10.5.4. If G is a group, then its center denoted Z(G), is the set of elements
in G, which commute with everything in G. That is,

Z(G) = {g ∈ G : gh = hg for any h ∈ G}.

Lemma 10.5.5. For any group G the following hold:

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
10.5 Some properties of finite groups | 163

(a) The center Z(G) is a normal subgroup.


(b) G = Z(G) if and only if G is abelian.
(c) If G/Z(G) is cyclic, then G is abelian.

Proof. (a) and (b) are direct, and we leave them to the exercises. Consider the case,
where G/Z(G) is cyclic. Then each coset of Z(G) has the form g m Z(G), where g ∈ G. Let
a, b ∈ G. Then since a, b are in cosets of the center, we have a = g m u and b = g n v with
u, v ∈ Z(G). Then

ab = (g m u)(g n v) = (g m g n )(uv) = (g n g m )(vu) = (g n v)(g m u) = ba

since u, v commute with everything. Therefore, G is abelian.

A p-group is any finite group of prime power order pk . We need the following: The
proof of this is based on what is called the class equation, which we will prove in Chap-
ter 13.

Lemma 10.5.6. A finite p-group has a nontrivial center of order at least p.

Lemma 10.5.7. If |G| = p2 with p a prime, then G is abelian; hence, G ≅ ℤp2 , or G ≅


ℤp × ℤp .

Proof. Suppose that |G| = p2 . Then from the previous lemma, G has a nontrivial center;
hence, |Z(G)| = p, or |Z(G)| = p2 . If |Z(G)| = p2 , then G = Z(G), and G is abelian. If
|Z(G)| = p, then |G/Z(G)| = p. Since p is a prime this implies that G/Z(G) is cyclic;
hence, from Lemma 10.5.5, G is abelian.

Lemma 10.5.7 handles the cases n = 4 and n = 9. Therefore, if |G| = 4, we must


have G ≅ ℤ4 , or G ≅ ℤ2 × ℤ2 , and if |G| = 9, we must have G ≅ ℤ9 , or G ≅ ℤ3 × ℤ3 .
This leaves n = 6, 8, 10. We next handle the cases 6 and 10.

Lemma 10.5.8. If G is any group, where every nontrivial element has order 2, then G is
abelian.

Proof. Suppose that g 2 = 1 for all g ∈ G. This implies that g = g −1 for all g ∈ G. Let a, b
be arbitrary elements of G. Then

(ab)2 = 1 󳨐⇒ abab = 1 󳨐⇒ ab = b−1 a−1 = ba.

Therefore, a, b commute, and G is abelian.

Lemma 10.5.9. If |G| = 6, then G ≅ ℤ6 , or G ≅ D3 .

Proof. Since 6 = 2 ⋅ 3, if G was abelian, then G ≅ ℤ2 × ℤ3 . Notice that if an abelian


group has an element of order m and an element of order n with (n, m) = 1, then it has
an element of order mn. Therefore, for 6 if G is abelian, there is an element of order 6;
hence, G ≅ ℤ2 × ℤ3 ≅ ℤ6 .

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
164 | 10 Normal subgroups, factor groups, and direct products

Now suppose that G is nonabelian. The nontrivial elements of G have orders 2, 3,


or 6. If there is an element of order 6, then G is cyclic, and hence abelian. If every ele-
ment has order 2, then G is abelian. Therefore, there is an element of order 3, say g ∈ G.
The cyclic subgroup ⟨g⟩ = {1, g, g 2 } then has index 2 in G and is, therefore, normal. Let
h ∈ G with h ∉ ⟨g⟩. Since g, g 2 both generate ⟨g⟩, we must have ⟨g⟩ ∩ ⟨h⟩ = {1}. If h
also had order 3, then |⟨g, h⟩| = |⟨g⟩∩⟨h⟩|
|⟨g⟩||⟨h⟩|
= 9, which is impossible. Therefore, h must
have order 2. Since ⟨g⟩ is normal, we have h−1 gh = g t for t = 1, 2. If h−1 gh = g, then g, h
commute, and the group G is abelian. Therefore, h−1 gh = g 2 = g −1 . It follows that g, h
generate a subgroup of G, satisfying

g 3 = h2 = 1, h1 gh = g −1 .

This defines a subgroup of order 6 isomorphic to D3 and, hence, must be all of G.

Lemma 10.5.10. If |G| = 10, then G ≅ ℤ10, or G ≅ D5 .

Proof. The proof is almost identical to that for n = 6. Since 10 = 2 ⋅ 5, if G were abelian,
G ≅ ℤ2 × ℤ5 = ℤ10 .
Now suppose that G is nonabelian. As for n = 6, G must contain a normal cyclic
subgroup of order 5, say ⟨g⟩ = {1, g, g 2 , g 3 , g 4 }. If h ∉ ⟨g⟩, then exactly as for n = 6, it
follows that h must have order 2, and h−1 gh = g t for t = 1, 2, 3, 4. If h−1 gh = g, then g, h
commute, and G is abelian. Notice that h−1 = h. Suppose that h−1 gh = hgh = g 2 . Then
3
(hgh)3 = (g 2 ) = g 6 = g 󳨐⇒ g = h2 gh2 = hg 2 h = g 4 󳨐⇒ g = 1,

which is a contradiction. Similarly, hgh = g 3 leads to a contradiction. Therefore,


h−1 gh = g 4 = g −1 , and g, h generate a subgroup of order 10, satisfying

g 5 = h2 = 1; h−1 gh = g −1 .

Therefore, this is all of G, and is isomorphic to D5 .

This leaves the case n = 8, the most difficult. If |G| = 8, and G is abelian, then
clearly, G ≅ ℤ8 , or G ≅ ℤ4 × ℤ2 , or G ≅ ℤ2 × ℤ2 × ℤ2 . The proof of Theorem 10.5.3 is
then completed with the following:

Lemma 10.5.11. If G is a nonabelian group of order 8, then G ≅ D4 , or G ≅ Q.

Proof. The nontrivial elements of G have orders 2, 4, or 8. If there is an element of


order 8, then G is cyclic, and hence abelian, whereas if every element has order 2, then
G is abelian. Hence, we may assume that G has an element of order 4, say g. Then ⟨g⟩
has index 2 and is a normal subgroup. First, suppose that G has an element h ∉ ⟨g⟩ of
order 2. Then

h−1 gh = g t for some t = 1, 2, 3.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
10.6 Automorphisms of a group | 165

If h−1 gh = g, then as in the cases 6 and 10, ⟨g, h⟩ defines an abelian subgroup of order 8;
hence, G is abelian. If h−1 gh = g 2 , then
2 2
(h−1 gh) = (g 2 ) = g 4 = 1 󳨐⇒ g = h−2 gh2 = h−1 g 2 h = g 4 󳨐⇒ g 3 = 1,

contradicting the fact that g has order 4. Therefore, h−1 gh = g 3 = g −1 . It follows that
g, h define a subgroup of order 8, isomorphic to D4 . Since |G| = 8, this must be all of G
and G ≅ D4 .
Therefore, we may now assume that every element h ∈ G with h ∉ ⟨g⟩ has or-
der 4. Let h be such an element. Then h2 has order 2, so h2 ∈ ⟨g⟩, which implies that
h2 = g 2 . This further implies that g 2 is central; that is, commutes with everything. Iden-
tifying g with i, h with j, and g 2 with −1, we get that G is isomorphic to Q, completing
Lemma 10.5.11 and the proof of Theorem 10.5.3.

In principle, this type of analysis can be used to determine the structure of any fi-
nite group, although it quickly becomes impractical. A major tool in this classification
is the following important result known as the Sylow theorem, which we just state. We
will prove this theorem in Chapter 13. If |G| = pm n with p a prime and (n, p) = 1, then
a subgroup of G of order pm is called a p-Sylow subgroup. It is not clear at first that a
group will contain p-Sylow subgroups.

Theorem 10.5.12 (Sylow theorem). Let |G| = pm n with p a prime and (n, p) = 1.
(a) G contains a p-Sylow subgroup.
(b) All p-Sylow subgroups of G are conjugate.
(c) Any p-subgroup of G is contained in a p-Sylow subgroup.
(d) The number of p-Sylow subgroups of G is of the form 1 + pk and divides n.

10.6 Automorphisms of a group


Let G be a group. A homomorphism f : G → G is called an automorphism of G if f is
bijective. Let Aut(G) be the set of all automorphisms of G.

Theorem 10.6.1. Aut(G) is a group.

Proof. The identity map 1 is the identity of Aut(G).


Let f , g ∈ Aut(G).
Then certainly fg ∈ Aut(g). Now

f −1 (ab) = f −1 (ff −1 (a)ff −1 (b))


= f −1 (f (f −1 (a)f −1 (b)))
= f −1 (a)f −1 (b)

for a, b ∈ G, because f ∈ Aut(G).


Hence, f −1 ∈ Aut(G).

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
166 | 10 Normal subgroups, factor groups, and direct products

A special automorphism of G is as follows: Let a ∈ G, and

ia : G → G, ia (x) = axa−1 .

By Lemma 10.1.3, we have that ia ∈ Aut(G).

Definition 10.6.2. ia is called an inner automorphism of G by a.


Let Inn(G) be the set of all inner automorphisms of G.

Theorem 10.6.3. The map φ : G → Aut(G), a 󳨃→ ia , is an epimorphism; that is, a sur-


jective homomorphism.

Proof. Certainly φ(G) = Inn(G). We have the following:

φ(a)φ(b)(x) = ia (ib (x)) = ia (bxb−1 )


= abxb−1 a−1 = (ab)x(ab)−1
= iab (x) = φ(ab)(x),

that is, φ(ab) = φ(a)φ(b).

Theorem 10.6.4. Inn(G) is a normal subgroup of Aut(G); that is, Inn(G) ⊲ Aut(G).

Proof. From Theorem 10.6.3, Inn(G) is a homomorphic image φ(G) of G. Therefore,


Inn(G) < Aut(G). Let f ∈ Aut(G). Then

fia f −1 (x) = f (af −1 (x)a−1 ) = f (a)ff −1 (x)f (a−1 )


= f (a)x(f (a)) = if (a) (x),
−1

that is, fia f −1 = if (a) ∈ Inn(G).

We now consider the kernel ker(φ) of the map φ : G → Aut(G), a 󳨃→ ia .


We have

ker(φ) = {a ∈ G : ia (x) = x for all x ∈ G}


= {a ∈ G : axa−1 = x for all x ∈ G}.

Hence, ker(φ) = Z(G), the center of G. Now, from Theorem 10.2.3, we get the following:

Theorem 10.6.5.

Inn(G) ≅ G/Z(G)

Let G be a group and f ∈ Aut(G). If a ∈ G has order n, then f (a) also has order n; if
a ∈ G has infinite order then f (a) also has infinite order.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
10.7 Exercises | 167

Example 10.6.6. Let V ≅ ℤ2 × ℤ2 ; that is, V has four elements 1, a, b and ab with
a2 = b2 = (ab)2 = 1.
V is often called the Klein four group. An automorphism of V permutes the three
elements a, b and ab of order 2, and each permutation of {a, b, ab} defines an automor-
phism of V. Hence, Aut(V) ≅ S3 .

Example 10.6.7.

S3 ≅ Inn(S3 ) = Aut(S3 ).

By Theorem 10.6.5, we have S3 ≅ Inn(S3 ), because Z(S3 ) = {1}. Now, let f ∈ Aut(S3 ).
Analogously, as in Example 10.6.6, the automorphism f permutes the three transposi-
tions (1, 2), (1, 3), and (2, 3). This gives | Aut(S3 )| ≤ |S3 | = 6, because S3 is generated by
these transpositions. From S3 ≅ Inn(S3 ) ⊲ Aut(S3 ), we have | Aut(S3 )| ≥ 6.
Hence, Aut(S3 ) ≅ Inn(S3 ) ≅ S3 .

Example 10.6.8. Let Gn = ⟨g⟩ ≅ (ℤn , +), n ∈ ℕ, be a cyclic group of order n. If f ∈


Aut(Gn ), then Gn = ⟨f (g)⟩ = ⟨g k ⟩, and (k, n) = 1 by Theorem 9.5.8. Hence, Aut(Gn ) ≅
ℤ⋆n , the group of units of the ring ℤn = ℤ/nℤ. In particular, | Aut(Gn )| = φ(n). If n = p
a prime number, then Aut(Gp ) ≅ ℤ⋆p is cyclic by Theorem 9.5.16.
In general, Aut(Gn ) is not cyclic. If, for instance, n = 8, then φ(8) = 4. The four
automorphisms of G8 are given by f1 (g) = g, f2 (g) = g 3 , f3 (g) = g 5 , and f4 (g) = g 7 .
We have fi2 (g) = g for i = 1, 2, 3, 4. Hence, Aut(G8 ) ≅ ℤ2 × ℤ2 .
We remark that certainly Aut(ℤ, +) = ℤ2 , because f (1) = 1 or f (1) = −1 for f ∈
Aut(ℤ, +).

10.7 Exercises
1. Prove that if G is cyclic, then any factor group of G is also cyclic.
2. Prove that for any group G, the center Z(G) is a normal subgroup, and G = Z(G) if
and only if G is abelian.
3. Let U1 and U2 be subgroups of a group G. Let x, y ∈ G. Show the following:
(i) If xU1 = yU2 , then U1 = U2 .
(ii) An example that xU1 = U2 x does not imply U1 = U2 .
4. Let U, V be subgroups of a group G. Let x, y ∈ G. If UxV ∩UyV ≠ 0, then UxV = UyV.
5. Let N be a cyclic normal subgroup of the group G. Then all subgroups of N are
normal subgroups of G. Give an example to show that the statement is not correct
if N is not cyclic.
6. Let N1 and N2 be normal subgroups of G. Show the following:
(i) If all elements in N1 and N2 have finite order, then also the elements of N1 N2 .
e
(ii) Let e1 , e2 ∈ ℕ. If ni i = 1 for all ni ∈ Ni (i = 1, 2), then xe1 e2 = 1 for all x ∈ N1 N2 .
7. Find groups N1 , N2 and G with N1 ⊲ N2 ⊲ G, but N1 is not a normal subgroup of G.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
168 | 10 Normal subgroups, factor groups, and direct products

8. Let G be a group generated by a and b and let bab−1 = ar and an = 1 for suitable
r ∈ ℤ, n ∈ ℕ. Show the following:
(i) The subgroup A := ⟨a⟩ is a normal subgroup of G.
(ii) G/A = ⟨bA⟩.
(iii) G = {bj ai : i, j ∈ ℤ}.
9. Prove that any group of order 24 cannot be simple.
10. Let G be a group with subgroups G1 , G2 . Then the following are equivalent:
(i) G ≅ G1 × G2 ;
(ii) G1 ⊲ G, G2 ⊲ G, G = G1 G2 , and G1 ∩ G2 = {1};
(iii) Every g ∈ G has a unique expression g = g1 g2 , where g1 ∈ G1 , g2 ∈ G2 , and
g1 g2 = g2 g1 for each g1 ∈ G1 , g2 ∈ G2 .
11. Suppose that G is a finite group with normal subgroups G1 , G2 such that
(|G1 |, |G2 |) = 1. If |G| = |G1 ||G2 |, then G ≅ G1 × G2 .
12. Let G be a group with normal subgroups G1 and G2 such that G = G1 G2 . Then

G/(G1 ∩ G2 ) ≅ G1 /(G1 ∩ G2 ) × G2 /(G1 ∩ G2 ).

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:31 AM
11 Symmetric and alternating groups
11.1 Symmetric groups and cycle decomposition
Groups most often appear as groups of transformations or permutations on a set. In
Galois Theory, groups will appear as permutation groups on the zeros of a polynomial.
In Section 9.3, we introduced permutation groups and the symmetric group Sn . In this
chapter, we look more carefully at the structure of Sn , and for each n introduce a very
important normal subgroup, An of Sn , called the alternating group on n symbols.
Recall that if A is a set, a permutation on A is a one-to-one mapping of A onto
itself. The set SA of all permutations on A forms a group under composition called the
symmetric group on A. If |A| > 2, then SA is nonabelian. Furthermore, if A, B have the
same cardinality, then SA ≅ SB .
If |A| = n, then |SA | = n! and, in this case, we denote SA by Sn , called the symmetric
group on n symbols. For example, |S3 | = 6. In Example 9.3.5, we showed that the six
elements of S3 can be given by the following:

1 2 3 1 2 3 1 2 3
1=( ), a=( ), b=( )
1 2 3 2 3 1 3 1 2
1 2 3 1 2 3 1 2 3
c=( ), d=( ), e=( ).
2 1 3 3 2 1 1 3 2

In addition, we saw that S3 has a presentation given by

S3 = ⟨a, c; a3 = c2 = 1, ac = ca2 ⟩.

By this, we mean that S3 is generated by a, c, or that S3 has generators a, c, and


the whole group and its multiplication table can be generated by using the relations
a3 = c2 = 1, ac = ca2 .
In general, a permutation group is any subgroup of SA for a set A.
For the remainder of this chapter, we will only consider finite symmetric groups
Sn and always consider the set A as A = {1, 2, 3, . . . , n}.

Definition 11.1.1. Suppose that f is a permutation of A = {1, 2, . . . , n}, which has the
following effect on the elements of A: There exists an element a1 ∈ A such that f (a1 ) =
a2 , f (a2 ) = a3 , . . . , f (ak−1 ) = ak , f (ak ) = a1 , and f leaves all other elements (if there are
any) of A fixed; that is, f (aj ) = aj for aj ≠ ai , i = 1, 2, . . . , k. Such a permutation f is
called a cycle or a k-cycle.

We use the following notation for a k-cycle, f , as given above:

f = (a1 , a2 , . . . , ak ).

https://doi.org/10.1515/9783110603996-011

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:37 AM
170 | 11 Symmetric and alternating groups

The cycle notation is read from left to right. It says f takes a1 into a2 , a2 into a3 , et
cetera, and finally ak , the last symbol, into a1 , the first symbol. Moreover, f leaves all
the other elements not appearing in the representation above fixed.
Note that one can write the same cycle in many ways using this type of notation;
for example, f = (a2 , a3 , . . . , ak , a1 ). In fact, any cyclic rearrangement of the symbols
gives the same cycle. The integer k is the length of the cycle. Note we allow a cycle
to have length 1, that is, f = (a1 ), for instance. This is just the identity map. For this
reason, we will usually designate the identity of Sn by (1), or just 1. (Of course, it also
could be written as (ai ), where ai ∈ A.)
If f and g are two cycles, they are called disjoint cycles if the elements moved by
one are left fixed by the other; that is, their representations contain different elements
of the set A (their representations are disjoint as sets).

Lemma 11.1.2. If f and g are disjoint cycles, then they must commute; that is, fg = gf .

Proof. Since the cycles f and g are disjoint, each element moved by f is fixed by g, and
vice versa. First, suppose f (ai ) ≠ ai . This implies that g(ai ) = ai , and f 2 (ai ) ≠ f (ai ).
But since f 2 (ai ) ≠ f (ai ), g(f (ai )) = f (ai ). Thus, (fg)(ai ) = f (g(ai )) = f (ai ), whereas
(gf )(ai ) = g(f (ai )) = f (ai ). Similarly, if g(aj ) ≠ aj , then (fg)(aj ) = (gf )(aj ). Finally, if
f (ak ) = ak and g(ak ) = ak , clearly then, (fg)(ak ) = ak = (gf )(ak ). Thus, gf = fg.

Before proceeding further with the theory, let us consider a specific example. Let
A = {1, 2, . . . , 8}, and let

1 2 3 4 5 6 7 8
f =( ).
2 4 6 5 1 7 3 8

We pick an arbitrary number from the set A, say 1. Then f (1) = 2, f (2) = 4, f (4) = 5,
f (5) = 1. Now select an element from A not in the set {1, 2, 4, 5}, say 3. Then f (3) = 6,
f (6) = 7, f (7) = 3. Next select any element of A not occurring in the set {1, 2, 4, 5} ∪
{3, 6, 7}. The only element left is 8, and f (8) = 8. It is clear that we can now write the
permutation f as a product of cycles:

f = (1, 2, 4, 5)(3, 6, 7)(8),

where the order of the cycles is immaterial since they are disjoint and, therefore, com-
mute. It is customary to omit such cycles as (8) and write f simply as

f = (1, 2, 4, 5)(3, 6, 7)

with the understanding that the elements of A not appearing are left fixed by f .
It is not difficult to generalize what was done here for a specific example, and
show that any permutation f can be written uniquely, except for order, as a product of
disjoint cycles. Thus, let f be a permutation on the set A = {1, 2, . . . , n}, and let a1 ∈ A.
Let f (a1 ) = a2 , f 2 (a1 ) = f (a2 ) = a3 , et cetera, and continue until a repetition is obtained.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:37 AM
11.1 Symmetric groups and cycle decomposition | 171

We claim that this first occurs for a1 ; that is, the first repetition is, say f k (a1 ) = f (ak ) =
ak+1 = a1 . For suppose the first repetition occurs at the k-th iterate of f and

f k (a1 ) = f (ak ) = ak+1 ,

and ak+1 = aj , where j < k. Then

f k (a1 ) = f j−1 (a1 ),

and so f k−j+1 (a1 ) = a1 . However, k−j+1 < k if j ≠ 1, and we assumed that the first repeti-
tion occurred for k. Thus, j = 1, and so f does cyclically permute the set {a1 , a2 , . . . , ak }.
If k < n, then there exists b1 ∈ A such that b1 ∉ {a1 , a2 , . . . , ak }, and we may proceed
similarly with b1 . We continue in this manner until all the elements of A are accounted
for. It is then seen that f can be written in the form

f = (a1 , . . . , ak )(b1 , . . . , bℓ )(c1 , . . . , cm ) ⋅ ⋅ ⋅ (h1 , . . . , ht ).

Note that all powers f i (a1 ) belong to the set {a1 = f 0 (a1 ) = f k (a1 ), a2 = f 1 (a1 ), . . . , ak =
f k−1 (a1 )}; all powers f i (b1 ) belong to the set {b1 = f 0 (b1 ) = f ℓ (b1 ), b2 = f 1 (b1 ), . . . , bℓ =
f ℓ−1 (b1 )}; . . . . Here, by definition, b1 is the smallest element in {1, 2, . . . , n}, which does
not belong to {a1 = f 0 (a1 ) = f k (a1 ), a2 = f 1 (a1 ), . . . , ak = f k−1 (a1 )}; c1 is the smallest
element in {1, 2, . . . , n}, which does not belong to

{a1 = f 0 (a1 ) = f k (a1 ), a2 = f 1 (a1 ), . . . , ak = f k−1 (a1 )}


∪ {b1 = f 0 (b1 ) = f ℓ (b1 ), b2 = f 1 (b1 ), . . . , bℓ = f ℓ−1 (b1 )}.

Therefore, by construction, all the cycles are disjoint. From this, it follows that k + ℓ +
m + ⋅ ⋅ ⋅ + t = n. It is clear that this factorization is unique, except for the order of the
factors, since it tells explicitly what effect f has on each element of A.
In summary, we have proven the following result.

Theorem 11.1.3. Every permutation of Sn can be written uniquely as a product of disjoint


cycles (up to order).

Example 11.1.4. The elements of S3 can be written in cycle notation as 1 = (1),


(1, 2), (1, 3), (2, 3), (1, 2, 3), (1, 3, 2). This is the largest symmetric group, which consists
entirely of cycles.
In S4 , for example, the element (1, 2)(3, 4) is not a cycle, but a product of cycles..
Suppose we multiply two elements of S3 , say (1, 2) and (1, 3). In forming the product
or composition here, we read from right to left. Thus, to compute (1, 2)(1, 3): We note
the permutation (1, 3) takes 1 into 3, and then the permutation (1, 2) takes 3 into 3.
Therefore, the composite (1, 2)(1, 3) takes 1 into 3. Continuing the permutation, (1, 3)
takes 3 into 1, and then the permutation (1, 2) takes 1 into 2. Therefore, the composite

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:37 AM
172 | 11 Symmetric and alternating groups

(1, 2)(1, 3) takes 3 into 2. Finally, (1, 3) takes 2 into 2, and then (1, 2) takes 2 into 1. So
(1, 2)(1, 3) takes 2 into 1. Thus, we see

(1, 2)(1, 3) = (1, 3, 2).

As another example of this cycle multiplication consider the product in S5 ,

(1, 2)(2, 4, 5)(1, 3)(1, 2, 5).

Reading from right to left 1 󳨃→ 2 󳨃→ 2 󳨃→ 4 󳨃→ 4 so 1 󳨃→ 4. Now 4 󳨃→ 4 󳨃→ 4 󳨃→ 5 󳨃→ 5


so 4 󳨃→ 5. Next 5 󳨃→ 1 󳨃→ 3 󳨃→ 3 󳨃→ 3 so 5 󳨃→ 3. Then 3 󳨃→ 3 󳨃→ 1 󳨃→ 1 󳨃→ 2 so 3 󳨃→ 2.
Finally, 2 󳨃→ 5 󳨃→ 5 󳨃→ 2 󳨃→ 1, so 2 󳨃→ 1. Since all the elements of A = {1, 2, 3, 4, 5} have
been accounted for, we have

(1, 2)(2, 4, 5)(1, 3)(1, 2, 5) = (1, 4, 5, 3, 2).

Let f ∈ Sn . If f is a cycle of length 2, that is, f = (a1 , a2 ), where a1 , a2 ∈ A, then f is


called a transposition. Any cycle can be written as a product of transpositions, namely,

(a1 , . . . , ak ) = (a1 , ak )(a1 , ak−1 ) ⋅ ⋅ ⋅ (a1 , a2 ).

From Theorem 11.1.3, any permutation can be written in terms of cycles, but from the
above, any cycle can be written as a product of transpositions. Thus, we have the fol-
lowing result:

Theorem 11.1.5. Let f ∈ Sn be any permutation. Then f can be written as a product of


transpositions.

11.2 Parity and the alternating groups


If f is a permutation with a cycle decomposition

(a1 , . . . , ak )(b1 , . . . , bj ) ⋅ ⋅ ⋅ (m1 , . . . , mt ),

then f can be written as a product of

W(f ) = (k − 1) + (j − 1) + ⋅ ⋅ ⋅ + (t − 1)

transpositions. The number W(f ) is uniquely associated with the permutation f since
f is uniquely represented (up to order) as a product of disjoint cycles. However, there
is nothing unique about the number of transpositions occurring in an arbitrary repre-
sentation of f as a product of transpositions. For example, in S3 ,

(1, 3, 2) = (1, 2)(1, 3) = (1, 2)(1, 3)(1, 2)(1, 2),

since (1, 2)(1, 2) = (1), the identity permutation of S3 .

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:37 AM
11.2 Parity and the alternating groups | 173

Although the number of transpositions is not unique in the representation of a


permutation f as a product of transpositions, we will show that the parity (evenness
or oddness) of that number is unique. Moreover, this depends solely on the number
W(f ) uniquely associated with the representation of f . More explicitly, we have the
following result:

Theorem 11.2.1. If f is a permutation written as a product of disjoint cycles, and if W(f )


is the associated integer given above, then if W(f ) is even (odd), any representation of
f , as a product of transpositions, must contain an even (odd) number of transpositions.

Proof. We first observe the following:

(a, b)(b, c1 , . . . , ct )(a, b1 , . . . , bk ) = (a, b1 , . . . , bk , b, c1 , . . . , ct ),


(a, b)(a, b1 , . . . , bk , b, c1 , . . . , ct ) = (a, b1 , . . . , bk )(b, c1 , . . . , ct ).

Suppose now that f is represented as a product of disjoint cycles, where we include all
the 1-cycles of elements of A, which f fixes, if any. If a and b occur in the same cycle in
this representation for f ,

f = ⋅ ⋅ ⋅ (a, b1 , . . . , bk , b, c1 , . . . , ct ) ⋅ ⋅ ⋅ ,

then, in the computation of W(f ), this cycle contributes k + t + 1. Now consider (a, b)f .
Since the cycles are disjoint and disjoint cycles commute,

(a, b)f = ⋅ ⋅ ⋅ (a, b)(a, b1 , . . . , bk , b, c1 , . . . , ct ) ⋅ ⋅ ⋅

since neither a nor b can occur in any factor of f other than (a, b1 , . . . , bk , b, c1 ,
. . . , ct ). So that (a, b) cancels out, and we find that (a, b)f = ⋅ ⋅ ⋅ (b, c1 , . . . , ct )(a, b1 ,
. . . , bk ) ⋅ ⋅ ⋅. Since W((b, c1 , . . . , ct )(a, b1 , . . . , bk )) = k + t, but W(a, b1 , . . . , bk , b,
c1 , . . . , ct ) = k + t + 1, we have W((a, b)f ) = W(f ) − 1.
A similar analysis shows that in the case, where a and b occur in different cycles
in the representation of f , then W((a, b)f ) = W(f ) + 1. Combining both cases, we have

W((a, b)f ) = W(f ) ± 1.

Now let f be written as a product of m transpositions, say

f = (a1 , b1 )(a2 , b2 ) ⋅ ⋅ ⋅ (am , bm ).

Then

(am , bm ) ⋅ ⋅ ⋅ (a2 , b2 )(a1 , b1 )f = 1.

Iterating this, together with the fact that W(1) = 0, shows that

W(f )(±1)(±1)(±1) ⋅ ⋅ ⋅ (±1) = 0,

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:37 AM
174 | 11 Symmetric and alternating groups

where there are m terms of the form ±1. Thus,

W(f ) = (±1)(±1) ⋅ ⋅ ⋅ (±1),

m times. Note, if exactly p are + and q = m − p are −, then m = p + q, and W(f ) =


p − q. Hence, m ≡ W(f ) (mod 2). Thus, W(f ) is even if and only if m is even, and this
completes the proof.

It now makes sense to state the following definition since we know that the parity
is indeed unique:

Definition 11.2.2. A permutation f ∈ Sn is said to be even if it can be written as a prod-


uct of an even number of transpositions. Similarly, f is called odd if it can be written
as a product of an odd number of transpositions.

Definition 11.2.3. On the group Sn , for n ≥ 2, we define the sign function : Sn → (ℤ2 , +)
by sgn(π) = 0 if π is an even permutation, and sgn(π) = 1 if π is an odd permutation.

We note that if f and g are even permutations, then so are fg and f −1 and also the
identity permutation is even. Furthermore, if f is even and g is odd, it is clear that fg
is odd. From this it is straightforward to establish the following:

Lemma 11.2.4. sgn is a homomorphism from Sn , for n ≥ 2, onto (ℤ2 , +).

We now let

An = {π ∈ Sn : sgn(π) = 0}.

That is, An is precisely the set of even permutations in Sn .

Theorem 11.2.5. For each n ∈ ℕ, n ≥ 2, the set An forms a normal subgroup of index 2
in Sn , called the alternating group on n symbols. Furthermore, |An | = n!2 .

Proof. By Lemma 11.2.4 sgn : Sn → (ℤ2 , +) is a homomorphism. Then ker(sgn) = An ;


therefore, An is a normal subgroup of Sn . Since im(sgn) = ℤ2 , we have |im(sgn)| = 2,
hence, |Sn /An | = 2. Therefore, [Sn : An ] = 2. Since |Sn | = n!, then |An | = n!2 follows from
Lagrange’s theorem.

11.3 Conjugation in Sn
Recall that in a group G, two elements x, y ∈ G are conjugates if there exists a g ∈ G
with g −1 xg = y. Conjugacy is an equivalence relation on G. In the symmetric groups Sn ,
it is easy to determine if two elements are conjugates. We say that two permutations
in Sn have the same cycle structure if they have the same number of cycles and the
lengths are the same. Hence, for example in S8 the permutations

π1 = (1, 3, 6, 7)(2, 5) and π2 = (2, 3, 5, 6)(1, 8)

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:37 AM
11.4 The simplicity of An | 175

have the same cycle structure. In particular, if π1 , π2 are two permutations in Sn , then
π1 , π2 are conjugates if and only if they have the same cycle structure. Therefore, in S8 ,
the permutations

π1 = (1, 3, 6, 7)(2, 5) and π2 = (2, 3, 5, 6)(1, 8)

are conjugates.

Lemma 11.3.1. Let

π = (a11 , a12 , . . . , a1k1 ) ⋅ ⋅ ⋅ (as1 , as2 , . . . , asks )

be the cycle decomposition of π ∈ Sn . Let τ ∈ Sn , and denote the image of aij under τ by
aτij . Then

τπτ−1 = (aτ11 , aτ12 , . . . , aτ1k1 ) ⋅ ⋅ ⋅ (aτs1 , aτs2 , . . . , aτsks ).

Proof. (a) Consider a11, then operating on the left like functions, we have

τπτ−1 (aτ11 ) = τπ(a11 ) = τ(a12 ) = aτ12 .

The same computation then follows for all the symbols aij , proving the lemma.

Theorem 11.3.2. Two permutations π1 , π2 ∈ Sn are conjugates if and only if they are of
the same cycle structure.

Proof. Suppose that π2 = τπ1 τ−1 . Then, from Lemma 11.3.1, we have that π1 and π2 are
of the same cycle structure.
Conversely, suppose that π1 and π2 are of the same cycle structure. Let

π1 = (a11 , a12 , . . . , a1k1 ) ⋅ ⋅ ⋅ (as1 , as2 , . . . , asks )


π2 = (b11 , b12 , . . . , b1k1 ) ⋅ ⋅ ⋅ (bs1 , bs2 , . . . , bsks ),

where we place the cycles of the same length under each other. Let τ be the per-
mutation in Sn that maps each symbol in π1 to the digit below it in π2 . Then, from
Lemma 11.3.1, we have τπ1 τ−1 = π2 ; hence, π1 and π2 are conjugate.

11.4 The simplicity of An


A simple group is a group G with no nontrivial proper normal subgroups. Up to this
point, the only examples we have of simple groups are cyclic groups of prime order.
In this section, we prove that if n ≥ 5, each alternating group An is a simple group.

Theorem 11.4.1. For each n ≥ 3 each π ∈ An is a product of cycles of length 3.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:37 AM
176 | 11 Symmetric and alternating groups

Proof. Let π ∈ An . Since π is a product of an even number of transpositions to prove


the theorem, it suffices to show that if τ1 , τ2 are transpositions, then τ1 τ2 is a product
of 3-cycles.
The statement holds certainly for n = 3. Now, let n ≥ 4.
Suppose that a, b, c, d are different digits in {1, . . . , n}. There are three cases to con-
sider. First:

Case (1): (a, b)(a, b) = 1 = (1, 2, 3)0 ;

hence, it is true here.


Next:

Case (2): (a, b)(b, c) = (c, a, b);

hence, it is also true here.


Finally:

Case (3): (a, b)(c, d) = (a, b)(b, c)(b, c)(c, d) = (c, a, b)(c, d, b)

since (b, c)(b, c) = 1. Therefore, it is also true here, proving the theorem.

Now our main result:

Theorem 11.4.2. For n ≥ 5, the alternating group An is a simple nonabelian group.

Proof. Suppose that N is a nontrivial normal subgroup of An with n ≥ 5. We show that


N = An ; hence, An is simple.
We claim first that N must contain a 3-cycle. Let 1 ≠ π ∈ N, then π is not a trans-
position since π ∈ An . Therefore, π moves at least 3 digits. If π moves exactly 3 digits,
then it is a 3-cycle, and we are done. Suppose then that π moves at least 4 digits. Let
π = τ1 ⋅ ⋅ ⋅ τr with τi disjoint cycles.
Case (1): There is a τi = (. . . , a, b, c, d). Set σ = (a, b, c) ∈ An . Then

πσπ −1 = τi στi−1 = (b, c, d).

However, from Lemma 11.3.1, (b, c, d) = (aτi , bτi , cτi ). Furthermore, since π ∈ N and N
is normal, we have

π(σπ −1 σ −1 ) = (b, c, d)(a, c, b) = (a, d, b).

Therefore, in this case, N contains a 3-cycle.


Case (2): There is a τi , which is a 3-cycle. Then

π = (a, b, c)(d, e, . . .).

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:37 AM
11.4 The simplicity of An | 177

Now, set σ = (a, b, d) ∈ An , and then

πσπ −1 = (b, c, e) = (aπ , bπ , dπ ),

and

σ −1 πσπ −1 = (a, b, d)(b, c, e) = (b, c, e, d, a) ∈ N.

Now, use Case (1). Therefore, in this case, N has a 3-cycle.


In the final case, π is a disjoint product of transpositions.
Case (3): π = (a, b)(c, d) ⋅ ⋅ ⋅. Since n ≥ 5, there is an e ≠ a, b, c, d. Let σ = (a, c, e) ∈
An . Then

πσπ −1 = (b, d, e1 ) with e1 = eπ ≠ b, d.

However, (aπ , cπ , eπ ) = (b, d, e1 ). Let γ = (σ −1 πσ)π −1 . This is in N since N is normal. If


e = e1 , then γ = (e, c, a)(b, d, e) = (a, e, b, d, c), and we can use Case (1) to get that N
contains a 3-cycle. If e ≠ e1 , then Γ = (e, c, a)(b, d, e1 ) ∈ N, and then we can use Case
(2) to obtain that N contains a 3-cycle.
These three cases show that N must contain a 3-cycle.
If N is normal in An , then from the argument above, N contains a 3-cycle τ. How-
ever, from Theorem 11.3.2, any two 3-cycles in Sn are conjugate. Hence, τ is conjugate
to any other 3-cycle in Sn . Since N is normal and τ ∈ N, each of these conjugates must
also be in N. Therefore, N contains all 3-cycles in Sn . From Theorem 11.4.1, each el-
ement of An is a product of 3-cycles. It follows then that each element of An is in N.
However, since N ⊂ An , this is only possible if N = An , completing the proof.

Theorem 11.4.3. Let n ∈ ℕ and U ⊂ Sn a subgroup. Let τ be a transposition and α a


n-cycle with α, τ ∈ U. Then U = Sn .

Proof. Suppose, without loss of generality, that τ = (1, 2). There is an i with αi (1) = 2.
Without loss of generality, we may then assume that α = (1, 2, a3 , . . . , an ). Let

1 2 a3 ⋅⋅⋅ an
π=( ).
1 2 3 ⋅⋅⋅ n

Then, from Lemma 11.3.1, we have

παπ −1 = (1, 2, . . . , n).

Furthermore, π(1, 2)π −1 = (1, 2). Hence, U1 = πUπ −1 contains (1, 2) and (1, 2, . . . , n).
Now we have

(1, 2, . . . , n)(1, 2)(1, 2, . . . , n)−1 = (2, 3) ∈ U1 .

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:37 AM
178 | 11 Symmetric and alternating groups

Analogously,

(1, 2, . . . , n)(2, 3)(1, 2, . . . , n)−1 = (3, 4) ∈ U1 ,

and so on until

(1, 2, . . . , n)(n − 2, n − 1)(1, 2, . . . , n)−1 = (n − 1, n) ∈ U1 .

Hence, the transpositions (1, 2), (2, 3), . . . , (n − 1, n) ∈ U1 . Moreover,

(1, 2)(2, 3)(1, 2) = (1, 3) ∈ U1 .

In an identical fashion, each (1, k) ∈ U1 . Then for any digits s, t, we have

(1, s)(1, t)(1, s) = (s, t) ∈ U1 .

Therefore, U1 contains all the transpositions of Sn ; hence, U1 = Sn . Since U = πU1 π −1 ,


we must have U = Sn also.

11.5 Exercises
1. Show that for n ≥ 3, the group An is generated by {(1, 2, k) : k ≥ 3}.
2. Let σ = (k1 , . . . , ks ) ∈ Sn be a permutation. Show that the order of σ is the least
common multiple of k1 , . . . , ks . Compute the order of τ = ( 21 62 35 41 35 46 77 ) ∈ S7 .
3. Let G = S4 .
(i) Determine a noncyclic subgroup H of order 4 of G.
(ii) Show that H is normal.
(iii) Show that f (g)(h) := ghg −1 defines an epimorphism f : G → Aut(H) for g ∈ G
and h ∈ H. Determine its kernel.
4. Show that all subgroups of order 6 of S4 are conjugate.
5. Let σ1 = (1, 2)(3, 4) and σ2 = (1, 3)(2, 4) ∈ S4 . Determine τ ∈ S4 such that τσ1 τ−1 = σ2 .
6. Let σ = (a1 , . . . , ak ) ∈ Sn . Describe σ −1 .

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:37 AM
12 Solvable groups
12.1 Solvability and solvable groups
The original motivation for Galois theory grew out of a famous problem in the theory of
equations. This problem was to determine the solvability or insolvability of a polyno-
mial equation of degree 5 or higher in terms of a formula involving the coefficients of
the polynomial and only using algebraic operations and radicals. This question arose
out of the well-known quadratic formula.
The ability to solve quadratic equations and, in essence, the quadratic formula
was known to the Babylonians some 3600 years ago. With the discovery of imaginary
numbers, the quadratic formula then says that any second degree polynomial over
ℂ can be solved by radicals in terms of the coefficients. In the sixteenth century, the
Italian mathematician, Niccolo Tartaglia, discovered a similar formula in terms of rad-
icals to solve cubic equations. This cubic formula is now known erroneously as Car-
dano’s formula in honor of Cardano, who first published it in 1545. An earlier special
version of this formula was discovered by Scipione del Ferro. Cardano’s student, Fer-
rari, extended the formula to solutions by radicals for fourth degree polynomials. The
combination of these formulas says that polynomial equations of degree four or less
over the complex numbers can be solved by radicals.
From Cardano’s work until the very early nineteenth century, attempts were made
to find similar formulas for degree five polynomials. In 1805, Ruffini proved that fifth
degree polynomial equations are insolvable by radicals in general. Therefore, there
exists no comparable formula for degree 5. Abel (in 1825–1826) and Galois (in 1831)
extended Ruffini’s result and proved the insolubility by radicals for all degrees five or
greater. In doing this, Galois developed a general theory of field extensions and its
relationship to group theory. This has come to be known as Galois theory and is really
the main focus of this book.
The solution of the insolvability of the quintic and higher polynomials involved a
translation of the problem into a group theory setting. For a polynomial equation to
be solvable by radicals, its corresponding Galois group (a concept we will introduce in
Chapter 16) must be a solvable group. This is a group with a certain defined structure.
In this chapter, we introduce and discuss this class of groups.

12.2 Solvable groups


A normal series for a group G is a finite chain of subgroups beginning with G and end-
ing with the identity subgroup {1}

G = G0 ⊃ G1 ⊃ G2 ⊃ ⋅ ⋅ ⋅ ⊃ Gn−1 ⊃ Gn = {1},

https://doi.org/10.1515/9783110603996-012

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:38 AM
180 | 12 Solvable groups

in which each Gi+1 is a proper normal subgroup of Gi . The factor groups Gi /Gi+1 are
called the factors of the series, and n is the length of the series.

Definition 12.2.1. A group G is solvable if it has a normal series with abelian factors;
that is, Gi /Gi+1 is abelian for all i = 0, 1, . . . , n − 1. Such a normal series is called a
solvable series.

If G is an abelian group, then G = G0 ⊃ {1} provides a solvable series. Hence, any


abelian group is solvable. Furthermore, the symmetric group S3 on 3-symbols is also
solvable, however, nonabelian. Consider the series

S3 ⊃ A3 ⊂ {1}.

Since |S3 | = 6, we have |A3 | = 3; hence, A3 is cyclic and therefore abelian. Furthermore,
|S3 /A3 | = 2; hence, the factor group S3 /A3 is also cyclic, thus abelian. Therefore, the
series above gives a solvable series for S3 .

Lemma 12.2.2. If G is a finite solvable group, then G has a normal series with cyclic
factors.

Proof. If G is a finite solvable group, then by definition, it has a normal series with
abelian factors. Hence, to prove the lemma, it suffices to show that a finite abelian
group has a normal series with cyclic factors.
Let A be a nontrivial finite abelian group. We do an induction on the order of A. If
|A| = 2, then A itself is cyclic, and the result follows. Suppose that |A| > 2. Choose an
1 ≠ a ∈ A. Let N = ⟨a⟩ so that N is cyclic. Then we have the normal series A ⊃ N ⊃ {1}
with A/N abelian. Moreover, A/N has order less than A, so A/N has a normal series
with cyclic factors, and the result follows.

Solvability is preserved under subgroups and factor groups.

Theorem 12.2.3. Let G be a solvable group. Then the following hold:


(1) Any subgroup H of G is also solvable.
(2) Any factor group G/N of G is also solvable.

Proof. (1) Let G be a solvable group, and suppose that

G = G0 ⊃ G1 ⊃ ⋅ ⋅ ⋅ ⊃ Gr = {1}

is a solvable series for G. Hence, Gi+1 is a normal subgroup of Gi for each i, and the
factor group Gi /Gi+1 is abelian.
Now let H be a subgroup of G, and consider the chain of subgroups

H = H ∩ G0 ⊃ H ∩ G1 ⊃ ⋅ ⋅ ⋅ ⊃ H ∩ Gr = {1}.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:38 AM
12.2 Solvable groups | 181

Since Gi+1 is normal in Gi , we know that H ∩ Gi+1 is normal in H ∩ Gi ; hence, this gives
a finite normal series for H. Furthermore, from the second isomorphism theorem, we
have for each i,

(H ∩ Gi )/(H ∩ Gi+1 ) = (H ∩ Gi )/((H ∩ Gi ) ∩ Gi+1 )


≅ (H ∩ Gi )Gi+1 /Gi+1 ⊂ Gi /Gi+1 .

However, Gi /Gi+1 is abelian, so each factor in the normal series for H is abelian. There-
fore, the above series is a solvable series for H; hence, H is also solvable.
(2) Let N be a normal subgroup of G. Then from (1) N is also solvable. As above,
let

G = G0 ⊃ G1 ⊃ ⋅ ⋅ ⋅ ⊃ Gr = {1}

be a solvable series for G. Consider the chain of subgroups

G/N = G0 N/N ⊃ G1 N/N ⊃ ⋅ ⋅ ⋅ ⊃ Gr N/N = N/N = {1}.

Let m ∈ Gi−1 , n ∈ N. Then since N is normal in G,

(mn)−1 Gi N(mn) = n−1 m−1 Gi mnN = n−1 Gi nN


= n−1 NGi = NGi = Gi N.

It follows that Gi+1 N is normal in Gi N for each i; therefore, the series for G/N is a normal
series.
Again, from the isomorphism theorems,

(Gi N/N)/(Gi+1 N/N) ≅ Gi /(Gi ∩ Gi+1 N)


≅ (Gi /Gi+1 )/((Gi ∩ Gi+1 N)/Gi+1 ).

However, the last group (Gi /Gi+1 )/((Gi ∩ Gi+1 N)/Gi+1 ) is a factor group of the group
Gi /Gi+1, which is abelian. Hence, this last group is also abelian; therefore, each factor
in the normal series for G/N is abelian. Hence, this series is a solvable series, and G/N
is solvable.

The following is a type of converse of the above theorem:

Theorem 12.2.4. Let G be a group and N a normal subgroup of G. If both N and G/N
are solvable, then G is solvable.

Proof. Suppose that

N = N0 ⊃ N1 ⊃ ⋅ ⋅ ⋅ ⊃ Nr = {1}
G/N = G0 /N ⊃ G1 /N ⊃ ⋅ ⋅ ⋅ ⊃ Gs /N = N/N = {1}

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:38 AM
182 | 12 Solvable groups

are solvable series for N and G/N, respectively. Then

G = G0 ⊃ G1 ⊃ ⋅ ⋅ ⋅ ⊃ Gs = N ⊃ N1 ⊃ ⋅ ⋅ ⋅ ⊃ Nr = {1}

gives a normal series for G. Furthermore, from the isomorphism theorems again,

Gi /Gi+1 ≅ (Gi /N)/(Gi+1 /N);

hence, each factor is abelian. Therefore, this is a solvable series for G; hence, G is
solvable.

This theorem allows us to prove that solvability is preserved under direct products.

Corollary 12.2.5. Let G and H be solvable groups. Then their direct product G ×H is also
solvable.

Proof. Suppose that G and H are solvable groups and K = G × H. Recall from Chap-
ter 10 that G can be considered as a normal subgroup of K with K/G ≅ H. Therefore,
G is a solvable subgroup of K, and K/G is a solvable quotient. It follows then, from
Theorem 12.2.4, that K is solvable.

We saw that the symmetric group S3 is solvable. However, the following theorem
shows that the symmetric group Sn is not solvable for n ≥ 5. This result will be crucial
to the proof of the insolvability of the quintic and higher polynomials.

Theorem 12.2.6. For n ≥ 5, the symmetric group Sn is not solvable.

Proof. For n ≥ 5, we saw that the alternating group An is simple. Furthermore, An


is nonabelian. Hence, An cannot have a nontrivial normal series, and so no solvable
series. Therefore, An is not solvable. If Sn were solvable for n ≥ 5, then from The-
orem 12.2.3, An would also be solvable. Therefore, Sn must also be nonsolvable for
n ≥ 5.

In general, for a simple, solvable group we have the following:

Lemma 12.2.7. If a group G is both simple and solvable, then G is cyclic of prime order.

Proof. Suppose that G is a nontrivial simple, solvable group. Since G is simple, the
only normal series for G is G = G0 ⊃ {1}. Since G is solvable, the factors are abelian;
hence, G is abelian. Again, since G is simple, G must be cyclic. If G were infinite, then
G ≅ (ℤ, +). However, then 2ℤ is a proper normal subgroup, a contradiction. Therefore,
G must be finite cyclic. If the order were not prime, then for each proper divisor of the
order, there would be a nontrivial proper normal subgroup. Therefore, G must be of
prime order.

In general, a finite p-group is solvable.

Theorem 12.2.8. A finite p-group G is solvable.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:38 AM
12.3 The derived series | 183

Proof. Suppose that |G| = pn . We do this by induction on n. If n = 1, then |G| = p,


and G is cyclic, hence abelian and therefore solvable. Suppose that n > 1. Then as
used previously G has a nontrivial center Z(G). If Z(G) = G, then G is abelian; hence
solvable. If Z(G) ≠ G, then Z(G) is a finite p-group of order less than pn . From our
inductive hypothesis, Z(G) must be solvable. Furthermore, G/Z(G) is then also a finite
p-group of order less than pn , so it is also solvable. Hence, Z(G) and G/Z(G) are both
solvable. Therefore, from Theorem 12.2.4, G is solvable.

12.3 The derived series


Let G be a group, and let a, b ∈ G. The product aba−1 b−1 is called the commutator of a
and b. We write [a, b] = aba−1 b−1 .
Clearly, [a, b] = 1 if and only if a and b commute.

Definition 12.3.1. Let G󸀠 be the subgroup of G, which is generated by the set of all
commutators

G󸀠 = gp({[x, y] : x, y ∈ G}).

G󸀠 is called the commutator or (derived) subgroup of G. We sometimes write G󸀠 = [G, G].

Theorem 12.3.2. For any group G, the commutator subgroup G󸀠 is a normal subgroup of
G, and G/G󸀠 is abelian. Furthermore, if H is a normal subgroup of G, then G/H is abelian
if and only if G󸀠 ⊂ H.

Proof. The commutator subgroup G󸀠 consists of all finite products of commutators and
inverses of commutators. However,

[a, b]−1 = (aba−1 b−1 ) = bab−1 a−1 = [b, a],


−1

and so the inverse of a commutator is once again a commutator. It then follows that
G󸀠 is precisely the set of all finite products of commutators; that is, G󸀠 is the set of all
elements of the form

h1 h2 ⋅ ⋅ ⋅ hn ,

where each hi is a commutator of elements of G.


If h = [a, b] for a, b ∈ G, then for x ∈ G, xhx−1 = [xax−1 , xbx −1 ] is again a com-
mutator of elements of G. Now from our previous comments, an arbitrary element of
G󸀠 has the form h1 h2 ⋅ ⋅ ⋅ hn , where each hi is a commutator. Thus, x(h1 h2 ⋅ ⋅ ⋅ hn )x−1 =
(xh1 x−1 )(xh2 x−1 ) ⋅ ⋅ ⋅ (xhn x−1 ) and, since by the above each xhi x−1 is a commutator,
x(h1 h2 ⋅ ⋅ ⋅ hn )x−1 ∈ G󸀠 . It follows that G󸀠 is a normal subgroup of G.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:38 AM
184 | 12 Solvable groups

Consider the factor group G/G󸀠 . Let aG󸀠 and bG󸀠 be any two elements of G/G󸀠 . Then

[aG󸀠 , bG󸀠 ] = aG󸀠 ⋅ bG󸀠 ⋅ (aG󸀠 )


−1 −1
⋅ (bG󸀠 )
= aG󸀠 ⋅ bG󸀠 ⋅ a−1 G󸀠 ⋅ b−1 G󸀠 = aba−1 b−1 G󸀠 = G󸀠

since [a, b] ∈ G󸀠 . In other words, any two elements of G/G󸀠 commute; therefore, G/G󸀠
is abelian.
Now let N be a normal subgroup of G with G/N abelian. Let a, b ∈ G, then aN and
bN commute since G/N is abelian. Therefore,

[aN, bN] = aNbNa−1 Nb−1 N = aba−1 b−1 N = N.

It follows that [a, b] ∈ N. Therefore, all commutators of elements in G lie in N; thus,


G󸀠 ⊂ N.

From the second part of Theorem 12.3.2, we see that G󸀠 is the minimal normal
subgroup of G such that G/G󸀠 is abelian. We call G/G󸀠 = Gab the abelianization of G.
We consider next the following inductively defined sequence of subgroups of an
arbitrary group G called the derived series:

Definition 12.3.3. For an arbitrary group G, define G(0) = G and G(1) = G󸀠 , and then, in-
ductively, G(n+1) = (G(n) )󸀠 . That is, G(n+1) is the commutator subgroup or derived group
of G(n) . The chain of subgroups

G = G(0) ⊃ G(1) ⊃ ⋅ ⋅ ⋅ ⊃ G(n) ⊃ ⋅ ⋅ ⋅

is called the derived series for G.

Notice that since G(i+1) is the commutator subgroup of G(i) , we have G(i) /G(i+1) is
abelian. If the derived series was finite, then G would have a normal series with abelian
factors; hence would be solvable. The converse is also true and characterizes solvable
groups in terms of the derived series.

Theorem 12.3.4. A group G is solvable if and only if its derived series is finite. That is,
there exists an n such that G(n) = {1}.

Proof. If G(n) = {1} for some n, then as explained above, the derived series provides a
solvable series for G; hence, G is solvable.
Conversely, suppose that G is solvable, and let

G = G0 ⊃ G1 ⊃ ⋅ ⋅ ⋅ ⊃ Gr = {1}

be a solvable series for G. We claim first that Gi ⊃ G(i) for all i. We do this by induction
on r. If r = 0, then G = G0 = G(0) . Suppose that Gi ⊃ G(i) . Then Gi󸀠 ⊃ (G(i) )󸀠 = G(i+1) .
Since Gi /Gi+1 is abelian, it follows, from Theorem 12.3.2, that Gi+1 ⊃ Gi󸀠 . Therefore,
Gi+1 ⊃ G(i+1) , establishing the claim.
Now if G is solvable, from the claim, we have that Gr ⊃ G(r) . However, Gr = {1};
therefore, G(r) = {1}, proving the theorem.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:38 AM
12.4 Composition series and the Jordan–Hölder theorem | 185

The length of the derived series is called the solvability length of a solvable
group G. The class of solvable groups of class c consists of those solvable groups
of solvability length c, or less.

12.4 Composition series and the Jordan–Hölder theorem


The concept of a normal series is extremely important in the structure theory of groups.
This is especially true for finite groups. If

G = G0 ⊃ G1 ⊃ ⋅ ⋅ ⋅ ⊃ Gs = {1}
G = H0 ⊃ H1 ⊃ ⋅ ⋅ ⋅ ⊃ Gt = {1}

are two normal series for the group G, then the second is a refinement of the first if all
the terms of the second occur in the first series. Furthermore, two normal series are
called equivalent or (isomorphic) if there exists a 1–1 correspondence between the fac-
tors (hence the length must be the same) of the two series such that the corresponding
factors are isomorphic.

Theorem 12.4.1 (Schreier’s theorem). Any two normal series for a group G have equiv-
alent refinements.

Proof. Consider two normal series for G:

G = G0 ⊃ G1 ⊃ ⋅ ⋅ ⋅ ⊃ Gs−1 ⊃ Gs = {1}
G = H0 ⊃ H1 ⊃ ⋅ ⋅ ⋅ ⊃ Ht−1 ⊃ Ht = {1}.

Now define

Gij = (Gi ∩ Hj )Gi+1 , j = 0, 1, 2, . . . , t,


Hji = (Gi ∩ Hj )Hj+1 , i = 0, 1, 2, . . . , s.

Then we have

G = G00 ⊃ G01 ⊃ ⋅ ⋅ ⋅ ⊃ G0s = G1


= G10 ⊃ ⋅ ⋅ ⋅ ⊃ G1s = G2 ⊃ ⋅ ⋅ ⋅ ⊃ Gts = {e},

and

G = H00 ⊃ H01 ⊃ ⋅ ⋅ ⋅ ⊃ H0t = H1


= H10 ⊃ ⋅ ⋅ ⋅ ⊃ H1t = H2 ⊃ ⋅ ⋅ ⋅ ⊃ Hst = {e}.

Now, applying the third isomorphism theorem to the groups Gi , Hj , Gi+1 , Hj+1 , we have
that Gi(j+1) = (Gi ∩ Hj+1 )Gi+1 is a normal subgroup of Gij = (Gi ∩ Hj )Gi+1 , and Hj(i+1) =
(Gi+1 ∩ Hj )Hj+1 is a normal subgroup of Hji = (Gi ∩ Hj )Hj+1 . Furthermore, also

Gij /Gi(j+1) ≅ Hji /Hj(i+1) .

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:38 AM
186 | 12 Solvable groups

Thus, the above two are normal series, which are refinements of the two given series,
and they are equivalent.

A proper normal subgroup N of a group G is called maximal in G, if there does not


exist any normal subgroup N ⊂ M ⊂ G with all inclusions proper. This is the group
theoretic analog of a maximal ideal. An alternative characterization is the following:
N is a maximal normal subgroup of G if and only if G/N is simple.
A normal series, where each factor is simple can have no refinements.

Definition 12.4.2. A composition series for a group G is a normal series, where all the
inclusions are proper and such that Gi+1 is maximal in Gi . Equivalently, a normal se-
ries, where each factor is simple.

It is possible that an arbitrary group does not have a composition series, or even
if it does have one, a subgroup of it may not have one. Of course, a finite group does
have a composition series.
In the case in which a group G does have a composition series, the following im-
portant theorem, called the Jordan–Hölder theorem, provides a type of unique factor-
ization.

Theorem 12.4.3 (Jordan–Hölder theorem). If a group G has a composition series, then


any two composition series are equivalent; that is, the composition factors are unique.

Proof. Suppose we are given two composition series. Applying Theorem 12.4.1, we get
that the two composition series have equivalent refinements. But the only refinement
of a composition series is one obtained by introducing repetitions. If in the 1–1 corre-
spondence between the factors of these refinements, the paired factors equal to {e} are
disregarded; that is, if we drop the repetitions, clearly, we get that the original compo-
sition series are equivalent.

We remarked in Chapter 10 that the simple groups are important, because they
play a role in finite group theory somewhat analogous to that of the primes in number
theory. In particular, an arbitrary finite group G can be broken down into simple com-
ponents. These uniquely determined simple components are, according to the Jordan–
Hölder theorem, the factors of a composition series for G.

12.5 Exercises
1. Let K be a field and

{ a x y }
{ }
G = {(0 b z ) : a, b, c, x, y, z ∈ K, abc ≠ 0} .
{ }
{ 0 0 c }

Show that G is solvable.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:38 AM
12.5 Exercises | 187

2. A group G is called polycyclic if it has a normal series with cyclic factors. Show the
following:
(i) Each subgroup and each factor group of a polycyclic group is polycyclic.
(ii) In a polycyclic group, each normal series has the same number of infinite
cyclic factors.
3. Let G be a group. Show the following:
(i) If G is finite and solvable, then G is polycyclic.
(ii) If G is polycyclic, then G is finitely generated.
(iii) The group (ℚ, +) is solvable, but not polycyclic.
4. Let N1 and N2 be normal subgroups of G. Show the following:
(i) If N1 and N2 are solvable, then also N1 N2 is a solvable normal subgroup of G.
(ii) Is (i) still true, if we replace “solvable” by “abelian”?
5. Let N1 , . . . , Nt be normal subgroups of a group G. If all factor groups G/Ni are solv-
able, then also G/(N1 ∩ ⋅ ⋅ ⋅ ∩ Nt ) is solvable.

Brought to you by | Chalmers University of Technology


Authenticated
Download Date | 9/12/19 6:38 AM
Brought to you by | Chalmers University of Technology
Authenticated
Download Date | 9/12/19 6:38 AM
13 Groups actions and the Sylow theorems
13.1 Group actions
A group action of a group G on a set A is a homomorphism from G into SA , the symmet-
ric group on A. We say that G acts on A. Hence, G acts on A if to each g ∈ G corresponds
a permutation

πg : A → A

such that
(1) πg1 (πg2 (a)) = πg1 g2 (a) for all g1 , g2 ∈ G and for all a ∈ A,
(2) 1(a) = a for all a ∈ A.

For the remainder of this chapter, if g ∈ G and a ∈ A, we will write ga for πg (a).
Group actions are an extremely important idea, and we use this idea in the present
chapter to prove several fundamental results in group theory.
If G acts on the set A, then we say that two elements a1 , a2 ∈ A are congruent under
G if there exists a g ∈ G with ga1 = a2 . The set

Ga = {a1 ∈ A : a1 = ga for some g ∈ G}

is called the orbit of a. It consists of elements congruent to a under G.

Lemma 13.1.1. If G acts on A, then congruence under G is an equivalence relation on A.

Proof. Any element a ∈ A is congruent to itself via the identity map; hence, the relation
is reflexive. If a1 ∼ a2 so that ga1 = a2 for some g ∈ G, then g −1 a2 = a1 , and so a2 ∼ a1 ,
and the relation is symmetric. Finally, if g1 a1 = a2 and g2 a2 = a3 , then g2 g1 a1 = a3 , and
the relation is transitive.

Recall that the equivalence classes under an equivalence relation partition a set.
For a given a ∈ A, its equivalence class under this relation is precisely its orbit Ga , as
defined above.

Corollary 13.1.2. If G acts on the set A, then the orbits under G partition the set A.

We say that G acts transitively on A if any two elements of A are congruent under G.
That is, the action is transitive if for any a1 , a2 ∈ A there is some g ∈ G such that
ga1 = a2 .
If a ∈ A, the stabilizer of a consists of those g ∈ G that fix a. Hence,

StabG (a) = {g ∈ G : ga = a}.

The following lemma is easily proved and left to the exercises:

https://doi.org/10.1515/9783110603996-013

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:00 AM
190 | 13 Groups actions and the Sylow theorems

Lemma 13.1.3. If G acts on A, then for any a ∈ A, the stabilizer StabG (a) is a subgroup
of G.

We now prove the crucial theorem concerning group actions.

Theorem 13.1.4. Suppose that G acts on A and a ∈ A. Let Ga be the orbit of a under G
and StabG (a) its stabilizer. Then

󵄨󵄨 󵄨
󵄨󵄨G : StabG (a)󵄨󵄨󵄨 = |Ga |.

That is, the size of the orbit of a is the index of its stabilizer in G.

Proof. Suppose that g1 , g2 ∈ G with g1 StabG (a) = g2 StabG (a); that is, they define the
same left coset of the stabilizer. Then g2−1 g1 ∈ StabG (a). This implies that g2−1 g1 a = a so
that g2 a = g1 a. Hence, any two elements in the same left coset of the stabilizer produce
the same image of a in Ga . Conversely, if g1 a = g2 a, then g1 , g2 define the same left coset
of StabG (a). This shows that there is a one-to-one correspondence between left cosets
of StabG (a) and elements of Ga . It follows that the size of Ga is precisely the index of
the stabilizer.

We will use this theorem repeatedly with different group actions to obtain impor-
tant group theoretic results.

13.2 Conjugacy classes and the class equation


In Section 10.5, we introduced the center of a group

Z(G) = {g ∈ G : gg1 = g1 g for all g1 ∈ G},

and showed that it is a normal subgroup of G. We use this normal subgroup in con-
junction with what we call the class equation to show that any finite p-group has a
nontrivial center. In this section, we use group actions to derive the class equation
and prove the result for finite p-groups.
Recall that if G is a group, then two elements g1 , g2 ∈ G are conjugate if there exists
a g ∈ G with g −1 g1 g = g2 . We saw that conjugacy is an equivalence relation on G. For
The equivalence class of g ∈ G is called its conjugacy class, which we will denote by
Cl(g). Thus,

Cl(g) = {g1 ∈ G : g1 is conjugate to g}.

If g ∈ G, then its centralizer CG (g) is the set of elements in G that commute with g:

CG (g) = {g1 ∈ G : gg1 = g1 g}.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:00 AM
13.2 Conjugacy classes and the class equation | 191

Theorem 13.2.1. Let G be a finite group and g ∈ G. Then the centralizer of g is a subgroup
of G, and
󵄨󵄨 󵄨 󵄨 󵄨
󵄨󵄨G : CG (g)󵄨󵄨󵄨 = 󵄨󵄨󵄨Cl(g)󵄨󵄨󵄨.

That is, the index of the centralizer of g is the size of its conjugacy class.
In particular, for a finite group the size of each conjugacy class divides the order of
the group.

Proof. Let the group G act on itself by conjugation. That is, g(g1 ) = g −1 g1 g. It is easy to
show that this is an action on the set G (see exercises). The orbit of g ∈ G under this
action is precisely its conjugacy class Cl(g), and the stabilizer is its centralizer CG (g).
The statements in the theorem then follow directly from Theorem 13.1.4.

For any group G, since conjugacy is an equivalence relation, the conjugacy classes
partition G. Hence,

G = ⋃̇ Cl(g),
g∈G

where this union is taken over the distinct conjugacy classes. It follows that
󵄨 󵄨
|G| = ∑ 󵄨󵄨󵄨Cl(g)󵄨󵄨󵄨,
g∈G

where this sum is taken over distinct conjugacy classes.


If Cl(g) = {g}; that is, the conjugacy class of g is g alone, then CG (g) = G so that g
commutes with all of G. Therefore, in this case, g ∈ Z(G). This is true for every element
of the center; therefore,

G = Z(G) ∪ ⋃̇ Cl(g),
g∉Z(G)

where again the second union is taken over the distinct conjugacy classes Cl(g) with
g ∉ Z(G). The size of G is then the sum of these disjoint pieces, so
󵄨 󵄨 󵄨 󵄨
|G| = 󵄨󵄨󵄨Z(G)󵄨󵄨󵄨 + ∑ 󵄨󵄨󵄨Cl(g)󵄨󵄨󵄨,
g∉Z(G)

where the sum is taken over the distinct conjugacy classes Cl(g) with g ∉ Z(G). How-
ever, from Theorem 13.2.1, |Cl(g)| = |G : CG (g)|, so the equation above becomes
󵄨 󵄨 󵄨 󵄨
|G| = 󵄨󵄨󵄨Z(G)󵄨󵄨󵄨 + ∑ 󵄨󵄨󵄨G : CG (g)󵄨󵄨󵄨,
g∉Z(G)

where the sum is taken over the distinct indices |G : CG (g)| with g ∉ Z(G). This is
known as the class equation.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:00 AM
192 | 13 Groups actions and the Sylow theorems

Theorem 13.2.2 (Class equation). Let G be a finite group. Then


󵄨 󵄨 󵄨 󵄨
|G| = 󵄨󵄨󵄨Z(G)󵄨󵄨󵄨 + ∑ 󵄨󵄨󵄨G : CG (g)󵄨󵄨󵄨,
g∉Z(G)

where the sum is taken over the distinct centralizers.

As a first application, we prove the result that finite p-groups have nontrivial cen-
ters (see Lemma 10.5.6).

Theorem 13.2.3. Let G be a finite p-group. Then G has a nontrivial center.

Proof. Let G be a finite p-group so that |G| = pn for some n, and consider the class
equation
󵄨 󵄨 󵄨 󵄨
|G| = 󵄨󵄨󵄨Z(G)󵄨󵄨󵄨 + ∑ 󵄨󵄨󵄨G : CG (g)󵄨󵄨󵄨,
g∉Z(G)

where the sum is taken over the distinct centralizers. Since |G : CG (g)| divides |G| for
each g ∈ G, we must have that p||G : CG (g)| for each g ∈ G. Furthermore, p||G|. There-
fore, p must divide |Z(G)|; hence, |Z(G)| = pm for some m ≥ 1. Therefore, Z(G) is non-
trivial.

The idea of conjugacy and the centralizer of an element can be extended to sub-
groups. If H1 , H2 are subgroups of a group G, then H1 , H2 are conjugate if there exists a
g ∈ G such that g −1 H1 g = H2 . As for elements, conjugacy is an equivalence relation on
the set of subgroups of G.
If H ⊂ G is a subgroup, then its conjugacy class consists of all the subgroups of G
conjugate to it. The normalizer of H is

NG (H) = {g ∈ G : g −1 Hg = H}.

As for elements, let G act on the set of subgroups of G by conjugation. That is, for
g ∈ G, the map is given by H 󳨃→ g −1 Hg. For H ⊂ G, the stabilizer under this action
is precisely the normalizer. Hence, exactly as for elements, we obtain the following
theorem:

Theorem 13.2.4. Let G be a group and H ⊂ G a subgroup. Then the normalizer NG (H)
of H is a subgroup of G, H is normal in NG (H), and
󵄨󵄨 󵄨
󵄨󵄨G : NG (H)󵄨󵄨󵄨 = number of conjugates of H in G.

13.3 The Sylow theorems


If G is a finite group and H ⊂ G is a subgroup, then Lagrange’s theorem guarantees
that the order of H divides the order of G. However, the converse of Lagrange’s theo-
rem is false. That is, if G is a finite group of order n and if d|n, then G need not contain

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:00 AM
13.3 The Sylow theorems | 193

a subgroup of order d. If d is a prime p or a power of a prime pe , however, then we shall


see that G must contain subgroups of that order. In particular, we shall see that if pd is
the highest power of p that divides n, then all subgroups of that order are actually con-
jugate, and we shall finally get a formula concerning the number of such subgroups.
These theorems constitute the Sylow theorems, which we will examine in this section.
First, we give an example, where the converse of Lagrange’s theorem is false.

Lemma 13.3.1. The alternating group on 4 symbols A4 has order 12, but has no subgroup
of order 6.

Proof. Suppose that there exists a subgroup U ⊂ A4 with |U| = 6. Then |A4 : U| = 2
since |A4 | = 12; hence, U is normal in A4 .
Now id, (1, 2)(3, 4), (1, 3)(2, 4), (1, 4)(2, 3) are in A4 . These each have order 2 and
commute, so they form a normal subgroup V ⊂ A4 of order 4. This subgroup V ≅
ℤ2 × ℤ2 . Then

|V||U| 4⋅6
12 = |A4 | ≥ |VU| = = .
|V ∩ U| |V ∩ U|

It follows that V ∩ U ≠ {1}, and since U is normal, we have that V ∩ U is also normal
in A4 .
Now (1, 2)(3, 4) ∈ V, and by renaming the entries in V, if necessary, we may assume
that it is also in U, so that (1, 2)(3, 4) ∈ V ∩ U. Since (1, 2, 3) ∈ A4 , we have

(3, 2, 1)(1, 2)(3, 4)(1, 2, 3) = (1, 3)(2, 4) ∈ V ∩ U,

and then

(3, 2, 1)(1, 4)(2, 3)(1, 2, 3) = (1, 2)(3, 4) ∈ V ∩ U.

But then V ⊂ V ∩ U, and so V ⊂ U. But this is impossible since |V| = 4, which does
not divide |U| = 6.

Definition 13.3.2. Let G be a finite group with |G| = n, and let p be a prime such that
pa |n, but no higher power of p divides n. A subgroup of G of order pa is called a p-Sylow
subgroup.

It is not a clear that a p-Sylow subgroup must exist. We will prove that for each p|n
a p-Sylow subgroup exists.
We first consider and prove a very special case.

Theorem 13.3.3. Let G be a finite abelian group, and let p be a prime such that p||G|.
Then G contains at least one element of order p.

Proof. Suppose that G is a finite abelian group of order pn. We use induction on n. If
n = 1, then G has order p, and hence is cyclic. Therefore, it has an element of order
p. Suppose that the theorem is true for all abelian groups of order pm with m < n,

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:00 AM
194 | 13 Groups actions and the Sylow theorems

and suppose that G has order pn. Suppose that g ∈ G. If the order of g is pt for some
integer t, then g t ≠ 1, and g t has order p, proving the theorem in this case. Hence,
we may suppose that g ∈ G has order prime to p, and we show that there must be an
element, whose order is a multiple of p, and then use the above argument to get an
element of exact order p.
Hence, we have g ∈ G with order m, where (m, p) = 1. Since m||G| = pn, we must
have m|n. Since G is abelian, ⟨g⟩ is normal, and the factor group G/⟨g⟩ is abelian of
order p( mn ) < pn. By the inductive hypothesis, G/⟨g⟩ has an element h⟨g⟩ of order p,
h ∈ G; hence, hp = g k for some k. g k has order m1 |m; therefore, h has order pm1 . Now,
as above, hm1 has order p, proving the theorem.

Therefore, if G is an abelian group, and if p|n, then G contains a subgroup of or-


der p, the cyclic subgroup of order p generated by an element a ∈ G of order p, whose
existence is guaranteed by the above theorem. We now present the first Sylow theo-
rem:

Theorem 13.3.4 (First Sylow theorem). Let G be a finite group, and let p||G|, then G con-
tains a p-Sylow subgroup; that is, a p-Sylow subgroup exists.

Proof. Let G be a finite group of order pn, and—as above—we do induction on n. If


n = 1, then G is cyclic, and G is its own maximal p-subgroup; hence, all of G is a
p-Sylow subgroup. We assume then that if |G| = pm with m < n, then G has a p-Sylow
subgroup.
Assume that |G| = pt m with (m, p) = 1. We must show that G contains a subgroup
of order pt . If H is a proper subgroup, whose index is prime to p, then |H| = pt m1 with
m1 < m. Therefore, by the inductive hypothesis, H has a p-Sylow subgroup of order pt .
This will also be a subgroup of G, hence a p-Sylow subgroup of G.
Therefore, we may assume that the index of any proper subgroup H of G must be
divisible by p. Now consider the class equation for G,

󵄨 󵄨 󵄨 󵄨
|G| = 󵄨󵄨󵄨Z(G)󵄨󵄨󵄨 + ∑ 󵄨󵄨󵄨G : CG (g)󵄨󵄨󵄨,
g∉Z(G)

where the sum is taken over the distinct centralizers. By assumption, each of the in-
dices are divisible by p and also p||G|. Therefore, p||Z(G)|. It follows that Z(G) is a finite
abelian group, whose order is divisible by p. From Theorem 13.3.3, there exists an el-
ement g ∈ Z(G) ⊂ G of order p. Since g ∈ Z(G), we must have ⟨g⟩ normal in G. The
factor group G/⟨g⟩ then has order pt−1 m, and—by the inductive hypothesis—must have
a p-Sylow subgroup K of order pt−1 , hence of index m. By the Correspondence Theo-
rem 10.2.6, there is a subgroup K of G with ⟨g⟩ ⊂ K such that K/⟨g⟩ ≅ K. Therefore,
|K| = pt , and K is a p-Sylow subgroup of G.

On the basis of this theorem, we can now strengthen the result obtained in Theo-
rem 13.3.3.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:00 AM
13.3 The Sylow theorems | 195

Theorem 13.3.5 (Cauchy). If G is a finite group, and if p is a prime such that p||G|, then
G contains at least one element of order p.

Proof. Let P be a p-Sylow subgroup of G, and let |P| = pt . If g ∈ P, g ≠ 1, then the order
t1 −1
of g is pt1 . Then g p has order p.

We have seen that p-Sylow subgroups exist. We now wish to show that any two
p-Sylow subgroups are conjugate. This is the content of the second Sylow theorem:

Theorem 13.3.6 (Second Sylow theorem). Let G be a finite group and p a prime such
that p||G|. Then any p-subgroup H of G is contained in a p-Sylow subgroup. Further-
more, all p-Sylow subgroups of G are conjugate. That is, if P1 and P2 are any two p-Sylow
subgroups of G, then there exists an a ∈ G such that P1 = aP2 a−1 .

Proof. Let Ω be the set of p-Sylow subgroups of G, and let G act on Ω by conjugation.
This action will, of course, partition Ω into disjoint orbits. Let P be a fixed p-Sylow
subgroup and ΩP be its orbit under the conjugation action. The size of the orbit is the
index of its stabilizer; that is, |ΩP | = |G : StabG (P)|. Now P ⊂ StabG (P), and P is a
maximal p-subgroup of G. It follows that the index of StabG (P) must be prime to p,
and so the number of p-Sylow subgroups conjugate to P is prime to p.
Now let H be a p-subgroup of G, and let H act on ΩP by conjugation. ΩP will itself
decompose into disjoint orbits under this actions. Furthermore, the size of each orbit is
an index of a subgroup of H, hence must be a power of p. On the other hand, the size of
the whole orbit is prime to p. Therefore, there must be one orbit that has size exactly 1.
This orbit contains a p-Sylow subgroup P 󸀠 , and P 󸀠 is fixed by H under conjugation;
that is, H normalizes P 󸀠 . It follows that HP 󸀠 is a subgroup of G, and P 󸀠 is normal in
HP 󸀠 . From the second isomorphism theorem, we then obtain

HP 󸀠 /P 󸀠 ≅ H/(H ∩ P 󸀠 ).

Since H is a p-group, the size of H/(H ∩ P 󸀠 ) is a power of p; therefore, so is the size of


HP 󸀠 /P 󸀠 . But P 󸀠 is also a p-group, so it follows that HP 󸀠 also has order a power of p. Now
P 󸀠 ⊂ HP 󸀠 , but P 󸀠 is a maximal p-subgroup of G. Hence, HP 󸀠 = P 󸀠 . This is possible only
if H ⊂ P 󸀠 , proving the first assertion in the theorem. Therefore, any p-subgroup of G is
obtained in a p-Sylow subgroup.
Now let H be a p-Sylow subgroup P1 , and let P1 act on ΩP . Exactly as in the argu-
ment above, P1 ⊂ P 󸀠 , where P 󸀠 is a conjugate of P. Since P1 and P 󸀠 are both p-Sylow
subgroups, they have the same size; hence, P1 = P 󸀠 . This implies that P1 is a conju-
gate of P. Since P1 and P are arbitrary p-Sylow subgroups, it follows that all p-Sylow
subgroups are conjugate.

We come now to the last of the three Sylow theorems. This one gives us informa-
tion concerning the number of p-Sylow subgroups.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:00 AM
196 | 13 Groups actions and the Sylow theorems

Theorem 13.3.7 (Third Sylow theorem). Let G be a finite group and p a prime such that
p||G|. Then the number of p-Sylow subgroups of G is of the form 1 + pk and divides the
order of |G|. It follows that if |G| = pa m with (p, m) = 1, then the number of p-Sylow
subgroups divides m.

Proof. Let P be a p-Sylow subgroup, and let P act on Ω, the set of all p-Sylow sub-
groups, by conjugation. Now P normalizes itself, so there is one orbit, namely, P, hav-
ing exactly size 1. Every other orbit has size a power of p since the size is the index of
a nontrivial subgroup of P, and therefore must be divisible by p. Hence, the size of the
Ω is 1 + pk.

13.4 Some applications of the Sylow theorems


We now give some applications of the Sylow theorems. First, we show that the con-
verse of Lagrange’s theorem is true for both general p-groups and for finite abelian
groups.

Theorem 13.4.1. Let G be a group of order pn , p a prime number. Then G contains at


least one normal subgroup of order pm for each m such that 0 ≤ m ≤ n.

Proof. We use induction on n. For n = 1, the theorem is trivial. By Lemma 10.5.7, any
group of order p2 is abelian. This, together with Theorem 13.3.3, establishes the claim
for n = 2.
We now assume the theorem is true for all groups G of order pk , where 1 ≤ k < n,
where n > 2. Let G be a group of order pn . From Lemma 10.3.4, G has a nontrivial
center of order at least p, hence an element g ∈ Z(G) of order p. Let N = ⟨g⟩. Since
g ∈ Z(G), it follows that N is normal subgroup of order p. Then G/N is of order pn−1 ,
therefore contains (by the induction hypothesis) normal subgroups of orders pm−1 , for
0 ≤ m − 1 ≤ n − 1. These groups are of the form H/N, where the normal subgroup H ⊂ G
contains N and is of order pm , 1 ≤ m ≤ n, because |H| = |N|[H : N] = |N| ⋅ |H/N|.

On the basis of the first Sylow theorem, we see that if G is a finite group, and if
p ||G|, then G must contain a subgroup of order pk . One can actually show that, as in
k

the case of Sylow p-groups, the number of such subgroups is of the form 1 + pt, but we
shall not prove this here.

Theorem 13.4.2. Let G be a finite abelian group of order n. Suppose that d|n. Then G
contains a subgroup of order d.
e e f f
Proof. Suppose that n = p1 1 ⋅ ⋅ ⋅ pkk is the prime factorization of n. Then d = p11 ⋅ ⋅ ⋅ pkk
e
for some nonnegative f1 , . . . , fk . Now G has p1 -Sylow subgroup H1 of order p1 1 . Hence,
f1
from Theorem 13.4.1, H1 has a subgroup K1 of order p1 . Similarly, there are subgroups
f f
K2 , . . . , Kk of G of respective orders p22 , . . . , pkk . Moreover, since the orders are disjoint,

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:00 AM
13.4 Some applications of the Sylow theorems | 197

f
Ki ∩ Kj = {1} if i ≠ j. It follows that ⟨K1 , K2 , . . . , Kk ⟩ has order |K1 ||K2 | ⋅ ⋅ ⋅ |Kk | = p11 ⋅ ⋅ ⋅
f
pkk = d.

In Section 10.5, we examined the classification of finite groups of small orders.


Here, we use the Sylow theorems to extend some of this material further.

Theorem 13.4.3. Let p, q be distinct primes with p < q and q not congruent to 1 mod p.
Then any group of order pq is cyclic. For example, any group of order 15 must be cyclic.

Proof. Suppose that |G| = pq with p < q and q not congruent to 1 mod p. The number
of q-Sylow subgroups is of the form 1 + qk and divides p. Since q is greater than p,
this implies that there can be only one; hence, there is a normal q-Sylow subgroup H.
Since q is a prime, H is cyclic of order q; therefore, there is an element g of order q.
The number of p-Sylow subgroups is of the form 1 + pk and divides q. Since q is
not congruent to 1 mod p, this implies that there also can be only one p-Sylow sub-
group; hence, there is a normal p-Sylow subgroup K. Since p is a prime K is cyclic
of order p; therefore, there is an element h of order p. Since p, q are distinct primes
H ∩ K = {1}. Consider the element g −1 h−1 gh. Since K is normal, g −1 hg ∈ K. Then
g −1 h−1 gh = (g −1 h−1 g)h ∈ K. But H is also normal, so h−1 gh ∈ H. This then implies
that g −1 h−1 gh = g −1 (h−1 gh) ∈ H; therefore, g −1 h−1 gh ∈ K ∩ H. It follows then that
g −1 h−1 gh = 1 or gh = hg. Since g, h commute, the order of gh is the lcm of the orders
of g and h, which is pq. Therefore, G has an element of order pq. Since |G| = pq, this
implies that G is cyclic.

In the above theorem, since we assumed that q is not congruent to 1 mod p, hence
p ≠ 2. In the case where p = 2, we get another possibility.

Theorem 13.4.4. Let p be an odd prime and G a finite group of order 2p. Then either
G is cyclic, or G is isomorphic to the dihedral group of order 2p; that is, the group of
symmetries of a regular p-gon. In this latter case, G is generated by two elements, g
and h, which satisfy the relations g p = h2 = (gh)2 = 1.

Proof. As in the proof of Theorem 13.4.3, G must have a normal cyclic subgroup of or-
der p, say ⟨g⟩. Since 2||G|, the group G must have an element of order 2, say h. Consider
the order of gh. By Lagrange’s theorem, this element can have order 1, 2, p, 2p. If the or-
der is 1, then gh = 1 or g = h−1 = h. This is impossible since g has order p, and h has
order 2. If the order of gh is p, then from the second Sylow theorem, gh ∈ ⟨g⟩. But this
implies that h ∈ ⟨g⟩, which is impossible since every nontrivial element of ⟨g⟩ has
order p. Therefore, the order of gh is either 2 or 2p.
If the order of gh is 2p, then since G has order 2p, it must be cyclic.
If the order of gh is 2, then within G, we have the relations g p = h2 = (gh)2 = 1. Let
H = ⟨g, h⟩ be the subgroup of G generated by g and h. The relations g p = h2 = (gh)2 = 1
imply that H has order 2p. Since |G| = 2p, we get that H = G. G is isomorphic to the
dihedral group Dp of order 2p (see exercises).

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:00 AM
198 | 13 Groups actions and the Sylow theorems

In the above description, g represents a rotation of 2π


p
of a regular p-gon about its
center, whereas h represents any reflection across a line of symmetry of the regular
p-gon.

Example 13.4.5 (The groups of order 21). Let G be a group of order 21. The number of
7-Sylow subgroups of G is 1, because it is of the form 1 + 7k and divides 3. Hence, the
7-Sylow subgroup K is normal and cyclic; that is, K ⊲ G and K = ⟨a⟩ with a of order 7.
The number of 3-Sylow subgroups is analogously 1 or 7. If it is 1, then we have
exactly one element of order 3 in G, and if it is 7, there are 14 elements of order 3 in G.
Let b be an element of order 3. Then bab−1 = ar for some r with 1 ≤ r ≤ 6. Now,
3
a = b3 ab−3 = ar ; hence, r 3 = 1 in ℤ6 , which implies r = 1, 2 or 4.
The map b 󳨃→ b, a 󳨃→ a2 defines an automorphism of G, because (a2 )3 = a.
Hence, up to isomorphism, there are exactly two groups of order 21.
If r = 1, then G is abelian. In fact, G = ⟨ab⟩ is cyclic of order 21.
The group for r = 2 can be realized as a subgroup of S7 . Let a = (1, 2, 3, 4, 5, 6, 7)
and b = (2, 3, 5)(4, 7, 6). Then bab−1 = a2 , and ⟨a, b⟩ has order 21.

We have looked at the finite fields ℤp . We give an example of a p-Sylow subgroup


of a matrix group over ℤp .

Example 13.4.6. Consider GL(n, p), the group of n × n invertible matrices over ℤp . If
{v1 , . . . , vn } is a basis for (ℤp )n over ℤp , then the size of GL(n, p) is the number of inde-
pendent images {w1 , . . . , wn } of {v1 , . . . , vn }. For w1 , there are pn − 1 choices; for w2 there
are pn − p choices and so on. It follows that
n(n−1)
󵄨󵄨 󵄨 n n n n−1 1+2+⋅⋅⋅+(n−1)
󵄨󵄨GL(n, p)󵄨󵄨󵄨 = (p − 1)(p − p) ⋅ ⋅ ⋅ (p − p ) = p m=p 2 m

n(n−1)
with (p, m) = 1. Therefore, a p-Sylow subgroup must have size p 2 .
Let P be the subgroup of upper triangular matrices with 1’s on the diagonal.
n(n−1)
Then P has size p1+2+⋅⋅⋅+(n−1) = p 2 , and is therefore a p-Sylow subgroup of
GL(n, p).

The final example is a bit more difficult. We mentioned that a major result on fi-
nite groups is the classification of the finite simple groups. This classification showed
that any finite simple group is either cyclic of prime order, in one of several classes of
groups such as the An , n > 4, or one of a number of special examples called sporadic
groups. One of the major tools in this classification is the following famous result,
called the Feit–Thompson theorem, which showed that any finite group G of odd or-
der is solvable and, in addition, if G is not cyclic, then G is nonsimple.

Theorem 13.4.7 (Feit–Thompson theorem). Any finite group of odd order is solvable.

The proof of this theorem, one of the major results in algebra in the twentieth cen-
tury, is way beyond the scope of this book. The proof is actually hundreds of pages

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:00 AM
13.4 Some applications of the Sylow theorems | 199

in length, when one counts the results used. However, we look at the smallest non-
abelian simple group.

Theorem 13.4.8. Suppose that G is a simple group of order 60. Then G is isomorphic to
A5 . Moreover, A5 is the smallest nonabelian finite simple group.

Proof. Suppose that G is a simple group of order 60 = 22 ⋅ 3 ⋅ 5. The number of 5-Sylow


subgroups is of the form 1+5k and divides 12. Hence, there is 1 or 6. Since G is assumed
simple, and all 5-Sylow subgroups are conjugate, there cannot be only one. Hence,
there are 6. Since each of these is cyclic of order 5 they intersect only in the identity.
Hence, these 6 subgroups cover 24 distinct elements.
The number of 3-Sylow subgroups is of the form 1 + 3k and divides 20. Hence,
there are 1, 4, 10. We claim that there are 10. There cannot be only 1, since G is simple.
Suppose there were 4. Let G act on the set of 3-Sylow subgroups by conjugation. Since
an action is a permutation, this gives a homomorphism f from G into S4 . By the first
isomorphism theorem, G/ ker(f ) ≅ im(f ). However, since G is simple, the kernel must
be trivial, and this implies that G would imbed into S4 . This is impossible, since |G| =
60 > 24 = |S4 |. Therefore, there are 10 3-Sylow subgroups. Since each of these is cyclic
of order 3, they intersect only in the identity. Therefore, these 10 subgroups cover 20
distinct elements.
Hence, together with the elements in the 5-Sylow subgroups, we have 44 nontrivial
elements.
The number of 2-Sylow subgroups is of the form 1 + 2k and divides 15. Hence, there
are 1, 3, 5, 15. We claim that there are 5. As before, there cannot be only 1, since G is
simple. There cannot be 3, since as for the case of 3-Sylow subgroups, this would imply
an imbedding of G into S3 , which is impossible, given |S3 | = 6. Suppose that there
were 15 2-Sylow subgroups, each of order 4. The intersections would have a maximum
of 2 elements. Therefore, each of these would contribute at least 2 distinct elements.
This gives a minimum of 30 distinct elements. However, we already have 44 nontrivial
elements from the 3-Sylow and 5-Sylow subgroups. Since |G| = 60, this is too many.
Therefore, G must have 5 2-Sylow subgroups.
Now let G act on the set of 2-Sylow subgroups. This then, as above, implies an
imbedding of G into S5 , so we may consider G as a subgroup of S5 . However, the only
subgroup of S5 of order 60 is A5 ; therefore, G ≅ A5 .
The proof that A5 is the smallest nonabelian simple group is actually brute force.
We show that any group G of order less than 60 either has prime order, or is nonsimple.
There are strong tools that we can use. By the Feit–Thompson theorem, we must only
consider groups of even order. From Theorem 13.4.4, we do not have to consider orders
2p. The rest can be done by an analysis using Sylow theory. For example, we show that
any group of order 20 is nonsimple. Since 20 = 22 ⋅ 5, the number of 5-Sylow subgroups
is 1 + 5k and divides 4. Hence, there is only one; therefore, it must be normal, and so
G is nonsimple. There is a strong theorem by Burnside, whose proof is usually done
with representation theory (see Chapter 22), which says that any group, whose order

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:00 AM
200 | 13 Groups actions and the Sylow theorems

is divisible by only two primes, is solvable. Therefore, for |G| = 60, we only have to
show that groups of order 30 = 2 ⋅ 3 ⋅ 5 and 42 = 2 ⋅ 3 ⋅ 7 are nonsimple. This is done
in the same manner as the first part of this proof. Suppose |G| = 30. The number of
5-Sylow subgroups is of the form 1 + 5k and divides 6. Hence, there are 1 or 6. If G were
simple there would have to be 6 covering 24 distinct elements. The number of 3-Sylow
subgroups is of the form 1 + 3k and divides 10; hence, there are 1 or 10. If there were
10 these would cover an additional 20 distinct elements, which is impossible, since
we already have 24 and G has order 30. Therefore, there is only one, hence a normal
3-Sylow subgroup. It follows that G cannot be simple. The case |G| = 42 is even simpler.
There must be a normal 7-Sylow subgroup.

13.5 Exercises
1. Prove Lemma 13.1.3.
2. Let the group G act on itself by conjugation; that is, g(g1 ) = g −1 g1 g. Prove that this
is an action on the set G.
3. Show that the dihedral group Dn of order 2n has the presentation

⟨r, f ; r n = f 2 = (rf )2 = 1⟩

(see Chapter 14 for group presentations).


4. Show that each group of order ≤ 59 is solvable.
5. Show that there is no simple group of order 84.
6. Let P1 and P2 be two different p-Sylow subgroups of a finite group G. Show that
P1 P2 is not a subgroup of G.
7. Let P and Q be two p-Sylow subgroups of the finite group G. If Z(P) is a normal
subgroup of Q, then Z(P) = Z(Q).
8. Let G be a finite group. For a prime p the following are equivalent:
(i) G has exactly one p-Sylow subgroup.
(ii) The product of any two elements of order p has some order pk .
9. Let p be a prime and G = SL(2, p). Let P = ⟨a⟩, where a = ( 01 11 ).
(i) Determine the normalizer NG (P) and the number of p-Sylow subgroups of G.
(ii) Determine the centralizer CG (a). How many elements of order p does G have?
In how many conjugacy classes can they be decomposed?
(iii) Show that all subgroups of G of order p(p − 1) are conjugate.
(iv) Show that G has no elements of order p(p − 1) for p ≥ 5.
10. Let G be a finite group and N a normal subgroup such that |N| is a power of p.
Show that N is contained in every p-Sylow subgroup of G.
11. Let p be a prime number, and let P and Q be two p-Sylow subgroups of the finite
group G such that P is contained in NG(Q) . Show that P = Q.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:00 AM
14 Free groups and group presentations
14.1 Group presentations and combinatorial group theory
In discussing the symmetric group on 3 symbols and then the various dihedral groups
in Chapters 9, 10, and 11, we came across the concept of a group presentation. Roughly,
for a group G, a presentation consists of a set of generators X for G, so that G = ⟨X⟩,
and a set of relations between the elements of X, from which—in principle—the whole
group table can be constructed. In this chapter, we make this concept precise. As we
will see, every group G has a presentation, but it is mainly in the case where the group
is finite or countably infinite that presentations are most useful. Historically, the idea
of group presentations arose out of the attempt to describe the countably infinite fun-
damental groups that came out of low dimensional topology. The study of groups us-
ing group presentations is called combinatorial group theory.
Before looking at group presentations in general, we revisit two examples of finite
groups and then a class of infinite groups.
Consider the symmetric group on 3 symbols, S3 . We saw that it has the following
6 elements:

1 2 3 1 2 3 1 2 3
1=( ), a=( ), b=( )
1 2 3 2 3 1 3 1 2
1 2 3 1 2 3 1 2 3
c=( ), d=( ), e=( ).
2 1 3 3 2 1 1 3 2

Notice that a3 = 1, c2 = 1, and that ac = ca2 . We claim that

⟨a, c; a3 = c2 = (ac)2 = 1⟩

is a presentation for S3 . First, it is easy to show that S3 = ⟨a, c⟩. Indeed,

1 = 1, a = a, b = a2 , c = c, d = ac, e = a2 c,

and so a, c generate S3 .
Now from (ac)2 = acac = 1, we get that ca = a2 c. This implies that if we write any
sequence (or word in our later language) in a and c, we can also rearrange it so that
the only nontrivial powers of a are a and a2 ; the only powers of c are c, and all a terms
precede c terms. For example,

aca2 cac = aca(acac) = a(ca) = a(a2 c) = (a3 )c = c.

Therefore, using the three relations from the presentation above, each element of S3
can be written as aα cβ with α = 0, 1, 2 and β = 0, 1. From this the multiplication of any
two elements can be determined.

https://doi.org/10.1515/9783110603996-014

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
202 | 14 Free groups and group presentations

This type of argument exactly applies to all the dihedral groups Dn . We saw that, in
general, |Dn | = 2n. Since these are the symmetry groups of a regular n-gon, we always
have a rotation r of angle 2π n
about the center of the n-gon. This element r would have
order n. Let f be a reflection about any line of symmetry. Then f 2 = 1, and rf is a
reflection about the rotated line, which is also a line of symmetry. Therefore, (rf )2 = 1.
Exactly as for S3 , the relation (rf )2 = 1 implies that fr = r −1 f = r n−1 f . This allows us to
always place r terms in front of f terms in any word on r and f . Therefore, the elements
of Dn are always of the form

r α f β , α = 0, 1, 2, . . . , n − 1, β = 0, 1.

Moreover, the relations r n = f 2 = (rf )2 = 1 allow us to rearrange any word in r and f


into this form. It follows that |⟨r, f ⟩| = 2n; hence, Dn = ⟨r, f ⟩ together with the relations
above. Hence, we obtain the following:

Theorem 14.1.1. If Dn is the symmetry group of a regular n-gon, then a presentation for
Dn is given by

Dn = ⟨r, f ; r n = f 2 = (rf )2 = 1⟩.

(See Section 14.3 for the concept of group presentations.)

We now give one class of infinite examples. If G is an infinite cyclic group, so that
G ≅ ℤ, then G = ⟨g; ⟩ is a presentation for G. That is, G has a single generator with no
relations.
A direct product of n copies of ℤ is called a free abelian group of rank n. We will
denote this by ℤn . A presentation for ℤn is then given by

ℤn = ⟨x1 , x2 , . . . , xn ; xi xj = xj xi for all i, j = 1, . . . , n⟩.

14.2 Free groups


Crucial to the concept of a group presentation is the idea of a free group.

Definition 14.2.1 (Universal mapping property). A group F is free on a subset X if every


map f : X → G with G a group can be extended to a unique homomorphism f : F → G.
X is called a free basis for F. In general, a group F is a free group if it is free on some
subset X. If X is a free basis for a free group F, we write F = F(X).

We first show that given any set X, there does exist a free group with free basis X.
Let X = {xi }i∈I be a set (possibly empty). We will construct a group F(X), which is free
with free basis X. First, let X −1 be a set disjoint from X, but bijective to X. If xi ∈ X,
then we denote as xi−1 the corresponding element of X −1 under the bijection, and say
that xi and xi−1 are associated. The set X −1 is called the set of formal inverses from X,

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
14.2 Free groups | 203

and we call X ∪ X −1 the alphabet. Elements of the alphabet are called letters. Hence, a
ϵ
letter has the form xi 1 , where ϵi = ±1. A word in X is a finite sequence of letters from
the alphabet. That is a word has the form
ϵi ϵi ϵ
w = xi 1 xi 2 ⋅ ⋅ ⋅ xi in ,
1 2 n

where xij ∈ X, and ϵij = ±1. If n = 0, we call it the empty word, which we will denote
as e. The integer n is called the length of the word. Words of the form xi xi−1 or xi−1 xi are
called trivial words. We let W(X) be the set of all words on X.
If w1 , w2 ∈ W(X), we say that w1 is equivalent to w2 , denoted as w1 ∼ w2 , if w1 can
be converted to w2 by a finite string of insertions and deletions of trivial words. For ex-
ample, if w1 = x3 x4 x4−1 x2 x2 and w2 = x3 x2 x2 , then w1 ∼ w2 . It is straightforward to verify
that this is an equivalence relation on W(X) (see exercises). Let F(X) denote the set of
equivalence classes in W(X) under this relation; hence, F(X) is a set of equivalence
classes of words from X.
A word w ∈ W(X) is said to be freely reduced or reduced if it has no trivial subwords
(a subword is a connected sequence within a word). Hence, in the example above,
w2 = x3 x2 x2 is reduced, but w1 = x3 x4 x4−1 x2 x2 is not reduced. There is a unique element
of minimal length in each equivalence class in F(X). Furthermore, this element must
be reduced or else it would be equivalent to something of smaller length. Two reduced
words in W(X) are either equal or not in the same equivalence class in F(X). Hence,
F(X) can also be considered as the set of all reduced words from W(X).
ϵi ϵi ϵ
Given a word w = xi 1 xi 2 ⋅ ⋅ ⋅ xi in , we can find the unique reduced word w equivalent
1 2 n
to w via the following free reduction process. Beginning from the left side of w, we
cancel each occurrence of a trivial subword. After all these possible cancellations, we
have a word w󸀠 . Now we repeat the process again, starting from the left side. Since w
has finite length, eventually the resulting word will either be empty or reduced. The
final reduced w is the free reduction of w.
Now we build a multiplication on F(X). If
ϵi ϵi ϵ ϵj ϵj ϵ
w1 = xi 1 xi 2 ⋅ ⋅ ⋅ xi in , w2 = xj 1 xj 2 ⋅ ⋅ ⋅ xj jm
1 2 n 1 2 m

are two words in W(X), then their concatenation w1 ⋆ w2 is simply placing w2 after w1 ,
ϵi ϵi ϵ ϵj ϵj ϵ
w1 ⋆ w2 = xi 1 xi 2 ⋅ ⋅ ⋅ xi in xj 1 xj 2 ⋅ ⋅ ⋅ xj jm .
1 2 n 1 2 m

If w1 , w2 ∈ F(X), then we define their product as

w1 w2 = equivalence class of w1 ⋆ w2 .

That is, we concatenate w1 and w2 , and the product is the equivalence class of the re-
sulting word. It is easy to show that if w1 ∼ w1󸀠 and w2 ∼ w2󸀠 , then w1 ⋆w2 ∼ w1󸀠 ⋆w2󸀠 so that
the above multiplication is well-defined. Equivalently, we can think of this product in

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
204 | 14 Free groups and group presentations

the following way. If w1 , w2 are reduced words, then to find w1 w2 , first concatenate, and
ϵ ϵj
then freely reduce. Notice that if xi in xj 1 is a trivial word, then it is cancelled when the
n 1
concatenation is formed. We say then that there is cancellation in forming the product
w1 w2 . Otherwise, the product is formed without cancellation.

Theorem 14.2.2. Let X be a nonempty set, and let F(X) be as above. Then F(X) is a free
group with free basis X. Furthermore, if X = 0, then F(X) = {1}; if |X| = 1, then F(X) ≅ ℤ,
and if |X| ≥ 2, then F(X) is nonabelian.

Proof. We first show that F(X) is a group, and then show that it satisfies the universal
mapping property on X. We consider F(X) as the set of reduced words in W(X) with the
multiplication defined above. Clearly, the empty word acts as the identity element 1.
ϵi ϵi ϵ −ϵ −ϵi −ϵi
If w = xi 1 xi 2 ⋅ ⋅ ⋅ xi in and w1 = xi in xi n−1 ⋅ ⋅ ⋅ xi 1 , then both w ⋆ w1 and w1 ⋆ w freely
1 2 n n n−1 1
reduce to the empty word, and so w1 is the inverse of w. Therefore, each element of
F(X) has an inverse. Therefore, to show that F(X) forms a group, we must show that
the multiplication is associative. Let
ϵi ϵi ϵ ϵj ϵj ϵ ϵk ϵk ϵk
w1 = xi 1 xi 2 ⋅ ⋅ ⋅ xi in , w2 = xj 1 xj 2 ⋅ ⋅ ⋅ xj jm , w3 = xk 1 xk 2 ⋅ ⋅ ⋅ xk p
1 2 n 1 2 m 1 2 p

be three freely reduced words in F(X). We must show that

(w1 w2 )w3 = w1 (w2 w3 ).

To prove this, we use induction on m, the length of w2 . If m = 0, then w2 is the


empty word, hence the identity, and it is certainly true. Now suppose that m = 1 so
ϵj
that w2 = xj 1 . We must consider exactly four cases.
1
ϵj1
Case (1): There is no cancellation in forming either w1 w2 or w2 w3 . That is, xj ≠
1
ϵj −ϵk
xi in ,and xj 1 xk 1 .
Then the product w1 w2 is just the concatenation of the words,
−ϵ

n 1 1
and so is (w1 w2 )w3 . The same is true for w1 (w2 w3 ). Therefore, in this case, w1 (w2 w3 ) =
(w1 w2 )w3 .
Case (2): There is cancellation in forming w1 w2 , but not in forming w2 w3 . Then if we
concatenate all three words, the only cancellation occurs between w1 and w2 in either
w1 (w2 w3 ) or in (w1 w2 )w3 ; hence, they are equal. Therefore, in this case, w1 (w2 w3 ) =
(w1 w2 )w3 .
Case (3): There is cancellation in forming w2 w3 , but not in forming w1 w2 . This is
entirely analogous to Case (2). Therefore, in this case, w1 (w2 w3 ) = (w1 w2 )w3 .
Case (4): There is cancellation in forming w1 w2 and also in forming w2 w3 . Then
ϵj ϵj −ϵk
xj 1 = xi in and xj 1 = xk 1 . Here,
−ϵ
1 n 1 1

ϵi ϵin−1 ϵk1 ϵk2 ϵk


(w1 w2 )w3 = xi 1 ⋅ ⋅ ⋅ xi xk xk ⋅ ⋅ ⋅ xk p .
1 n−1 1 2 p

On the other hand,


ϵi ϵ ϵk ϵk
w1 (w2 w3 ) = xi 1 ⋅ ⋅ ⋅ xi in xk 2 ⋅ ⋅ ⋅ xk p .
1 n 2 p

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
14.2 Free groups | 205

ϵ ϵk
However, these are equal since xi in = xk 1 . Therefore, in this final case, w1 (w2 w3 ) =
n 1
(w1 w2 )w3 .
It follows, inductively, from these four cases, that the associative law holds in
F(X); therefore, F(X) forms a group.
Now suppose that f : X → G is a map from X into a group G. By the construction
of F(X) as a set of reduced words this can be extended to a unique homomorphism. If
ϵi ϵ
w ∈ F with w = xi 1 ⋅ ⋅ ⋅ xi in , then define f (w) = f (xi1 )ϵi1 ⋅ ⋅ ⋅ f (xin )ϵin . Since multiplication
1 n
in F(X) is concatenation, this defines a homomorphism and again form the construc-
tion of F(X), its the only one extending f . This is analogous to constructing a linear
transformation from one vector space to another by specifying the images of a basis.
Therefore, F(X) satisfies the universal mapping property of Definition 14.2.1. Hence,
F(X) is a free group with free basis X.
The final parts of Theorem 14.2.2 are straightforward. If X is empty, the only re-
duced word is the empty word; hence, the group is just the identity. If X has a single
letter, then F(X) has a single generator, and is therefore cyclic. It is easy to see that
it must be torsion-free. Therefore, F(X) is infinite cyclic; that is, F(X) ≅ ℤ. Finally,
if |X| ≥ 2, let x1 , x2 ∈ X. Then x1 x2 ≠ x2 x1 , and both are reduced. Therefore, F(X) is
nonabelian.

The proof of Theorem 14.2.2 provides another way to look at free groups.

Theorem 14.2.3. F is a free group if and only if there is a generating set X such that every
element of F has a unique representation as a freely reduced word on X.

The structure of a free group is entirely dependent on the cardinality of a free


basis. In particular, the cardinality of a free basis X for a free group F is unique, and is
called the rank of F. If |X| < ∞, F is of finite rank. If F has rank n and X = {x1 , x2 , . . . , xn },
we say that F is free on {x1 , x2 , . . . , xn }. We denote this by F(x1 , x2 , . . . , xn ).

Theorem 14.2.4. If X and Y are sets with the same cardinality, that is, |X| = |Y|, then
F(X) ≅ F(Y), the resulting free groups are isomorphic. Furthermore, if F(X) ≅ F(Y), then
|X| = |Y|.

Proof. Suppose that f : X → Y is a bijection from X onto Y. Now Y ⊂ F(Y), so there


is a unique homomorphism ϕ : F(X) → F(Y) extending f . Since f is a bijection, it has
an inverse f −1 : Y → X, and since F(Y) is free, there is a unique homomorphism ϕ1
from F(Y) to F(X) extending f −1 . Then ϕϕ1 is the identity map on F(Y), and ϕ1 ϕ is the
identity map on F(X). Therefore, ϕ, ϕ1 are isomorphisms with ϕ = ϕ−1 1 .
Conversely, suppose that F(X) ≅ F(Y). In F(X), let N(X) be the subgroup generated
by all squares in F(X); that is,

N(X) = ⟨{g 2 : g ∈ F(X)}⟩.

Then N(X) is a normal subgroup, and the factor group F(X)/N(X) is abelian, where
every nontrivial element has order 2 (see exercises). Therefore, F(X)/N(X) can be con-

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
206 | 14 Free groups and group presentations

sidered as a vector space over ℤ2 , the finite field of order 2, with X as a vector space
basis. Hence, |X| is the dimension of this vector space. Let N(Y) be the correspond-
ing subgroup of F(Y). Since F(X) ≅ F(Y), we would have F(X)/N(X) ≅ F(Y)/N(Y);
therefore, |Y| is the dimension of the vector space F(Y)/N(Y). Thus, |X| = |Y| from the
uniqueness of dimension of vector spaces.

Expressing elements of F(X) as a reduced word gives a normal form for elements
in a free group F. As we will see in Section 14.5, this solves what is termed the word
problem for free groups. Another important concept is the following: a freely reduced
e e e
word W = xv11 xv22 ⋅ ⋅ ⋅ xvnn is cyclically reduced if v1 ≠ vn , or if v1 = vn , then e1 ≠ −en . Clearly
then, every element of a free group is conjugate to an element given by a cyclically
reduced word. This provides a method to determine conjugacy in free groups.

Theorem 14.2.5. In a free group F, two elements g1 , g2 are conjugate if and only if a
cyclically reduced word for g1 is a cyclic permutation of a cyclically reduced word for g2 .

The theory of free groups has a large and extensive literature. We close this section
by stating several important properties. Proofs for these results can be found in [31],
[30] or [20].

Theorem 14.2.6. A free group is torsion-free.

From Theorem 14.2.4, we can deduce:

Theorem 14.2.7. An abelian subgroup of a free group must be cyclic.

Finally, a celebrated theorem of Nielsen and Schreier states that a subgroup of a


free group must be free.

Theorem 14.2.8 (Nielsen–Schreier). A subgroup of a free group is itself a free group.

Combinatorially, F is free on X if X is a set of generators for F, and there are no


nontrivial relations. In particular, the following hold:
There are several different proofs of this result (see [31]) with the most straightfor-
ward being topological in nature. We give an outline of a simple topological proof in
Section 14.4.
About 1920, Nielsen, using a technique now called Nielsen transformations in his
honor, first proved this theorem for finitely generated subgroups. Schreier, shortly af-
ter, found a combinatorial method to extend this to arbitrary subgroups. A complete
version of the original combinatorial proof appears in [31], and in the notes by John-
son [26].
Schreier’s combinatorial proof also allows for a description of the free basis for
the subgroup. In particular, let F be free on X, and H ⊂ F a subgroup. Let T = {tα }
be a complete set of right coset representatives for F mod H with the property that
e e e e e e
if tα = xv11 xv22 ⋅ ⋅ ⋅ xvnn ∈ T, with ϵi = ±1, then all the initial segments 1, xv11 , xv11 xv22 , et
cetera are also in T. Such a system of coset representatives can always be found, and

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
14.3 Group presentations | 207

is called a Schreier system or Schreier transversal for H. If g ∈ F, let g represent its coset
representative in T, and further define for g ∈ F and t ∈ T, Stg = tg(tg)−1 . Notice that
Stg ∈ H for all t, g. We then have the following:

Theorem 14.2.9 (Explicit form of Nielsen–Schreier). Let F be free on X and H a sub-


group of F. If T is a Schreier transversal for F mod H, then H is free on the set {Stx : t ∈
T, x ∈ X, Stx ≠ 1}.

Example 14.2.10. Let F be free on {a, b} and H = F(X 2 ) the normal subgroup of F gen-
erated by all squares in F.
Then F/F(X 2 ) = ⟨a, b; a2 = b2 = (ab)2 = 1⟩ = ℤ2 × ℤ2 (see Section 14.3 for
the concept of group presentations). It follows that a Schreier system for F mod H is
{1, a, b, ab} with a = a, b = b and ba = ab. From this it can be shown that H is free on
the generating set

x1 = a2 , x2 = bab−1 a−1 , x3 = b2 , x4 = abab−1 , x5 = ab2 a−1 .

The theorem also allows for a computation of the rank of H, given the rank of F
and the index. Specifically:

Corollary 14.2.11. Suppose F is free of rank n and |F : H| = k. Then H is free of rank


nk − k + 1.

From the example, we see that F is free of rank 2, H has index 4, so H is free of
rank 2 ⋅ 4 − 4 + 1 = 5.

14.3 Group presentations


The significance of free groups stems from the following result, which is easily de-
duced from the definition and will lead us directly to a formal definition of a group
presentation. Let G be any group and F the free group on the elements of G considered
as a set. The identity map f : G → G can be extended to a homomorphism of F onto G.
Therefore, we have the following:

Theorem 14.3.1. Every group G is a homomorphic image of a free group. That is, let G
be any group. Then G = F/N, where F is a free group.

In the above theorem, instead of taking all the elements of G, we can consider
just a set X of generators for G. Then G is a factor group of F(X), G ≅ F(X)/N. The
normal subgroup N is the kernel of the homomorphism from F(X) onto G. We use The-
orem 14.3.1 to formally define a group presentation.
If H is a subset of a group G, then the normal closure of H denoted by N(H) is the
smallest normal subgroup of G containing H. This can be described alternatively in
the following manner. The normal closure of H is the subgroup of G generated by all
conjugates of elements of H.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
208 | 14 Free groups and group presentations

Now suppose that G is a group with X, a set of generators for G. We also call X a
generating system for G. Now let G = F(X)/N as in Theorem 14.3.1 and the comments
after it. N is the kernel of the homomorphism f : F(X) → G. It follows that if r is a free
group word with r ∈ N, then r = 1 in G (under the homomorphism). We then call r
a relator in G, and the equation r = 1 a relation in G. Suppose that R is a subset of N
such that N = N(R), then R is called a set of defining relators for G. The equations r = 1,
r ∈ R, are a set of defining relations for G. It follows that any relator in G is a product of
conjugates of elements of R. Equivalently, r ∈ F(X) is a relator in G if and only if r can
be reduced to the empty word by insertions and deletions of elements of R, and trivial
words.

Definition 14.3.2. Let G be a group. Then a group presentation for G consists of a set of
generators X for G and a set R of defining relators. In this case, we write G = ⟨X; R⟩. We
could also write the presentation in terms of defining relations as G = ⟨X; r = 1, r ∈ R⟩.

From Theorem 14.3.1, it follows immediately that every group has a presentation.
However, in general, there are many presentations for the same group. If R ⊂ R1 , then
R1 is also a set of defining relators.

Lemma 14.3.3. Let G be a group. Then G has a presentation.

If G = ⟨X; R⟩ and X is finite, then G is said to be finitely generated. If R is finite, G


is finitely related. If both X and R are finite, G is finitely presented.
Using group presentations, we get another characterization of free groups.

Theorem 14.3.4. F is a free group if and only if F has a presentation of the form F = ⟨X; ⟩.

Mimicking the construction of a free group from a set X, we can show that to each
presentation corresponds a group. Suppose that we are given a supposed presentation
⟨X; R⟩, where R is given as a set of words in X. Consider the free group F(X) on X.
Define two words w1 , w2 on X to be equivalent if w1 can be transformed into w2 using
insertions and deletions of elements of R and trivial words. As in the free group case,
this is an equivalence relation. Let G be the set of equivalence classes. If we define
multiplication as before, as concatenation followed by the appropriate equivalence
class, then G is a group. Furthermore, each r ∈ R must equal the identity in G so that
G = ⟨X; R⟩. Notice that here there may be no unique reduced word for an element
of G.

Theorem 14.3.5. Given (X, R), where X is a set and R is a set of words on X. Then there
exists a group G with presentation ⟨X; R⟩.

We now give some examples of group presentations:

Example 14.3.6. A free group of rank n has a presentation

Fn = ⟨x1 , . . . , xn ; ⟩.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
14.3 Group presentations | 209

Example 14.3.7. A free abelian group of rank n has a presentation

ℤn = ⟨x1 , . . . , xn ; xi xj xi−1 xj−1 , i = 1, . . . , n, j = 1, . . . , n⟩.

Example 14.3.8. A cyclic group of order n has a presentation

ℤn = ⟨x; xn = 1⟩.

Example 14.3.9. The dihedral groups of order 2n, representing the symmetry group of
a regular n-gon, has a presentation

⟨r, f ; r n = 1, f 2 = 1, (rf )2 = 1⟩.

14.3.1 The modular group

In this section, we give a more complicated example, and then a nice application to
number theory.
If R is any commutative ring with an identity, then the set of invertible n × n matri-
ces with entries from R forms a group under matrix multiplication called the n-dimen-
sional general linear group over R (see [32]). This group is denoted by GLn (R). Since
det(A) det(B) = det(AB) for square matrices A, B, it follows that the subset of GLn (R),
consisting of those matrices of determinant 1, forms a subgroup. This subgroup is
called the special linear group over R and is denoted by SLn (R). In this section, we
concentrate on SL2 (ℤ), or more specifically, a quotient of it, PSL2 (ℤ), and find presen-
tations for them.
The group SL2 (ℤ) then consists of 2 × 2 integral matrices of determinant one:

a b
SL2 (ℤ) = {( ) : a, b, c, d ∈ ℤ, ad − bc = 1} .
c d

SL2 (ℤ) is called the homogeneous modular group, and an element of SL2 (ℤ) is called a
unimodular matrix.
If G is any group, recall that its center Z(G) consists of those elements of G, which
commute with all elements of G:

Z(G) = {g ∈ G : gh = hg, ∀h ∈ G}.

Z(G) is a normal subgroup of G. Hence, we can form the factor group G/Z(G). For G =
SL2 (ℤ), the only unimodular matrices that commute with all others are ±I = ±( 01 01 ).
Therefore, Z(SL2 (ℤ)) = {I, −I}. The quotient

SL2 (ℤ)/Z(SL2 (ℤ)) = SL2 (ℤ)/{I, −I}

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
210 | 14 Free groups and group presentations

is denoted by PSL2 (ℤ) and is called the projective special linear group or inhomoge-
neous modular group. More commonly, PSL2 (ℤ) is just called the Modular Group, and
denoted by M.
M arises in many different areas of mathematics, including number theory, com-
plex analysis, and Riemann surface theory and the theory of automorphic forms and
functions. M is perhaps the most widely studied single finitely presented group. Com-
plete discussions of M and its structure can be found in the books Integral Matrices by
M. Newman [46] and Algebraic Theory of the Bianchi Groups by B. Fine [41].
Since M = PSL2 (ℤ) = SL2 (ℤ)/{I, −I}, it follows that each element of M can be
considered as ±A, where A is a unimodular matrix. A projective unimodular matrix is
then

a b
±( ), a, b, c, d ∈ ℤ, ad − bc = 1.
c d

The elements of M can also be considered as linear fractional transformations over


the complex numbers

az + b
z󸀠 = , a, b, c, d ∈ ℤ, ad − bc = 1, where z ∈ ℂ.
cz + d
Thought of in this way, M forms a Fuchsian group, which is a discrete group of isome-
tries of the non-Euclidean hyperbolic plane. The book by Katok [27] gives a solid and
clear introduction to such groups. This material can also be found in condensed form
in [43].
We now determine presentations for both SL2 (ℤ) and M = PSL2 (ℤ).

Theorem 14.3.10. The group SL2 (ℤ) is generated by the elements

0 −1 0 1
X=( ) and Y =( ).
1 0 −1 −1

Furthermore, a complete set of defining relations for the group in terms of these
generators is given by

X 4 = Y 3 = YX 2 Y −1 X −2 = I.

It follows that SL2 (ℤ) has the presentation

⟨X, Y; X 4 = Y 3 = YX 2 Y −1 X −2 = I⟩.

Proof. We first show that SL2 (ℤ) is generated by X and Y; that is, every matrix A in the
group can be written as a product of powers of X and Y.
Let

1 1
U=( ).
0 1

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
14.3 Group presentations | 211

Then a direct multiplication shows that U = XY, and we show that SL2 (ℤ) is generated
by X and U, which implies that it is also generated by X and Y. Furthermore,

1 n
Un = ( );
0 1

therefore, U has infinite order.


Let A = ( ac db ) ∈ SL2 (ℤ). Then we have

−c −d a + kc b + kd
XA = ( ), and U k A = ( )
a b c d

for any k ∈ ℤ. We may assume that |c| ≤ |a| otherwise start with XA rather than A. If
c = 0, then A = ±U q for some q. If A = U q , then certainly A is in the group generated
by X and U. If A = −U q , then A = X 2 U q since X 2 = −I. It follows that here also A is in
the group generated by X and U.
Now suppose c ≠ 0. Apply the Euclidean algorithm to a and c in the following
modified way:

a = q 0 c + r1
−c = q1 r1 + r2
r1 = q 2 r2 + r3
..
.
(−1)n rn−1 = qn rn + 0,

where rn = ±1 since (a, c) = 1. Then

XU −qn ⋅ ⋅ ⋅ XU −q0 A = ±U qn+1 with qn+1 ∈ ℤ.

Therefore,

A = X m U q0 XU q1 ⋅ ⋅ ⋅ XU qn XU qn+1

with m = 0, 1, 2, 3; q0 , q1 , . . . , qn+1 ∈ ℤ and q0 , . . . , qn ≠ 0. Thus, X and U, and hence X


and Y generate SL2 (ℤ).
We must now show that

X 4 = Y 3 = YX 2 Y −1 X −2 = I

form a complete set of defining relations for SL2 (ℤ), or that every relation on these
generators is derivable from these. It is straightforward to see that X and Y do satisfy
these relations. Assume then that we have a relation

S = X ϵ1 Y α1 X ϵ2 Y α2 ⋅ ⋅ ⋅ Y αn X ϵn+1 = I

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
212 | 14 Free groups and group presentations

with all ϵi , αj ∈ ℤ. Using the set of relations

X 4 = Y 3 = YX 2 Y −1 X −2 = I,

we may transform S so that

S = X ϵ1 Y α1 XY α2 ⋅ ⋅ ⋅ Y αm X ϵm+1

with ϵ1 , ϵm+1 = 0, 1, 2 or 3 and αi = 1 or 2 for i = 1, . . . , m and m ≥ 0. Multiplying by a


suitable power of X, we obtain

Y α1 X ⋅ ⋅ ⋅ Y αm X = X α = S1

with m ≥ 0 and α = 0, 1, 2 or 3. Assume that m ≥ 1, and let


a −b
S1 = ( ).
−c d

We show by induction that

a, b, c, d ≥ 0, b + c > 0,

or

a, b, c, d ≤ 0, b + c < 0.

This claim for the entries of S1 is true for

1 0 −1 1
YX = ( ), and Y 2 X = ( ).
−1 1 0 −1
a −b1
Suppose it is correct for S2 = ( −c1 ). Then
1 d1

a1 −b1
YXS2 = ( ) and
−(a1 + c1 ) b1 + d1
−a − c1 b1 + d1
Y 2 XS2 = ( 1 ).
c1 d1

Therefore, the claim is correct for all S1 with m ≥ 1. This gives a contradiction, for the
entries of X α with α = 0, 1, 2 or 3 do not satisfy the claim. Hence, m = 0, and S can be
reduced to a trivial relation by the given set of relations. Therefore, they are a complete
set of defining relations, and the theorem is proved.

Corollary 14.3.11. The modular group M = PSL2 (ℤ) has the presentation

M = ⟨x, y; x2 = y3 = 1⟩.

Furthermore, x, y can be taken as the linear fractional transformations


1 1
x : z󸀠 = − , and y : z󸀠 = − .
z z+1

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
14.3 Group presentations | 213

Proof. The center of SL2 (ℤ) is ±I. Since X 2 = −I, setting X 2 = I in the presentation for
SL2 (ℤ) gives the presentation for M. Writing the projective matrices as linear fractional
transformations gives the second statement.

This corollary says that M is the free product of a cyclic group of order 2 and a
cyclic group of order 3, a concept we will introduce in Section 14.7.
We note that there is an elementary alternative proof to Corollary 14.3.11 as far as
showing that X 2 = Y 3 = 1 are a complete set of defining relations. As linear fractional
transformations, we have
1 1 z+1
X(z) = − , Y(z) = − , Y 2 (z) = − .
z z+1 z
Now let

ℝ+ = {x ∈ ℝ : x > 0} and ℝ− = {x ∈ ℝ : x < 0}.

Then

X(ℝ− ) ⊂ ℝ+ , and Y α (ℝ+ ) ⊂ ℝ− , α = 1, 2.

Let S ∈ M. Using the relations X 2 = Y 3 = 1 and a suitable conjugation, we may assume


that either S = 1 is a consequence of these relations, or that

S = Y α1 XY α2 ⋅ ⋅ ⋅ XY αn

with 1 ≤ αi ≤ 2 and α1 = αn .
In this second case, if x ∈ ℝ+ , then S(x) ∈ ℝ− ; hence, S ≠ 1.
This type of ping-pong argument can be used in many examples (see [30], [20]
and [26]). As another example, consider the unimodular matrices

0 1 0 −1
A=( ), B=( ).
−1 2 1 2

Let A, B denote the corresponding linear fractional transformations in the modular


group M. We have

−n + 1 n −n + 1 −n
An = ( ), Bn = ( ) for n ∈ ℤ.
−n n+1 n n+1

In particular, A and B have infinite order. Now


n n
A (ℝ− ) ⊂ ℝ+ and B (ℝ+ ) ⊂ ℝ−

for all n ≠ 0. The ping-pong argument used for any element of the type
n1 m1 mk nk+1
S=A B ⋅⋅⋅B A

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
214 | 14 Free groups and group presentations

with all ni , mi ≠ 0 and n1 + nk+1 ≠ 0 shows that S(x) ∈ ℝ+ if x ∈ ℝ− . It follows that there
are no nontrivial relations on A and B; therefore, the subgroup of M generated by A, B
must be a free group of rank 2.
To close this section, we present a significant number of theoretical applications
of the modular group. First, we need the following corollary to Corollary 14.3.11:

Corollary 14.3.12. Let M = ⟨X, Y; X 2 = Y 3 = 1⟩ be the modular group. If A is an element


of order 2, then A is conjugate to X. If B is an element of order 3, then B is conjugate to
either Y or Y 2 .

Definition 14.3.13. Let a, n be relatively prime integers with a ≠ 0, n ≥ 1. Then a is a


quadratic residue mod n if there exists an x ∈ ℤ with x2 = a mod n; that is, a = x2 + kn
for some k ∈ ℤ.

The following is called Fermat’s two-square theorem.

Theorem 14.3.14 (Fermat’s two-square theorem). Let n > 0 be a natural number. Then
n = a2 + b2 with (a, b) = 1 if and only if −1 is a quadratic residue modulo n.

Proof. Suppose −1 is a quadratic residue mod n, then there exists an x with x2 ≡


−1 mod n or x2 = −1 + mn. This implies that −x2 − mn = 1 so that there must exist a
projective unimodular matrix

x n
A = ±( ).
m −x

It is straightforward that A2 = 1. Therefore, by Corollary 14.3.12, A is conjugate within


M to X. Now consider conjugates of X within M. Let T = ( ac db ). Then

d −b
T −1 = ( ),
−c a

and

a b 0 1 d −b −(bd + ac) a2 + b2
TXT −1 = ( )( )( ) = ±( ). (∗)
c d −1 0 −c a −(c2 + d2 ) bd + ac

Therefore, any conjugate of X must have the form (∗), and thus A also must have the
form (∗). Therefore, n = a2 + b2 . Furthermore, (a, b) = 1 since in finding the form (∗),
we had ad − bc = 1.
Conversely suppose n = a2 + b2 with (a, b) = 1. Then there exist c, d ∈ ℤ with
ad − bc = 1; hence, there exists a projective unimodular matrix

a b
T = ±( ).
c d

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
14.4 Presentations of subgroups | 215

Then

α a2 + b2 α n
TXT −1 = ± ( ) = ±( ).
γ −α γ −α

This has determinant one, so

−α2 − nγ = 1 󳨐⇒ α2 = −1 − nγ 󳨐⇒ α2 ≡ −1 mod n.

Therefore, −1 is a quadratic residue mod n.

This type of group theoretical proof can be extended in several directions. Kern-
Isberner and Rosenberger [28] considered groups of matrices of the form

a b√N
U=( ), a, b, c, d, N ∈ ℤ, ad − Nbc = 1,
c√N d
or

a√N b
U=( ), a, b, c, d, N ∈ ℤ, Nad − bc = 1.
c d √N

They then proved that if

N ∈ {1, 2, 4, 5, 6, 8, 9, 10, 12, 13, 16, 18, 22, 25, 28, 37, 58}

and n ∈ ℕ with (n, N) = 1, then the following hold:


(1) If −N is a quadratic residue mod n and n is a quadratic residue mod N, then n
can be written as n = x2 + Ny2 with x, y ∈ ℤ.
(2) Conversely, if n = x2 + Ny2 with x, y ∈ ℤ and (x, y) = 1, then −N is a quadratic
residue mod n, and n is a quadratic residue mod N.

The proof of the above results depends on the class number of ℚ(√−N) (see [28]).
In another direction, Fine [40] and [39] showed that the Fermat two-square prop-
erty is actually a property satisfied by many rings R. These are called sum of squares
rings. For example, if p ≡ 3 mod 4, then ℤpn for n > 1 is a sum of squares ring.

14.4 Presentations of subgroups


Given a group presentation G = ⟨X; R⟩, it is possible to find a presentation for a sub-
group H of G. The procedure to do this is called the Reidemeister–Schreier process
and is a consequence of the explicit version of the Nielsen–Schreier theorem (Theo-
rem 14.2.9). We give a brief description. A complete description and a verification of
its correctness is found in [31], or in [20].
Let G be a group with the presentation ⟨a1 , . . . , an ; R1 , . . . , Rk ⟩. Let H be a subgroup
of G and T a Schreier system for G mod H, defined analogously as above.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
216 | 14 Free groups and group presentations

Reidemeister–Schreier process
Let G, H and T be as above. Then H is generated by the set

{Stav : t ∈ T, av ∈ {a1 , . . . , an }, Stav ≠ 1}

with a complete set of defining relations given by conjugates of the original relators
rewritten in terms of the subgroup generating set.
To actually rewrite the relators in terms of the new generators, we use a mapping
τ on words on the generators of G called the Reidemeister rewriting process. This map
is defined as follows: If
e
W = aev11 aev22 ⋅ ⋅ ⋅ avjj with ei = ±1 defines an element of H

then

e e e
τ(W) = St11,av St22,av ⋅ ⋅ ⋅ Stjj,av ,
1 2 j

where ti is the coset representative of the initial segment of W preceding avi , if ei = 1


and ti is the representative of the initial segment of W up to and including a−1
vi if ei = −1.
The complete set of relators rewritten in terms of the subgroup generators is then given
by

{τ(tRi t −1 )} with t ∈ T, and Ri runs over all relators in G.

We present two examples; one with a finite group, and then an important example
with a free group, which shows that a countable free group contains free subgroups
of arbitrary ranks.

Example 14.4.1. Let G = A4 be the alternating group on 4 symbols. Then a presenta-


tion for G is

G = A4 = ⟨a, b; a2 = b3 = (ab)3 = 1⟩.

Let H = A󸀠4 be the commutator subgroup. We use the above method to find a presen-
tation for H. Now

G/H = A4 /A󸀠4 = ⟨a, b; a2 = b3 = (ab)3 = [a, b] = 1⟩ = ⟨b; b3 = 1⟩.

Therefore, |A4 : A󸀠4 | = 3. A Schreier system is then {1, b, b2 }. The generators for A󸀠4 are
then

X1 = S1a = a, X2 = Sba = bab−1 , X3 = Sb2 a = b2 ab,

whereas the relations are the following:

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
14.4 Presentations of subgroups | 217

1. τ(aa) = S1a S1a = X12


2. τ(baab−1 ) = X22
3. τ(b2 aab−2 ) = X32
4. τ(bbb) = 1
5. τ(bbbbb−1 ) = 1
6. τ(b2 bbbb−2 ) = 1
7. τ(ababab) = S1a Sba Sb2 a = X1 X2 X3
8. τ(babababb−1 ) = Sba Sb2 a S1a = X2 X3 X1
9. τ(b2 abababb−2 ) = Sb2 a S1a Sba = X3 X1 X2 .

Therefore, after eliminating redundant relations and using X3 = X1 X2 , we get as a


presentation for A󸀠4 ,

⟨X1 , X2 ; X12 = X22 = (X1 X2 )2 = 1⟩.

Example 14.4.2. Let F = ⟨x, y; ⟩ be the free group of rank 2. Let H be the commutator
subgroup. Then

F/H = ⟨x, y; [x, y] = 1⟩ = ℤ × ℤ

a free abelian group of rank 2. It follows that H has infinite index in F. As Schreier
coset representatives, we can take

tm,n = xm yn , m = 0, ±1, ±2, . . . , n = 0, ±1, ±2, . . . .

The corresponding Schreier generators for H are

xm,n = xm yn x−m y−n , m = 0, ±1, ±2, . . . , n = 0, ±1, ±2, . . . .

The relations are only trivial; therefore, H is free on the countable infinitely many gen-
erators above. It follows that a free group of rank 2 contains as a subgroup a free group
of countably infinite rank. Since a free group of countable infinite rank contains as
subgroups free groups of all finite ranks, it follows that a free group of rank 2 contains
as a subgroup a free subgroup of any arbitrary finite rank.

Theorem 14.4.3. Let F be free of rank 2. Then the commutator subgroup F 󸀠 is free of
countable infinite rank. In particular, a free group of rank 2 contains as a subgroup a
free group of any finite rank n.

Corollary 14.4.4. Let n, m be any pair of positive integers n, m ≥ 2 and Fn , Fm free groups
of ranks n, m, respectively. Then Fn can be embedded into Fm , and Fm can be embedded
into Fn .

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
218 | 14 Free groups and group presentations

14.5 Geometric interpretation


Combinatorial group theory has its origins in topology and complex analysis. Espe-
cially important in the development is the theory of the fundamental group. This con-
nection is so deep that many people consider combinatorial group theory as the study
of the fundamental group—especially the fundamental group of a low-dimensional
complex. This connection proceeds in both directions. The fundamental group pro-
vides methods and insights to study the topology. In the other direction, the topology
can be used to study the groups.
Recall that if X is a topological space, then its fundamental group based at a point
x0 , denoted by π(X, x0 ), is the group of all homotopy classes of closed paths at x0 .
If X is path-connected, then the fundamental groups at different points are all isomor-
phic, and we can speak of the fundamental group of X, which we will denote by π(X).
Historically, group presentations were developed to handle the fundamental groups
of spaces, which allowed simplicial or cellular decompositions. In these cases, the
presentation of the fundamental group can be read off from the combinatorial decom-
position of the space.
An (abstract) simplicial complex or cell complex K is a topological space consisting
of a set of points called the vertices, which we will denote by V(K), and collections of
subsets of vertices called simplexes or cells, which have the property that the intersec-
tion of any two simplices is again a simplex. If n is the number of vertices in a cell, then
n − 1 is called its dimension. Hence, the set of vertices are the 0-dimensional cells, and
a simplex {v1 , . . . , vn } is an (n − 1)-dimensional cell. The 1-dimensional cells are called
edges. These have the form {u, v}, where u and v are vertices. One should think of the
cells in a geometric manner so that the edges are really edges, the 2-cells are filled tri-
angles (which are equivalent to disks), and so on. The maximum dimension of any cell
in a complex K is called the dimension of K. From now on, we will assume that our
simplicial complexes are path-connected.
A graph Γ is just a 1-dimensional simplicial complex. Hence, Γ consists of just ver-
tices and edges. If K is any complex, then the set of vertices and edges is called the
1-skeleton of K. Similarly, all the cells of dimension less than or equal to 2 comprise
the 2-skeleton. A connected graph with no closed paths in it is called a tree. If K is
any complex, then a maximal tree in K is a tree that can be contained in no other tree
within K.
From the viewpoint of combinatorial group theory what is relevant is that if K is
a complex, then a presentation of its fundamental group can be determined from its
2-skeleton and read off directly. In particular the following hold:

Theorem 14.5.1. Suppose that K is a connected cell complex. Suppose that T is a max-
imal tree within the 1-skeleton of K. Then a presentation for π(K) can be determined in
the following manner:

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
14.5 Geometric interpretation | 219

Generators: all edges outside of the maximal tree T


Relations:
(a) {u, v} = 1 if {u, v} is an edge in T
(b) {u, v}{v, w} = {u, w} if u, v, w lie in a simplex of K.

From this the following is obvious:

Corollary 14.5.2. The fundamental group of a connected graph is free. Furthermore, its
rank is the number of edges outside a maximal tree.

A connected graph is homotopic to a wedge or bouquet of circles. If there are n


circles in a bouquet of circles, then the fundamental group is free of rank n. The con-
verse is also true. A free group can be realized as the fundamental group of a wedge
of circles.
An important concept in applying combinatorial group theory is that of a covering
complex.

Definition 14.5.3. Suppose that K is a complex. Then a complex K1 is a covering com-


plex for K if there exists a surjection p : K1 → K called a covering map with the property
that for any cell s ∈ K the inverse image p−1 (s) is a union of pairwise disjoint cells in
K1 , and p restricted to any of the preimage cells is a homeomorphism.
That is, for each simplex S in K, we have
p−1 (S) = ⋃ Si

and p : Si → S is a bijection for each i.

The following then becomes clear:

Lemma 14.5.4. If K1 is a connected covering complex for K, then K1 and K have the same
dimension.

What is crucial in using covering complexes to study the fundamental group is


that there is a Galois theory of covering complexes and maps. The covering map p
induces a homomorphism of the fundamental group, which we will also call p. Then
we have the following:

Theorem 14.5.5. Let K1 be a covering complex of K with covering map p. Then p(π(K1 )) is
a subgroup of π(K). Conversely, to each subgroup H of π(K), there is a covering complex
K1 with π(K1 ) = H. Hence, there is a one-to-one correspondence between subgroups of
the fundamental group of a complex K and covers of K.

We will see the analog of this theorem in regard to algebraic field extensions in
Chapter 15.
A topological space X is simply connected if π(X) = {1}. Hence, the covering com-
plex of K corresponding to the identity in π(K) is simply connected. This is called the
universal cover of K since it covers any other cover of K.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
220 | 14 Free groups and group presentations

Based on Theorem 14.5.1, we get a very simple proof of the Nielsen–Schreier the-
orem.

Theorem 14.5.6 (Nielsen–Schreier). Any subgroup of a free group is free.

Proof. Let F be a free group. Then F = π(K), where K is a connected graph. Let H be a
subgroup of F. Then H corresponds to a cover K1 of K. But a cover is also 1-dimensional;
hence, H = π(K1 ), where K1 is a connected graph. Therefore, H is also free.

The fact that a presentation of a fundamental group of a simplicial complex is


determined by its 2-skeleton going in the other direction also. That is, given an arbi-
trary presentation, there exists a 2-dimensional complex, whose fundamental group
has that presentation. Essentially, given a presentation ⟨X; R⟩, we consider a wedge of
circles with cardinality |X|. We then paste on a 2-cell for each relator W in R bounded
by the path corresponding to the word W.

Theorem 14.5.7. Given an arbitrary presentation ⟨X; R⟩, there exists a connected 2-com-
plex K with π(K) = ⟨X; R⟩.

We note that the books by Rotman [33] and Camps, Kühling and Rosenberger [21]
have significantly detailed and accessible descriptions of groups and complexes.
Cayley, and then Dehn, introduced for each group G, a graph, now called Cayley
graph, as a tool to apply complexes to the study of G. The Cayley graph is actually
tied to a presentation, and not to the group itself. Gromov reversed the procedure and
showed that by considering the geometry of the Cayley graph, one could get informa-
tion about the group. This led to the development of the theory of hyperbolic groups
(see for instance [20]).

Definition 14.5.8. Let G = ⟨X; R⟩ be a presentation. We form a graph Γ(G, X) in the


following way: Let A = X ∪ X −1 . For the vertex set of Γ(G, X), we take the elements of G;
that is, V(Γ) = {g : g ∈ G}. The edges of Γ are given by the set {(g, x) : g ∈ G, x ∈ A}. We
call g the initial point, and gx is the terminal point. That is two points g, g1 in the vertex
set are connected by an edge if g1 = gx for some x ∈ A. We have (g, x)−1 = (gx, x−1 ). This
gives a directed graph called the Cayley graph C(X) of G on the generating set X.

Call x the label on the edge (g, x). Given a g ∈ G, then G is represented by at least
one word W in A. This represents a path in the Cayley graph. The length of the word W
is the length of the path. This is equivalent to making each edge have length one. If we
take the distance between 2 points as the minimum path length, we make the Cayley
graph a metric space. This metric is called the word metric. If we extend this metric to
all pairs of points in the Cayley graph in the obvious way (making each edge a unit
real interval), then the Cayley graph becomes a geodesic metric space with a metric d.
Each closed path in the Cayley graph represents a relator.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
14.6 Presentations of factor groups | 221

By left multiplication, the group G acts on the Cayley graph as a group of isome-
tries. Furthermore, the action of G on the Cayley graph is without inversion; that is,
ge ≠ e−1 if e is an edge.
If we sew in a 2-cell for each closed path in the Cayley graph, we get a simply
connected 2-complex called the Cayley complex. We now want to briefly describe the
concept of hyperbolic groups. Let G be a finitely generated group with a finite gener-
ating system X. G is called a hyperbolic group if the Cayley graph C(X) is a hyperbolic
space; that is, there exists a constant δ = δ(x) ≥ 0 such that d(u, [y, z] ∪ [z, x]) ≤ δ for
each geodesic triangle [x, y] ∪ [y, z] ∪ [z, x], and each u ∈ [x, y].

z
v

u
x
y

Figure 14.1: Geodesic triangle.

There exists a v ∈ [y, z] ∪ [z, x] with d(u, v) ≤ δ, see Figure 14.1.


This property is independent of the finite generating system X, and a hyper-
bolic group is finitely presented. Prominent examples of hyperbolic groups are finite
groups, free groups, oriented surface groups ⟨a1 , b1 , . . . , ag , bg ; [a1 , b1 ] ⋅ ⋅ ⋅ [ag , bg ] = 1⟩
with g > 1, and the nonoriented surface groups ⟨a1 , . . . , ag ; a21 ⋅ ⋅ ⋅ a2g = 1⟩ with g > 2.
For a proof of these statements and more details on hyperbolic groups see [20].

14.6 Presentations of factor groups


Let G be a group with a presentation G = ⟨X; R⟩. Suppose that H is a factor group of G;
that is, H ≅ G/N for some normal subgroup N of G. We show that a presentation for H
is then H = ⟨X; R ∪ R1 ⟩, where R1 is a, perhaps additional, system of relators.

Theorem 14.6.1 (Dyck’s theorem). Let G = ⟨X; R⟩, and suppose that H ≅ G/N, where N
is a normal subgroup of G. Then a presentation for H is ⟨X; R ∪ R1 ⟩ for some set of words
R1 on X. Conversely, the presentation ⟨X; R ∪ R1 ⟩ defines a group, that is, a factor group
of G.

Proof. Since each element of H is a coset of N, they have the form gN for g ∈ G. It is
clear then that the images of X generate H. Furthermore, since H is a homomorphic
image of G, each relator in R is a relator in H. Let N1 be a set of elements that generate N,
and let R1 be the corresponding words in the free group on X. Then R1 is an additional

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
222 | 14 Free groups and group presentations

set of relators in H. Hence, R ∪ R1 is a set of relators for H. Any relator in H is either a


relator in G, hence a consequence of R, or can be realized as an element of G that lies
in N, and therefore a consequence of R1 . Therefore, R ∪ R1 is a complete set of defining
relators for H, and H has the presentation H = ⟨X; R ∪ R1 ⟩.
Conversely, G = ⟨X; R⟩, G1 = ⟨X; R ∪ R1 ⟩. Then G = F(X)/N1 , where N1 = N(R),
and G1 = F(X)/N2 , where N2 = N(R ∪ R1 ). Hence, N1 ⊂ N2 . The normal subgroup
N2 /N1 of F(X)/N1 corresponds to a normal subgroup of H of G, and therefore by the
isomorphism theorem

G/H ≅ (F(X)/N1 )/(N2 /N1 ) ≅ F(X)/N2 ≅ G1 .

14.7 Group presentations and decision problems


We have seen that given any group G, there exists a presentation for it, G = ⟨X; R⟩. In
the other direction, given any presentation ⟨X; R⟩, we have seen that there is a group
with that presentation. In principle, every question about a group can be answered
via a presentation. However, things are not that simple. Max Dehn in his pioneering
work on combinatorial group theory about 1910 introduced the following three fun-
damental group decision problems:
(1) Word Problem: Suppose G is a group given by a finite presentation. Is there an
algorithm to determine if an arbitrary word w in the generators of G defines the
identity element of G?
(2) Conjugacy Problem: Suppose G is a group given by a finite presentation. Is there
an algorithm to determine if an arbitrary pair of words u, v in the generators of G
define conjugate elements of G?
(3) Isomorphism Problem: Is there an algorithm to determine, given two arbitrary fi-
nite presentations, whether the groups they present are isomorphic or not?

All three of these problems have negative answers in general. That is, for each of these
problems one can find a finite presentation, for which these questions cannot be an-
swered algorithmically (see [30]). Attempts for solutions, and for solutions in restricted
cases, have been of central importance in combinatorial group theory. For this reason
combinatorial group theory has always searched for and studied classes of groups, in
which these decision problems are solvable.
For finitely generated free groups, there are simple and elegant solutions to all
three problems. If F is a free group on x1 , . . . , xn and W is a freely reduced word in
x1 , . . . , xn , then W ≠ 1 if and only if L(W) ≥ 1 for the length of W. Since freely reducing
any word to a freely reduced word is algorithmic, this provides a solution to the word
e e e
problem. Furthermore, a freely reduced word W = xv11 xv22 ⋅ ⋅ ⋅ xvnn is cyclically reduced
if v1 ≠ vn or if v1 = vn , then e1 ≠ −en . Clearly then, every element of a free group is
conjugate to an element given by a cyclically reduced word called a cyclic reduction.
This leads to a solution to the conjugacy problem. Suppose V and W are two words

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
14.8 Group amalgams: free products and direct products | 223

in the generators of F and V, W are respective cyclic reductions. Then V is conjugate


to W if and only if V is a cyclic permutation of W. Finally, two finitely generated free
groups are isomorphic if and only if they have the same rank.

14.8 Group amalgams: free products and direct products


Closely related to free groups in both form and properties are free products of groups.
Let A = ⟨a1 , . . . ; R1 , . . .⟩ and B = ⟨b1 , . . . ; S1 , . . .⟩ be two groups. We consider A and B to
be disjoint. Then we have the following:

Definition 14.8.1. The free product of A and B, denoted by A ∗ B, is the group G with
the presentation ⟨a1 , . . . , b1 , . . . ; R1 , . . . , S1 , . . .⟩; that is, the generators of G consist of the
disjoint union of the generators of A and B with relators taken as the disjoint union of
the relators Ri of A and Sj of B. A and B are called the factors of G.

In an analogous manner, the concept of a free product can be extended to an ar-


bitrary collection of groups.

Definition 14.8.2. If Aα = ⟨gens Aα ; rels Aα ⟩, α ∈ ℐ , is a collection of groups, then their


free product G = ∗Aα is the group, whose generators consist of the disjoint union of
the generators of the Aα , and whose relators are the disjoint union of the relators of
the Aα .

Free products exist and are nontrivial. In that regard, we have the following:

Theorem 14.8.3. Let G = A ∗ B. Then the maps A → G and B → G are injections. The
subgroup of G generated by the generators of A has the presentation ⟨generators of A;
relators of A⟩, that is, is isomorphic to A. Similarly for B. Thus, A and B can be considered
as subgroups of G. In particular, A ∗ B is nontrivial if A and B are.

Free products share many properties with free groups. First of all there is a cate-
gorical formulation of free products. Specifically we have the following:

Theorem 14.8.4. A group G is the free product of its subgroups A and B if A and B gen-
erate G, and given homomorphisms f1 : A → H, f2 : B → H into a group H, there exists
a unique homomorphism f : G → H, extending f1 and f2 .

Secondly, each element of a free product has a normal form related to the reduced
words of free groups. If G = A ∗ B, then a reduced sequence or reduced word in G is a
sequence g1 g2 . . . gn , n ≥ 0, with gi ≠ 1, each gi in either A or B and gi , gi+1 not both in
the same factor. Then the following hold:

Theorem 14.8.5. Each element g ∈ G = A ∗ B has a unique representation as a reduced


sequence. The length n is unique and is called the syllable length. The case n = 0 is
reserved for the identity.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
224 | 14 Free groups and group presentations

A reduced word g1 . . . gn ∈ G = A ∗ B is called cyclically reduced if either n ≤ 1 or


n ≥ 2 and g1 and gn are from different factors. Certainly, every element of G is conjugate
to a cyclically reduced word.
From this, we obtain several important properties of free products, which are anal-
ogous to properties in free groups.

Theorem 14.8.6. An element of finite order in a free product is conjugate to an element


of finite order in a factor. In particular a finite subgroup of a free product is entirely
contained in a conjugate of a factor.

Theorem 14.8.7. If two elements of a free product commute, then they are both powers
of a single element or are contained in a conjugate of an abelian subgroup of a fac-
tor.

Finally, a theorem of Kurosh extends the Nielsen–Schreier theorem to free prod-


ucts.

Theorem 14.8.8 (Kurosh). A subgroup of a free product is also a free product. Explicitly,
if G = A ∗ B and H ⊂ G, then

H = F ∗ (∗Aα ) ∗ (∗Bβ ),

where F is a free group, (∗Aα ) is a free product of conjugates of subgroups of A, and


(∗Bβ ) is a free product of conjugates of subgroups of B.

We note that the rank of F and the number of the other factors can be computed.
A complete discussion of these is in [31], [30] and [20].
If A and B are disjoint groups, then we now have two types of products form-
ing new groups out of them: the free product and the direct product. In both these
products, the original factors inject. In the free product, there are no relations be-
tween elements of A and elements of B, whereas in a direct product, each element
of A commutes with each element of B. If a ∈ A and b ∈ B, a cross commutator
is [a, b] = aba−1 b−1 . The direct product is a factor group of the free product, and
the kernel is precisely the normal subgroup generated by all the cross commuta-
tors.

Theorem 14.8.9. Suppose that A and B are disjoint groups. Then

A × B = (A ⋆ B)/H,

where H is the normal closure in A ⋆ B of all the cross commutators. In particular, a


presentation for A × B is given by

A × B = ⟨gens A, gens B; rels A, rels B, [a, b] for all a ∈ A, b ∈ B⟩.

This coincides with the concept in Section 10.3.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
14.9 Exercises | 225

14.9 Exercises
1. Let X −1 be a set disjoint from X, but bijective to X. A word in X is a finite sequence
of letters from the alphabet. That is a word has the form
ϵi ϵi ϵ
w = xi 1 xi 2 ⋅ ⋅ ⋅ xi in ,
1 2 n

where xij ∈ X, and ϵij = ±1. Let W(X) be the set of all words on X.
If w1 , w2 ∈ W(X), we say that w1 is equivalent to w2 , denoted by w1 ∼ w2 , if w1 can
be converted to w2 by a finite string of insertions and deletions of trivial words.
Verify that this is an equivalence relation on W(X).
2. In F(X), let N(X) be the subgroup generated by all squares in F(X); that is,

N(X) = ⟨{g 2 : g ∈ F(X)}⟩.

Show that N(X) is a normal subgroup, and that the factor group F(X)/N(X) is
abelian, where every nontrivial element has order 2.
3. Show that a free group F is torsion-free.
4. Let F be a free group, and a, b ∈ F. Show: If ak = bk , k ≠ 0, then a = b.
5. Let F = ⟨a, b; ⟩ a free group with basis {a, b}. Let ci = a−i bai , i ∈ ℤ. Then G =
⟨ci , i ∈ ℤ⟩ is free with basis {ci | i ∈ ℤ}.
6. Show that ⟨x, y; x2 y3 , x3 y4 ⟩ ≅ ⟨x; x⟩ = {1}.
7. Let G = ⟨v1 , . . . , vn ; v12 ⋅ ⋅ ⋅ vn2 ⟩, n ≥ 1, and α : G → ℤ2 the epimorphism with
α(vi ) = −1 for all i. Let U be the kernel of α. Then U has a presentation U =
⟨x1 , . . . , xn−1 , y1 , . . . , yn−1 ; y1 x1 ⋅ ⋅ ⋅ yn−1 xn−1 yn−1 xn−1 ⋅ ⋅ ⋅ y1−1 x1−1 ⟩.
−1 −1
2 3
8. Let M = ⟨x, y; x , y ⟩ ≅ PSL(2, ℤ) be the modular group. Let M 󸀠 be the commutator
subgroup. Show that M 󸀠 is a free group of rank 2 with a basis {[x, y], [x, y2 ]}.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:22 PM
Brought to you by | Cambridge University Library
Authenticated
Download Date | 9/18/19 6:22 PM
15 Finite Galois extensions
15.1 Galois theory and the solvability of polynomial equations

As we mentioned in Chapter 1, one of the origins of abstract algebra was the prob-
lem of trying to determine a formula for finding the solutions in terms of radicals of a
fifth degree polynomial. It was proved first by Ruffini in 1800 and then by Abel that,
in general, it is impossible to find a formula in terms of radicals for such a solution.
In 1820, Galois extended this and showed that such a formula is impossible for any
degree five or greater. In proving this, he laid the groundwork for much of the devel-
opment of modern abstract algebra, especially field theory and finite group theory.
One of the goals of this book has been to present a comprehensive treatment of Ga-
lois theory and a proof of the results mentioned above. At this point, we have covered
enough general algebra and group theory to discuss Galois extensions and general
Galois theory.
In modern terms, Galois theory is that branch of mathematics, which deals with
the interplay of the algebraic theory of fields, the theory of equations, and finite group
theory. This theory was introduced by Evariste Galois about 1830 in his study of the
insolvability by radicals of quintic (degree 5) polynomials, a result proved somewhat
earlier by Ruffini, and independently by Abel. Galois was the first to see the close con-
nection between field extensions and permutation groups. In doing so, he initiated the
study of finite groups. He was the first to use the term group as an abstract concept,
although his definition was really just for a closed set of permutations.
The method Galois developed not only facilitated the proof of the insolvability
of the quintic and higher powers, but led to other applications, and to a much larger
theory.
The main idea of Galois theory is to associate to certain special types of algebraic
field extensions called Galois extensions, a group called the Galois group. The prop-
erties of the field extension will be reflected in the properties of the group, which are
somewhat easier to examine. Thus, for example, solvability by radicals can be trans-
lated into solvability of groups, which was discussed in Chapter 12. Showing that for
every polynomial of degree five or greater, there exists a field extension whose Galois
group is not solvable proves that there cannot be a general formula for solvability by
radicals.
The tie-in to the theory of equations is as follows: If f (x) = 0 is a polynomial equa-
tion over some field K, we can form the splitting field K. This is usually a Galois ex-
tension, and therefore has a Galois group called the Galois group of the equation. As
before, properties of this group will reflect properties of this equation.

https://doi.org/10.1515/9783110603996-015

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:02 AM
228 | 15 Finite Galois extensions

15.2 Automorphism groups of field extensions


To define the Galois group, we must first consider the automorphism group of a field
extension. In this section, K, L, M will always be (commutative) fields with additive
identity 0 and multiplicative identity 1.

Definition 15.2.1. Let L|K be a field extension. Then the set

Aut(L|K) = {α ∈ Aut(L) : α|K = the identity on K}

is called the set of automorphisms of L over K. Notice that if α ∈ Aut(L|K), then α(k) = k
for all k ∈ K.

Lemma 15.2.2. Let L|K be a field extension. Then Aut(L|K) forms a group called the
Galois group of L|K.

Proof. Aut(L|K) ⊂ Aut(L). Hence, to show that Aut(L|K) is a group, we only have to
show that its a subgroup of Aut(L). Now the identity map on L is certainly the identity
map on K, so 1 ∈ Aut(L|K); hence, Aut(L|K) is nonempty. If α, β ∈ Aut(L|K), then
consider α−1 β. If k ∈ K, then β(k) = k, and α(k) = k, so α−1 (k) = k. Therefore, α−1 β(k) =
k for all k ∈ K, and hence α−1 β ∈ Aut(L|K). It follows that Aut(L|K) is a subgroup of
Aut(L), and therefore a group.

If f (x) ∈ K[x] \ K and L is the splitting field of f (x) over K, then Aut(L|K) is also
called the Galois group of f (x).

Theorem 15.2.3. If P is the prime field of L, then Aut(L|P) = Aut(L).

Proof. We must show that any automorphism of a prime field P is the identity. If α ∈
Aut(L), then α(1) = 1, and so α(n ⋅ 1) = n ⋅ 1. Therefore, in P, α fixes all integer multiples
of the identity. However, every element of P can be written as a quotient m⋅1 n⋅1
of integer
multiples of the identity. Since α is a field homomorphism and α fixes both the top
and the bottom, it follows that α will fix every element of this form, and hence fix each
element of P.

For splitting fields, the Galois group is a permutation group on the zeros of the
defining polynomial.

Theorem 15.2.4. Let f (x) ∈ K[x] and L the splitting field of f (x) over K. Suppose that
f (x) has zeros α1 , . . . , αn ∈ L.
(a) Then each ϕ ∈ Aut(L|K) is a permutation on the zeros. In particular, Aut(L|K) is
isomorphic to a subgroup of Sn and uniquely determined by the zeros of f (x).
(b) If f (x) is irreducible, then Aut(L|K) operates transitively on {α1 , . . . , αn }. Hence, for
each i, j, there is a ϕ ∈ Aut(L|K) such that ϕ(αi ) = αj .
(c) If f (x) = b(x − α1 ) ⋅ ⋅ ⋅ (x − αn ) with α1 , . . . , αn pairwise distinct and Aut(L|K) operates
transitively on α1 , . . . , αn , then f (x) is irreducible.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:02 AM
15.2 Automorphism groups of field extensions | 229

Proof. For the proofs, we use the results of Chapter 8.


(a) Let ϕ ∈ Aut(K). Then, from Theorem 8.1.5, we obtain that ϕ permutes the zeros
α1 , . . . , αn . Hence, ϕ|{α1 ,...,αn } ∈ Sn . This map then defines a homomorphism
τ : Aut(L|K) → Sn by τ(ϕ) = ϕ|{α1 ,...,αn } .

Furthermore, ϕ is uniquely determined by the images ϕ(αi ). It follows that τ is a


monomorphism.
(b) If f (x) is irreducible, then Aut(L|K) operates transitively on the set {α1 , . . . , αn },
again following from Theorem 8.1.5.
(c) Suppose that f (x) = b(x − α1 ) ⋅ ⋅ ⋅ (x − αn ) with α1 , . . . , αn distinct and f ∈
Aut(L|K) operates transitively on α1 , . . . , αn . Now, assume that f (x) = g(x)h(x) with
g(x), h(x) ∈ K[x] \ K. Without loss of generality, let α1 be a zero of g(x) and αn be a zero
of h(x).
Let α ∈ Aut(L|K) with α(α1 ) = αn . However, α(g(x)) = g(x); that is, α(α1 ) is a zero
of α(g(x)) = g(x), which gives a contradiction since αn is not a zero of g(x). Therefore,
f (x) must be irreducible.
Example 15.2.5. Let f (x) = (x2 −2)(x2 −3) ∈ ℚ[x]. The field L = ℚ(√2, √3) is the spitting
field of f (x).
Over L, we have
f (x) = (x + √2)(x − √2)(x + √3)(x − √3).

We want to determine the Galois group Aut(L|ℚ) = Aut(L) = G.


Lemma 15.2.6. The Galois group G above is the Klein 4-group.
Proof. First, we show that |Aut(L)| ≤ 4. Let α ∈ Aut(L). Then α is uniquely determined
by α(√2) and α(√3), and
2 2 2
α(2) = 2 = (√2) = α(√2 ) = (α(√2)) .

Hence, α(√2) = ±√2. Analogously, α(√3) = ±√3. From this it follows that |Aut(L)| ≤ 4.
Furthermore, α2 = 1 for any α ∈ G.
Next we show that the polynomial f (x) = x2 − 3 is irreducible over K = ℚ(√2).
Assume that x2 −3 were reducible over K. Then √3 ∈ K. This implies that √3 = ba + dc √2
with a, b, c, d ∈ ℤ and b ≠ 0 ≠ d, and gcd(c, d) = 1. Then bd√3 = ad + bc√2, hence
3b2 d2 = a2 b2 + 2b2 c2 + 2√2adbc. Since bd ≠ 0, this implies that we must have ac = 0.
If c = 0, then √3 = ba ∈ ℚ, a contradiction. If a = 0, then √3 = dc √2, which implies
3d2 = 2c2 . It follows from this that 3| gcd(c, d) = 1, again a contradiction. Hence f (x) =
x2 − 3 is irreducible over K = ℚ(√2).
Since L is the splitting field of f (x) and f (x) is irreducible over K, then there exists
an automorphism α ∈ Aut(L) with α(√3) = −√3 and α|K = IK ; that is, α(√2) = √2.
Analogously, there is a β ∈ Aut(L) with β(√2) = −√2 and β(√3) = √3.
Clearly, α ≠ β, αβ = βα and α ≠ αβ ≠ β. It follows that Aut(L) = {1, α, β, αβ},
completing the proof.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:02 AM
230 | 15 Finite Galois extensions

15.3 Finite Galois extensions


We now define (finite) Galois extensions. First, we introduce the concept of a fix field.
Let K be a field and G a subgroup of Aut(K). Define the set

Fix(K, G) = {k ∈ K : g(k) = k ∀g ∈ G}.

Theorem 15.3.1. For a G ⊂ Aut(K), the set Fix(K, G) is a subfield of K called the fix field
of G over K.

Proof. 1 ∈ K is in Fix(K, G), so Fix(K, G) is not empty. Let k1 , k2 ∈ Fix(K, G), and let
g ∈ G. Then g(k1 ± k2 ) = g(k1 ) ± g(k2 ) since g is an automorphism. Then g(k1 ) ± g(k2 ) =
k1 ±k2 , and it follows that k1 ±k2 ∈ Fix(K, G). In an analogous manner, k1 k2−1 ∈ Fix(K, G)
if k2 ≠ 0; therefore, Fix(K, G) is a subfield of K.
Using the concept of a fix field, we define a finite Galois extension.

Definition 15.3.2. L|K is a (finite) Galois extension if there exists a finite subgroup G ⊂
Aut(L) such that K = Fix(L, G).

We now give some examples of finite Galois extensions:

Lemma 15.3.3. Let L = ℚ(√2, √3) and K = ℚ. Then L|K is a Galois extension.

Proof. Let G = Aut(L|K). From the example in the previous section, there are automor-
phisms α, β ∈ G with

α(√3) = −√3, α(√2) = √2 and β(√2) = −√2, β(√3) = √3.

We have

ℚ(√2, √3) = {c + d√3 : c, d ∈ ℚ(√2)}.

Let t = a1 + b1 √2 + (a2 + b2 √2)√3 ∈ Fix(L, G).


Then applying β, we have

t = β(t) = a1 − b1 √2 + (a2 − b2 √2)√3.

It follows that b1 +b2 √3 = 0; that is, b1 = b2 = 0 since √3 ∉ ℚ. Therefore, t = a1 +a2 √3.


Applying α, we have α(t) = a1 − a2 √3, and hence a2 = 0. Therefore, t = a1 ∈ ℚ. Hence
ℚ = Fix(L, G), and L|K is a Galois extension.
1
Lemma 15.3.4. Let L = ℚ(2 4 ) and K = ℚ. Then L|K is not a Galois extension.
1
Proof. Suppose that α ∈ Aut(L) and a = 2 4 . Then a is a zero of x4 − 2, and hence
1
α(a) = 2 4 or
1
α(a) = i2 ∉ L since i ∉ L or
4

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:02 AM
15.4 The fundamental theorem of Galois theory | 231

1
α(a) = −2 4 or
1
α(a) = −i2 ∉ L since i ∉ L.
4

In particular, α(√2) = √2; therefore,

Fix(L, Aut(L)) = ℚ(√2) ≠ ℚ.

15.4 The fundamental theorem of Galois theory


We now state the fundamental theorem of Galois theory. This theorem describes the
interplay between the Galois group and Galois extensions. In particular, the result ties
together subgroups of the Galois group and intermediate fields between L and K.

Theorem 15.4.1 (Fundamental theorem of Galois theory). Let L|K be a Galois exten-
sion with Galois group G = Aut(L|K). For each intermediate field E, let τ(E) be the
subgroup of G fixing E. Then the following hold:
(1) τ is a bijection between intermediate fields containing K and subgroups of G.
(2) L|K is a finite extension, and if M is an intermediate field, then

󵄨 󵄨
|L : M| = 󵄨󵄨󵄨Aut(L|M)󵄨󵄨󵄨
󵄨 󵄨
|M : K| = 󵄨󵄨󵄨Aut(L|K) : Aut(L|M)󵄨󵄨󵄨.

(3) If M is an intermediate field, then the following hold:


(a) L|M is always a Galois extension.
(b) M|K is a Galois extension if and only if

Aut(L|M) is a normal subgroup of Aut(L|K).

(4) If M is an intermediate field and M|K is a Galois extension then we have the follow-
ing:
(a) α(M) = M for all α ∈ Aut(L|K),
(b) the map ϕ : Aut(L|K) → Aut(M|K) with ϕ(α) = α|M = β is an epimorphism,
(c) Aut(M|K) = Aut(L|K)/ Aut(L|M).
(5) The lattice of subfields of L containing K is the inverted lattice of subgroups of
Aut(K|L).

We will prove this main result via a series of theorems, and then combine them
all.

Theorem 15.4.2. Let G be a group, K a field, and α1 , . . . , αn pairwise distinct group ho-
momorphisms from G to K ⋆ , the multiplicative group of K. Then α1 , . . . , αn are linearly
independent elements of the K-vector space of all homomorphisms from G to K.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:02 AM
232 | 15 Finite Galois extensions

Proof. The proof is by induction on n. If n = 1 and kα1 = 0 with k ∈ K, then 0 =


kα1 (1) = k ⋅ 1, and hence k = 0. Now suppose that n ≥ 2, and suppose that each n − 1 of
the α1 , . . . , αn are linearly independent over K. If
n
∑ ki αi = 0, ki ∈ K, (∗)
i=1

then we must show that all ki = 0. Since α1 ≠ αn , there exists an a ∈ G with α1 (a) ≠
αn (a). Let g ∈ G and apply the sum above to ag. We get
n
∑ ki (αi (a))(αi (g)) = 0. (∗∗)
i=1

Now multiply equation (∗) by αn (a) ∈ K to get


n
∑ ki (αn (a))(αi (g)) = 0. (∗∗∗)
i=1

If we subtract equation (∗∗∗) from equation (∗∗), then the last term vanishes and
we have an equation in the n − 1 homomorphism α1 , . . . , αn−1 . Since these are linearly
independent, we obtain

k1 (α1 (a)) − k1 (αn (a)) = 0

for the coefficient for α1 . Since α1 (a) ≠ αn (a), we must have k1 = 0. Now α2 , . . . , αn−1
are by assumption linearly independent, so k2 = ⋅ ⋅ ⋅ = kn = 0 also. Hence, all the
coefficients must be zero, and therefore the mappings are independent.

Theorem 15.4.3. Let α1 , . . . , αn be pairwise distinct monomorphisms from the field K into
the field K 󸀠 . Let

L = {k ∈ K : α1 (k) = α2 (k) = ⋅ ⋅ ⋅ = αn (k)}.

Then L is a subfield of K with |L : K| ≥ n.

Proof. Certainly L is a field. Assume that r = |K : L| < n, and let {a1 , . . . , ar } be a basis
of the L-vector space K. We consider the following system of linear equations with r
equations and n unknowns:

(α1 (a1 ))x1 + ⋅ ⋅ ⋅ + (αn (a1 ))xn = 0


..
.
(α1 (ar ))x1 + ⋅ ⋅ ⋅ + (αn (ar ))xn = 0.

Since r < n, there exists a nontrivial solution (x1 , . . . , xn ) ∈ (K 󸀠 )n .

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:02 AM
15.4 The fundamental theorem of Galois theory | 233

Let a ∈ K. Then
r
a = ∑ lj aj with lj ∈ L.
j=1

From the definition of L, we have

α1 (lj ) = αi (lj ) for i = 2, . . . , n.

Then with our nontrivial solution (x1 , . . . , xn ), we have

n n r
∑ xi (αi (a)) = ∑ xi (∑ αi (lj )αi (aj ))
i=1 i=1 j=1
r n
= ∑(α1 (lj )) ∑ xi (αi (aj )) = 0
j=1 i=1

since α1 (lj ) = αi (lj ) for i = 2, . . . , n. This holds for all a ∈ K, and hence ∑ni=1 xi αi = 0,
contradicting Theorem 15.4.2. Therefore, our assumption that |K : L| < n must be false,
and hence |K : L| ≥ n.

Definition 15.4.4. Let K be a field and G a finite subgroup of Aut(K). The map
trG : K → K, given by

trG (k) = ∑ α(k),


α∈G

is called the G-trace of K.

Theorem 15.4.5. Let K be a field and G a finite subgroup of Aut(K). Then

{0} ≠ trG (K) ⊂ Fix(K, G).

Proof. Let β ∈ G. Then

β(trG (k)) = ∑ βα(k) = ∑ α(k) = trG (k).


α∈G α∈G

Therefore, trG (K) ⊂ Fix(K, G).


Now assume that trG (k) = 0 for all k ∈ K. Then ∑α∈G α(k) = 0 for all k ∈ K. It
follows that ∑α∈G α is the zero map; hence, the set of all α ∈ G are linearly dependent as
elements of the K-vector space of all maps from K to K. This contradicts Theorem 15.4.2,
and hence the trace cannot be the zero map.

Theorem 15.4.6. Let K be a field and G a finite subgroup of Aut(K). Then


󵄨󵄨 󵄨
󵄨󵄨K : Fix(K, G)󵄨󵄨󵄨 = |G|.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:02 AM
234 | 15 Finite Galois extensions

Proof. Let L = Fix(K, G), and suppose that |G| = n. From Theorem 15.4.3, we know that
|K : L| ≥ n. We must show that |K : L| ≤ n.
Suppose that G = {α1 , . . . , αn }. To prove the result, we show that if m > n and
a1 , . . . , am ∈ K, then a1 , . . . , am are linearly dependent.
We consider the system of equations

(α1−1 (a1 ))x1 + ⋅ ⋅ ⋅ + (α1−1 (am ))xm = 0


..
.
(αn−1 (a1 ))x1 + ⋅ ⋅ ⋅ + (αn−1 (am ))xm = 0.

Since m > n, there exists a nontrivial solution (y1 , . . . , ym ) ∈ K m . Suppose that yl ≠ 0.


Using Theorem 15.4.5, we can choose k ∈ K with trG (k) ≠ 0. Define

(x1 , . . . , xm ) = kyl−1 (y1 , . . . , ym ).

This m-tuple (x1 , . . . , xm ) is then also a nontrivial solution of the system of equations
considered above.
Then we have

trG (xl ) = trG (k) since xl = k.

Now we apply αi to the i-th equation to obtain

a1 (α1 (x1 )) + ⋅ ⋅ ⋅ + am (α1 (xm )) = 0


..
.
a1 (αn (x1 )) + ⋅ ⋅ ⋅ + am (αn (xm )) = 0.

Summation leads to
m n m
0 = ∑ aj ∑(αi (xj )) = ∑(trG (xj ))aj
j=1 i=1 j=1

by definition of the G-trace. Hence, a1 , . . . , am are linearly dependent over L since


trG (xl ) ≠ 0. Therefore, |K : L| ≤ n. Combining this with Theorem 15.4.3, we get that
|K : L| = n = |G|.

Theorem 15.4.7. Let K be a field and G a finite subgroup of Aut(K). Then

Aut(K|Fix(K, G)) = G.

Proof. G ⊂ Aut(K|Fix(K, G)). Since if g ∈ G, then g ∈ Aut(K), and g fixes Fix(K, G) by


definition. Therefore, we must show that Aut(K|Fix(K, G)) ⊂ G.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:02 AM
15.4 The fundamental theorem of Galois theory | 235

Assume then that there exists an α ∈ Aut(K| Fix(K, G)) with α ∉ G. Suppose, as in
the previous proof, |G| = n and G = {α1 , . . . , αn } with α1 = 1. Now

Fix(K, G) = {a ∈ K : a = α2 (a) = ⋅ ⋅ ⋅ = αn (a)}


= {a ∈ K : α(a) = a = α2 (a) = ⋅ ⋅ ⋅ = αn (a)}.

From Theorem 15.4.3, we have that |K : Fix(K, G)| ≥ n + 1. However, from Theo-
rem 15.4.6, |K : Fix(K, G)| = n, getting a contradiction.

Suppose that L|K is a Galois extension. We now establish that the map τ between
intermediate fields K ⊂ E ⊂ L and subgroups of Aut(L|K) is a bijection.

Theorem 15.4.8. Let L|K be a Galois extension. Then we have the following:
(1) Aut(L|K) is finite and

Fix(L, Aut(L|K)) = K.

(2) If H ⊂ Aut(L|K), then

Aut(L|Fix(L, H)) = H.

Proof. (1) If (L|K) is a Galois extension, there is a finite subgroup of Aut(L) with K =
Fix(K, G). From Theorem 15.4.7, we have G = Aut(L|K). In particular, Aut(L|K) is finite,
and K = Fix(L, Aut(L|K)).
(2) Let H ⊂ Aut(L|K). From part (1), H is finite, and then Aut(L|Fix(L, H)) = H from
Theorem 15.4.7.

Theorem 15.4.9. Let L|K be a field extension. Then the following are equivalent:
(1) L|K is a Galois extension.
(2) |L : K| = |Aut(L|K)| < ∞.
(3) |Aut(L|K)| < ∞, and K = Fix(L, Aut(L|K)).

Proof. (1) ⇒ (2): Now, from Theorem 15.4.8, |Aut(L|K)| < ∞, and Fix(L, Aut(L|K)) = K.
Therefore, from Theorem 15.4.6, |L : K| = |Aut(L|K)|.
(2) ⇒ (3): Let G = Aut(L|K). Then K ⊂ Fix(L, G) ⊂ L. From Theorem 15.4.6, we have
󵄨󵄨 󵄨
󵄨󵄨L : Fix(L, G)󵄨󵄨󵄨 = |G| = |L : K|.

(3) ⇒ (1) follows directly from the definition completing the proof.

We now show that if L|K is a Galois extension, then L|M is also a Galois extension
for any intermediate field M.

Theorem 15.4.10. Let L|K be a Galois extension and K ⊂ M ⊂ L be an intermediate


field. Then L|M is always a Galois extension, and
󵄨 󵄨
|M : K| = 󵄨󵄨󵄨Aut(L|K) : Aut(L|M)󵄨󵄨󵄨.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:02 AM
236 | 15 Finite Galois extensions

Proof. Let G = Aut(L|K). Then, from Theorem 15.4.9, |G| < ∞, and furthermore, K =
Fix(L, G). Define H = Aut(L|M) and M 󸀠 = Fix(L, H). We must show that M 󸀠 = M for
then L|M is a Galois extension.
Since the elements of H fix M, we have M ⊂ M 󸀠 . Let G = ⋃ri=1 αi H, a disjoint union
of the cosets of H. Let α1 = 1, and define βi = αi|M . The β1 , . . . , βr are pairwise distinct
for if βi = βj ; that is αi|M = αj|M . Then αj−1 αi ∈ H, so αi and αj are in the same coset.
We claim that

{a ∈ M : β1 (a) = ⋅ ⋅ ⋅ = βr (a)} = M ∩ Fix(L, G).

Moreover, from Theorem 15.4.9, we know that

M ∩ Fix(L, G) = M ∩ K = K.

To establish the claim, it is clear that

M ∩ Fix(L, G) ⊂ {a ∈ M : β1 (a) = ⋅ ⋅ ⋅ = βr (a)},

since

a = βi (a) = αi (a) for αi ∈ G, a ∈ K.

Hence, we must show that

{a ∈ M : β1 (a) = ⋅ ⋅ ⋅ = βr (a)} ⊂ M ∩ Fix(L, G).

To do this, we must show that α(b) = b for all α ∈ G, b ∈ M. We have α ∈ αi H for


some i, and hence α = αi γ for γ ∈ H. We obtain then

α(b) = αi (γ(b)) = αi (b) = βi (b) = b,

proving the inclusion and establishing the claim.


Now, from Theorem 15.4.3, |M : K| ≥ r. From the degree formula, we get
󵄨󵄨 󸀠 󵄨󵄨 󸀠 󵄨
󵄨󵄨L : M 󵄨󵄨󵄨󵄨󵄨󵄨M : M 󵄨󵄨󵄨|M : K| = |L : K| = |G| = |G : H||H| = r|L : M |,
󸀠

since, from Theorem 15.4.9, |L : K| = |G| and |H| = |L : M 󸀠 |. Therefore, |M : M 󸀠 | = 1.


Hence, M = M 󸀠 , since |M : K| ≥ r. Now
󵄨 󵄨
|M : K| = |G : H| = 󵄨󵄨󵄨Aut(L|K) : Aut(L|M)󵄨󵄨󵄨,

completing the proof.

Lemma 15.4.11. Let L|K be a field extension and K ⊂ M ⊂ L be an intermediate field. If


α ∈ Aut(L|K), then

Aut(L|α(M)) = α Aut(L|M)α−1 .

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:02 AM
15.4 The fundamental theorem of Galois theory | 237

Proof. Now, β ∈ Aut(L|α(M)) if and only if β(α(a)) = α(a) for all a ∈ M. This occurs if
and only if α−1 βα(a) = a for all a ∈ M, which is true if and only if β ∈ α Aut(L|M)α−1 .

Lemma 15.4.12. Let L|K be a Galois extension and K ⊂ M ⊂ L be an intermediate field.


Suppose that α(M) = M for all α ∈ Aut(L|K). Then

ϕ : Aut(L|K) → Aut(M|K) by ϕ(α) = α|M

is an epimorphism with kernel ker(ϕ) = Aut(L|M).

Proof. It is clear that ϕ is a homomorphism with ker(ϕ) = Aut(L|M) (see exercises).


We must show that it is an epimorphism.
Let G = im(ϕ). Since L|K is a Galois extension, we get that

Fix(M, G) = Fix(L, Aut(L|K)) ∩ M = K ∩ M = K.

Then, from Theorem 15.4.8, we have

Aut(M|K) = Aut(M|Fix(M, G)) = G,

and therefore ϕ is an epimorphism.

Theorem 15.4.13. Let L|K be a Galois extension and K ⊂ M ⊂ L be an intermediate field.


Then the following are equivalent:
(1) M|K is a Galois extension.
(2) If α ∈ Aut(L|K), then α(M) = M.
(3) Aut(L|M) is a normal subgroup of Aut(L|K).

Proof. (1) ⇒ (2): Suppose that M|K is a Galois extension. Let Aut(M|K) = {α1 , . . . , αr }.
Consider the αi as monomorphisms from M into L. Let αr+1 : M → L be a monomor-
phism with αr+1|K = 1. Then

{a ∈ M : α1 (a) = α2 (a) = ⋅ ⋅ ⋅ = αr (a) = αr+1 (a)} = K,

since M|K is a Galois extension. Therefore, from Theorem 15.4.3, we have that if the
α1 , . . . , αr , αr+1 are distinct, then

󵄨 󵄨
|M : K| ≥ r + 1 > r = 󵄨󵄨󵄨Aut(M|K)󵄨󵄨󵄨 = |M : K|,

giving a contradiction. Hence, if αr+1 ∈ Aut(L|K) is arbitrary, then αr+1|M ∈ {α1 , . . . , αr };


that is, αr+1 fixes M.
(2) ⇒ (1): Suppose that if α ∈ Aut(L|K), then α(M) = M. The map ϕ : Aut(L|K) →
Aut(M|K) with ϕ(α) = α|M is surjective. Since L|K is a Galois extension, then Aut(L|K)
is finite. Therefore, also H = Aut(M|K) is finite. To prove (1) then, it is sufficient to
show that K = Fix(M, H).

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:02 AM
238 | 15 Finite Galois extensions

The field K ⊂ Fix(M, H) from the definition of the fix field. Hence, we must show
that Fix(M, H) ⊂ K. Assume that there exists an α ∈ Aut(L|K) with α(a) ≠ a for some
a ∈ Fix(M, H). Recall that L|K is a Galois extension, and therefore Fix(L, Aut(L|K)) = K.
Define β = α|M . Then β ∈ H, since α(M) = M and our original assumption. Then
β(a) ≠ a, contradicting a ∈ Fix(M, H). Therefore, K = Fix(M, H), and M|K is a Galois
extension.
(2) ⇒ (3): Suppose that if α ∈ Aut(L|K), then α(M) = M. Then Aut(L|M) is a normal
subgroup of Aut(L|K) follows from Lemma 15.4.12, since Aut(L|M) is the kernel of ϕ.
(3) ⇒ (2): Suppose that Aut(L|M) is a normal subgroup of Aut(L|K). Let α ∈
Aut(L|K), then from our assumption and Lemma 15.4.11, we get that

Aut(L|α(M)) = Aut(L|M).

Now L|M and L|α(M) are Galois extensions by Theorem 15.4.10. Therefore,

α(M) = Fix(L, Aut(L|α(M)) = Fix(L, Aut(L|M)) = M,

completing the proof.

We now combine all of these results to give the proof of Theorem 15.4.1, the fun-
damental theorem of Galois theory.

Proof of Theorem 15.4.1. Let L|K be a Galois extension.


(1) Let G ⊂ Aut(L|K). Both G and Aut(L|K) are finite from Theorem 15.4.8. Further-
more, G = Aut(L|Fix(L, G)) from Theorem 15.4.7.
Now let M be an intermediate field of L|K. Then L|M is a Galois extension from
Theorem 15.4.10, and then Fix(L, Aut(L|M)) = M from Theorem 15.4.8.
(2) Let M be an intermediate field of L|K. From Theorem 15.4.10, L|M is a Galois ex-
tension. From Theorem 15.4.9, we have |L : M| = |Aut(L|M)|. Applying Theorem 15.4.10,
we get the result on indices
󵄨 󵄨
|M : K| = 󵄨󵄨󵄨Aut(L|K) : Aut(L|M)󵄨󵄨󵄨.

(3) Let M be an intermediate field of L|K.


(a) From Theorem 15.4.10, we have that L|M is a Galois extension.
(b) From Theorem 15.4.13, M|K is a Galois extension if and only if

Aut(L|M) is a normal subgroup of Aut(L|K).

(4) Let M|K be a Galois extension.


(a) α(M) = M for all α ∈ Aut(L|K) from Theorem 15.4.13.
(b) The map ϕ : Aut(L|K) → Aut(M|K) with ϕ(α) = α|M = β is an epimorphism
follows from Lemma 15.4.12 and Theorem 15.4.13.
(c) Aut(M|K) = Aut(L|K)/ Aut(L|M) follows directly from the group isomorphism
theorem.
(5) That the lattice of subfields of L containing K is the inverted lattice of subgroups
of Aut(L|K) follows directly from the previous results.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:02 AM
15.4 The fundamental theorem of Galois theory | 239

In Chapter 8, we looked at the following example (Example 8.1.7). Here, we ana-


lyze it further using the Galois theory.

Example 15.4.14. Let f (x) = x3 − 7 ∈ ℚ[x]. This has no zeros in ℚ, and since it is of
degree 3, it follows that it must be irreducible in ℚ[x].
Let ω = − 21 + 23 i ∈ ℂ. Then it is easy to show by computation that

1 √3
ω2 = − − i and ω3 = 1.
2 2
Therefore, the three zeros of f (x) in ℂ are

a1 = 71/3 , a2 = ω(71/3 ), a3 = ω2 (71/3 ).

Hence, L = ℚ(a1 , a2 , a3 ) is the splitting field of f (x). Since the minimal polynomial
of all three zeros over ℚ is the same f (x), it follows that

ℚ(a1 ) ≅ ℚ(a2 ) ≅ ℚ(a3 ).

Since ℚ(a1 ) ⊂ ℝ and a2 , a3 are nonreal, it is clear that a2 , a3 ∉ ℚ(a1 ).


1/3
Suppose that ℚ(a2 ) = ℚ(a3 ). Then ω = a3 a−1 2 ∈ ℚ(a2 ), and so 7 = ω−1 a2 ∈ ℚ(a2 ).
Hence, Q(a1 ) ⊂ ℚ(a2 ); therefore, ℚ(a1 ) = ℚ(a2 ) since they are the same degree over ℚ.
This contradiction shows that ℚ(a2 ) and ℚ(a3 ) are distinct.
2
By computation, we have a3 = a−1 1 a2 , and hence

L = ℚ(a1 , a2 , a3 ) = ℚ(a1 , a2 ) = ℚ(71/3 , ω).

Now the degree of L over ℚ is

|L : ℚ| = 󵄨󵄨󵄨Q(71/3 , ω) : ℚ(ω)󵄨󵄨󵄨󵄨󵄨󵄨ℚ(ω) : ℚ󵄨󵄨󵄨.


󵄨 󵄨󵄨 󵄨

Now |ℚ(ω) : ℚ| = 2, since the minimal polynomial of ω over ℚ is x2 + x + 1.


Since no zero of f (x) lies in ℚ(ω), and the degree of f (x) is 3, it follows that f (x) is
irreducible over ℚ(ω). Therefore, we have that the degree of L over ℚ(ω) is 3. Hence,
|L : ℚ| = (2)(3) = 6.
Clearly then, we have the following lattice of intermediate fields:

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:02 AM
240 | 15 Finite Galois extensions

The question then arises as to whether these are all the intermediate fields. The
answer is yes, which we now prove.
Let G = Aut(L|ℚ) = Aut(L). (Aut(L|ℚ) = Aut(L), since ℚ is a prime field.) Now
G ≅ S3 . G acts transitively on {a1 , a2 , a3 }, since f is irreducible. Let δ : ℂ → ℂ be the
automorphism of ℂ taking each element to its complex conjugate; that is, δ(z) = z.
Then δ(f ) = f , and δ|L ∈ G (see Theorem 8.2.2). Since a1 ∈ ℝ, we get that δ|{a1 ,a2 ,a3 } =
(a2 , a3 ), the 2-cycle that maps a2 to a3 and a3 to a2 . Since G is transitive on {a1 , a2 , a3 },
there is a τ ∈ G with τ(a1 ) = a2 .
Case 1: τ(a3 ) = a3 . Then τ = (a1 , a2 ), and (a1 , a2 )(a2 , a3 ) = (a1 , a2 , a3 ) ∈ G.
Case 2: τ(a3 ) ≠ a3 . Then τ is a 3-cycle. In either case, G is generated by a transpo-
sition and a 3-cycle. Hence, G is all of S3 . Then L|ℚ is a Galois extension from Theo-
rem 15.4.9, since |G| = |L : ℚ|.
The subgroups of S3 are as follows:

Hence, the above lattice of fields is complete. L|ℚ, ℚ|ℚ, ℚ(ω)|ℚ and L|ℚ(ai ) are
Galois extensions, whereas ℚ(ai )|ℚ with i = 1, 2, 3 are not Galois extensions.

15.5 Exercises
1. Let K ⊂ M ⊂ L be a chain of fields, and let ϕ : Aut(L|K) → Aut(M|K) be defined by
ϕ(α) = α|M . Show that ϕ is an epimorphism with kernel ker(ϕ) = Aut(L|M).
1 1
2. Show that ℚ(5 4 )|ℚ(√5) and ℚ(√5)|ℚ are Galois extensions, and ℚ(5 4 )|ℚ is not a
Galois extension.
3. Let L|K be a field extension and u, v ∈ L algebraic over K with |K(u) : K| = m and
|K(v) : K| = n. If m and n are coprime, then |K(u, v) : K| = n ⋅ m.
1 1
4. Let p, q be prime numbers with p ≠ q. Let L = ℚ(√p, q 3 ). Show that L = ℚ(√p⋅q 3 ).
1
Determine a basis of L over ℚ and the minimal polynomial of √p ⋅ q 3 .
1
5. Let K = ℚ(2 n ) with n ≥ 2.
(i) Determine the number of ℚ-embeddings σ : K → ℝ. Show that for each such
embedding, we have σ(K) = K.
(ii) Determine Aut(K|ℚ).

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:02 AM
15.5 Exercises | 241

6. Let α = √5 + 2√5.
(i) Determine the minimal polynomial of α over ℚ.
(ii) Show that ℚ(a)|ℚ is a Galois extension.
(iii) Determine Aut(ℚ(a)|ℚ).
7. Let K be a field of prime characteristic p, and let f (x) = xp − x + a ∈ K be an
irreducible polynomial. Let L = K(v), where v is a zero of f (x).
(i) If α is a zero of f (x), then also α + 1.
(ii) L|K is a Galois extension.
(iii) There is exactly one K-automorphism σ of L with σ(v) = v + 1.
(iv) The Galois group Aut(L|K) is cyclic with generating element σ.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:02 AM
Brought to you by | Stockholm University Library
Authenticated
Download Date | 10/13/19 9:02 AM
16 Separable field extensions
16.1 Separability of fields and polynomials
In the previous chapter, we introduced and examined Galois extensions. Recall that
L|K is a Galois extension if there exists a finite subgroup G ⊂ Aut(L) such that K =
Fix(L, G). The following questions logically arise:
(1) Under what conditions is a field extension L|K a Galois extension?
(2) When is L|K a Galois extension when L is the splitting field of a polynomial f (x) ∈
K[x]?

In this chapter, we consider these questions and completely characterize Galois ex-
tensions. To do this, we must introduce separable extensions.

Definition 16.1.1. Let K be a field. Then a nonconstant polynomial f (x) ∈ K[x] is called
separable over K if each irreducible factor of f (x) has only simple zeros in its splitting
field.

We now extend this definition to field extensions.

Definition 16.1.2. Let L|K be a field extension and a ∈ L. Then a is separable over K
if a is a zero of a separable polynomial. The field extension L|K is a separable field
extension, or just separable if all a ∈ L are separable over K. In particular, a separable
extension is an algebraic extension.

Finally, we consider fields, where every nonconstant polynomial is separable.

Definition 16.1.3. A field K is perfect if each nonconstant polynomial in K[x] is sepa-


rable over K.

The following is straightforward from the definitions: An element a is separable


over K if and only if its minimal polynomial ma (x) is separable.
If f (x) ∈ K[x], then f (x) = ∑ni=0 ki xi with ki ∈ K. The formal derivative of f (x) is then
f 󸀠 (x) = ∑ni=1 iki xi−1 . As in ordinary Calculus, we have the usual differentiation rules

(f (x) + g(x)) = f 󸀠 (x) + g 󸀠 (x)


󸀠

and

(f (x)g(x)) = f 󸀠 (x)g(x) + f (x)g 󸀠 (x)


󸀠

for f (x), g(x) ∈ K[x].

Lemma 16.1.4. Let K be a field and f (x) an irreducible nonconstant polynomial in K[x].
Then f (x) is separable if and only if its formal derivative is nonzero.

https://doi.org/10.1515/9783110603996-016

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:03 AM
244 | 16 Separable field extensions

Proof. Let L be the splitting field of f (x) over K. Let f (x) = (x − a)r g(x), where (x − a)
does not divide g(x). Then

f 󸀠 (x) = (x − a)r−1 (rg(x) + (x − a)g 󸀠 (x)).

If f 󸀠 (x) ≠ 0, then a is a zero of f (x) in L over K of multiplicity m ≥ 2 if and only if


(x − a)|f (x), and also (x − a)|f 󸀠 (x).
Let f (x) be a separable polynomial over K[x], and let a be a zero of f (x) in L. Then
if f (x) = (x − a)r g(x) with (x − a) not dividing g(x), we must have r = 1. Then

f 󸀠 (x) = g(x) + (x − a)g 󸀠 (x).

If g 󸀠 (x) = 0, then f 󸀠 (x) = g(x) ≠ 0. Now suppose that g 󸀠 (x) ≠ 0. Assume that f 󸀠 (x) = 0;
then, necessarily, (x − a)|g(x) giving a contradiction. Therefore, f 󸀠 (x) ≠ 0.
Conversely, suppose that f 󸀠 (x) ≠ 0. Assume that f (x) is not separable. Then both
f (x) and f 󸀠 (x) have a common zero a ∈ L. Let ma (x) be the minimal polynomial of
a in K[x]. Then ma (x)|f (x), and ma (x)|f 󸀠 (x). Since f (x) is irreducible, then the degree
of ma (x) must equal the degree of f (x). But ma (x) must also have the same degree as
f 󸀠 (x), which is less than that of f (x), giving a contradiction. Therefore, f (x) must be
separable.

We now consider the following example of a nonseparable polynomial over the


finite field ℤp of p elements. We will denote this field now as GF(p), the Galois field of
p elements.

Example 16.1.5. Let K = GF(p) and L = K(t), the field of rational functions in t over K.
Consider the polynomial f (x) = xp − t ∈ L[x].
Now K[t]/tK[t] ≅ K. Since K is a field, this implies that tK[t] is a maximal ideal,
and hence a prime ideal in K[t] with prime element t ∈ K[t] (see Theorem 3.2.7). By
the Eisenstein criteria, f (x) is an irreducible polynomial in L[x] (see Theorem 4.4.8).
However, f 󸀠 (x) = pxp−1 = 0, since char(K) = p. Therefore, f (x) is not separable.

16.2 Perfect fields


We now consider when a field K is perfect. First, we show that, in general, any field
of characteristic 0 is perfect. In particular, the rationals ℚ are perfect, and hence any
extension of the rationals is separable.

Theorem 16.2.1. Each field K of characteristic zero is perfect.

Proof. Suppose that K is a field with char(K) = 0. Suppose that f (x) is a nonconstant
polynomial in K[x]. Then f 󸀠 (x) ≠ 0. If f (x) is irreducible, then f (x) is separable from
Lemma 16.1.4. Therefore, by definition, each nonconstant polynomial f (x) ∈ K[x] is
separable.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:03 AM
16.2 Perfect fields | 245

We remark that in the original motivation for Galois theory, the ground field was
the rationals ℚ. Since this has characteristic zero, it is perfect and all extensions are
separable. Hence, the question of separability did not arise until the question of ex-
tensions of fields of prime characteristic arose.

Corollary 16.2.2. Any finite extension of the rationals ℚ is separable.

We now consider the case of prime characteristic.

Theorem 16.2.3. Let K be a field with char(K) = p ≠ 0. If f (x) is a nonconstant polyno-


mial in K[x], then the following are equivalent:
(1) f 󸀠 (x) = 0.
(2) f (x) is a polynomial in xp ; that is, there is a g(x) ∈ K[x] with f (x) = g(xp ).

If in (1) and (2) f (x) is irreducible, then f (x) is not separable over K if and only if f (x) is
a polynomial in xp .

Proof. Let f (x) = ∑ni=1 ai xi . Then f 󸀠 (x) = 0 if and only if p|i for all i with ai ≠ 0. But this
is equivalent to

f (x) = a0 + ap xp + ⋅ ⋅ ⋅ + am xmp .

If f (x) is irreducible, then f (x) is not separable if and only if f 󸀠 (x) = 0 from
Lemma 16.1.4.

Theorem 16.2.4. Let K be a field with char(K) = p ≠ 0. Then the following are equiva-
lent:
(1) K is perfect.
(2) Each element in K has a p-th root in K.
(3) The Frobenius homomorphism x 󳨃→ xp is an automorphism of K.

Proof. First we show that (1) implies (2). Suppose that K is perfect, and a ∈ K. Then
xp − a is separable over K. Let g(x) ∈ K[x] be an irreducible factor of xp − a. Let L be
the splitting field of g(x) over K, and b a zero of g(x) in L. Then bp = a. Furthermore,
xp − bp = (x − b)p ∈ L[x], since the characteristic of K is p. Hence, g(x) = (x − b)s , and
then s must equal 1 since g(x) is irreducible. Therefore, b ∈ K, and b is a p-th root of a.
Now we show that (2) implies (3). Recall that the Frobenius homomorphism τ :
x 󳨃→ xp is injective (see Theorem 1.8.8). We must show that it is also surjective. Let
a ∈ K, and let b be a p-th root of a so that a = bp . Then τ(b) = bp = a, and τ is
surjective.
Finally, we show that (3) implies (1). Let τ : x 󳨃→ xp be surjective. It follows that
each a ∈ K has a p-th root in K. Now let f (x) ∈ K[x] be irreducible. Assume that f (x) is
not separable. From Theorem 16.2.3, there is a g(x) ∈ K[x] with f (x) = g(xp ); that is,

f (x) = a0 + a1 xp + ⋅ ⋅ ⋅ + am xmp .

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:03 AM
246 | 16 Separable field extensions

Let bi ∈ K with ai = bpi . Then

p
f (x) = bpo + bp1 xp + ⋅ ⋅ ⋅ + bpm xmp = (b0 + b1 x + ⋅ ⋅ ⋅ + bm xm ) .

However, this is a contradiction since f (x) is irreducible. Therefore, f (x) is separable,


completing the proof.

Theorem 16.2.5. Let K be a field with char(K) = p ≠ 0. Then each element of K has at
most one p-th power in K.

Proof. Suppose that b1 , b2 ∈ K with bp1 = bp2 = a. Then

0 = bp1 − bp2 = (b1 − b2 )p .

Since K has no zero divisors, it follows that b1 = b2 .

16.3 Finite fields


In this section, we consider finite fields. In particular, we show that if K is a finite field,
then |K| = pm for some prime p and natural number m > 0. Moreover, we show that
if K1 , K2 are finite fields with |K1 | = |K2 |, then K1 ≅ K2 . Hence, there is a unique finite
field for each possible order.
Notice that if K is a finite field, then by necessity char K = p ≠ 0. We first show
that, in this case, K is always perfect.

Theorem 16.3.1. A finite field is perfect.

Proof. Let K be a finite field of characteristic p > 0. Then the Frobenius map τ : x 󳨃→
xp is surjective since its injective and K is finite. Therefore, K is perfect from Theo-
rem 16.2.4.

Next we show that each finite field has order pm for some prime p and natural
number m > 0.

Lemma 16.3.2. Let K be a finite field. Then |K| = pm for some prime p and natural num-
ber m > 0.

Proof. Let K be a finite field with characteristic p > 0. Then K can be considered as a
vector space over K = GF(p), and hence of finite dimension since |K| < ∞. If α1 , . . . , αm
is a basis, then each f ∈ K can be written as f = c1 α1 + ⋅ ⋅ ⋅ + cn αm with each ci ∈ GF(p).
Hence, there are p choices for each ci , and therefore pm choices for each f .

In Theorem 9.5.16, we proved that any finite subgroup of the multiplicative group
of a field is cyclic. If K is a finite field, then its multiplicative subgroup K ⋆ is finite, and
hence cyclic.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:03 AM
16.4 Separable extensions | 247

Lemma 16.3.3. Let K be a finite field. Then its multiplicative subgroup K ⋆ is cyclic.

If K is a finite field with order pm , then its multiplicative subgroup K ⋆ has order
p − 1. Then, from Lagrange’s theorem, each nonzero element to the power pm is the
m

identity. Therefore, we have the result.

Lemma 16.3.4. Let K be a field of order pm . Then each α ∈ K is a zero of the polynomial
m m
xp − x. In particular, if α ≠ 0, then α is a zero of xp −1 − 1.

If K is a finite field of order pm , it is a finite extension of GF(p). Since the multiplica-


tive group is cyclic, we must have K = GF(p)(α) for some α ∈ K. From this, we obtain
that for a given possible finite order, there is only one finite field up to isomorphism.

Theorem 16.3.5. Let K1 , K2 be finite fields with |K1 | = |K2 |. Then K1 ≅ K2 .

Proof. Let |K1 | = |K2 | = pm . From the remarks above, K1 = GF(p)(α), where α has order
pm − 1 in K1⋆ . Similarly, K2 = GF(p)(β), where β also has order pm − 1 in K2⋆ . Hence,
GF(p)(α) ≅ GF(p)(β), and therefore K1 ≅ K2 .

In Lemma 16.3.2, we saw that if K is a finite field, then |K| = pn for some prime p
and positive integer n. We now show that given a prime power pn , there does exist a
finite field of that order.

Theorem 16.3.6. Let p be a prime and n > 0 a natural number. Then there exists a field
K of order pn .
n
Proof. Given a prime p, consider the polynomial g(x) = xp − x ∈ GF(p)[x]. Let K be
the splitting field of this polynomial over GF(p). Since a finite field is perfect, K is a
separable extension, and hence all the zeros of g(x) are distinct in K.
Let F be the set of pn distinct zeros of g(x) within K. Let a, b ∈ F. Since
n n n n n n
(a ± b)p = ap ± bp and (ab)p = ap bp ,

it follows that F forms a subfield of K. However, F contains all the zeros of g(x), and
since K is the smallest extension of GF(p) containing all the zeros of g(x), we must
have K = F. Since F has pn elements, it follows that the order of K is pn .

Combining Theorems 16.3.5 and 16.3.6, we get the following summary result, in-
dicating that up to isomorphism there exists one and only one finite field of order pn .

Theorem 16.3.7. Let p be a prime and n > 0 a natural number. Then up to isomorphism,
there exists a unique finite field of order pn .

16.4 Separable extensions


In this section, we consider some properties of separable extensions.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:03 AM
248 | 16 Separable field extensions

Theorem 16.4.1. Let K be a field with K ⊂ L and L algebraically closed. Let α : K → L


be a monomorphism. Then the number of monomorphisms β : K(a) → L with β|K = α
is equal to the number of pairwise distinct zeros in L of the minimal polynomial ma of a
over K.

Proof. Let β be as in the statement of the theorem. Then β is uniquely determined by


β(a), and β(a) is a zero of the polynomial β(ma (x)) = α(ma (x)). Now let a󸀠 be a zero
of α(ma (x)) in L. Then there exists a β : K(a) → L with β(a) = a󸀠 from Theorem 7.1.4.
Therefore, α has exactly as many extensions β as α(ma (x)) has pairwise distinct ze-
ros in L. The number of pairwise distinct zeros of α(ma (x)) is equal to the number of
pairwise distinct zeros of ma (x). This can be seen as follows: Let L0 be a splitting field
of ma (x) and L1 ⊂ L a splitting field of α(ma (x)). From Theorems 8.1.5 and 8.1.6, there
is an isomorphism ψ : L0 → L1 , which maps the zeros of ma (x) onto the zeros of
α(ma (x)).

Lemma 16.4.2. Let L|K be a finite extension with L ⊂ L, and L algebraically closed. In
particular, L = K(a1 , . . . , an ), where the ai are algebraic over K. Let pi be the number of
pairwise distinct zeros of the minimal polynomial mai of ai over K(a1 , . . . , an−1 ) in L. Then
there are exactly p1 , . . . , pn monomorphisms β : L → L with β|K = 1K .

Proof. From Theorem 16.4.1, there are exactly p1 monomorphisms α : K(a1 ) → L with
α|K equal to the identity on K. Each such α has exactly p2 extensions of the identity on
K to K(a1 , a2 ). We now continue in this manner.

Theorem 16.4.3. Let L|K be a field extension with M an intermediate field. If a ∈ L is


separable over K, then it is also separable over M.

Proof. This follows directly from the fact that the minimal polynomial of a over M
divides the minimal polynomial of a over K.

Theorem 16.4.4. Let L|K be a field extension. Then the following are equivalent:
(1) L|K is finite and separable.
(2) There are finitely many separable elements a1 , . . . , an over K with K = K(a1 , . . . , an ).
(3) L|K is finite, and if L ⊂ L with L algebraically closed, then there are exactly [L : K]
monomorphisms α : L → L with α|K = 1K .

Proof. That (1) implies (2) follows directly from the definitions. We show then that (2)
implies (3). Let L = K(a1 , . . . , an ), where a1 , . . . , an are separable elements over K. The
extension L|K is finite (see Theorem 5.3.4). Let pi be the number of pairwise distinct
zeros in L of the minimal polynomial mai (x) = fi (x) of ai over K(a1 , . . . , ai−1 ). Then pi ≤
deg(fi ) = |K(a1 , . . . , ai ) : K(a1 , . . . , ai−1 )|. Hence, pi = deg(fi (x)) since ai is separable
over K(a1 , . . . , ai−1 ) from Theorem 16.4.3. Therefore,

[L : K] = p1 ⋅ ⋅ ⋅ pn

is equal to the number of monomorphisms α : L → L with α|K , the identity on K.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:03 AM
16.4 Separable extensions | 249

Finally, we show that (3) implies (1). Suppose then the conditions of (3). Since L|K
is finite, there are finitely many a1 , . . . , an ∈ L with L = K(a1 , . . . , an ). Let pi and fi (x) be
as in the proof above, and hence pi ≤ deg(fi (x)). By assumption we have

[L : K] = p1 ⋅ ⋅ ⋅ pn

equal to the number of monomorphisms α : L → L with α|K , the identity on K. Also

[L : K] = p1 ⋅ ⋅ ⋅ pn ≤ deg(f1 (x)) ⋅ ⋅ ⋅ deg(fn (x)) = [L : K].

Hence, pi = deg(fi (x)). Therefore, by definition, each ai is separable over K.


To complete the proof, we must show that L|K is separable. Inductively, it suffices
to prove that K(a1 )|K is separable over K whenever a1 is separable over K, and not in K.
This is clear if char(K) = 0, because K is perfect. Suppose then that char(K) =
p > 0. First, we show that K(ap1 ) = K(a1 ). Certainly, K(ap1 ) ⊂ K(a1 ). Assume that a1 ∉
K(ap1 ). Then g(x) = xp − ap1 is the minimal polynomial of a1 over K. This follows from
the fact that xp − ap1 = (x − a1 )p , and hence there can be no irreducible factor of xp − ap1
of the form (x − a1 )m with m < p and m|p.
However, it follows then, in this case, that g 󸀠 (x) = 0, contradicting the separability
of a1 over K. Therefore, K(a1 ) = K(ap1 ).
Let E = K(a1 ), then also E = K(E p ), where E p is the field generated by the p-th
powers of E. Now let b ∈ E = K(a1 ). We must show that the minimal polynomial of b,
say mb (x), is separable over K.
Assume that mb (x) is not separable over K. Then

k
mb (x) = ∑ bi xpi , bi ∈ K, bk = 1
i=0

from Theorem 16.2.3. We have

b0 + b1 bp + ⋅ ⋅ ⋅ + bk bpk = 0.

Therefore, the elements 1, bp , . . . , bpk are linearly dependent over K. Since K(a1 ) = E =
K(E p ), we find that 1, b, . . . , bk are linearly dependent also, since if they were indepen-
dent the p-th powers would also be independent. However, this is not possible, since
k < deg(mb (x)). Therefore, mb (x) is separable over K, and hence K(a1 )|K is separable.
Altogether L|K is then finite and separable, completing the proof.

Theorem 16.4.5. Let L|K be a field extension, and let M be an intermediate field. Then
the following are equivalent:
(1) L|K is separable.
(2) L|M and M|K are separable.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:03 AM
250 | 16 Separable field extensions

Proof. We first show that (1) ⇒ (2): If L|K is separable then L|M is separable by Theo-
rem 16.4.3, and M|K is separable.
Now suppose (2), and let M|K and L|M be separable. Let a ∈ L, and let

ma (x) = f (x) = b0 + ⋅ ⋅ ⋅ + bn−1 xn−1 + xn

be the minimal polynomial of a over M. Then f (x) is separable. Let

M 󸀠 = K(b1 , . . . , bn−1 ).

We have K ⊂ M 󸀠 ⊂ M, and hence M 󸀠 |K is separable, since M|K is separable. Further-


more, a is separable over M 󸀠 , since f (x) is separable, and f (x) ∈ M 󸀠 [x]. From Theo-
rem 16.4.1, there are m = deg(f (x)) = [M 󸀠 (a) : M 󸀠 ] extensions of α : M 󸀠 → M with M
the algebraic closure of M 󸀠 .
Since M 󸀠 |K is separable and finite, there are [M 󸀠 : K] monomorphisms α : M 󸀠 → M
from Theorem 16.4.4. Altogether, there are [M 󸀠 (a) : K] monomorphisms α : M 󸀠 →
M with α|K, the identity on K. Therefore, M 󸀠 (a)|K is separable from Theorem 16.4.4.
Hence, a is separable over K, and then L|K is separable. Therefore, (2) implies (1).

Theorem 16.4.6. Let L|K be a field extension, and let S ⊂ L such that all elements of S
are separable over K. Then K(S)|K is separable, and K[S] = K(S).

Proof. Let W be the set of finite subsets of S. Let T ∈ W. From Theorem 16.4.4, we
obtain that K(T)|K is separable. Since each element of K(S) is contained in some K(T),
we have that K(S)|K is separable. Since all elements of S are algebraic, we have that
K[S] = K(S).

Theorem 16.4.7. Let L|K be a field extension. Then there exists in L a uniquely deter-
mined maximal field M with the property that M|K is separable. If a ∈ L is separable
over M, then a ∈ M. M is called the separable hull of K in L.

Proof. Let S be the set of all elements in L, which are separable over K. Define M =
K(S). Then M|K is separable from Theorem 16.4.6. Now, let a ∈ L be separable over M.
Then M(a)|M is separable from Theorem 16.4.4. Furthermore, M(a)|K is separable from
Theorem 16.4.5. It follows that a ∈ M.

16.5 Separability and Galois extensions


We now completely characterize Galois extensions L|K as finite, normal, separable
extensions.

Theorem 16.5.1. Let L|K be a field extension. Then the following are equivalent:
(1) L|K is a Galois extension.
(2) L is the splitting field of a separable polynomial in K[x].
(3) L|K is finite, normal, and separable.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:03 AM
16.5 Separability and Galois extensions | 251

Therefore, we may characterize Galois extensions of a field K as finite, normal, and sep-
arable extensions of K.

Proof. Recall from Theorem 8.2.2 that an extension L|K is normal if the following hold:
(1) L|k is algebraic, and
(2) each irreducible polynomial f (x) ∈ K[x] that has a zero in L splits into linear fac-
tors in L[x].

Now suppose that L|K is a Galois extension. Then L|K is finite from Theorem 15.4.1.
Let L = K(b1 , . . . , bm ) and mbi (x) = fi (x) be the minimal polynomial of bi over K. Let
ai1 , . . . , ain be the pairwise distinct elements from

Hi = {α(bi ) : α ∈ Aut(L|K)}.

Define

gi (x) = (x − ai1 ) ⋅ ⋅ ⋅ (x − ain ) ∈ L[x].

If α ∈ Aut(L|K), then α(gi ) = gi , since α permutes the elements of Hi . This means that
the coefficients of gi (x) are in Fix(L, Aut(L|K)) = K. Furthermore, gi (x) ∈ K[x], because
bi is one of the aij , and fi (x)|gi (x). The group Aut(L|K) acts transitively on {ai1 , . . . , ain }
by the choice of ai1 , . . . , ain . Therefore, each gi (x) is irreducible (see Theorem 15.2.4).
It follows that fi (x) = gi (x). Now, fi (x) has only simple zeros in L; that is, no zero has
multiplicity ≥ 2, and hence fi (x) splits over L. Therefore, L is a splitting field of f (x) =
f1 (x) ⋅ ⋅ ⋅ fm (x), and f (x) is separable by definition. Hence, (1) implies (2).
Now suppose that L is a splitting field of the separable polynomial f (x) ∈ K[x], and
L|K is finite. From Theorem 16.4.4, we get that L|K is separable, since L = K(a1 , . . . , an )
with each ai separable over K. Therefore, L|K is normal from Definition 8.2.1. Hence,
(2) implies (3).
Finally, suppose that L|K is finite, normal, and separable. Since L|K is finite and
separable from Theorem 16.4.4, there exist exactly [L : K] monomorphisms α : L → L,
L, the algebraic closure of L, with α|K the identity on K. Since L|K is normal, these
monomorphisms are already automorphisms of L from Theorem 8.2.2. Hence, [L : K] ≤
|Aut(L|K)|. Furthermore, |L : K| ≥ |Aut(L|K)| from Theorem 15.4.3. Combining these,
we have [L : K] = Aut(L|K), and hence L|K is a Galois extension from Theorem 15.4.9.
Therefore, (3) implies (1), completing the proof.

Recall that any field of characteristic 0 is perfect, and therefore any finite exten-
sion is separable. Applying this to ℚ implies that the Galois extensions of the rationals
are precisely the splitting fields of polynomials.

Corollary 16.5.2. The Galois extensions of the rationals are precisely the splitting fields
of polynomials in ℚ[x].

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:03 AM
252 | 16 Separable field extensions

Theorem 16.5.3. Let L|K be a finite, separable field extension. Then there exists an ex-
tension field M of L such that M|K is a Galois extension.

Proof. Let L = K(a1 , . . . , an ) with all ai separable over K. Let fi (x) be the minimal poly-
nomial of ai over K. Then each fi (x), and hence also f (x) = f1 (x) ⋅ ⋅ ⋅ fn (x), is separable
over K. Let M be the splitting field of f (x) over K. Then M|K is a Galois extension from
Theorem 16.5.1.

Example 16.5.4. Let K = ℚ be the rationals, and let f (x) = x4 − 2 ∈ ℚ[x]. From Chap-
ter 8, we know that L = ℚ(√4 2, i) is a splitting field of f (x). By the Eisenstein criteria,
f (x) is irreducible, and [L : ℚ] = 8. Moreover,

√4 2, i√4 2, −√4 2, −i√4 2

are the zeros of f (x). Since the rationals are perfect, f (x) is separable. L|K is a Galois
extension by Theorem 16.5.1. From the calculations in Chapter 15, we have
󵄨󵄨 󵄨 󵄨 󵄨
󵄨󵄨Aut(L|K)󵄨󵄨󵄨 = 󵄨󵄨󵄨Aut(L)󵄨󵄨󵄨 = [L : K] = 8.

Let

G = Aut(L|K) = Aut(L|ℚ) = Aut(L).

We want to determine the subgroup lattice of the Galois group G. We show G ≅ D4 ,


the dihedral group of order 8. Since there are 4 zeros of f (x), and G permutes these, G
must be a subgroup of S4 , and since the order is 8, G is a 2-Sylow subgroup of S4 . From
this, we have that

G = ⟨(2, 4), (1, 2, 3, 4)⟩.

If we let τ = (2, 4) and σ = (1, 2, 3, 4), we get the isomorphism between G and D4 . From
Theorem 14.1.1, we know that D4 = ⟨r, f ; r 4 = f 2 = (rf )2 = 1⟩.
This can also be seen in the following manner. Let
4 4 4 4
a1 = √2, a2 = i√2, a3 = −√2, a4 = −i√2.

Let α ∈ G. α is determined if we know α(√4 2) and α(i). The possibilities for α(i) are i or
−i; that is, the zeros of x2 + 1.
The possibilities for √4 2 are the 4 zeros of f (x) = x4 − 2. Hence, we have 8 possibil-
ities for α. These are exactly the elements of the group G. We have δ, τ ∈ G with
4 4
δ(√2) = i√2, δ(i) = i

and
4 4
τ(√2) = √2, τ(i) = −i.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:03 AM
16.5 Separability and Galois extensions | 253

It is straightforward to show that δ has order 4, τ has order 2, and δτ has order 2. These
define a group of order 8 isomorphic to D4 , and since G has 8 elements, this must be
all of G.
We now look at the subgroup lattice of G, and then the corresponding field lattice.
Let δ and τ be as above. Then G has 5 subgroups of order 2

{1, δ2 }, {1, τ}, {1, δτ}, {1, δ2 τ}, {1, δ3 τ}.

Of these only {1, δ2 } is normal in G.


G has 3 subgroups of order 4

{1, δ, δ2 , δ3 }, {1, δ2 , τ, τδ2 }, {1, δ2 , δτ, δ3 τ},

and all are normal since they all have index 2.


Hence, we have the following subgroup lattice:

From this we construct the lattice of fields and intermediate fields. Since there
are 10 proper subgroups of G from the fundamental theorem of Galois theory, there
are 10 intermediate fields in L|ℚ, namely, the fix fields Fix(L, H), where H is a proper
subgroup of G. In the identification, the extension field corresponding to the whole
group G is the ground field ℚ (recall that the lattice of fields is the inverted lattice of
the subgroups), whereas the extension field corresponding to the identity is the whole
field L. We now consider the other proper subgroups. Let δ, τ be as before.
(1) Consider M1 = Fix(L, {1, τ}). Now {1, τ} fixes ℚ(√4 2) elementwise so that ℚ(√4 2) ⊂
M1 . Furthermore, [L : M1 ] = |{1, τ}| = 2, and hence [L : ℚ(√4 2)] = 2. Therefore,
M1 = ℚ(√4 2).
(2) Consider M2 = Fix(L, {1, τδ}). We have the following:
4 4 4
τδ(√2) = τ(i√2) = −i√2
4 4 4
τδ(i√2) = τ(−√2) = −√2
4 4 4
τδ(−√2) = τ(−i√2) = i√2
4 4 4
τδ(−i√2) = τ(√2) = √2.

It follows that τδ fixes (1 − i)√4 2, and hence M2 = ℚ((1 − i)√4 2).

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:03 AM
254 | 16 Separable field extensions

(3) Consider M3 = Fix(L, {1, τδ2 }). The map τδ2 interchanges a1 and a3 and fixes a2 and
a4 . Therefore, M3 = ℚ(i√4 2).

In an analogous manner, we can then consider the other 5 proper subgroups and cor-
responding intermediate fields. We get the following lattice of fields and subfields:

16.6 The primitive element theorem


In this section, we describe finite separable field extensions as simple extensions. Ii
follows that a Galois extension is always a simple extension.

Theorem 16.6.1 (Primitive element theorem). Let L = K(γ1 , . . . , γn ), and suppose that
each γi is separable over K. Then there exists a γ0 ∈ L such that L = K(γ0 ). The element
γ0 is called a primitive element.

Proof. Suppose first that K is a finite field. Then L is also a finite field, and therefore
L⋆ = ⟨γ0 ⟩ is cyclic. Therefore, L = K(γ0 ), and the theorem is proved if K is a finite field.
Now suppose that K is infinite. Inductively, it suffices to prove the theorem for
n = 2. Hence, let α, β ∈ L be separable over K. We must show that there exists a γ ∈ L
with K(α, β) = K(γ).
Let L be the splitting field of the polynomial mα (x)mβ (x) over L, where mα (x), mβ (x)
are, respectively, the minimal polynomials of α, β over K. In L[x], we have the follow-
ing:

mα (x) = (x − α1 )(x − α2 ) ⋅ ⋅ ⋅ (x − αs ) with α = α1


mβ (x) = (x − β1 )(x − β2 ) ⋅ ⋅ ⋅ (x − βt ) with β = β1 .

By assumption the αi and the βj are, respectively, pairwise distinct.


For each pair (i, j) with 1 ≤ i ≤ s, 2 ≤ j ≤ t, the equation

α1 + zβ1 = αi + zβj

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:03 AM
16.6 The primitive element theorem | 255

has exactly one solution z ∈ L, since βj − β1 ≠ 0 if j ≥ 2. Since K is infinite, there exists


a c ∈ K with

α1 + cβ1 ≠ αi + cβj

for all i, j with 1 ≤ i ≤ s, 2 ≤ j ≤ t. With such a value c ∈ K, we define

γ = α + cβ = α1 + cβ1 .

We claim that K(α, β) = K(γ). It suffices to show that β ∈ K(γ), for then α = γ − cβ ∈
K(γ). This implies that K(α, β) ⊂ K(γ), and since γ ∈ K(α, β), it follows that K(α, β) =
K(γ).
To show that β ∈ K(γ), we first define f (x) = mα (γ − cx), and let d(x) = gcd(f (x),
mβ (x)). We may assume that d(x) is monic. We show that d(x) = x − β. Then β ∈ K(γ),
since d(x) ∈ K(γ)[x].
Assume first that d(x) = 1. Then gcd(f (x), mβ (x)) = 1, and f (x) and mβ (x) are also
relatively prime in L[x]. This is a contradiction, since f (x) and mβ (x) have the common
zero β ∈ L, and hence the common divisor x − β.
Therefore, d(x) ≠ 1, so deg(d(x)) ≥ 1.
The polynomial d(x) is a divisor of mβ (x), and hence d(x) splits into linear factors
of the form x − βj , 1 ≤ j ≤ t in L[x]. The proof is completed if we can show that no linear
factor of the form x − βj with 2 ≤ j ≤ t is a divisor of f (x). That is, we must show that
f (βj ) ≠ 0 in L if j ≥ 2.
Now f (βj ) = mα (γ − cβj ) = mα (α1 + cβ1 − cβj ). Suppose that f (βj ) = 0 for some
j ≥ 2. This would imply that αi = α1 + cβ1 − cβj ; that is, α1 + cβ1 = αj + cβj for j ≥ 2.
This contradicts the choice of the value c. Therefore, f (βj ) ≠ 0 if j ≥ 2, completing the
proof.

In the above theorem, it is sufficient to assume that n − 1 of γ1 , . . . , γn are separable


over K. The proof is similar. We only need that the β1 , . . . , βt are pairwise distinct if β is
separable over K to show that K(α, β) = K(γ) for some γ ∈ L.
If K is a perfect field, then every finite extension is separable. Therefore, we get
the following corollary:

Corollary 16.6.2. Let L|K be a finite extension with K a perfect field. Then L = K(γ) for
some γ ∈ L.

Corollary 16.6.3. Let L|K be a finite extension with K a perfect field. Then there exist
only finitely many intermediate fields E with K ⊂ E ⊂ L.

Proof. Since K is a perfect field, we have L = K(γ) for some γ ∈ L. Let mγ (x) ∈ K[x]
be the minimal polynomial of γ over K, and let L be the splitting field of mγ (x) over K.
Then L|K is a Galois extension; hence, there are only finitely many intermediate fields
between K and L. Therefore, also only finitely many fields between K and L.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:03 AM
256 | 16 Separable field extensions

Suppose that L|K is algebraic. Then, in general, L = K(γ) for some γ ∈ L if and only
if there exist only finitely many intermediate fields E with K ⊂ E ⊂ L.
This condition on intermediate fields implies that L|K is finite if L|K is algebraic.
Hence, we have proved this result, in the case that K is perfect. The general case is
discussed in the book of S. Lang [12].

16.7 Exercises
1. Let f (x) = x4 − 8x3 + 24x2 − 32x + 14 ∈ ℚ[x], and let v ∈ ℂ be a zero of f . Let
α := v(4 − v), and K a splitting field of f over ℚ. Show the following:
(i) f is irreducible over ℚ, and f (x) = f (4 − x).
(ii) There is exactly one automorphism σ of ℚ(v) with σ(v) = 4 − v.
(iii) L := ℚ(α) is the Fix field of σ and |L : ℚ| = 2.
(iv) Determine the minimal polynomial of α over ℚ and determine α.
(v) |ℚ(v) : L| = 2, and determine the minimal polynomial of v over L; also deter-
mine v and all other zeros of f (x).
(vi) Determine the degree of |K : ℚ|.
(vii)Determine the structure of Aut(K|ℚ).
2. Let L|K be a field extension and f ∈ K[x] a separable polynomial. Let Z be a split-
ting field of f over L and Z0 a splitting field of f over K. Show that Aut(Z|L) is
isomorphic to a subgroup of Aut(Z0 |K).
3. Let L|K be a field extension and v ∈ L. For each element c ∈ K it is K(v + c) = K(v).
For c ≠ 0, it is K(cv) = K(v).
4. Let v = √2 + √3 and let K = ℚ(v). Show that √2 and √3 are presentable as a
ℚ-linear combination of 1, v, v2 , v3 . Conclude that K = ℚ(√2, √3).
5. Let L be the splitting field of x3 − 5 over ℚ in ℂ. Determine a primitive element t of
L over ℚ.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 9:03 AM
17 Applications of Galois theory
17.1 Applications of Galois theory
As we mentioned in Chapter 1, Galois theory was originally developed as part of the
proof that polynomial equations of degree 5 or higher over the rationals cannot be
solved by formulas in terms of radicals. In this chapter, we do this first and prove the
insolvability of the quintic polynomials by radicals. To do this, we must examine in
detail what we call radical extensions.
We then return to some geometric material we started in Chapter 6. There, using
general field extensions, we proved the impossibility of certain geometric compass
and straightedge constructions. Here, we use Galois theory to consider constructible
n-gons.
Finally, we will use Galois theory to present a proof of the fundamental theorem
of algebra, which says, essentially, that the complex number field ℂ is algebraically
closed.
In Chapter 17, we always assume that K is a field of characteristic 0; in particular,
K is perfect. We remark that some parts of Sections 17.2–17.5 go through for finite fields
of characteristic p > 3.

17.2 Field extensions by radicals


We would like to use Galois theory to prove the insolvability by radicals of polynomial
equations of degree 5 or higher. To do this we must introduce extensions by radicals
and solvability by radicals.

Definition 17.2.1. Let L|K be a field extension.


(1) Each zero of a polynomial xn − a ∈ K[x] in L is called a radical (over K). We denote
it by √a
n
(if a more detailed identification is not necessary).
(2) L is called a simple extension of K by a radical if L = K(√a)
n
for some a ∈ K.
(3) L is called an extension of K by radicals if there is a chain of fields

K = L0 ⊂ L1 ⊂ ⋅ ⋅ ⋅ ⊂ Lm = L

such that each Li is a simple extension of Li−1 by a radical for each i = 1, . . . , m.


(4) Let f (x) ∈ K[x]. Then the equation f (x) = 0 is solvable by radicals, or just solvable,
if the splitting field of f (x) over K is contained in an extension of K by radicals.

In proving the insolvability of the quintic polynomial, we will look for necessary
and sufficient conditions for the solvability of polynomial equations. Our main result

https://doi.org/10.1515/9783110603996-017

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:05 PM
258 | 17 Applications of Galois theory

will be that if f (x) ∈ K[x], then f (x) = 0 is solvable over K if the Galois group of the
splitting field of f (x) over K is a solvable group (see Chapter 11).
In the remainder of this section, we assume that all fields have characteristic zero.
The next theorem gives a characterization of simple extensions by radicals:

Theorem 17.2.2. Let L|K be a field extension and n ∈ ℕ. Assume that the polynomial
xn − 1 splits into linear factors in K[x] so that K contains all the n-th roots of unity. Then
L = K(√a)
n
for some a ∈ K if and only if L is a Galois extension over K, and Aut(L|K) =
ℤ/mℤ for some m ∈ ℕ with m|n.

Proof. The n-th roots of unity, that is, the zeros of the polynomial xn − 1 ∈ K[x], form a
cyclic multiplicative group ℱ ⊂ K ⋆ of order n, since each finite subgroup of the multi-
plicative group K ⋆ of K is cyclic, and |ℱ | = n. We call an n-th root of unity ω primitive
if ℱ = ⟨ω⟩.
Now let L = K(√a) n
with a ∈ K; that is, L = K(β) with βn = a ∈ K. Let ω be a
primitive n-th root of unity. With this β, the elements ωβ, ω2 β, . . . , ωn β = β are zeros of
xn − a. Hence, the polynomial xn − a splits into linear factors over L; hence, L = K(β)
is a splitting field of xn − a over K. It follows that L|K is a Galois extension.
Let σ ∈ Aut(L|K). Then σ(β) = ων β for some 0 < ν ≤ n. The element ων is uniquely
determined by σ, and we may write ων = ωσ .
Consider the map ϕ : Aut(L|K) → ℱ given by σ → ωσ , where ωσ is defined as
above by σ(β) = ωσ β. If τ, σ ∈ Aut(L|K), then

στ(β) = σ(ωτ )σ(β) = ωτ ωσ β,

because ωτ ∈ K.
Therefore, ϕ(στ) = ϕ(σ)ϕ(τ); hence, ϕ is a homomorphism. The kernel ker(ϕ)
contains all the K-automorphisms of L, for which σ(β) = β. However, since K = K(β), it
follows that ker(ϕ) contains only the identity. The Galois group Aut(L|K) is, therefore,
isomorphic to a subgroup of ℱ . Since ℱ is cyclic of order n, we have that Aut(L|K) is
cyclic of order m for some m|n, completing one way in the theorem.
Conversely, first suppose that L|K is a Galois extension with Aut(L|K) = ℤn , a
cyclic group of order n. Let σ be a generator of Aut(L|K). This is equivalent to

Aut(L|K) = {σ, σ 2 , . . . , σ n = 1}.

Let ω be a primitive n-th root of unity. Then, by assumption, ω ∈ K, σ(ω) = ω, and


ℱ = {ω, ω2 , . . . , ωn = 1}. Furthermore, the pairwise distinct automorphism σ ν ,
ν = 1, 2, . . . , n, of L are linearly independent; that is, there exists an η ∈ L such that

n
ω ⋆ η = ∑ ων σ ν (η) ≠ 0.
ν=1

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:05 PM
17.2 Field extensions by radicals | 259

The element ω ⋆ η is called the Lagrange resolvent of ω by η. We fix such an element


η ∈ L. Then we get, since σ(ω) = ω,

n n n+1
σ(ω ⋆ η) = ∑ ων σ ν+1 (η) = ω−1 ∑ ων+1 σ ν+1 (η) = ω−1 ∑ ων σ ν (η)
ν=1 ν=1 ν=2
n
= ω−1 ∑ ων σ ν (η) = ω−1 (ω ⋆ η).
ν=1

Moreover, σ μ (ω ⋆ η) = ω−μ (ω ⋆ η), μ = 1, 2, . . . , n. Hence, the only K-automorphism of L,


which fixes ω ⋆ η is the identity. Therefore, Aut(L|K(ω ⋆ η)) = {1}; hence, L = K(ω ⋆ η)
by the fundamental theorem of Galois theory.
Furthermore,
n n
σ((ω ⋆ η)n ) = (σ(ω ⋆ η)) = (ω−1 (ω ⋆ η)) = ω−n (ω ⋆ η)n = (ω ⋆ η)n .

Therefore, (ω ⋆ η)n ∈ Fix(L, Aut(L|K)) = K, again from the fundamental theorem of


Galois theory. If a = (ω ⋆ η)n ∈ K, then first a ∈ K, and second L = K(√a)
n
= K(ω ⋆ η).
This proves the result in the case where m = n. We now use this to prove it in general.
Finally, suppose that L|K is a Galois extension with Aut(L|K) = ℤm , a cyclic group
of order m, where n = qm for some q ≥ 1. If n = qm, then L = K( √b) for some b ∈ K by
m

the above argument. Hence, L = K(β) with β ∈ K. Then certainly, a = βn = (βm )q ∈ K;


m

therefore, L = K(β) = K(√a)


n
for some a ∈ K, completing the general case.

We next show that every extension by radicals is contained in a Galois extension


by radicals.

Theorem 17.2.3. Each extension L of K by radicals is contained in a Galois extension L̃


of K by radicals. This means that there is an extension L̃ of K by radicals with L ⊂ L,̃ and
L|K
̃ is a Galois extension.

Proof. We use induction on the degree m = [L : K]. Suppose that m = 1. If L = K(√a), n

then if ω is a primitive n-th root of unity, define K̃ = K(ω) and L̃ = K(̃ √a). We then
n

get the chain K ⊂ K̃ ⊂ L̃ with L ⊂ L,̃ and L|K


̃ is a Galois extension. This last statement
is due to the fact that L is the splitting field of the polynomial xn − a ∈ K[x] over K.
̃
Hence, the theorem is true if m = 1.
Now suppose that m ≥ 2, and suppose that the theorem is true for all extensions
F of K by radicals with [F : K] < m.
Since m ≥ 2 by the definition of extension by radicals, there exists a simple exten-
sion L|E by a radical. That is, there exists a field E with

K ⊂ E ⊂ L, [L : E] ≥ 2

and L = E(√a)
n
for some a ∈ E, n ∈ ℕ. Now [E : K] < m. Therefore, by the inductive
hypothesis, there exists a Galois extension by radicals Ẽ of K with E ⊂ E.̃ Let G =

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:05 PM
260 | 17 Applications of Galois theory

Aut(E|K),
̃ and let L̃ be the splitting field of the polynomial f (x) = ma (xn ) ∈ K[x] over E,̃
where ma (x) is the minimal polynomial of a over K. We show that L̃ has the desired
properties.
Now √an
∈ L is a zero of the polynomial f (x), and E ⊂ Ẽ ⊂ L.̃ Therefore, L̃ contains
an E-isomorphic image of L = K(√a); n
hence, we may consider L̃ as an extension of L.
Since Ẽ is a Galois extension of K, the polynomial f (x) may be factored as

f (x) = (x n − α1 ) ⋅ ⋅ ⋅ (x n − αs )

with αi ∈ Ẽ for i = 1, . . . , s. All zeros of f (x) in L̃ are radicals over E.̃ Therefore, L̃ is an
extension by radicals of E.̃ Since Ẽ is also an extension by radicals of K, we obtain that
L̃ is an extension by radicals of K.
Since Ẽ is a Galois extension of K, we have that Ẽ is a splitting field of a polynomial
g(x) ∈ K[x]. Furthermore, L̃ is a splitting field of f (x) ∈ K[x] over E.̃ Altogether then,
we have that L̃ is a splitting field of f (x)g(x) ∈ K[x] over K. Therefore, L̃ is a Galois
extension of K, completing the proof.

We will eventually show that a polynomial equation is solvable by radicals if and


only if the corresponding Galois group is a solvable group. We now begin to find con-
ditions, where the Galois group is solvable.

Lemma 17.2.4. Let K = L0 ⊂ L1 ⊂ ⋅ ⋅ ⋅ ⊂ Lr = L be a chain of fields such that the following


hold:
(i) L is a Galois extension of K.
(ii) Lj is a Galois extension of Lj−1 for j = 1, . . . , r.
(iii) Gj = Aut(Lj |Lj−1 ) is abelian for j = 1, . . . , r.

Then G = Aut(L|K) is solvable.

Proof. We prove the lemma by induction on r. If r = 0, then G = {1}, and there is noth-
ing to prove. Suppose then that r ≥ 1, and assume that the lemma holds for all such
chains of fields with a length r 󸀠 < r. Since L1 |K is a Galois extension, then Aut(L1 |K) is
a normal subgroup of G by the fundamental theorem of Galois theory. Moreover,

G1 = Aut(L1 |K) = G/ Aut(L|L1 ).

Since G1 is an abelian group, it is solvable, and by assumption Aut(L|L1 ) is solvable.


Therefore, G is solvable (see Theorem 12.2.4).

Lemma 17.2.5. Let L|K be a field extension. Let K̃ and L̃ be the splitting fields of the
polynomial x n − 1 ∈ K[x] over K and L, respectively. Since K ⊂ L, we have K̃ ⊂ L.̃ Then
the following hold:
(1) If σ ∈ Aut(L|L),
̃ then σ ̃ ∈ Aut(K|K),
|K
̃ and the map

Aut(L|L)
̃ → Aut(K|K),
̃ given by σ 󳨃→ σ|K̃ ,

is an injective homomorphism.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:05 PM
17.3 Cyclotomic extensions | 261

(2) Suppose that in addition L|K is a Galois extension. Then L|K̃ is also a Galois exten-
sion. If furthermore, σ ∈ Aut(L|̃ K),
̃ then σ|L ∈ Aut(L|K), and

Aut(L|̃ K)̃ → Aut(L|K), given by σ 󳨃→ σ|L ,

is an injective homomorphism.

Proof. (1) Let ω be a primitive nth root of unity. Then K̃ = K(ω), and L̃ = L(ω). Each
σ ∈ Aut(L|L)
̃ maps ω onto a primitive nth root of unity, and fixes K ⊂ L elementwise.
Hence, from σ ∈ Aut(L|L), ̃ we get that σ|K̃ ∈ Aut(K|K).
̃ Certainly, the map σ 󳨃→ σ|K̃
defines a homomorphism Aut(L|L) ̃ → Aut(K|K).
̃ Let σ|K̃ = 1 with σ ∈ Aut(L|L).̃ Then
σ(ω) = ω; therefore, we have already that σ = 1, since L = L(ω).
̃
(2) If L is the splitting field of a polynomial g(x) over K, then L̃ is the splitting field
of g(x)(xn − 1) over K. Hence, L|K ̃ is a Galois extension. Therefore, K ⊂ L ⊂ L,̃ and
L|K, L|L
̃ and L|K ̃ are all Galois extensions. Therefore, from the fundamental theorem
of Galois theory

Aut(L|K) = {σ|L ; σ ∈ Aut(L|K)}.


̃

In particular, σ|L ∈ Aut(L|K) if σ ∈ Aut(L|̃ K).


̃ Certainly, the map Aut(L|̃ K)̃ → Aut(L|K),
given by σ 󳨃→ σ|L , is a homomorphism. From σ ∈ Aut(L|̃ K), ̃ we get that σ(ω) = ω,
where—as above—ω is a primitive nth root of unity. Therefore, if σ|L = 1, then already,
σ = 1, since L̃ = L(ω). Hence, the map is injective.

17.3 Cyclotomic extensions


Very important in the solvability by radicals problem are the splitting fields of the
polynomials xn − 1 over ℚ. These are called cyclotomic fields.

Definition 17.3.1. The splitting field of the polynomial xn −1 ∈ ℚ[x] with n ≥ 2 is called
the nth cyclotomic field denoted by kn .

We have kn = ℚ(ω), where ω is a primitive nth root of unity. For example, ω =


2πi
e over ℚ. kn |ℚ is a Galois extension, and the Galois group Aut(kn |ℚ) is the set of
n

automorphisms σm : ω → ωm with 1 ≤ m ≤ n and gcd(m, n) = 1.


To understand this group G, we need the following concept: A prime residue class
mod n is a residue class a + nℤ with gcd(a, n) = 1. The set of the prime residue classes
mod n is just the set of invertible elements with respect to multiplication of the ℤ/nℤ.
This forms a multiplicative group that we denote by (ℤ/nℤ)⋆ = Pn . We have |Pn | =
ϕ(n), where ϕ(n) is the Euler phi-function.
If G = Aut(kn |ℚ), then G ≅ Pn under the map σm 󳨃→ m + nℤ.
If n = p is a prime number, then G = Aut(kp |ℚ) is cyclic with |G| = p − 1.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:05 PM
262 | 17 Applications of Galois theory

If n = p2 , then |G| = |Aut(kp2 |ℚ)| = p(p − 1), since


2
xp −1 x − 1
= xp(p−1) + xp(p−1)−1 + ⋅ ⋅ ⋅ + 1.
x − 1 xp − 1

Lemma 17.3.2. Let K be a field and K̃ be the splitting field of x n −1 over K. Then Aut(K|K)
̃
is abelian.

Proof. We apply Lemma 17.2.5 for the field extension K|ℚ. This can be done since the
characteristic of K is zero, and ℚ is the prime field of K. It follows that Aut(K|K)
̃ is
isomorphic to a subgroup of Aut(ℚ|ℚ) ̃ from part (1) of Lemma 17.2.5. But ℚ̃ = kn , and
hence Aut(ℚ|ℚ)
̃ is abelian. Therefore, Aut(K|K)
̃ is abelian.

17.4 Solvability and Galois extensions


In this section, we prove that solvability by radicals is equivalent to the solvability of
the Galois group.

Theorem 17.4.1. Let L|K be a Galois extension of K by radicals. Then G = Aut(L|K) is a


solvable group.

Proof. Suppose that L|K is a Galois extension. Then we have a chain of fields

K = L0 ⊂ L1 ⊂ ⋅ ⋅ ⋅ ⊂ Lr = L

such that Lj = Lj−1 ( √


nj
aj ) for some aj ∈ Lj . Let n = n1 ⋅ ⋅ ⋅ nr , and let L̃ j be the splitting
field of the polynomial xn − 1 ∈ K[x] over Lj for each j = 0, 1, . . . , r. Then L̃ j = L̃ j−1 ( √
nj
aj ),
and we get the chain

K ⊂ K̃ = L̃ 0 ⊂ L̃ 1 ⊂ ⋅ ⋅ ⋅ ⊂ L̃ r = L.̃

From part (2) of Lemma 17.2.5, we get that L|K ̃ is a Galois extension. Furthermore,
̃Lj |L̃ j−1 is a Galois extension with Aut(L̃ j |L̃ j−1 ) cyclic from Theorem 17.2.2. In particular,
Aut(L̃ j |L̃ j−1 ) is abelian. The group Aut(K|K)̃ is abelian from Lemma 17.3.2. Therefore,
we may apply Lemma 17.2.4 to the chain

K ⊂ K̃ = L̃ 0 ⊂ ⋅ ⋅ ⋅ ⊂ L̃ r = L.̃

Therefore, G̃ = Aut(L|K)
̃ is solvable. The group G = Aut(L|K) is a homomorphic image
of G from the fundamental theorem of Galois theory. Since homomorphic images of
̃
solvable groups are still solvable (see Theorem 12.2.3), it follows that G is solvable.

Lemma 17.4.2. Let L|K be a Galois extension, and suppose that G = Aut(L|K) is solv-
able. Assume further that K contains all q-th roots of unity for each prime divisor q of
m = [L : K]. Then L is an extension of K by radicals.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:05 PM
17.5 The insolvability of the quintic polynomial | 263

Proof. Let L|K be a Galois extension, and suppose that G = Aut(L|K) is solvable; also
assume that K contains all the q-th roots of unity for each prime divisor q of m = [L : K].
We prove the result by induction on m.
If m = 1, then L = K, and the result is clear. Now suppose that m ≥ 2, and as-
sume that the result holds for all Galois extensions L󸀠 |K 󸀠 with [L󸀠 : K 󸀠 ] < m. Now
G = Aut(L|K) is solvable, and G is nontrivial since m ≥ 2. Let q be a prime divisor of m.
From Lemma 12.2.2 and Theorem 13.3.5, it follows that there is a normal subgroup H
of G with G/H cyclic of order q. Let E = Fix(L, H). From the fundamental theorem of
Galois theory, E|K is a Galois extension with Aut(E|K) ≅ G/H, and hence Aut(E|K) is
cyclic of order q. From Theorem 17.2.2, E|K is a simple extension of K by a radical. The
proof is completed if we can show that L is an extension of E by radicals.
The extension L|E is a Galois extension, and the group Aut(L|E) is solvable, since
it is a subgroup of G = Aut(L|K). Each prime divisor p of [L : E] is also a prime divisor
of m = [L : K] by the degree formula. Hence, as an extension of K, the field E contains
all the p-th roots of unity. Finally,

[L : K] m
[L : E] = = < m.
[E : K] q

Therefore, L|E is an extension of E by radicals from the inductive assumption, com-


pleting the proof.

17.5 The insolvability of the quintic polynomial


We are now able to prove the insolvability of the quintic polynomial. This is one of
the most important applications of Galois theory. As aforementioned, we do this by
equating the solvability of a polynomial equation by radicals to the solvability of the
Galois group of the splitting field of this polynomial.

Theorem 17.5.1. Let K be a field of characteristic 0, and let f (x) ∈ K[x]. Suppose that L
is the splitting field of f (x) over K. Then the polynomial equation f (x) = 0 is solvable by
radicals if and only if Aut(L|K) is solvable.

Proof. Suppose first that f (x) = 0 is solvable by radicals. Then L is contained in an


extension L󸀠 of K by radicals. Hence, L is contained in a Galois extension L̃ of K by
radicals from Theorem 17.2.3. The group G̃ = Aut(L|K) ̃ is solvable from Theorem 17.4.1.
Furthermore, L|K is a Galois extension. Therefore, the Galois group Aut(L|K) is solv-
able as a subgroup of G.̃
Conversely, suppose that the group Aut(L|K) is solvable. Let q1 , . . . , qr be the prime
divisors of m = [K : K], and let n = q1 ⋅ ⋅ ⋅ qr . Let K̃ and L̃ be the splitting fields of the
polynomial xn − 1 ∈ K[x] over K and L, respectively. We have K̃ ⊂ L.̃ From part (2) of
Lemma 17.2.5, we have that L|K̃ is a Galois extension, and Aut(L|̃ K)̃ is isomorphic to a
subgroup of Aut(L|K). From this, we first obtain that [L̃ : K]̃ = |Aut(L|̃ K)| ̃ is a divisor

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:05 PM
264 | 17 Applications of Galois theory

of [L : K] = |Aut(L|K)|. Hence, each prime divisor q of [L̃ : K]̃ is also a prime divisor of
[L : K]. Therefore, L̃ is an extension by radicals of K̃ by Lemma 17.4.2. Since K̃ = K(ω),
where ω is a primitive n-th root of unity, we obtain that L̃ is also an extension of K
by radicals. Therefore, L is contained in an extension L̃ of K by radicals; therefore,
f (x) = 0 is solvable by radicals.

Corollary 17.5.2. Let K be a field of characteristic 0, and let f (x) ∈ K[x] be a polynomial
of degree m with 1 ≤ m ≤ 4. Then the equation f (x) = 0 is solvable by radicals.

Proof. Let L be the splitting field of f (x) over K. The Galois group Aut(L|K) is isomor-
phic to the subgroup of the symmetric group Sm . Now the group S4 is solvable via the
chain

{1} ⊂ ℤ2 ⊂ D2 ⊂ A4 ⊂ S4 ,

where ℤ2 is the cyclic group of order 2, and D2 is the Klein 4-group, which is isomorphic
to ℤ2 × ℤ2 . Because Sm ⊂ S4 for 1 ≤ m ≤ 4, it follows that Aut(L|K) is solvable. From
Theorem 17.5.1, the equation f (x) = 0 is solvable by radicals.

Corollary 17.5.2 uses the general theory to show that any polynomial equation of
degree less than or equal to 4 is solvable by radicals. This, however, does not provide
explicit formulas for the solutions. We present these below:
Let K be a field of characteristic 0, and let f (x) ∈ K[x] be a polynomial of degree
m with 1 ≤ m ≤ 4. As mentioned above, we assume that K is the splitting field of the
respective polynomial.
Case (1): If deg(f (x)) = 1, then f (x) = ax + b with a, b ∈ K and a ≠ 0. A zero is then
given by k = − ba .
Case (2): If deg(f (x)) = 2, then f (x) = ax2 + bx + c with a, b, c ∈ K and a ≠ 0. The
zeros are then given by the quadratic formula

−b ± √b2 − 4ac
k= .
2a

We note that the quadratic formula holds over any field of characteristic not equal to 2.
Whether there is a solution within the field K then depends on whether b2 − 4ac has a
square root within K.
For the cases of degrees 3 and 4, we have the general forms of what are known as
Cardano’s formulas.
Case (3): If deg(f (x)) = 3, then f (x) = ax3 + bx2 + cx + d with a, b, c, d ∈ K and a ≠ 0.
Dividing through by a, we may assume, without loss of generality, that a = 1.
By a substitution x = y − b3 , the polynomial is transformed into

g(y) = y3 + py + q ∈ K[y].

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:05 PM
17.5 The insolvability of the quintic polynomial | 265

Let L be the splitting field of g(y) over K, and let α ∈ L be a zero of g(y) so that

α3 + pα + q = 0.

If p = 0, then α = √−q
3
so that g(y) has the three zeros

3
√−q, ω√−q,
3
ω2√−q,
3

where ω is a primitive third root of unity, ω3 = 1 with ω ≠ ω2 .


Now let p ≠ 0, and let β be a zero of x2 − αx − p3 in a suitable extension L󸀠 of L. We
p
have β ≠ 0, since p ≠ 0. Hence, α = β − 3β . Putting this into the transformed cubic
equation

α3 + pα + q = 0,

we get

p3
β3 − + q = 0.
27β3

Define γ = β3 and δ = ( −p

)3 so that

γ + δ + q = 0.

Then
3 3
p p3 p
γ 2 + qγ − ( ) = 0 and − +δ+q =0 and δ2 + qδ − ( ) = 0.
3 27δ 3

Hence, the zeros of the polynomial


3
p
x2 + qx − ( )
3
are
2 3
q √ q p
γ, δ = − ± ( ) +( ) .
2 2 3

If we have γ = δ, then both are equal to − q2 , and

2 3
√( q ) + ( p ) = 0.
2 3
p
Then from the definitions of γ, δ, we have γ = β3 , and δ = ( −p

)3 . From above, α = β − 3β .
Therefore, we get α by finding the cube roots of γ and δ.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:05 PM
266 | 17 Applications of Galois theory

There are certain possibilities and combinations with these cube roots, but be-
cause of the conditions, the cube roots of γ and δ are not independent. We must satisfy
the condition
−p p
√3 γ√3 δ = β =− .
3β 3

Therefore, we get the final result:


The zeros of g(y) = y3 + py + q with p ≠ 0 are

u + v, ωu + ω2 v, ω2 u + ωv,

where ω is a primitive third root of unity, and

2 3 2 3
3 q √ q p 3 q √ q p
u = √− + ( ) +( ) and v = √− − ( ) +( ) .
2 2 3 2 2 3

The above is known as the cubic formula, or Cardano’s formula.


Case (4): If deg(f (x)) = 4, then f (x) = ax 4 + bx3 + cx2 + dx + e with a, b, c, d, e ∈ K
and a ≠ 0. Dividing through by a, we may assume without loss of generality that a = 1.
By a substitution x = y − b4 , the polynomial f (x) is transformed into

g(y) = y4 + py2 + qy + r.

We have to find the zeros of g(y). Let x1 , x2 , x3 , x4 be the solutions in the splitting field
of the polynomial

y4 + py2 + qy + r = 0.

Then

0 = y4 + py2 + qy + r = (y − x1 )(y − x2 )(y − x3 )(y − x4 ).

If we compare the coefficients, we get the following:

0 = x1 + x2 + x3 + x4 ,
p = x1 x2 + x1 x3 + x1 x4 + x2 x3 + x2 x4 + x3 x4 ,
−q = x1 x2 x3 + x1 x2 x4 + x1 x3 x4 + x2 x3 x4 ,
r = x1 x2 x3 x4 .

We define

y1 = (x1 + x2 )(x3 + x4 ),
y2 = (x1 + x3 )(x2 + x4 ),
y3 = (x1 + x4 )(x2 + x3 ).

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:05 PM
17.5 The insolvability of the quintic polynomial | 267

From x1 + x2 + x3 + x4 = 0, we get

y1 = −(x1 + x2 )2 = −(x3 + x4 )2 , because x1 + x2 = −(x3 + x4 ),


2 2
y2 = −(x1 + x3 ) = −(x2 + x4 ) , because x1 + x3 = −(x2 + x4 ),
y3 = −(x1 + x4 )2 = −(x2 + x3 )2 , because x1 + x4 = −(x2 + x3 ).

Let y3 + fy2 + gy + h = 0 be the cubic equation with the solutions y1 , y2 , and y3 . This
polynomial y3 + fy2 + gy + h is called the cubic resolvent of the equation of degree
four.
If we compare the coefficients, we get the following:

f = − y1 − y2 − y3 ,
g =y1 y2 + y1 y3 + y2 y3 ,
h = − y1 y2 y3 .

Direct calculations leads to

f = −2p,
g = p2 − 4r,
h = q2 .

Hence, the equation

y3 − 2py2 + (p2 − 4r)y + q2 = 0

is the resolvent of y4 + py2 + qy + r = 0. We now calculate the solutions y1 , y2 , y3 of


y3 − 2py2 + (p2 − 4r)y + q2 = 0 using Cardano’s formula.
Then we substitute backwards, and get the following:

x1 + x2 = −(x3 + x4 ) = ±√−y1 ,
x1 + x3 = −(x2 + x4 ) = ±√−y2 ,
x1 + x4 = −(x2 + x3 ) = ±√−y3 .

We add these equations, and get

3x1 + x2 + x3 + x4 = 2x1 = ±√−y1 ± √−y2 ± √−y3


±√−y1 ± √−y2 ± √−y3
⇒ x1 = .
2

The formulas for x2 , x3 , and x4 follow analogously, and are of the same type as that for
x1 .

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:05 PM
268 | 17 Applications of Galois theory

By variation of the signs we get eight numbers ±x1 , ±x2 , ±x3 and ±x4 . Four of them
are the solutions of the equation

y4 + py3 + qy + r = 0.

The correct ones we get by putting into the equation. They are as follows:

1
x1 = (√−y1 + √−y2 + √−y3 ),
2
1
x2 = (√−y1 − √−y2 − √−y3 ),
2
1
x3 = (−√−y1 + √−y2 − √−y3 ),
2
1
x4 = (−√−y1 − √−y2 + √−y3 ).
2

The following theorem is due to Abel; it shows the insolvability of the general
degree 5 polynomial over the rationals ℚ.

Theorem 17.5.3. Let L be the splitting field of the polynomial f (x) = x5 − 2x4 + 2 ∈ ℚ[x]
over ℚ. Then Aut(L|K) = S5 , the symmetric group on 5 letters. Since S5 is not solvable,
the equation f (x) = 0 is not solvable by radicals.

Proof. The polynomial f (x) is irreducible over ℚ by the Eisenstein criterion. Further-
more, f (x) has five zeros in the complex numbers ℂ by the fundamental theorem of
algebra (see Section 17.7). We claim that f (x) has exactly 3 real zeros and 2 nonreal
zeros, which then necessarily are complex conjugates. In particular, the 5 zeros are
pairwise distinct.
To see the claim, notice first that f (x) has at least 3 real zeros from the intermediate
value theorem. As a real function, f (x) is continuous, and f (−1) = −1 < 0 and f (0) =
2 > 0, so it must have a real zero between −1 and 0. Furthermore, f ( 32 ) = − 81 3
< 0, and
f (2) = 2 > 0. Hence, there must be distinct real zeros between 0 and 32 , and between
3
2
and 2. Suppose that f (x) has more than 3 real zeros. Then f 󸀠 (x) = x3 (5x − 8) has at
least 3 pairwise distinct real zeros from Rolle’s theorem. But f 󸀠 (x) clearly has only 2
real zeros, so this is not the case. Therefore, f (x) has exactly 3 real zeros, and hence 2
nonreal zeros that are complex conjugates.
Let L be the splitting field of f (x). The field L lies in ℂ, and the restriction of the
map δ : z 󳨃→ z of ℂ to L maps the set of zeros of f (x) onto themselves. Therefore, δ is an
automorphism of L. The map δ fixes the 3 real zeros and transposes the 2 nonreal zeros.
From this, we now show that Aut(L|ℚ) = Aut L = G = S5 , the full symmetric group on
5 symbols. Clearly, G ⊂ S5 , since G acts as a permutation group on the 5 zeros of f (x).
Since δ transposes the 2 nonreal zeros, G (as a permutation group) contains at
least one transposition. Since f (x) is irreducible, G acts transitively on the zeros of
f (x). Let x0 be one of the zeros of f (x), and let Gx0 be the stabilizer of x0 . Since G acts

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:05 PM
17.6 Constructibility of regular n-gons | 269

transitively, x0 has five images under G; therefore, the index of the stabilizer must be 5
(see Chapter 10):

5 = [G : Gx0 ],

which—by Lagrange’s theorem—must divide the order of G. Therefore, from the Sylow
theorems, G contains an element of order 5. Hence, G contains a 5-cycle and a trans-
position; therefore, by Theorem 11.4.3, it follows that G = S5 . Since S5 is not solvable,
it follows that f (x) cannot be solved by radicals.

Since Abel’s theorem shows that there exists a degree 5 polynomial that cannot
be solved by radicals, it follows that there can be no formula like Cardano’s formula
in terms of radicals for degree 5.

Corollary 17.5.4. There is no general formula for solving by radicals a fifth degree poly-
nomial over the rationals.

We now show that this result can be further extended to any degree greater than 5.

Theorem 17.5.5. For each n ≥ 5, there exist polynomials f (x) ∈ ℚ[x] of degree n, for
which the equation f (x) = 0 is not solvable by radicals.

Proof. Let f (x) = xn−5 (x5 − 2x 4 + 2), and let L be the splitting field of f (x) over ℚ. Then
Aut(L|ℚ) = Aut(L) contains a subgroup that is isomorphic to S5 . It follows that Aut(L)
is not solvable; therefore, the equation f (x) = 0 is not solvable by radicals.

This immediately implies the following:

Corollary 17.5.6. There is no general formula for solving by radicals polynomial equa-
tions over the rationals of degree 5 or greater.

17.6 Constructibility of regular n-gons


In Chapter 6, we considered certain geometric material related to field extensions.
There, using general field extensions, we proved the impossibility of certain geometric
compass and straightedge constructions. In particular, there were four famous insolv-
able (to the Greeks) construction problems. The first is the squaring of the circle. This
problem is, given a circle, to construct using straightedge and compass a square hav-
ing an area equal to that of the given circle. The second is the doubling of the cube.
This problem is, given a cube of given side length, to construct, using a straightedge
and compass, a side of a cube having double the volume of the original cube. The third
problem is the trisection of an angle. This problem is to trisect a given angle using only
a straightedge and compass. The final problem is the construction of a regular n-gon.
This problems asks which regular n-gons could be constructed using only straightedge

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:05 PM
270 | 17 Applications of Galois theory

and compass. In Chapter 6, we proved the impossibility of the first 3 problems. Here,
we use Galois theory to consider constructible n-gons.
Recall that a Fermat number is a positive integer of the form
n
Fn = 22 + 1, n = 0, 1, 2, 3, . . . .

If a particular Fm is prime, it is called a Fermat prime.


Fermat believed that all the numbers in this sequence were primes. In fact, F0 , F1 ,
F2 , F3 , F4 are all prime, but F5 is composite and divisible by 641 (see exercises). It is
still an open question whether or not there are infinitely many Fermat primes. It has
been conjectured that there are only finitely many. On the other hand, if a number of
the form 2n + 1 is a prime for some integer n, then it must be a Fermat prime; that is, n
must be a power of 2.
We first need the following:

Theorem 17.6.1. Let p = 2n + 1, n = 2s with s ≥ 0 be a Fermat prime. Then there exists a


chain of fields

ℚ = L0 ⊂ L1 ⊂ ⋅ ⋅ ⋅ ⊂ Ln = kp ,

where kp is the p-th cyclotomic field such that

[Lj : Lj−1 ] = 2

for j = 1, . . . , n.

Proof. The extension kp |ℚ is a Galois extension, and [kp : ℚ] = p − 1. Furthermore,


Aut(kp ) is cyclic of order p − 1 = 2n . Hence, there is a chain of subgroups

{1} = Un ⊂ Un−1 ⊂ ⋅ ⋅ ⋅ ⊂ U0 = Aut(kp )

with [Uj−1 : Uj ] = 2 for j = 1, . . . , n. From the fundamental theorem of Galois theory, the
fields Lj = Fix(kp , Uj ) with j = 0, . . . , n have the desired properties.

The following corollaries describe completely the constructible n-gons, tying


them to Fermat primes.

Corollary 17.6.2. Consider the numbers 0, 1, that is, a unit line segment or a unit circle.
A regular p-gon with p ≥ 3 prime is constructible from {0, 1} using a straightedge and
s
compass if and only if p = 22 + 1, s ≥ 0 is a Fermat prime.

Proof. From Theorem 6.3.13, we have that if a regular p-gon is constructible with a
straightedge and compass, then p must be a Fermat prime. The sufficiency follows
from Theorem 17.6.1.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:05 PM
17.7 The fundamental theorem of algebra | 271

We now extend this to general n-gons. Let m, n ∈ ℕ. Assume that we may construct
from {0, 1} a regular n-gon and a regular m-gon. In particular, this means that we may
construct the real numbers cos( 2πn
), sin( 2π
n
), cos( 2π
m
), and sin( 2π
m
). If the gcd(m, n) = 1,
then we may construct from {0, 1} a regular mn-gon.
To see this, notice that

2π 2π 2(n + m)π 2π 2π 2π 2π
cos( + ) = cos( ) = cos( ) cos( ) − sin( ) sin( ),
n m nm n m n m

and

2π 2π 2(n + m)π 2π 2π 2π 2π
sin( + ) = sin( ) = sin( ) cos( ) + cos( ) sin( ).
n m nm n m n m

2π 2π
Therefore, we may construct from {0, 1} the numbers cos( mn ) and sin( mn ), because
gcd(n + m, mn) = 1. Therefore, we may construct from {0, 1} a regular mn-gon.
Now let p ≥ 3 be a prime. Then [kp2 : ℚ] = p(p − 1), which is not a power of 2.
Therefore, from {0, 1} it is not possible to construct a regular p2 -gon. Hence, altogether
we have the following:

Corollary 17.6.3. Consider the numbers 0, 1, that is, a unit line segment or a unit circle.
A regular n-gon with n ∈ ℕ is constructible from {0, 1} using a straightedge and compass
if and only if
(i) n = 2m , m ≥ 0 or
(ii) n = 2m p1 p2 ⋅ ⋅ ⋅ pr , m ≥ 0, and the pi are pairwise distinct Fermat primes.

Proof. Certainly we may construct a 2m -gon. Furthermore, if r, s ∈ ℕ with gcd(r, s) = 1,


and if we can construct a regular rs-gon, then clearly, we may construct a regular r-gon
and a regular s-gon.

17.7 The fundamental theorem of algebra


The fundamental theorem of algebra is one of the most important algebraic results.
This says that any nonconstant complex polynomial must have a complex zero. In
the language of field extensions, this says that the field of complex numbers ℂ is alge-
braically closed. There are many distinct and completely different proofs of this result.
In [6], twelve proofs were given covering a wide area of mathematics. In this section,
we use Galois theory to present a proof. Before doing this, we briefly mention some of
the history surrounding this theorem.
The first mention of the fundamental theorem of algebra, in the form that every
polynomial equation of degree n has exactly n zeros, was given by Peter Roth of Nurn-
berg in 1608. However, its conjecture is generally credited to Girard, who also stated

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:05 PM
272 | 17 Applications of Galois theory

the result in 1629. It was then more clearly stated in 1637 by Descartes, who also dis-
tinguished between real and imaginary zeros. The first published proof of the funda-
mental theorem of algebra was then given by D’Alembert in 1746. However, there were
gaps in D’Alembert’s proof, and the first fully accepted proof was that given by Gauss
in 1797 in his Ph. D. thesis. This was published in 1799. Interestingly enough, in re-
viewing Gauss’ original proof, modern scholars tend to agree that there are as many
holes in this proof as in D’Alembert’s proof. Gauss, however, published three other
proofs with no such holes. He published second and third proofs in 1816, while his
final proof, which was essentially another version of the first, was presented in 1849.

Theorem 17.7.1. Each nonconstant polynomial f (x) ∈ ℂ[x], where ℂ is the field of com-
plex numbers, has a zero in ℂ. Therefore, ℂ is an algebraically closed field.

Proof. Let f (x) ∈ ℂ[x] be a nonconstant polynomial, and let K be the splitting field of
f (x) over ℂ. Since the characteristic of the complex numbers ℂ is zero, this will be a
Galois extension of ℂ. Since ℂ is a finite extension of ℝ, this field K would also be a
Galois extension of ℝ. The fundamental theorem of algebra asserts that K must be ℂ
itself, and hence the fundamental theorem of algebra is equivalent to the fact that any
nontrivial Galois extension of ℂ must be ℂ.
Let K be any finite extension of ℝ with |K : ℝ| = 2m q, (2, q) = 1. If m = 0, then K is
an odd-degree extension of ℝ. Since K is separable over ℝ, from the primitive element
theorem, it is a simple extension, and hence K = ℝ(α), where the minimal polynomial
mα (x) over ℝ has odd degree. However, odd-degree real polynomials always have a
real zero, and therefore mα (x) is irreducible only if its degree is one. But then, α ∈ ℝ,
and K = ℝ. Therefore, if K is a nontrivial finite extension of ℝ of degree 2m q, we must
have m > 0. This shows more generally that there are no odd-degree finite extensions
of ℝ.
Suppose that K is a degree 2 extension of ℂ. Then K = ℂ(α) with deg mα (x) = 2,
where mα (x) is the minimal polynomial of α over ℂ. But from the quadratic formula
complex, quadratic polynomials always have zeros in ℂ, so a contradiction. Therefore,
ℂ has no degree 2 extensions.
Now, let K be a Galois extension of ℂ. Then K is also Galois over ℝ. Suppose |K :
ℝ| = 2m q, (2, q) = 1. From the argument above, we must have m > 0. Let G = Gal(K/ℝ)
be the Galois group. Then |G| = 2m q, m > 0, (2, q) = 1. Thus, G has a 2-Sylow subgroup
of order 2m and index q (see Theorem 13.3.4). This would correspond to an intermediate
field E with |K : E| = 2m and |E : ℝ| = q. However, then E is an odd-degree finite
extension of ℝ. It follows that q = 1 and E = ℝ. Therefore, |K : ℝ| = 2m , and |G| = 2m .
Now, |K : ℂ| = 2m−1 and suppose G1 = Gal(K/ℂ). This is a 2-group. If it were not
trivial, then from Theorem 13.4.1 there would exist a subgroup of order 2m−2 and in-
dex 2. This would correspond to an intermediate field E of degree 2 over ℂ. However,
from the argument above, ℂ has no degree 2 extensions. It follows then that G1 is triv-
ial; that is, |G1 | = 1, so |K : ℂ| = 1, and K = ℂ, completing the proof.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:05 PM
17.8 Exercises | 273

The fact that ℂ is algebraically closed limits the possible algebraic extensions of
the reals.

Corollary 17.7.2. Let K be a finite field extension of the real numbers ℝ. Then K = ℝ or
K = ℂ.

Proof. Since |K : ℝ| < ∞ by the primitive element theorem, K = ℝ(α) for some α ∈ K.
Then the minimal polynomial mα (x) of α over ℝ is in ℝ[x], and hence in ℂ[x]. There-
fore, from the fundamental theorem of algebra it has a zero in ℂ. Hence, α ∈ ℂ. If
α ∈ ℝ, then K = ℝ, if not, then K = ℂ.

17.8 Exercises
1. For f (x) ∈ ℚ[x] with

f (x) = x6 − 12x 4 + 36x2 − 50


(f (x) = 4x4 − 12x 2 + 20x − 3)
1

determine for each complex zero α of f (x) a finite number of radicals γi = βi i ,


m

i = 1, . . . , r, and a presentation of α as a rational function in γ1 , . . . , γr over ℚ such


that γi+1 is irreducible over ℚ(γ1 , . . . , γi ), and βi+1 ∈ ℚ(γ1 , . . . , γi ) for i = 0, . . . , r − 1.
2. Let K be a field of prime characteristic p. Let n ∈ ℕ and Kn the splitting field of
xn − 1 over K. Show that Aut(Kn |K) is cyclic.
3. Let f (x) = x4 − x + 1 ∈ ℤ[x]. Show the following:
(i) f has a real zero.
(ii) f is irreducible over ℚ.
(iii) If u + iv (u, v ∈ ℝ) is a zero of f in ℂ, then g = x3 − 4x − 1 is the minimal
polynomial of 4u2 over ℚ.
(iv) The Galois group of f over ℚ has an element of order 3.
(v) No zero a ∈ ℂ of f is constructible from the points 0 and 1 with straightedge
and compass.
4. Show that each polynomial f (x) over ℝ decomposes in linear factors and quadrat-
ic factors (f (x) = d(x − a1 ) ⋅ (x − a2 ) ⋅ ⋅ ⋅ (x 2 + b1 x + c1 ) ⋅ (x 2 + b2 x + c2 ) ⋅ ⋅ ⋅, d ∈ ℝ).
5. Let E be a finite (commutative) field extension of ℝ. Then E ≅ ℝ, or E ≅ ℂ.
6. (Vieta) Show that y3 − py = q reduces to the form 4z 3 − 3z = c by a suitable
substitution y = mz.
7. Suppose that |a + id| = |c + id| and |a + ib|3 = c + id. Show that the relation between
a and c is 4a3 − 3a = c.
8. Show the identity of Bombelli:

√3 (2 ± √−121) = 2 ± √−1,

and apply it on the equation x4 = 15x + 4.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:05 PM
274 | 17 Applications of Galois theory

9. Solve the following equations:


(a) x3 − 2x + 3 = 0;
(b) x4 + 2x 3 + 3x2 − x − 2 = 0.
10. Let n ≥ 1 be a natural number and x an indeterminate over ℂ. Consider the poly-
nomial xn − 1 ∈ ℤ[x]. In ℂ[x] it decomposes in linear factors:

xn − 1 = (x − ξ1 )(x − ξ2 ) ⋅ ⋅ ⋅ (x − ξn ),

where the complex numbers


ν 2πν 2πν
ξν = e2πi n = cos + i ⋅ sin , 1 ≤ ν ≤ n,
n n

are all (different) n-th roots of unity, that is, especially ξn = 1. These ξν form a from
ξ1 generated multiplicative cyclic group G = {ξ1 , ξ2 , . . . , ξn }. It is ξν = ξ1ν .
An n-th root of unity ξν is called a primitive n-th root of unity, if ξν is not an m-th
root of unity for any m < n.
Show that the following are equivalent:
(i) ξν is a primitive n-th root of unity.
(ii) ξν is a generating element of G.
(iii) gcd(ν, n) = 1.
11. The polynomial ϕn (x) ∈ ℂ[x], whose zeros are exactly the primitive n-th roots of
unity, is called the n-th cyclotomic polynomial. With Exercise 6 it is
ν
ϕn (x) = ∏ (x − ξν ) = ∏ (x − e2πi n ).
1≤ν≤n 1≤ν≤n
gcd(ν,n)=1 gcd(ν,n)=1

The degree of ϕn (x) is the number of the integers {1, . . . , n}, which are coprime to n.
Show the following:
(i) xn − 1 = ∏d≥1 ϕd (x).
d|n
(ii) ϕn (x) ∈ ℤ[x] for all n ≥ 1.
(iii) ϕn (x) is irreducible over ℚ (and therefore also over ℤ) for all n ≥ 1.
12. Show that the Fermat numbers F0 , F1 , F2 , F3 , F4 are all prime but F5 is composite
and divisible by 641.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:05 PM
18 The theory of modules
18.1 Modules over rings
Recall that a vector space V over a field K is an abelian group V with a scalar multipli-
cation ⋅ : K × V → V, satisfying the following:
(1) f (v1 + v2 ) = fv1 + fv2 for f ∈ K and v1 , v2 ∈ V.
(2) (f1 + f2 )v = f1 v + f2 v for f1 , f2 ∈ K and v ∈ V.
(3) (f1 f2 )v = f1 (f2 v) for f1 , f2 ∈ K and v ∈ V.
(4) 1v = v for v ∈ V.

Vector spaces are the fundamental algebraic structures in linear algebra, and the study
of linear equations. Vector spaces have been crucial in our study of fields and Galois
theory, since any field extension is a vector space over any subfield. In this context,
the degree of a field extension is just the dimension of the extension field as a vector
space over the base field.
If we modify the definition of a vector space to allow scalar multiplication from an
arbitrary ring, we obtain a more general structure called a module. We will formally
define this below. Modules generalize vector spaces, but the fact that the scalars do
not necessarily have inverses makes the study of modules much more complicated.
Modules will play an important role in both the study of rings and the study of abelian
groups. In fact, any abelian group is a module over the integers ℤ so that modules, be-
sides being generalizations of vector spaces, can also be considered as generalizations
of abelian groups.
In this chapter, we will introduce the theory of modules. In particular, we will
extend to modules the basic algebraic properties such as the isomorphism theorems,
which have been introduced earlier in presenting groups, rings, and fields.
In this chapter, we restrict ourselves to commutative rings, so that throughout R
is always a commutative ring. If R has an identity 1, then we always consider only the
case that 1 ≠ 0. Throughout this chapter, we use letters a, b, c, m, . . . for ideals in R. For
principal ideals, we write ⟨a⟩ or aR for the ideal generated by a ∈ R. We note, however,
that the definition can be extended to include modules over noncommutative rings
(see Chapter 22). In this case, we would speak of left modules and right modules.

Definition 18.1.1. Let R = (R, +, ⋅) a commutative ring and M = (M, +) an abelian group.
M together with a scalar multiplication ⋅ : R×M → M, (α, x) 󳨃→ αx, is called a R-module
or module over R if the following axioms hold:
(M1) (α + β)x = αx + βx,
(M2) α(x + y) = αx + αy, and
(M3) (αβ)x = α(βx) for all α, β ∈ R and x, y ∈ M.

If R has an identity 1, then M is called an unitary R-module, if in addition

https://doi.org/10.1515/9783110603996-018

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:11 AM
276 | 18 The theory of modules

(M4) 1 ⋅ x = x for all x ∈ M holds.

In the following, R always is a commutative ring. If R contains an identity 1, then


M always is an unitary R-module. If R has an identity 1, then we always assume 1 ≠ 0.
As usual, we have the rules:

0 ⋅ x = 0, α ⋅ 0 = 0, −(αx) = (−α)x = α(−x),

for all α ∈ R and for all x ∈ M.


We next present a series of examples of modules.

Example 18.1.2.
(1) If R = K is a field, then a K-module is a K-vector space.
(2) Let G = (G, +) be an abelian group. If n ∈ ℤ and x ∈ G, then nx is defined as usual:

0 ⋅ x = 0,
nx = ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
x + ⋅⋅⋅ + x if n > 0, and
n-times

nx = (−n)(−x) if n < 0.

Then G is an unitary ℤ-module via the scalar multiplication

⋅ : ℤ × G → G, (n, x) 󳨃→ nx.

(3) Let S be a subring of R. Then, via (s, r) 󳨃→ sr, the ring R itself becomes an S-module.
(4) Let K be a field, V a K-vector space, and f : V → V a linear map of V. Let p =
∑i αi t i ∈ K[t]. Then p(f ) := ∑i αi f i defines a linear map of V, and V is an unitary
K[t]-module via the scalar multiplication

K[t] × V → V, (p, v) 󳨃→ pv := p(f )(v).

(5) If R is a commutative ring and a is an ideal in R, then a is a module over R.

Basic to all algebraic theory is the concept of substructures. Next we define sub-
modules.

Definition 18.1.3. Let M be an R-module. 0 ≠ U ⊂ M is called a submodule of M if


(UMI) (U, +) < (M, +) and
(UMII) α ∈ R, u ∈ U ⇒ αu ∈ U; that is, RU ⊂ U.

Example 18.1.4.
(1) In an abelian group G, considered as a ℤ-module, the subgroups are precisely the
submodules.
(2) The submodules of R, considered as a R-module, are precisely the ideals.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:11 AM
18.1 Modules over rings | 277

(3) Rx := {αx : α ∈ R} is a submodule of M for each x ∈ M.


(4) Let K be a field, V a K-vector space, and f : V → V a linear map of V. Let U be a
submodule of V, considered as a K[t]-module as above. Then the following holds:
(a) U < V.
(b) pU = p(f )U ⊂ U for all p ∈ K[t]. In particular, αU ⊂ U for p = α ∈ K and
tU = f (U) ⊂ U for p = t; that is, U is an f -invariant subspace.
Also, on the other hand, p(f )U ⊂ U for all p ∈ K[t] if U is an f -invariant
subspace.

We next extend to modules the concept of a generating system. For a single gen-
erator, as with groups, this is called cyclic.

Definition 18.1.5. A submodule U of the R-module M is called cyclic if there exists an


x ∈ M with U = Rx.

Example 18.1.4.(3) (above) is an example for a cyclic submodule.


As in vector spaces, groups, and rings, the following constructions are standard
leading us to generating systems.
(1) Let M be a R-module and {Ui : i ∈ I} a family of submodules. Then ⋂i∈I Ui is a
submodule of M.
(2) Let M be a R-module. If A ⊂ M, then we define

⟨A⟩ := ⋂{U : U submodule of M with A ⊂ U}.

⟨A⟩ is the smallest submodule of M, which contains A. If R has an identity 1, then


⟨A⟩ is the set of all linear combinations ∑i αi ai with all αi ∈ R, all ai ∈ A. This holds
because M is unitary, and na = n(1 ⋅ a) = (n ⋅ 1)a for n ∈ ℤ and a ∈ A; that is, we
may consider the pseudoproduct na as a real product in the module. Especially, if
R has an identity 1, then aR = ⟨{a}⟩ =: ⟨a⟩.

Definition 18.1.6. Let R have an identity 1. If M = ⟨A⟩, then A is called a gener-


ating system of M. M is called finitely generated if there are a1 , . . . , an ∈ M with
M = ⟨{a1 , . . . , an }⟩ =: ⟨a1 , . . . , an ⟩.

The following is clear:

Lemma 18.1.7. Let Ui be submodules of M, i ∈ I, I an index set. Then

⟨⋃ Ui ⟩ = {∑ ai : ai ∈ Ui , L ⊂ I finite}.
i∈I i∈L

We write ⟨⋃i∈I Ui ⟩ =: ∑i∈I Ui and call this submodule the sum of the Ui . A sum
∑i∈I Ui is called a direct sum if for each representation of 0, as 0 = ∑ ai , ai ∈ Ui , it
follows that all ai = 0. This is equivalent to Ui ∩ ∑i=j̸ Uj = 0 for all i ∈ I. Notation:
⨁i∈I Ui ; and if I = {1, . . . , n}, then we also write U1 ⊕ ⋅ ⋅ ⋅ ⊕ Un .

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:11 AM
278 | 18 The theory of modules

In analogy with our previously defined algebraic structure, we extend to modules


the concepts of quotient modules and module homomorphisms.

Definition 18.1.8. Let U be a submodule of the R-module M. Let M/U be the factor
group. We define a (well-defined) scalar multiplication:

R × M/U → M/U, α(x + U) := αx + U.

With this M/U is a R-module, the factor module or quotient module of M by U. In M/U,
we have the operations

(x + U) + (y + U) = (x + y) + U,

and

α(x + U) = αx + U.

A module M over a ring R can also be considered as a module over a quotient ring
of R. The following is straightforward to verify (see exercises):

Lemma 18.1.9. Let a ⊲ R an ideal in R and M a R-module. The set of all finite sums of
the form ∑ αi xi , αi ∈ a, xi ∈ M, is a submodule of M, which we denote by aM. The factor
group M/aM becomes a R/a-module via the well-defined scalar multiplication

(α + a)(m + aM) = αm + aM.

If here R has an identity 1 and a is a maximal ideal, then M/aM becomes a vector space
over the field K = R/a.

We next define module homomorphisms:

Definition 18.1.10. Let R be a ring and M, N be R-modules. A map f : M → N is called


a R-module homomorphism (or R-linear) if

f (x + y) = f (x) + f (y)

and

f (αx) = αf (x)

for all α ∈ R and all x, y ∈ M. Endo-, epi-, mono-, iso- and automorphisms are defined
analogously via the corresponding properties of the maps. If f : M → N and g : N → P
are module homomorphisms, then g ∘ f : M → P is also a module homomorphism. If
f : M → N is an isomorphism, then also f −1 : N → M.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:11 AM
18.2 Annihilators and torsion | 279

We define kernel and image in the usual way:

ker(f ) := {x ∈ M : f (x) = 0},

and

im(f ) := f (M) = {f (x) : x ∈ M}.

The set ker(f ) is a submodule of M, and im(f ) is a submodule of N. As usual,

f is injective ⇐⇒ ker(f ) = {0}.

If U is a submodule of M, then the map x 󳨃→ x + U defines a module epimorphism (the


canonical epimorphism) from M onto M/U with kernel U.
There are module isomorphism theorems. The proofs are straightforward exten-
sions of the corresponding proofs for groups and rings.

Theorem 18.1.11 (Module isomorphism theorems). Let M, N be R-modules.


(1) If f : M → N is a module homomorphism, then

f (M) ≅ M/ ker(f ).

(2) If U, V are submodules of the R-module M, then

U/(U ∩ V) ≅ (U + V)/V.

(3) If U and V are submodules of the R-module M with U ⊂ V ⊂ M, then

(M/U)/(V/U) ≅ M/V.

For the proofs, as for groups, just consider the map f : U + V → U/(U ∩ V),
u + v 󳨃→ u + (U ∩ V), which is well-defined because U ∩ V is a submodule of U; then
we have ker(f ) = V.
Note that α 󳨃→ αρ, ρ ∈ R fixed, defines a module homomorphism R → R if we
consider R itself as a R-module.

18.2 Annihilators and torsion


In this section, we define torsion for an R-module and a very important subring of R
called the annihilator.

Definition 18.2.1. Let M be an R-module. For a fixed a ∈ M, consider the map λa :


R → M, λa (α) := αa. λa is a module homomorphism, considering R as an R-module.
We call ker(λa ) the annihilator of a denoted by Ann(a); that is,

Ann(a) = {α ∈ R : αa = 0}.

Lemma 18.2.2. Ann(a) is a submodule of R, and the module isomorphism theorem (1)
gives R/ Ann(a) ≅ Ra.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:11 AM
280 | 18 The theory of modules

We next extend the annihilator to whole submodules of M:

Definition 18.2.3. Let U be a submodule of the R-module M. The annihilator Ann(U)


is defined to be

Ann(U) := {α ∈ R : αu = 0 for all u ∈ U}.

As for single elements, since Ann(U) = ⋂u∈U Ann(u), then Ann(U) is a submodule
of R. If ρ ∈ R, u ∈ U, then ρu ∈ U; that means, if u ∈ Ann(U), then also ρu ∈ Ann(U),
because (αρ)u = α(ρu) = 0. Hence, Ann(U) is an ideal in R.
Suppose that G is an abelian group. Then as aforementioned, G is a ℤ-module. An
element g ∈ G is a torsion element, or has finite order if ng = 0 for some n ∈ ℕ. The
set Tor(G) consists of all the torsion elements in G. An abelian group is torsion-free if
Tor(G) = {0}.

Lemma 18.2.4. Let G be an abelian group. Then Tor(G) is a subgroup of G, and G/


Tor(G) is torsion-free.

We extend this concept now to general modules:

Definition 18.2.5. The R-module M is called faithful if Ann(M) = {0}. An element a ∈


M is called a torsion element, or element of finite order, if Ann(a) ≠ {0}. A module
without torsion elements ≠ 0 is called torsion-free. If the R-module M is torsion-free,
then R has no zero divisors ≠ 0.

Theorem 18.2.6. Let R be an integral domain and M an R-module (by our agreement
M is unitary). Let Tor(M) = T(M) be the set of torsion elements of M. Then Tor(M) is a
submodule of M, and M/ Tor(M) is torsion-free.

Proof. If m ∈ Tor(M), α ∈ Ann(m), α ≠ 0, and β ∈ R, then we get α(βm) = (αβ)m =


(βα)m = β(αm) = 0; that is, βm ∈ Tor(M), because αβ ≠ 0 if β ≠ 0 (R is an integral
domain). Let m󸀠 another element of Tor(M) and 0 ≠ α󸀠 ∈ Ann(m󸀠 ). Then αα󸀠 ≠ 0, and
αα󸀠 (m+m󸀠 ) = αα󸀠 m+αα󸀠 m󸀠 = α󸀠 (αm)+α(α󸀠 m󸀠 ) = 0; that is, m+m󸀠 ∈ Tor(M). Therefore,
Tor(M) is a submodule.
Now, let m + Tor(M) be a torsion element in M/ Tor(M). Let α ∈ R, α ≠ 0 with
α(m + Tor(M)) = αm + Tor(M) = Tor(M). Then αm ∈ Tor(M). Hence, there exists a
β ∈ R, β ≠ 0, with 0 = β(αm) = (βα)m. Since βα ≠ 0, we get that m ∈ Tor(M), and the
torsion element m + Tor(M) is trivial.

18.3 Direct products and direct sums of modules


Let Mi , i ∈ I ≠ 0, be a family of R-modules. On the direct product

P = ∏ Mi = {f : I → ⋃ Mi : f (i) ∈ Mi for all i ∈ I},


i∈I i∈I

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:11 AM
18.3 Direct products and direct sums of modules | 281

we define the module operations

+:P×P →P and ⋅ : R × P → P

via

(f + g)(i) := f (i) + g(i) and (αf )(i) := αf (i).

Together with these operations, P = ∏i∈I Mi is an R-module, the direct product of


the Mi . If we identify f with the I-tuple of the images f = (fi )i∈I, then the sum and the
scalar multiplication are componentwise. If I = {1, . . . , n} and Mi = M for all i ∈ I, then
we write, as usual, M n = ∏i∈I Mi .
We make the agreement that ∏i∈I=0 Mi := {0}.
⨁i∈I Mi := {f ∈ ∏i∈I Mi : f (i) = 0 for almost all i} (“for almost all i” means that
there are at most finitely many i with f (i) ≠ 0) is a submodule of the direct product,
called the direct sum of the Mi . If I = {1, . . . , n}, then we write ⨁i∈I Mi = M1 ⊕ ⋅ ⋅ ⋅ ⊕ Mn .
Here, ∏ni=1 Mi = ⨁ni=1 Mi for finite I.

Theorem 18.3.1.
(1) If π ∈ SI is a permutation of I, then

∏ Mi ≅ ∏ Mπ(i),
i∈I i∈I

and

⨁ Mi ≅ ⨁ Mπ(i) .
i∈I i∈I

(2) If I = ⋃̇ j∈J Ij , the disjoint union, then

∏ Mi ≅ ∏(∏ Mi ),
i∈I j∈J i∈Ij

and

⨁ Mi ≅ ⨁(⨁ Mi ).
i∈I j∈J i∈Ij

Proof. (1) Consider the map f 󳨃→ f ∘ π.


(2) Consider the map f 󳨃→ ⋃j∈J fj , where fj ∈ ∏i∈Ij Mi is the restriction of f onto Ij ,
and ⋃j∈J fj is on J, defined by (⋃j∈J fj )(k) := fk = f (k).

Let I ≠ 0. If M = ∏i∈I Mi , then we get in a natural manner module homomorphisms


πi : M → Mi via f 󳨃→ f (i); πi is called the projection onto the ith component. In duality,
we define module homomorphisms δi : Mi → ⨁i∈I Mi ⊂ ∏i∈I Mi via δi (mi ) = (nj )j∈I ,
where nj = 0 if i ≠ j and ni = mi . δi is called the ith canonical injection. If I = {1, . . . , n},
then πi (a1 , . . . , ai , . . . , an ) = ai , and δi (mi ) = (0, . . . , 0, mi , 0, . . . , 0).

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:11 AM
282 | 18 The theory of modules

Theorem 18.3.2 (Universal properties). Let A, Mi , i ∈ I ≠ 0, be R-modules.


(1) If ϕi : A → Mi , i ∈ I, are module homomorphisms, then there exists exactly one
module homomorphism ϕ : A → ∏i∈I Mi such that, for each i, the following diagram
commutes:

that is, ϕj = πj ∘ ϕ where πj is the jth projection.


(2) If Ψi : Mi → A, i ∈ I, are module homomorphisms then there exists exactly one
module homomorphism Ψ : ⨁i∈I Mi → A such that for each j ∈ J the following
diagram commutes:

that is, Ψj = Ψ ∘ δj where δj is the jth canonical injection.

Proof. (1) If there is such ϕ, then the jth component of ϕ(a) is equal ϕj (a), because
πj ∘ ϕ = ϕj . Hence, define ϕ(a) ∈ ∏i∈I Mi via ϕ(a)(i) := ϕi (a), and ϕ is the desired map.
(2) If there is such a Ψ with Ψ ∘ αj = Ψj , then Ψ(x) = Ψ((xi )) = Ψ(∑i∈I δi (xi )) =
∑i∈I Ψ ∘ δi (xi ) = ∑i∈I Ψi (xi ). Hence, define Ψ((xi )) = ∑i∈I Ψi (xi ), and Ψ is the desired
map (recall that the sum is well defined).

18.4 Free modules


If V is a vector space over a field K, then V always has a basis over K, which may
be infinite. Despite the similarity to vector spaces, because the scalars may not have
inverses, this is not necessarily true for modules.
We now define a basis for a module. Those modules that actually have a basis are
called free modules.
Let R be a ring with identity 1, M be a unitary R-module, and S ⊂ M. Each finite
sum ∑ αi si , the αi ∈ R, and the si ∈ S, is called a linear combination in S. Since M is
unitary, and S ≠ 0, then ⟨S⟩ is exactly the set of all linear combinations in S. In the

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:11 AM
18.4 Free modules | 283

following, we assume that S ≠ 0. If S = 0, then ⟨S⟩ = ⟨0⟩ = {0}, and this case is not
interesting. For convention, in the following, we always assume mi ≠ mj if i ≠ j in a
finite sum ∑ αi mi with all αi ∈ R and all mi ∈ M.

Definition 18.4.1. A finite set {m1 , . . . , mn } ⊂ M is called linear independent or free (over
R) if a representation 0 = ∑ni=1 αi mi implies always αi = 0 for all i ∈ {1, . . . , n}; that is, 0
can be represented only trivially on {m1 , . . . , mn }. A nonempty subset S ⊂ M is called
free (over R) if each finite subset of S is free.

Definition 18.4.2. Let M be a R-module (as above).


(1) S ⊂ M is called a basis of M if
(a) M = ⟨S⟩, and
(b) S is free (over R).
(2) If M has a basis, then M is called a free R-module. If S is a basis of M, then M is
called free on S, or free with basis S.

In this sense, we can consider {0} as a free module with basis 0.

Example 18.4.3.
1. R × R = R2 , as an R-module, is free with basis {(1, 0), (0, 1)}.
2. More generally, let I ≠ 0. Then ⨁i∈I Ri with Ri = R for all i ∈ I is free with basis
{ϵi : I → R : ϵi (j) = δij , i, j ∈ I}, where

0 if i ≠ j,
δij = {
1 if i = j.

In particular, if I = {1, . . . , n}, then Rn = {(a1 , . . . , an ) : ai ∈ R} is free with basis


{ϵi = (0, . . . , 0, 1, 0, . . . , 0); 1 ≤ i ≤ n}.
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
i−1
3. Let G be an abelian group. If G, as a ℤ-module, is free on S ⊂ G, then G is called a
free abelian group with basis S. If |S| = n < ∞, then G ≅ ℤn .

Theorem 18.4.4. The R-module M is free on S if and only if each m ∈ M can be written
uniquely in the form ∑ αi si with αi ∈ R, si ∈ S. This is exactly the case, where M = ⨁s∈S Rs
is the direct sum of the cyclic submodules Rs, and each Rs is module isomorphic to R.

Proof. If S is a basis then each m ∈ M can be written as m = ∑ αi si , because M = ⟨S⟩.


This representation is unique, because if ∑ αi si = ∑ βi si , then ∑(αi − βi )si = 0; that is,
αi − βi = 0 for all i. If, on the other side, we assume that the representation is unique,
then we get from ∑ αi si = 0 = ∑ 0 ⋅ si that all αi = 0, and therefore M is free on S.
The rest of the theorem, essentially, is a rewriting of the definition. If each m ∈ M can
be written as m = ∑ αi si , then M = ∑s∈S Rs. If x ∈ Rs󸀠 ∩ ∑s∈S,s=s̸ 󸀠 Rs with s󸀠 ∈ S, then
x = α󸀠 s󸀠 = ∑si =s̸ 󸀠 ,si ∈S αi si , and 0 = α󸀠 s󸀠 − ∑si =s̸ 󸀠 ,si ∈S αi si . Therefore, α󸀠 = 0, and αi = 0 for
all i. This gives M = ⨁s∈S Rs. The cyclic modules Rs are isomorphic to R/ Ann(s), and
Ann(s) = {0} in the free modules. On the other side such modules are free on S.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:11 AM
284 | 18 The theory of modules

Corollary 18.4.5.
(1) M is free on S ⇔ M ≅ ⨁s∈S Rs , Rs = R for all s ∈ S.
(2) If M is finitely generated and free, then there exists an n ∈ ℕ0 such that M ≅ Rn =
R ⊕ ⋅ ⋅ ⋅ ⊕ R.
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
n-times

Proof. Part (1) is clear. We prove part (2). Let M = ⟨x1 , . . . , xr ⟩ and S a basis of M. Each xi
is uniquely representable on S, as xi = ∑si ∈S αi si . Since the xi generates M, we get m =
∑ βi xi = ∑ βi αj sj for arbitrary m ∈ M, and we need only finitely many sj to generate M.
Hence, S is finite.

Theorem 18.4.6. Let R be a commutative ring with identity 1, and M a free R-module.
Then any two bases of M have the same cardinality.

Proof. R contains a maximal ideal m, and R/m is a field (see Theorem 2.3.2 and 2.4.2).
Then M/mM is a vector space over R/m. From M ≅ ⨁s∈S Rs with basis S, we get mM ≅
⨁s∈S ms; hence,

M/mM ≅ (⨁ Rs)/mM ≅ ⨁(Rs/mM) ≅ ⨁ R/m.


s∈S s∈S s∈S

Therefore, the R/m-vector space M/mM has a basis of the cardinality of S. This gives
the result.

Let R be a commutative ring with identity 1, and M a free R-module. The cardinality
of a basis is an invariant of M, called the rank of M or dimension of M. If rank(M) = n <
∞, then this means M ≅ Rn .

Theorem 18.4.7. Each R-module is a (module-)homomorphic image of a free R-module.

Proof. Let M be a R-module. We consider F := ⨁m∈M Rm with Rm = R for all m ∈ M.


F is a free R-module. The map f : F → M, f ((αm )m∈M ) = ∑ αm m, defines a surjective
module homomorphism.

Theorem 18.4.8. Let F, M be R-modules, and let F be free. Let f : M → F be a module


epimorphism. Then there exists a module homomorphism g : F → M with f ∘ g = idF ,
and we have M = ker(f ) ⊕ g(F).

Proof. Let S be a basis of F. By the axiom of choice, there exists for each s ∈ S an
element ms ∈ M with f (ms ) = s (f is surjective). We define the map g : F → M via
s 󳨃→ ms linearly; that is, g(∑si ∈S αi si ) = ∑si ∈S αi msi . Since F is free, the map g is well
defined. Obviously, f ∘ g(s) = f (ms ) = s for s ∈ S; that means f ∘ g = idF , because
F is free on S. For each m ∈ M, we have also m = g ∘ f (m) + (m − g ∘ f (m)), where
g ∘ f (m) = g(f (m)) ∈ g(F). Since f ∘ g = idF , the elements of the form m − g ∘ f (m) are in
the kernel of f . Therefore, M = g(F) + ker(f ). Now let x ∈ g(F) ∩ ker(f ). Then x = g(y)
for some y ∈ F and 0 = f (x) = f ∘ g(y) = y, and hence x = 0. Therefore, the sum is
direct: M = g(F) ⊕ ker(f ).

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:11 AM
18.5 Modules over principal ideal domains | 285

Corollary 18.4.9. Let M be an R-module and N a submodule such that M/N is free. Then
there is a submodule N 󸀠 of M with M = N ⊕ N 󸀠 .

Proof. Apply the above theorem for the canonical map π : M → M/N with
ker(π) = N.

18.5 Modules over principal ideal domains


We now specialize to the case of modules over principal ideal domains. For the re-
mainder of this section, R is always a principal ideal domain ≠ {0}. We now use the
notation (α) := αR, α ∈ R, for the principal ideal αR.

Theorem 18.5.1. Let M be a free R-module of finite rank over the principal ideal do-
main R. Then each submodule U is free of finite rank, and rank(U) ≤ rank(M).

Proof. We prove the theorem by induction on n = rank(M). The theorem certainly


holds if n = 0. Now let n ≥ 1, and assume that the theorem holds for all free R-modules
of rank < n. Let M be a free R-module of rank n with basis {x1 , . . . , xn }. Let U be a
submodule of M. We represent the elements of U as linear combination of the basis
elements x1 , . . . , xn , and we consider the set of coefficients of x1 for the elements of U:

n
a = {β ∈ R : βx1 + ∑ βi xi ∈ U}.
i=2

Certainly a is an ideal in R. Since R is a principal ideal domain, we have a = (α1 ) for


some α1 ∈ R. Let u ∈ U be an element in U, which has α1 as its first coefficient; that is
n
u = α1 x1 + ∑ αi xi ∈ U.
i=2

Let v ∈ U be arbitrary. Then


n
v = ρ(α1 x1 ) + ∑ ρi xi .
i=2

Hence, v − ρu ∈ U 󸀠 := U ∩ M 󸀠 , where M 󸀠 is the free R-module with basis {x2 , . . . , xn }.


By induction, U 󸀠 is a free submodule of M 󸀠 with a basis {y1 , . . . , yt }, t ≤ n − 1. If α1 = 0,
then a = (0), and U = U 󸀠 , and there is nothing to prove. Now let α1 ≠ 0. We show that
{u, y1 , . . . , yt } is a basis of U. v − ρu is a linear combination of the basis elements of U 󸀠 ;
that is, v − ρu = ∑ti=1 ηi yi uniquely. Hence, v = ρu + ∑ti=1 ηi yi , and U = ⟨u, y1 , . . . , yt ⟩.
Now let be 0 = γu + ∑ti=1 μi yi . We write u and the yi as linear combinations in the basis
elements x1 , . . . , xn of M. There is only an x1 -portion in γu. Hence,
n
0 = γα1 x1 + ∑ μ󸀠i xi .
i=2

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:11 AM
286 | 18 The theory of modules

Therefore, first γα1 x1 = 0; that is, γ = 0, because R has no zero divisor ≠ 0, and fur-
thermore, μ󸀠2 = ⋅ ⋅ ⋅ = μ󸀠n = 0. That means, μ1 = ⋅ ⋅ ⋅ = μt = 0.

Let R be a principal ideal domain. Then the annihilator Ann(x) in R-modules M


has certain further properties. Let x ∈ M. By definition

Ann(x) = {α ∈ R : αx = 0} ⊲ R, an ideal in R,

hence Ann(x) = (δx ). If x = 0, then (δx ) = R. δx is called the order of x and (δx ) the
order ideal of x. δx is uniquely determined up to units in R (that is, up to elements η with
ηη󸀠 = 1 for some η󸀠 ∈ R). For a submodule U of M, we call Ann(U) = ⋂u∈U (δu ) = (μ),
the order ideal of U.
In an abelian group G, considered as a ℤ-module, this order for elements corre-
sponds exactly to the order as group elements if we choose δx ≥ 0 for x ∈ G.

Theorem 18.5.2. Let R be a principal ideal domain and M be a finitely generated


torsion-free R-module. Then M is free.

Proof. Let M = ⟨x1 , . . . , xn ⟩ torsion-free and R a principal ideal domain. Each submod-
ule ⟨xi ⟩ = Rxi is free, because M is torsion-free. We call a subset S ⊂ ⟨x1 , . . . , xn ⟩ free if
the submodule ⟨S⟩ is free. Since ⟨xi ⟩ is free, there exist such nonempty subsets. Under
all free subsets S ⊂ ⟨x1 , . . . , xn ⟩, we choose one with a maximal number of elements.
We may assume that {x1 , . . . , xs }, 1 ≤ s ≤ n, is such a maximal set—after possible re-
naming. If s = n, then the theorem holds. Now, let s < n. By the choice of s, the sets
{x1 , . . . , xs , xj } with s < j ≤ n are not free. Hence, there are αj ∈ R, and αi ∈ R, not all 0,
with
s
αj xj = ∑ αi xi , αj ≠ 0, s < j ≤ n.
i=1

For the product α := αs+1 ⋅ ⋅ ⋅ αn ≠ 0, we get αxj ∈ Rx1 ⊕ ⋅ ⋅ ⋅ ⊕ Rxs =: F, s < j ≤ n,


because αxi ∈ F for 1 ≤ i ≤ s. Altogether, we get αM ⊂ F. αM is a submodule of the free
R-module F of rank s. By Theorem 18.5.1, we have that αM is free. Since α ≠ 0, and M
is torsion-free, the map M → αM, x 󳨃→ αx, defines an (module) isomorphism; that is,
M ≅ αM. Therefore, also M is free.

We remind that for an integral domain R, the set

Tor(M) = T(M) = {x ∈ M : ∃α ∈ R, α ≠ 0, with αx = 0}

of the torsion elements of an R-module M, is a submodule with torsion-free factor


module M/T(M).

Corollary 18.5.3. Let R be a principal ideal domain and M be a finitely generated


R-module. Then M = T(M) ⊕ F with a free submodule F ≅ M/T(M).

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:11 AM
18.5 Modules over principal ideal domains | 287

Proof. M/T(M) is a finitely generated, torsion-free R-module, and hence free. By Corol-
lary 18.4.9, we have M = T(M) ⊕ F, F ≅ M/T(M).
From now on, we are interested in the case where M ≠ {0} is a torsion R-module;
that is, M = T(M). Let R be a principal ideal domain and M = T(M) an R-module.
Let M ≠ {0} and finitely generated. As above, let δx be the order of x ∈ M, unique
up to units in R, and let (δx ) = {α ∈ R : αx = 0} be the order ideal of x. Let (μ) =
⋂x∈M (δx ) be the order ideal of M. Since (μ) ⊂ (δx ), we have δx |μ for all x ∈ M. Since
principal ideal domains are unique factorization domains, if μ ≠ 0, then there can not
be many essentially different orders (that means, different up to units). Since M ≠ {0}
and finitely generated, we have in any case μ ≠ 0, because if M = ⟨x1 , . . . , xn ⟩, αi xi = 0
with αi ≠ 0, then αM = {0} if α := α1 ⋅ ⋅ ⋅ αn ≠ 0.
Lemma 18.5.4. Let R be a principal ideal domain and M ≠ {0} be an R-module with
M = T(M).
(1) If the orders δx and δy of x, y ∈ M are relatively prime; that is, gcd(δx , δy ) = 1, then
(δx+y ) = (δx δy ).
(2) Let δz be the order of z ∈ M, z ≠ 0. If δz = αβ with gcd(α, β) = 1, then there exist
x, y ∈ M with z = x + y and (δx ) = (α), (δy ) = (β).

Proof. (1) Since δx δy (x + y) = δx δy x + δx δy y = δy δx x + δx δy y = 0, we get (δx δy ) ⊂ (δx+y ).


On the other hand, from δx x = 0 and δx+y (x +y) = 0, we get 0 = δx δx+y (x +y) = δx δx+y y;
that means, δx δx+y ∈ (δy ), and hence δy |δx δx+y . Since gcd(δx , δy ) = 1, we have δy |δx+y .
Analogously δx |δx+y . Hence, δx δy |δx+y , and (δx+y ) ⊂ (δx δy ).
(2) Let δz = αβ with gcd(α, β) = 1. Then there are ρ, σ ∈ R with 1 = ρα + σβ.
Therefore, we get
z = 1 ⋅ z = ⏟⏟ραz ⏟⏟⏟⏟⏟ = y + x = x + y.
⏟⏟⏟⏟⏟ + ⏟⏟σβz
=:y =:x

Since αx = ασβz = σδz z = 0, we get α ∈ (δz ); that means, δx |α. On the other hand,
from 0 = δx x = σβδx z, we get δz |σβδx , and hence αβ|σβδx , because δz = αβ. Therefore,
α|σδx . From gcd(α, σ) = 1, we get α|δx . Therefore, α is associated to δx ; that is α = δx ϵ
with ϵ a unit in R, and furthermore, (α) = (δx ). Analogously, (β) = (δy ).
In Lemma 18.5.4, we do not need M = T(M). We only need x, y, z ∈ M with δx ≠ 0,
δy ≠ 0 and δz ≠ 0, respectively.
Corollary 18.5.5. Let R be a principal ideal domain and M ≠ {0} be an R-module with
M = T(M).
1. Let x1 , . . . , xn ∈ M be pairwise different and pairwise relatively prime orders δxi = αi .
Then y = x1 + ⋅ ⋅ ⋅ + xn has order α := α1 ⋅ ⋅ ⋅ αn .
k k
2. Let 0 ≠ x ∈ M and δx = ϵπ1 1 ⋅ ⋅ ⋅ πnn be a prime decomposition of the order δx of x (ϵ a
unit in R and the πi pairwise nonassociate prime elements in R), where n > 0, ki > 0.
k
Then there exist xi , i = 1, . . . , n, with δxi associated with πi i and x = x1 + ⋅ ⋅ ⋅ + xn .

This is exercise 7.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:11 AM
288 | 18 The theory of modules

18.6 The fundamental theorem for finitely generated modules


In Section 10.4, we described the following result called the basis theorem for finite
abelian groups. In the following, we give a complete proof in detail; an elementary
proof is given in Chapter 19:

Theorem 18.6.1 (Theorem 10.4.1, basis theorem for finite abelian groups). Let G be a
finite abelian group. Then G is a direct product of cyclic groups of prime power order.

This allowed us, for a given finite order n, to present a complete classification of
abelian groups of order n. In this section, we extend this result to general modules
over principal ideal domains. As a consequence, we obtain the fundamental decom-
position theorem for finitely generated (not necessarily finite) abelian groups, which
finally proves Theorem 10.4.1. In the next chapter, we present a separate proof of this
in a slightly different format.

Definition 18.6.2. Let R be a principal ideal domain and M be an R-module. Let π ∈ R


be a prime element. Mπ := {x ∈ M : ∃k ≥ 0 with π k x = 0} is called the π-primary
component of M. If M = Mπ for some prime element π ∈ R, then M is called π-primary.

We have the following:


1. Mπ is a submodule of M.
2. The primary components correspond to the p-subgroup in abelian groups.

Theorem 18.6.3. Let R be a principal ideal domain and M ≠ {0} be an R-module with
M = T(M). Then M is the direct sum of its π-primary components.
k k
Proof. x ∈ M has finite order δx . Let δx = ϵπ1 1 ⋅ ⋅ ⋅ πnn be a prime decomposition of δx .
By Corollary 18.5.5, we have that x = ∑ xi with xi ∈ Mπi . That means, M = ∑π∈P Mπ ,
where P is the set of the prime elements of R. Let y ∈ Mπ ∩ ∑σ∈P,σ =π̸ Mσ ; that is, δy = π k
for some k ≥ 0 and y = ∑ xi with xi ∈ Mσi . That means, δxi = σ li for some li ≥ 0. By
l
Corollary 18.5.5, we get that y has the order ∏σi =π̸ σi i ; that means, π k is associated to
l
∏σi =π̸ σi i . Therefore, k = li = 0 for all i, and the sum is direct.

If R is a principal ideal domain and {0} ≠ M = T(M) a finitely generated torsion


R-module, then there are only finitely many π-primary components. That is to say, for
the prime elements, π with π|μ, where (μ) is the order ideal of M.

Corollary 18.6.4. Let R be a principal ideal domain and {0} ≠ M be a finitely gener-
ated torsion R-module. Then M has only finitely many nontrivial primary components
Mπ1 , . . . , Mπn , and we have

n
M = ⨁ Mπi .
i=1

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:11 AM
18.6 The fundamental theorem for finitely generated modules | 289

Hence, we have a reduction of the decomposition problem to the primary compo-


nents.

Theorem 18.6.5. Let R be a principal ideal domain, π ∈ R a prime element, and M ≠ {0}
a R-module with π k M = {0}; furthermore, let m ∈ M with (δm ) = (π k ). Then there exists
a submodule N ⊂ M with M = Rm ⊕ N.

Proof. By Zorn’s lemma, the set {U : U submodule of M and U ∩ Rm = {0}} has


a maximal element N. This set is nonempty, because it contains {0}. We consider
M 󸀠 := N ⊕ Rm ⊂ M, and have to show that M 󸀠 = M. Assume that M 󸀠 ≠ M. Then there
exists a x ∈ M with x ∉ M 󸀠 , especially x ∉ N. Then N is properly contained in the
submodule Rx + N = ⟨x, N⟩. By our choice of N, we get A := (Rx + N) ∩ Rm ≠ {0}. If
z ∈ A, z ≠ 0, then z = ρm = αx + n with ρ, α ∈ R and n ∈ N. Since z ≠ 0, we have
ρm ≠ 0; also x ≠ 0, because otherwise z ∈ Rm ∩ N = {0}; α is not a unit in R, because
otherwise x = α−1 (ρm − n) ∈ M 󸀠 . Hence we have: If x ∈ M, x ∉ M 󸀠 , then there exist
α ∈ R, α ≠ 0, α not a unit in R, ρ ∈ R with ρm ≠ 0, and n ∈ N such that

αx = ρm + n. (⋆)

In particular, αx ∈ M 󸀠 .
Now let α = ϵπ1 ⋅ ⋅ ⋅ πr be a prime decomposition. We consider one after the other
the elements x, πr x, πr−1 πr x, . . . , ϵπ1 ⋅ ⋅ ⋅ πr x = αx. We have x ∉ M 󸀠 , but αx ∈ M 󸀠 ; hence,
there exists an y ∉ M 󸀠 with πi y ∈ N + Rm.
1. πi ≠ π, π the prime element in the statement of the theorem. Then gcd(πi , π k )
= 1; hence, there are σ, σ 󸀠 ∈ R with σπi + σ 󸀠 π k = 1, and we get Rm = (Rπi + Rπ k )m =
πi Rm, because π k m = 0. Therefore, πi y ∈ M 󸀠 = N ⊕ Rm = N + πi Rm.
2. πi = π. Then we write πy as πy = n + λm with n ∈ N and λ ∈ R. This is possible,
because πy ∈ M 󸀠 . Since π k M = {0}, we get 0 = π k−1 ⋅πy = π k−1 n+π k−1 λm. Therefore,
π k−1 n = π k−1 λm = 0, because N ∩ Rm = {0}. In particular, we get π k−1 λ ∈ (δm ); that
is, π k |π k−1 λ, and hence π|λ. Therefore, πy = n + λm = n + πλ󸀠 m ∈ N + πRm, λ󸀠 ∈ R.

Hence, in any case, we have πi y ∈ N + πi Rm; that is, πi y = n + πi z with n ∈ N and


z ∈ Rm. It follows that πi (y − z) = n ∈ N.
y − z is not an element of M 󸀠 , because y ∉ M 󸀠 . By (⋆), we have, therefore, α, β ∈ R,
β ≠ 0 not a unit in R with β(y − z) = n󸀠 + αm, αm ≠ 0, n󸀠 ∈ N. We write z 󸀠 = αm, then
z 󸀠 ∈ Rm, z 󸀠 ≠ 0, and β(y − z) = n󸀠 + z 󸀠 . So, we have the equations β(y − z) = n󸀠 + z 󸀠 ,
z 󸀠 ≠ 0, and

πi (y − z) = n. (⋆⋆)

We have gcd(β, πi ) = 1, because otherwise πi |β and, hence, β(y − z) ∈ N and z 󸀠 = 0,


because N ∩ Rm = {0}. Then there exist γ, γ 󸀠 with γπi + γ 󸀠 β = 1. In (⋆⋆), we multiply the
first equation with γ 󸀠 and the second with γ.
Addition gives y − z ∈ N ⊕ Rm = M 󸀠 , and hence y ∈ M 󸀠 , which contradicts y ∉ M 󸀠 .
Therefore, M = M 󸀠 .

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:11 AM
290 | 18 The theory of modules

Theorem 18.6.6. Let R be a principal ideal domain, π ∈ R a prime element, and M ≠ {0}
a finitely generated π-primary R-module. Then there exist finitely many m1 , . . . , ms ∈ M
with M = ⨁si=1 Rmi .

Proof. Let M = ⟨x1 , . . . , xn ⟩. Each xi has an order π ki . We may assume that


k1 = max{k1 , k2 , . . . , kn }, possibly after renaming. We have π ki xi = 0 for all i. Since
k k
xi 1 = (xi i )k1 −ki , we have also π k1 M = 0, and also (δx1 ) = (π k1 ). Then M = Rx1 ⊕ N for
some submodule N ⊂ M by Theorem 18.6.5. Now N ≅ M/Rx1 , and M/Rx1 is generated
by the elements x2 + Rx1 , . . . , xn + Rx1 . Hence, N is finitely generated by n − 1 elements,
and certainly N is π-primary. This proves the result by induction.

Since Rmi ≅ R/ Ann(mi ), and Ann(mi ) = (δmi ) = (π ki ), we get the following exten-
sion of Theorem 18.6.6:

Theorem 18.6.7. Let R be a principal ideal domain, π ∈ R a prime element, and M ≠ {0}
a finitely generated π-primary R-module. Then there exist finitely many k1 , . . . , ks ∈ ℕ
with
s
M ≅ ⨁ R/(π ki ),
i=0

and M is, up to isomorphism, uniquely determined by (k1 , . . . , ks ).

Proof. The first part, that is, a description as M ≅ ⨁si=0 R/(π ki ), follows directly from
Theorem 18.6.6. Now, let
n m
M ≅ ⨁ R/(π ki ) ≅ ⨁ R/(π li ).
i=0 i=0

We may assume that k1 ≥ k2 ≥ ⋅ ⋅ ⋅ ≥ kn > 0, and l1 ≥ l2 ≥ ⋅ ⋅ ⋅ ≥ lm > 0. We consider


first the submodule N := {x ∈ M : πx = 0}. Let M = ⨁ni=1 R/(π ki ). If we then write
x = ∑(ri +(π ki )), we have πx = 0 if and only if ri ∈ (π ki −1 ); that is, N ≅ ⨁ni=1 (π ki −1 )/(π ki ) ≅
⨁ni=1 R/(π), because π k−1 R/π k R ≅ R/πR.
Since (α + (π))x = αx if πx = 0, we get that N is an R/(π)-module, and hence a
vector space over the field R/(π). From the decompositions

n m
N ≅ ⨁ R/(π) and, analogously, N ≅ ⨁ R/(π),
i=1 i=1

we get

n = dimR/(π) N = m. (⋆⋆⋆)

Assume that there is an i with ki < li or li < ki . Without loss of generality, assume that
there is an i with ki < li .

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:11 AM
18.6 The fundamental theorem for finitely generated modules | 291

Let j be the smallest index, for which kj < lj . Then (because of the ordering of
the ki )

n j−1
M 󸀠 := π kj M ≅ ⨁ π kj R/π ki R ≅ ⨁ π kj R/π ki R,
i=1 i=1

because if i > j, then π kj R/π ki R = {0}.


We now consider M 󸀠 = π kj M with respect to the second decomposition; that is,
M ≅ ⨁m
󸀠 kj li
i=1 π R/π R. By our choice of j, we have kj < lj ≤ li for 1 ≤ i ≤ j.
Therefore, in this second decomposition, the first j summands π kj R/π li R are un-
equal {0}; that is, π kj R/π li R ≠ {0} if 1 ≤ i ≤ j. The remaining summands are {0}, or of
the form R/π s R. Hence, altogether, on the one hand, M 󸀠 is a direct sum of j − 1 cyclic
submodules, and, on the other hand, a direct sum of t ≥ j nontrivial submodules. But
this contradicts the above result (⋆⋆⋆) about the number of direct sums for finitely
generated π-primary modules, because, certainly, M 󸀠 is also finitely generated and
π-primary. Therefore, ki = li for i = 1, . . . , n. This proves the theorem.

Theorem 18.6.8 (Fundamental theorem for finitely generated modules over principal
ideal domains). Let R be a principal ideal domain and M ≠ {0} be a finitely generated
(unitary) R-module. Then there exist prime elements π1 , . . . , πr ∈ R, 0 ≤ r < ∞ and
numbers k1 , . . . , kr ∈ tℕ, t ∈ ℕ0 such that

k k
M ≅ R/(π1 1 ) ⊕ R/(π2 2 ) ⊕ ⋅ ⋅ ⋅ ⊕ R/(πrkr ) ⊕ R ⊕ ⋅ ⋅ ⋅ ⊕ R,
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
t-times

k k
and M is, up to isomorphism, uniquely determined by (π1 1 , . . . , πr r , t).

The prime elements πi are not necessarily pairwise different (up to units in R); that
means, it can be πi = ϵπj for i ≠ j, where ϵ is a unit in R.

Proof. The proof is a combination of the preceding results. The free part of M is iso-
morphic to M/T(M), and the rank of M/T(M), which we call here t, is uniquely deter-
mined, because two bases of M/T(M) have the same cardinality. Therefore, we may re-
strict ourselves on torsion modules. Here, we have a reduction to π-primary modules,
k k
because in a decomposition M = ⨁i R/(πi i ) is Mπ = ⨁πi =π R/(πi i ), the π-primary com-
ponent of M (an isomorphism certainly maps a π-primary component onto a π-primary
component). Therefore, it is only necessary, now, to consider π-primary modules M.
The uniqueness statement now follows from Theorem 18.6.8:

Since abelian groups can be considered as ℤ-modules, and ℤ is a principal ideal


domain, we get the following corollary. We will restate this result in the next chapter
and prove a different version of it.

Theorem 18.6.9 (Fundamental theorem for finitely generated abelian groups). Let {0}
≠ G = (G, +) be a finitely generated abelian group. Then there exist prime numbers

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:11 AM
292 | 18 The theory of modules

p1 , . . . , pr , 0 ≤ r < ∞, and numbers k1 , . . . , kr ∈ ℕ, t ∈ ℕ0 such that

k
G ≅ ℤ/(p1 1 ℤ) ⊕ ⋅ ⋅ ⋅ ⊕ ℤ/(pkr r ℤ) ⊕ ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
ℤ ⊕ ⋅ ⋅ ⋅ ⊕ ℤ,
t-times

k k
and G is, up to isomorphism, uniquely determined by (p1 1 , . . . , pr r , t).

18.7 Exercises
1. Let M and N be isomorphic modules over a commutative ring R. Then EndR (M)
and EndR (N) are isomorphic rings. (EndR (M) is the set of all R-modules endomor-
phisms of M.)
2. Let R be an integral domain and M an R-module with M = Tor(M) (torsion mod-
ule). Show that HomR (M, R) = 0. (HomR (M, R) is the set of all R-module homo-
morphisms from M to R.)
3. Prove the isomorphism theorems for modules (1), (2), and (3) in Theorem 18.1.11
in detail.
4. Let M, M 󸀠 , N be R-modules, R a commutative ring. Show the following:
(i) HomR (M ⊕ M 󸀠 , N) ≅ HomR (M, N) × HomR (M 󸀠 , N)
(ii) HomR (N, M × M 󸀠 ) ≅ HomR (N, M) ⊕ HomR (N, M 󸀠 ).
5. Show that two free R-modules having bases, whose cardinalities are equal are iso-
morphic.
6. Let M be an unitary R-module (R a commutative ring), and let {m1 , . . . , ms } be a
finite subset of M. Show that the following are equivalent:
(i) {m1 , . . . , ms } generates M freely.
(ii) {m1 , . . . , ms } is linearly independent and generates M.
(iii) Every element m ∈ M is uniquely expressible in the form m = ∑si=1 ri mi with
ri ∈ R.
(iv) Each Rmi is torsion-free, and M = Rm1 ⊕ ⋅ ⋅ ⋅ ⊕ Rms .
7. Let R be a principal domain and M ≠ {0} be an R-module with M = T(M).
(i) Let x1 , . . . , xn ∈ M be pairwise different and pairwise relatively prime orders
δxi = αi . Then y = x1 + ⋅ ⋅ ⋅ + xn has order α := α1 . . . αn .
k k
(ii) Let 0 ≠ x ∈ M and δx = ϵπ1 1 ⋅ ⋅ ⋅ πnn be a prime decomposition of the order δx of
x (ϵ a unit in R and the πi pairwise nonassociate prime elements in R), where
k
n > 0, ki > 0. Then there exist xi , i = 1, . . . , n, with δxi associated with πi i and
x = x1 + ⋅ ⋅ ⋅ + xn .

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 3:11 AM
19 Finitely generated Abelian groups
19.1 Finite Abelian groups
In Chapter 10, we described the theorem below that completely provides the struc-
ture of finite abelian groups. As we saw in Chapter 18, this result is a special case of a
general result on modules over principal ideal domains.

Theorem 19.1.1 (Theorem 10.4.1, basis theorem for finite abelian groups). Let G be a
finite abelian group. Then G is a direct product of cyclic groups of prime power order.

We review two examples that show how this theorem leads to the classification of
finite abelian groups. In particular, this theorem allows us, for a given finite order n,
to present a complete classification of abelian groups of order n.
Since all cyclic groups of order n are isomorphic to (ℤn , +), ℤn = ℤ/nℤ, we will
denote a cyclic group of order n by ℤn .

Example 19.1.2. Classify all abelian groups of order 60. Let G be an abelian group of
order 60. From Theorem 10.4.1, G must be a direct product of cyclic groups of prime
power order. Now 60 = 22 ⋅ 3 ⋅ 5, so the only primes involved are 2, 3, and 5. Hence, the
cyclic groups involved in the direct product decomposition of G have order either 2, 4,
3, or 5 (by Lagrange’s theorem they must be divisors of 60). Therefore, G must be of
the form

G ≅ ℤ4 × ℤ3 × ℤ5 ,

or

G ≅ ℤ2 × ℤ2 × ℤ3 × ℤ5 .

Hence, up to isomorphism, there are only two abelian groups of order 60.

Example 19.1.3. Classify all abelian groups of order 180. Let G be an abelian group of
order 180. Now 180 = 22 ⋅ 32 ⋅ 5, so the only primes involved are 2, 3, and 5. Hence, the
cyclic groups involved in the direct product decomposition of G have order either 2, 4,
3, 9, or 5 (by Lagrange’s theorem they must be divisors of 180). Therefore, G must be
of the form

G ≅ ℤ4 × ℤ9 × ℤ5
G ≅ ℤ2 × ℤ2 × ℤ9 × ℤ5
G ≅ ℤ4 × ℤ3 × ℤ3 × ℤ5
G ≅ ℤ2 × ℤ2 × ℤ3 × ℤ3 × ℤ5 .

Therefore, up to isomorphism, there are four abelian groups of order 180.

https://doi.org/10.1515/9783110603996-019

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 10:13 AM
294 | 19 Finitely generated Abelian groups

The proof of Theorem 19.1.1 involves the lemmas that follow. We refer back to Chap-
ter 10 or Chapter 18 for the proofs. Notice how these lemmas mirror the results for
finitely generated modules over principal ideal domains considered in the last chap-
ter.

Lemma 19.1.4. Let G be a finite abelian group, and let p||G|, where p is a prime. Then
all the elements of G, whose orders are a power of p form a normal subgroup of G. This
subgroup is called the p-primary component of G, which we will denote by Gp .
e e
Lemma 19.1.5. Let G be a finite abelian group of order n. Suppose that n = p1 1 ⋅ ⋅ ⋅ pkk
with p1 , . . . , pk distinct primes. Then

G ≅ Gp1 × ⋅ ⋅ ⋅ × Gpk ,

where Gpi is the pi -primary component of G.

Theorem 19.1.6 (Basis theorem for finite abelian groups). Let G be a finite abelian
group. Then G is a direct product of cyclic groups of prime power order.

19.2 The fundamental theorem: p-primary components


In this section, we use the fundamental theorem for finitely generated modules over
principal ideal domains to extend the basis theorem for finite abelian groups to the
more general case of finitely generated abelian groups. We also consider the decom-
position into p-primary components, mirroring our result in the finite case. In the next
section, we present a different form of the basis theorem with a more elementary proof.
In Chapter 18, we proved the following:

Theorem 19.2.1 (Fundamental theorem for finitely generated modules over principal
ideal domains). Let R be a principal ideal domain and M ≠ {0} be a finitely generated
(unitary) R-module. Then there exist prime elements π1 , . . . , πr ∈ R, 0 ≤ r < ∞ and
numbers k1 , . . . , kr ∈ ℕ, t ∈ ℕ0 , such that

k k
M ≅ R/(π1 1 ) ⊕ R/(π2 2 ) ⊕ ⋅ ⋅ ⋅ ⊕ R/(πrkr ) ⊕ ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
R ⊕ ⋅ ⋅ ⋅ ⊕ R,
t-times

k k
and M is, up to isomorphism, uniquely determined by (π1 1 , . . . , πr r , t).

The prime elements πi are not necessarily pairwise different (up to units in R); that
means, it can be πi = ϵπj for i ≠ j, where ϵ is a unit in R.
Since abelian groups can be considered as ℤ-modules, and ℤ is a principal ideal
domain, we get the following corollary, which is extremely important in its own right:

Theorem 19.2.2 (Fundamental theorem for finitely generated abelian groups). Let {0}
≠ G = (G, +) be a finitely generated abelian group. Then there exist prime numbers

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 10:13 AM
19.3 The fundamental theorem: elementary divisors | 295

p1 , . . . , pr , 0 ≤ r < ∞, and numbers k1 , . . . , kr ∈ ℕ, t ∈ ℕ0 , such that


k
G ≅ ℤ/(p1 1 ℤ) ⊕ ⋅ ⋅ ⋅ ⊕ ℤ/(pkr r ℤ) ⊕ ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
ℤ ⊕ ⋅ ⋅ ⋅ ⊕ ℤ,
t-times

k k
and G is, up to isomorphism, uniquely determined by (p1 1 , . . . , pr r , t).

Notice that the number t of infinite components is unique. This is called the rank
or Betti number of the abelian group G. This number plays an important role in the
study of homology and cohomology groups in topology.
If G = ℤ × ℤ × ⋅ ⋅ ⋅ × ℤ = ℤr for some r, we call G a free abelian group of rank r. No-
tice that if an abelian group G is torsion-free, then the p-primary components are just
the identity. It follows that, in this case, G is a free abelian group of finite rank. Again,
using module theory, it follows that subgroups of this must also be free abelian and
of smaller or equal rank. Notice the distinction between free abelian groups and abso-
lutely free groups (see Chapter 14). In the free group case, a nonabelian free group of
finite rank contains free subgroups of all possible countable ranks. In the free abelian
case, however, the subgroups have smaller or equal rank. We summarize these com-
ments as follows:

Theorem 19.2.3. Let G ≠ {0} be a finitely generated torsion-free abelian group. Then G
is a free abelian group of finite rank r; that is, G ≅ ℤr . Furthermore, if H is a subgroup
of G, then H is also free abelian and the rank of H is smaller than or equal to the rank
of G.

19.3 The fundamental theorem: elementary divisors


In this section, we present the fundamental theorem of finitely generated abelian
groups in a slightly different form, and present an elementary proof of it.
In the following, G is always a finitely generated abelian group. We use the addi-
tion “+” for the binary operation; that is,

+ : G × G → G, (x, y) 󳨃→ x + y.

We also write ng instead of g n , and use 0 as the symbol for the identity element in G;
that is, 0 + g = g for all g ∈ G. G = ⟨g1 , . . . , gt ⟩, 0 ≤ t < ∞. That is, G is (finitely)
generated by g1 , . . . , gt , is equivalent to the fact that each g ∈ G can be written in the
form g = n1 g1 + n2 g2 + ⋅ ⋅ ⋅ + nt gt , ni ∈ ℤ. A relation between the gi with coefficients
n1 , . . . , nt is then each an equation of the form n1 g1 + ⋅ ⋅ ⋅ + nt gt = 0. A relation is called
nontrivial if ni ≠ 0 for at least one i. A system R of relations in G is called a system of
defining relations, if each relation in G is a consequence of R. The elements g1 , . . . , gt are
called integrally linear independent if there are no nontrivial relations between them.
A finite generating system {g1 , . . . , gt } of G is called a minimal generating system if there
is no generating system with t − 1 elements.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 10:13 AM
296 | 19 Finitely generated Abelian groups

Certainly, each finitely generated group has a minimal generating system. In what
follow, we always assume that our finitely generated abelian group G is unequal {0};
that is, G is nontrivial.
As above, we may consider G as a finitely generated ℤ-module, and in this sense,
the subgroups of G are precisely the submodules. Hence, it is clear what we mean if
we call G a direct product G = U1 × ⋅ ⋅ ⋅ × Us of its subgroups U1 , . . . , Us ; namely, each
g ∈ G can be written as g = u1 + u2 + ⋅ ⋅ ⋅ + us with ui ∈ Ui and

s
Ui ∩ ( ∏ Uj ) = {0}.
j=1,j=i̸

To emphasize the little difference between abelian groups and ℤ-modules, here
we use the notation “direct product” instead of “direct sum”. Considered as ℤ-mod-
ules, for finite index sets I = {1, . . . , s}, we have anyway
s s
∏ Ui = ⨁ Ui .
i=1 i=1

Finally, we use the notation ℤn instead of ℤ/nℤ, n ∈ ℕ. In general, we use Zn to


be a cyclic group of order n.
The aim in this section is to prove the following:

Theorem 19.3.1 (Basis theorem for finitely generated abelian groups). Let G ≠ {0} be
a finitely generated abelian group. Then G is a direct product

G ≅ Zk1 × ⋅ ⋅ ⋅ × Zkr × U1 × ⋅ ⋅ ⋅ × Us ,

r ≥ 0, s ≥ 0, of cyclic subgroups with |Zki | = ki for i = 1, . . . , r, ki |ki+1 for i = 1, . . . , r − 1


and Uj ≅ ℤ for j = 1, . . . , s. Here, the numbers k1 , . . . , kr , r, and s are uniquely determined
by G; that means, are k1󸀠 , . . . , kr󸀠 , r 󸀠 and s󸀠 , the respective numbers for a second analogous
decomposition of G. Then r = r 󸀠 , k1 = k1󸀠 , . . . , kr = kr󸀠 , and s = s󸀠 .

The numbers ki are called the elementary divisors of G.


We can have r = 0, or s = 0 (but not both, because G ≠ {0}). If s > 0, r = 0, then
G is a free abelian group of rank s (exactly the same rank if you consider G as a free
ℤ-module of rank s). If s = 0, then G is finite. In fact, s = 0 ⇔ G is finite.
We first prove some preliminary results:

Lemma 19.3.2. Let G = ⟨g1 , . . . , gt ⟩, t ≥ 2, an abelian group. Then also G = ⟨g1 +


∑ti=2 mi gi , g2 , . . . , gt ⟩ for arbitrary m2 , . . . , mt ∈ ℤ.

Lemma 19.3.3. Let G be a finitely generated abelian group. Among all nontrivial rela-
tions between elements of minimal generating systems of G, we choose one relation,

m1 g1 + ⋅ ⋅ ⋅ + mt gt = 0 (⋆)

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 10:13 AM
19.3 The fundamental theorem: elementary divisors | 297

with smallest possible positive coefficient, and let this smallest coefficient be m1 . Let

n1 g1 + ⋅ ⋅ ⋅ + nt gt = 0 (⋆⋆)

be another relation between the same generators g1 , . . . , gt . Then


(1) m1 |n1 , and
(2) m1 |mi for i = 1, 2, . . . , t.

Proof. (1) Assume m1 ∤ n1 . Then n1 = qm1 + m󸀠1 with 0 < m󸀠1 < m1 . If we multiply the
relation (⋆) with q and subtract the resulting relation from the relation (⋆⋆), then we
get a relation with a coefficient m󸀠1 < m1 , which contradicts the choice of m1 . Hence,
m1 |n1 .
(2) Assume m1 ∤ m2 . Then m2 = qm1 + m󸀠2 with 0 < m󸀠2 < m2 . {g1 + qg2 , g2 , . . . , gt } is
a minimal generating system, which satisfies the relation m1 (g1 + qg2 ) + m󸀠2 g2 + m3 g3 +
⋅ ⋅ ⋅ + mt gt = 0, and this relation has a coefficient m󸀠2 < m1 . This again contradicts the
choice of m1 . Hence, m1 |m2 , and furthermore, m1 |mi for i = 1, . . . , t.

Lemma 19.3.4 (Invariant characterization of kr for finite abelian groups G). Let G =
Zk1 × ⋅ ⋅ ⋅ × Zkr and Zki finite cyclic of order ki ≥ 2, i = 1, . . . , r, with ki |ki+1 for i = 1, . . . , r − 1.
Then kr is the smallest natural number n such that ng = 0 for all g ∈ G. kr is called the
exponent or the maximal order of G.

Proof. 1. Let g ∈ G arbitrary; that is, g = n1 g1 + ⋅ ⋅ ⋅ + nr gr with gi ∈ Zki . Then ki gi = 0 for


i = 1, . . . , r by the theorem of Fermat. Since ki |kr , we get kr g = n1 k1 g1 + ⋅ ⋅ ⋅ + nr kr gr = 0.
2. Let a ∈ G with Zkr = ⟨a⟩. Then the order of a is kr and, hence, na ≠ 0 for all
0 < n < kr .

Lemma 19.3.5 (Invariant characterization of s). Let G = Zk1 × ⋅ ⋅ ⋅ × Zkr × U1 × ⋅ ⋅ ⋅ × Us ,


s > 0, where the Zki are finite cyclic groups of order ki , and the Uj are infinite cyclic
groups. Then, s is the maximal number of integrally linear independent elements of G; s
is called the rank of G.

Proof. 1. Let gi ∈ Ui , gi ≠ 0, for i = 1, . . . , s. Then the g1 , . . . , gs are integrally linear


independent, because from n1 g1 + ⋅ ⋅ ⋅ + ns gs = 0, the ni ∈ ℤ, we get n1 g1 ∈ U1 ∩ (U2 × ⋅ ⋅ ⋅ ×
Us ) = {0}. Hence, n1 g1 = 0; that is, n1 = 0, because g1 has infinite order. Analogously,
we get n2 = ⋅ ⋅ ⋅ = ns = 0.
2. Let g1 , . . . , gs+1 ∈ G. We look for integers x1 , . . . , xs+1 , not all 0, such that a relation
∑s+1i=1 xi gi = 0 holds. Let Zki ∈ ⟨ai ⟩, Uj = ⟨bj ⟩. Then we may write each gi as gi = mi1 a1 +
⋅ ⋅ ⋅ + mir ar + ni1 b1 + ⋅ ⋅ ⋅ + nis bs for i = 1, . . . , s + 1, where mij aj ∈ Zkj , and nil bl ∈ Ul .
Case 1: all mij aj = 0. Then ∑s+1
i=1 xi gi = 0 is equivalent to

s+1 s s s+1
∑ xi (∑ nij bj ) = ∑(∑ nij xi )bj = 0.
i=1 j=1 j=1 i=1

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 10:13 AM
298 | 19 Finitely generated Abelian groups

The system ∑s+1 i=1 nij xi = 0, j = 1, . . . , s, of linear equations has at least one nontriv-
ial rational solution (x1 , . . . , xs+1 ), because we have more unknowns than equations.
Multiplication with the common denominator gives a nontrivial integral solution
(x1 , . . . , xs+1 ) ∈ ℤs+1 . For this solution, we get

s+1
∑ xi gi = 0.
i=1

Case 2: mij aj arbitrary. Let k ≠ 0 be a common multiple of the orders kj of the cyclic
groups Zkj , j = 1, . . . , r. Then

kgi = ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
mi1 ka1 + ⋅ ⋅ ⋅ + ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
mir kar +ni1 kb1 + ⋅ ⋅ ⋅ + nis kbs
=0 =0

for i = 1, . . . , s + 1. By case 1, the kg1 , . . . , kgs+1 are integrally linear dependent; that is,
we have integers x1 , . . . , xs+1 , not all 0, with ∑s+1 s+1
i=1 xi (kgi ) = 0 = ∑i=1 (xi k)gi , and the xi k
are not all 0. Hence, also g1 , . . . , gs+1 are integrally linear dependent.

Lemma 19.3.6. Let G := Zk1 × ⋅ ⋅ ⋅ × Zkr ≅ Zk1󸀠 × ⋅ ⋅ ⋅ × Zk󸀠󸀠 =: G󸀠 , the Zki , Zk󸀠 cyclic groups
r j

of orders ki ≠ 1 and kj󸀠 ≠ 1, respectively, and ki |ki+1 for i = 1, . . . , r − 1 and kj󸀠 |kj+1
󸀠
for
j = 1, . . . , r − 1. Then r = r , and k1 = k1 , k2 = k2 , . . . , kr = kr .
󸀠 󸀠 󸀠 󸀠 󸀠

Proof. We prove this lemma by induction on the group order |G| = |G󸀠 |. Certainly,
Lemma 19.3.6 holds if |G| ≤ 2, because then, either G = {0}, and here r = r 󸀠 = 0, or
G ≅ ℤ2 , and here r = r 󸀠 = 1. Now let |G| > 2. Then, in particular, r ≥ 1. Inductively we
assume that Lemma 19.3.6 holds for all finite abelian groups of order less than |G|. By
Lemma 19.3.4 the number kr is invariantly characterized, that is, from G ≅ G󸀠 follows
kr = kr󸀠󸀠 , that is especially, Zkr ≅ Zk󸀠󸀠 . Then G/Zkr ≅ G/Zk󸀠󸀠 , that is, Zk1 × ⋅ ⋅ ⋅ × Zkr−1 ≅
r r
Zk1󸀠 × ⋅ ⋅ ⋅ × Zk󸀠󸀠 . Inductively, r − 1 = r 󸀠 − 1; that is, r = r 󸀠 , and k1 = k1󸀠 , . . . , kr−1 = kr󸀠󸀠 −1 .
r −1

This proves Lemma 19.3.6.

We can now present the main result, which we state again, and its proof.

Theorem 19.3.7 (Basis theorem for finitely generated abelian groups). Let G ≠ {0} be
a finitely generated abelian group. Then G is a direct product

G ≅ Zk1 × ⋅ ⋅ ⋅ × Zkr × U1 × ⋅ ⋅ ⋅ × Us , r ≥ 0, s ≥ 0,

of cyclic subgroups with |Zki | = ki for i = 1, . . . , r, ki |ki+1 for i = 1, . . . , r − 1, and Uj ≅ ℤ


for j = 1, . . . , s. Here, the numbers k1 , . . . , kr , r, and s are uniquely determined by G; that
means, are k1󸀠 , . . . , kr󸀠 , r 󸀠 , and s󸀠 , the respective numbers for a second analogous decom-
position of G. Then r = r 󸀠 , k1 = k1󸀠 , . . . , kr = kr󸀠 , and s = s󸀠 .

Proof. (a) We first prove the existence of the given decomposition. Let G ≠ {0} be a
finitely generated abelian group. Let t, 0 < t < ∞, be the number of elements in a

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 10:13 AM
19.3 The fundamental theorem: elementary divisors | 299

minimal generating system of G. We have to show that G is decomposable as a direct


product of t cyclic groups with the given description. We prove this by induction on t.
If t = 1, then the basis theorem is correct. Now let t ≥ 2, and assume that the assertion
holds for all abelian groups with less then t generators.
Case 1: There does not exist a minimal generating system of G, which satisfies a
nontrivial relation. Let {g1 , . . . , gt } be an arbitrary minimal generating system for G. Let
Ui = ⟨gi ⟩. Then all Ui are infinite cyclic, and we have G = U1 × ⋅ ⋅ ⋅ × Ut , because if, for
instance, U1 ∩ (U2 + ⋅ ⋅ ⋅ + Ut ) ≠ {0}, then we must have a nontrivial relation between
the g1 , . . . , gt .
Case 2: There exist minimal generating systems of G, which satisfy nontrivial rela-
tions. Among all nontrivial relations between elements of minimal generating systems
of G, we choose one relation,

m1 g1 + ⋅ ⋅ ⋅ + mt gt = 0 (⋆)

with smallest possible positive coefficient. Without loss of generality, let m1 be


this coefficient. By Lemma 19.3.3, we get m2 = q2 m1 , . . . , mt = qt m1 . Now, {g1 +
∑ti=2 qi gi , g2 , . . . , gt } is a minimal generating system of G by Lemma 19.3.2. Define
h1 = g1 + ∑ti=2 qi gi , then m1 h1 = 0. If n1 h1 + n2 g2 + ⋅ ⋅ ⋅ + nt gt = 0 is an arbitrary relation
between h1 , g2 , . . . , gt , then m1 |n1 by Lemma 19.3.3; hence, n1 h1 = 0. Define H1 := ⟨h1 ⟩,
and G󸀠 = ⟨g2 , . . . , gt ⟩. Then G = H1 × G󸀠 . This we can see as follows: First, each g ∈ G
can be written as g = m1 h1 +m2 g2 +⋅ ⋅ ⋅+mt gt = m1 h1 +g 󸀠 with g 󸀠 ∈ G󸀠 . Also H1 ∩G󸀠 = {0},
because m1 h1 = g 󸀠 ∈ G󸀠 implies a relation n1 h1 + n2 g2 + ⋅ ⋅ ⋅ + nt gt = 0, and from this we
get, as above, n1 h1 = g 󸀠 = 0. Now, inductively, G󸀠 = Zk2 × ⋅ ⋅ ⋅ × Zkr × U1 × ⋅ ⋅ ⋅ × Us with
Zki a cyclic group of order ki , i = 2, . . . , r, ki |ki+1 for i = 2, . . . , r − 2, Uj ≅ ℤ for j = 1, . . . , s,
and (r − 1) + s = t − 1; that is, r + s = t. Furthermore, G = H1 × G󸀠 , where H1 is cyclic of
order m1 . If r ≥ 2 and Zk2 = ⟨h2 ⟩, then we get a nontrivial relation

m⏟ ⏟⏟
⏟⏟ 1 h⏟ 1⏟ + ⏟⏟ h⏟2⏟ = 0,
k⏟2⏟⏟
=0 =0

since k2 ≠ 0. Again m1 |k2 by Lemma 19.3.3. This gives the desired decomposition.
(b) We now prove the uniqueness statement.
Case 1: G is finite abelian. Then the claim follows from Lemma 19.3.6
Case 2: G is arbitrary finitely generated and abelian. Let T := {x ∈ G : |x| < ∞};
that is, the set of elements of G of finite order. Since G is abelian, T is a subgroup of G,
the so called torsion subgroup of G. If, as above, G = Zk1 × ⋅ ⋅ ⋅ × Zkr × U1 × ⋅ ⋅ ⋅ × Us , then
T = Zk1 × ⋅ ⋅ ⋅ × Zkr , because an element b1 + ⋅ ⋅ ⋅ + br + c1 + ⋅ ⋅ ⋅ + cs with bi ∈ Zki , cj ∈ Uj
has finite order if and only if all cj = 0. That means: Zk1 × ⋅ ⋅ ⋅ × Zkr is independent
of the special decomposition, uniquely determined by G; hence, also the numbers
r, k1 , . . . , kr by Lemma 19.3.6. Finally, the number s, the rank of G, is uniquely deter-
mined by Lemma 19.3.5. This proves the basis theorem for finitely generated abelian
groups.

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 10:13 AM
300 | 19 Finitely generated Abelian groups

As a corollary, we get the fundamental theorem for finitely generated abelian


groups as given in Theorem 19.2.1.

Theorem 19.3.8. Let {0} ≠ G = (G, +) be a finitely generated abelian group. Then there
exist prime numbers p1 , . . . , pr , 0 ≤ r < ∞, and numbers k1 , . . . , kr ∈ ℕ, t ∈ ℕ0 such that

G ≅ ℤpk1 × ⋅ ⋅ ⋅ × ℤpkr × ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟


ℤ × ⋅ ⋅ ⋅ × ℤ,
1 r
t-times

k k
and G is, up to isomorphism, uniquely determined by (p1 1 , . . . , pr r , t).

Proof. For the existence, we only have to show that ℤmn ≅ ℤm ×ℤn if gcd(m, n) = 1. For
this, we write Un = ⟨m + mnℤ⟩ < ℤmn , Um = ⟨n + nmℤ⟩ < ℤmn , and Un ∩ Um = {mnℤ},
because gcd(m, n) = 1. Furthermore, there are h, k ∈ ℤ with 1 = hm + kn. Hence,
l + mnℤ = hlm + mnℤ + kln + mnℤ, and therefore ℤmn = Un × Um ≅ ℤn × ℤm .
For the uniqueness statement, we may reduce the problem to the case |G| = pk for a
prime number p and k ∈ ℕ. But here the result follows directly from Lemma 19.3.6.

From this proof, we automatically get the Chinese remainder theorem for the case
ℤn = ℤ/nℤ.

Theorem 19.3.9 (Chinese remainder theorem). Let m1 , . . . , mr ∈ ℕ, r ≥ 2, with


gcd(mi , mj ) = 1 for i ≠ j. Define m := m1 ⋅ ⋅ ⋅ mr .
(1) π : ℤm → ℤm1 × ⋅ ⋅ ⋅ × ℤmr , a + mℤ 󳨃→ (a + m1 ℤ, . . . , a + mr ℤ), defines a ring
isomorphism.
(2) The restriction of π on the multiplicative group of the prime residue classes defines
a group isomorphism ℤ⋆m → ℤ⋆m1 × ⋅ ⋅ ⋅ × ℤ⋆mr .
(3) For given a1 , . . . , ar ∈ ℤ, there exists modulo m exactly one x ∈ ℤ with x ≡ ai (mod
mi ) for i = 1, . . . , r.

Recall that for k ∈ ℕ, a prime residue class is defined by a + kℤ with gcd(a, k) = 1.


The set of prime residue classes modulo k is certainly a multiplicative group.

Proof. By Theorem 19.3.1, we get that π is an additive group isomorphism, which can
be extended directly to a ring isomorphism via (a + mℤ)(b + mℤ) 󳨃→ (ab + m1 ℤ,
. . . , ab + mr ℤ). The remaining statements are now obvious.

Let A(n) be the number of nonisomorphic finite abelian groups of order n =


k k
p1 1 ⋅ ⋅ ⋅ pr r , r ≥ 1, with pairwise different prime numbers p1 , . . . , pr and k1 , . . . ,
k k
kr ∈ ℕ. By Theorem 19.2.2, we have A(n) = A(p1 1 ) ⋅ ⋅ ⋅ A(pr r ). Hence, to calculate A(n),
m
we have to calculate A(p ) for a prime number p and a natural number m ∈ ℕ. Again,
by Theorem 19.2.2, we get G ≅ ℤpm1 × ⋅ ⋅ ⋅ × ℤpmk , all mi ≥ 1, if G is abelian of order pm . If
we compare the orders, we get m = m1 +⋅ ⋅ ⋅+mk . We may order the mi by size. A k-tuple
(m1 , . . . , mk ) with 0 < m1 ≤ m2 ≤ ⋅ ⋅ ⋅ ≤ mk and m1 + m2 + ⋅ ⋅ ⋅ + mk = m is called a partition
of m. From above, each abelian group of order pm gives a partition (m1 , . . . , mk ) of m

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 10:13 AM
19.4 Exercises | 301

for some k with 1 ≤ k ≤ m. On the other hand, each partition (m1 , . . . , mk ) of m gives an
abelian group of order pm , namely ℤpm1 ×⋅ ⋅ ⋅×ℤpmk . Theorem 19.2.2 shows that different
partitions give nonisomorphic groups. If we define p(m) to be the number of partitions
k k
of m, then we get the following: A(pm ) = p(m), and A(p1 1 ⋅ ⋅ ⋅ pr r ) = p(k1 ) ⋅ ⋅ ⋅ p(kr ).

19.4 Exercises
1. Let H be a finite generated abelian group, which is the homomorphic image of a
torsion-free abelian group of finite rank n. Show that H is the direct sum of ≤ n
cyclic groups.
2. Determine (up to isomorphism) all groups of order p2 (p prime) and all abelian
groups of order ≤ 15.
3. Let G be an abelian group with generating elements a1 , . . . , a4 and defining rela-
tions

5a1 + 4a2 + a3 + 5a4 = 0


7a1 + 6a2 + 5a3 + 11a4 = 0
2a1 + 2a2 + 10a3 + 12a4 = 0
10a1 + 8a2 − 4a3 + 4a4 = 0.

Express G as a direct product of cyclic groups.


4. Let G be a finite abelian group and u = ∏g∈G g, the product of all elements of G.
Show: If G has exactly one element a of order 2, then u = a, otherwise u = e.
Conclude from this the theorem of Wilson:

(p − 1)! ≡ −1(mod p) for each prime p.

5. Let p be a prime and G a finite abelian p-group; that is, the order of all elements
of G is finite and a power of p. Show that G is cyclic, if G has exactly one subgroup
of order p. Is the statement still correct if G is not abelian?

Brought to you by | Stockholm University Library


Authenticated
Download Date | 10/13/19 10:13 AM
Brought to you by | Stockholm University Library
Authenticated
Download Date | 10/13/19 10:13 AM
20 Integral and transcendental extensions
20.1 The ring of algebraic integers
Recall that a complex number α is an algebraic number if it is algebraic over the ra-
tional numbers ℚ. That is, α is a zero of a polynomial p(x) ∈ ℚ[x]. If α ∈ ℂ is not
algebraic, then it is a transcendental number.
We will let 𝒜 denote the totality of algebraic numbers within the complex num-
bers ℂ, and 𝒯 the set of transcendentals, so that ℂ = 𝒜 ∪ 𝒯 . The set 𝒜 is the algebraic
closure of ℚ within ℂ.
The set 𝒜 of algebraic numbers forms a subfield of ℂ (see Chapter 5), and the
subset 𝒜󸀠 = 𝒜 ∩ ℝ of real algebraic numbers forms a subfield of ℝ. The field 𝒜 is an
algebraic extension of the rationals ℚ. However, the degree is infinite.
Since each rational is algebraic, it is clear that there are algebraic numbers. Fur-
thermore, there are irrational algebraic numbers, √2 for example, since it is a zero of
the irreducible polynomial x2 − 2 over ℚ. In Chapter 5, we proved that there are un-
countably infinitely many transcendental numbers (Theorem 5.5.3). However, it is very
difficult to prove that any particular real or complex number is actually transcenden-
tal. In Theorem 5.5.4, we showed that the real number

1
c=∑
j=1 10j!

is transcendental.
In this section, we examine a special type of algebraic number called an algebraic
integer. These are the algebraic numbers that are zeros of monic integral polynomials.
The set of all such algebraic integers forms a subring of ℂ. The proofs in this section
can be found in [43].
After we do this, we extend the concept of an algebraic integer to a general con-
text and define integral ring extensions. We then consider field extensions that are
nonalgebraic—transcendental field extensions. Finally, we will prove that the famil-
iar numbers e and π are transcendental.

Definition 20.1.1. An algebraic integer is a complex number α, that is, a zero of a monic
integral polynomial. That is, α ∈ ℂ is an algebraic integer if there exists f (x) ∈ ℤ[x]
with f (x) = xn + bn−1 xn−1 + ⋅ ⋅ ⋅ + b0 , bi ∈ ℤ, n ≥ 1, and f (α) = 0.

An algebraic integer is clearly an algebraic number. The following are clear:

Lemma 20.1.2. If α ∈ ℂ is an algebraic integer, then all its conjugates, α1 , . . . , αn , over


ℚ are also algebraic integers.

Lemma 20.1.3. α ∈ ℂ is an algebraic integer if and only if mα ∈ ℤ[x].

https://doi.org/10.1515/9783110603996-020

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:03 PM
304 | 20 Integral and transcendental extensions

To prove the converse of this lemma, we need the concept of a primitive integral
polynomial. This is a polynomial p(x) ∈ ℤ[x] such that the GCD of all its coefficients
is 1. The following can be proved (see exercises or Chapter 4):
(1) If f (x) and g(x) are primitive, then so is f (x)g(x).
(2) If f (x) ∈ ℤ[x] is monic, then it is primitive.
(3) If f (x) ∈ ℚ[x], then there exists a rational number c such that f (x) = cf1 (x) with
f1 (x) primitive.

Now suppose f (x) ∈ ℤ[x] is a monic polynomial with f (α) = 0. Let p(x) = mα (x). Then
p(x) divides f (x) so f (x) = p(x)q(x).
Let p(x) = c1 p1 (x) with p1 (x) primitive, and let q(x) = c2 q1 (x) with q1 (x) primitive.
Then

f (x) = cp1 (x)q1 (x).

Since f (x) is monic, it is primitive; hence c = 1, so f (x) = p1 (x)q1 (x).


Since p1 (x), and q1 (x) are integral and their product is monic, they both must be
monic. Since p(x) = c1 p1 (x), and they are both monic, it follows that c1 = 1. Hence,
p(x) = p1 (x). Therefore, p(x) = mα (x) is integral.
When we speak of algebraic integers, we will refer to the ordinary integers as ra-
tional integers. The next lemma shows the close ties between algebraic integers and
rational integers.

Lemma 20.1.4. If α is an algebraic integer and also rational, then it is a rational integer.

The following ties algebraic numbers in general to corresponding algebraic inte-


gers. Notice that if q ∈ ℚ, then there exists a rational integer n such that nq ∈ ℤ. This
result generalizes this simple idea.

Theorem 20.1.5. If θ is an algebraic number, then there exists a rational integer r ≠ 0


such that rθ is an algebraic integer.

We saw that the set 𝒜 of all algebraic numbers is a subfield of ℂ. In the same
manner, the set ℐ of all algebraic integers forms a subring of 𝒜. First, an extension of
the following result on algebraic numbers.

Lemma 20.1.6. Suppose α1 , . . . , αn form the set of conjugates over ℚ of an algebraic in-
teger α. Then any integral symmetric function of α1 , . . . , αn is a rational integer.

Theorem 20.1.7. The set ℐ of all algebraic integers forms a subring of 𝒜.

We note that 𝒜, the field of algebraic numbers, is precisely the quotient field of
the ring of algebraic integers.
An algebraic number field is a finite extension of ℚ within ℂ. Since any finite ex-
tension of ℚ is a simple extension, each algebraic number field has the form K = ℚ(θ)
for some algebraic number θ.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:03 PM
20.2 Integral ring extensions | 305

Let K = ℚ(θ) be an algebraic number field, and let RK = K ∩ ℐ . Then RK forms a


subring of K called the algebraic integers, or integers of K. An analysis of the proof of
Theorem 20.1.5 shows that each β ∈ K can be written as
α
β=
r
with α ∈ RK and r ∈ ℤ.
These rings of algebraic integers share many properties with the rational integers.
Whereas there may not be unique factorization into primes, there is always prime fac-
torization.

Theorem 20.1.8. Let K be an algebraic number field and RK its ring of integers. Then
each α ∈ RK is either 0, a unit, or can be factored into a product of primes.

We stress again that the prime factorization need not be unique. However, from
the existence of a prime factorization, we can extend Euclid’s original proof of the
infinitude of primes (see [43]) to obtain the following:

Corollary 20.1.9. There exist infinitely many primes in RK for any algebraic number ring
RK .

Just as any algebraic number field is finite dimensional over ℚ, we will see that
each RK is of finite degree over ℚ. That is, if K has degree n over ℚ, we show that there
exists ω1 , . . . , ωn in RK such that each α ∈ RK is expressible as

α = m1 ω1 + ⋅ ⋅ ⋅ + mn ωn ,

where m1 , . . . , mn ∈ ℤ.

Definition 20.1.10. An integral basis for RK is a set of integers ω1 , . . . , ωt ∈ RK such


that each α ∈ RK can be expressed uniquely as

α = m1 ω1 + ⋅ ⋅ ⋅ + mt ωt ,

where m1 , . . . , mt ∈ ℤ.

The finite degree comes from the following result that shows there does exist an
integral basis (see [43]):

Theorem 20.1.11. Let RK be the ring of integers in the algebraic number field K of degree
n over ℚ. Then there exists at least one integral basis for RK .

20.2 Integral ring extensions


We now extend the concept of an algebraic integer to general ring extensions. We first
need the idea of an R-algebra, where R is a commutative ring with identity 1 ≠ 0.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:03 PM
306 | 20 Integral and transcendental extensions

Definition 20.2.1. Let R be a commutative ring with an identity 1 ≠ 0. An R-algebra or


algebra over R is a unitary R-module A, in which there is an additional multiplication
such that the following hold
(1) A is a ring with respect to the addition and this multiplication.
(2) (rx)y = x(ry) = r(xy) for all r ∈ R and x, y ∈ A.

As examples of R-algebras, first consider R = K, where K is a field, and let A =


Mn (K), the set of all (n × n)-matrices over K. Then Mn (K) is a K-algebra. Furthermore,
the set of polynomials K[x] is also a K-algebra.
We now define ring extensions. Let A be a ring, not necessarily commutative, with
an identity 1 ≠ 0, and R be a commutative subring of A, which contains 1. Assume that
R is contained in the center of A; that is, rx = xr for all r ∈ R and x ∈ A. We then call A
a ring extension of R and write A|R. If A|R is a ring extension, then A is an R-algebra in
a natural manner.
Let A be an R-algebra with an identity 1 ≠ 0. Then we have the canonical ring
homomorphism ϕ : R → A, r 󳨃→ r ⋅ 1. The image R󸀠 := ϕ(R) is a subring of the center
of A, and R󸀠 contains the identity element of A. Then A|R󸀠 is a ring extension (in the
above sense). Hence, if A is a R-algebra with an identity 1 ≠ 0, then we may consider
R as a subring of A and A|R as a ring extension.
We now will extend to the general context of ring extensions the ideas of inte-
gral elements and integral extensions. As above, let R be a commutative ring with an
identity 1 ≠ 0, and let A be an R-algebra.

Definition 20.2.2. An element a ∈ A is said to be integral over R, or integrally depen-


dent over R, if there is a monic polynomial f (x) = xn + αn−1 xn−1 + ⋅ ⋅ ⋅ + α0 ∈ R[x] of
degree n ≥ 1 over R with f (a) = an + αn−1 an−1 + ⋅ ⋅ ⋅ + α0 = 0. That is, a is integral over R
if it is a zero of a monic polynomial of degree ≥ 1 over R.
An equation that an integral element satisfies is called integral equation of a
over R. If A has an identity 1 ≠ 0, then we may write a0 = 1 and ∑ni=0 αi ai with αn = 1.

Example 20.2.3.
1. Let E|K be a field extension. a ∈ E is integral over K if and only if a is algebraic over
K. If K is the quotient field of an integral domain R, and a ∈ E is algebraic over K.
Then there exists an α ∈ R with αa integral over R, because if 0 = αn an + ⋅ ⋅ ⋅ + α0 ,
thus, 0 = (αn a)n + ⋅ ⋅ ⋅ + αnn−1 α0 .
2. The elements of ℂ, which are integral over ℤ are precisely the algebraic integers
over ℤ, that is, the zeros of monic polynomials over ℤ.

Theorem 20.2.4. Let R be as above and A an R-algebra with an identity 1 ≠ 0. If A is,


as an R-module, finitely generated, then each element of A is integral over R.

Proof. Let {b1 , . . . , bn } be a finite generating system of A, as an R-module. We may as-


sume that b1 = 1, otherwise add 1 to the system. As explained in the preliminaries,

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:03 PM
20.2 Integral ring extensions | 307

without loss of generality, we may assume that R ⊂ A. Let a ∈ A. For each 1 ≤ j ≤ n, we


have an equation abj = ∑nk=1 αkj bk for some αkj ∈ R. In other words,
n
∑ (αkj − δjk a)bk = 0 (⋆⋆)
k=1

for j = 1, . . . , n, where

0 if j ≠ k,
δjk = {
1 if j = k.

Define γjk := αkj − δjk a and C = (γjk )j,k . C is an (n × n)-matrix over the commutative ring
R[a]. Recall that R[a] has an identity element. Let C̃ = (γ̃jk )j,k be the complementary
matrix of C (see for instance [8]). Then CC ̃ = (det C)E . From (⋆⋆), we get
n

n n n n n
0 = ∑ γ̃ij ( ∑ γjk bk ) = ∑ ∑ γ̃ij γjk bk = ∑ (det C)δik bk = (det C)bi
j=1 k=1 k=1 j=1 k=1

for all 1 ≤ i ≤ n. Since b1 = 1, we have necessarily that det C = det(αjk − δjk a)j,k = 0
(recall that δjk = δkj ). Hence, a is a zero of the monic polynomial f (x) = det(δjk x −αjk ) ∈
R[x] of degree n ≥ 1. Therefore, a is integral over R.

Definition 20.2.5. A ring extension A|R is called an integral extension if each element
of A is integral over R. A ring extension A|R is called finite if A, as a R-module, is finitely
generated.

Recall that finite field extensions are algebraic extensions. As an immediate con-
sequence of Theorem 20.2.4, we get the corresponding result for ring extensions.

Theorem 20.2.6. Each finite ring extension A|R is an integral extension.

Theorem 20.2.7. Let A be an R-algebra with an identity 1 ≠ 0. If a ∈ A, then the follow-


ing are equivalent:
(1) a is integral over R.
(2) The subalgebra R[a] is, as an R-module, finitely generated.
(3) There exists a subalgebra A󸀠 of A, which contains a, and which is, as an R-module,
finitely generated.

A subalgebra of an algebra over R is a submodule, which is also a subring.

Proof. (1) ⇒ (2): We have R[a] = {g(a) : g ∈ R[x]}. Let f (a) = 0 be an integral equation
of a over R. Since f is monic, by the division algorithm, for each g ∈ R[x], there are
h, r ∈ R[x] with g = h ⋅ f + r and r = 0, or r ≠ 0 and deg(r) < deg(f ) =: n. Let r ≠ 0. Since
g(a) = r(a), we get that {1, a, . . . , an−1 } is a generating system for the R-module R[a].
(2) ⇒ (3): Take A󸀠 = R[a].
(3) ⇒ (1): Use Theorem 20.2.4 for A󸀠 .

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:03 PM
308 | 20 Integral and transcendental extensions

For the remainder of this chapter, all rings are commutative with an identity 1 ≠ 0.

Theorem 20.2.8. Let A|R and B|A be finite ring extensions. Then also B|R is finite.

Proof. From A = Re1 +⋅ ⋅ ⋅+Rem , and B = Af1 +⋅ ⋅ ⋅+Afn , we get B = Re1 f1 +⋅ ⋅ ⋅+Rem fn .

Theorem 20.2.9. Let A|R be a ring extension. Then the following are equivalent:
(1) There are finitely many, over R integral elements a1 , . . . , am in A such that A =
R[a1 , . . . , am ].
(2) A|R is finite.

Proof. (2) ⇒ (1): We only need to take for a1 , . . . , am a generating system of A as an


R-module, and the result holds, because A = Ra1 + ⋅ ⋅ ⋅ + Ram , and each ai is integral
over R by Theorem 20.2.4.
(1) ⇒ (2): We use induction for m. If m = 0, then there is nothing to prove. Now let
m ≥ 1, and assume that (1) holds. Define A󸀠 = R[a1 , . . . , am−1 ]. Then A = A󸀠 [am ], and
am is integral over A󸀠 . A|A󸀠 is finite by Theorem 20.2.7. By the induction assumption,
A󸀠 |R is finite. Then A|R is finite by Theorem 20.2.8.

Definition 20.2.10. Let A|R be a ring extension. Then the subset C = {a ∈ A :


a is integral over R} ⊂ A is called the integral closure of R in A.

Theorem 20.2.11. Let A|R be a ring extension. Then the integral closure of R in A is a
subring of A with R ⊂ A.

Proof. R ⊂ C, because α ∈ R is a zero of the polynomial x − α. Let a, b ∈ C. We consider


the subalgebra R[a, b] of the R-algebra A. R[a, b]|R is finite by Theorem 20.2.9. Hence,
by Theorem 20.2.4, all elements from R[a, b] are integral over R; that is, R[a, b] ⊂ C. In
particular, a + b, a − b, and ab are in C.

We extend to ring extensions the idea of a closure:

Definition 20.2.12. Let A|R a ring extension. R is called integrally closed in A, if R itself
is its integral closure in R; that is, R = C, the integral closure of R in A.

Theorem 20.2.13. For each ring extension A|R, the integral closure C of R in A, is inte-
grally closed in A.

Proof. Let a ∈ A be integral over C. Then an + αn−1 an−1 + ⋅ ⋅ ⋅ + α0 = 0 for some αi ∈ C,


n ≥ 1. Then a is also integral over the R-subalgebra A󸀠 = R[α0 , . . . , αn−1 ] of C, and A󸀠 |R is
finite. Furthermore, A󸀠 [a]|A is finite. Hence, A󸀠 [a]|R is finite. By Theorem 20.2.4, then
a ∈ A󸀠 [a] is already integral over R, that is, a ∈ C.

Theorem 20.2.14. Let A|R and B|A be ring extensions. If A|R and B|A are integral exten-
sions, then also B|R is an integral extension (and certainly vice versa).

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:03 PM
20.2 Integral ring extensions | 309

Proof. Let C be the integral closure of R in B. We have A ⊂ C, since A|R is integral.


Together with B|A, we also have that B|C is integral. By Theorem 20.2.13, we get that C
is integrally closed in B. Hence, B = C.

We now consider integrally closed integral domains.

Definition 20.2.15. An integral domain R is called integrally closed if R is integrally


closed in its quotient field K.

Theorem 20.2.16. Each unique factorization domain R is integrally closed.

Proof. Let α ∈ K and α = ba with a, b ∈ R, a ≠ 0. Since R is a unique factorization


domain, we may assume that a and b are relatively prime. Let α be integral over R.
Then we have over R an integral equation αn +an−1 αn−1 +⋅ ⋅ ⋅+a0 = 0 for α. Multiplication
with bn gives an + ban−1 + ⋅ ⋅ ⋅ + bn a0 = 0. Hence, b is a divisor of an . Since a and b are
relatively prime in R, we have that b is a unit in R. Hence, α = ba ∈ R.

Theorem 20.2.17. Let R be an integral domain and K its quotient field. Let E|K be a finite
field extension. Let R be integrally closed and α ∈ E integral over R. Then the minimal
polynomial g ∈ K[x] of α over K has only coefficients of R.

Proof. Let g ∈ K[x] be the minimal polynomial of α over K (recall that g is monic by
definition). Let Ē be an algebraic closure of E. Then g(x) = (x−α1 ) ⋅ ⋅ ⋅ (x−αn ) with α1 = α
over E.̄ There are K-isomorphisms σi : K(α) → Ē with σi (α) = αi . Hence, all αi are also
integral over R. Since all coefficients of g are polynomial expressions Cj (α1 , . . . , αn ) in
the αi , we get that all coefficients of g are integral over R (see Theorem 20.2.11). Now
g ∈ R[x], because g ∈ K[x], and R is integrally closed.

Theorem 20.2.18. Let R be an integrally closed integral domain and K its quotient field.
Let f , g, h ∈ K[x] be monic polynomials over K with f = gh. If f ∈ R[x], then also g, h ∈
R[x].

Proof. Let E be the splitting field of f over K. Over E, we have f (x) = (x − α1 ) ⋅ ⋅ ⋅ (x − αn ).


Since f is monic, all αk are integral over R (see the proof of Theorem 20.2.17). Since
f = gh, there are I, J ⊂ {1, . . . , n} with g(x) = ∏i∈I (x − αi ) and h(x) = ∏j∈J (x − αj ). As
polynomial expressions in the αi , i ∈ I, and αj , j ∈ J, respectively, the coefficients of
g and h, respectively, are integral over R. On the other hand, all these coefficients are
in K, and R is integrally closed. Hence, g, h ∈ R[x].

Theorem 20.2.19. Let E|R be an integral ring extension. If E is a field, then also R is a
field.

Proof. Let α ∈ R\{0}. The element α1 ∈ E satisfies an integral equation ( α1 )n +an−1 ( α1 )n−1 +
⋅ ⋅ ⋅ + a0 = 0 over R. Multiplication with αn−1 gives α1 = −an−1 − an−2 α − ⋅ ⋅ ⋅ − a0 αn−1 ∈ R.
Hence, R is a field.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:03 PM
310 | 20 Integral and transcendental extensions

20.3 Transcendental field extensions


Recall that a transcendental number is an element of ℂ that is not algebraic over ℚ.
More generally, if E|K is a field extension, then an element α ∈ E is transcendental over
K if it is not algebraic; that is, it is not a zero of any polynomial f (x) ∈ K[x]. Since fi-
nite extensions are algebraic, clearly E|K will contain transcendental elements only if
[E : K] = ∞. However, this is not sufficient. The field 𝒜 of algebraic numbers is alge-
braic over ℚ, but infinite dimensional over ℚ. We now extend the idea of a transcen-
dental number to that of a transcendental extension.
Let K ⊂ E be fields; that is, E|K is a field extension. Let M be a subset of E. The
algebraic cover of M in E is defined to be the algebraic closure H(M) of K(M) in E; that
is, HK,E (M) = H(M) = {α ∈ E : α algebraic over K(M)}. H(M) is a field with K ⊂ K(M) ⊂
H(M) ⊂ E. α ∈ E is called algebraically dependent on M (over K) if α ∈ H(M); that is, if
α is algebraic over K(M).
The following are clear:
1. M ⊂ H(M),
2. M ⊂ M 󸀠 ⇒ H(M) ⊂ H(M 󸀠 ), and
3. H(H(M)) = H(M).

Definition 20.3.1.
(a) M is said to be algebraically independent (over K) if α ∉ H(M \ {α}) for all α ∈ M;
that is, if each α ∈ M is transcendental over K(M \ {α}).
(b) M is said to be algebraically dependent (over K) if M is not algebraically indepen-
dent.

The proofs of the statements in the following lemma are straightforward:

Lemma 20.3.2.
(1) M is algebraically dependent if and only if there exists an α ∈ M, which is algebraic
over K(M \ {α}).
(2) Let α ∈ M. Then α ∈ H(M \ {α}) ⇔ H(M) = H(M \ {α}).
(3) If α ∉ M and α is algebraic over K(M), then M ∪ {α} is algebraically dependent.
(4) M is algebraically dependent if and only if there is a finite subset in M, which is
algebraically dependent.
(5) M is algebraically independent if and only if each finite subset of M is algebraically
independent.
(6) M is algebraically independent if and only if the following holds: If α1 , . . . , αn are
finitely many, pairwise different elements of M, then the canonical homomorphism
ϕ : K[x1 , . . . , xn ] → E, f (x1 , . . . , xn ) 󳨃→ f (α1 , . . . , αn ) is injective; or in other words,
for all f ∈ K[x1 , . . . , xn ], we have that f = 0 if f (α1 , . . . , αn ) = 0. That is, there is no
nontrivial algebraic relation between the α1 , . . . , αn over K.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:03 PM
20.3 Transcendental field extensions | 311

(7) Let M ⊂ E, α ∈ E. If M is algebraically independent and M ∪ {α} algebraically


dependent, then α ∈ H(M); that is, α is algebraically dependent on M.
(8) Let M ⊂ E, B ⊂ M. If B is maximal algebraically independent, that is, if α ∈ M \ B,
then B ∪ {α} is algebraically dependent, thus M ⊂ H(B). That is, each element of M
is algebraic over K(B).

We will show that any field extension can be decomposed into a transcendental
extension over an algebraic extension. We need the idea of a transcendence basis.

Definition 20.3.3. B ⊂ E is called a transcendence basis of the field extension E|K if


the following two conditions are satisfied:
1. E = H(B), that is, the extension E|K(B) is algebraic.
2. B is algebraically independent over K.

Theorem 20.3.4. If B ⊂ E, then the following are equivalent:


(1) B is a transcendence basis of E|K.
(2) If B ⊂ M ⊂ E with H(M) = E, then B is a maximal algebraically independent subset
of M.
(3) There exists a subset M ⊂ E with H(M) = E, which contains B as a maximal alge-
braically independent subset.

Proof. (1) ⇒ (2): Let α ∈ M \ B. We have to show that B ∪ {α} is algebraically dependent.
But this is clear, because α ∈ H(B) = E.
(2) ⇒ (3): We just take M = E.
(3) ⇒ (1): We have to show that H(B) = E. Certainly, M ⊂ H(B). Hence, E = H(M) ⊂
H(H(B)) = H(B) ⊂ E.

We next show that any field extension does have a transcendence basis:

Theorem 20.3.5. Each field extension E|K has a transcendence basis. More concretely,
if there is a subset M ⊂ E such that E|K(M) is algebraic and if there is a subset C ⊂ M,
which is algebraically independent, then there exists a transcendence basis B of E|K with
C ⊂ B ⊂ M.

Proof. We have to extend C to a maximal algebraically independent subset B of M.


By Theorem 20.3.4, such a B is a transcendence basis of E|K. If M is finite, then such
a B certainly exists. Now let M be not finite. We argue analogously as for the existence
of a basis of a vector space, for instance, with Zorn’s lemma: If a partially ordered,
nonempty set S is inductive, then there exist maximal elements in S. Here, a partially
ordered, nonempty set S is said to be inductive if every totally ordered subset of S has
an upper bound in S. The set N of all algebraically independent subsets of M, which
contain C is partially ordered with respect to “⊂”, and N ≠ 0, because C ∈ N. Let K ≠ 0
be an ascending chain in N; that is, given an ascending chain 0 ≠ Y1 ⊂ Y2 ⊂ ⋅ ⋅ ⋅.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:03 PM
312 | 20 Integral and transcendental extensions

in N. The union U = ⋃Y∈K Y is also algebraically independent. Hence, there exists a


maximal algebraically independent subset B ⊂ M with C ⊂ B.

Theorem 20.3.6. Let E|K be a field extension and M a subset of E, for which E|K(M) is
algebraic. Let C be an arbitrary subset of E, which is algebraically independent on K.
Then there exists a subset M 󸀠 ⊂ M with C ∩ M 󸀠 = 0 such that C ∪ M 󸀠 is a transcendence
basis of E|K.

Proof. Take M ∪ C, and define M 󸀠 := B \ C in Theorem 20.3.5.

Theorem 20.3.7. Let B, B󸀠 be two transcendence bases of the field extension E|K. Then
there is a bijection ϕ : B → B󸀠 . In other words, any two transcendence bases of E|K have
the same cardinal number.

Proof. (a) If B is a transcendental basis of E|K and M is a subset of E such that E|K(M)
is algebraic, then we may write B = ⋃α∈M Bα with finite sets Bα . In particular, if B
is infinite, then the cardinal number of B is not bigger than the cardinal number
of M.
(b) Let B and B󸀠 be two transcendence bases of E|K. If B and B󸀠 are both infinite,
then B and B󸀠 have the same cardinal number by (a) and the theorem by Schroeder–
Bernstein [9]. We now prove Theorem 20.3.7 for the case that E|K has a finite transcen-
dence basis. Let B be finite with n elements. Let C be an arbitrary algebraically inde-
pendent subset in E over K with m elements. We show that m ≤ n. Let C = {α1 , . . . , αm }
with m ≥ n. We show, by induction, that for each integer k, 0 ≤ k ≤ n, there are
subsets B ⫌ B1 ⫌ ⋅ ⋅ ⋅ ⫌ Bk of B such that {α1 , . . . , αk } ∪ Bk is a transcendence basis of
E|K, and {α1 , . . . , αk } ∩ Bk = 0. For k = 0, we take B0 = B, and the statement holds.
Assume now that the statement is correct for 0 ≤ k < n. By Theorem 20.3.4 and 20.3.5,
there is a subset Bk+1 of {α1 , . . . , αk } ∪ Bk such that {α1 , . . . , αk+1 } ∪ Bk+1 is a transcen-
dence basis of E|K, and {α1 , . . . , αk+1 } ∩ Bk+1 = 0. Then necessarily, Bk+1 ⊂ Bk . Assume
Bk = Bk+1 . Then on the one hand, Bk ∪ {α1 , . . . , αk+1 } is algebraic independent because
Bk = Bk+1 . On the other hand, also Bk ∪ {α1 , . . . , αk } ∪ {ak+1 } is algebraically dependent,
which gives a contradiction. Hence, Bk+1 ⫋ Bk . Now Bk has at most n − k elements.
Therefore, Bn = 0; that is, {α1 , . . . , αn } = {α1 , . . . , αn } ∪ Bn is a transcendence basis of
E|K. Because C = {α1 , . . . , αm } is algebraically independent, we cannot have m > n.
Thus, m ≤ n, and B and B󸀠 have the same number of elements, because B󸀠 must also
be finite.

Since the cardinality of any transcendence basis for a field extension E|K is the
same, we can define the transcendence degree.

Definition 20.3.8. The transcendence degree trgd(E|K) of a field extension is the car-
dinal number of one (and hence of each) transcendence basis of E|K. A field extension
E|K is called purely transcendental, if E|K has a transcendence basis B with E = K(B).

We note the following facts:

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:03 PM
20.3 Transcendental field extensions | 313

(1) If E|K is purely transcendental and B = {α1 , . . . , αn } is a transcendence basis of E|K,


then E is K-isomorphic to the quotient field of the polynomial ring K[x1 , . . . , xn ] of
the independence indeterminates x1 , . . . , xn .
(2) K is algebraically closed in E if E|K is purely transcendental.
(3) By Theorem 20.3.4, the field extension E|K has an intermediate field F, K ⊂ F ⊂ E,
such that F|K is purely transcendental, and E|F is algebraic. Certainly F is not
uniquely determined.
For example, take ℚ ⊂ F ⊂ ℚ(i, π), and for F, we may take F = ℚ(π), and also
F = ℚ(iπ), for instance.
(4) trgd(ℝ|ℚ) = trgd(ℂ|ℚ) = card ℝ, the cardinal number of ℝ. This holds, because
the set of the algebraic numbers (over ℚ) is countable.

Theorem 20.3.9. Let E|K be a field extension and F an arbitrary intermediate field,
K ⊂ F ⊂ E. Let B be a transcendence basis of F|K and B󸀠 a transcendence base of E|F.
Then B ∩ B󸀠 = 0, and B ∪ B󸀠 is a transcendence basis of E|K. In particular, trgd(E|K) =
trgd(E|F) + trgd(F|K).

Proof. (1) Assume α ∈ B∩B󸀠 . As an element of F, then α is algebraic over F(B󸀠 )\{α}. But
this gives a contradiction, because α ∈ B󸀠 , and B󸀠 is algebraically independent over F.
(2) F|K(B) is an algebraic extension, and also F(B󸀠 )|K(B ∪ B󸀠 ) = K(B)(B󸀠 ). Since the
relation “algebraic extension” is transitive, we have that E|K(B ∪ B󸀠 ) is algebraic.
(3) Finally, we have to show that B∪B󸀠 is algebraically independent over K. By The-
orems 20.3.5 and 20.3.6, there is a subset B󸀠󸀠 of B∪B󸀠 with B∩B󸀠󸀠 = 0 such that B∪B󸀠󸀠 is a
transcendence basis of E|K. We have B󸀠󸀠 ⊂ B󸀠 , and have to show that B󸀠 ⊂ B󸀠󸀠 . Assume
that there is an α ∈ B󸀠 with α ∉ B󸀠󸀠 . Then α is algebraic over K(B ∪ B󸀠󸀠 ) = K(B)(B󸀠󸀠 ), and
hence algebraic over F(B󸀠󸀠 ). Since B󸀠󸀠 ⊂ B󸀠 , we have that α is algebraically independent
over F, which gives a contradiction. Hence, B󸀠󸀠 = B󸀠 .

Theorem 20.3.10 (Noether’s normalization theorem). Let K be a field and


A = K[a1 , . . . , an ]. Then there exist elements u1 , . . . , um , 0 ≤ m ≤ n, in A with the
following properties:
(1) K[u1 , . . . , um ] is K-isomorphic to the polynomial ring K[x1 , . . . , xm ] of the independent
indeterminates x1 , . . . , xm .
(2) The ring extension A|K[u1 , . . . , um ] is an integral extension; that is, for each a ∈
A \ K[u1 , . . . , um ] there exists a monic polynomial f (x) = xn + αn−1 xn−1 + ⋅ ⋅ ⋅ + α0 ∈
K[u1 , . . . , um ][x] of degree n ≥ 1 with f (a) = an + αn−1 an−1 + ⋅ ⋅ ⋅ + α0 = 0. In particular,
A|K[u1 , . . . , um ] is finite.

Proof. Without loss of generality, let the a1 , . . . , an be pairwise different. We prove the
theorem by induction on n. If n = 1, then there is nothing to show. Now, let n ≥ 2, and
assume that the statement holds for n − 1. If there is no nontrivial algebraic relation
f (a1 , . . . , an ) = 0 over K between the a1 , . . . , an , then there is nothing to show. Hence,
let there exists a polynomial f ∈ K[x1 , . . . , xn ] with f ≠ 0 and f (a1 , . . . , an ) = 0. Let

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:03 PM
314 | 20 Integral and transcendental extensions

ν ν
f = ∑ν=(ν1 ,...,νn ) cν x1 1 ⋅ ⋅ ⋅ xnn . Let μ2 , μ3 , . . . , μn be natural numbers, which we specify later.
μ μ μ μ
Define b2 = a2 − a1 2 , b3 = a3 − a1 3 , . . . , bn = an − a1 n . Then ai = bi + a1 i for 2 ≤ i ≤ n,
μ μ
hence, f (a1 , b2 + a1 2 , . . . , bn + a1 n ) = 0. We write R := K[x1 , . . . , xn ] and consider the
polynomial ring R[y2 , . . . , yn ] of the n − 1 independent indeterminates y2 , . . . , yn over R.
μ μ
In R[y2 , . . . , yn ], we consider the polynomial f (x1 , y2 + x1 2 , . . . , yn + x1 n ). We may rewrite
this polynomial as

ν +μ2 ν2 +⋅⋅⋅+μn νn
∑ cν x1 1 + g(x1 , y2 , . . . , yn )
ν=(ν1 ,...,νn )

with a polynomial g(x1 , y2 , . . . , yn ), for which, as a polynomial in x1 over K[y2 , . . . , yn ],


ν +μ ν +⋅⋅⋅+μn νn
the degree in x1 is smaller than the degree of ∑ν=(ν1 ,...,νn ) cν x1 1 2 2 , provided that
we may choose the μ2 , . . . , μn in such a way that this really holds. We now specify the
μ2 , . . . , μn . We write μ := (1, μ2 , . . . , μn ), and define the scalar product μν = 1 ⋅ ν1 + μ2 ν2 +
⋅ ⋅ ⋅ + μn νn . Choose p ∈ ℕ with p > deg(f ) = max{ν1 + ⋅ ⋅ ⋅ + νn : cν ≠ 0}. We now
take μ = (1, p, p2 , . . . , pn−1 ). If ν = (ν1 , . . . , νn ) with cν ≠ 0 and ν󸀠 = (ν1󸀠 , . . . , νn󸀠 ) with
cν󸀠 󸀠 ≠ 0 are different n-tuples then indeed μν ≠ μν󸀠 because νi , νi󸀠 < p for all i, 1 ≤ i ≤ n.
This follows from the uniqueness of the p-adic expression of a natural number. Hence,
μ μ
we may choose μ2 , . . . , μn such that f (x1 , y2 + x1 2 , . . . , yn + x1 n ) = cx1N + h(x1 , y2 , . . . , yn )
with c ∈ K, c ≠ 0, and h ∈ K[y2 , . . . , yn ][x1 ] has in x1 a degree < N. If we divide by
c and take a1 , b2 , . . . , bn for x1 , y2 , . . . , yn , then we get an integral equation of a1 over
K[b2 , . . . , bn ]. Therefore, the ring extension A = K[a1 , . . . , an ]|K[b2 , . . . , bn ] is integral
μ
(see Theorem 20.2.9), ai = bi + a1 i for 2 ≤ i ≤ n. By induction, there exist elements
u1 , . . . , um in K[b2 , . . . , bn ] with the following properties:
1. K[u1 , . . . , um ] is a polynomial ring of the m independent indeterminates u1 , . . . , um ,
and
2. K[b2 , . . . , bn ]|K[u1 , . . . , um ] is integral.

Hence, also A|K[u1 , . . . , um ] is integral by Theorem 20.2.14.

Corollary 20.3.11. Let E|K be a field extension. If E = K[a1 , . . . , an ] for a1 , . . . , an ∈ E,


then E|K is algebraic.

Proof. By Theorem 20.3.10, we have that E contains a polynomial ring K[u1 , . . . , um ],


0 ≤ m ≤ n, of the m independent indeterminates u1 , . . . , um as a subring, for which
E|K[u1 , . . . , um ] is integral. We claim that then already K[u1 , . . . , um ] is a field. To prove
that, let a ∈ K[u1 , . . . , um ], a ≠ 0. The element a−1 ∈ E satisfies an integral equation
(a−1 )n + αn−1 (a−1 )n−1 + ⋅ ⋅ ⋅ + α0 = 0 over K[u1 , . . . , um ] =: R. Hence, a−1 = −αn−1 − αn−2 a −
⋅ ⋅ ⋅ − α0 an−1 ∈ R. Therefore, R is a field, which proves the claim. This is possible only
for m = 0, and then E|K is integral; here, that is algebraic.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:03 PM
20.4 The transcendence of e and π | 315

20.4 The transcendence of e and π


Although we have shown that within ℂ, there are continuously many transcendental
numbers, we have only shown that one particular number is transcendental. In this
section, we prove that the numbers e and π are transcendental. We start with e.

Theorem 20.4.1. e is a transcendental number, that is, transcendental over ℚ.

Proof. Let f (x) ∈ ℝ[x] with the degree of f (x) = m ≥ 1. Let z1 ∈ ℂ, z1 ≠ 0, and γ :
[0, 1] → ℂ, γ(t) = tz1 . Let
z1
z1 −z
I(z1 ) = ∫ e f (z)dz = (∫) ez1 −z f (z)dz.
γ 0 γ

z
By (∫0 1 )γ , we mean the integral from 0 to z1 along γ. Recall that
z1 z1
z1 −z
(∫) e z1
f (z)dz = −f (z1 ) + e f (0) + (∫) ez1 −z f 󸀠 (z)dz.
0 γ 0 γ

It follows then by repeated partial integration that


(1) I(z1 ) = ez1 ∑m m
j=0 f (0) − ∑j=0 f (z1 ).
(j) (j)

Let |f |(x) be the polynomial we get if we replace the coefficients of f (x) by their absolute
values. Since |ez1 −z | ≤ e|z1 −z| ≤ e|z1 | , we get
(2) |I(z1 )| ≤ |z1 |e|z1 | |f |(|z1 |).

Now assume that e is an algebraic number; that is,


(3) q0 + q1 e + ⋅ ⋅ ⋅ + qn en = 0 for n ≥ 1 and integers q0 ≠ 0, q1 , . . . , qn , and the greatest
common divisor of q0 , q1 , . . . , qn , is equal to 1.

For a detailed proof of these facts see for instance [42]. We consider now the polyno-
mial f (x) = xp−1 (x − 1)p ⋅ ⋅ ⋅ (x − n)p with p a sufficiently large prime number, and we
consider I(z1 ) with respect to this polynomial. Let

J = q0 I(0) + q1 I(1) + ⋅ ⋅ ⋅ + qn I(n).

From (1) and (3), we get that


m n
J = − ∑ ∑ qk f (j) (k),
j=0 k=0

where m = (n + 1)p − 1, since (q0 + q1 e + ⋅ ⋅ ⋅ + qn en )(∑m


j=0 f (0)) = 0.
(j)

Now, f (j) (k) = 0 if j < p, k > 0, and if j < p − 1, then k = 0. Hence, f (j) (k) is
an integer that is divisible by p! for all j, k, except for j = p − 1, k = 0. Furthermore,

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:03 PM
316 | 20 Integral and transcendental extensions

f (p−1) (0) = (p − 1)!(−1)np (n!)p . Hence, if p > n, then f (p−1) (0) is an integer divisible by
(p − 1)!, but not by p!.
It follows that J is a nonzero integer that is divisible by (p − 1)! if p > |q0 | and p > n.
So let p > n, p > |q0 |, so that |J| ≥ (p − 1)!.
Now, |f |(k) ≤ (2n)m . Together with (2), we then get that

|J| ≤ |q1 |e|f |(1) + ⋅ ⋅ ⋅ + |qn |nen |f |(n) ≤ cp

for a number c independent of p. It follows that

(p − 1)! ≤ |J| ≤ cp ;

that is,

|J| cp−1
1≤ ≤c .
(p − 1)! (p − 1)!

cp−1
This gives a contradiction, since (p−1)!
→ 0 as p → ∞. Therefore, e is transcen-
dental.

We now move on to the transcendence of π. We first need the following lemma:

Lemma 20.4.2. Suppose α ∈ ℂ is an algebraic number and f (x) = an xn + ⋅ ⋅ ⋅ + a0 , n ≥ 1,


an ≠ 0, and all ai ∈ ℤ (f (x) ∈ ℤ[x]) with f (α) = 0. Then an α is an algebraic integer.

Proof.

an−1 n n n−1
n f (x) = an x + an an−1 x
n−1
+ ⋅ ⋅ ⋅ + an−1
n a0

= (an x)n + an−1 (an x)n−1 + ⋅ ⋅ ⋅ + an−1


n a0
= g(an x) = g(y) ∈ ℤ[y],

where y = an x, and g(y) is monic. Then g(an α) = 0; hence, an α is an algebraic inte-


ger.

Theorem 20.4.3. π is a transcendental number, that is, transcendental over ℚ.

Proof. Assume that π is an algebraic number. Then θ = iπ is also algebraic.


Let θ1 = θ, θ2 , . . . , θd be the conjugates of θ. Suppose

p(x) = q0 + q1 x + ⋅ ⋅ ⋅ + qd xd ∈ ℤ[x], qd > 0, and gcd(q0 , . . . , qd ) = 1

is the entire minimal polynomial of θ over ℚ. Then θ1 = θ, θ2 , . . . , θd are the zeros of


this polynomial. Let t = qd . Then from Lemma 20.4.2, tθi is an algebraic integer for
all i. From eiπ + 1 = 0, and from θ1 = iπ, we get that

(1 + eθ1 )(1 + eθ2 ) ⋅ ⋅ ⋅ (1 + eθd ) = 0.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:03 PM
20.4 The transcendence of e and π | 317

The product on the left side can be written as a sum of 2d terms eϕ , where ϕ =
ϵ1 θ1 + ⋅ ⋅ ⋅ + ϵd θd , ϵj = 0 or 1. Let n be the number of terms ϵ1 θ1 + ⋅ ⋅ ⋅ + ϵd θd that are
nonzero. Call these α1 , . . . , αn . We then have an equation

q + e α1 + ⋅ ⋅ ⋅ + e αn = 0

with q = 2d − n > 0. Recall that all tαi are algebraic integers, and we consider the
polynomial

f (x) = t np xp−1 (x − α1 )p ⋅ ⋅ ⋅ (x − αn )p

with p a sufficiently large prime integer. We have f (x) ∈ ℝ[x], since the αi are alge-
braic numbers, and the elementary symmetric polynomials in α1 , . . . , αn are rational
numbers.
Let I(z1 ) be defined as in the proof of Theorem 20.4.1, and now let

J = I(α1 ) + ⋅ ⋅ ⋅ + I(αn ).

From (1) in the proof of Theorem 20.4.1 and (4), we get


m m n
J = −q ∑ f (j) (0) − ∑ ∑ f (j) (αk ),
j=0 j=0 k=1

with m = (n + 1)p − 1.
Now, ∑nk=1 f (j) (αk ) is a symmetric polynomial in tα1 , . . . , tαn with integer coeffi-
cients, since the tαi are algebraic integers. It follows from the main theorem on sym-
metric polynomials that ∑m n
j=0 ∑k=1 f (αk ) is an integer. Furthermore, f (αk ) = 0 for
(j) (j)

j < p. Hence, ∑m n
j=0 ∑k=1 f (αk ) is an integer divisible by p!.
(j)

Now, f (j) (0) is an integer divisible by p! if j ≠ p − 1, and f (p−1) (0) = (p − 1)!(−t)np ×


(α1 ⋅ ⋅ ⋅ αn )p is an integer divisible by (p − 1)!, but not divisible by p! if p is sufficiently
large. In particular, this is true if p > |t n (α1 ⋅ ⋅ ⋅ αn )| and also p > q.
From (2) in the proof of Theorem 20.4.1, we get that

|J| ≤ |α1 |e|α1 | |f |(|α1 |) + ⋅ ⋅ ⋅ + |αn |e|αn | |f |(|αn |) ≤ cp

for some number c independent of p.


As in the proof of Theorem 20.4.1, this gives us

(p − 1)! ≤ |J| ≤ cp ;

that is,

|J| cp−1
1≤ ≤c .
(p − 1)! (p − 1)!
cp−1
This, as before, gives a contradiction, since (p−1)!
→ 0 as p → ∞. Therefore, π is
transcendental.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:03 PM
318 | 20 Integral and transcendental extensions

20.5 Exercises
1. A polynomial p(x) ∈ ℤ[x] is primitive if the GCD of all its coefficients is 1. Prove
the following:
(i) If f (x) and g(x) are primitive, then so is f (x)g(x).
(ii) If f (x) ∈ ℤ[x] is monic, then it is primitive.
(iii) If f (x) ∈ ℚ[x], then there exists a rational number c such that f (x) = cf1 (x)
with f1 (x) primitive.
2. Let d be a square-free integer and K = ℚ(√d) be a quadratic field. Let RK be the
subring of K of the algebraic integers of K. Show the following:
(i) RK = {m + n√d : m, n ∈ ℤ} if d ≡ 2(mod 4) or d ≡ 3(mod 4). {1, √d} is an
integral basis for RK .
(ii) RK = {m + n 1+2 d : m, n ∈ ℤ} if d ≡ 1(mod 4). {1, 1+2 d } is an integral basis for
√ √

RK .
(iii) If d < 0, then there are only finitely many units in RK .
(iv) If d > 0, then there are infinitely many units in RK .
3. Let K = ℚ(α) with α3 + α + 1 = 0 and RK the subring of the algebraic integers in K.
Show that:
(i) {1, α, α2 } is an integral basis for RK .
(ii) RK = ℤ[α].
4. Let A|R be an integral ring extension. If A is an integral domain and R a field, then
A is also a field.
5. Let A|R be an integral extension. Let 𝒫 be a prime ideal of A and p be a prime ideal
of R such that 𝒫 ∩ R = p. Show that:
(i) If p is maximal in R, then 𝒫 is maximal in A. (Hint: consider A/𝒫 .)
(ii) If 𝒫0 is another prime ideal of A with 𝒫0 ∩ R = p and 𝒫0 ⊂ 𝒫 , then 𝒫 = 𝒫0 .
(Hint: we may assume that A is an integral domain, and 𝒫 ∩R = {0}, otherwise
go to A/𝒫 .)
6. Show that for a field extension E|K, the following are equivalent:
(i) [E : K(B)] < ∞ for each transcendence basis B of E|K.
(ii) trgd(E|K) < ∞ and [E : K(B)] < ∞ for each transcendence basis B of E|K.
(iii) There is a finite transcendence basis B of E|K with [E : K(B)] < ∞.
(iv) There are finitely many x1 , . . . , xn ∈ E with E = K(x1 , . . . , xn ).
7. Let E|K be a field extension. If E|K is purely transcendental, then K is algebraically
closed in E.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:03 PM
21 The Hilbert basis theorem and the nullstellensatz
21.1 Algebraic geometry
An extremely important application of abstract algebra and an application central to
all of mathematics is the subject of algebraic geometry. As the name suggests this is the
branch of mathematics that uses the techniques of abstract algebra to study geomet-
ric problems. Classically, algebraic geometry involved the study of algebraic curves,
which roughly are the sets of zeros of a polynomial or set of polynomials in several
variables over a field. For example, in two variables a real algebraic plane curve is the
set of zeros in ℝ2 of a polynomial p(x, y) ∈ ℝ[x, y]. The common planar curves, such as
parabolas and the other conic sections, are all plane algebraic curves. In actual prac-
tice, plane algebraic curves are usually considered over the complex numbers and are
projectivized.
The algebraic theory that deals most directly with algebraic geometry is called
commutative algebra. This is the study of commutative rings, ideals in commutative
rings, and modules over commutative rings. A large portion of this book has dealt with
commutative algebra.
Although we will not consider the geometric aspects of algebraic geometry in gen-
eral, we will close the book by introducing some of the basic algebraic ideas that are
crucial to the subject. These include the concept of an algebraic variety or algebraic
set and its radical. We also state and prove two of the cornerstones of the theory as
applied to commutative algebra—the Hilbert basis theorem and the nullstellensatz.
In this chapter, we also often consider a fixed field extension C|K and the poly-
nomial ring K[x1 , . . . , xn ] of the n independent indeterminates x1 , . . . , xn . Again, in this
chapter, we often use letters a, b, m, p, P, A, Q, . . . for ideals in rings.

21.2 Algebraic varieties and radicals


We first define the concept of an algebraic variety:

Definition 21.2.1. If M ⊂ K[x1 , . . . , xn ], then we define

n
𝒩 (M) = {(α1 , . . . , αn ) ∈ C : f (α1 , . . . , αn ) = 0 ∀f ∈ M}.

α = (α1 , . . . , αn ) ∈ 𝒩 (M) is called a zero (Nullstelle) of M in C n , and 𝒩 (M) is called the


zero set of M in C n . If we want to mention C, then we write 𝒩 (M) = 𝒩C (M). A subset
V ⊂ C n of the form V = 𝒩 (M) for some M ⊂ K[x1 , . . . , xn ] is called an algebraic variety
or (affine) algebraic set of C n over K, or just an algebraic K-set of C n .

For any subset N of C n , we can reverse the procedure and consider the set of poly-
nomials, whose zero set is N.

https://doi.org/10.1515/9783110603996-021

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:06 PM
320 | 21 The Hilbert basis theorem and the nullstellensatz

Definition 21.2.2. Suppose that N ⊂ C n . Then

I(N) = {f ∈ K[x1 , . . . , xn ] : f (α1 , . . . , αn ) = 0 ∀(α1 , . . . , αn ) ∈ N}.

Instead of f ∈ I(N), we also say that f vanishes on N (over K). If we want to mention K,
then we write I(N) = IK (N).

What is important is that the set I(N) forms an ideal. The proof is straightforward.

Theorem 21.2.3. For any subset N ⊂ C n , the set I(N) is an ideal in K[x1 , . . . , xn ]; it is
called the vanishing ideal of N ⊂ C n in K[x1 , . . . , xn ].

The following result examines the relationship between subsets in C n and their
vanishing ideals.

Theorem 21.2.4. The following properties hold:


(1) M ⊂ M 󸀠 ⇒ 𝒩 (M 󸀠 ) ⊂ 𝒩 (M);
(2) If a = (M) is the ideal in K[x1 , . . . , xn ] generated by M, then 𝒩 (M) = 𝒩 (a);
(3) N ⊂ N 󸀠 ⇒ I(N 󸀠 ) ⊂ I(N);
(4) M ⊂ I 𝒩 (M) for all M ⊂ K[x1 , . . . , xn ];
(5) N ⊂ 𝒩 I(N) for all N ⊂ C n ;
(6) If (ai )i∈I is a family of ideals in K[x1 , . . . , xn ], then ⋂i∈I 𝒩 (ai ) = 𝒩 (∑i∈I ai ). Here
∑i∈I ai is the ideal in K[x1 , . . . , xn ], generated by the union ⋃i∈I ai ;
(7) If a, b are ideals in K[x1 , . . . , xn ], then 𝒩 (a) ∪ 𝒩 (b) = 𝒩 (ab) = 𝒩 (a ∩ b). Here ab is
the ideal in K[x1 , . . . , xn ] generated by all products fg, where f ∈ a and g ∈ b;
(8) 𝒩 (M) = 𝒩 I 𝒩 (M) for all M ⊂ K[x1 , . . . , xn ];
(9) V = 𝒩 I(V) for all algebraic K-sets V;
(10) I(N) = I 𝒩 I(N) for all N ⊂ C n .

Proof. The proofs are straightforward. Hence, we prove only (7), (8), and (9). The rest
can be left as exercise for the reader.
Proof of (7): Since ab ⊂ a ∩ b ⊂ a, b, we have, by (1), the inclusion 𝒩 (a) ∪ 𝒩 (b) ⊂
𝒩 (a ∩ b) ⊂ 𝒩 (ab). Hence, we have to show that 𝒩 (ab) ⊂ 𝒩 (a) ∪ 𝒩 (b).
Let α = (α1 , . . . , αn ) ∈ C n be a zero of ab, but not a zero of a. Then there is an f ∈ a
with f (α) ≠ 0; hence, for all g ∈ b, we get f (α)g(α) = (fg)(α) = 0. Thus, g(α) = 0.
Therefore, α ∈ 𝒩 (b).
Proof of (8) and (9): Let M ⊂ K[x1 , . . . , xn ]. Then, on the one hand, M ⊂ I 𝒩 (M) by
(5), and further 𝒩 I 𝒩 (M) ⊂ 𝒩 (M) by (1). On the other hand, 𝒩 (M) ⊂ 𝒩 I 𝒩 (M) by (6).
Therefore, 𝒩 (M) = 𝒩 I 𝒩 (M) for all M ⊂ K[x1 , . . . , xn ].
Now, the algebraic K-sets of C n are precisely the sets of the form V = 𝒩 (M). Hence,
V = 𝒩 I(V).

We make the following agreement: if a is an ideal in K[x1 , . . . , xn ], then we write

a ⊲ K[x1 , . . . , xn ].

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:06 PM
21.3 The Hilbert basis theorem | 321

If a ⊲ K[x1 , . . . , xn ], then we do not have a = I 𝒩 (a) in general. That is, a is, in


general, not equal to the vanishing ideal of its zero set in C n . The reason for this is that
not each ideal a occurs as a vanishing ideal of some N ⊂ C n . If a = I(N), then we must
have

f m ∈ a, m ≥ 1 󳨐⇒ f ∈ a. (⋆)

Hence, for instance, if a = (x12 , . . . , xn2 ) ⊲ K[x1 , . . . , xn ], then a is not of the form a = I(N)
for some N ⊂ C n . We now define the radical of an ideal:

Definition 21.2.5. Let R be a commutative ring and a ⊲ R an ideal in R. Then √a = {f ∈


R : f m ∈ a for some m ∈ ℕ} is an ideal in R. √a is called the radical of a (in R). a is said
to be reduced if √a = a.

We note that the √0 is called the nil radical of R; it contains exactly the nilpotent
elements of R; that is, the elements a ∈ R with am = 0 for some m ∈ ℕ.
Let a ⊲ R be an ideal in R and π : R → R/a the canonical mapping. Then √a is
exactly the preimage of the nil radical of R/a.

21.3 The Hilbert basis theorem


In this section, we show that if K is a field, then each ideal a ⊲ K[x1 , . . . , xn ] is finitely
generated. This is the content of the Hilbert basis theorem. This has as an important
consequence: any algebraic variety of C n is the zero set of only finitely many polyno-
mials.
The Hilbert basis theorem follows directly from the following Theorem 21.3.2. Be-
fore we state this theorem, we need a definition.

Definition 21.3.1. Let R be a commutative ring with an identity 1 ≠ 0. R is said to be


noetherian if each ideal in R is generated by finitely many elements; that is, each ideal
in R is finitely generated.

Theorem 21.3.2. Let R be a noetherian ring. Then the polynomial ring R[x] over R is also
noetherian.

Proof. Let 0 ≠ fk ∈ R[x]. We denote the degree of fk with deg(fk ). Let a⊲R[x] be an ideal
in R[x]. Assume that a is not finitely generated. Then, particularly, a ≠ 0. We construct
a sequence of polynomials fk ∈ a such that the highest coefficients ak generate an
ideal in R, which is not finitely generated. This produces then a contradiction; hence,
a is in fact finitely generated. Choose f1 ∈ a, f1 ≠ 0, so that deg(f1 ) = n1 is minimal.
If k ≥ 1, then choose fk+1 ∈ a, fk+1 ∉ (f1 , . . . , fk ) so that deg(fk+1 ) = nk+1 is minimal
for the polynomials in a \ (f1 , . . . , fk ). This is possible, because we assume that a is not
finitely generated. We have nk ≤ nk+1 by our construction. Furthermore, (a1 , . . . , ak ) ⫋
(a1 , . . . , ak , ak+1 ).

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:06 PM
322 | 21 The Hilbert basis theorem and the nullstellensatz

Proof of this claim: Assume that (a1 , . . . , ak ) = (a1 , . . . , ak , ak+1 ). Then ak+1 ∈
(a1 , . . . , ak ). Hence, there are bi ∈ R with ak+1 = ∑ki=1 ai bi . Let g(x) = ∑ki=1 bi fi (x)xnk+1 −ni ;
hence, g ∈ (f1 , . . . , fk ), and g = ak+1 xnk+1 + ⋅ ⋅ ⋅. Therefore, deg(fk+1 − g) < nk+1 , and
fk+1 − g ∉ (f1 , . . . , fk ), which contradicts the choice of fk+1 . This proves the claim.
Hence, (a1 , . . . , ak ) ⫋ (a1 , . . . , ak , ak+1 ), which contradicts the fact that R is noethe-
rian. Hence, a is finitely generated.

We now have the Hilbert basis theorem:

Theorem 21.3.3 (Hilbert basis theorem). Let K be a field. Then each ideal a⊲K[x1 , . . . , xn ]
is finitely generated; that is, a = (f1 , . . . , fm ) for finitely many f1 , . . . , fm ∈ K[x1 , . . . , xn ].

Corollary 21.3.4. If C|K is a field extension, then each algebraic K-set V of C n is already
the zero set of only finitely many polynomials f1 , . . . , fm ∈ K[x1 , . . . , xn ]:

V = {(α1 , . . . , αn ) ∈ C n : fi (α1 , . . . , αn ) = 0 for i = 1, . . . , m}.

Furthermore, we write V = 𝒩 (f1 , . . . , fm ).

21.4 The Hilbert nullstellensatz


Vanishing ideals of subsets of C n are not necessarily reduced. For an arbitrary field C,
the condition

f m ∈ a, m ≥ 1 󳨐⇒ f ∈ a

is, in general, not sufficient for a ⊲ K[x1 , . . . , xn ] to be a vanishing ideal of a subset


of C n . For example, let n ≥ 2, K = C = ℝ and a = (x12 + ⋅ ⋅ ⋅ + xn2 ) ⊲ ℝ[x1 , . . . , xn ]. a is
a prime ideal in ℝ[x1 , . . . , xn ], because x12 + ⋅ ⋅ ⋅ + xn2 is a prime element in ℝ[x1 , . . . , xn ].
Hence, a is reduced. But, on the other hand, 𝒩 (a) = {0}, and I({0}) = (x1 , . . . , xn ).
Therefore, a is not of the form I(N) for some N ⊂ C n . If this would be the case, then
a = I(N) = I 𝒩 I(N) = I{0} = (x1 , . . . , xn ), because of Theorem 21.2.4(10), which gives a
contradiction.
The nullstellensatz of Hilbert, which we give in two forms shows that if a is re-
duced, that is, a = √a, then I 𝒩 (a) = a.

Theorem 21.4.1 (Hilbert’s nullstellensatz, first form). Let C|K be a field extension with
C algebraically closed. If a ⊲ K[x1 , . . . , xn ], then I 𝒩 (a) = √a. Moreover, if a is reduced,
that is, a = √a, then I 𝒩 (a) = a. Therefore, 𝒩 defines a bijective map between the set of
reduced ideals in K[x1 , . . . , xn ] and the set of the algebraic K-sets in C n , and I defines the
inverse map.

The proof follows from the following:

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:06 PM
21.5 Applications and consequences of Hilbert’s theorems | 323

Theorem 21.4.2 (Hilbert’s nullstellensatz, second form). Let C|K be a field extension
with C algebraically closed. Let a ⊲ K[x1 , . . . , xn ] with a ≠ K[x1 , . . . , xn ]. Then there exists
an α = (α1 , . . . , αn ) ∈ C n with f (α) = 0 for all f ∈ a; that is, 𝒩C (a) ≠ 0.

Proof. Since a ≠ K[x1 , . . . , xn ], there exists a maximal ideal m⊲K[x1 , . . . , xn ] with a ⊂ m.


We consider the canonical map π : K[x1 , . . . , xn ] → K[x1 , . . . , xn ]/m. Let βi = π(xi ) for
i = 1, . . . , n. Then K[x1 , . . . , xn ]/m = K[β1 , . . . , βn ] =: E. Since m is maximal, E is a field.
Moreover, E|K is algebraic by Corollary 20.3.11. Hence, there exists a K-homomorphism
σ : K[β1 , . . . , βn ] → C (C is algebraically closed). Let αi = σ(βi ). As a result we have
f (α1 , . . . , αn ) = 0 for all f ∈ m. Since a ⊂ m this holds also for all f ∈ a. Hence, we get a
zero (α1 , . . . , αn ) of a in C n .

Proof of Theorem 21.4.1. Let a ⊲ K[x1 , . . . , xn ], and let f ∈ I 𝒩 (a). We have to show that
f m ∈ a for some m ∈ ℕ. If f = 0, then there is nothing to show.
Now, let f ≠ 0. We consider K[x1 , . . . , xn ] as a subring of K[x1 , . . . , xn , xn+1 ] of the
n + 1 independent indeterminates x1 , . . . , xn , xn+1 . In K[x1 , . . . , xn , xn+1 ], we consider the
ideal ā = (a, 1 − xn+1 f ) ⊲ K[x1 , . . . , xn , xn+1 ], generated by a and 1 − xn+1 f .
Case 1: ā ≠ K[x1 , . . . , xn , xn+1 ].
ā then has a zero (β1 , . . . , βn , βn+1 ) in C n+1 by Theorem 21.2.4. Hence, for (β1 , . . . , βn ,
βn+1 ) ∈ 𝒩 (a), ̄ we have the equations:
(1) g(β1 , . . . , βn ) = 0 for all g ∈ a, and
(2) f (β1 , . . . , βn )βn+1 = 1.

From (1), we get (β1 , . . . , βn ) ∈ 𝒩 (a). Hence, especially, f (β1 , . . . , βn ) = 0 for our f ∈
I 𝒩 (a). But this contradicts (2). Therefore, ā ≠ K[x1 , . . . , xn , xn+1 ] is not possible. Thus,
we have
Case 2: ā = K[x1 , . . . , xn , xn+1 ], that is, 1 ∈ a.̄ Then there exists a relation of the form

1 = ∑ hi gi + h(1 − xn+1 f ) for some gi ∈ a and hi , h ∈ K[x1 , . . . , xn , xn+1 ].


i

The map xi 󳨃→ xi for 1 ≤ i ≤ n and xn+1 󳨃→ f1 defines a homomorphism ϕ :


K[x1 , . . . , xn , xn+1 ] → K(x1 , . . . , xn ), the quotient field of K[x1 , . . . , xn ]. From (3), we get
a relation 1 = ∑i hi (x1 , . . . , xn , f1 )gi (x1 , . . . , xn ) in K(x1 , . . . , xn ). If we multiply this with a
suitable power f m of f , we get f m = ∑i h̃ i (x1 , . . . , xn )gi (x1 , . . . , xn ) for some polynomials
h̃ ∈ K[x , . . . , x ]. Since g ∈ a, we get f m ∈ a.
1 n i

21.5 Applications and consequences of Hilbert’s theorems


Theorem 21.5.1. Each nonempty set of algebraic K-sets in C n contains a minimal ele-
ment. In other words, for each descending chain

V1 ⊃ V2 ⊃ ⋅ ⋅ ⋅ ⊃ Vm ⊃ Vm+1 ⊃ ⋅ ⋅ ⋅ (21.1)

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:06 PM
324 | 21 The Hilbert basis theorem and the nullstellensatz

of algebraic K-sets Vi in C n , there exists an integer m such that Vm = Vm+1 = Vm+2 = ⋅ ⋅ ⋅,


or equivalently, every strictly descending chain V1 ⫌ V2 ⫌ ⋅ ⋅ ⋅ of algebraic K-sets Vi in C n
is finite.

Proof. We apply the operator I; that is, we pass to the vanishing ideals. This gives an
ascending chain of ideals

I(V1 ) ⊂ I(V2 ) ⊂ ⋅ ⋅ ⋅ ⊂ I(Vm ) ⊂ I(Vm+1 ) ⊂ ⋅ ⋅ ⋅ . (21.2)

The union of the I(Vi ) is an ideal in K[x1 , . . . , xn ], and hence, by Theorem 21.3.3,
finitely generated. Therefore, there is an m with I(Vm ) = I(Vm+1 ) = I(Vm+2 ) = ⋅ ⋅ ⋅.
Now we apply the operator 𝒩 and get the desired result, because Vi = 𝒩 I(Vi ) by
Theorem 21.2.4 (10).

Definition 21.5.2. An algebraic K-set V ≠ 0 in C n is called irreducible if it is not de-


scribable as a union V = V1 ∪ V2 of two algebraic K-sets Vi ≠ 0 in C n with Vi ≠ V for
i = 1, 2. An irreducible algebraic K-set in C n is also called a K-variety in C n .

Theorem 21.5.3. An algebraic K-set V ≠ 0 in C n is irreducible if and only if its vanishing


ideal Ik (V) = I(V) is a prime ideal of R = K[x1 , . . . , xn ] with I(V) ≠ R.

Proof. (1) Let V be irreducible. Let fg ∈ I(V). Then V = 𝒩 I(V) ⊂ 𝒩 (fg) = 𝒩 (f ) ∪ 𝒩 (g);
hence, V = V1 ∪ V2 with the algebraic K-sets V1 = 𝒩 (f ) ∩ V and V2 = 𝒩 (g) ∩ V. Now
V is irreducible; hence, V = V1 , or V = V2 , say V = V1 . Then V ⊂ 𝒩 (f ). Therefore,
f ∈ I 𝒩 (f ) ⊂ I(V). Since V ≠ 0, we have further 1 ∉ I(V); that is, I(V) ≠ R.
(2) Let I(V) ⊲ R with I(V) ≠ R be a prime ideal. Let V = V1 ∪ V2 , V1 ≠ V, with
algebraic K-sets Vi in C n . First,

I(V) = I(V1 ∪ V2 ) = I(V1 ) ∩ I(V2 ) ⊃ I(V1 )I(V2 ), (⋆)

where I(V1 )I(V2 ) is the ideal generated by all products fg with f ∈ I(V1 ), g ∈ I(V2 ).
We have I(V1 ) ≠ I(V), because otherwise V1 = 𝒩 I(V1 ) = 𝒩 I(V) = V contradicting
V1 ≠ V. Hence, there is a f ∈ I(V1 ) with f ∉ I(V). Now, I(V) ≠ R is a prime ideal; hence,
necessarily I(V2 ) ⊂ I(V) by (⋆). It follows that V ⊂ V2 . Therefore, V is irreducible.

Note that the affine space K n is, as the zero set of the zero polynomial 0, itself
an algebraic K-set in K n . If K is infinite, then I(K n ) = {0}. Hence, K n is irreducible
by Theorem 21.5.3. Moreover, if K is infinite, then K n can not be written as a union of
finitely many proper algebraic K-subsets. If K is finite, then K n is not irreducible.
Furthermore, each algebraic K-set V in C n is also an algebraic C-set in C n . If V is an
irreducible algebraic K-set in C n , then—in general—it is not an irreducible algebraic
C-set in C n .

Theorem 21.5.4. Each algebraic K-set V in C n can be written as a finite union V = V1 ∪


V2 ∪ ⋅ ⋅ ⋅ ∪ Vr of irreducible algebraic K-sets Vi in C n . If here Vi ⊈ Vk for all pairs (i, k) with
i ≠ k, then this presentation is unique, up to the ordering of the Vi , and then the Vi are
called the irreducible K-components of V.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:06 PM
21.5 Applications and consequences of Hilbert’s theorems | 325

Proof. Let a be the set of all algebraic K-sets in C n , which can not be presented as a
finite union of irreducible algebraic K-sets in C n .
Assume that a ≠ 0. By Theorem 21.4.1, there is a minimal element V in a. This V
is not irreducible, otherwise we have a presentation as desired. Hence, there exists a
presentation V = V1 ∪ V2 with algebraic K-sets Vi , which are strictly smaller than V.
By definition, both V1 and V2 have a presentation as desired; hence, V also has one,
which gives a contradiction. Hence, a = 0.
Now suppose that V = V1 ∪ ⋅ ⋅ ⋅ ∪ Vr = W1 ∪ ⋅ ⋅ ⋅ ∪ Ws are two presentations of the
desired form. For each Vi , we have a presentation Vi = (Vi ∩ W1 ) ∪ ⋅ ⋅ ⋅ ∪ (Vi ∩ Ws ). Each
Vi ∩ Wj is a K-algebraic set (see Theorem 21.2.4). Since Vi is irreducible, we get that
there is a Wj with Vi = Vi ∩ Wj , that is, Vi ⊂ Wj . Analogously, for this Wj , there is a Vk
with Wj ⊂ Vk . Altogether, Vi ⊂ Wj ⊂ Vk . But Vp ⊈ Vq if p ≠ q. Hence, from Vi ⊂ Wj ⊂ Vk ,
we get i = k. Therefore, Vi = Wj ; that means, for each Vi there is a Wj with Vi = Wj .
Analogously, for each Wk , there is a Vl with Wk = Vl . This proves the theorem.

Example 21.5.5.
1. Let M = {gf } ⊂ ℝ[x, y] with g(x) = x2 + y2 − 1 and f (x) = x2 + y2 − 2. Then 𝒩 (M) =
V = V1 ∪ V2 , where V1 = 𝒩 (g), and V2 = 𝒩 (f ); V is not irreducible.
2. Let M = {f } ⊂ ℝ[x, y] with f (x, y) = xy − 1; f is irreducible in ℝ[x, y]. Therefore, the
ideal (f ) is a prime ideal in ℝ[x, y]. Hence, V = 𝒩 (f ) is irreducible.

Definition 21.5.6. Let V be an algebraic K-set in C n . The residue class ring K[V] =
K[x1 , . . . , xn ]/I(V) is called the (affine) coordinate ring of V.

K[V] can be identified with the ring of all those functions V → C, which are given
by polynomials from K[x1 , . . . , xn ]. As a homomorphic image of K[x1 , . . . , xn ], we get
that K[V] can be described in the form K[V] = K[α1 , . . . , αn ]; therefore, a K-algebra
of the form K[α1 , . . . , αn ] is often called an affine K-algebra. If the algebraic K-set V
in C n is irreducible—we can call V now an (affine) K-variety in C n —then K[V] is an
integral domain with an identity, because I(V) is then a prime ideal with I(V) ≠ R
by Theorem 21.4.2. The quotient field K(V) = Quot K[V] is called the field of rational
functions on the K-variety V.
We note the following:
1. If C is algebraically closed, then V = C n is a K-variety, and K(V) is the field
K(x1 , . . . , xn ) of the rational functions in n variables over K.
2. Let the affine K-algebra A = K[α1 , . . . , αn ] be an integral domain with an identity
1 ≠ 0. Then A ≅ K[x1 , . . . , xn ]/p for some prime ideal p ≠ K[x1 , . . . , xn ]. Hence, if C is
algebraically closed, then A is isomorphic to the coordinate ring of the K-variety
V = 𝒩 (p) in C n (see Hilbert’s nullstellensatz, first form, Theorem 21.4.1).
3. If the affine K-algebra A = K[α1 , . . . , αn ] is an integral domain with an identity
1 ≠ 0, then we define the transcendence degree trgd(A|K) to be the transcendence
degree of the field extension Quot(A)|K; that is, trgd(A|K) = trgd(Quot(A)|K),
Quot(A) the quotient field of A.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:06 PM
326 | 21 The Hilbert basis theorem and the nullstellensatz

In this sense, trgd(K[x1 , . . . , xn ]|K) = n. Since Quot(A) = K(α1 , . . . , αn ), we get


trgd(A|K) ≤ n by Noether’s normalization theorem (Theorem 20.3.10).
4. An arbitrary affine K-algebra K[α1 , . . . , αn ] is, as a homomorphic image of the poly-
nomial ring K[x1 , . . . , xn ], noetherian (see Theorem 21.2.4 and Theorem 21.2.3).

Example 21.5.7. Let ω1 , ω2 ∈ ℂ two elements which are linear independent over ℝ. An
element ω = m1 ω1 + m2 ω2 with m1 , m2 ∈ ℤ, is called a period. The periods describe an
abelian group Ω = {m1 ω1 + m2 ω2 : m1 , m2 ∈ ℤ} ≅ ℤ ⊕ ℤ and give a lattice in ℂ.

An elliptic function f (with respect to Ω) is a meromorphic function with period


group Ω, that is, f (z + w) = f (z) for all z ∈ ℂ. The Weierstrass ℘-function,

1 1 1
℘(z) = 2
+ ∑ ( 2
− 2 ),
z 0=w∈Ω
̸
(z − w) w

is an elliptic function.
1 1
With g2 = 60 ∑0=w∈Ω
̸ w4
, and g3 = 140 ∑0=w∈Ω
̸ w6
, we get the differential equation
2 3
℘ (z) = 4℘(z) + g2 ℘(z) + g3 = 0. The set of elliptic functions is a field E, and each
󸀠

elliptic function is a rational function in ℘ and ℘󸀠 (for details see, for instance, [34]).
The polynomial f (t) = t 2 − 4s3 + g2 s + g3 ∈ ℂ(s)[t] is irreducible over ℂ(s). For the
corresponding algebraic ℂ(s)-set V, we get K(V) = ℂ(s)[t]/(t 2 − 4s3 + g2 s + g3 ) ≅ E with
respect to t 󳨃→ ℘󸀠 , s 󳨃→ ℘.

21.6 Dimensions
From now we assume that C is algebraically closed.

Definition 21.6.1.
(1) The dimension dim(V) of an algebraic K-set V in C n is said to be the supremum of
all integers m, for which there exists a strictly descending chain V0 ⊋ V1 ⊋ ⋅ ⋅ ⋅ ⊋ Vm
of K-varieties Vi in C n with Vi ⊂ V for all i.
(2) Let A be a commutative ring with an identity 1 ≠ 0. The height h(p) of a prime ideal
p ≠ A of A is said to be the supremum of all integers m, for which there exists a
strictly ascending chain p0 ⊊ p1 ⊊ ⋅ ⋅ ⋅ ⊊ pm = p of prime ideals pi of A with pi ≠ A.
The dimension (Krull dimension) dim(A) of A is the supremum of the heights of
all prime ideals ≠ A in A.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:06 PM
21.6 Dimensions | 327

Theorem 21.6.2. Let V be an algebraic K-set in C n . Then dim(V) = dim(K[V]).

Proof. By Theorem 21.2.4 and Theorem 21.4.2, we have a bijective map between the
K-varieties W with W ⊂ V and the prime ideals ≠ R = K[x1 , . . . , xn ] of R, which con-
tain I(V) (the bijective map reverses the inclusion). But these prime ideals correspond
exactly with the prime ideals ≠ K[V] of K[V] = K[x1 , . . . , xn ]/I(V), which gives the
statement.

Suppose that V is an algebraic K-set in C n , and let V1 , . . . , Vr the irreducible com-


ponents of V. Then dim(V) = max{dim(V1 ), . . . , dim(Vr )}, because if V is a K-variety
with V 󸀠 ⊂ V, Then, V 󸀠 = (V 󸀠 ∩ V1 ) ∪ ⋅ ⋅ ⋅ ∪ (V 󸀠 ∩ Vr ). Hence, we may restrict ourselves on
K-varieties V.
If we consider the special case of the K-variety V = C 1 = C (recall that C is alge-
braically closed, and, hence, in particular, C is infinite). Then K[V] = K[x], the poly-
nomial ring K[x] in one indeterminate x. Now, K[x] is a principal ideal domain, and
hence, each prime ideal ≠ K[x] is either a maximal ideal or the zero ideal {0} of K[x].
The only K-varieties in V = C are therefore V itself and the zero set of irreducible poly-
nomials in K[x]. Hence, if V = C, then dim(V) = dim K[V] = 1 = trgd(K[V]|K).

Theorem 21.6.3. Let A = K[α1 , . . . , αn ] be an affine K-algebra, and let A be also an inte-
gral domain. Let {0} = p0 ⊊ p1 ⊊ ⋅ ⋅ ⋅ ⊊ pm be a maximal strictly ascending chain of prime
ideals in A (such a chain exists since A is noetherian). Then m = trgd(A|K) = dim(A). In
other words;
All maximal ideals of A have the same height, and this height is equal to the tran-
scendence degree of A over K.

Corollary 21.6.4. Let V be a K-variety in C n . Then dim(V) = trgd(K[V]|K).

We prove Theorem 21.6.3 in several steps.

Lemma 21.6.5. Let R be an unique factorization domain. Then each prime ideal p with
height h(p) = 1 is a principal ideal.

Proof. p ≠ {0}, since h(p) = 1. Hence, there is an f ∈ p, f ≠ 0. Since R is an unique


factorization domain, f has a decomposition f = p1 ⋅ ⋅ ⋅ ps with prime elements pi ∈ R.
Now, p is a prime ideal; hence, some pi ∈ p, because f ∈ p, say p1 ∈ p. Then we have the
chain {0} ⊊ (p1 ) ⊂ p, and (p1 ) is a prime ideal of R. Since h(p) = 1, we get (p1 ) = p.

Lemma 21.6.6. Let R = K[y1 , . . . , yr ] be the polynomial ring of the r independent inde-
terminates y1 , . . . , yr over the field K (recall that R is a unique factorization domain). If
p is a prime ideal in R with height h(p) = 1, then the residue class ring R̄ = R/p has
transcendence degree r − 1 over K.

Proof. By Lemma 21.6.5, we have that p = (p) for some nonconstant polynomial
p ∈ K[y1 , . . . , yr ]. Let the indeterminate y = yr occur in p, that is, degy (p) ≥ 1, the
degree in y. If f is a multiple of p, then also degy (f ) ≥ 1. Hence, p ∩ K[y1 , . . . , yr ] ≠ {0}.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:06 PM
328 | 21 The Hilbert basis theorem and the nullstellensatz

Therefore, the residue class mapping R → R̄ = K[ȳ1 , . . . , ȳr ] induces an isomor-


phism K[y1 , . . . , yr−1 ] → K[ȳ1 , . . . , ȳr−1 ] of the subring K[y1 , . . . , yr−1 ]; that is, ȳ1 , . . . , ȳr−1
are algebraically independent over K. On the other hand, p(ȳ1 , . . . , ȳr−1 , ȳr ) = 0 is a
nontrivial algebraic relation for ȳr over K(ȳ1 , . . . , ȳr−1 ). Hence, altogether trgd(R|K) ̄ =
trgd(K(ȳ1 , . . . , ȳr )|K) = r − 1 by Theorem 20.3.9.

Before we describe the last technical lemma, we need some preparatory theoreti-
cal material.
Let R, A be integral domains (with identity 1 ≠ 0), and let A|R be a ring extension.
We first consider only R.
(1) A subset S ⊂ R \ {0} is called a multiplicative subset of R if 1 ∈ S for the identity
1 of R, and if s, t ∈ S, then also, st ∈ S. (x, s) ∼ (y, t) :⇔ xt − ys = 0 defines an
equivalence relation on M = R × S. Let xs be the equivalence class of (x, s) and S−1 R, the
set of all equivalence classes. We call xs a fraction. If we add and multiply fractions as
usual, we get that S−1 R becomes an integral domain; it is called the ring of fractions of
R with respect to S. If, in particular, S = R \ {0}, then S−1 R = Quot(R), the quotient field
of R.
Now, back to the general situation. i : R → S−1 R, i(r) = 1r , defines an embedding
of R into S−1 R. Hence, we may consider R as a subring of S−1 R. For each s ∈ S ⊂ R \ {0},
we have that i(s) is an unit in S−1 R. That is, i(s) is invertible, and each element of S−1 R
has the form i(s)−1 i(r) with r ∈ R, s ∈ S. Therefore, S−1 R is uniquely determined up to
isomorphisms, and we have the following universal property:
If ϕ : R → R󸀠 is a ring homomorphism (of integral domains) such that ϕ(s) is
invertible for each s ∈ S, then there exist exactly one ring homomorphism λ : S−1 R →
R󸀠 with λ ∘ i = ϕ. If a ⊲ R is an ideal in a, then we write S−1 a for the ideal in S−1 R,
generated by i(a). S−1 a is the set of all elements of the form as with a ∈ a and s ∈ S.
Furthermore, S−1 a = (1) ⇔ a ∩ S ≠ 0.
Vice versa; if A ⊲ S−1 R is an ideal in S−1 R, then we also denote the ideal i−1 (A) ⊲ R
with A ∩ R. An ideal a ⊲ R is of the form a = i−1 (A) if and only if there is no s ∈ S such
that its image in R/a under the canonical map R → R/a is a proper zero divisor in R/a.
Under the mapping P → P ∩ R and p 󳨃→ S−1 p, the prime ideals in S−1 R correspond
exactly to the prime ideals in R, which do not contain an element of S.
We now identify R with i(R):
(2) Now, let p ⊲ R be a prime ideal in R. Then S = R \ p is multiplicative. In this
case, we write Rp instead of S−1 R, and call Rp the quotient ring of R with respect to p,
or the localization of R of p. Put m = pRp = S−1 p. Then 1 ∉ m. Each element of Rp /m is a
unit in Rp and vice versa. In other words, each ideal a ≠ (1) in Rp is contained in m, or
equivalently, m is the only maximal ideal in Rp . A commutative ring with an identity
1 ≠ 0, which has exactly one maximal ideal, is called a local ring. Hence, Rp is a local
ring. From part (1), we additionally get the prime ideals of the local ring Rp correspond
bijectively to the prime ideals of R, which are contained in p.
(3) Now we consider our ring extension A|R as above. Let q be a prime ideal in R.

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:06 PM
21.6 Dimensions | 329

Claim: If qA ∩ R = q, then there exists a prime ideal Q ⊲ A with Q ∩ R = q (and vice


versa).
Proof of the claim: If S = R \ q, then qA ∩ S = 0. Hence, qS−1 A is a proper ideal in
S−1 A, and hence contained in a maximal ideal m in S−1 A. Here, qS−1 A is the ideal in
S−1 A, which is generated by q. Define Q = m ∩ A; Q is a prime ideal in A, and Q ∩ R = q
by part (1), because Q ∩ S = 0, where S = R \ q.
(4) Now let A|R be an integral extension (A, R integral domains as above). Assume
that R is integrally closed in its quotient field K. Let P ⊲ A be a prime ideal in A and
p = P ∩ R.
Claim: If q ⊲ R is a prime ideal in A with q ⊂ p then qAp ∩ R = q.
Proof of the claim: An arbitrary β ∈ qAp has the form β = αs with α ∈ qA, qA (the
ideal in A generated by q), and s ∈ S = A \ p. An integral equation for α ∈ qA over K
is given a form αn + an−1 αn−1 + ⋅ ⋅ ⋅ + a0 = 0 with ai ∈ q. This can be seen as follows:
we have certainly a form α = b1 α1 + ⋅ ⋅ ⋅ + bm αm with bi ∈ q and αi ∈ A. The subring
A󸀠 = R[α1 , . . . , αm ] is, as an R-module, finitely generated, and αA󸀠 ⊂ qA󸀠 . Now, ai ∈ q
follows with the same type of arguments as in the proof of Theorem 20.2.4.
Now, in addition, let β ∈ R. Then, for s = αβ , we have an equation

an−1 n−1 a
sn + s + ⋅ ⋅ ⋅ + 0n = 0
β β
a
over K. But s is integral over R; hence, all βn−1i ∈ R.
We are now prepared to prove the last preliminary lemma, which we need for the
proof of Theorem 21.6.3.

Lemma 21.6.7 (Krull’s going up lemma). Let A|R be an integral ring extension of inte-
gral domains, and let R be integrally closed in its quotient field. Let p and q be prime
ideals in R with q ⊂ p. Furthermore, let P be a prime ideal in A with P ∩ R = p. Then
there exists a prime ideal Q in A with Q ∩ R = q, and Q ⊂ P.

Proof. It is enough to show that there exists a prime ideal Q in Ap with Q ∩ R = q.


This can be seen from the preceding preparations. By part (1) and (2) such a Q has the
form Q = Q󸀠 Ap with a prime ideal Q󸀠 in A with Q󸀠 ⊂ P, and Q ∩ A = Q󸀠 . It follows
that q = Q󸀠 ∩ R ⊂ P ∩ R = p. And the existence of such a Q follows from parts (3)
and (4).

Proof of Theorem 21.6.3. Let first be m = 0. Then {0} is a maximal ideal in A; hence, A =
K[α1 , . . . , αn ] a field. By Corollary 20.3.11 then, A|K is algebraic; therefore, trgd(A|K) = 0.
So, Theorem 21.3.3 holds for m = 0.
Now, let m ≥ 1. We use Noether’s normalization theorem. A has a polynomial ring
R = K[y1 , . . . , yr ] of the r independent indeterminates y1 , . . . , yr as a subring, and A|R is
an integral extension. As a polynomial ring over K, the ring R is a unique factorization
domain, and hence, certainly, algebraically closed (in its quotient field).

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:06 PM
330 | 21 The Hilbert basis theorem and the nullstellensatz

Now, let

{0} = P0 ⊊ P1 ⊊ ⋅ ⋅ ⋅ ⊊ Pm (21.3)

be a maximal strictly ascending chain of prime ideals in A. If we intersect with R, we


get a chain

{0} = p0 ⊂ p1 ⊂ ⋅ ⋅ ⋅ ⊂ pm (21.4)

of prime ideals pi = Pi ∩ R of R. Since A|R is integral, the chain (21.4) is also a strictly
ascending chain. This follows from Krull’s going up lemma (Lemma 21.6.7), because if
pi = pj , then Pi = Pj . If Pm is a maximal ideal in A, then also pm is a maximal ideal in
R, because A|R is integral (consider A/Pm and use Theorem 20.2.19). If the chain (21.3)
is maximal and strictly, then also the chain (2).
Now, let the chain (21.3) be maximal and strictly. If we pass to the residue class
rings Ā = A/P1 and R̄ = R/p1 , then we get the chains of prime ideals {0} = P̄ 1 ⊂ P̄ 2 ⊂
⋅ ⋅ ⋅ ⊂ P̄ m and {0} = p̄ 1 ⊂ p̄ 2 ⊂ ⋅ ⋅ ⋅ ⊂ p̄ m for the affine K-algebras Ā and R,̄ respectively,
but with a 1 less length. By induction, we may assume that already trgd(A|K) ̄ = m−1 =
trgd(R|K).
̄ On the other hand, by construction, we have trgd(A|K) = trgd(R|K) = r.
Finally, to prove Theorem 21.3.3, we have to show that r = m. If we compare both
equations, then r = m follows if trgd(R|K) ̄ = r − 1. But this holds by Lemma 21.6.6.

Theorem 21.6.8. Let V be a K-variety in C n . Then dim(V) = n − 1 if and only if V = (f )


for some irreducible polynomial f ∈ K[x1 , . . . , xn ].

Proof. (1) Let V be a K-variety in C n with dim(V) = n − 1. The corresponding ideal


(in the sense of Theorem 21.2.4) is by Theorem 21.4.2 a prime ideal p in K[x1 , . . . , xn ].
By Theorem 21.3.3 and Corollary 21.3.4, we get h(p) = 1 for the height of p, because
dim(V) = n − 1 (see also Theorem 21.3.2). Since K[x1 , . . . , xn ] is a unique factorization
domain, we get that p = (f ) is a principal ideal by Lemma 21.6.5.
(2) Now let f ∈ K[x1 , . . . , xn ] be irreducible. We have to show that V = 𝒩 (f ) has
dimension n − 1. For that, by Theorem 21.6.3, we have to show that the prime ideal
p = (f ) has the height h(p) = 1. Assume that this is not the case. Then there exists a
e e
prime ideal q ≠ p with {0} ≠ q ⊂ p. Choose g ∈ q, g ≠ 0. Let g = uf e1 π2 2 ⋅ ⋅ ⋅ πr r be its
prime factorization in K[x1 , . . . , xn ]. Now g ∈ q and f ∉ q, because q ≠ p. Hence, there
is a πi in q ⊊ p = (f ), which is impossible. Therefore h(p) = 1.

21.7 Exercises
1. Let A = K[a1 , . . . , an ] and C|K be a field extension with C algebraically closed.
Show that there is a K-algebra homomorphism K[a1 , . . . , an ] → C.
2. Let K[x1 , . . . , xn ] be the polynomial ring of the n independent indeterminates
x1 , . . . , xn over the algebraically closed field K. The maximal ideals of K[x1 , . . . , xn ]

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:06 PM
21.7 Exercises | 331

are exactly the ideals of the form m(α) = (x1 − α1 , x2 − α2 , . . . , xn − αn ) with


α = (α1 , . . . , αn ) ∈ K n .
3. The nil radical √0 of A = K[a1 , . . . , an ] corresponds with the Jacobson radical of A,
that is, the intersection of all maximal ideals of A.
4. Let R be a commutative ring with 1 ≠ 0. If each prime ideal of R is finitely gener-
ated, then R is noetherian.
5. Prove the theoretical preparations for Krull’s going up lemma in detail.
6. Let K[x1 , . . . , xn ] be the polynomial ring of the n independent indeterminates
x1 , . . . , xn . For each ideal a of K[x1 , . . . , xn ], there exists a natural number m with
the following property: if f ∈ K[x1 , . . . , xn ] vanishes on the zero set of a, then
f m ∈ a.
7. Let K be a field with char K ≠ 2 and a, b ∈ K ⋆ . We consider the polynomial f (x, y) =
ax 2 + by2 − 1 ∈ K[x, y] as the polynomial ring of the independent indeterminates x
and y. Let C be the algebraic closure of K(x) and β ∈ C with f (x, β) = 0. Show the
following:
(i) f is irreducible over the algebraic closure C0 of K (in C).
(ii) trgd(K(x, β)|K) = 1, [K(x, β) : K(x)] = 2, and K is algebraically closed in K(x, β).

Brought to you by | University of Durham


Authenticated
Download Date | 9/28/19 6:06 PM
Brought to you by | University of Durham
Authenticated
Download Date | 9/28/19 6:06 PM
22 Algebras and group representations
22.1 Group representations
In Chapter 13, we spoke about group actions. These are homomorphisms from a group
G into a set of permutations on a set S. The way a group G acts on a set S can often be
used to study the structure of the group G, and, in Chapter 13, we used group actions
to prove the important Sylow theorems.
In this chapter, we discuss a very important type of group action called a group
representation or linear representation. This is a homomorphism of a group G into the
set of linear transformations of a vector space V over a field K. It is a finite dimensional
representation if V is a finite dimensional vector space over K, and infinite dimensional
otherwise. For an n-dimensional representation, each element of the group G can be
represented by an n × n matrix over K, and the group operation can be represented
by matrix multiplication. As with general group actions, much information about the
structure of the group G can be obtained from representations. In particular, in this
chapter, we will present an important Burnside theorem, which shows that any finite
group, whose order is divisible by only two primes, must be solvable.
Representations of groups are important in many areas of mathematics. Group
representations allow many group-theoretic problems to be reduced to problems in
linear algebra, which is well understood. They are also important in physics and the
study of physical structure, because they describe how the symmetry group of a phys-
ical system affects the solutions of equations describing that system.
The theory of group representations can be divided into several areas depending
on the kind of group being represented. The various areas can be quite different in
detail, though the basic definitions and concepts are the same. The most important
areas are:
(1) The theory of finite group representations. Group representations constitute a cru-
cial tool in the study of finite groups. They also arise in applications of finite group
theory to crystallography and to geometry.
(2) Group representations of compact and locally compact groups. Using integration
theory and Haar measure, many of the results on representations of finite groups
can be extended to infinite locally compact groups. The resulting theory is a cen-
tral part of the area of mathematics called harmonic analysis. Pontryagin dual-
ity describes the theory for commutative groups as a generalized Fourier trans-
form.
(3) Representations of Lie Groups. Lie groups are continuous groups with a differen-
tiable structure. Most of the groups that arise in physics and chemistry are Lie
groups, and their representation theory is important to the application of group
theory in those fields.

https://doi.org/10.1515/9783110603996-022

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
334 | 22 Algebras and group representations

(4) Linear algebraic groups are the analogues of Lie groups, but over more general
fields than just the reals or complexes. Their representation theory is more com-
plicated than that of Lie groups.

For this chapter, we will consider solely the representation theory of finite groups, and
for the remainder of this chapter, when we say group, we mean finite group.

22.2 Representations and modules


A group representation is a group action on a vector space that respects the vector
space structure. In this section, we examine the basic definitions of group repre-
sentations and the ties to general modules over rings, both commutative and non-
commutative. The main reference for this chapter is the book entitled Groups and
Representations by J. L. Alperin and R. B. Bell [1]. We follow the main lines of this
book. As we mentioned in the previous section, throughout the remainder of the
chapter, group refers to a finite group.
Let K be a field, and let G be a group action on a K-vector space V. We denote this
action by gv for g ∈ G and v ∈ V. The action is called linear if the following hold:
(1) g(v + w) = gv + gw for all g ∈ G, and v, w ∈ V.
(2) g(αv) = α(gv) for all g ∈ G, α ∈ K, and v ∈ V.

Recall that group actions correspond to group homomorphisms into symmetric


groups. For linear actions on a vector space V, we have a stronger result.

Theorem 22.2.1. There is a bijective correspondence between the set of linear actions
of a group G on a K-vector space V and the set of homomorphisms from G into GL(V),
the group of all invertible linear transformations of V, which is called the general linear
group over V.

Proof. Suppose that ρ : G → GL(V) is a homomorphism, then the action of G on V


is defined by setting gv = ρ(g)(v), and it is clear that this action is linear. Conversely,
if we have a linear action of G on V, then we can define a homomorphism ρ : G →
GL(V) by ρ(g)v = gv. These processes are mutually inverse, which gives the desired
correspondence.

Definition 22.2.2. A homomorphism ρ : G → GL(V), where G is a group and V is a


K-vector space called a linear representation or group representation of G in V.

From Theorem 22.2.1, it follows that the study of group representations is equiva-
lent to the study of linear actions of groups. This area of study, with emphasis on finite
groups and finite dimensional vector spaces, has many applications to finite group
theory.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
22.2 Representations and modules | 335

The modern approach to the representation theory of finite groups involves


another equivalent concept, namely that of finitely generated modules over group
rings.
In Chapter 18, we considered R-modules over commutative rings R, and used this
study to prove the fundamental theorem of finitely generated modules over principal
ideal domains. In particular, we used the same study to prove the fundamental theo-
rem of finitely generated abelian groups. Here we must extend the concepts and allow
R to be a general ring with identity.

Definition 22.2.3. Let R be a ring with identity 1, and let M be an abelian group written
additively. M is called left R-module if there is a map R × M → M written as (r, m) 󳨃→ rm
such that the following hold:
(1) 1 ⋅ m = m;
(2) r(m + n) = rm + rn;
(3) (r + s)m = rm + sm;
(4) r(sm) = (rs)m;

for all r, s ∈ R and m, n ∈ M.


We can similarly define the notion of a right R-module via a map from M × R to
M sending (m, r) to mr, which satisfies the analogous properties to those above. If R
is commutative, then every left module can in an obvious manner be given a right
R-module structure; hence, it is not necessary in the commutative case to distinguish
between left and right R-modules.
We always use the wording R-module to denote left R-module, unless otherwise
specified.

Definition 22.2.4. An R-module M is finitely generated if every element of M can be


written as an R- linear combination m = r1 m1 + ⋅ ⋅ ⋅ + rk mk for a finite subset {m1 , . . . , mk }
of M.

Finite minimal sets for a given module may have different numbers of elements.
This is in contrast to the situation in free R-modules over a commutative ring R
with identity, where any two finite bases have the same number of elements (The-
orem 18.4.6).
In the following, we review the module theory that is necessary for the study of
group representations. The facts we use are straightforward extensions of the respec-
tive facts for modules over commutative rings or for groups.

Definition 22.2.5. Let M be an R-module, and let N be a subgroup of M. Then N is an


R-submodule (or just a submodule) if rn ∈ N for every r ∈ R and n ∈ N.

Example 22.2.6. The R-submodules of a ring R are exactly the left ideals of R (see
Chapter 1). Every R-module M has at least two submodules, namely, M itself and the
zero submodule {0}.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
336 | 22 Algebras and group representations

Definition 22.2.7. A simple R-module is an R-module M ≠ {0}, which has only M and
{0} as submodules.

If N is a submodule of M, then we may construct the factor group M/N (recall that
M is abelian). We may give the factor group M/N an R-module structure by defining
r(m + N) = rm + N for every r ∈ R and m + N ∈ M/N. We call M/N the factor R-module,
or just factor module of M/N.

Definition 22.2.8. Let N1 , N2 be submodules of an R-module M. Then we define the


module sum N1 + N2 by

N1 + N2 = {x + y | x ∈ N1 , y ∈ N2 } ⊂ M.

The sum N1 + N2 and the intersection N1 ∩ N2 are submodules of M. If N1 ∩ N2 = {0},


then we call the sum N1 + N2 a direct sum and write N1 ⊕ N2 instead of N1 + N2 .
We say that a submodule N of M is a direct summand if there is some other sub-
module N 󸀠 of M such that M = N ⊕ N 󸀠 . In general, we write kN or N k to denote the
direct sum

N ⊕ N ⊕ ⋅⋅⋅ ⊕ N

of k copies of N.

As for groups, we also have the external notion of a direct sum. If M and N are
R-modules, then we give the Cartesian product M ×N an R-module structure by setting
r(m, n) = (rm, rn), and we write M ⊕ N instead of M × N.
The notions of internal and external direct sums can be extended to any finite
number of submodules and modules, respectively.

Definition 22.2.9. A composition series of an R-module M ≠ {0} is a descending series

M = M0 ⊃ M1 ⊃ ⋅ ⋅ ⋅ ⊃ Mk = {0}

of finitely many submodules Mi of M beginning with M and ending with {0}, where the
inclusions are proper, and in which each successive factor module Mi /Mi+1 is a simple
module. We call the length of the composition series k.

Notice the following:


(1) A module need not have a composition series. For example, an infinite abelian
group, considered as a ℤ-module, does not have a composition series (see Chap-
ter 12).
(2) The analog of the Jordan–Hölder theorem for groups (see Theorem 12.4.3) holds
for modules that have composition series.

Theorem 22.2.10 (Jordan–Hölder theorem for R-modules). If an R-module M ≠ {0}


has a composition series, then any two composition series are equivalent; that is, there

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
22.2 Representations and modules | 337

exists a one-to-one correspondence between their respective factor modules. Hence, the
factor modules are unique, and, in particular, the length must be the same.
Therefore, we can speak in a well-defined manner about the factor modules of a
composition series. If an R-module M has a composition series, then each submodule
N and each factor module M/N also has a composition series.
If the submodule N and the factor module M/N each have a composition series,
then the module M also has one (see Chapter 13 for the respective proofs for groups).
Definition 22.2.11. Let M and N be R-modules, and let ϕ : M → N be a group homo-
morphism. Then ϕ is an R-module homomorphism if ϕ(rm) = rϕ(m) for any r ∈ R and
m ∈ M.
As for all other structures, we define monomorphism, epimorphism, isomor-
phism, and automorphism of R-modules in analogy with the definition for groups.
Analogously, for groups, we have the following results:
Theorem 22.2.12 (First isomorphism theorem). Let M and N be R-modules, and
ϕ : M → N an R-module homomorphism.
(1) The kernel ker(ϕ) = {m ∈ M | ϕ(m) = 0} of ϕ is a submodule of M.
(2) The image Im(ϕ) = {n ∈ N | ϕ(m) = n for some m ∈ M} of ϕ is a submodule of N.
(3) The R-modules M/kerϕ and Im(ϕ) are isomorphic via the map induced by ϕ.

If the R-modules M and N are R-module isomorphic, then we write M ≅ N.


Corollary 22.2.13. An R-module homomorphism ϕ : M → N is injective if and only if
ker(ϕ) = {0}.
Theorem 22.2.14 (Second isomorphism theorem). Let N1 , N2 be submodules of an
R-module M. Then

(N1 + N2 )/N2 ≅ N1 /(N1 ∩ N2 ).

Theorem 22.2.15 (Schur’s lemma). Let M and N be simple R-modules, and let
ϕ : M → N be a nonzero R-module homomorphism. Then ϕ is an R-module iso-
morphism.
Proof. Since both M and N are simple, we must have either ker(ϕ) = M or
ker(ϕ) = {0}. If ker(ϕ) = M, then ϕ = 0 the zero homomorphism. Hence, ker(ϕ) = {0}
and Im(ϕ) = N. Therefore, if ϕ ≠ 0, then ϕ is an R-module isomorphism.

Group rings and modules over group rings


We now introduce the class of rings, whose modules we will study for group represen-
tations. They form the class of group algebras.
Definition 22.2.16. Let R be a ring and G a group. Then the group ring of G over R,
denoted by RG, consists of all finite R-linear combinations of elements of G. This is

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
338 | 22 Algebras and group representations

the set of linear combinations of the form

{ ∑ αg g | all αg ∈ R}.
g∈G

For addition in RG, we take the rule

∑ αg g + ∑ βg g = ∑ (αg + βg )g.
g∈G g∈G g∈G

Multiplication in RG is defined by extending the multiplication in G:

( ∑ αg g)( ∑ βg g) = ∑ ∑ αg βh gh
g∈G g∈G g∈G h∈G

= ∑ ( ∑ (αg βg −1 x ))x.
x∈G g∈G

The group ring RG has an identity element, which coincides with the identity element
of G. We usually denote this by just 1.

From the viewpoint of abstract group theory, it is of interest to consider the case,
where the underlying ring is an integral domain. In this connection, we mention the
famous zero divisor conjecture by Higman and Kaplansky, which poses the question
whether every group ring RG of a torsion-free group G over an integral domain R or
over a field K has no zero divisors.
The conjecture has been proved only for a fairly restricted class of torsion-free
groups.
In this chapter, we will primarily consider the case where R = K is a field and the
group G is finite, in which case the group ring KG is not only a ring, but also a finite
dimensional K-vector space having G as a basis. In this case, KG is called the group
algebra.
In mathematics, in general, an algebra over a field K is a K-vector space with a
bilinear product that makes it a ring. That is, an algebra over K is an algebraic structure
A with both a ring structure and a K-vector space structure that are compatible. That
is, α(ab) = (αa)b = a(αb) for any α ∈ K and a, b ∈ A. An algebra is finite-dimensional
if it has finite dimension as K-vector space.

Example 22.2.17.
(1) The matrix ring Mn (K) is a finite dimensional K-algebra for any natural number n.
(2) The group ring KG is a finite dimensional K-algebra when the group G is finite.

Definition 22.2.18. A homomorphism of K-algebras is a ring homomorphism, which


is also a K-linear transformation.

Modules over a group algebra KG can also be considered as K-vector spaces with
α ∈ K acting as α ⋅ 1 ∈ KG.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
22.2 Representations and modules | 339

Lemma 22.2.19. If K is a field, and G is a finite group, then a KG-module is finitely gen-
erated if and only if it is finite dimensional as a K-vector space.

Proof. If V is generated as a KG-module by {v1 , . . . , vk }, then V is generated as a


K-vector space by {gv1 , . . . , gvk }, and hence has finite dimension as a K-vector space.
The converse is clear.

We now describe the fundamental connections between modules over group al-
gebras and group representation theory.

Theorem 22.2.20. If K is a field and G is a finite group, then there is a one-to-one cor-
respondence between finitely generated KG-modules and linear actions of G on finite
dimensional K-vector spaces V, and hence with the homomorphisms ρ : G → GL(V) for
finite dimensional K-vector spaces V.

Proof. If V is a finitely generated KG-module, then dim K(V) < ∞ by Lemma 22.2.19,
and the map from G × V to V obtained by restricting the module structure map from
KG × V to V is a linear action.
Conversely, let V be a finite dimensional K-vector space, on which G acts linearly.
Then we place a KG-module structure on V by defining

( ∑ αg g)v = ∑ αg (gv) for


g∈G g∈G

∑ αg g ∈ KG and v ∈ V.
g∈G

The processes are inverses of each other.

To define a KG-module structure on a K-vector space V, it suffices to stipulate the


action of the elements of G on V. The action of arbitrary elements of KG on V is then
defined by extending linearly.
As indicated for the remainder of this section, G will denote a finite group, and K
will denote a field. All K-vector spaces will be finite dimensional, and all KG-modules
will be finitely generated and hence of finite dimension as a K-vector space. Our at-
tention will primarily be on KG-modules, although on occasion it will be convenient
to work with the linear representation ρ : G → GL(V) with ρ(g) = gv for g ∈ G, v ∈ V
arising from a given KG-module V.

Example 22.2.21.
(1) The field K can always be considered as a KG-module by defining gλ = λ for all
g ∈ G and λ ∈ K. This module is called the trivial module.
(2) Let G act on the finite set X = {x1 , . . . , xn }. Let KX be the set

n
{∑ ci xi | ci ∈ K, xi ∈ X for i = 1, . . . , n}
i=1

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
340 | 22 Algebras and group representations

of all formal sums of K-linear combinations of elements of X. This then has a


K-vector space structure with basis X. On KX, we may define a KG-module in the
following manner: If g ∈ G and ∑ni=1 ci xi ∈ KX, then

n n
g(∑ ci xi ) = ∑ ci (gxi ).
i=1 i=1

These modules are called the permutation modules.


(3) Let U, V be KG-modules. Then the (external) direct sum U ⊕ V has a KG-module
structure given by

g(u, v) = (gu, gv).

(4) Let U, V be KG-modules, and let HomKG (U, V) be the set of all KG-module homo-
morphisms from U to V. For ϕ, ψ ∈ HomKG (U, V) define ϕ + ψ ∈ HomKG (U, V)
by

(ϕ + ψ)(u) = ϕ(u) + ψ(u).

With this definition HomKG (U, V) is an abelian group. Furthermore, HomKG (U.V)
is a K-vector space with (λϕ)(u) = λϕ(u) for λ ∈ K, u ∈ U and ϕ ∈ HomKG (U, V).

Note that this K-vector space has finite dimension. The K-vector space HomKG (U, V)
also admits a natural KG-module structure. For g ∈ G and ϕ ∈ HomKG (U, V) then, we
define

gϕ : U → V by (gϕ)(u) = g(ϕ(g −1 (u)).

It is clear that gϕ ∈ HomKG (u, V).

For g1 , g2 ∈ G, and ϕ ∈ HomKG (U, V) then,

((g1 g2 )ϕ)(u) = g1 g2 ϕ((g1 g2 )−1 u) = g1 (g2 ϕ(g2−1 (g1−1 u)))


= g1 ((g2 ϕ)(g1−1 (u)) = (g1 (g2 (ϕ))(u).

Therefore, (g1 g2 )ϕ = g1 (g2 ϕ). It follows that HomKG (U, V) has a KG-module structure.
G acts on HomKG (U, V), and we write U ⋆ for HomKG (U, K), where K is the trivial
module. U ⋆ is called the dual module of U, and here we have (gϕ)(u) = ϕ(g −1 u).

Theorem 22.2.22 (Maschke’s Theorem). Let G be a finite group, and suppose that the
characteristic of K is either 0 or co-prime to |G|; that is, gcd(char(K), |G|) = 1. If U
is a KG-module and V is a KG-submodule of U, then V is a direct summand of U as
KG-modules.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
22.2 Representations and modules | 341

Proof. U is, in particular, a finite dimensional K-vector space, and V is a K-subspace.


Any basis for V can be extended to a basis of U. Hence, there is some subspace W of U
such that U = V ⊕W as K-vector spaces. However, W may not be a KG-submodule of U.
Let π : U → V be the projection of U onto V in terms of the vector space decomposition
so that the map π is the unique linear transformation; that is, the identity on V and
zero on W. We now define a linear transformation

π󸀠 : U → U

by

1
π 󸀠 (u) = ∑ gπ(g −1 u) for u ∈ U.
|G| g∈G

1
Since char(K) = 0, or gcd(char(K), |G|) = 1, it follows that |G| ≠ 0 in K; hence, |G|
exists
in K. Therefore, the definition of π makes sense.
󸀠

We have gv ∈ V for any g ∈ G and v ∈ V, because V is a KG-submodule of U.


Therefore, the map π 󸀠 maps U into V. moreover, since π is the identity on V, we have
that gπ(g −1 v) = gg −1 (v) = v for any g ∈ G and v ∈ V. Therefore, the restriction of π 󸀠 to
V is the identity. It also follows that U = V ⊕ ker(π 󸀠 ) as K-vector spaces.
It remains to show that ker(π 󸀠 ) is a KG-submodule of U. To show this, it is sufficient
to show that π 󸀠 is a KG-module homomorphism; that is, we must show that π 󸀠 (xu) =
xπ 󸀠 (u) for any x ∈ G and u ∈ U. We have

1 1
π 󸀠 (xu) = ∑ gπ(g −1 xu) = ∑ xx−1 gπ(g −1 xu)
|G| g∈G |G| g∈G
1
= x( ∑ x−1 gπ(g −1 xu)).
|G| g∈G

But as g varies through G further, y = x−1 g varies through G for fixed x ∈ G.


Therefore,

1
π 󸀠 (xu) = x( ∑ yπ(y−1 u)) = xπ 󸀠 (u)
|G| y∈G

as required.

Definition 22.2.23. A module U is semisimple if it is a direct sum of simple modules.


If U = {0}, then the sum is the empty sum.

Corollary 22.2.24. Let G be a finite group and K a field. Suppose that either char(K) = 0
or char(K) is relatively prime to |G|. Then every nonzero KG-module is semisimple.

Proof. Let U be a nonzero KG-module. We use induction on dimK (V). If U is simple, we


are done. This includes the case where dimK (V) = 1. Suppose that dimK (V) > 1, and

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
342 | 22 Algebras and group representations

assume that U is not simple. Then U must have a nonzero proper KG-submodule V.
By Mashke’s theorem, we have U = V ⊕ W for some nonzero proper KG-submodule W
of U. Then both V and W have dimension strictly less than dimK (U). By the induction
hypothesis, both V and W are semisimple; therefore, U is semisimple.

We now present a version of Maschke’s theorem for linear group representations


ρ : G → GL(V), where ρ(g)(u) = gu for g ∈ G, u ∈ U, which arises from the given
KG-module U.
To formulate Mashke’s result, we need some additional definitions and notation.

Definition 22.2.25.
(1) A K-vector subspace V of U is a G-invariant subspace if gv ∈ V for all g ∈ G and
v ∈ V.
(2) Let U be nonzero. A representation ρ : G → GL(U) is irreducible if {0} and U are
the only G-invariant subspaces of U.
(3) Let U be nonzero. A representation ρ : G → GL(U) is fully reducible if each
G-invariant subspace V of U has a G-invariant complement W in U; that is,
U = V ⊕ W as K-vector spaces.

Theorem 22.2.26 (Mashke’s theorem). Let G be a finite group and K a field. Suppose
that either char(K) = 0 or char(K) is relatively prime to |G|. Let U be a finite dimensional
K-vector space. Then each representation ρ : G → GL(V) is fully reducible.

Proof. By Theorem 22.2.1, we may consider U as a KG-module. Then the above version
of Mashke’s theorem follows from the proof for modules, because the KG-submodules
of U together with the respective definitions for group representations represent the
G-invariant subspaces of U.

The theory of KG-modules, when char(K) = p > 0 and p, divides |G|. In which case,
arbitrary KG-modules need not be semisimple, and is called modular representation
theory. The earliest work on modular representations was done by Dickson and many
of the main developments were done by Brauer. More details and a good overview may
be found in [1], [4], [5], and [17].

22.3 Semisimple algebras and Wedderburn’s theorem


In this section, K will denote a field and all algebras will be finite dimensional
K-algebras and, unless explicitly stated otherwise, will be algebras with an identity el-
ement. All modules and algebras are assumed to be finitely generated or equivalently
finite dimensional as K-vector spaces. All direct sums of modules will be assumed to
be finite.
Let A be an algebra. We are interested in semisimple A-modules, and want to de-
termine conditions on A so that every A-module is semisimple.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
22.3 Semisimple algebras and Wedderburn’s theorem | 343

Lemma 22.3.1. Let M be an A-module. Then the following are equivalent:


(1) Any submodule of M is a direct summand of M.
(2) M is semisimple.
(3) M is a sum of simple submodules.

Proof. The implication (1) 󳨐⇒ (2) follows in the same manner as Corollary 22.2.24.
The implication (2) 󳨐⇒ (3) is direct.
Finally, we must show the implication (3) 󳨐⇒ (1). Suppose that (3) holds, and
let N be a submodule of M. Let V also be a submodule of M; that is, maximal among
all submodules of M that intersect N trivially. Such a submodule V exists by Zorn’s
lemma. We wish to show that N + V = M. Suppose that N + V ≠ M (certainly we have
N + V ⊂ M). If every simple submodule of M were contained in N + V, then as M can
be written as a sum of simple submodules, we would have M ⊂ N + V. This is not
the case, since N + V ≠ M. Hence, there is some simple submodule S of M that is not
contained in N + V. Since S ∩ (N + V) is a proper submodule of the simple module S,
we must have S ∩ (N + V) = {0}. In particular, S ∩ V = {0}, so we have V ⊂ V + S. Let
n ∈ N ∩ (V + S). Then n = s + v for some v ∈ V and s ∈ S. This gives s = n − v ∈ S ∩ N + V,
and therefore s = 0. Hence, n = v, which forces n to be 0, because N ∩ V = {0}. It
follows that N ∩ (V + S) = {0}, which contradicts the maximality of V. Hence, we now
have M = N + V. Furthermore, since N ∩ V = {0}, we get that the sum is direct and
M = N ⊕ V. Therefore, N is a direct summand of M, which proves the implication
(3) 󳨐⇒ (1) completing the proof of the lemma.

Lemma 22.3.2. Submodules and factor modules of semisimple modules are also semi-
simple.

Proof. Let M be a semisimple A-module. By the previous lemma and the isomorphism
theorem for modules, we get that every submodule of M is isomorphic to a factor mod-
ule of M. Therefore, it suffices to show that factor modules of M are semisimple. Let
M/N be an arbitrary factor module, and let η : M → M/N with m 󳨃→ m + N be the
canonical map. Since M is semisimple, we have M = S1 + ⋅ ⋅ ⋅ + Sn with n ∈ ℕ, and each
Si a simple module. Then M/N = η(M) = η(S1 )+⋅ ⋅ ⋅+η(Sn ). But each η(Si ) is isomorphic
to a factor module of Si , and hence each η(Si ) is either {0} or a simple module. There-
fore, M/N is a sum of simple modules, and hence semisimple by Lemma 22.3.1

Definition 22.3.3. An algebra A is semisimple if all nonzero A-modules are semisim-


ple.

Note that if G is a finite group, and either char(K) = 0 or gcd(char(K), |G|) = 1,


then KG is semisimple.
We now give some fundamental results on semisimple algebras.

Lemma 22.3.4. The algebra A is semisimple if and only if the A-module A is semisimple.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
344 | 22 Algebras and group representations

Proof. Suppose that the A-module A is semisimple, and let M be an A-module gen-
erated by {m1 , . . . , mr }. Let Ar denote the direct sum of r copies of A; (a1 , . . . , ar ) 󳨃→
a1 m1 + ⋅ ⋅ ⋅ + ar mr defines a map from Ar to M, which is an A-module epimorphism.
Thus, M is isomorphic to a factor module of the semisimple module Ar , and hence
semisimple by Lemma 22.3.2. It follows that A is a semisimple algebra.
The converse is clear.

Theorem 22.3.5. Let A be a semisimple algebra, and suppose that as an A-module, we


have

A ≅ S1 ⊕ ⋅ ⋅ ⋅ ⊕ Sr , r ∈ ℕ,

where the Si are simple submodules of A. Then any simple A-module is isomorphic to
some Si .

Proof. Let S be a simple A-module and s ∈ S with s ≠ 0. We define an A-module


homomorphism ϕ : A → S by ϕ(a) = as for a ∈ A. Since S is simple, the map ϕ is
surjective. For each i, let be ϕi = ϕ |Si , the restriction of ϕ to Si . If ϕi = 0 for all i, then
we would have ϕ = 0. Hence, ϕi is nonzero for some i, and it follows from Schur’s
lemma that ϕi : Si → Si is an isomorphism for such an i.

Theorem 22.3.6. Suppose that A is a semisimple algebra, and let S1 , . . . , Sr be a collec-


tion of simple A-modules such that every simple A-module is isomorphic with exactly
one Si . Let M be an A-module, and let

M ≅ m1 S1 + ⋅ ⋅ ⋅ + mr Sr

for some integers mi ∈ ℕ ∪ {0}. Then the mi are uniquely determined.

Proof. There is a composition series of m1 S1 ⊕ ⋅ ⋅ ⋅ ⊕ mr Sr having m1 + ⋅ ⋅ ⋅ + mr terms,


in which Si appears mi times as a composition factor. The result then follows from the
Jordan-Hölder theorem for modules (Theorem 22.2.10).

Whenever the modules S1 , . . . , Sr are stated as m1 S1 + ⋅ ⋅ ⋅ + mr S, as in the previous


theorem, we will say that the Si are the distinct simple A-modules. The Si are noniso-
morphic.
We want to classify all semisimple algebras. We start by showing the semi-
simplicity of a certain class of algebras, and then showing that all semisimple algebras
fall in this class. We will introduce this class in steps.
Let D be a finite-dimensional K-algebra. Then for any n ∈ ℕ, the set Mn (D) of n × n
matrices with entries in D is a finite dimensional K-algebra of dimension n2 dimK (D).
Algebras of this form are called matrix algebras over D.
For 1 ≤ i, j ≤ n, and α ∈ D, let Eij (α) be the matrix, whose only nonzero entry is
equal to α, and occurs in the (i, j)-th position.
Let Dn be the set of column vectors of length n with entries from D. Dn forms an
Mn (D)-module under matrix multiplication.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
22.3 Semisimple algebras and Wedderburn’s theorem | 345

Definition 22.3.7. An algebra D is a division algebra or skew field if the nonzero ele-
ments of D form a group. Equivalently, it is a ring, where every nonzero element has a
multiplicative inverse. It is exactly the definition of a field without requiring commu-
tativity.

Any field K is a division algebra over itself, but there may be division algebras
that are noncommutative. If the interest is on the ring structure of D, one often speaks
about division rings (see Chapter 7).

Theorem 22.3.8. Let D be a division algebra, and let n ∈ ℕ. Then any simple
Mn (D)-module is isomorphic to Dn , and Mn (D) is an Mn (D)- module isomorphic to the
direct sum of n copies of Dn . In particular, Mn (D) is a semisimple algebra.

Proof. A nonzero submodule of Dn must contain some nonzero vector, which must
have a nonzero entry x in the j-th place for some j. This x is invertible in D.
By premultiplying this vector by Ejj (x−1 ), we see that the submodule contains the
j-th canonical basis vector. By premultiplying this basis vector by appropriate permu-
tation matrices, we get that the submodule contains every canonical basis vector, and
hence contains every vector.
It follows that Dn is the only nonzero Mn (D)-submodule of Dn , and hence Dn is
simple. Now for each 1 ≤ k ≤ n, let Ck be the submodule of Mn (D) consisting of those
matrices, whose only nonzero entries appear in the k-th column. Then we have

Mn (D) ≅ ⊕nk=1 Ck

as Mn (D)-modules. But each Ck is isomorphic as an Mn (D)-module to Dn .


It follows that Mn (D) is a semisimple algebra by Lemma 22.3.4, and then Dn is the
unique simple Mn (D)-module by Theorem 22.3.5.

Definition 22.3.9. A nonzero algebra is simple if its only (two-sided) ideals (as a ring)
are itself and the zero ideal.

Lemma 22.3.10. Simple algebras are semisimple.

Proof. Let A be a simple algebra, and let Σ be the sum of all simple submodules of A.
Let S be a simple submodule of A, and let a ∈ A. Then the map ϕ : S → Sa, given by
s 󳨃→ sa, is a module epimorphism. Therefore, Sa is simple, or Sa = {0}. In either case,
we have Sa ⊂ Σ for any submodule S and any a ∈ A.
It follows that Σ is a right ideal in A, and hence that Sa is a two-sided ideal. How-
ever, A is simple, and Σ ≠ {0}, so we must have Σ = A. Therefore, A is the sum of simple
A-modules, and from Lemmas 22.3.1 and 22.3.4, it follows that A is a semisimple alge-
bra.

Theorem 22.3.11. Let D be a division algebra, and let n ∈ ℕ. Then Mn (D) is a simple
algebra.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
346 | 22 Algebras and group representations

Proof. Let M ∈ Mn (D) with M ≠ {0}. We must show that the principal two-sided ideal
J of Mn (D) generated by M is equal to Mn (D).
It suffices to show that J contains each Eij (1), since these matrices generate Mn (D)
as an Mn (D)-module. Since M ≠ {0}, there exists some 1 ≤ r, s ≤ n such that the
(r, s)-entry of M is nonzero. We call this entry x. By calculation, we have

Ess (1) = Esr (x −1 )MEss (1) ∈ J.

Now let 1 ≤ i, j ≤ n, and let w, w󸀠 be the permutation matrices corresponding to the


transpositions (i, s) and (s, j), respectively. Then Eij (1) = wEss (1)w󸀠 ∈ J.
Let B1 , . . . , Br be algebras. The external direct sum B = B1 ⊕B2 ⊕⋅ ⋅ ⋅⊕Br is the algebra,
whose underlying set is the Cartesian product, and whose addition, multiplication,
and scalar multiplication are defined componentwise.
If M is a Bi -module for some i, then M has a B-module structure by (b1 , . . . , br )m =
bi m.
If M is simple (respectively semisimple) as a Bi -module, then M is also simple
(respectively semisimple) as a B-module. For each i, the set of elements of B, whose
only nonzero entry is in the ith component of B, is an ideal in B, and this ideal is
B-module isomorphic to Bi .
Now suppose that B is an algebra having ideals B1 , . . . , Br such that, as vector
spaces, B is the direct sum of the Bi . Then B is isomorphic to the external direct sum
B1 ⊕ ⋅ ⋅ ⋅ ⊕ Br by the map

b = b1 + ⋅ ⋅ ⋅ + br 󳨃→ (b1 , . . . , br ).

The algebra B is the internal direct sum as algebras of the Bi . This can be seen as fol-
lows. If i ≠ j and bi ∈ Bi , bj ∈ Bj , then we must have bi bj ∈ Bi ∩ Bj = {0}, since Bi and Bj
are ideals. Therefore, the product in B of b1 +⋅ ⋅ ⋅+br and b󸀠1 +⋅ ⋅ ⋅ b󸀠r is just b1 b󸀠1 +⋅ ⋅ ⋅+br b󸀠r .

Lemma 22.3.12. Let B = B1 ⊕ ⋅ ⋅ ⋅ ⊕ Br be a direct sum of algebras. Then the (two-sided)


ideals of B are precisely the sets of the form J1 ⊕ ⋅ ⋅ ⋅ ⊕ Jr , where Ji is a (two-sided) ideal of
Bi for each i.

Proof. Let J be a (two-sided) ideal of B, and let Ji = J ∩ Bi for each i. Certainly, ⊕ri=1 Ji ⊂ J.
Let b ∈ J, then b = b1 + ⋅ ⋅ ⋅ + br with bi ∈ Bi for each i. For some i, let ei =
(0, . . . , 0, 1, 0, . . . , 0); that is, the element of B, whose only nonzero entry is the iden-
tity element of Bi . Then b = bei ∈ J ∩ Bi = Ji . Therefore, b ∈ ⊕ri=1 Ji , which shows that
J = J1 ⊕ ⋅ ⋅ ⋅ ⊕ Jr .
The converse is clear.

Theorem 22.3.13. Let r ∈ ℕ. For each 1 ≤ i ≤ r, let Di be a division algebra over K.


Let ni ∈ ℕ, and let Bi = Mni (Di ). Let B be the external direct sum of the Bi . Then B
is a semisimple algebra having exactly r isomorphism classes of simple modules and
exactly 2r (two-sided) ideals, namely, every sum of the form ⊕j∈J Bj , where J is a subset
of {1, . . . , r}.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
22.3 Semisimple algebras and Wedderburn’s theorem | 347

Proof. For each i, we write Bi = Ci1 ⊕⋅ ⋅ ⋅⊕Cin using Theorem 22.3.8, where the Cij are mu-
tually isomorphic Bi -modules. As we saw above, each Cij is also simple as a B-module.
Therefore, as B-modules, we have B ≅ ⊕i,j Cij , and hence B is a semisimple algebra by
Lemma 22.3.4. From Theorem 22.3.5, we get that any simple B-module is isomorphic
to some Cij , but Cij ≅ Ckl if and only if i = k. Hence, there are exactly r isomorphisms
of simple B-modules. The final statement is a straightforward consequence of Theo-
rem 22.3.11 and Lemma 22.3.12.
We saw that a direct sum of matrix algebras over a division algebras is semisimple.
We now start to show that the converse is also true; that is, any semisimple algebra is
isomorphic to a direct sum of matrix algebras over division algebras. This is Wedder-
burn’s theorem.

Definition 22.3.14. If M is an A-module, then let EndA (M) = HomA (M, M) denote the
set of all A-module endomorphisms of M. In a more general context, we have seen that
EndA (M) has the structure of an A-module via

(ϕ + ψ)(m) = ϕ(m) + ψ(m)


(λϕ)(m) = ϕ(λm)

for all ϕ, ψ ∈ EndA (M), λ ∈ A, and m ∈ M. This composition of mappings gives a mul-
tiplication in EndA (M), and hence EndA (M) is a K-algebra, called the endomorphism
algebra of M.

Definition 22.3.15. The opposite algebra of B, denoted Bop , is the set B together with
the usual addition and scalar multiplication, but with the opposite multiplication,
that is, the multiplication rule of B reversed.

Given a, b ∈ B, we use ab to denote their product in B, and a⋅b to denote their prod-
uct in Bop . Hence, a ⋅ b = ba. We certainly have (Bop )op = B. If B is a division algebra,
then so is Bop . The opposite of a direct sum of algebras is the direct sum of the opposite
algebras, because the multiplication in the direct sum is defined componentwise.
Endomorphism algebras and opposite algebras are closely related.

Lemma 22.3.16. Let B be an algebra. Then Bop ≅ EndB (B).

Proof. Let ϕ ∈ EndB (B), and let a = ϕ(1). Then ϕ(b) = bϕ(1) = ba for any b ∈ B;
hence, ϕ is equal to the automorphism ψa , given by right multiplication of a. There-
fore, EndB (B) = {ψa | a ∈ B}; hence, EndB (B) and B are in one-to-one correspondence.
To finish the proof, we must show that ψa ψb = ψa⋅b for any a, b ∈ B.
Let a, b ∈ B. Then ψa ψb (x) = ψa (xb) = xba = ψba (x) = ψa⋅b (x), as required.

Lemma 22.3.17. Let S1 , . . . , Sr be the r distinct simple A-modules of Theorem 22.3.6. For
each i, let Ui be a direct sum of copies of Si , and let U = U1 ⊕ ⋅ ⋅ ⋅ ⊕ Ur . Then

EndA (U) ≅ EndA (U1 ) ⊕ ⋅ ⋅ ⋅ ⊕ EndA (Ur ).

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
348 | 22 Algebras and group representations

Proof. Let ϕ ∈ EndA (U). Fix some i. Then every composition factor of Ui is isomorphic
to Si . Therefore, by the Jordan–Hölder theorem for modules (Theorem 22.3.10), we see
that the same is true for ϕ(Ui ), since ϕ(Ui ) is isomorphic to a quotient of Ui . Assume
that ϕ(Ui ) is not contained in Ui . Then the image of ϕ(Ui ) in U/Ui under the canonical
map is a nonzero submodule, having Si as a composition factor. However, the compo-
sition factors of U/Ui are exactly those Sj for j ≠ i. This gives a contradiction. It follows
that ϕ(Ui ) ⊂ Ui , and a submodule of U/Ui cannot have Si as a composition factor. For
each i, we can define ϕi = ϕ|U , and we have ϕi ∈ EndA (Ui ). In this way, we define a
i
map

Γ : EndA (U) 󳨃→ EndA (U1 ) ⊕ ⋅ ⋅ ⋅ ⊕ EndA (Ur )

by setting

Γ(ϕ) = (ϕ1 . . . , ϕr ) ∈ EndA (U1 ) ⊕ ⋅ ⋅ ⋅ ⊕ EndA (Ur ).

It is straightforward that Γ is an A-module monomorphism.



Now let (ϕ1 , . . . , ϕr ) ∈ EndA (U1 )⊕⋅ ⋅ ⋅⊕EndA (Ur ). We define ϕ ∈ EndA (U) as follows:
Given x ∈ U with x = x1 + ⋅ ⋅ ⋅ + xr , and xi ∈ Ui for each i, then

ϕ(x) = ϕ1 (x1 ) + ⋅ ⋅ ⋅ + ϕr (xr ).

We then have (ϕ1 , . . . , ϕr ) = Γ(ϕ), which shows that Γ is surjective, and hence an iso-
morphism.

Lemma 22.3.18. If S is a simple A-module, then EndA (nS) ≅ Mn (EndA (S)) for any n ∈ ℕ.

Proof. We regard the elements of nS as being column vectors of length n with entries
from S. Let Φ = (ϕij ) ∈ Mn (EndA (S)). We now define the map

Γ(Φ) : nS → nS

by

s1 ϕ11 ... ϕin s1


Γ(Φ) ( .. ) = ( ...
. .. ) ( .. )
. .
sn ϕn1 ... ϕnn sn
ϕ11 (s1 ) + ⋅ ⋅ ⋅ + ϕ1n (sn )
..
=( . ).
ϕn1 (s1 ) + ⋅ ⋅ ⋅ + ϕnn (sn )
s1
󳨀s = ( .. ) ∈ nS. Then
We write → .
sn

󳨀s + →
Γ(Φ(a→
󳨀 󳨀s ) + Γ(Φ)(→
t )) = aΓ(Φ)(→
󳨀
t)

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
22.3 Semisimple algebras and Wedderburn’s theorem | 349

󳨀s ,→
for any a ∈ A and →
󳨀
t ∈ nS, because each ϕij is an A-module homomorphism. There-
fore, Γ(Φ) ∈ EndA (nS), and we easily obtain that

Γ : Mn (EndA (S)) → EndA (nS)

by

Φ 󳨃→ Γ(Φ)

is an algebra monomorphism.
Now let ψ ∈ EndA (nS). For each 1 ≤ i, j ≤ n, we define ψij : S → S implicitly by

s 0
ψ11 (s) ψ1n (s)
0 0
.
ψ ( . ) = ( . ) , . . . , ψ ( . ) = ( ... ) .
.
.. ..
ψn1 (s) ψnn (s)
0 s

We get that each ψij ∈ EndA (S). Now let Ψ = (ψij ) ∈ Mn (EndA (S)). Then Γ(Ψ) = ψ,
showing that Γ is also surjective, and hence an isomorphism.

If S is a simple A-module, then EndA (S) is a division algebra by Schur’s lemma


(Theorem 22.2.15). If the ground field K is algebraically closed, then more specific re-
sults can be stated about the structure of EndA (S).

Lemma 22.3.19. Suppose that K is algebraically closed, and let S be a simple A-module.
Then EndA (S) ≅ K.

Proof. Let ϕ ∈ EndA (S). Consider ϕ as an invertible K-linear map of the finite dimen-
sional K-vector space S onto itself. Since K is algebraically closed, ϕ has a nonzero
eigenvalue λϕ ∈ K. If I is the identity element of Enda (S), then (ϕ − λϕ I) ∈ EndA (S) has
a nonzero kernel, and therefore is not invertible. From this, it follows that ϕ = λϕ I,
since EndA (S) is a division algebra. The map ϕ 󳨃→ λϕ is then an isomorphism from
EndA (S) to K.

Lemma 22.3.20. Let B be an algebra. Then (Mn (B))op ≅ Mn (Bop ) for any n ∈ ℕ.

Proof. Define the map ψ : (Mn (B))op → Mn (Bop ) by ψ(X) = X t , where X t is the trans-
pose of the matrix X. This map is bijective.
Let X = (xij ) and Y = (yij ) be elements of (Mn (B))op . Then for any i and j we have

n n
(ψ(X)ψ(Y))ij = ∑ ψ(X)ij ⋅ ψ(Y)kj = ∑ (X t )ik ⋅ (Y t )kj
k=1 k=1
n n
= ∑ Xki ⋅ Yjk = ∑ Yjk Xki = (YX)ji
k=1 k=1

= ((YX)t )ij = ((X ⋅ Y)t )ij = ψ(X ⋅ Y)ij .

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
350 | 22 Algebras and group representations

Therefore, ψ(X ⋅ Y) = ψ(X)ψ(Y), and then ψ is an algebra homomorphism, and since


it is bijective also an algebra isomorphism.

We are now at the point of stating Wedderburn’s main structure theorem for
semisimple algebras.

Theorem 22.3.21 (Wedderburn). The algebra A is semisimple if and only if it is isomor-


phic to a direct sum of matrix algebras over division algebras.

Proof. Suppose that the algebra A is semisimple. Then A is of the form A = U1 ⊕⋅ ⋅ ⋅⊕Ur ,
where each Ui is the direct sum of ni copies of a simple A-module Si , and no two of
the distinct Si are isomorphic. We have Aop ≅ EndA (A) by Lemma 22.3.16, and Aop ≅
EndA (U1 ) ⊕ ⋅ ⋅ ⋅ ⊕ EndA (Ur ) by Lemma 22.3.17. Therefore,

Aop ≅ EndA (n1 S1 ) ⊕ ⋅ ⋅ ⋅ ⊕ EndA (nr Sr ),

and then by Lemma 22.3.16,

Aop ≅ Mn1 (EndA (S1 )) ⊕ ⋅ ⋅ ⋅ ⊕ Mnr (EndA (Sr )).

Hence, it follows from Lemma 22.3.18 that

op
A ≅ (Mn1 (EndA (S1 )) ⊕ ⋅ ⋅ ⋅ ⊕ Mnr (Enda (Sr )))
op op
≅ (Mn1 (EndA (S1 ))) ⊕ ⋅ ⋅ ⋅ ⊕ (Mnr (Enda (Sr )))
op
≅ (Mn1 (EndA (S1 )op ) ⊕ ⋅ ⋅ ⋅ ⊕ Mnr (EndA (Sr )op )) .

Since the endomorphism algebra of a simple module is a division algebra, and


the opposite algebra of a division algebra is also a division algebra, it follows that
a semisimple algebra is isomorphic to a direct sum of matrix algebras over division
algebras.
The converse is a direct consequence of Theorem 22.3.13.

Theorem 22.3.22. The algebra A is simple if and only if it is isomorphic to a matrix


algebra over a division ring.

Proof. Suppose that A is a simple algebra. Then by Lemma 22.3.10, A is semisimple;


hence, by Theorem 22.3.21, A is isomorphic to a direct sum of R matrix algebras over
division algebras. From Theorem 22.3.13, we have that A has exactly 2r ideals. However,
A is simple, and hence has only 2 ideals. Therefore, r = 1. Hence, any simple algebra
is isomorphic to a matrix algebra over a division algebra.
The converse follows from Theorem 22.3.11.

We see that an algebra is semisimple if and only if it is a direct sum of simple


algebras. This affirms the consistency of the choice of terminology.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
22.4 Ordinary representations, characters and character theory | 351

Theorem 22.3.23. Suppose that the field K is algebraically closed. Then any semisimple
algebra is isomorphic to a direct sum of matrix algebras over K.

Proof. This follows directly from Lemma 22.3.19 and Theorem 22.3.21.

22.4 Ordinary representations, characters and character theory


In this section, we look at a concept, the character of a representation, which gives
more information than one might expect at first glance. Throughout this section, we
will be concerned with the case, where the ground field K is ℂ, the field of complex
numbers. In this case, representation theory of groups is called ordinary representa-
tion theory. Recall that ℂ has characteristic 0 and is algebraically closed. For this sec-
tion, G will denote a finite group, and all ℂG-modules are finitely generated, or equiv-
alently have finite dimension as ℂ-vector spaces.
From Theorem 22.3.21, we see that every nonzero ℂG-modules is semisimple for
any group G. It follows, from Wedderburn’s theorem, that we have very specific infor-
mation about the nature of the group algebra ℂG.

Theorem 22.4.1. There exists some r ∈ ℕ and some f1 , . . . , fr ∈ ℕ such that

ℂG = Mf1 (ℂ) ⊕ ⋅ ⋅ ⋅ ⊕ Mfr (ℂ)

as ℂ-algebras.
Furthermore, there are exactly r isomorphism classes of simple ℂG-modules, and if
we let S1 , . . . , Sr be representations of these r classes, then we can order the Si so that

ℂG ≅ f1 S1 ⊕ ⋅ ⋅ ⋅ ⊕ fr Sr

as ℂG-modules, where dimℂ Si = fi for each i. Any ℂG-module can be written uniquely
in the form

a1 S1 ⊕ ⋅ ⋅ ⋅ ⊕ ar Sr ,

where all ai ∈ ℕ ∪ {0}.

Proof. The theorem follows from our results on the classification of simple and
semisimple algebras. The first statement follows from Corollary 22.2.24 and Theo-
rem 22.3.23. The second statement follows from Theorem 22.3.8 and 22.3.13, where we
take Si as the space of column vectors of length fi with the canonical module structure
over the ith summand Mfi (ℂ).
The final statement follows from Theorem 22.3.6.

Definition 22.4.2. The ℂ-dimensions f1 , . . . , fr of the r simple ℂG-modules are called


the degrees of the representations of G.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
352 | 22 Algebras and group representations

The trivial ℂG-module ℂ is one-dimensional, and hence simple. Therefore, G will


always have at least one representation of degree 1. By convention, we let f1 = 1. The
sizes of the degrees are determined by the order of the group G.

Corollary 22.4.3.
r
∑ fi2 = |G|.
i=1

Proof. Theorem 22.4.1 gives

|G| = dimℂ (ℂG) = dim(⊕ri=1 Mfi (ℂ))


r r
= ∑ dimℂ (Mfi (ℂ)) = ∑ fi2 .
i=1 i=1

We note that the degrees of G divide |G|. We do not need this fact. For a proof see
the appendix in the book [1].

Theorem 22.4.4. The number r of simple G-modules is equal to the number of conjugacy
classes of G.

Proof. Let Z be the center of ℂG; that is, the subalgebra of ℂG consisting of all ele-
ments that commute with every element of ℂG. From Theorem 22.4.1, it follows that Z
is isomorphic to the center of Mf1 (ℂ) ⊕ ⋅ ⋅ ⋅ ⊕ Mfr (ℂ), and therefore is isomorphic to the
direct sum of the centers of the Mfi (ℂ). It is straightforward that the center of Mfi (ℂ) is
equal to the set of diagonal matrices

{αI | I is the identity matrix in Mfi (ℂ), α ∈ ℂ}.

Hence, the center of Mfi (ℂ) is isomorphic to ℂ, and therefore Z ≅ ℂr , which implies
that dimℂ (Z) = r.
We now consider an element ∑g∈G λg G of Z. For any h ∈ G, we have

( ∑ λg G)h = h( ∑ λg g),
g∈G g∈G

which leads to

∑ λg g = ∑ λg h−1 gh = ∑ λhgh−1 g.
g∈G g∈G g∈G

It follows that we must have λg = λhgh−1 for all g, h ∈ G.


It also follows then that the coefficients of elements of the center Z are constant
on conjugacy classes of G, and that a basis for Z is the set of class sums, which are the
sums of the form ∑g∈C g, where C is a conjugacy class of G. Thus, dimℂ Z is equal to
the number of conjugacy classes of G.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
22.4 Ordinary representations, characters and character theory | 353

Characters and character theory


We now define and study the characters of an ordinary representation.

Definition 22.4.5. If U is a ℂG-module, then each g ∈ G defines an invertible linear


transformation of U via u 󳨃→ gu for u ∈ U. The character of U is the function χU : G → ℂ
defined by χU (g), the trace of the linear transformation of U defined by g.

We note that for any representation U, we have χU (1) = dimℂ (U), since the identity
element of G induces the identity transformation of U. Furthermore, if ρ : G → GL(U)
is the representation corresponding to U, then χU (g) is just the trace of the map ρ(g).
Thus, isomorphic ℂG-modules have equal characters.
If g, h ∈ G, then the linear transformations of U, defined by g and hgh−1 , have the
same trace. These linear transformations are called similar. Therefore, any character
is constant on each conjugacy class of G; that is, the value of the character on any two
conjugate elements is the same.

Example 22.4.6. Let U = ℂG and g ∈ G. By considering the matrix of the linear trans-
formation defined by g with respect to the basis G of ℂG, we get that χU (g) is equal to
the number of elements x ∈ G, for which gx = x. Therefore, we have χU (1) = |G| and
χU (g) = 0 for every g ∈ G with g ≠ 1. This character is called the regular character of G.

The theory of characters was introduced by Frobenius. In connection with number


theory, he defined characters as being functions from G to ℂ satisfying certain prop-
erties. However, it turned out that his characters were exactly the trace functions of
finitely generated ℂG-modules. In what follows, we describe the properties of charac-
ters.
We first consider the characters of the r simple ℂG-modules. We denote these by
χ1 , . . . , χr . These are called the irreducible characters of G.
Whenever we have that S1 , . . . , Sr are the distinct (up to isomorphism) ℂG-mod-
ules, we order them so that χSi = χi for each i. Because S1 = {1} for the trivial represen-
tation, we let χ1 be the character of the trivial representation, and call χ1 the principal
character of G. We then have χ1 (g) = 1 for all g ∈ G.

Definition 22.4.7. A character of a one-dimensional representation ℂG-module is


called a linear character.

Since one-dimensional modules are simple, we get that all linear characters are ir-
reducible. Let χ be the linear character arising from the ℂG-module U, and let g, h ∈ G.
Since U is one-dimensional for any u ∈ U, we have gu = χ(g)u, and hu = χ(h)u. Then
χ(gh)u = (gh)u = χ(g)χ(h)u. Hence, χ is a homomorphism from G to the multiplica-
tive group ℂ⋆ = ℂ \ {0}. On the other hand, given a homomorphism ϕ : G → ℂ⋆ ,
we can define a one-dimensional ℂG-module U by gu = ϕ(g)u for g ∈ G and u ∈ U.
Therefore, χU = ϕ. It follows that the linear characters of G are precisely the group of
homomorphisms from G to ℂ⋆ .

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
354 | 22 Algebras and group representations

Theorem 22.4.8. Let U be a ℂG-module, and let ρ : G → GL(V) be the representation


corresponding to U. Let g ∈ G be of order n. Then the following hold:
(i) ρ(g) is diagonalizable.
(ii) χU (g) equals the sum (with multiplicities) of the eigenvalues of ρ(g).
(iii) χU (g) is the sum of the χU (1)th roots of unity.
(iv) χU (g −1 ) = χU (g) the complex conjugate of χU (g).
(v) |χU (g)| ≤ χU (1).
(vi) The set {x ∈ G | χU (x) = χU (1)} is a normal subgroup of G.

Proof. Since g n = 1, we get that ρ(g) is a zero of the polynomial X n − 1. However, X n − 1


splits into distinct linear factors in ℂ[X], and so it follows that the minimal polynomial
of ρ(g) does also. Hence, ρ(g) is diagonalizable by way of proving (i). From this, we
have that the trace of ρ(g) is the sum (with multiplicities) of the eigenvalues proving
(ii). The eigenvalues are precisely the zeros of the minimal polynomial of ρ(g), which
divides X n −1. Consequently, these roots are nth roots of unity, which proves (iii), since
χU (1) = dimℂ (U). Each eigenvector of ρ(g) is also an eigenvector for ρ(g −1 ) with the
eigenvalue for ρ(g −1 ) being the inverse of the eigenvalue for ρ(g). Since the eigenvalues
are roots of unity, it follows that χU (g −1 ) = χU (g). From this we obtain (iv).
Now (v) follows directly from (iii). We have already seen that χU (g) is the sum of
its χU (1) eigenvalues, each of which is a root of unity. If the sum is equal to χU (1), then
it follows that each of these eigenvalues must be 1, in which case ρ(g) must be the
identity map. Conversely, if ρ(g) is the identity map, then χU (g) = dimℂ (U) = χU (1).
Therefore, {x ∈ G | χU (x) = χU (1)} = ker(ρ), and hence is a normal subgroup of G.

Suppose that χ and ψ are characters of G. We define new functions χ + ψ and χψ


from G to ℂ by (χ + ψ)(g) = χ(g) + ψ(g) and (χψ)(g) = χ(g)ψ(g) for g ∈ G. These new
functions are not a priori characters themselves. Given a scalar λ ∈ ℂ, define a new
function λχ : G → ℂ by (λχ)(g) = λχ(g). Consequently, we can view the characters of
G as elements of a ℂ-vector space of functions from G to ℂ.

Theorem 22.4.9. The irreducible characters of G are, as functions from G to ℂ, linearly


independent over ℂ.

Proof. We have ℂG ≅ Mf1 (ℂ) ⊕ ⋅ ⋅ ⋅ ⊕ Mfr (ℂ) by Theorem 22.4.1.


Let S1 , . . . , Sr be the distinct simple ℂG-modules. For each i, let ei be the identity
element of Mfi . We fix some i.
Recall that χi (g) is the trace of the linear transformation on Si defined by g ∈ G. The
linear transformation on Si , given by ei , is the identity map. Hence, χi (ei ) = dimℂ (Si ) =
fi . Moreover, if j ≠ i, then the linear transformation on Sj given by ei is the zero map,
and hence χj (ei ) = 0 for j ≠ i. Now suppose that λ1 , . . . , λr ∈ ℂ such that ∑rj=1 λj χj = 0.
From above, we see that 0 = ∑rj=1 λj χj (ei ) = λi fi for each i. It follows that λi = 0 for all i;
therefore, the characters are linearly independent.

Lemma 22.4.10. χU⊕V = χU + χV for any ℂG-modules U and V.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
22.4 Ordinary representations, characters and character theory | 355

Proof. By considering a ℂ-basis for U ⊕V, whose first dimℂ (U) elements form a ℂ-basis
for U ⊕ {0}, and whose remaining elements form a ℂ-basis for {0} ⊕ V, we get that
χU⊕V (g) = χU (g) + χV (g) for any g ∈ G.

Theorem 22.4.11. If S1 , . . . , Sr are the distinct (up to isomorphism), simple ℂG-modules,


then the character of the ℂG-module a1 S1 ⊕ ⋅ ⋅ ⋅ ⊕ ar Sr with ai ∈ ℕ ∪ {0} is a1 χ1 + ⋅ ⋅ ⋅ + ar χr .
Consequently, two ℂG-modules are isomorphic if and only if their characters are equal.

Proof. The first statement follows directly from Lemma 22.4.10. Now, suppose that
χU = χV for some ℂG-modules U and V. Since ℂG is semisimple, we can write U ≅
a1 S1 ⊕ ⋅ ⋅ ⋅ ⊕ ar Sr and V ≅ b1 S1 ⊕ ⋅ ⋅ ⋅ ⊕ br Sr with ai , bi ∈ ℕ ∪ {0}. By taking characters, we
have
r
0 = χU − χV = ∑(ai − bi )χi .
i=1

By Theorem 22.4.9, this forces ai = bi for all i, and therefore U ≅ V.

Definition 22.4.12. A class function on G is a function from G to ℂ, whose value within


any conjugacy class is constant.

For example, characters of ℂG-modules are class functions.


The set of all class functions on G forms a ℂ-vector space of dimension r, where r
is the number of conjugacy classes within G. An obvious basis for this vector space is
the set of class functions on G that have the value 1 on a single conjugacy class, and 0
on all other conjugacy classes.

Theorem 22.4.13. The irreducible characters for G form a basis for the ℂ-vector space
of class functions on G.

Proof. By Theorem 22.4.9, the irreducible characters of G are linearly independent el-
ements of the space of class functions. Their number equals the number of conjugacy
classes of G by Theorem 22.4.4, and this number is equal to the dimension of the space
of class functions.

Definition 22.4.14. If α, β are class function of G, then their inner product is the com-
plex number
1
⟨α, β⟩ = ∑ α(g)β(g).
|G| g∈G

This inner product is a traditional complex inner product on the space of class
function. Therefore, we have the following properties:
(1) ⟨α, α⟩ ≥ 0, and ⟨α, α⟩ = 0, if and only if α = 0;
(2) ⟨α, β⟩ = ⟨β, α⟩;
(3) ⟨λα, β⟩ = λ⟨α, β⟩ for all λ ∈ ℂ;
(4) ⟨α1 + α2 , β⟩ = ⟨α1 , β⟩ + ⟨α2 , β⟩.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
356 | 22 Algebras and group representations

From these basic properties we further have


(5) ⟨α, λβ⟩ = λ⟨α, β⟩,
(6) ⟨α, β1 + β2 ⟩ = ⟨α, β1 ⟩ + ⟨α, β2 ⟩,

for all class functions α1 , β1 , α2 , β2 , and all λ ∈ ℂ.

Definition 22.4.15. If U is a ℂG-module, then

U G = {u ∈ U | gu = u for all g ∈ G}.

Lemma 22.4.16. If U is a ℂG-module, then


1
dimℂ (U G ) = ∑ χ (g).
|G| g∈G U
1
Proof. Let a = |G|
∑g∈G g ∈ ℂG. Clearly, ga = a for any g ∈ G, and hence a2 = a. If T is a
linear transformation of U, defined by a, then T must satisfy the equation X 2 − X = 0,
and consequently, T is diagonalizable. It follows that the only eigenvalues of T are 0
and 1. Let U1 ⊂ U be the eigenspace of T corresponding to the eigenvalue 1. If u ∈ U1 ,
then gu = gau = au = u for any g ∈ G. Therefore, u ∈ U G . Conversely, suppose that
u ∈ U G . Then

|G|au = ( ∑ g)u = ∑ gu = ∑ u = |G|u,


g∈G g∈G g∈G

G
and hence a ∈ U1 . It follows that U = U1 . However, the trace of T is equal to the
dimension of U1 , and then the result follows from the linearity of the trace map.

Theorem 22.4.17. ⟨χU , χV ⟩ = dimℂ (HomℂG (U, V)) for any ℂG-modules U, V.

Recall that HomℂG (U, V) is an ℂ-vector space with (ϕ + ψ)(u) = ϕ(u) + ψ(u), and
(λϕ)(u) = λϕ(u) for any λ ∈ ℂ, u ∈ U and ϕ, ψ ∈ HomℂG (U, V).

Proof. We first observe that HomℂG (U, V) is a subspace of the ℂG-module HomℂG (U,
V). If ϕ ∈ HomℂG (U, V) and g ∈ G, then (gϕ)(u) = gϕ(g −1 u) = gg −1 ϕ(u) = ϕ(u) for any
u ∈ U. Hence, gϕ = ϕ for all g ∈ G. This implies that ϕ ∈ HomℂG (U, V)G . By reversing
the elements, we get HomℂG (U, V) = HomℂG (U, V)G .
Therefore,

dimℂ (HomℂG (U, V)) = dimℂ (HomℂG (U, V)G )


1
= ∑χ (g)
|G| g∈G HomℂG (U,V)
1
= ∑ χ (g)χV (g)
|G| g∈G U

= ⟨χV , χU ⟩

by Lemma 22.4.16, and part (iii) of Theorem 22.4.8.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
22.4 Ordinary representations, characters and character theory | 357

This implies that

⟨χU , χV ⟩ = ⟨χV , χU ⟩ = ⟨χV , χU ⟩ = dim(HomℂG (U, V)),

since we know that ⟨χV , χU ⟩ is real.

The character table and orthogonality relations


We have seen that the number of conjugacy classes r in a finite group G is the same as
the number of irreducible characters. Furthermore, the set of irreducible characters
form a basis for the space of class functions on G. If χ1 , . . . , χr are the set of irreducible
characters, and g1 , . . . , gr are a complete set of conjugacy class representatives, then
the r × r-matrix χ = (χi (gj )) is called the character table for G.
We close this section by showing that the rows and columns of the character table
are orthogonal vectors relative to the defined inner product. These results are called
the orthogonality relations. As a consequence of these relations, we obtain the fact
that the irreducible characters form an orthonormal basis for the space of characters.
There is great deal of other information that can be obtained from the character table.
We refer to the book by Alperin and Bell [1] for further discussion.

Theorem 22.4.18 (First orthogonality relation). Let χ1 , . . . , χr be the set of irreducible


characters of G. Then

1 0, if i ≠ j,
∑ χ (g)χj (g) = {
|G| g∈G i 1, if i = j.

In other words, the irreducible characters form an orthonormal set with respect to
the defined inner product.

Proof. Let S1 , . . . , Sr be the distinct simple ℂG-modules that go with the irreducible
characters. From the previous theorem, we have

⟨χi , χj ⟩ = dimℂ (HomℂG (Si , Sj ))

for any i, j. We further have HomℂG (Si , Si ) ≅ ℂ, and by Schur’s lemma HomℂG (Si , Sj ) = 0
for i ≠ j, proving the theorem.

Corollary 22.4.19. The set of irreducible characters form an orthonormal basis for the
vector space of class functions.

Proof. The irreducible characters form a basis for the space of characters, and from
the orthogonality result they are an orthonormal set relative to the inner product.

The second orthogonality relation says that the columns of the character table are
also a set of orthogonal vectors. That is, the irreducible characters of a set of conjugacy
class representatives also forms an orthogonal set with respect to the defined inner
product.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
358 | 22 Algebras and group representations

Theorem 22.4.20 (Second orthogonality relation). Let χ1 , . . . , χr be the set of irre-


ducible characters of G, and suppose that g1 , . . . , gr are a set of conjugacy class rep-
resentatives, and k1 , . . . , kr are the orders of the conjugacy classes. Then for any 1 ≤ i,
j ≤ r, we have

r 0, if i ≠ j,
∑ χs (gi )χs (gj ) = { |G|
s=1 ki
, if i = j.

Proof. Let χ = (χi (gj ))1≤i,j≤r be the character table for G, and let K be the r × r diagonal
matrix with the set {k1 , . . . , kr } as its main diagonal. Then we have (χK)i,j = χi (gj )kj for
any i, j. Then
r
(χKχ t )ij = ∑ kℓ χi (gℓ )χj (gℓ ) = ∑ χi (g)χj (g),
ℓ=1 g∈G

but this equals

= |G|⟨χi , χj ⟩

by the first orthogonality relation.


Hence, χKχ t = |G|I, where I is the identity matrix. It follows that for any i, j, we
have
r
|G| = ∑ kj χℓ (gj )χℓ (gj ),
ℓ=1

and
r
0 = ∑ kj χℓ (gj )χℓ (gi ) for i ≠ j,
ℓ=1

completing the proof.

As mentioned before, more information about character tables and their conse-
quences can be found in [1].

22.5 Burnside’s theorem


We conclude this chapter by presenting a very important result in finite group theory,
whose proof uses representation theory. This is Burnside’s Theorem, which asserts that
any group of order pa qb with p, q distinct primes must be solvable. Burnside’s result
was important in the proof of the famous Feit–Thompson theorem, which asserted that
any group of odd order must be solvable. This was crucial in the classification of finite
simple groups.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
22.5 Burnside’s theorem | 359

Recall that a group G is solvable if it has a normal series with abelian factors. Solv-
able groups play a crucial role in the proof of the insolvability of the quintic polyno-
mial, and we discussed solvable groups in detail in Chapter 12. For the proof, we need
the following two facts about solvable groups:
1. If a group G has a normal solvable subgroup N with G/N solvable, then G is solv-
able (Theorem 12.2.3).
2. Any finite group of prime power order is solvable (Theorem 12.2.8).

We start with several lemmas that depend on representation theory.


Let G be a finite group, and suppose it has r irreducible representations χ1 , χ2 , . . . , χr
of respective degrees m1 , m2 , . . . , mr . Suppose the respective orders of the r conjugacy
classes are h1 , h2 , . . . , hr . The statements in the lemmas depend on some mild facts
on algebraic integers. An algebraic integer is a complex number, which is a zero of
a monic integral polynomial. Here we just need the following two facts:
1. The set of algebraic integers forms a subring of ℂ.
2. If an algebraic integer is a rational number, then it is an ordinary integer.

For more information about algebraic integers see Chapter 21.

Lemma 22.5.1. Let χ be a character of G. The value χ(g) for any g ∈ G is an algebraic
integer.

Proof. For any g ∈ G, the value χ(g) is a sum of roots of unity. However, any root of
unity satisfies a monic integral polynomial X n − 1 = 0, and hence is an algebraic inte-
ger. Since the algebraic integers form a ring, any sum of roots of unity is an algebraic
integer.

Lemma 22.5.2. Let χ be an irreducible character of G. Let g ∈ G and CG (g) the central-
izer of g in G. Then
|G : CG (g)|
χ(g)
χ(1)
is an algebraic integer.

Proof. Let S be the simple ℂG-module having character χ.


Let g ∈ G, and let C be the conjugacy class of g in G. By Theorem 13.2.1, we have
|C| = |C : CG (g)|.
Let α ∈ ℂ, α = ∑x∈K x, be the class sum of K. We consider the map φ : S → S,
φ(s) = αs, for s ∈ S. From Theorem 22.4.4 and its proof, we get that α is in the center of
ℂG.
This gives φ ∈ EndℂG (S), and there exists a λ ∈ ℂ with αs = λs for all s ∈ S by
Schur’s lemma. We obtain

λχ(1) = ∑ χ(x) = |C|χ(g) = |G : CG (g)|χ(g)


x∈C

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
360 | 22 Algebras and group representations

by taking traces. Therefore,

|G : CG (g)|
λ= = χ(g).
χ(1)

Let τ : ℂG → ℂG, τ(z) = zα for z ∈ ℂG. We get τ ∈ EndℂG (ℂG) by the proof of
Lemma 22.3.6. Since S is a simple ℂG-module. Therefore, we may consider S as a sub-
module of ℂG, and for 0 ≠ s ∈ S ⊂ ℂG, we have τ(s) = sα = αs = λs, since α is a central
element.
Therefore, λ is an eigenvalue of τ, and so det(λI − A) = 0, where I is the identity
matrix, and A the matrix of τ with respect to the ℂ-basis G for ℂG. Each entry of A is
either 0 or 1, which means that, in particular, f (X) = det(XI − A) is a monic polynomial
in X with integer coefficients. Since f (λ) = 0, we get that λ is an algebraic integer.

Lemma 22.5.3. Let χ be an irreducible character of G. Then χ(1) divides |G|.

Proof. Let g1 , g2 , . . . , gr be a set of representatives of the conjugacy classes of G. We


know that

|G : CG (gi )|χ(gi )
and χ(gi ) = χ(gi−1 )
χ(1)

are algebraic integers. By the first orthogonality relation

1 r 󵄨󵄨 r
󵄨 χ(g )
∑ 󵄨󵄨G : CG (gi )󵄨󵄨󵄨χ(gi )χ(gi ) = ∑ 󵄨󵄨󵄨G : CG (gi )󵄨󵄨󵄨 i χ(gi ),
|G| 󵄨 󵄨
=
χ(1) χ(1) i=1 i=1
χ(1)

which is an algebraic integer, and hence an ordinary integer.


χ(g)
Lemma 22.5.4. Let G be a character of G, g ∈ G and γ = χ(1)
.
If γ is a nonzero algebraic integer, then |γ| = 1.

Proof. From Theorem 22.4.8, we know that |γ| ≤ 1.


Suppose that 0 < |γ| < 1, and assume that γ is an algebraic integer.
Now, γ is an average of complex roots of unity. The same will be true for all σ(γ)
with σ ∈ Aut(K | ℚ) =: H, where K is the splitting field of the minimal polynomial of
γ over ℚ.
In particular, |σ(γ)| ≤ 1 for all σ ∈ H. Hence, p := | ∏σ∈H σ(γ)| < 1.
On the other hand, p ∈ ℤ by Theorems 7.3.12 and 16.5.1 (recall that γ is a zero of a
irreducible, monic polynomial with integer coefficients, see Theorem 4.4.3).
This implies p = 0, and therefore the constant term of the minimal polynomial of
γ over ℚ must be zero, which gives a contradiction.
Hence, γ cannot be an algebraic integer.

Theorem 22.5.5. If G has a conjugacy class of nontrivial prime power order, then G is
not simple.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
22.5 Burnside’s theorem | 361

Proof. Suppose that G is simple and that the conjugacy class of 1 ≠ g ∈ G has order pn
with p a prime number, and n ∈ ℕ. From the second orthogonality relation, we get

0 1 r 1 1 r
0= = ∑ χi (g)χi (1) = + ∑ χi (g)χi (1),
p p i=1 p p i=2

where χ1 , χ2 , . . . , χr are the irreducible characters of G (recall that χ1 is the principal


character).
χ (g)χ (1)
Since − p1 is not an algebraic integer, it follows that i p i is not an algebraic in-
teger for some 2 ≤ i ≤ r. As χi (g) is an algebraic integer, this implies that p ∤ χi (1), and
χi (g) ≠ 0.
Now |G : CG (g)| = pn is relatively prime to χi (1).
Therefore,

a󵄨󵄨󵄨G : CG (g)󵄨󵄨󵄨 + bχi (1) = 1


󵄨 󵄨

for some a, b ∈ ℤ (see Theorem 3.1.9).


Thus,

χi (g) a|G : CG (g)|χi (g)


= + bχi (g),
χi (1) χi (1)

which is an algebraic integer, and therefore |χi (x)| = χi (1).


Consequently,

g ∈ Zi = {x ∈ G | 󵄨󵄨󵄨χi (x)󵄨󵄨󵄨 = χi (1)}.


󵄨 󵄨

We show that Zi is a subgroup of G.

First of all, if g ∈ Zi , then g −1 ∈ Zi . From Theorem 22.4.8, we also get that |χi (g)| =
χi (1) if and only if g has exactly one eigenvalue. If g ∈ Zi , let this eigenvalue be λ(g),
so that, if U is the ℂG-module corresponding to χi , then we have gu = λ(g)u for all
u ∈ U. We now see that for g, h ∈ Zi , then (gh)u = λ(g)λ(h)u for all u ∈ U. Hence,
χi (gh) = χi (1)λ(g)λ(h), and thus |χi (gh)| = χi (1), which gives gh ∈ Zi . Therefore, Zi is a
subgroup of G.
Now, let Ki = {x ∈ G | χi (x) = χi (1)}. Ki is a normal subgroup of G, and also in Zi .
We now want to show that

Zi /Ki = Z(G/Ki ),

the center of G/Ki . If ρ : G → GL(U) is the representation corresponding to χi , then


for any g ∈ Zi , the matrix of ρ(g) (with respect to any ℂ-basis of U) will be scalar, and
hence ρ(g) ∈ Z(ρ(G)). Since ρ(G) ≅ G/Ki , it follows that Zi /Ki is a subgroup of Z(G/Ki ).
Now, we apply that χi is irreducible. If gKi ∈ Z(G/Ki ), then ρ(g) commutes with ρ(x) for

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
362 | 22 Algebras and group representations

every x ∈ G. Consequently, the map defined by u 󳨃→ gu, u ∈ U, is a ℂG-endomorphism


of U. But U is simple, so we have EndℂG (U) ≅ ℂ by Schur’s lemma.
Therefore, there is a complex root of unity μ such that gu = μu for all u ∈ U. We
now have χi (g) = χi (1), and hence g ∈ Zi . Therefore, Zi /Ki = Z(G/Ki ).
Consequently, if G is non-abelian and simple, then Zi = {1}. But this gives a con-
tradiction.

Theorem 22.5.6 (Burnside’s Theorem). If |G| = pa qb , where p and q are prime numbers
and a, b ∈ ℕ, then G is solvable.

Proof. We use induction on a + b. If a + b = 1, then G has a prime order, and hence is


solvable. We now assume that a + b ≥ 2, and that any group of order pr qs , r, s ∈ ℕ, is
solvable whenever r + s < a + b.
First of all, if the center Z(G) is nontrivial, then G is solvable, because Z(G) is solv-
able and G/Z(G) is solvable by the inductive hypothesis.
Now, let Z(G) = {1}.
Then we may take h1 = 1 for the conjugacy class of 1.
By the class equation (see Theorem 13.2.2), we then have

pa qb = |G| = 1 + h2 + h3 + ⋅ ⋅ ⋅ + hr .

It follows that pq cannot divide each h2 , h3 , . . . , hr . Hence, hi is a prime power of either


p or q for some i ≥ 2. If hi is a nontrivial prime power, then from Theorem 22.5.5 it
follows that G is not simple.
If hi = 1 for some i ≥ 2, then G has at least two representations into ℂ. The num-
ber of these representations is given by the abelianizations, which is given by |G : G󸀠 |,
where G󸀠 is the commutator subgroup of G. Then |G : G󸀠 | > 1, and since G󸀠 is non-
abelian, G󸀠 is a proper normal subgroup. Hence, G is not simple. So, in any case, G
is not simple. Therefore, G contains a proper normal subgroup N. Since |N| | |G|, we
have |N| = pa1 qb1 with a1 + b1 < a + b, since N is a proper subgroup.
By the inductive hypothesis, N is solvable. Furthermore, |G/N| also divides |G|. So,
for the same reason, G/N is solvable. Therefore, both N and G/N are solvable, so G is
solvable by Theorem 12.2.3.

22.6 Exercises
1. Let K be a field, and let G be a finite group. Let U and V be KG-modules having the
same dimension n, and let ρ : G → GL(U) and τ : G → GL(V) be the corresponding
representations.
By fixing K-bases for U and V, consider ρ and τ as homomorphisms from G to
GL(n, K). Show that U and V are KG-module isomorphic if and only if there exists
some M ∈ GL(n, K) such that ρ(g)M = Mτ(g) for every g ∈ G.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
22.6 Exercises | 363

2. Let K be a field, and let G be a finite group. Let x = ∑g∈G g ∈ KG.


(i) Show that the subspace Kx of KG is the unique submodule of KG, that is, iso-
morphic to the trivial module.
(ii) Let ϵ : KG → K be the KG-module epimorphism defined by ϵ(g) = 1 for all
g ∈ G.
Show that ker(ϵ) is the unique KG-submodule of KG, whose quotient is iso-
morphic to the trivial module. This kernel is called the augmentation ideal of
KG.
(iii) Suppose that char(K) = p, with p dividing |G|. Show that KG ⊂ ker(ϵ), the
augmentation ideal of KG. Show that ker(ϵ) is not a direct summand of KG,
and hence that the KG-module KG is not semisimple.
3. Show that the converse of Corollary 22.2.24 is true.
4. Let U be a finite-dimensional K-vector space, and let G be a finite group. Let ρ :
G → GL(U) be a fully reducible representation. Show that ρ gives a direct decom-
position

U = V1 ⊕ ⋅ ⋅ ⋅ ⊕ Vk

of U with all Vi , i = 1, . . . , k, irreducible G-invariant subspaces of U.


5. Show that A is a simple A-module if and only if A is a division algebra.
6. Let n ∈ ℕ, and let Tn (K) be the algebra of upper triangular n × n matrices over K.
(i) Show that the set Vn (K) of column vectors of K of length n is a Tn (K)-module
that has a unique composition series, in which every simple Tn (K)-module ap-
pears exactly once as a composition factor.
(ii) Show that the Tn (K)-module Tn (K) is isomorphic to the direct sum of all
nonzero submodules of Vn (K).
7. Let U be an A-module, let n ∈ ℕ, and let U n be the set of column vectors of length
n with entries from U, considered in the obvious way as an Mn (A)-module. Show
that U is a simple A-module if and only if U n is a simple Mn (A)-module.
8. Let χ be an irreducible character of G. Let λ be any |G|th root of unity. Show that
the set {x ∈ G | χ(x) = λχ(1)} is a normal subgroup of G.
9. Prove that the set of algebraic integers forms a subring of ℂ.
10. Prove that if an algebraic integer is rational, then it is an ordinary integer.
11. Prove that G is simple if and only if the only irreducible character χi , for which
χi (g) = χi (1) for some 1 ≠ g ∈ G is the principal character χ1 .

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:48 PM
Brought to you by | Cambridge University Library
Authenticated
Download Date | 9/18/19 6:48 PM
23 Algebraic cryptography
23.1 Basic cryptography
As we have mentioned, much of mathematics has been algebraicized, that is, uses the
methods and techniques of abstract algebra. Throughout this book, we have looked
at various applications of the algebraic ideas. Many of these were to other areas of
mathematics, such as the insolvability of the quintic polynomial. In this final chapter,
we move in a slightly different direction and look at applications of algebra to cryp-
tography. This has become increasingly important, because of the extensive use of
cryptography and cryptosystems in modern commerce and communications. We first
give a brief introduction to general cryptography and its history.
Cryptography refers to the science and/or art of sending and receiving coded mes-
sages. Coding and hidden ciphering is an old endeavor used by governments and mil-
itaries, and between private individuals from ancient times. Recently, it has become
even more prominent because of the necessity of sending secure and private informa-
tion, such as credit card numbers, over essentially open communication systems.
Traditionally cryptography is the science and/or art of devising and implement-
ing secret codes or cryptosystems. Cryptanalysis is the science and/or art of break-
ing cryptosystems, whereas cryptology refers to the whole field of cryptography, plus
cryptanalysis. In most modern literature, cryptography is used synonymously with
cryptology. Theoretically, cryptography uses mathematics, computer science, and en-
gineering.
A cryptosystem or code is an algorithm to change a plain message, called the plain-
text message, into a coded message, called the ciphertext message. In general, both the
plaintext message (uncoded message) and the ciphertext message (coded message) are
written in some N letter alphabet, which is usually the same for both plaintext and
code. The method of coding or the encoding algorithm is then a transformation of the
N letters. The most common way to perform this transformation is to consider the N
letters as N integers modulo N, and then perform a number theoretical function on
them. Therefore, most encoding algorithms use modular arithmetic; hence, cryptog-
raphy is closely tied to number theory. The subject is very broad, and as mentioned
above, very current, due to the need for publically viewed but coded messages. There
are many references to the subject. The book by Koblitz [70] gives an outstanding intro-
duction to the interaction between number theory and cryptography. It also includes
many references to other sources. The books by Baumslag, Fine, Kreuzer and Rosen-
berger [54] and Stinson [78] describe the whole area.
Modern cryptography is usually separated into classical cryptography, also called
symmetric key cryptography and public key cryptography. In the former, both the en-
coding and decoding algorithms are supposedly known only to the sender and re-

https://doi.org/10.1515/9783110603996-023

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
366 | 23 Algebraic cryptography

ceiver, usually referred to as Bob and Alice. In the latter, the encryption method is
public knowledge but only the receiver knows how to decode.
The message that one wants to send is written in plaintext, and then converted
into code. The coded message is written in ciphertext. The plaintext message and ci-
phertext message are written in some alphabets that are usually the same. The process
of putting the plaintext message into code is called enciphering or encryption, whereas
the reverse process is called deciphering or decryption. Encryption algorithms break
the plaintext and ciphertext message into message units. These are single letters or
pairs of letters, or more generally, k-vectors of letters. The transformations are done
on these message units, and the encryption algorithm is a mapping from the set of
plaintext message units to the set of ciphertext message units. Putting this into a math-
ematical formulation we let

𝒫 = set of all plaintext message units;


𝒞 = set of all ciphertext message units.

The encryption algorithm is then the application of a left invertible function

f : 𝒫 → 𝒞.

The function f is the encryption map. The left inverse

g : 𝒞 → 𝒫, g ∘ f = id𝒫

is the decryption or deciphering map. The triple {𝒫 , 𝒞 , f }, consisting of a set of plaintext


message units, a set of ciphertext message units, and an encryption map, is called a
cryptosystem.
Breaking a code is called cryptanalysis. An attempt to break a code is called an
attack. Most cryptanalysis depends on a statistical frequency analysis of the plaintext
language used (see exercises for examples). Cryptanalysis depends also on a knowl-
edge of the form of the code, that is, the type of cryptosystem used.
We now give some examples of cryptosystems and cryptanalysis.

Example 23.1.1. The simplest type of encryption algorithm is a permutation cipher.


Here, the letters of the plaintext alphabet are permuted and the plaintext message
is sent in the permuted letters. Mathematically, if the alphabet has N letters and σ
is a permutation on 1, . . . , N, the letter i in each message unit is replaced by σ(i). For
example, suppose the plaintext language is English, the plaintext word is BOB and
the permutation algorithm is
a b c d e f g h i j k l m
b c d f g h j k l n o p r
n o p q r s t u v w x y z
s t v w x a e i z m q y u
then BOB → CTC.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
23.1 Basic cryptography | 367

Example 23.1.2. A very straightforward example of a permutation encryption algo-


rithm is a shift algorithm. Here, we consider the plaintext alphabet as the integers
0, 1, . . . , N − 1 mod N. We choose a fixed integer k and the encryption algorithm is

f :m→m+k mod N.

This is often known as a Caesar code after Julius Caesar, who supposedly invented it.
It was used by the Union Army during the American Civil War. For example, if both the
plaintext and ciphertext alphabets were English, and each message unit was a single
letter, then N = 26. Suppose k = 5, and we wish to send the message ATTACK. If a = 0,
then ATTACK is the numerical sequence 0, 19, 19, 0, 2, 10. The encoded message would
then be FYYFHP.

Any permutation encryption algorithm, which goes letter to letter is very simple to
attack using a statistical analysis. If enough messages are intercepted and the plain-
text language is guessed then a frequency analysis of the letters will suffice to crack
the code. For example, in the English language, the three most commonly occurring
letters are E, T, and A with a frequency of occurrence of approximately 13 % and 9 %
and 8 %, respectively. By examining the frequency of occurrences of letters in the ci-
phertext the letters corresponding to E, T, and A can be uncovered.

Example 23.1.3. A variation on the Caesar code is the Vignère code. Here, message
units are considered as k-vectors of integers mod N from an N letter alphabet. Let
B = (b1 , . . . , bk ) be a fixed k-vector in ℤkn . The Vignère code then takes a message
unit

(a1 , . . . , ak ) → (a1 + b1 , . . . , ak + bk ) mod N.

From a cryptanalysis point of view, a Vignère code is no more secure than a Caesar
code and is susceptible to the same type of statistical attack.

The Alberti Code is a polyalphabetic cipher, and can often be used to thwart a sta-
tistical frequency attack. We describe it in the next example.

Example 23.1.4. Suppose we have an N letter alphabet. We then form an N × N ma-


trix P, where each row and column is a distinct permutation of the plaintext alphabet.
Hence, P is a permutation matrix on the integers 0, . . . , N − 1. Bob and Alice decide on
a keyword. The keyword is placed above the plaintext message, and the intersection
of the keyword letter and plaintext letter below will determine which cipher alphabet
to use. We will make this precise with a 9-letter alphabet A, B, C, D, E, O, S, T, U. Here,
for simplicity, we will assume that each row is just a shift of the previous row, but any
permutation can be used.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
368 | 23 Algebraic cryptography

Key Letters
A B C D E O S T U
a A a b c d e o s t u
l B b c d e o s t u a
p C c d e o s t u a b
h D d e o s t u a b c
a E e o s t u a b c d
b O o s t u a b c d e
e S s t u a b c d e o
t T t u a b c d e o s
s U u a b c d e o s t.

Suppose the plaintext message is STAB DOC and Bob and Alice have chosen the
keyword BET. We place the keyword repeatedly over the message

B E T B E T B
S T A B D O C.

To encode, we look at B, which lies over S. The intersection of the B key letter and the
S alphabet is a T; so we encrypt the S with T. The next key letter is E, which lies over T.
The intersection of the E keyletter with the T alphabet is C. Continuing in this manner,
and ignoring the space, we get the encryption

STAB DOC → TCTC TDD.

Example 23.1.5. A final example, which is not number theory based, is the so-called
Beale Cipher. This has a very interesting history, which is related in the popular book
Archimedes Revenge by Paul Hoffman (see [66]). Here, letters are encrypted by num-
bering the first letters of each word in some document like the Declaration of Indepen-
dence or the Bible. There will then be several choices for each letter, making a Beale
cipher quite difficult to attack.

Until relatively recent times, cryptography was mainly concerned with message
confidentiality—that is sending secret messages so that interceptors or eavesdroppers
cannot decipher them. The discipline was primarily used in military and espionage sit-
uations. This changed with the vast amount of confidential data that had to be trans-
mitted over public airways. Thus, the field has expanded to many different types of
cryptographic techniques, such as digital signatures and message authentications.
Cryptography and encryption does have a long and celebrated history. In the
Bible, in the book of Jeremiah, they use what is called an Atabash Code. In this code,
the letters of the alphabet—Hebrew in the Bible, but can be used with any alphabet—
are permuted first to last. That is, in the Latin alphabet, Z would go to A and so on.
The Kabbalists and the Kabbala believe that the Bible—written in Hebrew, where
each letter also stands for a number—is a code from heaven. They have devised elabo-
rate ways to decode it. This idea has seeped into popular culture, where the book “The
Bible Code” became a bestseller.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
23.1 Basic cryptography | 369

In his military campaigns, Julius Caesar would send out coded messages. His
method, which we looked at in the last section, is now known as a Caesar code. It is a
shift cipher. That is, each letter is shifted a certain amount to the right. A shift cipher
is a special case of an affine cipher that will be elaborated upon in the next section.
The Caesar code was resurrected and used during the American Civil War.
Coded messages produced by most of the historical methods reveal statistical in-
formation about the plaintext. This could be used in most cases to break the codes.
The discovery of frequency analysis was done by the Arab mathematician Al-Kindi in
the ninth century, and the basic classical substitution ciphers became more or less
easily breakable. About 1470, Leon Alberti developed a method to thwart statistical
analysis. His innovation was to use a polyalphabetic cipher, where different parts of
the message are encrypted with different alphabets. We looked at an example of an
Alberti code in this section.
A different way to thwart statistical attacks is to use blank and neutral letters,
that is, meaningless letters within the message. Mary, Queen of Scots, used a ran-
dom permutation cipher with neutrals in it, where a neutral was a random mean-
ingless symbol. Unfortunately for her, her messages were decoded, and she was be-
headed.
There have been various physical devices and aids used to create codes. Prior
to the widespread use of the computer, the most famous cryptographic aid was the
Enigma machine, developed and used by the German military during the Second World
War. This was a rotor machine using a polyalphabetic cipher. An early version was
broken by Polish cryptographers early in the war, so a larger system was built that
was considered unbreakable. British cryptographers led by Alan Turing broke this,
and British knowledge of German secrets had a great effect on the latter part of the
war.
The development of digital computers allowed for the development of much more
complicated cryptosystems. Furthermore, this allowed for the encryption using any-
thing that can be placed in binary formats, whereas historical cryptosystems could
only be rendered using language texts. This has revolutionized cryptography.
In 1976, Diffie and Hellman developed the first usable public key exchange proto-
col. This allowed for the transmission of secret data over open airways. A year later,
Rivest, Adelman, and Shamir, developed the RSA algorithm, a second public key pro-
tocol. There are now many, and we will discuss them later. In 1997, it became known
that public key cryptography had been developed earlier by James Ellis working for
British Intelligence, and that both the Diffie–Hellman and RSA protocols had been
developed earlier by Malcom Williamson and Clifford Cocks, respectively.
Before we close this introductory section, we give a short overview of the crypto-
graphic tasks.
Secure confidential message transmission is only one type of task that must be
done with secrecy, and there are many other tasks and procedures that are important
in cryptography.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
370 | 23 Algebraic cryptography

Suppose that several parties want to perform a cryptographic task. To accomplish


this, the involved parties must communicate and cooperate. Moreover, each party has
to obey certain rules and use certain preassigned algorithms. The set of all methods
and algorithms used to perform such a cryptographic task is called a cryptographic
protocol. A cryptosystem is just one type of cryptographic protocol. Formal terms of
the cryptographic protocol is further discussed below.

Definition 23.1.6. Suppose that several parties want to manage a cryptographical


task. Then they must communicate with each other and cooperate; hence, each party
must follow certain rules and implement certain agreed-upon algorithms. The set of
all such methods and rules to perform a cryptographical task is called a cryptographic
protocol.

Several cryptographic tasks are described below. As more techniques are intro-
duced later in the book, we will look at more instances of these cryptographic tasks
and cryptographic protocols to handle them.
(1) Authentication: Authentication refers to the process of determining that a mes-
sage, supposedly from a given person, does come from that person, and further,
has not been tampered with. Included in the general topic of authentication are
the concepts of hash functions and digital signatures. Another important usage is
password identification.
(2) Key exchange and key transport: In a key exchange protocol, two people, usually
called Bob and Alice, exchange a secret shared key to be used in some symmetric
encryption. In a key transport protocol, one party transports to another a secret
key that is to be used.
(3) Secret sharing: Secret sharing involves methods, where some secret is to be shared
by k people, but not available to any proper subset of them. There are many ways
to accomplish this, and it is related to a classical lock and key problem. A beau-
tiful simple solution to the general problem using polynomial interpolation is ac-
corded to Shamir (see for instance the book [54] by Baumslag, Fine, Kreuzer, and
Rosenberger).
(4) Zero knowledge proof : A zero-knowledge proof is an argument that convinces
someone that you have solved a problem, for example, a combinatorial problem,
without giving away the solution. This is tied to authentication.

23.2 Encryption and number theory


Here, we describe some basic number, theoretically derived cryptosystems. In apply-
ing a cryptosystem to an N letter alphabet, we consider the letters as integers mod N.
The encryption algorithms then apply number theoretic functions and use modular
arithmetic on these integers. One example of this was the shift, or Caesar cipher, de-
scribed in the last section. In this encryption method, a fixed integer k is chosen and

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
23.2 Encryption and number theory | 371

the encryption map is given

f :m→m+k mod N.

The shift algorithm is a special case of an affine cipher. Recall that an affine map
on a ring R is a function f (x) = ax + b with a, b, x ∈ R. We apply such a map to the ring
of integers modulo N, that is, R = ℤN , as the encryption map. Again, suppose we have
an N letter alphabet, and we consider the letters as the integers 0, 1, . . . , N − 1 mod N,
that is, in the ring ℤN . We choose integers a, b ∈ ℤN with (a, N) = 1 and b ≠ 0. The
numbers a, b are called the keys of the cryptosystem. The encryption map is then given
by

f : m → am + b mod N.

Example 23.2.1. Using an affine cipher with the English language and keys a = 3,
b = 5 encode the message EAT AT JOE’S. Ignore spaces and punctuation.
The numerical sequence for the message ignoring the spaces and punctuation is

4, 0, 19, 0, 19, 9, 14, 4, 18.

Applying the map f (m) = 3m + 5 mod 26, we get

17, 5, 62, 5, 62, 32, 47, 17, 59 → 17, 5, 10, 5, 10, 6, 21, 17, 7.

Now rewriting these as letters, we get

EAT AT JOE’S → RFKFKGVRH.

Since (a, N) = 1, the integer a has a multiplicative inverse a−1 mod N. The decryp-
tion map for an affine cipher with keys a, b is then

g = f −1 : m → a−1 (m − b) mod N.

Since an affine cipher, as given above, goes letter to letter, it is easy to attack using
a statistical frequency approach. Furthermore, if an attacker can determine two letters
and knows that it is an affine cipher, the keys can be determined and the code broken.
To give better security it is preferable to use k-vectors of letters as message units. The
form then of an affine cipher becomes

f : v → Av + B,

where v and B are k-vectors from ℤkN , and A is an invertible k × k matrix with entries
from the ring ℤN . The computations are then done modulo N. Since v is a k-vector,
and A is a k × k matrix, the matrix product Av produces another k-vector from ℤkN .
Adding the k-vector B again produces a k-vector, so the ciphertext message unit is

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
372 | 23 Algebraic cryptography

again a k-vector. The keys for this affine cryptosystem are the enciphering matrix A,
and the shift vector B. The matrix A is chosen to be invertible over ℤN (equivalent to
the determinant of A being a unit in the ring ℤN ), so the decryption map is given by

v → A−1 (v − B).

Here, A−1 is the matrix inverse over ℤN , and v is a k-vector. The enciphering matrix A
and the shift vector B are now the keys of the cryptosystem.
A statistical frequency attack on such a cryptosystem requires knowledge, within
a given language, of the statistical frequency of k-strings of letters. This is more diffi-
cult to determine than the statistical frequency of single letters. As for a letter to letter
affine cipher, if k + 1 message units, where k is the message block length, are discov-
ered, then the code can be broken.

Example 23.2.2. Using an affine cipher with message units of length 2 in the English
language and keys

5 1 5
A=( ), B = ( ),
8 7 3

encode the message EAT AT JOE’S. Again ignore spaces and punctuation.
Message units of length 2; that is, 2-vectors of letters are called digraphs. We first
must place the plaintext message in terms of these message units. The numerical se-
quence for the message EAT AT JOE’S, ignoring the spaces and punctuation, is as be-
fore

4, 0, 19, 0, 19, 9, 14, 4, 18.

Therefore, the message units are

4 19 19 14 18
( ), ( ), ( ), ( ), ( ),
0 0 9 4 18

repeating the last letter to end the message.


The enciphering matrix A has determinant 1 mod 26, which is a unit mod 26,
and hence is invertible. Therefore, it is a valid key.
Now we must apply the map f (v) = Av + B mod 26 to each digraph. For example,

4 5 1 4 5 20 5 25
A( ) + B = ( )( ) + ( ) = ( ) + ( ) = ( ).
0 8 7 0 3 32 3 9

Doing this to the other message units, we obtain

25 22 5 1 9
( ), ( ), ( ), ( ), ( ).
9 25 10 13 13

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
23.2 Encryption and number theory | 373

Now rewriting these as digraphs of letters, we get

Z W F B J
( ), ( ), ( ), ( ), ( ).
J Z K N N

Therefore, the coded message is

EAT AT JOE’S → ZJWZFKBNJN.

Example 23.2.3. Suppose we receive the message ZJWZFKBNJN, and we wish to de-
code it. We know that an affine cipher with message units of length 2 in the English
language and keys

5 1 5
A=( ), B=( )
8 7 3

is being used.
The decryption map is given by

v → A−1 (v − B),

so we must find the inverse matrix for A. For a 2 × 2 invertible matrix ( ac db ), we have

−1
a b 1 d −b
( ) = ( ).
c d ad − bc −c a

Therefore, in this case, recalling that multiplication is mod 26,

5 1 7 −1
A=( ) 󳨐⇒ A−1 = ( ).
8 7 −8 5

The message ZJWZFKBNJN, in terms of message units, is

25 22 5 1 9
( ), ( ), ( ), ( ), ( ).
9 25 10 13 13

We apply the decryption map to each digraph. For example,

20 7 −1 25 5 4
A−1 (( ) − B) = ( ) (( ) − ( )) = ( ) .
6 −8 5 9 3 0

Doing this to each, we obtain

4 19 19 14 18
( ), ( ), ( ), ( ), ( )
0 0 9 4 18

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
374 | 23 Algebraic cryptography

and rewriting in terms of letters

E T T O S
( ), ( ), ( ), ( ), ( ).
A A J E S

This gives us

ZJWZFKBNJN → EATATJOESS.

Modern cryptography is done via a computer. Hence, all messages both plaintext
and ciphertext are actually presented as binary strings. Important in this regard is the
concept of a hash function.
A cryptographic hash function is a deterministic function

h : S → {0, 1}n ,

which returns for each arbitrary block of data, called a message, a fixed size bit string.
It should have the property that a change in the data will change the hash value. The
hash value is called the digest.
An ideal cryptographic hash function has the following properties:
(1) It is easy to compute the hash value for any given message.
(2) It is infeasible to find a message that has a given hash value (preimage resistant).
(3) It is infeasible to modify a message without changing its hash.
(4) It is infeasible to find two different messages with the same hash (collision resis-
tant).

A cryptographic hash function can serve as a digital signature.


Hash functions can also be used with encryption. Suppose that Bob and Alice
want to communicate openly. They have exchanged a secret key K that supposedly
only they know. Let fK be an encryption function or encryption algorithm based on
the key K. Alice wants to send the message m to Bob, and m is given as a binary bit
string. Alice sends to Bob

fK (m) ⊕ h(K),

where ⊕ is addition modulo 2.


Bob knows the key K, and hence its hash value h(K). He now computes

fK (m) ⊕ h(K) ⊕ h(K).

Since addition modulo 2 has order 2, we have

fK (m) ⊕ h(K) ⊕ h(K) = fK (M).

Bob now applies the decryption algorithm gK ; gK ∘ fK = id to decode the message.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
23.3 Public key cryptography | 375

Alice could have just as easily sent fK (m) ⊕ K. However, sending the hash has two
benefits. Usually the hash is shorter than the key, and from the properties of hash
functions, it gives another level of security. As we will see, tying the secret key to the
actual encryption in this manner is the basis for the El-Gamal and elliptic curve cryp-
tographic methods.
The encryption algorithm fK is usually a symmetric key encryption, so that anyone
knowing K can encrypt and decrypt easily. However, it should be resistant to plaintext-
ciphertext attacks. That is, if an attacker gains some knowledge of a piece of plaintext
together with the corresponding ciphertext, it should not compromise the whole sys-
tem.
The encryption algorithm can either be a block cipher or a stream cipher. In the
former, blocks of fixed length k are transformed into blocks of fixed length n, and there
is a method to tie the encrypted blocks together. In the latter, a stream cipher, bits are
transformed one by one into new bit strings by some procedure.
In 2001, the National Institute of Standards and Technology adopted a block ci-
pher, now called AES for Advanced Encryption System, as the industry standard for a
symmetric key encryption. Although not universally used, it is the most widely used.
This block cipher was a standardization of the Rijnadel cipher, named after its inven-
tors Rijmen and Daeman.
AES replaced DES or Digital Encryption System, which had been the standard.
Parts of DES were found to be insecure. AES proceeds with several rounds of encrypt-
ing blocks, and then mixing blocks. The mathematics in AES is done over the finite
field GF(28 ).

23.3 Public key cryptography


Presently there are many instances, where secure information must be sent over open
communication lines. These include, for example, banking and financial transac-
tions, purchasing items via credit cards over the internet and similar things. This
led to the development of public key cryptography. Roughly, in classical cryptography,
only the sender and receiver know the encoding and decoding methods. Furthermore,
it is a feature of such cryptosystems, such as the ones that we have looked at, that if
the encrypting method is known, then the decryption can be carried out. In public
key cryptography, the encryption method is public knowledge, but only the receiver
knows how to decode. More precisely, in a classical cryptosystem once the encrypting
algorithm is known, the decryption algorithm can be implemented in approximately
the same order of magnitude of time. In the public key cryptosystem, developed first
by Diffie and Hellman, the decryption algorithm is much more difficult to implement.
This difficulty depends on the type of computing machinery used, and as computers
get better, new and more secure public key cryptosystems become necessary.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
376 | 23 Algebraic cryptography

The basic idea in a public key cryptosystem is to have a one-way function or trap-
door function. That is, a function, which is easy to implement, but very hard to invert.
Hence, it becomes simple to encrypt a message, but very hard, unless you know the
inverse, to decrypt.
The standard model for public key systems is the following: Alice wants to send
a message to Bob. The encrypting map fA for Alice is public knowledge, as well as the
encrypting map fB for Bob. On the other hand, the decryption algorithms gA and gB
are secret and known only to Alice and Bob, respectively. Let 𝒫 be the message Alice
wants to send to Bob. She sends fB gA (𝒫 ). To decode, Bob applies first gB , which only he
knows. This gives him gB (fB gA (𝒫 )) = gA (𝒫 ). He then looks up fA , which is publically
available and applies this fA (gA (𝒫 )) = 𝒫 to obtain the message. Why not just send
fB (𝒫 )? Bob is the only one who can decode this. The idea is authentication, that is, be-
ing certain from Bob’s point of view that the message really came from Alice. Suppose
𝒫 is Alice’s verification; signature, social security number et cetera. If Bob receives
fB (𝒫 ), it could be sent by anyone, since fB is public. On the other hand, since only
Alice supposedly knows gA , getting a reasonable message from fA (gB fB gA (𝒫 )) would
verify that it is from Alice. Applying gB alone should result in nonsense.
Getting a reasonable one-way function can be a formidable task. The most widely
used (at present) public key systems are based on difficult-to-invert number theoretic
functions. The original public key system was developed by Diffie and Hellman in 1976.
It was followed closely by a second public key system developed by Rivest, Shamir,
and Adelman, known as the RSA system. Although at present there are many different
public key systems in use, most are variations of these original two. The variations are
attempts to make the systems more secure. We will discuss four such systems.

23.3.1 The Diffie–Hellman protocol

Diffie and Hellman in 1976 developed the original public key idea using the discrete log
problem. In modular arithmetic, it is easy to raise an element to a power, but difficult
to determine, given an element, if it is a power of another element. Specifically, if G is
a finite group, such as the cyclic multiplicative group of ℤp , where p is a prime, and
h = g k for some k, then the discrete log of h to the base g is any integer t with h = g t .
The rough form of the Diffie–Hellman public key system is as follows: Bob and
Alice will use a classical cryptosystem based on a key k with 1 < k < q − 1 where q is a
prime. It is the key k that Alice must share with Bob. Let g be a multiplicative generator
of ℤ⋆q , the multiplicative group of ℤq . The generator g is public. It is known that this
group is cyclic if q is a prime.
Alice chooses an a ∈ ℤq with 1 < a < q − 1. She makes public g a . Bob chooses a
b ∈ ℤ⋆q and makes public g b . The secret key is g ab . Both Bob and Alice, but presumably
none else, can discover this key. Alice knows her secret power a, and the value g b is

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
23.3 Public key cryptography | 377

public from Bob. Hence, she can compute the key g ab = (g b )a . The analogous situation
holds for Bob. An attacker, however, only knows g a and g b and g. Unless the attacker
can solve the discrete log problem, the key exchange is secure.
Given q, g, g a , g b the problem of determining the secret key g ab is called the Diffie–
Hellman problem. At present the only known solution is to solve the discrete log prob-
lem, which appears to be very hard. In choosing the prime q and the generator g, it is
assumed that the prime q is very large, so that the order of g is very large. There are
algorithms to solve the discrete log problem if q is too small.
One attack on the Diffie–Hellman key exchange is a man in the middle attack. Since
the basic protocol involves no authentication, an attacker can pretend to be Bob and
get information from Alice, and then pretend to be Alice and get information from
Bob. In this way, the attacker could get the secret shared key. To prevent this, digital
signatures are often used (see [70] for a discussion of these).
The decision Diffie–Hellman problem is: given a prime q and g a mod q, g b mod q,
and g c mod q determine if g c = g ab .
In 1997, it became known that the ideas of public key cryptography were developed
by British Intelligence Services prior to Diffie and Hellman.

23.3.2 The RSA algorithm

In 1977, Rivest, Adelman, and Shamir developed the RSA algorithm, which is presently
(in several variations) the most widely used public key cryptosystem. It is based on
the difficulty of factoring large integers and, in particular, on the fact that it is easier
to test for primality than to factor very large integers.
In basic form, the RSA algorithm works as follows: Alice chooses two large primes
pA , qA and an integer eA relatively prime to ϕ(pA qA ) = (pA − 1)(qA − 1), where ϕ is the
Euler phi-function. It is assumed that these integers are chosen randomly to minimize
attacks. Primality tests arise in the following manner: Alice first randomly chooses a
large odd integer m and tests it for primality. If m is prime it is used. If not, she tests
m + 2, m + 4, and so on, until she gets her first prime pA . She then repeats the process
to get qA . Similarly, she chooses another odd integer m and tests until she gets an eA
relatively prime to ϕ(pA qA ). The primes she chooses should be quite large. Originally,
RSA used primes of approximately 100 decimal digits, but as computing and attack
have become more sophisticated, larger primes have had to be utilized. Presently, keys
with 400 decimal digits are not uncommon. Once Alice has obtained pA , qA , eA , she
lets nA = pA qA and computes dA , the multiplicative inverse of eA modulo ϕ(nA ). That
is, dA satisfies eA dA ≡ 1 mod (pA − 1)(qA − 1). She makes public the enciphering key
KA = (nA , eA ), and the encryption algorithm known to all is

fA (𝒫 ) = 𝒫 eA mod nA ,

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
378 | 23 Algebraic cryptography

where 𝒫 ∈ ℤnA is a message unit. It can be shown (see for instance [43] or exercises)
that if (eA , (pA −1)(qA −1)) = 1 and eA dA ≡ 1 mod (pA −1)(qA −1), then 𝒫 eA dA ≡ 𝒫 mod nA .
Therefore, the decryption algorithm is

gA (𝒞 ) = 𝒞 da mod nA .

Notice then that gA (fA (𝒫 )) = 𝒫 eA dA ≡ 𝒫 mod nA , so it is the left inverse.


Now Bob makes the same type of choices to obtain pB , qB , eB . He lets nB = pB qB
and makes public his key KB = (nB , eB ).
If Alice wants to send a message to Bob that can be authenticated to be from Alice,
she sends fB (gA (𝒫 )). An attack then requires factoring nA or nB , which is much more
difficult than obtaining the primes pA , qA , pB , qB .
In practice, suppose there is an N letter alphabet, which is to be used for both
plaintext and ciphertext. The plaintext message is to consist of k vectors of letters, and
the ciphertext message of l vectors of letters with k < l. Each of the k plaintext letters
in a message unit 𝒫 are then considered as integers mod N, and the whole plaintext
message is considered as a k digit integer written to the base N (see example below).
The transformed message is then written as an l digit integer mod N, and then the
digits are the considered integers mod N, from which encrypted letters are found. To
ensure that the range of plaintext messages and ciphertext messages are the same,
k < l are chosen so that

N k < nU < N l

for each user U; that is, nU = pU qU . In this case, any plaintext message 𝒫 is an inte-
ger less than N k , considered as an element of ℤnU . Since nU < N l , the image under
the power transformation corresponds to an l digit integer written to the base N, and
hence to an l letter block. We give an example with relatively small primes. In real
world applications, the primes would be chosen to have over a hundred digits, and
the computations and choices must be done using good computing machinery.

Example 23.3.1. Suppose N = 26, k = 2, and l = 3. Suppose further that Alice chooses
pA = 29, qA = 41, eA = 13. Here, nA = 29 ⋅ 41 = 1189, so she makes public the key KA =
(1189, 13). She then computes the multiplicative inverse dA of 13 mod 1120 = 28 ⋅ 40.
Now suppose we want to send her the message TABU. Since k = 2, the message units
in plaintext are 2 vectors of letters, so we separate the message into TA BU. We show
how to send TA. First, the numerical sequence for the letters TA mod 26 is (19,0). We
then use these as the digits of a 2-digit number to the base 26. Hence,

TA =̂ 19 ⋅ 26 + 0 ⋅ 1 = 494.

We now compute the power transformation using her eA = 13 to evaluate

f (19, 0) = 49413 mod 1189.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
23.3 Public key cryptography | 379

This is evaluated as 320. Now we write 320 to the base 26. By our choices of k, l this
can be written with a maximum of 3 digits to this base. Then

320 = 0 ⋅ 262 + 12 ⋅ 26 + 8.

The letters in the encoded message then correspond to (0, 12, 8), and therefore the en-
cryption of TA is AMI.
To decode the message Alice knows dA and applies the inverse transformation.

Since we have assumed that k < l, this seems to restrict the direction in which
messages can be sent. In practice, to allow messages to go between any two users,
the following is done: Suppose Alice is sending an authenticated message to Bob. The
keys kA = (nA , eA ), kB = (nB , eB ) are public. If nA < nB , Alice sends fB gA (𝒫 ). On the
other hand, if nA > nB , she sends gA fB (𝒫 ).
There have been attacks on RSA for special types of primes, so care must be taken
in choosing the primes.
The computations and choices used in real world implementations of the RSA al-
gorithm must be done with computers. Similarly, attacks on RSA are done via comput-
ers. As computing machinery gets stronger and factoring algorithms get faster, RSA
becomes less secure, and larger and larger primes must be used. To combat this, other
public key methods are in various stages of ongoing development. RSA and Diffie–
Hellman, and many related public key cryptosystems use properties in abelian groups.
In recent years, a great deal of work has been done to encrypt and decrypt using cer-
tain nonabelian groups, such as linear groups or braid groups. We will discuss these
later in the chapter.

23.3.3 The El-Gamal protocol

The El-Gamal cryptosystem is a method to use the Diffie–Hellman key exchange


method to do encryption. The method works as follows, and uses the fact that hash
functions can also be used with encryption: Suppose that Bob and Alice want to
communicate openly. They have exchanged a secret key K that supposedly only they
know. Let fK be an encryption function or encryption algorithm based on the key K.
Alice wants to send the message m to Bob and m is given as a binary bit string. Alice
sends to Bob

fK (m) ⊕ h(K),

where ⊕ is addition modulo 2.


Bob knows the key K, and hence its hash value h(K). He now computes

fK (m) ⊕ h(K) ⊕ h(K).

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
380 | 23 Algebraic cryptography

Since addition modulo 2 has order 2, we have

fK (m) ⊕ h(K) ⊕ h(K) = fK (M).

Bob now applies the decryption algorithm gK to decode the message.


Hence, if K is a publicly exchanged secret key and fK is a cryptosystem based on K,
then the above format allows an encryption algorithm to go with the key exchange. The
El-Gamal system does this with the Diffie–Hellman key exchange protocol.
Suppose that Bob and Alice want to communicate openly. Alice chooses a prime q
and a generator g of the multiplicative group ℤq . The prime q should be large enough
to thwart the known discrete logarithm algorithms. Alice then chooses an integer a
with 1 < a < q − 1. She computes

A = ga mod q.

Her public key is then (q, g, A). Bob wants to send a message M to Alice. He first en-
crypts the message an integer m mod q. For Bob to now send an encrypted message m
to Alice, he chooses a random integer b with 1 < b < q − 2, and computes

B = gb mod q.

Bob then sends to Alice the integer

c = Ab m mod q;

that is, Bob encrypts the whole message by multiplying it by the Diffie–Hellman
shared key. The complete El-Gamal ciphertext is then the pair (B, c).
How does Alice decode the message? Given the message m, she knows how to
reconstruct the plaintext message M, so she must recover the mod q integer m. As in
the Diffie–Hellman key exchange, she can compute the shared key Ab = Ba . She can
then divide c by this Diffie–Hellman key g ab to obtain m. To avoid having to find the
inverse of Ba mod q, which can be difficult, she computes the exponent x = p − 1 − a.
The inverse is then Bx mod q.
For each new El-Gamal encryption, a new exponent b is chosen so that there is a
random component of El-Gamal, which improves the security.
Breaking the El-Gamal system is as difficult as breaking the Diffie–Hellman pro-
tocol, and hence is based on the difficulty of the discrete log problem. However, the
El-Gamal has the advantage that the choice of primes is random. As mentioned, the
primes should be chosen large enough to not be susceptible to known discrete log
algorithms. Presently, the primes should be of binary length at least 512:

c = Ab m mod q.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
23.3 Public key cryptography | 381

23.3.4 Elliptic curves and elliptic curve methods

A very powerful approach, which has had wide-ranging applications in cryptography


is to use elliptic curves. If K is a field of characteristic not equal to 2 or 3, then an elliptic
curve over K is the locus of points (x, y) ∈ K × K, satisfying the equation

y2 = x3 + ax + b with 4a3 + 27b2 ≠ 0.

We denote by 0 a single point at infinity, and let

E(K) = {(x, y) ∈ K × K : y2 = x3 + ax + b} ∪ {0}.

The important thing about elliptic curves from the viewpoint of cryptography is
that a group structure can be placed on E(K). In particular, we define the operation +
on E(K) by the following:
1. 0 + P = P for any point P ∈ E(K).
2. If P = (x, y), then −P = (x, −y), and −0 = 0.
3. P + (−P) = 0 for any point P ∈ E(K).
4. If P1 = (x1 , y1 ), P2 = (x2 , y2 ) with P1 ≠ −P2 , then

P1 + P2 = (x3 , y3 ) with
x3 = m2 − (x1 + x2 ), y3 = −m(x3 − x1 ) − y1 ,

where
y2 − y1
m= if x2 ≠ x1 ,
x2 − x1

and

3x12 + a
m= if x2 = x1 .
2y1

This operation has a very nice geometric interpretation if K = ℝ, the real numbers. It
is known as the chord and tangent method. If P1 ≠ P2 are two points on the curve, then
the line through P1 , P2 intersects the curve at another point P3 . If we reflect P3 through
the x-axis, we get P1 + P2 . If P1 = P2 ; we take the tangent line at P1 .
With this operation, E(K) becomes an abelian group (due to Cassels), whose struc-
ture can be worked out.

Theorem 23.3.2. E(K), together with the operations defined above, forms an abelian
group. If K is a finite field of order pk , then E(K) is either cyclic or has the structure

E(K) = ℤm1 × ℤm2

with m1 |m2 and m1 |(pk − 1).

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
382 | 23 Algebraic cryptography

A comprehensive description and discussion of elliptic curve methods can be


found in Crandall and Pomerance [61].
The groups of elliptic curves can be used for cryptography, as developed by Koblitz
and others. If q is a prime and a, b ∈ ℤq , then we can form the elliptic curve E(p : a, b)
and the corresponding elliptic curve abelian group. In this group, the Diffie–Hellman
key exchange protocol and the corresponding El-Gamal encryption system can be im-
plemented. Care must be taken that the discrete log problem in E(q; a, b) is difficult.
The curve is then called a cryptographically secure elliptic curve.
Elliptic curve public-key cryptosystems are at present the most important commu-
tative alternatives to the use of the RSA algorithm. There are several reasons for this:
They are more efficient in many cases than RSA, and keys in elliptic curve systems are
much smaller than keys in RSA. It is felt that it is important to have good workable
alternatives to RSA in the event that factoring algorithms become strong enough to
compromise RSA encryption.

23.4 Noncommutative-group-based cryptography


The public key cryptosystems and public key exchange protocols that we have dis-
cussed, such as the RSA algorithm, Diffie–Hellman, El-Gamal and elliptic curve meth-
ods are number theory based, and hence depend on the structure of abelian groups.
Although there have been no overall successful attacks on the standard methods,
there is a feeling that the strength of computing machinery has made these techniques
theoretically susceptible to attack. As a result of this, there has been a recent active line
of research to develop cryptosystems and key exchange protocols using noncommuta-
tive cryptographic platforms. This line of investigation has been given the broad title
of noncommutative algebraic cryptography. Since most of the cryptographic platforms
are groups, this is also known as group-based cryptography. The books by Myasnikov,
Shpilrain and Ushakov [73] and Steinwandt [64] provides an overview of group-based
cryptographic methods tied to complexity theory.
Up to this point, the main sources for noncommutative cryptographic platforms
has been nonabelian groups. In cryptosystems based on these objects, algebraic prop-
erties of the platforms are used prominently in both devising cryptosystems and in
cryptanalysis. In particular, the nonsolvability of certain algorithmic problems in
finitely presented groups, such as the conjugator search problem, has been crucial in
encryption and decryption.
The main sources for nonabelian groups are combinatorial group theory and
linear group theory. Braid group cryptography (see [62]), where encryption is done
within the classical braid groups, is one prominent example. The one-way functions
in braid group systems are based on the difficulty of solving group theoretic decision
problems, such as the conjugacy problem and conjugator search problem. Although
braid group cryptography had initial spectacular success, various potential attacks

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
23.4 Noncommutative-group-based cryptography | 383

have been identified. Borovik, Myasnikov, Shpilrain [58], and others have studied the
statistical aspects of these attacks, and have identified what are termed black holes
in the platform groups, outside of which present cryptographic problems. Baumslag,
Fine and Xu in [55] and [79] suggested potential cryptosystems using a combination
of combinatorial group theory and linear groups, and a general schema for these
types of cryptosystems was given. In [56], a public key version of this schema using
the classical modular group as a platform was presented. A cryptosystem using the
extended modular group SL2 (ℤ) was developed by Yamamura [80], but was subse-
quently shown to have loopholes [77]. In [56], attacks based on these loopholes were
closed.
The extension of the cryptographic ideas to noncommutative platforms involves
the following idea:
(1) General algebraic techniques for developing cryptosystems,
(2) Potential algebraic platforms (specific groups, rings, et cetera) for implementing
the techniques,
(3) Cryptanalysis and security analysis of the resulting systems.

The main source for noncommutative platforms are nonabelian groups, and the main
method for handling nonabelian groups in cryptography is combinatorial group the-
ory, which we discussed in detail in Chapter 14. The basic idea in using combinatorial
group theory for cryptography is that elements of groups can be expressed as words in
some alphabet. If there is an easy method to rewrite group elements in terms of these
words, and further the technique used in this rewriting process can be supplied by a
secret key, then a cryptosystem can be created.
One of the earliest descriptions of a free group cryptosystem was in a paper by
W. Magnus in the early 1970s [71]. Recall that the classical modular group M is M =
PSL2 (ℤ). Hence, M consists of the 2 × 2 projective integral matrices:

a b
M = {± ( ) : ad − bc = 1, a, b, c, d ∈ ℤ} .
c d

Equivalently, M can be considered as the set of integral linear fractional transforma-


tions with determinant 1:
az + b
z󸀠 = , ad − bc = 1, a, b, c, d ∈ ℤ.
cz + d
Magnus proved the following theorem:

Theorem 23.4.1 ([55]). The matrices

1 1 1 + 4t 2 2t
±( ), ±( ), t = 1, 2, 3, . . .
1 2 2t 1

freely generate a free subgroup F of infinite index in M. Furthermore, distinct elements


of F have distinct first columns.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
384 | 23 Algebraic cryptography

Since the entries in the generating matrices are positive, we can do the following:
Choose a set

T1 , . . . , Tn

of projective matrices from the set above with n large enough to encode a desired plain-
text alphabet 𝒜. Any message would be encoded by a word

W(T1 , . . . , Tn )

with nonnegative exponents. This represents an element g of F. The two elements in


the first column determine W, and therefore g. Receiving W then determines the mes-
sage uniquely.
The idea of using the difficulty of group theory decision problems in infinite non-
abelian groups was first developed by Magyarik and Wagner in 1985. They devised a
public key protocol based on the difficulty of the solution of the word problem (see
Chapter 14). Although this was a seminal idea, their basic cryptosystem was really un-
workable and not secure in the form they presented. Wagner and Magyarik outlined
a conceptual public key cryptosystem based on the hardness of the word problem for
finitely presented groups. At the same time, they gave a specific example of such a
system. González Vasco and Steinwandt proved that their approach is vulnerable to
so-called reaction attacks. In particular, for the proposed instance, it is possible to
retrieve the private key just by watching the performance of a legitimate recipient.
The general scheme of the Wagner and Magyarik public-key cryptosystem is as
follows: Let X be a finite set of generators, and let R and S be finite sets of relators such
that the group G0 = ⟨X; R ∪ S⟩ has an easy word problem. That is, the word problem
can be solved in polynomial time, whereas the G = ⟨X; R⟩ has a hard word problem
(see Chapter 14 for terminology).
Choose two words W0 and W1 , which are not equivalent in G0 (and hence not
equivalent in G). The public key is the presentation ⟨X; R⟩, and the chosen words W0
and W1 . To encrypt a single bit ∈ {0, 1}, pick Wi and transform it into a ciphertext word
W by repeatedly and randomly applying Tietze transformations to the presentation
⟨X; R⟩. To decrypt a word W, run the algorithm for the word problem of G0 to decide
which of Wi W −1 is equivalent to the empty word for the presentation ⟨X; R ∪ S⟩. The
private key is the set S. Actually, this is not sufficient and Wagner and Magyarik are
not clear on this point. The public key should be a deterministic polynomial-time al-
gorithm for the word problem of G0 = ⟨X; R∪S⟩. Just knowing S does not automatically
and explicitly give us an efficient algorithm (even if such an algorithm exists).

23.4.1 Free group cryptosystems

The simplest example of a nonabelian-group-based cryptosystem is perhaps a free


group cryptosystem. This can be described in the following manner:

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
23.4 Noncommutative-group-based cryptography | 385

Consider a free group F on free generators x1 , . . . , xr . Then each element g in F has


a unique expression as a word W(x1 , . . . , xr ). Let W1 , . . . , Wk with Wi = Wi (x1 , . . . , xr ) be
a set of words in the generators x1 , . . . , xr of the free group F. At the most basic level, to
construct a cryptosystem, suppose that we have a plaintext alphabet 𝒜. For example,
suppose that 𝒜 = {a, b, . . .} are the symbols needed to construct meaningful messages
in English. To encrypt, use a substitution ciphertext

𝒜 → {W1 , . . . , Wk }.

That is,

a 󳨃→ W1 , b 󳨃→ W2 , . . . .

Then, given a word W(a, b, . . .) in the plaintext alphabet, form the free group word
W(W1 , W2 , . . .). This represents an element g in F. Send out g as the secret message.
To implement this scheme, we need a concrete representation of g, and then for
decryption, a way to rewrite g back in terms of W1 , . . . , Wk . This concrete representation
is the idea behind homomorphic cryptosystems.
The decryption algorithm in a free group cryptosystem then depends on the Reide-
meister–Schreier rewriting process. As described in Chapter 14, this is a method to
rewrite elements of a subgroup of a free group in terms of the generators of that sub-
group. Recall that roughly it works as follows: Assume that W1 , . . . , Wk are free gener-
ators for some subgroup H of a free group F on {x1 , . . . , xn }. Each Wi is then a reduced
word in the generators {x1 , . . . , xn }. A Schreier transversal for H is a set {h1 , . . . , ht , . . .} of
(left) coset representatives for H in F of a special form (see Chapter 14). Any subgroup
of a free group has a Schreier transversal. The Reidemeister–Schreier process allows
one to construct a set of generators W1 , . . . , Wk for H by using a Schreier transversal.
Furthermore, given the Schreier transversal, from which the set of generators for H was
constructed, the Reidemeister–Schreier rewriting process allows us to algorithmically
rewrite an element of H. Given such an element expressed as a word W = W(x1 , . . . , xr )
in the generators of F, this algorithm rewrites W as a word W ⋆ (W1 , . . . , Wk ) in the gen-
erators of H.
The knowledge of a Schreier transversal, and the use of Reidemeister–Schreier
rewriting, facilitates the decoding process in the free group case, but is not essential.
Given a known set of generators for a subgroup the Stallings folding method to develop
a subgroup graph can also be utilized to rewrite in terms of the given generators. The
paper by Kapovich and Myasnikov [68] is now a standard reference for this method in
free groups. At present, there is an ongoing study of the complexity of Reidemeister–
Schreier being done by Brukhov, Fine, and Troeger.
Pure free group cryptosystems are subject to various attacks and can be broken
easily. However, a public key free group cryptosystem, using a free group represen-
tation in the modular group, was developed by Baumslag, Fine, and Xu [55, 56]. The
most successful attacks on free group cryptosystems are called length-based attacks.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
386 | 23 Algebraic cryptography

Here, an attacker multiplies a word in ciphertext by a generator to get a shorter word,


which could possibly be decoded.
Baumslag, Fine, and Xu in [55] described the general encryption scheme that fol-
lows using free group cryptography. A further enhancement was discussed in the pa-
per [56].
We start with a finitely presented group

G = ⟨X|R⟩,

where X = {x1 , . . . , xn } and a faithful representation,

ρ : G → G.

G can be any one of several different kinds of objects: linear group, permutation group,
power series ring et cetera.
We assume that there is an algorithm to re-express an element of ρ(G) in G in terms
of the generators of G. That is, if g = W(x1 , . . . , xn , . . .) ∈ G, where W is a word in these
generators, and we are given ρ(g) ∈ G, we can algorithmically find g and its expression
as the word W(x1 , . . . , xn ).
Once we have G, we assume that we have two free subgroups K, H with

H ⊂ K ⊂ G.

We assume that we have fixed Schreier transversals for K in G and for H in K, both of
which are held in secret by the communicating parties Bob and Alice. Now, based on
the fixed Schreier transversals, we have sets of Schreier generators constructed from
the Reidemeister–Schreier process for K and for H:

k1 , . . . , km , . . . for K,

and

h1 , . . . , ht , . . . for H.

Notice that the generators for K will be given as words in x1 , . . . , xn , the generators
of G, whereas the generators for H will be given as words in the generators k1 , k2 , . . . for
K. We note further that H and K may coincide, and that H and K need not, in general,
be free, but only have a unique set of normal forms so that the representation of an
element in terms of the given Schreier generators is unique.
We will encode within H, or more precisely within ρ(H). We assume that the num-
ber of generators for H is larger than the set of characters within our plaintext alpha-
bet. Let 𝒜 = {a, b, c, . . .} be our plaintext alphabet. At the simplest level, we choose a
starting point i within the generators of H, and encode

a 󳨃→ hi , b 󳨃→ hi+1 , . . . et cetera.

Suppose that Bob wants to communicate the message W(a, b, c, . . .) to Alice, where
W is a word in the plaintext alphabet. Recall that both Bob and Alice know the var-

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
23.4 Noncommutative-group-based cryptography | 387

ious Schreier transversals, which are kept secret between them. Bob then encodes
W(hi , hi+1 , . . .) and computes in G the element W(ρ(hi ), ρ(hi+1 ), . . .), which he sends to
Alice. This is sent as a matrix if G is a linear group, or as a permutation if G is a per-
mutation group, and so on.
Alice uses the algorithm for G relative to G to rewrite W(ρ(hi ), ρ(hi+1 ), . . .) as a word
W ⋆ (x1 , . . . , xn ) in the generators of G. She then uses the Schreier transversal for K in
G to rewrite using the Reidemeister–Schreier process W ⋆ as a word W ⋆⋆ (k1 , . . . , ks , . . .)
in the generators of K. Since K is free, or has unique normal forms, this expression
for the element of K is unique. Once she has the word written in the generators of K,
she uses the transversal for H in K to rewrite again, using the Reidemeister–Schreier
process, in terms of the generators for H. She then has a word W ⋆⋆⋆ (hi , hi+1 , . . .), and
using hi 󳨃→ a, hi+1 󳨃→ b, . . . decodes the message.
In actual implementation, an additional random noise factor is added.
In [55] and [56], an implementation of this process was presented that used for
the base group G, the classical modular group M = PSL2 (ℤ). Furthermore, it was a
polyalphabetic cipher, which was secure.
The system in the modular group M was presented as follows: A list of finitely
generated free subgroups H1 , . . . , Hm of M is public and presented by their systems of
generators (presented as matrices). In a full practical implementation, it is assumed
that m is large. For each Hi , we have a Schreier transversal

h1,i , . . . , ht(i),i

and a corresponding ordered set of generators

W1,i , . . . , Wm(i),i

constructed from the Schreier transversal by the Reidemeister–Schreier process. It is


assumed that each m(i) ≫ l, where l is the size of the plaintext alphabet; that is, each
subgroup has many more generators than the size of the plaintext alphabet. Although
Bob and Alice know these subgroups in terms of free group generators, what is made
public are generating systems given in terms of matrices.
The subgroups on this list and their corresponding Schreier transversals can be
chosen in a variety of ways. For example, the commutator subgroup of the Modular
group is free of rank 2, and some of the subgroups Hi can be determined from homo-
morphisms of this subgroup onto a set of finite groups.
Suppose that Bob wants to send a message to Alice. Bob first chooses three inte-
gers (m, q, t), where

m = choice of the subgroup Hm ;


q = starting point among the generators of Hm
for the substitution of the plaintext alphabet;
t = size of the message unit.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
388 | 23 Algebraic cryptography

We clarify the meanings of q and t. Once Bob chooses m, to further clarify the meaning
of q, he makes the substitution

a 󳨃→ Wm,q , b 󳨃→ Wm,q+1 , . . . .

Again, the assumption is that m(i) ≫ l so that starting almost anywhere in the se-
quence of generators of Hm will allow this substitution. The message unit size t is the
number of coded letters that Bob will place into each coded integral matrix.
Once Bob has made the choices (m, q, t), he takes his plaintext message W(a, b, . . .)
and groups blocks of t letters. He then makes the given substitution above to form the
corresponding matrices in the Modular group:

T1 , . . . , Ts .

We now introduce a random noise factor. After forming T1 , . . . , Ts , Bob then multiplies
on the right each Ti by a random matrix in M, say RTi (different for each Ti ). The only
restriction on this random matrix RTi is that there is no free cancellation in forming the
product Ti RTi . This can be easily checked, and ensures that the freely reduced form for
Ti RTi is just the concatenation of the expressions for Ti and RTi . Next he sends Alice
the integral key (m, q, t) by some public key method (RSA, Anshel–Anshel–Goldfeld
et cetera.). He then sends the message as s random matrices

T1 RT1 , T2 RT2 , . . . , Ts RTs .

Hence, what is actually being sent out are not elements of the chosen subgroup Hm ,
but rather elements of random right cosets of Hm in M. The purpose of sending coset
elements is two-fold. The first is to hinder any geometric attack by masking the sub-
group. The second is that it makes the resulting words in the modular group generators
longer—effectively hindering a brute force attack.
To decode the message, Alice first uses public key decryption to obtain the inte-
gral keys (m, q, t). She then knows the subgroup Hm , the ciphertext substitution from
the generators of Hm and how many letters t each matrix encodes. She next uses the
algorithms, described in Section 14.4, to express each Ti RTi in terms of the free group
generators of M, say WTi (y1 , . . . , yn ). She has knowledge of the Schreier transversal,
which is held secretly by Bob and Alice, so now uses the Reidemeister–Schreier rewrit-
ing process to start expressing this freely reduced word in terms of the generators of
Hm . The Reidemeister–Schreier rewriting is done letter by letter from left to right (see
Chapter 14). Hence, when she reaches t of the free generators, she stops. Notice that
the string that she is rewriting is longer than what she needs to rewrite to decode as a
result of the random polynomial RTi . This is due to the fact that she is actually rewrit-
ing not an element of the subgroup, but an element in a right coset. This presents a
further difficulty to an attacker. Since these are random right cosets, it makes it diffi-
cult to pick up statistical patterns in the generators even if more than one message is
intercepted. In practice, the subgroups should be changed with each message.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
23.5 Ko–Lee and Anshel–Anshel–Goldfeld methods | 389

The initial key (m, q, t) is changed frequently. Hence, as mentioned above, this
method becomes a type of polyalphabetic cipher. Polyalphabetic ciphers have histor-
ically been very difficult to decode.
A further variation of this method, using a formal power series ring in noncom-
muting variables over a field, was described in [51].
There have been many cryptosystems based on the difficulty of solving hard group
theoretic problems. The book by Myasnikov, Shpilrain, and Ushakov [73] describes
many of these in detail.

23.5 Ko–Lee and Anshel–Anshel–Goldfeld methods


After the initial attempt by Wagner and Magyarik to develop a cryptosystem based on a
hard group theoretic problem, there have been many developments using nonabelian
groups in cryptography. Among the first were the cryptographic schemes of Anshel,
Anshel and Goldfeld [50] and Ko and Lee [69]. Both sets of authors, at about the same
time, proposed using nonabelian groups and combinatorial group theory for public
key exchange. The security of these systems depended on the difficulty of solving cer-
tain “hard” group theoretic problems.
The methods of both Anshel–Anshel–Goldfeld and Ko–Lee can be considered as
group theoretic analogs of the number-theory-based Diffie–Hellman method. The ba-
sic underlying idea is the following: If G is a group and g, h ∈ G, we let g h denote the
conjugate of g by h; that is, g h = h−1 gh. The simple observation is that this behaves
like ordinary exponentiation in that (g h1 )h2 = g h1 h2 . From this straightforward idea,
one can exactly mimic the Diffie–Hellman protocol within a nonabelian group.
Both the Anshel–Anshel–Goldfeld protocol and the Ko–Lee protocol start with a
platform group G given by a group presentation. A major assumption in both protocols
is that the elements of G have unique normal forms that are easy to compute for given
group elements. However, it is further assumed that given normal forms for x, y ∈ G,
the normal form for the product xy does not reveal x or y.

23.5.1 The Ko–Lee protocol

Ko and Lee [69] developed a public key exchange system, that is, a direct translation
of the Diffie–Hellman protocol to a nonabelian group theoretic setting. Its security is
based on the difficulty of the conjugacy problem. We again assume that the platform
group has nice unique normal forms that are easy to compute given a group element,
but hard to recover the group element. Recall again that g h means the conjugate of g
by h; that is, g h = h−1 gh.
In the Ko–Lee protocol, Alice and Bob choose commuting subgroups A and B of
the platform group G. A is Alice’s subgroup, whereas Bob’s subgroup is B and these are

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
390 | 23 Algebraic cryptography

secret. Now they completely mimic the classical Diffie–Hellman technique. There is a
public element g ∈ G; Alice chooses a random secret element a ∈ A and makes public
g a . Bob chooses a random secret element b ∈ B and makes public g b . The secret shared
key is g ab . Notice that ab = ba, since the subgroups commute. It follows then that
(g a )b = g ab = g ba = (g b )a just as if these were exponents. Hence, both Bob and Alice
can determine the common secret. The difficulty is in the difficulty of the conjugacy
problem.
The conjugacy problem for a group G, or more precisely, for a group presentation
for G, is given g, h ∈ G to determine algorithmically if they are conjugates. As with the
conjugator search problem, it is known that the conjugacy is undecidable in general,
but there are groups, where it is decidable, but hard. These groups then become the
target platform groups for the Ko–Lee protocol. As with the Anshel–Anshel–Goldfeld
protocol, Ko and Lee suggest the use of the braid groups.
As with the standard Diffie–Hellman key exchange protocol, using number theory
the Ko–Lee protocol can be changed to an encryption system via the El-Gamal method.
There are several different variants of noncommutative El-Gamal systems.

23.5.2 The Anshel–Anshel–Goldfeld protocol

We now describe the Anshel–Anshel–Goldfeld public key exchange protocol. Let G be


the platform group given by a finite presentation and with the assumptions on normal
forms, as described above.
Alice and Bob want to communicate a shared secret. First, Alice and Bob choose
random finitely generated subgroups of G by giving a set of generators for each:

A = {a1 , . . . , an }, B = {b1 , . . . , bm },

and make them public. The subgroup A is Alice’s subgroup, whereas the subgroup B
is Bob’s subgroup.
Alice chooses a secret group word a = W(a1 , . . . , an ) in her subgroup, whereas
Bob chooses a secret group word b = V(b1 , . . . , bm ) in his subgroup. For an element
g ∈ G, we let NF(g) denote the normal form for g. Alice knows her secret word a and
knows the generators bi of Bob’s subgroup. She makes public the normal forms of the
conjugates

NF(bai ), i = 1, . . . , m.

Bob knows his secret word b and the generators ai of Alice’s subgroup, and makes
public the normal forms of the conjugates

NF(abj ), j = 1, . . . , n.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
23.6 Platform groups and braid group cryptography | 391

The common shared secret is the commutator

[a, b] = a−1 b−1 ab = a−1 ab = (ba ) b.


−1

Notice that Alice knows ab , since she knows a in terms of generators ai of her
subgroup, and she knows the conjugates by b, since Bob has made the conjugates of
the generators of A by b public. Since Alice knows ab , she knows [a, b] = a−1 ab .
In an analogous manner, Bob knows [a, b] = (ba )−1 b. An attacker would have to
know the corresponding conjugator, that is, the element that conjugates each of the
generators. Given elements g, h in a group G, where it is known that g k = k −1 gk = h, the
conjugator search problem is to determine the conjugator k. It is known that this prob-
lem is undecidable in general; that is, there are groups where the conjugator cannot
be determined algorithmically. On the other hand, there are groups, where the conju-
gator search problem is solvable, but “difficult”. That is, the complexity of solving the
conjugator search problem is hard. Such groups become the ideal platform groups for
the Anshel–Anshel–Goldfeld protocol.
The security in this system is then in the difficulty of the conjugator search prob-
lem. Anshel, Anshel, and Goldfeld suggested the Braid groups as potential platforms,
they use, for example, B80 with 12 or more generators in the subgroups. Their sugges-
tion and that of Ko and Lee led to development of braid group cryptography. There have
been various attacks on the Braid group system. However, some have been handled
by changing the parameters. In general, the ideas remain valid despite the attacks.
The Anshel–Anshel–Goldfeld key exchange can be developed into a cryptosystem
again by the El-Gamal method.
There have been many other public key exchange protocols developed using non-
abelian groups. A large number of them are described in the book of Myasnikov, Sh-
pilrain, and Ushakov [73]. The authors of that book themselves have developed many
of these methods. They use different “hard” group theoretic decision problems and
many have been broken. On the other hand, the security of many of them is still open,
and they, perhaps, can be used as viable alternatives to commutative methods.

23.6 Platform groups and braid group cryptography


Given a group based encryption scheme, such as Ko–Lee or Anshel–Anshel–Goldfeld,
a platform group is a group G, in which the encryption is to take place. In general,
platform groups for the noncommutative protocols that we have discussed require cer-
tain properties. The first is the existence of a normal form for elements in the group.
Normal forms provide an effective method of disguising elements. Without this, one
can determine a secret key simply by inspection of group elements. Furthermore, if
N(x), N(y) are the normal forms for x, y, respectively, then it should be difficult to de-
termine N(x) and N(y) from N(xy). The existence of a normal form in a group implies

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
392 | 23 Algebraic cryptography

that the group has solvable word problem, which is essential for these protocols. For
purposes of practicality, the group also needs an efficiently computable normal form,
which ensures an efficiently solvable word problem.
In addition to the platform group having normal form, ideally, it would also be
large enough so that a brute force search for the secret key is infeasible.
Currently, there are many potential platform groups that have been suggested.
What follow are some of the proposals. We refer to [73] for a discussion of many of
these.
– Braid groups (Ko–Lee, Anshel–Anshel–Goldfeld),
– Thompson groups (Shpilrain–Ushakov) [75],
– Polycyclic groups (Eick–Kahrobaei) [63],
– Linear groups (Baumslag–Fine–Xu) [55, 56],
– Free metabelian groups (Shpilrain–Zapata) [76],
– Artin groups (Shpilrain–Zapata) [76],
– Grigorchuk groups (Petrides) [74],
– Groups of matrices (Grigoriev–Ponomarenko) [65],
– Surface braid groups (Camps) [60].

As platform groups for their respective protocols, both Ko–Lee and Anshel– Anshel–
Goldfeld suggested the braid groups Bn (see [59]). The groups in this class of groups
possess the desired properties for the key exchange and key transport protocols; they
have remarkable presentations with solvable word problems and conjugacy prob-
lems; the solution to the conjugacy and conjugator search problem is “hard”; there
are several possibilities for normal forms for elements, and they have many choices
for large commuting subgroups. Initially, the braid groups were considered so ideal as
platforms that many other cryptographic applications were framed within the braid
group setting. These included authentication (identifying over a public airwave that
a message received was from the correct sender) and digital signature, (sending an
encrypted message with an included authentication). There was so much enthusiasm
about using these groups that the whole area of study was named braid group cryptog-
raphy. A comprehensive and well-written article by Dehornoy [38] provides a detailed
overview of the subject, and we refer the reader to that for technical details.
After the initial successes with braid group cryptographic schemes, there were
some surprisingly effective attacks. There were essentially three types of attacks: an
attack using solutions to the conjugacy and conjugator search problems, an attack
using heuristic probability within Bn , and an attack based on the fact that there are
faithful linear representations of each Bn (see [38]). What is most surprising is that
the Anshel–Anshel–Goldfeld method was susceptible to a length-based attack. In the
Anshel–Anshel–Goldfeld method, the parameters are the specific braid group Bn , and
the rank of the secret subgroups for Bob and Alice. A length-based attack essentially
broke the method for the initial parameters suggested by Anshel, Anshel and Goldfeld.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
23.6 Platform groups and braid group cryptography | 393

The parameters were then made larger and attacks by this method were less success-
ful. However, this led to research on why these attacks on the conjugator search prob-
lem within Bn were successful. What was discovered was that, generically, a random
subgroup of Bn is a free group; hence, length-based attacks are essentially attacks on
free group cryptography, and therefore successful (see [22]). What this indicated was
that although randomness is important in cryptography, by using the braid groups as
platforms, subgroups cannot be chosen purely randomly.
Braid groups arise in several different areas of mathematics and have several
equivalent formulations. We close this chapter and the book with a brief introduction
to braid groups. A complete topological and algebraic description can be found in the
book of Joan Birman [59].
A braid on n strings is obtained by starting with n parallel strings and intertwining
them. We number the strings at each vertical position and keep track of where each
individual string begins and ends. We say that two braids are equivalent if it is possible
to move the strings of one of the braids in space without moving the endpoints, or
moving through a string and obtain the other braid. A braid with no crossings is called
a trivial braid. We form a product of braids in the following manner: If u is the first braid
and v is the second braid, then uv is the braid formed by placing the starting points for
the strings in v at the endpoints of the strings in u. The inverse of a braid is the mirror
image in the horizontal plane. It is clear that if we form the product of a braid and
its mirror image, we get a braid equivalent to the trivial braid. With these definitions,
the set of all equivalence classes of braids on n strings forms a group Bn . We let σi
denote the braid that has a single crossing from string i over string i+1. Since a general
braid is just a series of crossings, it follows that Bn is generated by the set σi ; i = 1, . . . ,
n − 1.
There is an equivalent algebraic formulation of the braid group Bn . Let Fn be free
on the n generators x1 , . . . , xn with n > 2. Let σi , i = 1, . . . , n − 1, be the automorphism of
Fn , given by

σi : xi 󳨃→ xi+1 , xi+1 󳨃→ xi+1


−1
xi xi+1
σi : xj 󳨃→ xj , j ≠ i, i + 1.

Then each σi corresponds precisely to the basic crossings in Bn . Therefore, Bn can


be considered as the subgroup of Aut(Fn ) generated by the automorphisms σi . Artin
proved [35] (see also [31]) that a finite presentation for Bn is given by

Bn = ⟨σ1 , . . . , σn−1 ; [σi , σj ] = 1 if |i − j| > 1,


xi+1 xi xi+1 = xi xi+1 xi , i = 1, . . . , n − 1⟩.

This is now called the Artin presentation. The fact that Bn is contained in Aut(Fn )
provides an elementary solution to the word problem in Bn , since one can determine

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
394 | 23 Algebraic cryptography

easily if an automorphism of Fn is trivial on all the generators. We note that although


the braid groups Bn are linear (the Lawrence–Krammer representation is faithful
(see [38])), it is known that Aut(Fn ) is not linear (see [41]).
From the commuting relations in the Artin presentation, it is clear that each Bn
has the requisite collection of commuting subgroups.
The conjugacy problem for Bn was originally solved by Garside, and it was as-
sumed that it was hard in the complexity sense. Recently, there has been significant
research on the complexity of the solution to the conjugacy problem (see [73] and [38]).
There are several possibilities for normal forms for elements of Bn . The two most
commonly used are the Garside normal form and the Dehornoy handle form. These are
described in [38] and [73].
For braid group cryptography, one must be careful in using more than one normal
form in an encryption scheme. The second may expose what the first is hiding and vice
versa (see [38]).
We describe first the Dehornoy handle form. Let W be a word in the generators of
the braid group Bn . An xi -handle is a subword of W of the form

xi−ϵ Vxiϵ

with ϵ = ±1, and where the word V does not involve xi . If V does not contain any
xi+1 -handles, then the xi -handle is called permitted.
A braid word W is obtained from a braid word W 󸀠 by a one step handle reduction
if some subword of W is a permitted xi -handle xi−ϵ Vxiϵ , and W 󸀠 is obtained from W by
applying the following substitutions for all letters in the xi -handle:

{
{ 1, if j = i,
{
{
{ −ϵ ±1 ϵ
xj±1 → {xi+1 xi xi+1 , if j = i + 1,
{
{
{
{ ±1
{xj , if j < i or j > i + 1.

W can be obtained from W 󸀠 by an m-step handle reduction if W can be obtained


from W 󸀠 by a sequence of m one-step handle reductions.
A word is handle free if it has no handles. The handle free braid words provide
normal forms for the elements of Bn .

Theorem 23.6.1. Let W be a braid word. Then the following holds:


(1) Any sequence of handle reductions applied to W will eventually stop and produce a
handle free braid word V representing the same element as W.
(2) The word W represents the identity in Bn if and only if any sequence of handle reduc-
tions applied to W produces the trivial word or, equivalently, the handle free form of
W is trivial.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
23.7 Exercises | 395

The handle free reduction process is very efficient and most of the time works
in polynomial time on the length of the braid word to produce the handle free form.
However, there is no known theoretical complexity estimate (see [38]).
Garside solved the conjugacy problem using a different type of normal form for Bn .
Let Sn be the symmetric group on n letters, and for each s ∈ Sn , let ζs be the shortest
positive braid such that π(ζs ) = s. The elements

S = {ζs : s ∈ Sn } ⊂ Bn

are called simple elements. We order the simple elements so that ζs < ζt if there exists
r ∈ Sn such that ζt = ζs ζr . This produces a lattice structure on S.
The trivial braid is the smallest element of S, whereas the greatest element of S is
the half-twist braid

Δ = ζ(n,n−1,...,2,1) .

The Garside left normal form of a braid a ∈ Bn is a pair (p, (s1 , . . . , st )), where p ∈ ℤ
and s1 , . . . , st is a sequence of permutations in Sn \{1, Δ} satisfying for each i = 1, . . . , t −1

ζ1 = gcd(ζs−1 Δ , ζsi+1 ),
i

where

gcd(ζs , ζt ) = max{ζr : ζr < ζs and ζr < ζt }.

A normal form (p, (s1 , . . . , st )) represents the element

ζΔp ζs1 . . . ζsn .

Theorem 23.6.2. There exists an algorithm, which computes the normal form of the cor-
responding braid for any braid word W = w(x1 , . . . , xn ).

23.7 Exercises
1. Show that if p, q are primes and e, d are positive integers with (e, (p − 1)(q − 1)) = 1
and ed ≡ 1 mod (p − 1)(q − 1), then aed ≡ a mod pq for any integer a. (This is the
basis of the decryption function used in the RSA algorithm.)
2. The following table gives the approximate statistical frequency of occurrence of
letters in the English language. The passage below is encrypted with a simple per-
mutation cipher without punctuation. Use a frequency analysis to try to decode
it.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
396 | 23 Algebraic cryptography

letter frequency letter frequency letter frequency


A .082 B .015 C .028
D .043 E .127 F .022
G .020 H .061 I 070
J .002 K .008 L .040
M .024 N .067 O .075
P .019 Q .001 R .060
S .063 T .091 U .028
V .010 W .023 X .001
Y .020 Z .001

ZKIRNVMFNYVIRHZKLHRGREVRMGVTVIDSR
XSSZHZHGHLMOBKLHRGREVWRERHLIHLMVZ
MWRGHVOUKIRNVMFNYVIHKOZBZXIFXRZOI
LOVRMMFNYVIGSVLIBZMWZIVGSVYZHRHUL
IGHSHVMLGVHGSVIVZIVRMURMRGVOBNZMB
KIRNVHZMWGSVBHVIEVZHYFROWRMTYOLXP
HULIZOOGSVKLHRGREVRMGVTVIH

3. Encrypt the message NO MORE WAR using an affine cipher with single letter keys
a = 7, b = 5.
4. Encrypt the message NO MORE WAR using an affine cipher on 2 vectors of letters
and an encrypting key

5 2 3
A=( ), B = ( ).
1 1 7

5. What is the decryption algorithm for the affine cipher given in the last problem.
6. How many different affine enciphering transformations are there on single letters
with an N letter alphabet.
7. Let N ∈ ℕ with N ≥ 2 and n → an + b with (a, N) = 1 is an affine cipher on an N
letter alphabet. Show that if any two letters are guessed n1 → m1 , n2 → m2 with
(n1 − n2 , N) = 1, then the code can be broken.
8. If we use an affine cipher on N, N ≥ 2, single letters with n 󳨃→ an + b, b ≠ 0 mod N,
and (a − 1, N) = 1, show that there is always a unique fixed letter. This can be used
in cryptoanalysis.
9. A user has the public RSA key (n, e). By a security gap, the number ϕ(n) becomes
known. Show that the user has to reject the key.
(i) Explain how the secret key d can be calculated, that is, the number d such
that ed = 1 mod ϕ(n).
(ii) Explain how n can be factorized.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
23.7 Exercises | 397

10. The plaintext message x is encrypted with two RSA keys (551, 5) and (551, 11). The
respective ciphertexts are 277 mod 551 and 429 mod 551. From this calculate x.
11. Let F be a free group of rank 3 with generators x, y, z. Code the English alphabet
by a 󳨃→ 0, b 󳨃→ 1, . . . . Consider the free group cryptosystem given by

i 󳨃→ Wi ,

where Wi = xi yi+1 z i+2 x−i+1 . Code the message EAT AT JOES with this system.
12. In the Anshel–Anshel–Goldfeld protocol, verify that both Bob and Alice will know
the commutator.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:41 PM
Brought to you by | Cambridge University Library
Authenticated
Download Date | 9/18/19 6:41 PM
Bibliography
General abstract algebra

[1] J. L. Alperin and R. B. Bell, Groups and Representations, Springer-Verlag, 1995.


[2] M. Artin, Algebra, Prentice-Hall. 1991.
[3] C. Curtis and I. Reiner, Representation Theory of Finite Groups and Associative Algebras, Wiley
Interscience, 1966.
[4] C. Curtis and I. Reiner, Methods of Representation Theory I, Wiley Interscience, 1982.
[5] C. Curtis and I. Reiner, Methods of Representation Theory II, Wiley Interscience, 1986.
[6] B. Fine and G. Rosenberger, The Fundamental Theorem of Algebra, Springer-Verlag, 2000.
[7] J. Fraleigh, A First Course in Abstract Algebra, 7th ed., Addison-Wesley, 2003.
[8] E. G. Hafner, Lineare Algebra, Wiley-VCH, 2018.
[9] P. R. Halmos, Naive Set Theory, Springer-Verlag, 1998.
[10] I. Herstein, Topics in Algebra, Blaisdell, 1964.
[11] M. Kreuzer and S. Robiano, Computational Commutative Algebra I and II, Springer-Verlag,
1999.
[12] S. Lang, Algebra, Addison-Wesley, 1965.
[13] S. MacLane and G. Birkhoff, Algebra, Macmillan, 1967.
[14] N. McCoy, Introduction to Modern Algebra. Allyn and Bacon, 1960.
[15] N. McCoy, The Theory of Rings, Macmillan, 1964.
[16] G. Stroth, Algebra. Einführung in die Galoistheorie, De Gruyter, 1998.
[17] A. Zimmermann, Modular Representations of finite groups, in Representation Theory: Algebra
and Applications 19, 155–257, Spinger-Verlag, 2015.

Group theory and related topics

[18] G. Baumslag, Topics in Combinatorial Group Theory, Birkhäuser, 1993.


[19] O. Bogopolski, Introduction to Group Theory, European Mathematical Society, 2008.
[20] T. Camps, V. Große Rebel, G. Rosenberger, Einführung in die kombinatorische und die
geometrische Gruppentheorie, Heldermann Verlag, 2008.
[21] T. Camps, S. Kühling and G. Rosenberger, Einführung in die mengenteoretische und die
algebraische Topologie, Heldermann Verlag, 2006.
[22] C. Cartsensen, B. Fine and G. Rosenberger, On asymptotic densities and generic properties in
finitely generated groups, Groups–Complexitx–Cryptology 2, 2010, 212–25.
[23] B. Fine, A. Gaglione, A. Moldenhauer, G. Rosenberger and D. Spellman, Geometry and Discrete
Mathematics: A Selection of Highlights, De Gruyter, 2018.
[24] B. Fine and G. Rosenberger, Algebraic Generalizations of Discrete Groups, Marcel Dekker,
2001.
[25] D. Gorenstein, Finite Simple Groups. An Introduction to their Classification, Plenum Press,
1982.
[26] D. Johnson, Presentations of Groups, Cambridge University Press, 1990.
[27] S. Katok, Fuchsian Groups, Univ. of Chicago Press, 1992.
[28] G. Kern-Isberner and G. Rosenberger. A note on numbers of the form x 2 + Ny 2 , Arch. Math. 43,
1986, 148–55.

https://doi.org/10.1515/9783110603996-024

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:43 PM
400 | Bibliography

[29] R. C. Lyndon, Groups and Geometry, LMS Lecture Note Series 101, Cambridge University Press,
1985.
[30] R. C. Lyndon and P. Schupp, Combinatorial Group Theory, Springer-Verlag 1977.
[31] W. Magnus, A. Karrass and D. Solitar Combinatorial Group Theory, Wiley, 1966.
[32] D. J. S. Robinson, A Course in the Theory of Groups, Springer-Verlag, 1982.
[33] J. Rotman, Group Theory, 3rd ed., Wm. C. Brown, 1988.

Number theory

[34] L. Ahlfors, Introduction to Complex Analysis, Springer-Verlag, 1968.


[35] T. M. Apostol, Introduction to Analytic Number Theory, Springer-Verlag, 1976.
[36] A. Baker, Transcendental Number Theory, Cambridge University Press, 1975.
[37] H. Cohn, A Classical Invitation to Algebraic Numbers and Class Fields, Springer-Verlag, 1978.
[38] L. E. Dickson, History of the Theory of Numbers, Chelsea, 1950.
[39] B. Fine, A note on the two-square theorem, Can. Math. Bulletin, 20, 1977, 93–4.
[40] B. Fine, Sums of squares rings, Can. J. Math., 29, 1977, 155–60.
[41] B. Fine, The Algebraic Theory of the Bianchi Groups, Marcel Dekker, 1989.
[42] B. Fine, A. Gaglione, A. Moldenhauer, G. Rosenberger and D. Spellman Algebra and Number
Theory: A selection of Highlights, De Gruyter, 2017.
[43] B. Fine and G. Rosenberger, Number Theory: An Introduction via the Distribution of Primes,
second edition, Birkhäuser, 2016.
[44] G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers, 5th ed., Clarendon
Press, 1979.
[45] E. Landau, Elementary Number Theory, Chelsea, 1958.
[46] M. Newman, Integral Matrics, Academic Press, 1972.
[47] I. Niven and H. S. Zuckerman, The Theory of Numbers, 4th ed., John Wiley, 1980.
[48] O. Ore, Number Theory and its History, McGraw-Hill, 1949.
[49] H. Pollard and H. Diamond The Theory of Algebraic Numbers, Carus Mathematical Monographs
9, Math. Assoc. of America, 1975.

Cryptography

[50] I. Anshel, M. Anshel and D. Goldfeld, An algebraic method for public key cryptography, Math.
Res. Lett., 6, 1999, 287–91.
[51] G. Baumslag, Y. Brjukhov, B. Fine and G. Rosenberger, Some cryptoprimitives for
noncommutative algebraic cryptography, in Aspects of Infinite Groups, 26–44, World Scientific
Press, 2009.
[52] G. Baumslag, Y. Brjukhov, B. Fine and D. Troeger, Challenge response password security using
combinatorial group theory, Groups Complex. Cryptol., 2, 2010, 67–81.
[53] G. Baumslag, T. Camps, B. Fine, G. Rosenberger and X. Xu, Designing key transport protocols
using combinatorial group theory, Cont. Math. 418, 2006, 35–43.
[54] G. Baumslag, B. Fine, M. Kreuzer and G. Rosenberger, A Course in Mathematical Cryptography,
De Gruyter, 2015.
[55] G. Baumslag, B. Fine and X. Xu, Cryptosystems using linear groups, Appl. Algebra Eng.
Commun. Comput. 17, 2006, 205–17.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:43 PM
Bibliography | 401

[56] G. Baumslag, B. Fine and X. Xu, A proposed public key cryptosystem using the modular group,
Cont. Math. 421, 2007, 35–44.
[57] J. Birman, Braids, Links and Mapping Class Groups, Annals of Math Studies 82, Princeton
University Press, 1975.
[58] A. V. Borovik, A. G. Myasnikov and V. Shpilrain, Measuring sets in infinite groups, in
Computational and Statistical Group Theory, Contemp. Math. 298, 21–42, 2002.
[59] J. A. Buchmann, Introduction to Cryptography, Springer 2004.
[60] T. Camps, Surface Braid Groups as Platform Groups and Applications in Cryptography, Ph.D.
thesis, Universität Dortmund 2009.
[61] R. E. Crandall and C. Pomerance, Prime Numbers. A Computational Perspective, 2nd ed.,
Springer-Verlag, 2005.
[62] P. Dehornoy, Braid-based cryptography, Cont. Math., 360, 2004, 5–34.
[63] B. Eick and D. Kahrobaei, Polycyclic groups: A new platform for cryptology? math.GR/0411077
(2004), 1–7.
[64] M. I. González Vasco and R. Steinwandt, Group Theoretic Cryptograph, Chapman & Hall, 2015.
[65] D. Grigoriev and I. Ponomarenko, Homomorphic public-key cryptosystems over groups and
rings, Quaderni di Matematica, 2005.
[66] P. Hoffman, Archimedes’ Revenge, W. W. Norton & Company, 1988.
[67] D. Kahrobaei and B. Khan, A non-commutative generalization of the El-Gamal key exchange
using polycyclic groups, in Proceeding of IEEE, 1–5, 2006.
[68] I. Kapovich and A. Myasnikov, Stallings foldings and subgroups of free groups, J. Algebra 248,
2002, 608–68.
[69] K. H. Ko, S. J. Lee, J. H. Cheon, J. H. Han, J. S. Kang and C. Park, New public-key cryptosystems
using Braid groups, in Advances in Cryptography, Proceedings of Crypto 2000, Lecture Notes in
Computer Science 1880, 166–83, 2000.
[70] N. Koblitz, Algebraic Methods of Cryptography, Springer, 1998.
[71] W. Magnus, Rational representations of fuchsian groups and non-parabolic subgroups of the
modular group, Nachrichten der Akad. Göttingen, 179–89, 1973.
[72] A. G. Myasnikov, V. Shpilrain and A. Ushakov, A practical attack on some braid group based
cryptographic protocols, in CRYPTO 2005, Lecture Notes in Computer Science 3621, 86–96,
2005.
[73] A. G. Myasnikov, V. Shpilrain and A. Ushakov, Group-Based Cryptography, Advanced Courses in
Mathematics, CRM Barcelona, 2007.
[74] G. Petrides, Cryptoanalysis of the public key cryptosystem based on the word problem on the
Grigorchuk groups, in Cryptography and Coding, Lecture Notes in Computer Science 2898,
234–44, 2003.
[75] V. Shpilrain and A. Ushakov, The conjugacy search problem in public key cryptography;
unnecessary and insufficient, Applicable Algebra in Engineering, Communication and
computing, 17, 2006 285–9.
[76] V. Shpilrain and A. Zapata, Using the subgroup memberhsip problem in public key
cryptography, Cont. Math., 418, 2006, 169–79.
[77] R. Steinwandt, Loopholes in two public key cryptosystems using the modular groups, preprint,
University of Karlsruhe, 2000.
[78] R. Stinson, Cryptography; Theory and Practice, Chapman and Hall, 2002.
[79] X. Xu, Cryptography and Infinite Group Theory, Ph.D. thesis, CUNY, 2006.
[80] A. Yamamura, Public key cryptosystems using the modular group, in Public Key Cryptography,
Lecture Notes in Computer Sciences 1431, 203–16, 1998.

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:43 PM
Brought to you by | Cambridge University Library
Authenticated
Download Date | 9/18/19 6:43 PM
Index
Abelian group 3, 102 Commutator 183
Abelianization 184 Composition series 186
AES 375 Composition series for modules 336
Affine cipher 371 Congruence motion 129
Affine coordinate ring 325 Conjugacy class 190
Algebra 338 Conjugacy problem 222
Algebraic closure 74, 93, 97 Conjugation in groups 147
Algebraic extension 70 Constructible number 81
Algebraic geometry 319 Construction of a regular n-gon 85
Algebraic integer 303 Coset 17, 133
Algebraic number field 304 Cryptanalysis 365, 366
Algebraic numbers 68, 75 Cryptographic protocol 370
Algebraic variety 319 Cryptography 365
Algebraically closed 93, 96 – public key 365
Alternating group 174 – symmetric key 365
Annihilator 279 Cryptology 365
Anshel–Anshel–Goldfeld protocol 390 Cryptosystem 365
Associates 35 Cyclic group 127
Authentication 370 Cyclotomic field 261
Automorphism 11
Axiom of choice 25
Decryption 366
Axiom of well-ordering 25
Dedekind domain 50
Degree of a representation 351
Basis theorem for finite abelian groups 157, 293
Dehornoy handle form 394
Betti number 295
Derived series 184
Block cipher 375
Diffie–Hellman protocol 376
Braid group 393
Digital signature 370
Braid group cryptography 393
Dihedral groups 162
Burnside’s Theorem 358
Dimension of an algebraic set 326
Direct summand 336
Cardano’s formulas 264
Discrete log problem 376
Cayley graph 220
Divisibility 29
Cayley’s theorem 133
Division algorithm 30
Cell complex 218
Division ring 112, 345
Centralizer 190
Doubling the cube 84
Character 353
Dual module 340
Character table 357
Dyck’s theorem 221
Characteristic 14
Characters and Character Theory 353
Ciphertext 366 Eisenstein’s criterion 62
Class equation 192 El-Gamal protocol 379
Class function 355 Elliptic curve methods 381
Class sums 352 Elliptic function 326
Combinatorial group theory 201 Encryption 366
Commutative algebra 319 Endomorphism algebra 347
Commutative ring 2 Euclidean algorithm 32

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:46 PM
404 | Index

Euclidean domain 44 Garside normal form 395


Euclidean group 129 Gauss’ lemma 59
Euclidean norm 44 Gaussian integers 46
Euclid’s lemma 21 Gaussian primes 48
Extension field 67 Gaussian rationals 48
General linear group 128
Factor group 18, 150 Group 16, 102, 125
Factor module 336 – abelian 3, 16, 125
Factor R-module 336 – center 190
Factor ring 9 – conjugate elements 190
Feit–Thompson theorem 198 – coset 133
Feit-Thompson Theorem 358 – coset representative 134
Field 4 – cyclic 139
– extension 67 – direct product 155
Field extension 67 – finite 16, 102, 125
– algebraic 70 – finitely generated 208
– by radicals 257 – finitely presented 208
– degree 67 – finitely related 208
– finite 67 – free abelian 295
– finitely generated 70 – free product 223
– isomorphic 68 – generating system 208
– separable 243 – generators 133, 208
– simple 70 – homomorphism 127
– transcendental 70 – internal direct product 156
Field of fractions 13 – isomorphism 127
Finite fields 246 – order 16, 102, 125
Finite integral domains 5 – presentation 133, 208
Fix field 230 – relations 133
Free group 202 – relator 208
– rank 205 – simple 175
Free group cryptosystems 384 – solvable 180
Free modules 282 – transversal 134
Free product 223 Group action 189
Free reduction 203 Group algebra 338
Frobenius homomorphism 15 Group based cryptography 382
Fuchsian group 210 Group isomorphism theorem 18, 151
Fully reducible representation 342 Group presentation 208
Fundamental theorem of algebra 106, 272 Group representation 333, 334
Fundamental theorem of arithmetic 29 Group ring 337, 338
Fundamental theorem of Galois theory 231 Group Rings and Modules Over Group Rings 337
Fundamental theorem of modules 288 Group table 126
Fundamental theorem of symmetric polynomials
105 Hamiltonian skew field 112
Hash function 370, 374
G-invariant subspace 342 Hilbert basis theorem 322
Galois extension 243 Hilbert’s Nullstellensatz 322, 323
– finite 230 Homomorphism
Galois group 228 – group 17
Galois theory 227 – automorphism 17

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:46 PM
Index | 405

– epimorphism 17 Linear representation 333, 334


– isomorphism 17 Local ring 328
– monomorphism 17
– ring 11 Mashke’s Theorem 340
– automorphism 11 Maximal ideal 24
– endomorphism 11 Minimal polynomial 71
– epimorphism 11 Modular group 209
– isomorphism 11 Modular representation theory 342
– monomorphism 11 Modular rings 5
Modular rings in ℤ 10
Ideal 7 Module 275
– generators 26 Module homomorphism 337
– maximal 24 Module sum 336
– prime 22
– product 23 Nielsen–Schreier theorem 206
Ideals in ℤ 8 Noetherian 321
Index of a subgroup 17 Noncommutative algebraic cryptography 383
Inner automorphism of G by a 166 Norm 36
Inner product 355 Normal extension 121
Insolvability of the quintic 263 Normal forms 206
Integral closure 308 Normal series 179
Integral domain 3 Normal subgroup 18, 148
Integral element 306 Normalizer 192
Integral ring extension 307
Integrally closed 308
One-way function 376
Intermediate field 68
Opposite algebra 347
Irreducible character 353
Ordinary representation theory 351
Irreducible element 35
Orthogonality relations 357
Irreducible representation 342
Isometry 129
p-group 163
Isomorphism problem 222
p-Sylow subgroup 165
Jordan–Hölder theorem 186 Perfect field 243
Jordan-Hölder Theorem for R-modules 336 Permutation 16, 102
Permutation cipher 366
K-isomorphism 93 Permutation group 132
Kernel 18 Permutation module 340
Key exchange 370 Plaintext 366
Key transport 370 Platform group 391
Ko–Lee protocol 389 Polynomial 42, 53
Kronecker’s theorem 93 – coefficients 42, 53
Krull dimension 326 – constant 42
Krull’s lemma 329 – degree 42, 53
Kurosh theorem 224 – irreducible 43, 54, 55
– leading coefficient 42, 53
Lagrange’s theorem 18 – linear 42, 53
Left module 335 – prime 43, 55
Left R-module 335 – primitive 57
Linear action 334 – quadratic 42, 53
Linear character 353 – separable 243

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:46 PM
406 | Index

– zero 42 Semisimple algebra 343


– zero of 54 Semisimple module 341
Prime element 35 Separable field extension 243
Prime field 14 Separable hull 250
Prime ideal 22 Separable polynomial 243
Prime ring 15 Simple algebra 345
Primitive element theorem 254 Simple extension 70
Principal character 353 Simple group 175
Principal ideal 7, 27 Simple module 336
Principal ideal domain 27 Simplicial complex 218
Prüfer ring 50 Skew field 112, 345
Public key cryptosystem 376 Solvability by radicals 257
Purely transcendental 312 Solvable group 180
Solvable series 180
Quaternions 112 Special linear group 128
Quotient group 18, 150 Splitting field 101, 119
Quotient ring 9 Squaring the circle 84
Stabilizer 132, 189
R-module 275 Stream cipher 375
R-algebra 306 Subfield 6
R-module Subgroup 17, 102, 126
– cyclic 277 – commutator 183
– direct product 281 – conjugate 147
– factor module 278 – cyclic 127
– faithful 280 – derived 183
– free 283 – index 135
– generators 277 – normal 148
– quotient module 278 Submodule 335
– torsion element 280 Subring 6
– unitary 275 Sylow theorems 165, 192
Radical 321 Symmetric group 17, 102, 169
– nil 321 Symmetric polynomials 105
Rational integers 48 Symmetry 130
Rational primes 48
Regular character 353 The Character Table and Orthogonality Relations
Reidemeister–Schreier process 216 357
Ring 2 Theorem of Frobenius 116
– commutative 2 Transcendence basis 311
– finite 3 Transcendence degree 312, 325
– prime 15 Transcendental extension 70
– trivial 3 Transcendental numbers 68, 75
– with identity 2 Transitive action 189
Ring extension 306 Transposition 172
Ring isomorphism theorem 11 Trapdoor function 376
Ring of polynomials 54 Trisecting an angle 84
RSA algorithm 377 Trivial module 339

Schur’s lemma 337 UFD 38


Secret sharing 370 Unique factorization domain 38

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:46 PM
Index | 407

Unit 4, 35 – length 203


Unit group 35 – reduced 203
– trivial 203
Vector space 67 Word problem 222

Wagner–Magyarik system 384 Zero divisor 3


Word 203 Zero-knowledge proof 370
– cyclically reduced 206 Zorn’s lemma 25

Brought to you by | Cambridge University Library


Authenticated
Download Date | 9/18/19 6:46 PM
Brought to you by | Cambridge University Library
Authenticated
Download Date | 9/18/19 6:46 PM

You might also like