Abstract Algebra Applications To Galois Theory, Algebraic Geometry, Representation Theory and Cryptography (Celine Carstensen-Opitz, Benjamin Fine Etc.)
Abstract Algebra Applications To Galois Theory, Algebraic Geometry, Representation Theory and Cryptography (Celine Carstensen-Opitz, Benjamin Fine Etc.)
Gerhard Rosenberger
Abstract Algebra
Also of Interest
Algebra and Number Theory. A Selection of Highlights
Benjamin Fine, Anthony Gaglione, Anja Moldenhauer,
Gerhard Rosenberger, Dennis Spellman, 2017
ISBN 978-3-11-051584-8, e-ISBN (PDF) 978-3-11-051614-2,
e-ISBN (EPUB) 978-3-11-051626-5
Abstract Algebra
|
Applications to Galois Theory, Algebraic Geometry,
Representation Theory and Cryptography
Mathematics Subject Classification 2010
Primary: 11-01, 12-01, 13-01, 14-01, 16-01, 20-01, 20C15; Secondary: 01-01, 08-01, 94-01
Authors
Celine Carstensen-Opitz Dr. Anja Moldenhauer
Dortmund Hamburg
Germany Germany
[email protected] [email protected]
ISBN 978-3-11-060393-4
e-ISBN (PDF) 978-3-11-060399-6
e-ISBN (EPUB) 978-3-11-060525-9
www.degruyter.com
Preface
Traditionally, mathematics has been separated into three main areas: algebra, anal-
ysis, and geometry. Of course, there is a great deal of overlap between these areas.
For example, topology, which is geometric in nature, owes its origins and problems
as much to analysis as to geometry. Furthermore, the basic techniques in studying
topology are predominantly algebraic. In general, algebraic methods and symbolism
pervade all of mathematics, and it is essential for anyone learning any advanced math-
ematics to be familiar with the concepts and methods in abstract algebra.
This is an introductory text on abstract algebra. It grew out of courses given to
advanced undergraduates and beginning graduate students in the United States, and
to mathematics students and teachers in Germany. We assume that the students are
familiar with calculus and with some linear algebra, primarily matrix algebra and the
basic concepts of vector spaces, bases, and dimensions. All other necessary material
is introduced and explained in the book. We assume, however, that the students have
some, but not a great deal, of mathematical sophistication. Our experience is that the
material in this text can be completed in a full years course. We presented the material
sequentially, so that polynomials and field extensions preceded an in-depth look at
group theory. We feel that a student who goes through the material in these notes
will attain a solid background in abstract algebra, and be able to move on to more
advanced topics.
The centerpiece of these notes is the development of Galois theory and its impor-
tant applications, especially the insolvability of the quintic polynomial. After intro-
ducing the basic algebraic structures, groups, rings, and fields, we begin the theory
of polynomials and polynomial equations over fields. We then develop the main ideas
of field extensions and adjoining elements to fields. After this, we present the nec-
essary material from group theory needed to complete both the insolvability of the
quintic polynomial and solvability by radicals in general. Hence, the middle part of
the book, Chapters 9 through 14, are concerned with group theory, including permu-
tation groups, solvable groups, abelian groups, and group actions. Chapter 14 is some-
what off to the side of the main theme of the book. Here, we give a brief introduction
to free groups, group presentations and combinatorial group theory. With the group
theory material, we return to Galois theory and study general normal and separable
extensions and the fundamental theorem of Galois theory. Using this approach, we
present several major applications of the theory, including solvability by radicals and
the insolvability of the quintic, the fundamental theorem of algebra, the construction
of regular n-gons and the famous impossibilities; squaring the circling, doubling the
cube, and trisecting an angle. We finish in a slightly different direction, giving an in-
troduction to algebraic and group-based cryptography.
https://doi.org/10.1515/9783110603996-201
VI | Preface
https://doi.org/10.1515/9783110603996-202
Contents
Preface | V
5 Field extensions | 67
5.1 Extension fields and finite extensions | 67
5.2 Finite and algebraic extensions | 70
5.3 Minimal polynomials and simple extensions | 71
5.4 Algebraic closures | 74
5.5 Algebraic and transcendental numbers | 75
5.6 Exercises | 78
Bibliography | 399
Index | 403
1 Groups, rings and fields
1.1 Abstract algebra
Abstract algebra or modern algebra can be best described as the theory of algebraic
structures. Briefly, an algebraic structure is a set S together with one or more binary
operations on it satisfying axioms governing the operations. There are many algebraic
structures, but the most commonly studied structures are groups, rings, fields, and
vector spaces. Also, widely used are modules and algebras. In this first chapter, we
will look at some basic preliminaries concerning groups, rings, and fields. We will
only briefly touch on groups here; a more extensive treatment will be done later in the
book.
Mathematics traditionally has been subdivided into three main areas—analysis,
algebra, and geometry. These areas overlap in many places so that it is often difficult,
for example, to determine whether a topic is one in geometry or in analysis. Algebra
and algebraic methods permeate all these disciplines and most of mathematics has
been algebraicized; that is, uses the methods and language of algebra. Groups, rings,
and fields play a major role in the modern study of analysis, topology, geometry, and
even applied mathematics. We will see these connections in examples throughout the
book.
Abstract algebra has its origins in two main areas and questions that arose in
these areas—the theory of numbers and the theory of equations. The theory of num-
bers deals with the properties of the basic number systems—integers, rationals, and
reals, whereas the theory of equations, as the name indicates, deals with solving equa-
tions, in particular, polynomial equations. Both are subjects that date back to classical
times. A whole section of Euclid’s elements is dedicated to number theory. The foun-
dations for the modern study of number theory were laid by Fermat in the 1600s, and
then by Gauss in the 1800s. In an attempt to prove Fermat’s big theorem, Gauss intro-
duced the complex integers a + bi, where a and b are integers and showed that this
set has unique factorization. These ideas were extended by Dedekind and Kronecker,
who developed a wide ranging theory of algebraic number fields and algebraic inte-
gers. A large portion of the terminology used in abstract algebra, such as rings, ideals,
and factorization, comes from the study of algebraic number fields. This has evolved
into the modern discipline of algebraic number theory.
The second origin of modern abstract algebra was the problem of trying to de-
termine a formula for finding the solutions in terms of radicals of a fifth degree poly-
nomial. It was proved first by Ruffini in 1800, and then by Abel that it is impossible
to find a formula in terms of radicals for such a solution. Galois in 1820 extended this
and showed that such a formula is impossible for any degree five or greater. In proving
this, he laid the groundwork for much of the development of modern abstract algebra,
especially field theory and finite group theory. Earlier, in 1800, Gauss proved the fun-
https://doi.org/10.1515/9783110603996-001
damental theorem of algebra, which says that any nonconstant complex polynomial
equation must have a solution. One of the goals of this book is to present a compre-
hensive treatment of Galois theory and a proof of the results mentioned above.
The locus of real points (x, y), which satisfy a polynomial equation f (x, y) = 0, is
called an algebraic plane curve. Algebraic geometry deals with the study of algebraic
plane curves and extensions to loci in a higher number of variables. Algebraic geom-
etry is intricately tied to abstract algebra and especially commutative algebra. We will
touch on this in the book also.
Finally linear algebra, although a part of abstract algebra, arose in a somewhat
different context. Historically, it grew out of the study of solution sets of systems of
linear equations and the study of the geometry of real n-dimensional spaces. It began
to be developed formally in the early 1800s with work of Jordan and Gauss, and then
later in the century by Cayley, Hamilton, and Sylvester.
1.2 Rings
The primary motivating examples for algebraic structures are the basic number sys-
tems: the integers ℤ, the rational numbers ℚ, the real numbers ℝ, and the complex
numbers ℂ. Each of these has two basic operations, addition and multiplication, and
form what is called a ring. We formally define this.
Definition 1.2.1. A ring is a set R with two binary operations defined on it: addition,
denoted by +, and multiplication, denoted by ⋅, or just by juxtaposition, satisfying the
following six axioms:
(1) Addition is commutative: a + b = b + a for each pair a, b in R.
(2) Addition is associative: a + (b + c) = (a + b) + c for a, b, c ∈ R.
(3) There exists an additive identity, denoted by 0, such that a + 0 = a for each a ∈ R.
(4) For each a ∈ R, there exists an additive inverse, denoted by −a, such that
a + (−a) = 0.
(5) Multiplication is associative: a(bc) = (ab)c for a, b, c ∈ R.
(6) Multiplication is left and right distributive over addition: a(b + c) = ab + ac, and
(b + c)a = ba + ca for a, b, c ∈ R.
If in addition
(7) Multiplication is commutative: ab = ba for each pair a, b in R,
a b1 a b2 a + a2 b1 + b2
( 1 )+( 2 )=( 1 ),
c1 d1 c2 d2 c1 + c2 d1 + d2
a b1 a b2 a a + b1 c2 a1 b2 + b1 d2
( 1 )⋅( 2 )=( 1 2 ).
c1 d1 c2 d2 c1 a2 + d1 c2 c1 b2 + d1 d2
Then again, it is an easy verification (see exercises) that M2 (ℤ) forms a ring. Further,
since matrix multiplication is noncommutative, this forms a noncommutative ring.
However, the identity matrix does form a multiplicative identity for it. M2 (nℤ) with
n > 1 provides an example of an infinite noncommutative ring without an identity.
Finally, M2 (ℤn ) for n > 1 will give an example of a finite noncommutative ring.
Notice that having no zero divisors is equivalent to the fact that if ab = 0 in R, then
either a = 0, or b = 0.
Hence, ℤ, ℚ, ℝ, ℂ are all integral domains, but from the example above, ℤ6 is not.
In general, we have the following:
Proof. First of all, notice that under multiplication modulo n, an element m is 0 if and
only if n divides m. We will make this precise shortly. Recall further Euclid’s lemma
(see Chapter 2), which says that if a prime p divides a product ab, then p divides a, or
p divides b.
Now suppose that n is a prime and ab = 0 in ℤn . Then n divides ab. From Euclid’s
lemma it follows that n divides a, or n divides b. In the first case, a = 0 in ℤn , whereas
in the second, b = 0 in ℤn . It follows that there are no zero divisors in ℤn , and since
ℤn is a commutative ring with an identity, it is an integral domain.
Conversely, suppose ℤn is an integral domain. Suppose that n is not prime. Then
n = ab with 1 < a < n, 1 < b < n. It follows that ab = 0 in ℤn with neither a nor b
being zero. Therefore, they are zero divisors, which is a contradiction. Hence, n must
be prime.
Hence, a field K always contains at least two elements, a zero element 0 and an
identity 1 ≠ 0.
The rationals ℚ, the reals ℝ, and the complexes ℂ are all fields. If we relax the com-
mutativity requirement and just require that in the ring R with identity, each nonzero
element is a unit, then we get a skew field or division ring.
Proof. Since a field K is already a commutative ring with an identity, we must only
show that there are no zero divisors in K.
Recall that ℤn was an integral domain only when n was a prime. This turns out to
also be necessary and sufficient for ℤn to be a field.
Proof. First suppose that ℤn is a field. Then from Lemma 1.3.5, it is an integral domain.
Therefore, from Theorem 1.3.2, n must be a prime.
Conversely, suppose that n is a prime. We must show that ℤn is a field. Since we
already know that ℤn is an integral domain, we must only show that each nonzero
element of ℤn is a unit. Here, we need some elementary facts from number theory. If
a, b are integers, we use the notation a|b to indicate that a divides b.
Recall that given nonzero integers a, b, their greatest common divisor or GCD d > 0
is a positive integer, which is a common divisor; that is, d|a and d|b, and if d1 is any
other common divisor, then d1 |d. We denote the greatest common divisor of a, b by
either gcd(a, b) or (a, b). It can be proved that given nonzero integers a, b their GCD
exists, is unique and can be characterized as the least positive linear combination
of a and b. If the GCD of a and b is 1, then we say that a and b are relatively prime or
coprime. This is equivalent to being able to express 1 as a linear combination of a and b
(see Chapter 3 for proofs and more details).
Now let a ∈ ℤn with n prime and a ≠ 0. Since a ≠ 0, we have that n does
not divide a. Since n is prime, it follows that a and n must be relatively prime,
(a, n) = 1. From the number theoretic remarks above, we then have that there ex-
ist x, y with
ax + ny = 1.
ax = 1.
Therefore, a has a multiplicative inverse in ℤn and is, hence, a unit. Since a was
an arbitrary nonzero element, we conclude that ℤn is a field.
The theorem above is actually a special case of a more general result from which
Theorem 1.3.6 could also be obtained.
Proof. Let K be a finite integral domain. We must show that K is a field. It is clearly
sufficient to show that each nonzero element of K is a unit. Let
{0, 1, r1 , . . . , rn }
be the elements of K. Let ri be a fixed nonzero element and multiply each element of
K by ri on the left. Now
if ri rj = ri rk then ri (rj − rk ) = 0.
R = {0, 1, r1 , . . . , rn } = ri R = {0, ri , ri r1 , . . . , ri rn }.
Therefore, the identity element 1 must be in the right-hand list; that is, there is an
rj such that ri rj = 1. Therefore, ri has a multiplicative inverse and is, hence, a unit.
Therefore, K is a field.
Definition 1.4.1. A subring of a ring R is a nonempty subset S that is also a ring under
the same operations as R. If R is a field and S also a field, then it is a subfield.
Lemma 1.4.2. A subset S of a ring R is a subring if and only if S is nonempty, and when-
ever a, b ∈ S, we have a + b ∈ S, a − b ∈ S and ab ∈ S.
Example 1.4.3. Show that if n > 1, the set nℤ is a subring of ℤ. Here, clearly nℤ is
nonempty. Suppose a = nz1 , b = nz2 are two elements of nℤ. Then
Therefore, nℤ is a subring.
Example 1.4.4. Show that the set of real numbers of the form
S = {u + v√2 : u, v ∈ ℚ}
is a subring of ℝ.
Here, 1 + √2 ∈ S; therefore, S is nonempty. Suppose a = u1 + v1 √2, b = u2 + v2 √2
are two element of S. Then
Therefore, S is a subring.
Definition 1.4.5. Let R be a ring and I ⊂ R. Then I is a (two-sided) ideal if the following
properties hold:
(1) I is nonempty.
(2) If a, b ∈ I, then a ± b ∈ I.
(3) If a ∈ I and r is any element of R, then ra ∈ I, and ar ∈ I.
⟨a⟩ = aR = {ar : r ∈ R}
is an ideal of R.
Proof. We must verify the three properties of the definition. Since a ∈ R, we have that
aR is nonempty. If u = ar1 , v = ar2 are two elements of aR, then
Theorem 1.4.7. Any subring of ℤ is of the form nℤ for some n. Hence, each subring of ℤ
is actually a principal ideal.
Proof. Let S be a subring of ℤ. If S = {0}, then S = 0ℤ, so we may assume that S has
nonzero elements. Since S is a subring if it has nonzero elements, it must have positive
elements (since it has the additive inverse of any element in it).
Let S+ be the set of positive elements in S. From the remarks above, this is a
nonempty set, and so, there must be a least positive element n. We claim that S = nℤ.
Let m be a positive element in S. By the division algorithm
m = qn + r,
where either r = 0, or 0 < r < n (see Chapter 3). Suppose that r ≠ 0. Then
r = m − qn.
We mention that this is true in ℤ, but not always true. For example, ℤ is a subring
of ℚ, but not an ideal.
An extension of the proof of Lemma 1.4.6 gives the following. We leave the proof
as an exercise.
⟨a1 , . . . , an ⟩ = {r1 a1 + r2 a2 + ⋅ ⋅ ⋅ + rn an : ri ∈ R}
is an ideal of R.
Proof. Suppose that R is a field and I ⊲ R is an ideal. We must show that either I = {0},
or I = R. Suppose that I ≠ {0}, then we must show that I = R.
Since I ≠ {0}, there exists an element a ∈ I with a ≠ 0. Since R is a field, this
element a has an inverse a−1 . Since I is an ideal, it follows that a−1 a = 1 ∈ I. Let r ∈ R,
then, since 1 ∈ I, we have r ⋅ 1 = r ∈ I. Hence, R ⊂ I and, therefore, R = I.
Conversely, suppose that R is a commutative ring with an identity, whose only
ideals are {0} and R. We must show that R is a field, or equivalently, that every nonzero
element of R has a multiplicative inverse.
Let a ∈ R with a ≠ 0. Since R is a commutative ring, and a ≠ 0, the principal ideal
aR is a nontrivial ideal in R. Hence, aR = R. Therefore, the multiplicative identity
1 ∈ aR. It follows that there exists an r ∈ R with ar = 1. Hence, a has a multiplicative
inverse, and R must be a field.
r + I = {r + i : i ∈ I}
Lemma 1.5.2. Let I be an ideal in a ring R. Then the cosets of I partition R; that is, any
two cosets are either coincide or disjoint.
Theorem 1.5.3. Let I be an ideal in a ring R. Let R/I be the set of all cosets of I in R; that
is,
R/I = {r + I : r ∈ R}.
Then R/I forms a ring called the factor ring of R modulo I. The zero element of R/I is 0 + I
and the additive inverse of r + I is −r + I.
Further, if R is commutative, then R/I is commutative, and if R has an identity, then
R/I has an identity 1 + I.
Proof. The proofs that R/I satisfies the ring axioms under the definitions above is
straightforward. For example,
then
and
Now if r1 + I = r1 + I, then r1 ∈ r1 + I, and so, r1 = r1 + i1 for some i1 ∈ I. Similarly, if
r2 + I = r2 + I, then r2 ∈ r2 + I, and so, r2 = r2 + i2 for some i2 ∈ I. Then
Addition and multiplication of cosets is then just addition and multiplication mod-
ulo n. As we can see, this is just a formalization of the ring ℤn , which we have already
looked at. Recall that ℤn is an integral domain if and only if n is prime and ℤn is a field
for precisely the same n. If n = 0, then ℤ/nℤ is the same as ℤ.
We now show that ideals and factor rings are closely related to certain mappings
between rings.
In addition,
(1) f is an epimorphism if it is surjective.
(2) f is an monomorphism if it is injective.
(3) f is an isomorphism if it is bijective; that is, both surjective and injective. In this
case, R and S are said to be isomorphic rings, which we denote by R ≅ S.
(4) f is an endomorphism if R = S; that is, a ring homomorphism from a ring to itself.
(5) f is an automorphism if R = S and f is an isomorphism.
Lemma 1.5.5. Let R and S be rings, and let f : R → S be a ring homomorphism. Then
(1) f (0) = 0, where the first 0 is the zero element of R, and the second is the zero element
of S.
(2) f (−r) = −f (r) for any r ∈ R.
Proof. We obtain f (0) = 0 from the equation f (0) = f (0 + 0) = f (0) + f (0). Hence,
0 = f (0) = f (r − r) = f (r + (−r)) = f (r) + f (−r); that is, f (−r) = −f (r).
Definition 1.5.6. Let R and S be rings, and let f : R → S be a ring homomorphism. Then
the kernel of f is
Theorem 1.5.7 (Ring isomorphism theorem). Let R and S be rings, and let
f :R→S
R/ ker(f ) ≅ im(f ).
(2) Conversely, suppose that I is an ideal in a ring R. Then the map f : R → R/I, given
by f (r) = r + I for r ∈ R, is a ring homomorphism, whose kernel is I, and whose image
is R/I.
The theorem says that the concepts of ideal of a ring and kernel of a ring homo-
morphism coincide; that is, each ideal is the kernel of a homomorphism and the kernel
of each ring homomorphism is an ideal.
Therefore, I is a subring.
Now let i ∈ I and r ∈ R. Then
and
Finally, let s ∈ im(f ). Then there exists r ∈ R such that f (r) = s. Then f ∗ (r + I) = s,
and the map f ∗ is surjective and, hence, an isomorphism. This proves the first part of
the theorem.
To prove the second part, let I be an ideal in R and R/I the factor ring. Consider
the map f : R → R/I, given by f (r) = r + I. From the definition of addition and multi-
plication in the factor ring R/I, it is clear that this is a homomorphism. Consider the
kernel of f . If r ∈ ker(f ), then f (r) = r + I = 0 = 0 + I. This implies that r ∈ I and, hence,
the kernel of this map is exactly the ideal I, completing the theorem.
Theorem 1.5.7 is called the ring isomorphism theorem or the first ring isomorphism
theorem. We mention that there is an analogous theorem for each algebraic structure,
in particular, for groups and vector spaces. We will mention the result for groups in
Section 1.8.
Theorem 1.6.1. The rationals ℚ are the smallest field containing the integers ℤ. That is,
if ℤ ⊂ K ⊂ ℚ with K a subfield of ℚ, then K = ℚ.
Theorem 1.6.2. Let D be an integral domain. Then there is a field K containing D, called
the field of fractions for D, such that each element of K is a fraction from D; that is, an
element of the form d1 d2−1 with d1 , d2 ∈ D. Further, K is unique up to isomorphism and is
the smallest field containing D.
Proof. The proof is just the mimicking of the construction of the rationals from the
integers. Let
K = {(d1 , d2 ) : d1 , d2 ≠ 0, d1 , d2 ∈ D}.
Let K be the set of equivalence classes, and define addition and multiplication in the
usual manner as for fractions, where the result is the equivalence class:
It is now straightforward to verify the ring axioms for K. The inverse of (d1 , 1) is (1, d1 )
for d1 ≠ 0 in D.
As with ℤ, we identify the elements of K as fractions dd1 .
2
The proof that K is the smallest field containing D is the same as for ℚ from ℤ.
As examples, we have that ℚ is the field of fractions for ℤ. A familiar, but less
common, example is the following:
Let ℝ[x] be the set of polynomials over the real numbers ℝ. It can be shown that
ℝ[x] forms an integral domain (see Chapter 3). The field of fractions consists of all
f (x)
formal functions g(x) , where f (x), g(x) are real polynomials with g(x) ≠ 0. The corre-
sponding field of fractions is called the field of rational functions over ℝ and is denoted
ℝ(x).
Lemma 1.7.2. Let K be any field. Then K contains a prime field K as a subfield.
We have seen that every field contains a prime field. We extend this.
Definition 1.7.5. A commutative ring R with an identity 1 ≠ 0 is a prime ring if the only
subring containing the identity is the whole ring.
Clearly both the integers ℤ and the modular integers ℤn are prime rings. In fact,
up to isomorphism, they are the only prime rings.
Theorem 1.7.6 can be extended to fields with ℚ, taking the place of ℤ and ℤp , with
p a prime, taking the place of ℤn .
Proof. The proof is identical to that of Theorem 1.7.6; however, we consider the small-
est subfield K1 of K containing S.
We mention that there can be infinite fields of characteristic p. Consider, for ex-
ample, the field of fractions of the polynomial ring ℤp [x]. This is the field of rational
functions with coefficients in ℤp .
We give a theorem on fields of characteristic p that will be important much later
when we look at Galois theory.
p p(p − 1) ⋅ ⋅ ⋅ (p − i + 1)
( )= ,
i i ⋅ (i − 1) ⋅ ⋅ ⋅ 1
and it is clear that p|(pi ) for 1 ≤ i ≤ p − 1. Hence, in K, we have (pi ) ⋅ 1 = 0, and so, we
have
Therefore, ϕ is a homomorphism.
Further, ϕ is always injective. To see this, suppose that ϕ(x) = ϕ(y). Then
ϕ(x − y) = 0 ⇒ (x − y)p = 0.
1.8 Groups
We close this first chapter by introducing some basic definitions and results from
group theory that mirror the results, which were presented for rings and fields. We
will look at group theory in more detail later in the book. Proofs will be given at that
point.
Definition 1.8.1. A group G is a set with one binary operation (which we will denote
by multiplication) such that
(1) The operation is associative.
(2) There exists an identity for this operation.
(3) Each g ∈ G has an inverse for this operation.
If, in addition, the operation is commutative, the group G is called an abelian group.
The order of G is the number of elements in G, denoted by |G|. If |G| < ∞, G is a finite
group; otherwise G is an infinite group.
Groups most often arise from invertible mappings of a set onto itself. Such map-
pings are called permutations.
Theorem 1.8.2. The group of all permutations on a set A forms a group called the sym-
metric group on A, which we denote by SA . If A has more than 2 elements, then SA is
nonabelian.
Theorem 1.8.5. If A1 and A2 are sets with |A1 | = |A2 |, then SA1 ≅ SA2 . If |A| = n with n
finite, we call SA the symmetric group on n elements, which we denote by Sn . Further, we
have |Sn | = n!.
As with rings the cosets of a subgroup partition a group. We call the number of
right cosets of a subgroup H in a group G, then index of H in G, denoted |G : H|. One
can prove that the number of right cosets is equal to the number of left cosets. For
finite groups, we have the following beautiful result called Lagrange’s theorem.
Theorem 1.8.8 (Lagrange’s theorem). Let G be a finite group and H a subgroup. Then
the order of H divides the order of G. In particular,
Theorem 1.8.10. Let H be a normal subgroup of a group G. Let G/H be the set of all
cosets of H in G; that is,
(g1 H)(g2 H) = g1 g2 H.
Then G/H forms a group called the factor group or quotient group of G modulo H.
The identity element of G/H is 1H, and the inverse of gH is g −1 H.
Further, if G is abelian, then G/H is also abelian.
Finally, as with rings normal subgroups, factor groups are closely tied to homo-
morphisms.
Theorem 1.8.12 (Group isomorphism theorem). Let G1 and G2 be groups, and let f :
G1 → G2 be a homomorphism. Then
(1) ker(f ) is a normal subgroup in G1 . im(f ) is a subgroup of G2 , and
G1 / ker(f ) ≅ im(f ).
(2) Conversely, suppose that H is a normal subgroup of a group G. Then the map f :
G → G/H, given by f (g) = gH for g ∈ G is a homomorphism, whose kernel is H and
whose image is G/H.
1.9 Exercises
1. Let ϕ : K → R be a homomorphism from a field K to a ring R. Show: Either ϕ(a) = 0
for all a ∈ K, or ϕ is a monomorphism.
2. Let R be a ring and M ≠ 0 an arbitrary set. Show that the following are equivalent:
(i) The ring of all mappings from M to R is a field.
(ii) M contains only one element and R is a field.
3. Let π be a set of prime numbers. Define
a
ℚπ = { : all prime divisors of b are in π}.
b
3x ≡ 5 mod 7.
Hence, for the integers ℤ, a factor ring is a field if and only if it is an integral
domain. We will see later that this is not true in general. However, what is clear is
that special ideals nℤ lead to integral domains and fields when n is a prime. We look
at the ideals pℤ with p a prime in two different ways, and then use these in subsequent
sections to give the general definitions. We first need a famous result, Euclid’s lemma,
from number theory. For integers a, b, the notation a|b means that a divides b.
Proof. Recall that the greatest common divisor or GCD of two integers a, b is an integer
d > 0 such that d is a common divisor of both a and b, and if d1 is another common
divisor of a and b, then d1 |d. We express the GCD of a, b by d = (a, b). It is known that
for any two integers a, b, their GCD exists and is unique, and is the least positive linear
combination of a and b; that is, the least positive integer of the form ax+by for integers
x, y. The integers a, b are relatively prime if their GCD is 1, (a, b) = 1. In this case, 1 is a
linear combination of a and b (see Chapter 3 for proofs and more details).
Now suppose p|ab, where p is a prime. If p does not divide a, then since the only
positive divisors of p are 1 and p, it follows that (a, p) = 1. Hence, 1 is expressible as
a linear combination of a and p. That is, ax + py = 1 for some integers x, y. Multiply
through by b, so that
abx + pby = b.
Now p|ab, so p|abx and p|pby. Therefore, p|abx + pby; that is, p|b.
https://doi.org/10.1515/9783110603996-002
We now recast this lemma in two different ways in terms of the ideal pℤ. Notice
that pℤ consists precisely of all the multiples of p. Hence, p|ab is equivalent to ab ∈
pℤ.
This conclusion will be taken as a motivation for the definition of a prime ideal in
the next section.
Lemma 2.1.4. If p is a prime and pℤ ⊂ nℤ, then n = 1, or n = p. That is, every ideal in
ℤ containing pℤ with p a prime is either all of ℤ or pℤ.
In Section 2.3, the conclusion of this lemma will be taken as a motivation for the
definition of a maximal ideal.
This property of an ideal is precisely what is necessary and sufficient to make the
factor ring R/I an integral domain.
Theorem 2.2.2. Let R be a commutative ring with an identity 1 ≠ 0, and let P be a non-
trivial ideal in R. Then P is a prime ideal if and only if the factor ring R/P is an integral
domain.
Proof. Let R be a commutative ring with an identity 1 ≠ 0, and let P be a prime ideal.
We show that R/P is an integral domain. From the results in the last chapter, we have
that R/P is again a commutative ring with an identity. Therefore, we must show that
there are no zero divisors in R/P. Suppose that (a+I)(b+I) = 0 in R/P. The zero element
in R/P is 0 + P and, hence,
(a + P)(b + P) = 0 = 0 + P ⇒ ab + P = 0 + P ⇒ ab ∈ P.
we have
(a + P)(b + P) = 0.
However, R/P is an integral domain, so it has no zero divisors. It follows that either
a + P = 0 and, hence, a ∈ P or b + P = 0, and b ∈ P. Therefore, either a ∈ P, or b ∈ P.
Therefore, P is a prime ideal.
Definition 2.2.3. Let R be a commutative ring with an identity 1 ≠ 0, and let A and B
be ideals in R. Define
AB = {a1 b1 + ⋅ ⋅ ⋅ + an bn : ai ∈ A, bi ∈ B, n ∈ ℕ}.
Lemma 2.2.4. Let R be a commutative ring with an identity 1 ≠ 0, and let A and B be
ideals in R. Then AB is an ideal.
Proof. We must verify that AB is a subring, and that it is closed under multiplication
from R. Le r1 , r2 ∈ AB. Then
r1 = a1 b1 + ⋅ ⋅ ⋅ + an bn for some ai ∈ A, bi ∈ B,
and
Then
Consider, for example, the first term a1 b1 a1 b1 . Since R is commutative, this is equal to
Now a1 a1 ∈ A since A is a subring, and b1 b1 ∈ B since B is a subring. Hence, this term
is in AB. Similarly, for each of the other terms. Therefore, r1 r2 ∈ AB and, hence, AB is
a subring.
Now let r ∈ R, and consider rr1 . This is then
Now rai ∈ A for each i since A is an ideal. Hence, each summand is in AB, and then
rr1 ∈ AB. Therefore, AB is an ideal.
Lemma 2.2.5. Let R be a commutative ring with an identity 1 ≠ 0, and let A and B be
ideals in R. If P is a prime ideal in R, then AB ⊂ P implies that A ⊂ P or B ⊂ P.
Proof. Suppose that AB ⊂ P with P a prime ideal, and suppose that B is not contained
in P. We show that A ⊂ P. Since AB ⊂ P, each product ai bj ∈ P. Choose a b ∈ B with
b ∉ P, and let a be an arbitrary element of A. Then ab ∈ P. Since P is a prime ideal,
this implies either a ∈ P, or b ∈ P. But by assumption b ∉ P, so a ∈ P. Since a was
arbitrary, we have A ⊂ P.
Theorem 2.3.2. Let R be a commutative ring with an identity 1 ≠ 0, and let I be an ideal
in R. Then I is a maximal ideal if and only if the factor ring R/I is a field.
Recall that a field is already an integral domain. Combining this with the ideas of
prime and maximal ideals we obtain:
Theorem 2.3.3. Let R be a commutative ring with an identity 1 ≠ 0. Then each maximal
ideal is a prime ideal.
Proof. Suppose that R is a commutative ring with an identity and I is a maximal ideal
in R. Then from Theorem 2.3.2, we have that the factor ring R/I is a field. But a field is
an integral domain, so R/I is an integral domain. Therefore, from Theorem 2.2.2, we
have that I must be a prime ideal.
The converse is not true in general. That is, there are prime ideals that are not
maximal. Consider, for example, R = ℤ the integers and I = {0}. Then I is an ideal,
and R/I = ℤ/{0} ≅ ℤ is an integral domain. Hence, {0} is a prime ideal. However, ℤ is
not a field, so {0} is not maximal. Note, however, that in the integers ℤ, a proper ideal
is maximal if and only if it is a prime ideal.
Zorn’s lemma. If each chain of M has an upper bound in M, then there is at least one
maximal element in M.
Axiom of well-ordering. Each set M can be well-ordered, such that each nonempty sub-
set of M contains a least element.
Axiom of choice. Let {Mi : i ∈ I} be a nonempty collection of nonempty sets. Then there
is a mapping f : I → ⋃i∈I Mi with f (i) ∈ Mi for all i ∈ I.
Theorem 2.4.1. Zorn’s lemma, the axiom of well-ordering and the axiom of choice are
all equivalent.
We now show the existence of maximal ideals in commutative rings with identity.
Theorem 2.4.2. Let R be a commutative ring with an identity 1 ≠ 0, and let I be an ideal
in R with I ≠ R. Then there exists a maximal ideal I0 in R with I ⊂ I0 . In particular, a ring
with an identity contains maximal ideals.
Proof. Let I be an ideal in the commutative ring R. We must show that there exists a
maximal ideal I0 in R with I ⊂ I0 .
Let
Then M is partially ordered by containment. We want to show first that each chain in
M has a maximal element. If K = {Xj : Xj ∈ M, j ∈ J} is a chain, let
X = ⋃ Xj .
j∈J
Lemma 2.5.1. Let R be a commutative ring and a1 , . . . , an be elements of R. Then the set
⟨a1 , . . . , an ⟩ = {r1 a1 + ⋅ ⋅ ⋅ + rn an : ri ∈ R}
a = r1 a1 + ⋅ ⋅ ⋅ + rn an , b = s1 a1 + ⋅ ⋅ ⋅ + sn an
Proof. Every ideal I in ℤ is of the form nℤ. This is the principal ideal generated
by n.
Definition 2.5.4. A principal ideal domain or PID is an integral domain, in which every
ideal is principal.
We mention that the set of polynomials K[x] with coefficients from a field K is also
a principal ideal domain. We will return to this in the next chapter.
Not every integral domain is a PID. Consider K[x, y] = (K[x])[y], the set of polyno-
mials over K in two variables x, y (see Chapter 4). Let I consist of all the polynomials
with zero constant term.
Lemma 2.5.6. The set I in K[x, y] as defined above is an ideal, but not a principal ideal.
Proof. We leave the proof that I forms an ideal to the exercises. To show that it is not
a principal ideal, suppose I = ⟨p(x, y)⟩. Now the polynomial q(x) = x has zero con-
stant term, so q(x) ∈ I. Hence, p(x, y) cannot be a constant polynomial. In addition,
if p(x, y) had any terms with y in them, there would be no way to multiply p(x, y) by
a polynomial h(x, y) and obtain just x. Therefore, p(x, y) can contain no terms with y
in them. But the same argument, using s(y) = y, shows that p(x, y) cannot have any
terms with x in them. Therefore, there can be no such p(x, y) generating I, and so, I is
not principal, and K[x, y] is not a principal ideal domain.
2.6 Exercises
1. Consider the set ⟨r, I⟩ = {rx + i : x ∈ R, i ∈ I}, where I is an ideal. Prove that this is
also an ideal called the ideal generated by r and I, denoted ⟨r, I⟩.
2. Let R and S be commutative rings, and let ϕ : R → S be a ring epimorphism. Let
M be a maximal ideal in R. Show:
ϕ(M) is a maximal ideal in S if and only if ker(ϕ) ⊂ M. Is ϕ(M) always a prime
ideal of S?
3. Let A1 , . . . , At be ideals of a commutative ring R. Let P be a prime ideal of R. Show:
(i) ⋂ti=1 Ai ⊂ P ⇒ Aj ⊂ P for at least one index j.
(ii) ⋂ti=1 Ai = P ⇒ Aj = P for at least one index j.
4. Which of the following ideals A are prime ideals of R? Which are maximal ideals?
(i) A = (x), R = ℤ[x].
(ii) A = (x2 ), R = ℤ[x].
(iii) A = (1 + √5), R = ℤ[√5].
(iv) A = (x, y), R = ℚ[x, y].
5. Let w = 21 (1 + √−3). Show that ⟨2⟩ is a prime ideal and even a maximal ideal of
ℤ[w], but ⟨2⟩ is neither a prime ideal nor a maximal ideal of ℤ[i], i = √−1 ∈ ℂ.
6. Let R = { ba : a, b ∈ ℤ, b odd}. Show that R is a subring of ℚ, and that there is only
one maximal ideal M in R.
7. Let R be a commutative ring with an identity. Let x, y ∈ R and x ≠ 0 not be a
zero divisor. Furthermore, let ⟨x⟩ be a prime ideal with ⟨x⟩ ⊂ ⟨y⟩ ≠ R. Show that
⟨x⟩ = ⟨y⟩.
8. Consider K[x, y] the set of polynomials over K in two variables x, y. Let I consist of
all the polynomials with zero constant term. Prove that the set I is an ideal.
n = cp1 p2 ⋅ ⋅ ⋅ pk ,
There are two main ingredients that go into the proof: induction and Euclid’s
lemma. We presented this in the last chapter. In turn, however, Euclid’s lemma de-
pends upon the existence of greatest common divisors and their linear expressibility.
Therefore, to begin, we present several basic ideas from number theory.
The starting point for the theory of numbers is divisibility.
https://doi.org/10.1515/9783110603996-003
Theorem 3.1.4 (Division algorithm). Given integers a, b with a > 0, then there exist
unique integers q and r such that b = qa + r, where either r = 0 or 0 < r < a.
One may think of q and r as the quotient and remainder, respectively, when divid-
ing b by a.
S = {b − qa ≥ 0 : q ∈ ℤ}.
If b > 0, then b + a ≥ 0, and the sum is in S. If b ≤ 0, then there exists a q > 0 with
−qa < b. Then b + qa > 0 and is in S. Therefore, in either case, S is nonempty. Hence, S
is a nonempty subset of ℕ ∪ {0} and, therefore, has a least element r. If r ≠ 0, we must
show that 0 < r < a. Suppose r ≥ a, then r = a + x with x ≥ 0, and x < r since a > 0.
Then b − qa = r = a + x ⇒ b − (q + 1)a = x. This means that x ∈ S. Since x < r, this
contradicts the minimality of r, which is a contradiction. Therefore, if r ≠ 0, it follows
that 0 < r < a.
The only thing left is to show the uniqueness of q and r. Suppose b = q1 a + r1 also.
By the construction above, r1 must also be the minimal element of S. Hence, r1 ≤ r,
and r ≤ r1 so r = r1 . Now
b − qa = b − q1 a ⇒ (q1 − q)a = 0,
The next idea that is necessary is the concept of greatest common divisor.
Definition 3.1.5. Given nonzero integers a, b, their greatest common divisor or GCD
d > 0 is a positive integer such that it is their common divisor, that is, d|a and d|b, and
if d1 is any other common divisor, then d1 |d. We denote the greatest common divisor
of a, b by either gcd(a, b) or (a, b).
Certainly, if a, b are nonzero integers with a > 0 and a|b, then a = gcd(a, b).
The next result says that given any nonzero integers, they do have a greatest com-
mon divisor, and it is unique.
Theorem 3.1.6. Given nonzero integers a, b, their GCD exists, is unique, and can be char-
acterized as the least positive linear combination of a and b.
If (a, b) = 1, then we say that a, b are relatively prime. It follows that a and b are
relatively prime if and only if 1 is expressible as a linear combination of a and b. We
need the following three results:
Proof. If d = (a, b), then d|a, and d|b. Hence, a = a1 d, and b = b1 d. We have
d = ax + by = a1 dx + b1 dy.
1 = a1 x + b1 y.
Therefore, (a1 , b1 ) = 1.
Lemma 3.1.8. For any integer c, we have that (a, b) = (a, b + ac).
Proof. Suppose (a, b) = d and (a, b + ac) = d1 . Now d is the least positive linear com-
bination of a and b. Suppose d = ax + by. d1 is a linear combination of a, b + ac so
that
b = q 1 a + r1 , 0 < r1 < a
a = q 2 r1 + r2 , 0 < r2 < r1
..
.
rn−2 = qn rn−1 + rn , 0 < rn < rn−1
rn−1 = qn+1 rn .
rn = rn−2 − qn rn−1 .
Therefore, the last nonzero remainder is 18, which is the GCD. We now must express
18 as a linear combination of 270 and 2412.
From the first equation
The next result that we need is Euclid’s lemma. We stated and proved this in the
last chapter, but we restate it here.
Lemma 3.1.11 (Euclid’s lemma). If p is a prime and p|ab, then p|a, or p|b.
Lemma 3.1.12. Any integer n > 1 can be expressed as a product of primes, perhaps with
only one factor.
Proof. The proof is by induction. n = 2 is prime. Therefore, it is true at the lowest level.
Suppose that any integer 2 ≤ k < n can be decomposed into prime factors, we must
show that n then also has a prime factorization.
If n is prime, then we are done. Suppose then that n is composite. Hence, n = m1 m2
with 1 < m1 < n, 1 < m2 < n. By the inductive hypothesis, both m1 and m2 can be
expressed as products of primes. Therefore, n can, also using the primes from m1 and
m2 , completing the proof.
Proof. Suppose that there are only finitely many primes p1 , . . . , pn . Each of these is
positive, so we can form the positive integer
N = p1 p2 ⋅ ⋅ ⋅ pn + 1.
p|(p1 p2 ⋅ ⋅ ⋅ pn + 1).
Since the only primes are assumed p1 , p2 , . . . , pn , it follows that p = pi for some i =
1, . . . , n. But then p|p1 p2 ⋅ ⋅ ⋅ pi ⋅ ⋅ ⋅ pn so p cannot divide p1 ⋅ ⋅ ⋅ pn + 1, which is a contradic-
tion. Therefore, p is not one of the given primes showing that the list of primes must
be endless.
Proof. We assume that n ≥ 1. If n ≤ −1, we use c = −n, and the proof is the same. The
statement certainly holds for n = 1 with k = 0. Now suppose n > 1. From Lemma 3.1.12,
n has a prime decomposition:
n = p1 p2 ⋅ ⋅ ⋅ pm .
We must show that this is unique up to the ordering of the factors. Suppose then that
n has another such factorization n = q1 q2 ⋅ ⋅ ⋅ qk with the qi all prime. We must show
that m = k, and that, the primes are the same. Now we have
n = p1 p2 ⋅ ⋅ ⋅ pm = q1 ⋅ ⋅ ⋅ qk .
n = p1 p2 ⋅ ⋅ ⋅ pm = q1 ⋅ ⋅ ⋅ qk ,
it follows that p1 |q1 q2 ⋅ ⋅ ⋅ qk . From Lemma 3.1.11 then, we must have that p1 |qi for some i.
But qi is prime, and p1 > 1, so it follows that p1 = qi . Therefore, we can eliminate p1
and qi from both sides of the factorization to obtain
p2 ⋅ ⋅ ⋅ pm = q1 ⋅ ⋅ ⋅ qi−1 qi+1 ⋅ ⋅ ⋅ qk .
Continuing in this manner, we can eliminate all the pi from the left side of the factor-
ization to obtain
1 = qm+1 ⋅ ⋅ ⋅ qk .
If qm+1 , . . . , qk were primes, this would be impossible. Therefore, m = k, and each prime
pi was included in the primes q1 , . . . , qm . Therefore, the factorizations differ only in the
order of the factors, proving the theorem.
Notice that in the integers ℤ, the units are just ±1. The set of prime elements co-
incides with the set of irreducible elements. In ℤ, these are precisely the set of prime
numbers. On the other hand, if K is a field, every nonzero element is a unit. Therefore,
in K, there are no prime elements and no irreducible elements.
Recall that the modular rings ℤn are fields (and integral domains) when n is a
prime. In general, if n is not a prime then ℤn is a commutative ring with an identity,
and a unit is still an invertible element. We can characterize the units within ℤn .
Proof. Suppose (a, n) = 1. Then there exist x, y ∈ ℤ such that ax + ny = 1. This implies
that ax ≡ 1 mod n, which in turn implies that ax = 1 in ℤn and, therefore, a is a unit.
Conversely, suppose a is a unit in ℤn . Then there is an x ∈ ℤn with ax = 1. In terms
of congruence then
ax ≡ 1 mod n ⇒ n|(ax − 1) ⇒ ax − 1 = ny ⇒ ax − ny = 1.
If R is an integral domain, then the set of units within R will form a group.
Lemma 3.2.3. If R is a commutative ring with an identity, then the set of units in R form
an abelian group under ring multiplication. This is called the unit group of R, denoted
U(R).
Proof. The commutativity and associativity of U(R) follow from the ring properties.
The identity of U(R) is the multiplicative identity of R, whereas the ring multiplicative
inverse for each unit is the group inverse. We must show that U(R) is closed under
ring multiplication. If a ∈ R is a unit, we denote its multiplicative inverse by a−1 . Now
suppose a, b ∈ U(R). Then a−1 , b−1 exist. It follows that
Hence, ab has an inverse, namely b−1 a−1 (= a−1 b−1 in a commutative ring) and, hence,
ab is also a unit. Therefore, U(R) is closed under ring multiplication.
In general, irreducible elements are not prime. Consider for example the subring
of the complex numbers (see exercises) given by
This is a subring of the complex numbers ℂ and, hence, can have no zero divisors.
Therefore, R is an integral domain.
For an element x + iy√5 ∈ R, define its norm by
Proof. The fact that the norm is multiplicative is straightforward and left to the exer-
cises. If a ∈ R is a unit, then there exists a multiplicative inverse b ∈ R with ab = 1.
Then N(ab) = N(a)N(b) = 1. Since both N(a) and N(b) are nonnegative integers, we
must have N(a) = N(b) = 1.
Conversely, suppose that N(a) = 1. If a = x + iy√5, then x2 + 5y2 = 1. Since x, y ∈ ℤ,
we must have y = 0 and x2 = 1. Then a = x = ±1.
Using this lemma we can show that R possesses irreducible elements that are not
prime.
Therefore, c is a unit in R, and from Lemma 3.2.4, we get c = ±1. Hence, a = ±3. This is
a contradiction, so 3 does not divide a. An identical argument shows that 3 does not
divide b. Therefore, 3 is not a prime element in R.
Proof. (1) Suppose that p ∈ R is a prime element, and p = ab. We must show that
either a or b must be a unit. Now p|ab, so either p|a, or p|b. Without loss of generality,
we may assume that p|a, so a = pr for some r ∈ R. Hence, p = ab = (pr)b = p(rb).
However, R is an integral domain, so p − prb = p(1 − rb) = 0 implies that 1 − rb = 0
and, hence, rb = 1. Therefore, b is a unit and, hence, p is irreducible.
(2) Suppose that p is a prime element. Then p ≠ 0. Consider the ideal pR, and
suppose that ab ∈ pR. Then ab is a multiple of p and, hence, p|ab. Since p is prime, it
follows that p|a or p|b. If p|a, then a ∈ pR, whereas if p|b, then b ∈ pR. Therefore, pR
is a prime ideal.
Conversely, suppose that pR is a prime ideal, and suppose that p = ab. Then ab ∈
pR, so a ∈ pR, or b ∈ pR. If a ∈ pR, then p|a, and if b ∈ pR, then p|b. Therefore, p is
prime.
(3) Let p be irreducible, then p ≠ 0. Suppose that pR ⊂ aR, where a ∈ R. Then
p = ra for some r ∈ R. Since p is irreducible, it follows that either a is a unit, or r is a
unit. If r is a unit, we have pR = raR = aR ≠ R since p is not a unit. If a is a unit, then
aR = R, and pR = rR ≠ R. Therefore, pR is maximal in the set of principal ideals not
equal to R.
Conversely, suppose p ≠ 0 and pR is a maximal ideal in the set of principal ideals
≠ R. Let p = ab with a not a unit. We must show that b is a unit. Since aR ≠ R, and
pR ⊂ aR, from the maximality we must have pR = aR. Hence, a = rp for some r ∈ R.
Then p = ab = rpb and, as before, we must have rb = 1 and b a unit.
Theorem 3.2.7. Let R be a principle ideal domain. Then we have the following:
(1) An element p ∈ R is irreducible if and only if it is a prime element.
(2) A nonzero ideal of R is a maximal ideal if and only if it is a prime ideal.
(3) The maximal ideals of R are precisely those ideals pR, where p is a prime element.
Proof. First note that {0} is a prime ideal, but not maximal.
(1) We already know that prime elements are irreducible. To show the converse,
suppose that p is irreducible. Since R is a principal ideal domain from Theorem 3.2.6,
we have that pR is a maximal ideal, and each maximal ideal is also a prime ideal.
Therefore, from Theorem 3.2.6, we have that p is a prime element.
(2) We already know that each maximal ideal is a prime ideal. To show the con-
verse, suppose that I ≠ {0} is a prime ideal. Then I = pR, where p is a prime element
with p ≠ 0. Therefore, p is irreducible from part (1) and, hence, pR is a maximal ideal
from Theorem 3.2.6.
(3) This follows directly from the proof in part (2) and Theorem 3.2.6.
This Theorem especially explains the following remark at the end of Section 2.3:
in the principal ideal domain ℤ, a proper ideal is maximal if and only if it is a prime
ideal.
r = p1 ⋅ ⋅ ⋅ pm = q1 ⋅ ⋅ ⋅ qk ,
There are several relationships in integral domains that are equivalent to unique
factorization.
q1 ⋅ ⋅ ⋅ qr = q1 ⋅ ⋅ ⋅ qs .
Then r = s, and there is a permutation π ∈ Sr such that for each i ∈ {1, . . . , r} the
elements qi and qπ(i)
are associates (uniqueness up to ordering and unit factors).
(4) R has property (C) if and only if each irreducible element of R is a prime element.
Notice that properties (A) and (C) together are equivalent to what we defined as
unique factorization. Hence, an integral domain satisfying (A) and (C) is a UFD. Next,
we show that there are other equivalent formulations.
Proof. As remarked before, the statement of the theorem by definition (A) and (C) are
equivalent to unique factorization. We show here that (2), (3), and (4) are equivalent.
First, we show that (2) implies (3).
Suppose that R satisfies properties (A) and (B). We must show that it also satisfies
(C); that is, we must show that if q ∈ R is irreducible, then q is prime. Suppose that
q ∈ R is irreducible and q|ab with a, b ∈ R. Then we have ab = cq for some c ∈ R. If a
is a unit from ab = cq, we get that b = a−1 cq, and q|b. The results are identical if b is a
unit. Therefore, we may assume that neither a nor b are units.
If c = 0, then since R is an integral domain, either a = 0, or b = 0, and q|a, or q|b.
We may assume then that c ≠ 0.
If c is a unit, then q = c−1 ab, and since q is irreducible, either c−1 a, or b are units.
If c−1 a is a unit, then a is also a unit. Therefore, if c is a unit, either a or b are units
contrary to our assumption.
Therefore, we may assume that c ≠ 0, and c is not a unit. From property (A) we
have
a = q1 ⋅ ⋅ ⋅ qr
b = q1 ⋅ ⋅ ⋅ qs
c = q1 ⋅ ⋅ ⋅ qt ,
From property (B), q is an associate of some qi or qj . Hence, q|qi or q|qj . It follows
that q|a, or q|b and, therefore, q is a prime element.
That (3) implies (4) is direct.
We show that (4) implies (2).
Suppose that R satisfies property (A ). We must show that it satisfies both (A)
and (B). We show first that (A) follows from (A ) by showing that irreducible elements
are prime.
Suppose that q is irreducible. Then from (A ), we have
q = p1 ⋅ ⋅ ⋅ pr
with each pi prime. It follows, without loss of generality, that p2 ⋅ ⋅ ⋅ pr is a unit, and p1
is a nonunit and, hence, pi |1 for i = 2, . . . , r. Thus, q = p1 , and q is prime. Therefore,
(A) holds.
We now show that (B) holds. Let
q1 ⋅ ⋅ ⋅ qr = q1 ⋅ ⋅ ⋅ qs ,
q1 |q1 ⋅ ⋅ ⋅ qr ,
and so, q1 |qi for some i. Without loss of generality, suppose q1 |q1 . Then q1 = aq1 . Since
q1 is irreducible, it follows that a is a unit, and q1 and q1 are associates. It follows then
that
since R has no zero divisors. Property (B) holds then by induction, and the theorem is
proved.
Note that in our new terminology, ℤ is a UFD. In the next section, we will present
other examples of UFD’s. However, not every integral domain is a unique factorization
domain.
As we defined in the last section, let R be the following subring of ℂ:
9 = 3 ⋅ 3 = (2 + i√5)(2 − i√5)
give two different decompositions for an element in terms of irreducible elements. The
fact that R is not a UFD also follows from the fact that 3 is an irreducible element, which
is not prime.
Unique factorization is tied to the famous solution of Fermat’s big theorem. Wiles
and Taylor in 1995 proved the following:
Theorem 3.3.4. The equation x p + yp = z p has no integral solutions with xyz ≠ 0 for any
prime p ≥ 3.
p−1
z p − yp = ∏(z − ϵj y).
j=0
p−1
R = ℤ[ϵ] = { ∑ aj ϵj : aj ∈ ℤ}.
j=0
Kummer proved that if R is a UFD, then property (Fp ) holds. However, independently,
from Uchida and Montgomery (1971), R is a UFD only if p ≤ 19 (see [49]).
I1 ⊂ I2 ⊂ ⋅ ⋅ ⋅ ⊂ In ⊂ ⋅ ⋅ ⋅
Theorem 3.4.1. Let R be an integral domain. If each ascending chain of principal ideals
in R becomes stationary, then R satisfies property (A).
aR ⊂ a1 R ⊂ ⋅ ⋅ ⋅ ⊂ an R ⊂ ⋅ ⋅ ⋅ .
From our hypothesis on R, this must become stationary, contradicting the argument
above that the inclusion is proper. Therefore, a must be a product of irreducibles.
Proof. Suppose that R is a principal ideal domain. R satisfies property (C) by Theo-
rem 3.2.7(1). Therefore, to show that it is a unique factorization domain, we must show
that it also satisfies property (A). From the previous theorem, it suffices to show that
each ascending chain of principal ideals becomes stationary. Consider such an as-
cending chain
a1 R ⊂ a2 R ⊂ ⋅ ⋅ ⋅ ⊂ an R ⊂ ⋅ ⋅ ⋅ .
Now let
∞
I = ⋃ ai R.
i=1
Since we showed that the integers ℤ are a PID, we can recover the fundamental
theorem of arithmetic from Theorem 3.4.2. We now present another important example
of a PID; hence a UFD. In the next chapter, we will look in detail at polynomials with
coefficients in an integral domain. Below, we consider polynomials with coefficients
in a field, and for the present leave out many of the details.
If K is a field and n is a nonnegative integer, then a polynomial of degree n over K
is a formal sum of the form
P(x) = a0 + a1 x + ⋅ ⋅ ⋅ + an xn
that is, the coefficient of xi in P(x) ± Q(x) is ai ± bi , where ai = 0 for i > n, and bj = 0
for j > m. Multiplication is given by
and
From the definitions, the following degree relationships are clear. The proofs are
in the exercises.
Lemma 3.4.4. Let 0 ≠ P(x), 0 ≠ Q(x) in K[x]. Then the following hold:
Theorem 3.4.5. If K is a field, then K[x] forms an integral domain. K can be naturally
embedded into K[x] by identifying each element of K with the corresponding constant
polynomial. The only units in K[x] are the nonzero elements of K.
Proof. Verification of the basic ring properties is solely computational and is left to the
exercises. Since deg P(x)Q(x) = deg P(x) + deg Q(x), it follows that if neither P(x) ≠ 0,
nor Q(x) ≠ 0, then P(x)Q(x) ≠ 0 and, therefore, K[x] is an integral domain.
If G(x) is a unit in K[x], then there exists an H(x) ∈ K[x] with G(x)H(x) = 1. From
the degrees, we have deg G(x) + deg H(x) = 0, and since deg G(x) ≥ 0, deg H(x) ≥ 0.
This is possible only if deg G(x) = deg H(x) = 0. Therefore, G(x) ∈ K.
Now that we have K[x] as an integral domain, we proceed to show that K[x] is a
principal ideal domain and, hence, there is unique factorization into primes. We first
repeat the definition of a prime in K[x]. If 0 ≠ f (x) has no nontrivial, nonunit factors
(it cannot be factorized into polynomials of lower degree), then f (x) is a prime in K[x]
or a prime polynomial. A prime polynomial is also called an irreducible polynomial.
Clearly, if deg g(x) = 1, then g(x) is irreducible.
The fact that K[x] is a principal ideal domain follows from the division algorithm
for polynomials, which is entirely analogous to the division algorithm for integers.
Lemma 3.4.6 (Division algorithm in K[x]). If 0 ≠ f (x), 0 ≠ g(x) ∈ K[x], then there exist
unique polynomials q(x), r(x) ∈ K[x] such that f (x) = q(x)g(x) + r(x), where r(x) = 0 or
deg r(x) < deg g(x). (The polynomials q(x) and r(x) are called, respectively, the quotient
and remainder.)
Example 3.4.7.
(1) Let f (x) = 3x 4 − 6x2 + 8x − 6, g(x) = 2x 2 + 4. Then
3x4 − 6x 2 + 8x − 6 3 2
= x −6 with remainder 8x + 18.
2x 2 + 4 2
Theorem 3.4.8. Let K be a field. Then the polynomial ring K[x] is a principal ideal do-
main; hence a unique factorization domain.
Proof. The proof is essentially analogous to the proof in the integers. Let I be an ideal
in K[x] with I ≠ K[x]. Let f (x) be a polynomial in I of minimal degree. We claim that
I = ⟨f (x)⟩, the principal ideal generated by f (x). Let g(x) ∈ I. We must show that g(x)
is a multiple of f (x). By the division algorithm in K[x], we have
where r(x) = 0, or deg(r(x)) < deg(f (x)). If r(x) ≠ 0, then deg(r(x)) < deg(f (x)). How-
ever, r(x) = g(x) − q(x)f (x) ∈ I since I is an ideal, and g(x), f (x) ∈ I. This is a contra-
diction since f (x) was assumed to be a polynomial in I of minimal degree. Therefore,
r(x) = 0 and, hence, g(x) = q(x)f (x) is a multiple of f (x). Therefore, each element of I
is a multiple of f (x) and, hence, I = ⟨f (x)⟩.
Therefore, K[x] is a principal ideal domain and, from Theorem 3.4.2, a unique fac-
torization domain.
We proved that in a principal ideal domain, every ascending chain of ideals be-
comes stationary. In general, a ring R (commutative or not) satisfies the ascending
chain condition or ACC if every ascending chain of left (or right) ideals in R becomes
stationary. A ring satisfying the ACC is called a Noetherian ring.
r2 = qr1 + r,
Therefore, Euclidean domains are precisely those integral domains, which allow
division algorithms. In the integers ℤ, define N(z) = |z|. Then N is a Euclidean norm
Theorem 3.5.2. Every Euclidean domain is a principal ideal domain; hence a unique
factorization domain.
Before proving this theorem, we must develop some results on the number theory
of general Euclidean domains. First, some properties of the norm.
(b) Suppose u is a unit. Then there exists u−1 with u ⋅ u−1 = 1. Then
1 = qu + r.
If r ≠ 0, then N(r) < N(u) = N(1), contradicting the minimality of N(1). Therefore,
r = 0, and 1 = qu. Then u has a multiplicative inverse and, hence, is a unit.
(c) Suppose a, b ∈ R⋆ are associates. Then a = ub with u a unit. Then
Since N(a) ≤ N(b), and N(b) ≤ N(a), it follows that N(a) = N(b).
(d) Suppose N(a) = N(ab). Apply the division algorithm
a = q(ab) + r,
contradicting that N(r) < N(ab). Hence, r = 0, and a = q(ab) = (qb)a. Then
a = (qb)a = 1 ⋅ a ⇒ qb = 1
since there are no zero divisors in an integral domain. Hence, b is a unit. Since N(a) ≤
N(ab), it follows that if b is not a unit, we must have N(a) < N(ab).
b = qa + r,
ℤ[i] = {a + bi : a, b ∈ ℤ}.
It was first observed by Gauss that this set permits unique factorization. To show this,
we need a Euclidean norm on ℤ[i].
N(a + bi) = a2 + b2 .
The basic properties of this norm follow directly from the definition (see exer-
cises).
From the multiplicativity of the norm, we have the following concerning primes
and units in ℤ[i].
Lemma 3.5.6.
(1) u ∈ ℤ[i] is a unit if and only if N(u) = 1.
(2) If π ∈ ℤ[i] and N(π) = p, where p is an ordinary prime in ℤ, then π is a prime in
ℤ[i].
Proof. Certainly u is a unit if and only if N(u) = N(1). But in ℤ[i], we have N(1) = 1.
Therefore, the first part follows.
Suppose next that π ∈ ℤ[i] with N(π) = p for some p ∈ ℤ. Suppose that π = π1 π2 .
From the multiplicativity of the norm, we have
Since each norm is a positive ordinary integer, and p is a prime, it follows that either
N(π1 ) = 1, or N(π2 ) = 1. Hence, either π1 or π2 is a unit. Therefore, π is a prime in
ℤ[i].
Armed with this norm, we can show that ℤ[i] is a Euclidean domain.
Proof. That ℤ[i] forms a commutative ring with an identity can be verified directly
and easily. If αβ = 0, then N(α)N(β) = 0, and since there are no zero divisors in ℤ, we
must have N(α) = 0, or N(β) = 0. But then either α = 0, or β = 0 and, hence, ℤ[i] is
an integral domain. To complete the proof, we show that the norm N is a Euclidean
norm.
From the multiplicativity of the norm, we have, if α, β ≠ 0
Therefore, property (1) of Euclidean norms is satisfied. We must now show that the
division algorithm holds.
Let α = a + bi and β = c + di be Gaussian integers. Recall that the inverse for a
nonzero complex number z = x + iy is
1 z x − iy
= 2 = 2 .
z |z| x + y2
α β c − di
= α 2 = (a + bi) 2
β |β| c + d2
ac + bd ac − bd
= 2 + 2 i = u + iv.
c + d2 c + d2
{u + iv : u, v ∈ ℚ}
α
|r| = |α − qβ| = |β| − q.
β
Now
α 2 2
1 1
− q = (u − m) + i(v − n) = √(u − m)2 + (v − n)2 ≤ √( ) + ( ) < 1.
β 2 2
Therefore,
Since ℤ[i] forms a Euclidean domain, it follows from our previous results that ℤ[i]
must be a principal ideal domain; hence a unique factorization domain.
Since we will now be dealing with many kinds of integers, we will refer to the
ordinary integers ℤ as the rational integers and the ordinary primes p as the rational
primes. It is clear that ℤ can be embedded into ℤ[i]. However, not every rational prime
is also prime in ℤ[i]. The primes in ℤ[i] are called the Gaussian primes. For example,
we can show that both 1 + i and 1 − i are Gaussian primes; that is, primes in ℤ[i].
However, (1 + i)(1 − i) = 2. Therefore, the rational prime 2 is not a prime in ℤ[i]. Using
the multiplicativity of the Euclidean norm in ℤ[i], we can describe all the units and
primes in ℤ[i].
Theorem 3.5.9.
(1) The only units in ℤ[i] are ±1, ±i.
(2) Suppose π is a Gaussian prime. Then π is one of the following:
(a) a positive rational prime p ≡ 3 mod 4, or an associate of such a rational prime.
(b) 1 + i, or an associate of 1 + i.
(c) a + bi, or a − bi, where a > 0, b > 0, a is even, and N(π) = a2 + b2 = p with p a
rational prime congruent to 1 mod 4, or an associate of a + bi, or a − bi.
Proof. (1) Suppose u = x + iy ∈ ℤ[i] is a unit. Then, from Lemma 3.5.6, we have N(u) =
x2 + y2 = 1, implying that (x, y) = (0, ±1) or (x, y) = (±1, 0). Hence, u = ±1 or u = ±i.
(2) Now suppose that π is a Gaussian prime. Since N(π) = ππ, and π ∈ ℤ[i], it
follows that π|N(π). N(π) is a rational integer, so N(π) = p1 ⋅ ⋅ ⋅ pk , where the pi ’s are
rational primes. By Euclid’s lemma π|pi for some pi and, hence, a Gaussian prime must
divide at least one rational prime. On the other hand, suppose π|p and π|q, where
p, q are different primes. Then (p, q) = 1 and, hence, there exist x, y ∈ ℤ such that
1 = px + qy. It follows that π|1 is a contradiction. Therefore, a Gaussian prime divides
one and only one rational prime.
Let p be the rational prime that π divides. Then N(π)|N(p) = p2 . Since N(π) is a
rational integer, it follows that N(π) = p, or N(π) = p2 . If π = a + bi, then a2 + b2 = p,
or a2 + b2 = p2 .
If p = 2, then a2 + b2 = 2, or a2 + b2 = 4. It follows that π = ±2, ±2i, or π = 1 + i, or an
associate of 1 + i. Since (1 + i)(1 − i) = 2, and neither 1 + i, nor 1 − i are units, it follows
that neither 2, nor any of its associates are primes. Then π = 1 + i, or an associate of
1 + i. To see that 1 + i is prime supposes 1 + i = αβ. Then N(1 + i) = 2 = N(α)N(β). It
follows that either N(α) = 1, or N(β) = 1, and either α or β is a unit.
If p ≠ 2, then either p ≡ 3 mod 4, or p ≡ 1 mod 4. Suppose first that p ≡ 3 mod 4.
Then a2 + b2 = p would imply, from Fermat’s two-square theorem (see [43]), that p ≡
1 mod 4. Therefore, from the remarks above a2 + b2 = p2 , and N(π) = N(p). Since π|p,
we have π = αp with α ∈ ℤ[i]. From N(π) = N(p), we get that N(α) = 1, and α is a unit.
Therefore, π and p are associates. Hence, in this case, π is an associate of a rational
prime congruent to 3 mod 4.
Finally, suppose p ≡ 1 mod 4. From the remarks above, either N(π) = p, or N(π) =
p2 . If N(π) = p2 , then a2 + b2 = p2 . Since p ≡ 1 mod 4, from Fermat’s two square
theorem, there exist m, n ∈ ℤ with m2 + n2 = p. Let u = m + in, then the norm N(u) = p.
Since p is a rational prime, it follows that u is a Gaussian prime. Similarly, its conjugate
u is also a Gaussian prime. Now uu | p2 = N(π). Since π|N(π), it follows that π|uu,
and from Euclid’s lemma, either π|u, or π|u. If π|u, they are associates since both are
primes. But this is a contradiction since N(π) ≠ N(u). The same is true if π|u.
It follows that if p ≡ 1 mod 4, then N(π) ≠ p2 . Therefore, in this case, N(π) =
p = a2 + b2 . An associate of π has both a, b > 0 (see exercises). Furthermore, since
a2 + b2 = p, one of a or b must be even. If a is odd, then b is even; then iπ is an
associate of π with a even, completing the proof.
Finally, we mention that the methods used in ℤ[i] cannot be applied to all
quadratic integers. For example, we have seen that there is not unique factorization
in ℤ[√−5].
Definition 3.6.1.
(1) A Dedekind domain D is an integral domain such that each nonzero proper ideal
A ({0} ≠ A ≠ R) can be written uniquely as a product of prime ideals
A = P1 ⋅ ⋅ ⋅ Pr
with each Pi being a prime ideal and the factorization being unique up to ordering.
(2) A Prüfer ring R is an integral domain such that
A ⋅ (B ∩ C) = AB ∩ AC
3.7 Exercises
1. Let R be an integral domain, and let π ∈ R \ (U(R) ∪ {0}). Show the following:
(i) If for each a ∈ R with π ∤ a, there exist λ, μ ∈ R with λπ + μa = 1, then π is a
prime element of R.
(ii) Give an example for a prime element π in an UFD R, which does not satisfy
the conditions of (i).
2. Let R be a UFD, and let a1 , . . . , at be pairwise coprime elements of R. If a1 ⋅ ⋅ ⋅ at is
an m-th power (m ∈ ℕ), then all factors ai are an associate of an m-th power. Is
each ai necessarily an m-th power?
3. Decide if the unit group of ℤ[√3], ℤ[√5], and ℤ[√7] is finite or infinite. For which
a ∈ ℤ are (1 − √5) and (a + √5) associates in ℤ[√5]?
4. Let k ∈ ℤ and k ≠ x2 for all x ∈ ℤ. Let α = a + b√k and β = c + d√k be elements of
ℤ[√k], and N(α) = a2 − kb2 , N(β) = c2 − kd2 . Show the following:
(i) The equality of the absolute values of N(α) and N(β) is necessary for the as-
sociation of α and β in ℤ[√k]. Is this constraint also sufficient?
for some m ≥ 0 since ri ≠ 0 for only finitely many i. Furthermore, this presentation is
unique.
We now call x an indeterminate over R, and write each element of R̃ as f (x) =
m
∑i=0 ri xi with f (x) = 0 or rm ≠ 0. We also now write R[x] for R.̃ Each element of R[x]
is called a polynomial over R. The elements r0 , . . . , rm are called the coefficients of f (x)
with rm the leading coefficient. If rm ≠ 0, the non-negative integer m is called the de-
gree of f (x), which we denote by deg f (x). We say that f (x) = 0 has degree −∞. The
uniqueness of the representation of a polynomial implies that two nonzero polynomi-
als are equal if and only if they have the same degree and exactly the same coefficients.
A polynomial of degree 1 is called a linear polynomial, whereas one of degree two is a
quadratic polynomial. The set of polynomials of degree 0, together with 0, form a ring
isomorphic to R and, hence, can be identified with R, the constant polynomials. Thus,
the ring R embeds in the set of polynomials R[x]. The following results are straightfor-
ward concerning degree:
https://doi.org/10.1515/9783110603996-004
Lemma 4.1.1. Let f (x) ≠ 0, g(x) ≠ 0 ∈ R[x]. Then the following hold:
(a) deg f (x)g(x) ≤ deg f (x) + deg g(x).
(b) deg(f (x) ± g(x)) ≤ max(deg f (x), deg g(x)).
Theorem 4.1.2. Let R be a commutative ring with an identity. Then the set of polynomi-
als R[x] forms a ring called the ring of polynomials over R. The ring R identified with 0
and the polynomials of degree 0 naturally embeds into R[x]. R[x] is commutative. Fur-
thermore, R[x] is uniquely determined by R and x.
∑ ri xi → ∑ ri αi .
i≥0 i≥0
Hence, R[x] is uniquely determined by R and x. We remark that R[α] must be commu-
tative.
f (c) = r0 + r1 c + ⋅ ⋅ ⋅ + rn cn ∈ R
Definition 4.1.4. If f (x) ∈ R[x] and f (c) = 0 for c ∈ R, then c is called a zero or a root
of f (x) in R.
Theorem 4.2.1. If K is a field, then K[x] forms an integral domain. K can be naturally
embedded into K[x] by identifying each element of K with the corresponding constant
polynomial. The only units in K[x] are the nonzero elements of K.
Proof. Verification of the basic ring properties is solely computational and is left to the
exercises. Since deg P(x)Q(x) = deg P(x) + deg Q(x), it follows that if neither P(x) ≠ 0,
nor Q(x) ≠ 0, then P(x)Q(x) ≠ 0. Therefore, K[x] is an integral domain.
If G(x) is a unit in K[x], then there exists an H(x) ∈ K[x] with G(x)H(x) = 1.
From the degrees, we have deg G(x) + deg H(x) = 0, and since deg G(x) ≥ 0,
deg H(x) ≥ 0. This is possible only if deg G(x) = deg H(x) = 0. Therefore, G(x) ∈ K.
Now that we have K[x] as an integral domain, we proceed to show that K[x] is a
principal ideal domain and, hence, there is unique factorization into primes. We first
repeat the definition of a prime in K[x]. If 0 ≠ f (x) has no nontrivial, nonunit factors (it
cannot be factorized into polynomials of lower degree), then f (x) is a prime in K[x] or a
prime polynomial. A prime polynomial is also called an irreducible polynomial over K.
Clearly, if deg g(x) = 1, then g(x) is irreducible.
The fact that K[x] is a principal ideal domain follows from the division algorithm
for polynomials, which is entirely analogous to the division algorithm for integers.
Theorem 4.2.2 (Division algorithm in K[x]). If 0 ≠ f (x), 0 ≠ g(x) ∈ K[x], then there ex-
ist unique polynomials q(x), r(x) ∈ K[x] such that f (x) = q(x)g(x) + r(x), where r(x) = 0,
or deg r(x) < deg g(x). (The polynomials q(x) and r(x) are called respectively the quo-
tient and remainder.)
Proof. If deg f (x) = 0 and deg g(x) ≥ 1, then we just choose q(x) = 0, and r(x) = f (x).
If deg f (x) = 0 = deg g(x), then f (x) = f ∈ K, and g(x) = g ∈ K, and we choose
q(x) = gf and r(x) = 0. Hence, Theorem 4.2.2 is proved for deg f (x) = 0, also certainly
the uniqueness statement.
Now, let n > 0 and Theorem 4.2.2 be proved for all f (x) ∈ K[x] with deg f (x) < n.
Now, given
an n−m
h(x) = f (x) − x g(x).
bm
We have deg h(x) < n. Hence, by induction assumption, there are q1 (x) and r(x) with
h(x) = q1 (x)g(x) + r(x) and deg r(x) < deg g(x). Then
an n−m
f (x) = h(x) + x g(x)
bm
an n−m
=( x + q1 (x))g(x) + r(x)
bm
an n−m
= q(x)g(x) + r(x) with q(x) = x + q1 (x),
bm
with
deg r1 (x) < deg g(x), and deg r2 (x) < deg g(x).
which gives a contradiction because deg(r1 (x) − r2 (x)) < deg g(x), and q2 (x) − q1 (x) ≠ 0
if r1 (x) ≠ r2 (x). Therefore, r1 (x) = r2 (x), and furthermore q1 (x) = q2 (x) because K[x] is
an integral domain.
2x 3 + x2 − 5x + 3
= 2x − 1 with remainder − 6x + 4.
x2 + x + 1
Hence, q(x) = 2x − 1, r(x) = −6x + 4, and
Theorem 4.2.4. Let K be a field. Then the polynomial ring K[x] is a principal ideal do-
main, and hence a unique factorization domain.
f (x) = (x − c)h(x),
where r(x) = 0, or deg r(x) < deg(x −c) = 1. Hence, if r(x) ≠ 0, then r(x) is a polynomial
of degree 0, that is, a constant polynomial, and thus r(x) = r for r ∈ K. Hence, we have
f (x) = (x − c)h(x) + r.
0 = f (x) = 0h(c) + r = r
and, therefore, r = 0, and f (x) = (x − c)h(x). Since deg(x − c) = 1, we must have that
deg h(x) < deg f (x).
If f (x) = (x − c)k h(x) for some k ≥ 1 with h(c) ≠ 0, then c is called a zero of order k.
Theorem 4.2.6. Let f (x) ∈ K[x] with degree 2 or 3. Then f is irreducible if and only if
f (x) does not have a zero in K.
Proof. Suppose that f (x) is irreducible of degree 2 or 3. If f (x) has a zero c, then from
Theorem 4.2.5, we have f (x) = (x − c)h(x) with h(x) of degree 1 or 2. Therefore, f (x) is
reducible a contradiction and, hence, f (x) cannot have a zero.
From Theorem 4.2.5, if f (x) has a zero and is of degree greater than 1, then f (x) is
reducible.
If f (x) is reducible, then f (x) = g(x)h(x) with deg g(x) = 1 and, hence, f (x) has a
zero in K.
Notice, for example, that this concept depends on the ring R. For example, 6 and
9 are not coprime over the integers ℤ since 3|6 and 3|9 and 3 is not a unit. However,
6 and 9 are coprime over the rationals ℚ. Here, 3 is a unit.
Definition 4.3.2. Let f (x) = ∑ni=0 ri xi ∈ R[x], where R is an integral domain. Then f (x)
is a primitive polynomial or just primitive if r0 , r1 , . . . , rn are coprime in R.
Proof. If r ∈ R is a unit, then since R embeds into R[x], it follows that r is also a unit
in R[x]. Conversely, suppose that h(x) ∈ R[x] is a unit. Then there is a g(x) such that
h(x)g(x) = 1. Hence, deg f (x) + deg g(x) = deg 1 = 0. Since degrees are nonnegative
integers, it follows that deg f (x) = deg g(x) = 0 and, hence, f (x) ∈ R.
Now suppose that p is a prime element of R. Then p ≠ 0, and pR is a prime ideal
in R. We must show that pR[x] is a prime ideal in R[x]. Consider the map
Then τ is an epimorphism with kernel pR[x]. Since pR is a prime ideal, we know that
R/pR is an integral domain. It follows that (R/pR)[x] is also an integral domain. Hence,
pR[x] must be a prime ideal in R[x], and therefore p is also a prime element of R[x].
Recall that each integral domain R can be embedded into a unique field of frac-
tions K. We can use results on K[x] to deduce some results in R[x].
Proof. Since K is a field, each nonzero element of K is a unit. Therefore, the only com-
mon divisors of the coefficients of f (x) are units and, hence, f (x) ∈ K[x] is primi-
tive.
Theorem 4.3.5. Let R be an integral domain. Then each irreducible f (x) ∈ R[x] of degree
> 0 is primitive.
Proof. Let f (x) be an irreducible polynomial in R[x], and let r ∈ R be a common divi-
sor of the coefficients of f (x). Then f (x) = rg(x), where g(x) ∈ R[x]. Then deg f (x) =
deg g(x) > 0, so g(x) ∉ R. Since the units of R[x] are the units of R, it follows that g(x)
is not a unit in R[x]. Since f (x) is irreducible, it follows that r must be a unit in R[x]
and, hence, r is a unit in R. Therefore, f (x) is primitive.
Theorem 4.3.6. Let R be an integral domain and K its field of fractions. If f (x) ∈ R[x] is
primitive and irreducible in K[x], then f (x) is irreducible in R[x].
Proof. Suppose that f (x) ∈ R[x] is primitive and irreducible in K[x], and suppose that
f (x) = g(x)h(x), where g(x), h(x) ∈ R[x] ⊂ K[x]. Since f (x) is irreducible in K[x], either
g(x) or h(x) must be a unit in K[x]. Without loss of generality, suppose that g(x) is a
unit in K[x]. Then g(x) = g ∈ K. But g(x) ∈ R[x], and K ∩ R[x] = R.
Hence, g ∈ R. Then g is a divisor of the coefficients of f (x), and as f (x) is primitive,
g(x) must be a unit in R and, therefore, also a unit in R[x]. Therefore, f (x) is irreducible
in R[x].
Theorem 4.4.1 (Gauss’ lemma). Let R be a UFD and f (x), g(x) primitive polynomials in
R[x]. Then their product f (x)g(x) is also primitive.
Proof. Let R be a UFD and f (x), g(x) primitive polynomials in R[x]. Suppose that
f (x)g(x) is not primitive. Then there is a prime element p ∈ R that divides each of
the coefficients of f (x)g(x). Then p|f (x)g(x). Since prime elements of R are also prime
elements of R[x], it follows that p is also a prime element of R[x] and, hence, p|f (x),
or p|g(x). Therefore, either f (x) or g(x) is not primitive, giving a contradiction.
r
Proof. (a) Suppose that g(x) = ∑ni=0 ai xi with ai = si , ri , si ∈ R. Set s = s0 s1 ⋅ ⋅ ⋅ sn .
i
Then sg(x) is a nonzero element of R[x]. Let d be a greatest common divisor of the
coefficients of sg(x). If we set a = ds , then ag(x) is primitive.
(b) For a ∈ K, there are coprime r, s ∈ R satisfying a = sr . Suppose that a ∉ R.
Then there is a prime element p ∈ R dividing s. Since g(x) is primitive, p does not
divide all the coefficients of g(x). However, we also have f (x) = ag(x) = sr g(x). Hence,
sf (x) = rg(x), where p|s and p does not divide r. Therefore, p divides all the coefficients
of g(x) and, hence, a ∈ R.
(c) From part (a), there is a nonzero a ∈ K such that af (x) is primitive in R[x].
Then f (x) = a−1 (af (x)). From part (b), we must have a−1 ∈ R. Set g(x) = af (x) and
b = a−1 .
Theorem 4.4.3. Let R be a UFD and K its field of fractions. Let f (x) ∈ R[x] be a polyno-
mial of degree ≥ 1.
(a) If f (x) is primitive and f (x)|g(x) in K[x], then f (x) divides g(x) also in R[x].
(b) If f (x) is irreducible in R[x], then it is also irreducible in K[x].
(c) If f (x) is primitive and a prime element of K[x], then f (x) is also a prime element of
R[x].
Proof. (a) Suppose that g(x) = f (x)h(x) with h(x) ∈ K[x]. From Theorem 4.4.2 part
(a), there is a nonzero a ∈ K such that h1 (x) = ah(x) is primitive in R[x]. Hence,
g(x) = a1 (f (x)h1 (x)). From Gauss’ lemma f (x)h1 (x) is primitive in R[x]. Therefore, from
Theorem 4.4.2 part (b), we have a1 ∈ R. It follows that f (x)|g(x) in R[x].
(b) Suppose that g(x) ∈ K[x] is a factor of f (x). From Theorem 4.4.2 part (a), there
is a nonzero a ∈ K with g1 (x) = ag(x) primitive in R[x]. Since a is a unit in K, it follows
that
However, by assumption, f (x) is irreducible in R[x]. This implies that either g1 (x) is a
unit in R, or g1 (x) is an associate of f (x).
If g1 (x) is a unit, then g1 ∈ K, and g1 = ga. Hence, g ∈ K; that is, g = g(x) is a unit.
If g1 (x) is an associate of f (x), then f (x) = bg(x), where b ∈ K since g1 (x) = ag(x)
with a ∈ K. Combining these, it follows that f (x) has only trivial factors in K[x], and
since—by assumption—f (x) is nonconstant, it follows that f (x) is irreducible in K[x].
(c) Suppose that f (x)|g(x)h(x) with g(x), h(x) ∈ R[x]. Since f (x) is a prime element
in K[x], we have that f (x)|g(x) or f (x)|h(x) in K[x]. From part (a), we have f (x)|g(x) or
f (x)|h(x) in R[x] implying that f (x) is a prime element in R[x].
Theorem 4.4.4 (Gauss). Let R be a UFD. Then the polynomial ring R[x] is also a UFD.
Proof. By induction, on degree, we show that each nonunit f (x) ∈ R[x], f (x) ≠ 0, is
a product of prime elements. Since R is an integral domain, so is R[x]. Therefore, the
fact that R[x] is a UFD then follows from Theorem 3.3.3.
If deg f (x) = 0, then f (x) = f is a nonunit in R. Since R is a UFD, f is a product
of prime elements in R. However, from Theorem 4.3.3, each prime factor is then also
prime in R[x]. Therefore, f (x) is a product of prime elements.
Now suppose n > 0 and that the claim is true for all polynomials f (x) of degree
< n. Let f (x) be a polynomial of degree n > 0. From Theorem 4.4.2 (c), there is an a ∈ R
and a primitive h(x) ∈ R[x] satisfying f (x) = ah(x). Since R is a UFD, the element a is a
product of prime elements in R, or a is a unit in R. Since the units in R[x] are the units
in R, and a prime element in R is also a prime element in R[x], it follows that a is a
product of prime elements in R[x], or a is a unit in R[x]. Let K be the field of fractions
of R. Then K[x] is a UFD. Hence, h(x) is a product of prime elements of K[x]. Let p(x) ∈
K[x] be a prime divisor of h(x). From Theorem 4.4.2, we can assume by multiplication
of field elements that p(x) ∈ R[x], and p(x) is primitive. From Theorem 4.4.2 (c), it
follows that p(x) is a prime element of R[x]. Furthermore, from Theorem 4.4.3 (a), p(x)
is a divisor of h(x) in R[x]. Therefore,
By our inductive hypothesis, we have then that g(x) is a product of prime elements in
R[x], or g(x) is a unit in R[x]. Therefore, the claim holds for f (x), and therefore holds
for all f (x) by induction.
If R[x] is a polynomial ring over R, we can form a polynomial ring in a new inde-
terminate y over this ring to form (R[x])[y]. It is straightforward that (R[x])[y] is iso-
morphic to (R[y])[x]. We denote both of these rings by R[x, y] and consider this as the
ring of polynomials in two commuting variables x, y with coefficients in R.
If R is a UFD, then from Theorem 4.4.4, R[x] is also a UFD. Hence, R[x, y] is
also a UFD. Inductively then, the ring of polynomials in n commuting variables
R[x1 , x2 , . . . , xn ] is also a UFD. Here, R[x1 , . . . , xn ] is inductively given by R[x1 , . . . , xn ] =
(R[x1 , . . . , xn−1 ])[xn ] if n > 2.
We now give a condition for a polynomial in R[x] to have a zero in K[x], where K
is the field of fractions of R.
Theorem 4.4.6. Let R be a UFD and K its field of fractions. Let f (x) = xn + rn−1 xn−1 + ⋅ ⋅ ⋅ +
r0 ∈ R[x]. Suppose that β ∈ K is a zero of f (x). Then β is in R and is a divisor of r0 .
r rn r n−1
f ( ) = 0 = n + rn−1 n−1 + ⋅ ⋅ ⋅ + r0 .
s s s
Hence, it follows that s must divide r n . Since r and s are coprime, s must be a unit, and
then, without loss of generality, we may assume that s = 1. Then β ∈ R, and
and so r|a0 .
Note that since ℤ is a UFD, Gauss’ theorem implies that ℤ[x] is also a UFD. How-
ever, ℤ[x] is not a principal ideal domain. For example, the set of integral polynomials
with even constant term is an ideal, but not principal. We leave the verification to the
exercises. On the other hand, we saw that if K is a field, K[x] is a PID. The question
arises as to when R[x] actually is a principal ideal domain. It turns out to be precisely
when R is a field.
Theorem 4.4.7. Let R be a commutative ring with an identity. Then the following are
equivalent:
(a) R is a field.
(b) R[x] is Euclidean.
(c) R[x] is a principal ideal domain.
Proof. From Section 4.2, we know that (a) implies (b), which in turn implies (c). There-
fore, we must show that (c) implies (a). Assume then that R[x] is a principal ideal do-
main. Define the map
τ : R[x] → R
by
We now consider the relationship between irreducibles in R[x] for a general inte-
gral domain and irreducibles in K[x], where K is its field of fractions. This is handled
by the next result called Eisenstein’s criterion.
Theorem 4.4.8 (Eisenstein’s criterion). Let R be an integral domain and K its field of
fractions. Let f (x) = ∑ni=0 ai xi ∈ R[x] of degree n > 0. Let p be a prime element of R
satisfying the following:
(1) p|ai for i = 0, . . . , n − 1.
(2) p does not divide an .
(3) p2 does not divide a0 .
Proof. (a) Suppose that f (x) = g(x)h(x) with g(x), h(x) ∈ R[x]. Suppose that
k l
g(x) = ∑ bi xi , bk ≠ 0 and h(x) = ∑ cj xj , cl ≠ 0.
i=0 j=0
Then a0 = b0 c0 . Now p|a0 , but p2 does not divide a0 . This implies that either p does
not divide b0 , or p doesn’t divide c0 . Without loss of generality, assume that p|b0 and
p does not divide c0 .
Since an = bk cl , and p does not divide an , it follows that p does not divide bk . Let
bj be the first coefficient of g(x), which is not divisible by p. Consider
aj = bj c0 + ⋅ ⋅ ⋅ + b0 cj ,
where everything after the first term is divisible by p. Since p does not divide both bj
and c0 , it follows that p does not divide bj c0 . Therefore, p does not divide aj , which
implies that j = n. Then from j ≤ k ≤ n, it follows that k = n. Therefore, deg g(x) =
deg f (x) and, hence, deg h(x) = 0. Thus, h(x) = h ∈ R. Then from f (x) = hg(x) with f
primitive, it follows that h is a unit and, therefore, f (x) is irreducible.
(b) Suppose that f (x) = g(x)h(x) with g(x), h(x) ∈ R[x]. The fact that f (x) was
primitive was only used in the final part of part (a). Therefore, by the same arguments
as in part (a), we may assume—without loss of generality—that h ∈ R ⊂ K. Therefore,
f (x) is irreducible in K[x].
Example 4.4.9. Let R = ℤ and p a prime number. Suppose that n, m are integers such
that n ≥ 1 and p does not divide m. Then xn ± pm is irreducible in ℤ[x] and ℚ[x]. In
1
particular, (pm) n is irrational.
xp − 1
Φp (x) = = xp−1 + xp−2 + ⋅ ⋅ ⋅ + 1.
x−1
Since all the coefficients of Φp (x) are equal to 1, Eisenstein’s criterion is not directly
applicable. However, the fact that Φp (x) is irreducible implies that for any integer a,
the polynomial Φp (x + a) is also irreducible in ℤ[x]. It follows that
p p p−1 p p
(x + 1)p − 1 x + ( 1 )x + ⋅ ⋅ ⋅ + (p−1 )x + 1 − 1
Φp (x + 1) = =
(x + 1) − 1 x
p−1 p p−2 p
= x + ( )x + ⋅ ⋅ ⋅ + ( ).
1 p−1
Theorem 4.4.11. Let R be a UFD and K its field of fractions. Let f (x) = ∑ni=0 ai xi ∈ R[x]
be a polynomial of degree ≥ 1. Let P be a prime ideal in R with an ∉ P. Let R = R/P, and
let α : R[x] → R[x] be defined by
m m
α( ∑ ri xi ) = ∑ (ri + P)xi .
i=0 i=0
α is an epimorphism. Then if α(f (x)) is irreducible in R[x], then f (x) is irreducible in K[x].
Proof. By Theorem 4.4.3, there is an a ∈ R and a primitive g(x) ∈ R[x] satisfying f (x) =
ag(x). Since an ∉ P, we have that α(a) ≠ 0. Furthermore, the highest coefficient of
g(x) is also not an element of P. If α(g(x)) is reducible, then α(f (x)) is also reducible.
Thus, α(g(x)) is irreducible. However, from Theorem 4.4.4, g(x) is irreducible in K[x].
Therefore, f (x) = ag(x) is also irreducible in K[x]. Therefore, to prove the theorem, it
suffices to consider the case where f (x) is primitive in R[x].
Now suppose that f (x) is primitive. We show that f (x) is irreducible in R[x].
Suppose that f (x) = g(x)h(x), g(x), h(x) ∈ R[x] with h(x), g(x) nonunits in R[x].
Since f (x) is primitive, g, h ∉ R. Therefore, deg g(x) < deg f (x), and deg h(x) < deg f (x).
Now we have α(f (x)) = α(g(x))α(h(x)). Since P is a prime ideal, R/P is an integral
domain. Therefore, in R[x] we have
Now
Therefore, deg α(g(x)) = deg g(x), and deg α(h(x)) = deg h(x). Therefore, α(f (x)) is
reducible, and we have a contradiction.
It is important to note that α(f (x)), being reducible, does not imply that f (x) is
reducible. For example, f (x) = x2 + 1 is irreducible in ℤ[x]. However, in ℤ2 [x], we have
x2 + 1 = (x + 1)2
Suppose that in ℤ2 [x], we have α(f (x)) = g(x)h(x). Without loss of generality, we may
assume that g(x) is of degree 1 or 2.
If deg g(x) = 1, then α(f (x)) has a zero c in ℤ2 [x]. The two possibilities for c are
c = 0, or c = 1. Then the following hold;
If c = 0, then 0 + 0 + 1 = 1 ≠ 0.
If c = 1, then 1 + 1 + 1 = 1 ≠ 0.
Suppose deg g(x) = 2. The polynomials of degree 2 over ℤ2 [x] have the form
x2 + x + 1, x2 + x, x2 + 1, x2 .
The last three, x2 + x, x2 + 1, x2 all have zeros in ℤ2 [x]. Therefore, they cannot divide
α(f (x)). Therefore, g(x) must be x2 + x + 1. Applying the division algorithm, we obtain
and, therefore, x2 + x + 1 does not divide α(f (x)). It follows that α(f (x)) is irreducible,
and from the previous theorem, f (x) must be irreducible in ℚ[x].
4.5 Exercises
1. For which a, b ∈ ℤ does the polynomial x2 + 3x + 1 divide the polynomial x3 + x2 +
ax + b?
2. Let a + bi ∈ ℂ be a zero of f (x) ∈ ℝ[x]. Show that also a − ib is a zero of f (x).
3. Determine all quadratic irreducible polynomials over ℝ.
4. Let R be an integral domain, I ⊲ R an ideal, and f ∈ R[x] a monic polynomial.
Define (R/I)[x] by the mapping R[x] → (R/I)[x], f = ∑ ai xi → f ̄ = ∑ aī xi , where
ā := a + I. Show, if (R/I)[x] is irreducible, then f ∈ R[x] is also irreducible.
5. Decide if the following polynomials f ∈ R[x] are irreducible:
(i) f (x) = x3 + 2x 2 + 3, R = ℤ.
(ii) f (x) = x5 − 2x + 1, R = ℚ.
(iii) f (x) = 3x 4 + 7x2 + 14x + 7, R = ℚ.
(iv) f (x) = x7 + (3 − i)x2 + (3 + 4i)x + 4 + 2i, R = ℤ[i].
(v) f (x) = x4 + 3x 3 + 2x2 + 3x + 4, R = ℚ.
(vi) f (x) = 8x 3 − 4x2 + 2x − 1, R = ℤ.
6. Let R be an integral domain with characteristic 0, let k ≥ 1 and α ∈ R. In R[x],
define the derivatives f (k) (x), k = 0, 1, 2, . . . , of a polynomial f (x) ∈ R[x] by
f 0 (x) := f (x),
f (k) (x) := f (k−1) (x).
Show that α is a zero of order k of the polynomial f (x) ∈ R[x], if f (k−1) (α) = 0, but
f (k) (α) ≠ 0.
7. Prove that the set of integral polynomials with even constant term is an ideal, but
not principal.
8. Prove that p|(pi ) for 1 ≤ i ≤ p − 1.
Definition 5.1.1. If K, L are fields with K ⊂ L, then we say that L is a field extension or
extension field of K. We denote this by L|K.
Note that this is equivalent to having a field monomorphism
i:K→L
Definition 5.1.2. If L is an extension field of K, then the degree of the extension L|K
is defined as the dimension, dimK (L), of L, as a vector space over K. We denote the
https://doi.org/10.1515/9783110603996-005
degree by |L : K|. The field extension L|K is a finite extension if the degree |L : K| is
finite.
Proof. Every complex number can be written uniquely as a+ib, where a, b ∈ ℝ. Hence,
the elements 1, i constitute a basis for ℂ over ℝ and, therefore, the dimension is 2. That
is, |ℂ : ℝ| = 2.
The fact that |ℝ : ℚ| = ∞ depends on the existence of transcendental numbers.
An element r ∈ ℝ is algebraic (over ℚ) if it satisfies some nonzero polynomial with
coefficients from ℚ. That is, P(r) = 0, where
0 ≠ P(x) = a0 + a1 x + ⋅ ⋅ ⋅ + an xn with ai ∈ ℚ.
If L|K and L1 |K1 are field extensions, then they are isomorphic field extensions if
there exists a field isomorphism f : L → L1 such that f|K is an isomorphism from K to
K1 .
Suppose that K ⊂ L ⊂ M are fields. Below we show that the degrees multiply. In
this situation, where K ⊂ L ⊂ M, we call L an intermediate field.
|M : K| = |M : L||L : K|.
Proof. Let {xi : i ∈ I} be a basis for L as a vector space over K, and let {yj : j ∈ J} be a
basis for M as a vector space over L. To prove the result, it is sufficient to show that the
set
B = {xi yj : i ∈ I, j ∈ J}
is a basis for M as a vector space over K. To show this, we must show that B is a linearly
independent set over K, and that B spans M.
Suppose that
But ∑i kij xi ∈ L. Since {yj : j ∈ J} is a basis for M over L, the yj are independent over
L; hence, for each j, we get ∑i kij xi = 0. Now since {xi : i ∈ I} is a basis for L over K, it
follows that the xi are linearly independent, and since for each j we have ∑i kij xi = 0,
it must be that kij = 0 for all i and for all j. Therefore, the set B is linearly independent
over K.
Now suppose that m ∈ M. Then since {yj : j ∈ J} spans M over L, we have
m = ∑ cj yj with cj ∈ L.
j
m = ∑ kij xi yj
ij
and, hence, B spans M over K. Therefore, B is a basis for M over K, and the result is
proved.
Corollary 5.1.6.
(a) If |L : K| is a prime number, then there exists no proper intermediate field between
L and K.
(b) If K ⊂ L and |L : K| = 1, then L = K.
Let L|K be a field extension, and suppose that A ⊂ L. Then certainly there are sub-
rings of L containing both A and K, for example L. We denote by K[A] the intersection
of all subrings of L containing both K and A. Since the intersection of subrings is a
subring, it follows that K[A] is a subring containing both K and A and the smallest
such subring. We call K[A] the ring adjunction of A to K.
Definition 5.1.7. The field extension L|K is finitely generated if there exist a1 , . . . ,
an ∈ L such that L = K(a1 , . . . , an ). The extension L|K is a simple extension if there
is an a ∈ L with L = K(a). In this case, a is called a primitive element of L|K.
For the remainder of this section, we assume that L|K is a field extension.
Proof. Suppose that L|K is a finite extension and a ∈ L. We must show that a is alge-
braic over K. Suppose that |L : K| = n < ∞, then dimK (L) = n. It follows that any n + 1
elements of L are linearly dependent over K.
Now consider the elements 1, a, a2 , . . . , an in L. These are n + 1 distinct elements
in L, so they are dependent over K. Hence, there exist c0 , . . . , cn ∈ K not all zero such
that
c0 + c1 a + ⋅ ⋅ ⋅ + cn an = 0.
From the previous theorem, it follows that every finite extension is algebraic. The
converse is not true; that is, there are algebraic extensions that are not finite. We will
give examples in Section 5.4.
The following lemma gives some examples of algebraic and transcendental exten-
sions.
Lemma 5.2.4. ℂ|ℝ is algebraic, but ℝ|ℚ and ℂ|ℚ are transcendental. If K is any field,
then K(x)|K is transcendental.
Definition 5.3.1. Suppose that L|K is a field extension and a ∈ L is algebraic over K.
The polynomial ma (x) ∈ K[x] is the minimal polynomial of a over K if the following
hold:
(1) ma (x) has leading coefficient 1; that is, it is a monic polynomial.
(2) ma (a) = 0.
(3) If f (x) ∈ K[x] with f (a) = 0, then ma (x)|f (x).
Hence, ma (x) is the monic polynomial of minimal degree that has a as a zero.
We prove next that every algebraic element has such a minimal polynomial.
Theorem 5.3.2. Suppose that L|K is a field extension and a ∈ L is algebraic over K. Then
we have:
(1) The minimal polynomial ma (x) ∈ K[x] exists and is irreducible over K.
(2) K[a] ≅ K(a) ≅ K[x]/(ma (x)), where (ma (x)) is the principal ideal in K[x] generated
by ma (x).
(3) |K(a) : K| = deg(ma (x)). Therefore, K(a)|K is a finite extension.
τ(∑ ki xi ) = ∑ ki ai .
i i
Since ma (x) is irreducible, we have K[x]/(ma (x)) is a field and, therefore, K[a] = K(a).
(3) Let n = deg(ma (x)). We claim that the elements 1, a, . . . , an−1 are a basis for
K[a] = K(a) over K. First suppose that
n−1
∑ ci ai = 0
i=1
with ci ∈ K and some ci , but not all might be zero. This implies that
Theorem 5.3.3. Suppose that L|K is a field extension and a ∈ L is algebraic over K.
Suppose that f (x) ∈ K[x] is a monic polynomial with f (a) = 0. Then f (x) is the minimal
polynomial if and only if f (x) is irreducible in K[x].
Proof. Suppose that f (x) is the minimal polynomial of a. Then f (x) is irreducible from
the previous theorem.
Conversely, suppose that f (x) is monic, irreducible and f (a) = 0. From the previ-
ous theorem ma (x)|f (x). Since f (x) is irreducible, we have f (x) = cma (x) with c ∈ K.
However, since both f (x) and ma (x) are monic, we must have c = 1, and f (x) = ma (x).
Theorem 5.3.4. Let L|K be a field extension. Then the following are equivalent:
(1) L|K is a finite extension.
(2) L|K is an algebraic extension, and there exist elements a1 , . . . , an ∈ L such that L =
K(a1 , . . . , an ).
(3) There exist algebraic elements a1 , . . . , an ∈ L such that L = K(a1 , . . . , an ).
Proof. (1) ⇒ (2). We have seen in Theorem 5.2.3 that a finite extension is algebraic.
Suppose that a1 , . . . , an are a basis for L over K. Then clearly L = K(a1 , . . . , an ).
(2) ⇒ (3). If L|K is an algebraic extension and L = K(a1 , . . . , an ), then each ai is
algebraic over K.
(3) ⇒ (1). Suppose that there exist algebraic elements a1 , . . . , an ∈ L such that
L = K(a1 , . . . , an ). We show that L|K is a finite extension. We do this by induction on n.
If n = 1, then L = K(a) for some algebraic element a, and the result follows from The-
orem 5.3.2. Suppose now that n ≥ 2. We assume then that an extension K(a1 , . . . , an−1 )
with a1 , . . . , an−1 algebraic elements is a finite extension. Now suppose that we have
L = K(a1 , . . . , an ) with a1 , . . . , an algebraic elements.
Then
K(a1 , . . . , an ) : K
The second term |K(a1 , . . . , an−1 ) : K| is finite from the inductive hypothesis. The first
term |K(a1 , . . . , an−1 )(an ) : K(a1 , . . . , an−1 )| is also finite from Theorem 5.3.2 since it is
a simple extension of the field K(a1 , . . . , an−1 ) by the algebraic element an . Therefore,
|K(a1 , . . . , an ) : K| is finite.
Theorem 5.3.5. Suppose that K is a field and R is an integral domain with K ⊂ R. Then
R can be viewed as a vector space over K. If dimK (R) < ∞, then R is a field.
τ(r) = rr0 .
It is easy to show (see exercises) that this is a linear transformation from R to R, con-
sidered as a vector space over K.
Suppose that τ(r) = 0. Then rr0 = 0 and, hence, r = 0 since r0 ≠ 0 and R is an
integral domain. It follows that τ is an injective map. Since R is a finite dimensional
vector space over K, and τ is an injective linear transformation, it follows that τ must
also be surjective. This implies that there exists an r1 with τ(r1 ) = 1. Then r1 r0 = 1 and,
hence, r0 has an inverse within R. Since r0 was an arbitrary nonzero element of R, it
follows that R is a field.
Proof. If M|K is algebraic, then certainly M|L and L|K are algebraic.
Now suppose that M|L and L|K are algebraic. We show that M|K is algebraic. Let
a ∈ M. Then since a is algebraic over L, there exist b0 , b1 , . . . , bn ∈ L with
b0 + b1 a + ⋅ ⋅ ⋅ + bn an = 0.
Theorem 5.4.1. Suppose that L|K is a field extension, and let 𝒜K denote the set of all
elements of L that are algebraic over K. Then 𝒜K is a subfield of L. 𝒜K is called the
algebraic closure of K in L.
Theorem 5.4.2. Let 𝒜 be the algebraic closure of the rational numbers ℚ within the
complex numbers ℂ. Then 𝒜 is an algebraic extension of ℚ, but |𝒜 : ℚ| = ∞.
We will let 𝒜 denote the totality of algebraic numbers within the complex num-
bers ℂ, and 𝒯 the set of transcendentals so that ℂ = 𝒜 ∪ 𝒯 . In the language of the last
subsection, 𝒜 is the algebraic closure of ℚ within ℂ. As in the general case, if α ∈ ℂ is
algebraic, we will let mα (x) denote the minimal polynomial of α over ℚ.
We now examine the sets 𝒜 and 𝒯 more closely. Since 𝒜 is precisely the algebraic
closure of ℚ in ℂ, we have from our general result that 𝒜 actually forms a subfield
of ℂ. Furthermore, since the intersection of subfields is again a subfield, it follows
that 𝒜 = 𝒜 ∩ ℝ, the real algebraic numbers form a subfield of the reals.
Theorem 5.5.2. The set 𝒜 of algebraic numbers forms a subfield of ℂ. The subset 𝒜 =
𝒜 ∩ ℝ of real algebraic numbers forms a subfield of ℝ.
Since each rational is algebraic, it is clear that there are algebraic numbers. Fur-
thermore, there are irrational algebraic numbers, √2 for example, since it satisfies the
irreducible polynomial x2 − 2 = 0 over ℚ. On the other hand, we have not examined
the question of whether transcendental numbers really exist. To show that any par-
ticular complex number is transcendental is, in general, quite difficult. However, it is
relatively easy to show that there are uncountably infinitely many transcendentals.
Theorem 5.5.3. The set 𝒜 of algebraic numbers is countably infinite. Therefore, 𝒯 , the
set of transcendental numbers, and 𝒯 = 𝒯 ∩ ℝ, the real transcendental numbers, are
uncountably infinite.
Proof. Let
ℚn+1 = ℚ × ℚ × ⋅ ⋅ ⋅ × ℚ.
Since a finite Cartesian product of countable sets is still countable, it follows that 𝒫n
is a countable set.
Now let
ℬn = ⋃ {zeros of p(x)};
p(x)∈𝒫n
that is, ℬn is the union of all zeros in ℂ of all rational polynomials of degree ≤ n. Since
each such p(x) has a maximum of n zeros, and since 𝒫n is countable, it follows that ℬn
is a countable union of finite sets and, hence, is still countable. Now
∞
𝒜 = ⋃ ℬn ,
n=1
From Theorem 5.5.3, we know that there exist infinitely many transcendental num-
bers. Liouville, in 1851, gave the first proof of the existence of transcendentals by ex-
hibiting a few. He gave the following as one example:
is transcendental.
for some ζ with ck < ζ < c < 1. Now since 0 < ζ < 1, we have
1
|c − ck |f (ζ ) < 2B
.
10(k+1)!
On the other hand, since f (x) can have at most n zeros, it follows that for all k large
enough, we would have f (ck ) ≠ 0. Since f (c) = 0, we have
n
j 1
f (c) − f (ck ) = f (ck ) = ∑ mj ck > nk!
j=1 10
j
since for each j, mj ck is a rational number with denominator 10jk! . However, if k is
chosen sufficiently large and n is fixed, we have
1 2B
> ,
10nk! 10(k+1)!
contradicting the equality from the mean value theorem. Therefore, c is transcenden-
tal.
Theorem 5.5.5. Suppose that L|K is a field extension and a ∈ L is transcendental over K.
Then K(a)|K is isomorphic to K(x)|K. Here the isomorphism μ : K(x) → K(a) can be
chosen such that μ(x) = a.
f (x) f (a)
μ( )=
g(x) g(a)
for f (x), g(x) ∈ K[x] with g(x) ≠ 0. Then μ is a homomorphism, and μ(x) = a. Since
μ ≠ 0, it follows that μ is an isomorphism.
5.6 Exercises
1. Let a ∈ ℂ with a3 − 2a + 2 = 0 and b = a2 − a. Compute the minimal polynomial
mb (x) of b over ℚ and compute the inverse of b in ℚ(a).
2. Determine the algebraic closure of ℝ in ℂ(x).
n
3. Let an := 2√2 ∈ ℝ, n = 1, 2, 3, . . . and A := {an : n ∈ ℕ} and E := ℚ(A). Show the
following:
(i) |ℚ(an ) : ℚ| = 2n .
(ii) |E : ℚ| = ∞.
(iii) E = ⋃∞n=1 ℚ(an ).
(iv) E is algebraic over ℚ.
4. Determine |E : ℚ| for
(i) E = ℚ(√2, √−2).
(ii) E = ℚ(√3, √3 + √3 3).
(iii) E = ℚ( 1+i , −1+i ).
√2 √2
5. Show that ℚ(√2, √3) = {a + b√2 + c√3 + d√6 : a, b, c, d ∈ ℚ}. Determine the degree
of ℚ(√2, √3) over ℚ. Further show that ℚ(√2, √3) = ℚ(√2 + √3).
6. Let K, E be fields and a ∈ E be transcendental over K. Show the following:
(i) Each element of K(a)|K, which is not in K, is transcendental over K.
(ii) an is transcendental over K for each n > 1.
3
a
(iii) If L := K( a+1 ), then a is algebraic over L. Determine the minimal polynomial
ma (x) of a over L.
7. Let K be a field and a ∈ K(x) \ K. Show the following:
(i) x is algebraic over K(a).
(ii) If L is a field with K ⊂ L ⊆ K(x) and if a ∈ L, then |K(x) : L| < ∞.
(iii) a is transcendental over K.
8. Suppose that a ∈ L is algebraic over K. Let
τ(∑ ki xi ) = ∑ ki ai .
i i
τ(r) = rr0 .
Greek mathematicians in the classical period posed the problem of constructing cer-
tain geometric figures in the Euclidean plane using only a straightedge and a compass.
These are known as geometric construction problems.
Recall from elementary geometry that using a straightedge and compass, it is pos-
sible to draw a line parallel to a given line segment through a given point, to extend a
given line segment, and to erect a perpendicular to a given line at a given point on that
line. There were other geometric construction problems that the Greeks could not de-
termine straightedge and compass solutions but, on the other hand, were never able to
prove that such constructions were impossible. In particular, there were four famous
insolvable (to the Greeks) construction problems. The first is the squaring of the circle.
This problem is, given a circle, to construct using straightedge and compass a square
having an area equal to that of the given circle. The second is the doubling of the cube.
This problem is, given a cube of given side length, to construct using a straightedge
and compass, a side of a cube having double the volume of the original cube. The third
problem is the trisection of an angle. This problem is to trisect a given angle using only
a straightedge and compass. The final problem is the construction of a regular n-gon.
This problems asks which regular n-gons could be constructed using only straightedge
and compass.
By translating each of these problems into the language of field extensions, we
can show that each of the first three problems are insolvable in general, and we can
give the complete solution to the construction of the regular n-gons.
We now translate the geometric construction problems into the language of field ex-
tensions. As a first step, we define a constructible number.
Our first result is that the set of all constructible numbers forms a subfield of ℝ.
Theorem 6.2.2. The set 𝒞 of all constructible numbers forms a subfield of ℝ. Further-
more, ℚ ⊂ 𝒞 .
https://doi.org/10.1515/9783110603996-006
Proof. Let 𝒞 be the set of all constructible numbers. Since the given unit length seg-
ment is constructible, we have 1 ∈ 𝒞 . Therefore, 𝒞 ≠ 0. Thus, to show that it is a field,
we must show that it is closed under the field operations.
Suppose α, β are constructible. We must show then that α ± β, αβ, and α/β for β ≠ 0
are constructible. If α, β > 0, construct a line segment of length |α|. At one end of this
line segment, extend it by a segment of length |β|. This will construct a segment of
length α + β. Similarly, if α > β, lay off a segment of length |β| at the beginning of a
segment of length |α|. The remaining piece will be α − β. By considering cases, we can
do this in the same manner if either α or β, or both, are negative. These constructions
are pictured in Figure 6.1. Therefore, α ± β are constructible.
In Figure 6.2, we show how to construct αβ. Let the line segment OA have length |α|.
Consider a line L through O not coincident with OA. Let OB have length |β| as in the
diagram. Let P be on ray OB so that OP has length 1. Draw AP and then find Q on ray
OA such that BQ is parallel to AP. From similar triangles, we then have
A similar construction, pictured in Figure 6.3, shows that α/β for β ≠ 0 is constructible.
Find OA, OB, OP as above. Now, connect A to B, and let PQ be parallel to AB. From
similar triangles again, we have
1 |OQ| |α|
= ⇒ = |OQ|.
|β| |α| |β|
Let us now consider how a constructible number is found in the plane. Starting
at the origin and using the unit length and the constructions above, we can locate
any point in the plane with rational coordinates. That is, we can construct the point
P = (q1 , q2 ) with q1 , q2 ∈ ℚ. Using only straightedge and compass, any further point in
the plane can be determined in one of the following three ways:
1. The intersection point of two lines, each of which passes through two known
points each having rational coordinates.
2. The intersection point of a line passing through two known points having rational
coordinates and a circle, whose center has rational coordinates, and whose radius
squared is rational.
3. The intersection point of two circles, each of whose centers has rational coordi-
nates, and each of whose radii is the square root of a rational number.
Analytically, the first case involves the solution of a pair of linear equations, each with
rational coefficients and, thus, only leads to other rational numbers. In cases two and
three, we must solve equations of the form x2 +y2 +ax+by+c = 0, with a, b, c ∈ ℚ. These
will then be quadratic equations over ℚ and, thus, the solutions will either be in ℚ, or
in a quadratic extension ℚ(√α) of ℚ. Once a real quadratic extension of ℚ is found, the
process can be iterated. Conversely, using the altitude theorem, if α is constructible,
so is √α. A much more detailed description of the constructible numbers can be found
in [42]. We thus can prove the following theorem:
Therefore, the constructible numbers are precisely those real numbers that
are contained in repeated quadratic extensions of ℚ. In the next section, we use
this idea to show the impossibility of the first three mentioned construction prob-
lems.
Theorem 6.3.1. It is impossible to square the circle. That is, it is impossible in general,
given a circle, to construct using straightedge and compass a square having area equal
to that of the given circle.
Proof. Suppose the given circle has radius 1. It is then constructible and would have
an area of π. A corresponding square would then have to have a side of length √π. To
be constructible a number, α must have |ℚ(α) : ℚ| = 2m < ∞ and, hence, α must be al-
gebraic. However, π is transcendental, so √π is also transcendental (see Section 20.4);
therefore not constructible.
Theorem 6.3.2. It is impossible to double the cube. This means that it is impossible in
general, given a cube of given side length, to construct using a straightedge and compass,
a side of a cube having double the volume of the original cube.
Proof. Let the given side length be 1, so that the original volume is also 1. To double
this, we would have to construct a side of length 21/3 . However, |ℚ(21/3 ) : ℚ| = 3 since
the minimal polynomial over ℚ is m21/3 (x) = x3 − 2. This is not a power of 2, so 21/3 is
not constructible.
Let α = cos(π/9). From the above identity, we have 4α3 − 3α − 21 = 0. The polynomial
4x3 −3x − 21 is irreducible over ℚ and, hence, the minimal polynomial over ℚ is mα (x) =
The final construction problem we consider is the construction of regular n-gons. The
algebraic study of the constructibility of regular n-gons was initiated by Gauss in the
early part of the nineteenth century.
Notice first that a regular n-gon will be constructible for n ≥ 3 if and only if the
angle 2πn
is constructible, which is the case if and only if the length cos 2πn
is a con-
2π
structible number. From our techniques, if cos n is a constructible number, then nec-
essarily |ℚ(cos( 2π n
)) : ℚ| = 2m for some m. After we discuss Galois theory, we see that
this condition is also sufficient. Therefore, cos 2πn
is a constructible number if and only
if |ℚ(cos( 2πn
)) : ℚ| = 2 m
for some m.
The solution of this problem, that is, the determination of when
|ℚ(cos( 2πn
)) : ℚ| = 2m , involves two concepts from number theory: the Euler phi-
function and Fermat primes.
Definition 6.3.4. For any natural number n, the Euler phi-function is defined by
pm − pm−1 .
Proof. Given a natural number n a reduced residue system modulo n is a set of integers
x1 , . . . , xk such that each xi is relatively prime to n, xi ≠ xj mod n unless i = j, and if
(x, n) = 1 for some integer x, then x ≡ xi mod n for some i. Clearly, ϕ(n) is the size of a
reduced residue system modulo n.
Let Ra = {x1 , . . . , xϕ(a) } be a reduced residue system modulo a, Rb = {y1 , . . . , yϕ(b) }
be a reduced residue system modulo b, and let
We claim that S is a reduced residue system modulo ab. Since S has ϕ(a)ϕ(b) elements,
it will follow that ϕ(ab) = ϕ(a)ϕ(b).
To show that S is a reduced residue system modulo ab, we must show three things:
first that each x ∈ S is relatively prime to ab; second that the elements of S are distinct;
and, finally, that given any integer n with (n, ab) = 1, then n ≡ s mod ab for some s ∈ S.
Let x = ayi + bxj . Then since (xj , a) = 1 and (a, b) = 1, it follows that (x, a) = 1.
Analogously, (x, b) = 1. Since x is relatively prime to both a and b, we have (x, ab) = 1.
This shows that each element of S is relatively prime to ab.
Next suppose that
Then
anx + bny = n.
Since (x, b) = 1, and (n, b) = 1, it follows that (nx, b) = 1. Therefore, there is an si with
nx = si + tb. In the same manner, (ny, a) = 1, and so there is an rj with ny = rj + ua.
Then
e e −1 e e −1 e e −1
ϕ(n) = (p1 1 − p1 1 )(p22 − p22 ) ⋅ ⋅ ⋅ (pkk − pkk ).
e e e
ϕ(n) = ϕ(p1 1 )ϕ(p22 ) ⋅ ⋅ ⋅ ϕ(pkk )
e e −1 e e −1 e e −1
= (p1 1 − p1 1 )(p22 − p22 ) ⋅ ⋅ ⋅ (pkk − pkk )
e e e e
= p1 1 (1 − 1/p1 ) ⋅ ⋅ ⋅ pkk (1 − 1/pk ) = p1 1 ⋅ ⋅ ⋅ pkk ⋅ (1 − 1/p1 ) ⋅ ⋅ ⋅ (1 − 1/pk )
= n ∏(1 − 1/pi ).
i
∑ ϕ(d) = n.
d|n
Proof. We first prove the theorem for prime powers and then paste together via the
fundamental theorem of arithmetic.
Suppose that n = pe for p a prime. Then the divisors of n are 1, p, p2 , . . . , pe , so
Notice that this sum telescopes; that is, 1 + (p − 1) = p, p + (p2 − p) = p2 and so on.
Hence, the sum is just pe , and the result is proved for n a prime power.
We now do an induction on the number of distinct prime factors of n. The above
argument shows that the result is true if n has only one distinct prime factor. Assume
that the result is true whenever an integer has less than k distinct prime factors, and
e e
suppose n = p1 1 ⋅ ⋅ ⋅ pkk has k distinct prime factors. Then n = pe c, where p = p1 , e = e1 ,
and c has fewer than k distinct prime factors. By the inductive hypothesis
∑ ϕ(d) = c.
d|c
Since (c, p) = 1, the divisors of n are all of the form pα d1 , where d1 |c, and
α = 0, 1, . . . , e. It follows that
As in the case of prime powers, this sum telescopes, giving a final result
∑ ϕ(d) = pe c = n.
d|n
Example 6.3.11. Consider n = 10. The divisors are 1, 2, 5, 10. Then ϕ(1) = 1, ϕ(2) = 1,
ϕ(5) = 4, ϕ(10) = 4. Then
We will see later in the book that the Euler phi-function plays an important role
in the structure theory of abelian groups.
We now turn to Fermat primes.
Definition 6.3.12. The Fermat numbers are the sequence (Fn ) of positive integers de-
fined by
n
Fn = 22 + 1, n = 0, 1, 2, 3, . . . .
Fermat believed that all the numbers in this sequence were primes. In fact, F0 , F1 ,
F2 , F3 , F4 are all primes, but F5 is composite and divisible by 641 (see exercises). It is
still an open question whether or not there are infinitely many Fermat primes. It has
been conjectured that there are only finitely many. On the other hand, if a number of
the form 2n + 1 is a prime for some integer n, then it must be a Fermat prime.
Proof. If a is odd then an + 1 is even and, hence, not a prime. Suppose then that a is
even and n = kl with k odd and k ≥ 3. Then
akl + 1
= a(k−1)l − a(k−2)l + ⋅ ⋅ ⋅ + 1.
al + 1
2πi 2π 2π
e n = cos( ) + i sin( )
n n
1 2π
μ+ = 2 cos( ).
μ n
e e −1 e e −1 e e −1
ϕ(n) = 2m−1 ⋅ (p1 1 − p1 1 )(p22 − p22 ) ⋅ ⋅ ⋅ (pkk − pkk ).
6.4 Exercises
1. Let ϕ be a given angle. In which of the following cases is the angle ψ constructible
from the angle ϕ by compass and straightedge?
π π
(a) ϕ = 13 , ψ = 26 .
π π
(b) ϕ = 33 , ψ = 11 .
π
(c) ϕ = π7 , ψ = 12 .
2. (The golden section) In the plane, let AB be a given segment from A to B with
length a. The segment AB should be divided such that the proportion of AB to the
length of the bigger subsegment is equal to the proportion of the length of the
bigger subsegment to the length of the smaller subsegment:
a b
= ,
b a−b
where b is the length of the bigger subsegment. Such a division is called division
by the golden section. If we write b = ax, 0 < x < 1, then x1 = 1−x x
, that is, x2 = 1 − x.
Do the following:
(a) Show that x1 = 1+2 5 = α.
√
(b) Construct the division of AB by the golden section with compass and straight-
edge.
(c) If we divide the radius r > 0 of a circle by the golden section, then the bigger
part of the so divided radius is the side of the regular 10-gon with its 10 vertices
on the circle.
3. Given a regular 10-gon such that the 10 vertices are on the circle with radius R > 0.
Show that the length of each side is equal to the bigger part of the radius divided
by the golden section. Describe the procedure of the construction of the regular
10-gon and 5-gon.
4. Construct the regular 17-gon with compass and straightedge. Hint: We have to con-
2πi
struct the number 21 (ω + ω−1 ) = cos 2π
17
, where ω = e 17 . First, construct the positive
zero ω1 of the polynomial x2 + x − 4; we get
1
ω1 = (√17 − 1) = ω + ω−1 + ω2 + ω−2 + ω4 + ω−4 + ω8 + ω−8 .
2
1 √
ω2 = ( 17 − 1 + √34 − 2√17) = ω + ω−1 + ω4 + ω−4 .
4
6. Show: The Fermat numbers F0 , F1 , F2 , F3 , F4 are all prime but F5 is composite and
divisible by 641.
2πi
7. Let μ = e n be a primitive n-th root of unity. Using
2πi 2π 2π
e n = cos( ) + i sin( ),
n n
show that
1 2π
μ+ = 2 cos( ).
μ n
Definition 7.1.1. Let L|K and L |K be field extensions. Then a K-isomorphism is an iso-
morphism τ : L → L , that is, the identity map on K; thus, it fixes each element of K.
Theorem 7.1.2 (Kronecker’s theorem). Let K be a field and f (x) ∈ K[x]. Then there ex-
ists a finite extension K of K, where f (x) has a zero.
Proof. Suppose that f (x) ∈ K[x]. We know that f (x) factors into irreducible polynomi-
als. Let p(x) be an irreducible factor of f (x). From the material in Chapter 4, we know
that since p(x) is irreducible, the principal ideal ⟨p(x)⟩ in K[x] is a maximal ideal. To
see this, suppose that g(x) ∉ ⟨p(x)⟩, so that g(x) is not a multiple of p(x). Since p(x) is
irreducible, it follows that (p(x), g(x)) = 1. Thus, there exist h(x), k(x) ∈ K[x] with
h(x)p(x) + k(x)g(x) = 1.
The element on the left is in the ideal (g(x), p(x)), so the identity, 1, is in this ideal.
Therefore, the whole ring K[x] is in this ideal. Since g(x) was arbitrary, this implies
that the principal ideal ⟨p(x)⟩ is maximal.
Now let K = K[x]/⟨p(x)⟩. Since ⟨p(x)⟩ is a maximal ideal, it follows that K is a
field. We show that K can be embedded in K , and that p(x) has a zero in K .
First, consider the map α : K[x] → K by α(f (x)) = f (x) + ⟨p(x)⟩. This is a homo-
morphism. Since the identity element 1 ∈ K is not in ⟨p(x)⟩, it follows that α restricted
to K is nontrivial. Therefore, α restricted to K is a monomorphism since if ker(α|K ) ≠ K
then ker(α|K ) = {0}. Therefore, K can be embedded into α(K), which is contained in
https://doi.org/10.1515/9783110603996-007
a0 + a1 α + ⋅ ⋅ ⋅ + an αn = 0.
Then on K(α), define addition and subtraction componentwise, and define multipli-
cation by algebraic manipulation, replacing powers of α higher than αn by using
We claim that K = K(α), then forms a field of finite degree over K. The basic
ring properties follow easily by computation (see exercises) using the definitions. We
must show then that every nonzero element of K(α) has a multiplicative inverse. Let
g(α) ∈ K(α). Then the corresponding polynomial g(x) ∈ K[x] is a polynomial of degree
≤ n − 1. Since f (x) is irreducible of degree n, it follows that f (x) and g(x) must be
relatively prime; that is, (f (x), g(x)) = 1. Hence, there exist a(x), b(x) ∈ K[x] with
b(α)g(α) = 1.
Now b(α) might have degree higher than n − 1 in α. However, using the relation that
f (α) = 0, we can rewrite b(α) as b(α), where b(α) now has degree ≤ n − 1 in α and,
hence, is in K(α). Therefore,
b(α)g(α) = 1;
hence, g(α) has a multiplicative inverse. It follows that K(α) is a field and, by definition,
f (α) = 0. The elements 1, α, . . . , αn−1 form a basis for K(α) over K and, hence,
K(α) : K = n.
Example 7.1.3. Let f (x) = x2 + 1 ∈ ℝ[x]. This is irreducible over ℝ. We construct the
field, in which this has a zero. Let K ≅ K[x]/⟨x 2 + 1⟩, and let a ∈ K with f (a) = 0. The
extension field ℝ(α) then has the form
K = ℝ(α) = {x + αy : x, y ∈ ℝ, α2 = −1}.
It is clear that this field is ℝ-isomorphic to the complex numbers ℂ; that is, ℝ(α) ≅
ℝ(i) ≅ ℂ.
Theorem 7.1.4. Let p(x) ∈ K[x] be an irreducible polynomial, and let K = K(α) be the
extension field of K constructed in Kronecker’s theorem, in which p(x) has a zero α. Let L
be an extension field of K, and suppose that a ∈ L is algebraic with minimal polynomial
mα (x) = p(x). Then K(α) is K-isomorphic to K(a).
Proof. If L|K is a field extension and a ∈ L with p(a) = 0 and if deg(p(x)) = n, then the
elements 1, a, . . . , an−1 constitute a basis for K(a) over K, and the elements 1, α, . . . , αn−1
constitute a basis for K(α) over K. The mapping
τ : K(a) → K(α)
f (x) = b(x − a1 ) ⋅ ⋅ ⋅ (x − an ).
Proof. Suppose that each nonconstant polynomial in K[x] has a zero in K. Let f (x) ∈
K[x] with deg(f (x)) = n. Suppose that a1 is a zero of f (x), then
f (x) = (x − a1 )h(x),
with deg(g(x)) = n−2. Continue in this manner, and f (x) factors completely into linear
factors. Hence, (1) implies (2).
Now suppose (2); that is, that each nonconstant polynomial in K[x] factors into
linear factors over K. Suppose that f (x) is irreducible. If deg(f (x)) > 1, then f (x) factors
into linear factors and, hence, is not irreducible. Therefore, f (x) must be of degree 1,
and (2) implies (3).
Now suppose that an element of K[x] is irreducible if and only if it is of degree one,
and suppose that L|K is an algebraic extension. Let a ∈ L. Then a is algebraic over K.
Its minimal polynomial ma (x) is monic and irreducible over K and, hence, from (3),
is linear. Therefore, ma (x) = x − a ∈ K[x]. It follows that a ∈ K and, hence, K = L.
Therefore, (3) implies (4).
Finally, suppose that whenever L|K is an algebraic extension, then L = K. Suppose
that f (x) is a nonconstant polynomial in K[x]. From Kronecker’s theorem, there exists
a field extension L, and a ∈ L with f (a) = 0. However, L is an algebraic extension.
Therefore, by supposition, K = L. Therefore, a ∈ K, and f (x) has a zero in K. Therefore,
(4) implies (1), completing the proof.
In the next section, we will prove that given a field K, we can always find an ex-
tension field K with the properties of the last theorem.
Theorem 7.2.2. A field K is algebraically closed if and only it satisfies any one of the
following conditions:
(1) Each nonconstant polynomial in K[x] has a zero in K.
(2) Each nonconstant polynomial in K[x] factors into linear factors over K. That is, for
each f (x) ∈ K[x], there exist elements a1 , . . . , an , b ∈ K with
f (x) = b(x − a1 ) ⋅ ⋅ ⋅ (x − an ).
The prime example of an algebraically closed field is the field ℂ of complex num-
bers. The fundamental theorem of algebra says that any nonconstant complex poly-
nomial has a complex zero.
We now show that the algebraic closure of one field within an algebraically closed
field is algebraically closed. First, we define a general algebraic closure.
Theorem 7.2.4. Let K be a field and L|K an extension of K with L algebraically closed.
Let K = 𝒜K be the algebraic closure of K within L. Then K is an algebraic closure of K.
Proof. Let K = 𝒜K be the algebraic closure of K within L. We know that K|K is alge-
braic. Therefore, we must show that K is algebraically closed.
Let f (x) be a nonconstant polynomial in K[x]. Then f (x) ∈ L[x]. Since L is alge-
braically closed, f (x) has a zero a in L. Since f (a) = 0 and f (x) ∈ K[x], it follows that a
is algebraic over K. However, K is algebraic over K. Therefore, a is also algebraic over K.
Hence, a ∈ K, and f (x) has a zero in K. Therefore, K is algebraically closed.
We want to note the distinction between being algebraically closed and being an
algebraic closure.
Lemma 7.2.5. The complex numbers ℂ are an algebraic closure of ℝ, but not an alge-
braic closure of ℚ. An algebraic closure of ℚ is 𝒜 the field of algebraic numbers within ℂ.
We now show that every field has an algebraic closure. To do this, we first show
that any field can be embedded into an algebraically closed field.
Theorem 7.2.6. Let K be a field. Then K can be embedded into an algebraically closed
field.
Proof. We show first that there is an extension field L of K, in which each nonconstant
polynomial f (x) ∈ K[x] has a zero in L.
Assign to each nonconstant f (x) ∈ K[x] the symbol yf , and consider
n
I = {∑ fj (yfj )rj : rj ∈ R, fj (x) ∈ K[x]}.
j=1
1 = g1 f1 (yf1 ) + ⋅ ⋅ ⋅ + gn fn (yfn ),
where gi ∈ I = R.
In the n polynomials g1 , . . . , gn , there are only a finite number of variables, say for
example,
y f1 , . . . , y fn , . . . , y fm .
Hence,
n
1 = ∑ gi (yf1 , . . . , yfm )fi (yfi ). (∗)
i=1
f (yf + M) = f (yf ) + M.
K ⊂ K1 (= L) ⊂ K2 ⊂ ⋅ ⋅ ⋅
Proof. Let K̂ be an algebraically closed field containing K, which exists from Theo-
rem 7.2.6.
Now let K = 𝒜K̂ be the set of elements of K̂ that are algebraic over K. From Theo-
rem 7.2.4, K̂ is an algebraic closure of K.
The following lemma is straightforward. We leave the proof to the exercises.
Proof. This is a generalized version of Theorem 7.1.4. If b ∈ K(a), then from the con-
struction of K(a), there is a polynomial g(x) ∈ K[x] with b = g(a). Define a map
ψ : K(a) → K (a )
by
ψ(b) = ϕ(g(x))(a ).
Since ϕ(f (x))(a ) = 0, this implies that ϕ(g(x))(a ) = ϕ(h(x))(a ); hence, the map ψ is
well-defined.
It is easy to show that ψ is a homomorphism. Let b1 = g1 (a), b2 = g2 (a). Then
b1 b1 = g1 g2 (a). Hence,
Before we give the proof, we note that the theorem gives the following diagram:
Now the set ℳ is nonempty since (K, ϕ) ∈ ℳ. Order ℳ by (M1 , τ1 ) < (M2 , τ2 ) if
M1 ⊂ M2 and (τ2 )|M = τ1 . Let
1
𝒦 = {(Mi , τi ) : i ∈ I}
It is clear that M is an upper bound for the chain 𝒦. Since each chain has an upper
bound it follows from Zorn’s lemma that ℳ has a maximal element (N, ρ). We show
that N = L.
Suppose that N ⊊ L. Let a ∈ L \ N. Then a is algebraic over N and further algebraic
over K, since L|K is algebraic. Let ma (x) ∈ N[x] be the minimal polynomial of a relative
to N. Since L1 is algebraically closed, ρ(ma (x)) has a zero a ∈ L1 . Therefore, there is a
monomorphism ρ : N(a) → L1 with ρ restricted to N, the same as ρ. It follows that
(N, ρ) < (N(a), ρ ) since a ∉ N. This contradicts the maximality of N. Therefore, N = L,
completing the proof.
Combining the previous two theorems, we can now prove that any two algebraic
closures of a field K are unique up to K-isomorphism; that is, up to an isomorphism,
thus, is the identity on K.
Theorem 7.2.11. Let L1 and L2 be algebraic closures of the field K. Then there is a
K-isomorphism τ : L → L1 . Again by K-isomorphism, we mean that τ is the identity
on K.
Corollary 7.2.12. Let L|K and L |K be field extensions with a ∈ L and a ∈ L algebraic
elements over K. Then K(a) is K-isomorphic to K(a ) if and only if |K(a) : K| = |K(a ) : K|,
and there is an element a ∈ K(a ) with ma (x) = ma (x).
We have just seen that given an irreducible polynomial over a field K, we could always
find a field extension, in which this polynomial has a zero. We now push this further
to obtain field extensions, where a given polynomial has all its zeros.
Theorem 7.3.2. If K is a field and 0 ≠ f (x) ∈ K[x], then there exists a splitting field for
f (x) over K.
Proof. The splitting field is constructed by repeated adjoining of zeros. Suppose, with-
out loss of generality, that f (x) is irreducible of degree n over K. From Theorem 7.1.2,
there exists a field K containing α with f (α) = 0. Then f (x) = (x − α)g(x) ∈ K [x] with
deg g(x) = n − 1. By an inductive argument, g(x) has a splitting field; therefore, so does
f (x).
Definition 7.3.3. A group G is a set with one binary operation, which we will denote
by multiplication, such that the following hold:
(1) The operation is associative; that is, (g1 g2 )g3 = g1 (g2 g3 ) for all g1 , g2 , g3 ∈ G.
(2) There exists an identity for this operation; that is, an element 1 such that 1g = g
for each g ∈ G.
(3) Each g ∈ G has an inverse for this operation; that is, for each g, there exists a g −1
with the property that gg −1 = 1.
If in addition the operation is commutative (g1 g2 = g2 g1 for all g1 , g2 ∈ G), the group
G is called an abelian group. The order of G is the number of elements in G, denoted
|G|. If |G| < ∞, G is a finite group. H ⊂ G is a subgroup if H is also a group under the
same operation as G. Equivalently, H is a subgroup if H ≠ 0, and H is closed under the
operation and inverses.
Groups most often arise from invertible mappings of a set onto itself. Such map-
pings are called permutations.
Theorem 7.3.5. For any set T, ST forms a group under composition called the symmetric
group on T. If T, T1 have the same cardinality (size), then ST ≅ ST1 . If T is a finite set with
|T| = n, then ST is a finite group, and |ST | = n!.
Proof. If ST is the set of all permutations on the set T, we must show that composition
is an operation on ST that is associative and has an identity and inverses.
Let f , g ∈ ST . Then f , g are one-to-one mappings of T onto itself. Consider f ∘ g :
T → T. If f ∘ g(t1 ) = f ∘ g(t2 ), then f (g(t1 )) = f (g(t2 )), and g(t1 ) = g(t2 ), since f is
one-to-one. But then t1 = t2 since g is one-to-one.
t1 ... tn
f =( ).
f (t1 ) ... f (tn )
For t1 , there are n choices for f (t1 ). For t2 , there are only n − 1 choices since f is one-to-
one. This continues down to only one choice for tn . Using the multiplication principle,
the number of choices for f and, therefore, the size of ST is
n(n − 1) ⋅ ⋅ ⋅ 1 = n!.
Example 7.3.6. Write down the six elements of S3 , and give the multiplication table
for the group.
Name the three elements 1, 2, 3 of T. The six elements of S3 are then:
1 2 3 1 2 3 1 2 3
1=( ), a=( ), b=( )
1 2 3 2 3 1 3 1 2
1 2 3 1 2 3 1 2 3
c=( ), d=( ), e=( ).
2 1 3 3 2 1 1 3 2
The multiplication table for S3 can be written down directly by doing the required
composition. For example,
1 2 3 1 2 3 1 2 3
ac = ( )( )=( ) = d.
2 3 1 2 1 3 3 2 1
S3 = ⟨a, c; a3 = c2 = 1, ac = ca2 ⟩.
An important result, the form of which we will see later in our work on extension
fields, is the following:
Lemma 7.3.7. Let T be a set and T1 ⊂ T a subset. Let H be the subset of ST that fixes
each element of T1 ; that is, f ∈ H if f (t) = t for all t ∈ T1 . Then H is a subgroup.
p(x, y1 , . . . , yn ) = (x − y1 ) ⋅ ⋅ ⋅ (x − yn ).
In general, the pattern of the last example holds for y1 , . . . , yn . That is,
s1 = y1 + y2 + ⋅ ⋅ ⋅ + yn
s2 = y1 y2 + y1 y3 + ⋅ ⋅ ⋅ + yn−1 yn
s3 = y1 y2 y3 + y1 y2 y4 + ⋅ ⋅ ⋅ + yn−2 yn−1 yn
..
.
s n = y1 ⋅ ⋅ ⋅ yn .
From this theorem, we obtain the following two lemmas, which will be crucial in
our proof of the fundamental theorem of algebra.
Lemma 7.3.13. Let p(x) ∈ K[x], and suppose p(x) has the zeros α1 , . . . , αn in the splitting
field K . Then the elementary symmetric polynomials in α1 , . . . , αn are in K.
Proof. Suppose p(x) = c0 + c1 x + ⋅ ⋅ ⋅ + cn xn ∈ K[x]. Since p(x) splits in K [x], with zeros
α1 , . . . , αn , we have that, in K [x],
p(x) = cn (x − α1 ) ⋅ ⋅ ⋅ (x − αn ).
The coefficients are then cn (−1)i si (α1 , . . . , αn ), where the si (α1 , . . . , αn ) are the ele-
mentary symmetric polynomials in α1 , . . . , αn . However, p(x) ∈ K[x], so each coefficient
is in K. It follows then that for each i, cn (−1)i si (α1 , . . . , αn ) ∈ K; hence, si (α1 , . . . , αn ) ∈ K
since cn ∈ K.
Lemma 7.3.14. Let p(x) ∈ K[x], and suppose p(x) has the zeros α1 , . . . , αn in the split-
ting field K . Suppose further that g(x) = g(x, α1 , . . . , αn ) ∈ K [x]. If g(x) is a symmetric
polynomial in α1 , . . . , αn , then g(x) ∈ K[x].
The proof depends on the following sequence of lemmas. The crucial one now is
the last, which says that any real polynomial must have a complex zero.
Lemma 7.4.2. Any odd-degree real polynomial must have a real zero.
From (1), P(x) gets arbitrarily large positively, so there exists an x1 with P(x1 ) > 0.
Similarly, from (2) there exists an x2 with P(x2 ) < 0.
Lemma 7.4.3. Any degree-two complex polynomial must have a complex zero.
Proof. This is a consequence of the quadratic formula and of the fact that any complex
number has a square root.
If P(x) = ax2 + bx + c, a ≠ 0, then the zeros formally are
P(z) = a0 + ⋅ ⋅ ⋅ + an z n = a0 + a1 z + ⋅ ⋅ ⋅ + an z n = P(z).
(2) Suppose P(x) is real, then ai = ai for all its coefficients; hence, P(x) = P(x).
Conversely, suppose P(x) = P(x). Then ai = ai for all its coefficients; hence, ai ∈ ℝ for
each ai ; therefore, P(x) is a real polynomial.
(3) The proof is a computation and left to the exercises.
Lemma 7.4.7. If every nonconstant real polynomial has a complex zero, then every non-
constant complex polynomial has a complex zero.
Proof. Let P(x) ∈ ℂ[x], and suppose that every nonconstant real polynomial has at
least one complex zero. Let H(x) = P(x)P(x). From Lemma 7.4.6, H(x) ∈ ℝ[x]. By sup-
position there exists a z0 ∈ ℂ with H(z0 ) = 0. Then P(z0 )P(z0 ) = 0, and since ℂ
is a field it has no zero divisors. Hence, either P(z0 ) = 0, or P(z0 ) = 0. In the first
case, z0 is a zero of P(x). In the second case, P(z0 ) = 0. Then from Lemma 7.4.5,
P(z0 ) = P(z0 ) = P(z0 ) = 0. Therefore, z0 is a zero of P(x).
Proof. Let f (x) = a0 +a1 x+⋅ ⋅ ⋅+an xn ∈ ℝ[x] with n ≥ 1, an ≠ 0. The proof is an induction
on the degree n of f (x).
Suppose n = 2m q, where q is odd. We do the induction on m. If m = 0, then f (x) has
odd degree, and the theorem is true from Lemma 7.4.2. Assume then that the theorem
is true for all degrees d = 2k q , where k < m and q is odd. Now assume that the degree
of f (x) is n = 2m q.
Suppose K is the splitting field for f (x) over ℝ, in which the zeros are α1 , . . . , αn .
We show that at least one of these zeros must be in ℂ. (In fact, all are in ℂ, but to prove
the lemma, we need only show at least one.)
Let h ∈ ℤ, and form the polynomial
This is in K [x]. In forming H(x), we chose pairs of zeros {αi , αj }, so the number of
such pairs is the number of ways of choosing two elements out of n = 2m q elements.
This is given by
(2m q)(2m q − 1)
= 2m−1 q(2m q − 1) = 2m−1 q
2
with q odd. Therefore, the degree of H(x) is 2m−1 q .
H(x) is a symmetric polynomial in the zeros α1 , . . . , αn . Since α1 , . . . , αn are the zeros
of a real polynomial, from Lemma 7.3.14, any polynomial in the splitting field symmet-
ric in these zeros must be a real polynomial.
Therefore, H(x) ∈ ℝ[x] with degree 2m−1 q . By the inductive hypothesis, then, H(x)
must have a complex zero. This implies that there exists a pair {αi , αj } with
αi + αj + hαi αj ∈ ℂ.
Since h was an arbitrary integer, for any integer h1 , there must exist such a pair
{αi , αj } with
αi + αj + h1 αi αj ∈ ℂ.
Now let h1 vary over the integers. Since there are only finitely many such pairs
{αi , αj }, it follows that there must be at least two different integers h1 , h2 such that
z1 = αi + αj + h1 αi αj ∈ ℂ, and z2 = αi + αj + h2 αi αj ∈ ℂ.
However, p(x) is then a degree-two complex polynomial, and so from Lemma 7.4.3, its
zeros are complex. Therefore, αi , αj ∈ ℂ; thus, f (x) has a complex zero.
It is now easy to give a proof of the fundamental theorem of algebra. From Lem-
ma 7.4.8, every nonconstant real polynomial has a complex zero. From Lemma 7.4.7, if
every nonconstant real polynomial has a complex zero, then every nonconstant com-
plex polynomial has a complex zero, proving the fundamental theorem.
Proof. Let a ∈ E. Regard the elements 1, a, a2 , . . . . These elements become linearly de-
pendent over ℂ, and we get a nonconstant polynomial over ℂ with zero a. By the fun-
damental theorem of algebra, we know that a ∈ ℂ.
i1 − j1 , i2 − j2 , . . . , in − jn
that differs from zero is in fact positive. The highest piece of a polynomial f (x1 , . . . , xn )
is denoted by HG(f ).
Hence,
s1 = x1 + x2 + ⋅ ⋅ ⋅ + xn
s2 = x1 x2 + x1 x3 + ⋅ ⋅ ⋅ + xn−1 xn
s3 = x1 x2 x3 + x1 x2 x4 + ⋅ ⋅ ⋅ + xn−2 xn−1 xn
..
.
sn = x1 ⋅ ⋅ ⋅ xn .
where the sum is taken over all the (nk ) different systems of indices i1 , . . . , ik with
i1 < i2 < ⋅ ⋅ ⋅ < ik .
Furthermore, a polynomial s(x1 , . . . , xn ) is a symmetric polynomial if s(x1 , . . . , xn )
is unchanged by any permutation σ of {x1 , . . . , xn }; that is, s(x1 , . . . , xn ) = s(σ(x1 ), . . . ,
σ(xn )).
k k
Lemma 7.5.2. In the highest piece ax1 1 ⋅ ⋅ ⋅ xnn , a ≠ 0, of a symmetric polynomial s(x1 , . . . ,
xn ), we have k1 ≥ k2 ≥ ⋅ ⋅ ⋅ ≥ kn .
Proof. Assume that ki < kj for some i < j. As a symmetric polynomial, s(x1 , . . . , xn ) also
k k k k
must then contain the piece ax1 1 ⋅ ⋅ ⋅ xi j ⋅ ⋅ ⋅ xj i ⋅ ⋅ ⋅ xnn , which is higher than
k k k k
ax1 1 ⋅ ⋅ ⋅ xi i ⋅ ⋅ ⋅ xj j ⋅ ⋅ ⋅ xnn , giving a contradiction.
k −k2 k2 −k3 k −kn kn
Lemma 7.5.3. The product s1 1 s2 ⋅ ⋅ ⋅ sn−1
n−1
sn with k1 ≥ k2 ≥ ⋅ ⋅ ⋅ ≥ kn has the high-
k k k
est piece x1 1 x2 2 ⋅ ⋅ ⋅ xnn .
Proof. From the definition of the elementary symmetric polynomials, we have that
HG(stk ) = (x1 x2 ⋅ ⋅ ⋅ xk )t , 1 ≤ k ≤ n, t ≥ 1.
Proof. We prove the existence of the polynomial f by induction on the size of the high-
est pieces. If in the highest piece of a symmetric polynomial all exponents are zero,
then it is constant, that is, an element of R. Therefore, there is nothing to prove.
Now we assume that each symmetric polynomial with highest piece smaller than
that of s(x1 , . . . , xn ) can be written as a polynomial in the elementary symmetric poly-
k k
nomials. Let ax1 1 ⋅ ⋅ ⋅ xnn , a ≠ 0, be the highest piece of s(x1 , . . . , xn ). Let
k −k2 k −kn kn
t(x1 , . . . , xn ) = s(x1 , . . . , xn ) − as1 1 ⋅ ⋅ ⋅ sn−1
n−1
sn .
Clearly, t(x1 , . . . , xn ) is another symmetric polynomial, and from Lemma 7.4.5, the
highest piece of t(x1 , . . . , xn ) is smaller than that of s(x1 , . . . , xn ). Therefore, t(x1 , . . . , xn ).
k −k kn−1 −kn kn
Hence, s(x1 , . . . , xn ) = t(x1 , . . . , xn ) + as1 1 2 ⋅ ⋅ ⋅ sn−1 sn can be written as a polynomial
in s1 , . . . , sn .
To prove the uniqueness of this expression, assume that s(x1 , . . . , xn ) = f (s1 , . . . ,
sn ) = g(s1 , . . . , sn ). Then f (s1 , . . . , sn ) − g(s1 , . . . , sn ) = h(s1 , . . . , sn ) = ϕ(x1 , . . . , xn ) is the
zero polynomial in x1 , . . . , xn . Hence, if we write h(s1 , . . . , sn ) as a sum of products of
powers of the s1 , . . . , sn , all coefficients disappear because two different products of
powers in the s1 , . . . , sn have different highest pieces. This follows from the previous
set of lemmas. Therefore, f and g are the same, proving the theorem.
i2 = j2 = k 2 = −1,
ij = k, jk = i, ki = j,
ji = −k, kj = −i, ik = −j.
For
x = x0 + x1 i + x2 j + x3 k and y = y0 + y1 i + y2 j + x3 k,
x = x0 + x1 i + x2 j + x3 k,
x := x0 − x1 i − x2 j − x3 k.
x = x, x + y = x + y, λx = λx, λ ∈ ℝ, and xy = x ⋅ y.
With help of the conjugation, we may now define the norm and the length of a quater-
nion
x = x0 + x1 i + x2 j + x3 k
by
n(x) = xx = xx = x02 + x12 + x22 + x32 and |x| = √x02 + x12 + x22 + x32 ,
x x
xx−1 = x =1=x .
xx xx
Hence, together with the addition and multiplication, V becomes a skew field, in
which ℝ can be embedded via r → r ⋅ 1 for r ∈ ℝ.
Theorem 7.6.1. The set of quaternions ℍ is a skew field, which contains both the reals
and the complexes as subfields. It has dimension 4 as a vector space over ℝ. Further-
more, rx = xr for all x ∈ ℍ, and all r ∈ ℝ (considered as elements of ℍ).
In ℍ, there is an important multiplicative rule for the norm and the length:
Theorem 7.6.2 (Theorem of Lagrange). Each natural number n can be written as a sum
n = a2 + b2 + c2 + d2
Hint: We have only to show that (see [43, Chapter 3.2]) if p is a prime number with
p ≡ 3 mod 4, then p = a2 + b2 + c2 + d2 for some a, b, c, d ∈ ℤ.
A proof of this can be found for instance in the book [43].
We remark that the skew field ℍ of the quaternions can be embedded into M(2, ℂ)
via
1 0 i 0
1 → ( ), i → ( ),
0 1 0 −i
0 1 0 i
j → ( ), k → ( ).
−1 0 i 0
x + x1 i x2 + x3 i w z
( 0 )=( )
−x2 + x3 i x0 − x1 i −z w
with w = x0 + x1 i ∈ ℂ and z = x2 + x3 i ∈ ℂ.
We have shown that the quaternions form a skew field of degree 4 over the real
numbers. We ask whether there can be other finite degree skew field extensions of ℝ.
Let V be a ℝ-vector space of dimℝ (V) = n < ∞. For which n, we may provide V with a
multiplication such that V with the vector addition and this multiplication becomes a
field, or a skew field.
We remark that some nonzero vector in V has to be the unit element 1; therefore,
we automatically have an embedding ℝ → V.
Let n ≥ 2. Since the irreducible polynomials from ℝ[x] have degree 1 or 2, then
under the existence of such a multiplication, each element α ∈ V, which is not in ℝ
(considered as a subset of V), must be a zero of a quadratic polynomial from ℝ[x].
We now assume that we have in V a multiplication such that V, together with the
addition in V and this multiplication, is a field or a skew field.
If n = 2, we get the field ℂ of the complex numbers.
Now, let n = 3.
Using analogous thoughts as for the implementation of ℂ, we may construct in
two steps a basis {1, i, j} of V such that 1 is the unit element of V, and i2 = j2 = −1.
Recall that a two-dimensional subspace of V has to be isomorphic to ℂ as a subfield
of V.
Let k = ij. Since dimℝ (V) = 3, we must have k = a1 + b1 i + c1 j with a1 , b1 , c1 ∈ ℝ.
Multiplication from the left with i results in
−j = a1 i − b1 + c1 k = a1 i − b1 + c1 (a1 + b1 i + c1 j),
and since 1, i, j are linearly independent, therefore, we get c12 = −1, which is impossible
in ℝ. Therefore, the case n = 3 is not possible.
If n = 4, we may construct in V three linearly independent elements 1, i, j such that
1 is the unit element of V, and i2 = j2 = −1. Certainly ij is linearly independent from 1, i
and j, because otherwise, we get a contradiction as in the case n = 3. Also ji is linearly
independent from 1, i and j. Now i + j and i − j are both zeros of quadratic polynomials
over ℝ; that is, there exists r1 , s1 , r2 , s2 ∈ ℝ with
If we add these equations, we see that r1 = r2 = 0; therefore, we get from the first
equation that ij + ji = c ∈ ℝ. Here, we used that 1, i and j are linearly independent.
Now, we may replace j by j + c2 i, which gives
c c
i(j + i) + (j + i)i = 0.
2 2
So altogether, we may construct a basis {1, i, j, k} of V such that 1 is the unit element
of V, and i2 = j2 = k 2 = −1, k = ij = −ji. Thereby, V is isomorphic to the skew field ℍ of
the quaternions.
Finally, let n ≥ 5.
Analogously as for the case n = 4 and the general observation for the subfield
isomorphic to ℂ, we may construct a basis {1, i, j, k, l, . . .} such that
Analogously, as in the case n = 4, we have that i + l and i − l are both zeros of quadratic
polynomials over ℝ.
Therefore, as in the case n = 4,
il = li = a2 ∈ ℝ.
jl + lj = b2 ∈ ℝ and kl + lk = c2 ∈ ℝ.
We calculate
2lk = a2 j − b2 i + c2 .
−2l = a2 i + b2 j + c2 k,
Theorem 7.6.3 (Theorem of Frobenius). Let V be a ℝ-vector space with dimℝ (V) = n <
∞. Let V be provided in addition with a multiplication, such that V together with the
vector addition and the multiplication is a field or a skew field.
Then n = 1, 2 or 4.
If n = 1, then V is isomorphic to ℝ.
If n = 2, then V is isomorphic to ℂ.
If n = 4, then V is isomorphic to ℍ.
7.7 Exercises
1. Let f , g ∈ K[x] be irreducible polynomials of degree 2 over the field K. Let α1 , α2
(respectively, β1 , β2 ) be zeros of f and g. For 1 ≤ i, j ≤ 2, let νij = αi + βj . Show the
following:
(a) |K(νij ) : K| ∈ {1, 2, 3, 4}.
(b) For fixed f , g, there are at most two different degrees in (a).
(c) Decide which sets of combinations of degrees in (b) (with f , g variable) are
possible, and give an example in each case.
2. Let L|K be a field extension; let ν ∈ L and f (x) ∈ L[x], a polynomial of degree ≥ 1.
Let all coefficients of f (x) be algebraic over K. If f (ν) = 0, then ν is algebraic over K.
3. Let L|K be a field extension, and let M be an intermediate field. The extension M|K
is algebraic. For ν ∈ L, the following are equivalent:
(a) ν is algebraic over M.
(b) ν is algebraic over K.
4. Let L|K be a field extension and ν1 , ν2 ∈ L. Then the following are equivalent:
(a) ν1 and ν2 are algebraic over K.
(b) ν1 + ν2 and ν1 ν2 are algebraic over K.
5. Let L|K be a simple field extension. Then there is an extension field L of L of the
form L = K(ν1 , ν2 ) with the following:
(a) ν1 and ν2 are transcendental over K.
(b) The set of all over K algebraic elements of L is L.
6. In the proof of Theorem 7.1.4, show that the mapping
τ : K(a) → K(α),
11. Determine all irreducible polynomials over ℝ. Factorize f (x) ∈ ℝ[x] in irreducible
polynomials.
In the last chapter, we introduced splitting fields and used this idea to present a proof
of the fundamental theorem of algebra. The concept of a splitting field is essential to
the Galois theory of equations. Therefore, in this chapter, we look more deeply at this
idea.
Definition 8.1.1. Let K be a field and f (x) a nonconstant polynomial in K[x]. An exten-
sion field L of K is a splitting field for f (x) over K if the following hold:
(a) f (x) splits into linear factors in L[x].
(b) K ⊂ M ⊂ L and M ≠ L, resulting in f (x) not splitting into linear factors in M[x].
Lemma 8.1.2. L is a splitting field for f (x) ∈ K[x] if and only if f (x) splits into linear
factors in L[x], and if f (x) = b(x − a1 ) ⋅ ⋅ ⋅ (x − an ) with b ∈ K, then L = K(a1 , . . . , an ).
Example 8.1.3. The field ℂ of complex numbers is a splitting field for the polynomial
p(x) = x2 + 1 in ℝ[x]. In fact, since ℂ is algebraically closed, it is a splitting field for
any real polynomial f (x) ∈ ℝ[x], which has at least one nonreal zero.
The field ℚ(i) adjoining i to ℚ is a splitting field for x2 + 1 over ℚ[x].
The next result was used in the previous chapter. We restate and reprove it here.
Theorem 8.1.4. Let K be a field. Then each nonconstant polynomial in K[x] has a split-
ting field.
Proof. Let K be an algebraic closure of K. Then f (x) splits in K[x]; that is, f (x) = b(x −
a1 ) ⋅ ⋅ ⋅ (x − an ) with b ∈ K and ai ∈ K. Let L = K(a1 , . . . , an ). Then L is the splitting field
for f (x) over K.
We next show that the splitting field over K of a given polynomial is unique up to
K-isomorphism.
https://doi.org/10.1515/9783110603996-008
(b) If g(x) is an irreducible factor of f (x) in K[x], a is a zero of g(x) in L, and a is a zero
of g (x) = ϕ(g(x)) in L , then there is an isomorphism ψ from L to L with ψ|K = ϕ
and ψ(a) = ψ(a ).
Before giving the proof of this theorem, we note that the following important result
is a direct consequence of it:
Proof of Theorem 8.1.5. Suppose that f (x) = b(x − a1 ) ⋅ ⋅ ⋅ (x − an ) ∈ L[x] and that f (x) =
b (x − a1 ) ⋅ ⋅ ⋅ (x − an ) ∈ L [x]. Then
We have proved that polynomials have unique factorization over fields. Since L ⊂ L ,
it follows that the set of zeros (ψ(a1 ), . . . , ψ(an )) is a permutation of the set of zeros
(a1 , . . . , an ). In particular, this implies that ψ(ai ) ∈ L ; thus,
Since the image of ψ is K (a1 , . . . , an ) = K (ψ(ai ), . . . , ψ(an )), it is clear that ψ is uniquely
determined by the images ψ(ai ). This proves part (a).
For part (b), embed L in an algebraic closure L . Hence, there is a monomorphism
ϕ : K(a) → L
Example 8.1.7. Let f (x) = x3 − 7 ∈ ℚ[x]. This has no zeros in ℚ, and since it is of
degree 3, it follows that it must be irreducible in ℚ[x].
Let ω = − 21 + 23 i ∈ ℂ. Then it is easy to show by computation that ω2 = − 21 − 23 i,
√ √
a1 = 71/3
a2 = ω ⋅ 71/3
a3 = ω2 ⋅ 71/3 .
Hence, L = ℚ(a1 , a2 , a3 ), the splitting field of f (x). Since the minimal polynomial
of all three zeros over ℚ is the same f (x), it follows that
1/3
Suppose that ℚ(a2 ) = ℚ(a3 ). Then ω = a3 a−1 2 ∈ ℚ(a2 ), and so 7 = ω−1 a2 ∈
ℚ(a2 ). Hence, Q(a1 ) ⊂ ℚ(a2 ); therefore, ℚ(a1 ) = ℚ(a2 ) since they have the same degree
over ℚ. This contradiction shows that ℚ(a2 ) and ℚ(a3 ) are distinct.
2
By computation, we have a3 = a−1 1 a2 ; hence,
Now |ℚ(ω) : ℚ| = 2 since the minimal polynomial of ω over ℚ is x2 +x +1. Since no zero
of f (x) lies in ℚ(ω), and the degree of f (x) is 3, it follows that f (x) is irreducible over
ℚ(ω). Therefore, we have that the degree of L over ℚ(ω) is 3. Hence, |L : ℚ| = (2)(3) = 6.
We now have the following lattice diagram of fields and subfields:
We do not know however if there are any more intermediate fields. There could,
for example, be infinitely many. However, as we will see when we do the Galois theory,
there are no others.
Note, in Example 8.1.7, the extension fields Q(αi )|ℚ are not normal extensions.
Although f (x) has a zero in ℚ(αi ), the polynomial f (x) does not split into linear factors
in ℚ(αi )[x].
We now show that L|K is a finite normal extension if and only if L is the splitting
field for some f (x) ∈ K[x].
Theorem 8.2.2. Let L|K be a finite extension. Then the following are equivalent:
(a) L|K is a normal extension.
(b) L|K is a splitting field for some f (x) ∈ K[x].
(c) If L ⊂ L and ψ : L → L is a monomorphism with ψ|K , the identity map on K, then ψ
is an automorphism of L; that is, ψ(L) = L.
Proof. Suppose that L|K is a finite normal extension. Since L|K is a finite extension, L is
algebraic over K, and since of finite degree, we have L = K(a1 , . . . , an ) with ai algebraic
over K.
Let fi (x) ∈ K[x] be the minimal polynomial of ai . Since L|K is a normal extension,
fi (x) splits in L[x]. This is true for each i = 1, . . . , n. Let f (x) = f1 (x)f2 (x) ⋅ ⋅ ⋅ fn (x). Then f (x)
splits into linear factors in L[x]. Since K = K(a1 , . . . , an ), the polynomial f (x) cannot
have all its zeros in any intermediate extension between K and L. Therefore, L is the
splitting field for f (x). Hence, (a) implies (b).
Now suppose that L ⊂ L and ψ : L → L is a monomorphism with ψ|K the identity
map on K. Then the extension field ψ(L) of K is also a splitting field for f (x) since ψ|K
is the identity on K. Hence, ψ maps the zeros of f (x) in L ⊂ L onto the zeros of f (x) in
ψ(L) ⊂ L , and thus it follows that ψ(L) = L. Hence, (b) implies (c).
Finally, suppose (c). Hence, we assume that if L ⊂ L and ψ : L → L is a monomor-
phism with ψ|K , the identity map on K, then ψ is an automorphism of L; that is,
ψ(L) = L.
As before L|K is algebraic since L|K is finite. Suppose that f (x) ∈ K[x] is irre-
ducible and that a ∈ L is a zero of f (x). There are algebraic elements a1 , . . . , an ∈ L
with L = K(a1 , . . . , an ) since L|K is finite. For i = 1, . . . , n, let fi (x) ∈ K[x] be the minimal
polynomial of ai , and let g(x) = f (x)f1 (x) ⋅ ⋅ ⋅ fn (x). Let L be the splitting field of g(X).
Clearly, L ⊂ L . Let b ∈ L be a zero of f (x). From Theorem 8.1.5, there is an automor-
phism ψ of L with ψ(a) = b and ψ|K , the identity on K. Hence, by our assumption, ψ|L
is an automorphism of L. It follows that b ∈ L; hence, f (x) splits in L[x]. Therefore, (c)
implies (a), completing the proof.
Proof. Suppose that |L : K| = 2. Then L|K is algebraic since it is finite. Let f (x) ∈
K[x] be irreducible with leading coefficient 1, and which has a zero in L. Let a be
one zero. Then f (x) must be the minimal polynomial of a. However, deg(ma (x)) ≤
|L : K| = 2; hence, f (x) is of degree 1 or 2. Since f (x) has a zero in L, it follows that
it must split into linear factors in L[x]; therefore, L is a normal extension.
Later, we will tie this result to group theory when we prove that a subgroup of
index 2 must be a normal subgroup.
Example 8.2.4. As a first example of the lemma, consider the polynomial f (x) = x2 −2.
In ℝ, this splits as (x − √2)(x + √2); hence, the field ℚ(√2) is the splitting field of
f (x) = x2 − 2 over ℚ. Therefore, ℚ(√2) is a normal extension of ℚ.
Example 8.2.5. As a second example, consider the polynomial x4 −2 in ℚ[x]. The zeros
in ℂ are
Hence,
Therefore, we have
8.3 Exercises
1. Determine the splitting field of f (x) ∈ ℚ[x] and its degree over ℚ in the following
cases:
(a) f (x) = x4 − p, where p is a prime.
(b) f (x) = xp − 2, where p is a prime.
2. Determine the degree of the splitting field of the polynomial x4 + 4 over ℚ. Deter-
mine the splitting field of x6 + 4x 4 + 4x2 + 3 over ℚ.
3. For each a ∈ ℤ, let fa (x) = x3 − ax 2 + (a − 3)x + 1 ∈ ℚ[x] be given:
(a) fa is irreducible over ℚ for each a ∈ ℤ.
(b) If b ∈ ℝ is a zero of fa , then also (1 − b)−1 and (b − 1)b−1 are zeros of fa .
(c) Determine the splitting field L of fa (x) over ℚ and its degree |L : ℚ|.
4. Let K be a field and f (x) ∈ K[x] a polynomial of degree n. Let L be a splitting field
of f (x). Show the following:
(a) If a1 , . . . , an ∈ L are the zeros of f , then |K(a1 , . . . , at ) : K| ≤ n ⋅ (n − 1) ⋅ ⋅ ⋅ (n − t + 1)
for each t with 1 ≤ t ≤ n.
(b) L over K is of degree at most n!.
(c) If f (x) is irreducible over K, then n divides |L : K|.
Definition 9.1.1. A group G is a set with one binary operation, which we will denote
by multiplication, such that
(1) The operation is associative; that is, (g1 g2 )g3 = g1 (g2 g3 ) for all g1 , g2 , g3 ∈ G.
(2) There exists an identity for this operation; that is, an element 1 such that 1g = g
and g1 = g for each g ∈ G.
(3) Each g ∈ G has an inverse for this operation; that is, for each g, there exists a g −1
with the property that gg −1 = 1, and g −1 g = 1.
If, in addition, the operation is commutative; that is, g1 g2 = g2 g1 for all g1 , g2 ∈ G, the
group G is called an abelian group.
The order of G, denoted |G|, is the number of elements in the group G. If |G| < ∞,
G is a finite group, otherwise, it is an infinite group.
It follows easily from the definition that the identity is unique, and that each ele-
ment has a unique inverse.
Proof. Suppose that 1 and e are both identities for G. Then 1e = e since 1 is an identity,
and 1e = 1 since e is an identity. Therefore, 1 = e, and there is only one identity.
Next suppose that g ∈ G, g1 , and g2 are inverses for g. Then
g1 gg2 = g1 (gg2 ) = g1 1 = g1
https://doi.org/10.1515/9783110603996-009
Therefore, g2−1 g1−1 is an inverse for g1 g2 , and since inverses are unique, it is the inverse
of the product.
Groups most often arise as permutations on a set. We will see this, as well as other
specific examples of groups, in the next sections.
Finite groups can be completely described by their group tables or multiplication
tables. These are sometimes called Cayley tables. In general, let G = {g1 , . . . , gn } be a
group, then the multiplication table of G is
g1 g2 ⋅⋅⋅ gj ⋅⋅⋅ gn
g1 ⋅⋅⋅
g2 ⋅⋅⋅
..
.
gi ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ gi gj
..
.
gn ...
The entry in the row of gi ∈ G and column of gj ∈ G is the product (in that order)
gi gj in G.
Groups satisfy the cancellation law for multiplication.
A consequence of Lemma 9.1.3 is that each row and each column in a group table is
just a permutation of the group elements. That is, each group element appears exactly
once in each row and each column.
A subset H ⊂ G is a subgroup of G if H is also a group under the same operation
as G. As for rings and fields, a subset of a group is a subgroup if it is nonempty and
closed under both the group operation and inverses.
Lemma 9.1.4.
1. A subset H ⊂ G is a subgroup if H ≠ 0, and H is closed under the operation and
inverses. That is, if a, b ∈ H, then ab ∈ H, and a−1 , b−1 ∈ H.
2. A nonempty subset H of a group G is a subgroup if and only if ab−1 ∈ H for all
a, b ∈ H. In addition, if G is finite, then H is a subgroup if and only if ab ∈ H for all
a, b ∈ H.
H = {1 = g 0 , g, g −1 , g 2 , g −2 , . . .}
Lemma 9.1.5. If G is a group and g ∈ G, then ⟨g⟩ forms a subgroup of G called the cyclic
subgroup generated by g. ⟨g⟩ is abelian, even if G is not.
Suppose that g ∈ G and g m = 1 for some positive integer m. Then let n be the small-
est positive integer such that g n = 1. It follows that the set of elements {1, g, g 2 , . . . , g n−1 }
are all distinct, but for any other power g k , we have g k = g t for some k = 0, 1, . . . , n − 1
(see exercises). The cyclic subgroup generated by g then has order n, and we say that g
has order n, which we denote by o(g) = n. If no such n exists, we say that g has infinite
order. We will look more deeply at cyclic groups and subgroups in Section 9.5.
We introduce one more concept before looking at examples.
As with rings and fields, we say that two groups G and H are isomorphic, denoted
by G ≅ H, if there exists an isomorphism f : G → H. This means that, abstractly, G
and H have exactly the same algebraic structure.
group. In abelian groups, the group operation is often denoted by + and the identity
element by 0 (zero).
In a field K, the nonzero elements are all invertible and form a group under multi-
plication. This is called the multiplicative group of the field K and is usually denoted by
K ∗ . Since multiplication in a field is commutative, the multiplicative group of a field
is an abelian group. Hence, ℚ∗ , ℝ∗ , ℂ∗ are all infinite abelian groups, whereas if p is
a prime, ℤ∗p forms a finite abelian group. Recall that if p is a prime, then the modular
ring ℤp is a field.
Within ℚ∗ , ℝ∗ , ℂ∗ , there are certain multiplicative subgroups. Since the positive
rationals ℚ+ and the positive reals ℝ+ are closed under multiplication and inverse,
they form subgroups of ℚ∗ and ℝ∗ , respectively. In ℂ, if we consider the set of all
complex numbers z with |z| = 1, these form a multiplicative subgroup. Further within
this subgroup, if we consider the set of n-th roots of unity z (that is z n = 1) for a fixed n,
this forms a subgroup, this time of finite order.
The multiplicative group of a field is a special case of the unit group of a ring. If R
is a ring with identity, recall that a unit is an element of R with a multiplicative inverse.
Hence, in ℤ, the only units are ±1, whereas in any field every nonzero element is a unit.
Lemma 9.2.1. If R is a ring with identity, then the set of units in R forms a group under
multiplication called the unit group of R, and is denoted by U(R). If R is a field, then
U(R) = R∗ .
Proof. Let R be a ring with identity. Then the identity 1 itself is a unit, so 1 ∈ U(R);
hence, U(R) is nonempty. If e ∈ R is a unit, then it has a multiplicative inverse e−1 .
Clearly then, the multiplicative inverse has an inverse, namely, e so e−1 ∈ U(R) if e is.
Hence, to show U(R) is a group, we must show that it is closed under product.
Let e1 , e2 ∈ U(R). Then there exist e1−1 , e2−1 . It follows that e2−1 e1−1 is an inverse for
e1 e2 . Hence, e1 e2 is also a unit, and U(R) is closed under product. Therefore, for any
ring R with identity U(R) forms a multiplicative group.
and
Lemma 9.2.2. If K is a field, then for n ≥ 2, GL(n, K) forms a nonabelian group under
matrix multiplication, and SL(n, K) forms a subgroup.
GL(n, K) is called the n-dimensional general linear group over K, whereas SL(n, K)
is called the n-dimensional special linear group over K.
Proof. Recall that for two n × n matrices A, B with n ≥ 2 over a field, we have
Theorem 9.2.3. The set of congruence motions of ℰ 2 forms a group called the Euclidean
group. We denote the Euclidean group by ℰ .
Proof. The identity map I is clearly an isometry, and since composition of mappings
is associative, we need only to show that the product of isometries is an isometry, and
that the inverse of an isometry is an isometry.
Let T, U be isometries. Then d(a, b) = d(T(a), T(b)) and d(a, b) = d(U(a),
U(b)) for any points a, b. Now consider
One of the major results concerning ℰ is the following. We refer to [32], [33], [23],
and [29] for a more thorough treatment.
Proof. We outline a brief proof. If T is an isometry and T fixes the origin (0, 0), then T
is a linear mapping. It follows that T is a rotation or a reflection. If T does not fix the
origin, then there is a translation T0 such that T0 T fixes the origin. This gives transla-
tions and glide reflections. In the exercises, we expand out more of the proof.
Example 9.2.6. Let T be an equilateral triangle. Then there are exactly six symmetries
of T (see exercises). These are as follows:
I = the identity,
r = a rotation of 120∘ around the center of T,
Sym(T) is called the dihedral group D3 . In the next section, we will see that it is
isomorphic to S3 , the symmetric group on 3 symbols.
more deeply in Chapter 11. We recall some ideas, first introduced in Chapter 7, in rela-
tion to the proof of the fundamental theorem of algebra.
Theorem 9.3.2. For any set A, SA forms a group under composition, called the symmet-
ric group on A. If |A| > 2, then SA is nonabelian. Furthermore, if A, B have the same
cardinality, then SA ≅ SB .
Proof. If SA is the set of all permutations on the set A, we must show that composition
is an operation on SA that is associative, and has an identity and inverses.
Let f , g ∈ SA . Then f , g are one-to-one mappings of A onto itself. Consider f ∘ g :
A → A. If f ∘ g(a1 ) = f ∘ g(a2 ), then f (g(a1 )) = f (g(a2 )), and g(a1 ) = g(a2 ), since f is
one-to-one. But then a1 = a2 since g is one-to-one.
If a ∈ A, there exists a1 ∈ A with f (a1 ) = a since f is onto. Then there exists a2 ∈ A
with g(a2 ) = a1 since g is onto. Putting these together, f (g(a2 )) = a; therefore, f ∘ g
is onto. Therefore, f ∘ g is also a permutation, and composition gives a valid binary
operation on SA .
The identity function 1(a) = a for all a ∈ A will serve as the identity for SA , whereas
the inverse function for each permutation will be the inverse. Such unique inverse
functions exist since each permutation is a bijection.
Finally, composition of functions is always associative; therefore, SA forms a
group.
Suppose that |A| > 2. Then A has at least 3 elements. Call them a1 , a2 , a2 . Consider
the 2 permutations f and g, which fix (leave unchanged) all of A, except a1 , a2 , a3 and
on these three elements:
whereas
a1 ... an
f =( ).
f (a1 ) ... f (an )
For a1 , there are n choices for f (a1 ). For a2 , there are only n − 1 choices since f is one-to-
one. This continues down to only one choice for an . Using the multiplication principle,
the number of choices for f ; therefore, the size of SA is
n(n − 1) ⋅ ⋅ ⋅ 1 = n!.
Example 9.3.5. Write down the six elements of S3 and give the multiplication table for
the group.
Name the three elements 1, 2, 3. The six elements of S3 are then as follows:
1 2 3 1 2 3 1 2 3
1=( ), a=( ), b=( )
1 2 3 2 3 1 3 1 2
1 2 3 1 2 3 1 2 3
c=( ), d=( ), e=( ).
2 1 3 3 2 1 1 3 2
The multiplication table for S3 can be written down directly by doing the required
composition. For example,
1 2 3 1 2 3 1 2 3
ac = ( )( )=( ) = d.
2 3 1 2 1 3 3 2 1
1 a a2 c ac a2 c
1 1 a a2 c ac a2 c
a a a2 1 ac a2 c c
a2 a2 1 a a2 c c ac
c c a2 c ac 1 a2 a
ac ac c a2 c a 1 a2
a2 c a2 c ac c a2 a 1
S3 = ⟨a, c; a3 = c2 = 1, ac = ca2 ⟩.
Theorem 9.3.6 (Cayley’s theorem). Let G be a group. Consider the set of elements of G.
Then the group G is a permutation group on the set G; that is, G is a subgroup of SG .
Lemma 9.4.2. Let G be a group and H ⊂ G a subgroup. Then the relation defined above
is an equivalence relation on G. The equivalence classes all have the form aH for a ∈ G
and are called the left cosets of H in G. Clearly, G is a disjoint union of its left cosets.
Proof. Let us show, first of all, that this is an equivalence relation. Now a ∼ a since
a−1 a = e ∈ H. Therefore, the relation is reflexive. Furthermore, a ∼ b implies a−1 b ∈ H,
but since H is a subgroup of G, we have b−1 a = (a−1 b)−1 ∈ H. Thus, b ∼ a. Therefore,
the relation is symmetric. Finally, suppose that a ∼ b and b ∼ c. Then a−1 b ∈ H, and
b−1 c ∈ H. Since H is a subgroup a−1 b ⋅ b−1 c = a−1 c ∈ H; hence, a ∼ c. Therefore, the
relation is transitive and, hence, is an equivalence relation.
For a ∈ G, the equivalence class is
But then, clearly, g ∈ aH. It follows that the equivalence class for a ∈ G is precisely
the set
These classes, aH, are called left cosets of H, and since they are equivalence classes,
they partition G. This means that every element of g is in one and only one left coset.
In particular, bH = H = eH if and only if b ∈ H.
called right cosets of H. Also, of course, G is the (disjoint) union of distinct right cosets.
It is easy to see that any two left (right) cosets have the same order (number of
elements). To demonstrate this, consider the mapping aH → bH via ah → bh, where
h ∈ H. It is not hard to show that this mapping is 1–1 and onto (see exercises). Thus, we
have |aH| = |bH|. (This is also true for right cosets and can be established in a similar
manner.) Letting b ∈ H in the above discussion, we see |aH| = |H|, for any a ∈ G. That
is, the size of each left or right coset is exactly the same as the subgroup H.
One can also see that the collection {aH} of all distinct left cosets has the same
number of elements as the collection {Ha} of all distinct right cosets. In other words,
the number of left cosets equals the number of right cosets (this number may be infi-
nite). For example, consider the map
f : aH → Ha−1 .
This mapping is well-defined; for if aH = bH, then b = ah, where h ∈ H. Thus, f (bH) =
Hb−1 = Hh−1 a−1 = f (aH). It is not hard to show that this mapping is 1–1 and onto (see
exercises). Hence, the number of left cosets equals the number of right cosets.
Definition 9.4.3. Let G be a group and H ⊂ G a subgroup. The number of distinct left
cosets, which is the same as the number of distinct right cosets, is called the index of
H in G, denoted by [G : H].
Now let us consider the case where the group G is finite. Each left coset has the
same size as the subgroup H; here, both are finite. Hence, |aH| = |H| for each coset.
In addition, the group G is a disjoint union of the left cosets; that is,
G = H ∪ g1 H ∪ ⋅ ⋅ ⋅ ∪ gn H.
If G is a finite group, this implies that both the order of a subgroup and the index of a
subgroup are divisors of the order of the group.
This theorem plays a crucial role in the structure theory of finite groups since it
greatly restricts the size of subgroups. For example, in a group of order 10, there can
be proper subgroups only of orders 1, 2, and 5.
As an immediate corollary, we have the following result:
Corollary 9.4.5. The order of any element g ∈ G, where G is a finite group, divides the
order of the group. In particular, if |G| = n and g ∈ G, then o(g)|n, and g n = 1.
Proof. Let g ∈ G and o(g) = m. Then m is the size of the cyclic subgroup generated
by g; hence divides n from Lagrange’s theorem. Then n = mk, and so
k
g n = g mk = (g m ) = 1k = 1.
Before leaving this section, we consider some results concerning general subsets
of a group.
Suppose that G is a group and S is an arbitrary nonempty subset of G, S ⊂ G, and
S ≠ 0. Such a set S is usually called a complex of G.
If U and V are two complexes of G, the product UV is defined as follows:
UV = {g1 g2 ∈ G : u ∈ U, v ∈ V}.
Now suppose that U, V are subgroups of G. When is the complex UV again a sub-
group of G?
Proof. We note first that when we say U and V commute, we do not demand that this
is so elementwise. In other words, it is not required that uv = vu for all u ∈ U and all
v ∈ V. All that is required is that for any u ∈ U and v ∈ V uv = v1 u1 for some elements
u1 ∈ U and v1 ∈ V.
Assume that UV is a subgroup of G. Let u ∈ U and v ∈ V. Then u ∈ U ⋅ 1 ⊂ UV and
v ∈ 1 ⋅ V ⊂ UV. But since UV is assumed itself to be a subgroup, it follows that vu ∈ UV.
Hence, each product vu ∈ UV, and so VU ⊂ UV. In an identical manner, UV ⊂ VU,
and so UV = VU.
Conversely, suppose that UV = VU. Let g1 = u1 v1 ∈ UV, g2 = u2 v2 ∈ UV. Then
1 = u4 v4 .
g1−1 = (u1 v1 )−1 = v1−1 u−1
UV = ⋃ rV,
r∈R
⋃ rV ⊂ UV.
r∈R
U = ⋃ r(U ∩ V).
r∈R
uv = rv v ∈ rV.
uv ∈ ⋃ rV.
r∈R
|U| |U||V|
|UV| = |R||V| = |U : U ∩ V||V| = |V| = .
|U ∩ V| |U ∩ V|
We now show that index is multiplicative. Later, we will see how this fact is related
to the multiplicativity of the degree of field extensions.
Theorem 9.4.8. Suppose G is a group and U and V are subgroups with U ⊂ V ⊂ G. Then
if G is the disjoint union
G = ⋃ rV,
r∈R
V = ⋃ sU,
s∈S
G = ⋃ rsU.
r∈R,s∈S
[G : U] = [G : V][V : U].
Proof. Now
G = ⋃ rV = ⋃ (⋃ sU) = ⋃ rsU.
r∈R r∈R s∈S r∈R,s∈S
The next result says that the intersection of subgroups of finite index must again
be of finite index.
Theorem 9.4.9 (Poincaré). Suppose that U, V are subgroups of finite index in G. Then
U ∩ V is also of finite index. Furthermore,
[G : U ∩ V] ≤ [G : U][G : V].
Proof. Let r be the number of left cosets of U in G that are contained in UV. r is finite
since the index [G : U] is finite. From Theorem 9.4.7, we then have
|V : U ∩ V| = r ≤ [G : U].
Corollary 9.4.10. Suppose that [G : U] and [G : V] are finite and relatively prime. Then
G = UV.
[G : U ∩ V] = [G : U][G : V].
[G : U ∩ V] = [G : V][V : U ∩ V].
[V : U ∩ V] = [G : U].
The number of left cosets of U in G that are contained in VU is equal to the number of
all left cosets of U in G. It follows then that we must have G = UV.
Notice that any group G has at least one set of generators, namely G itself. If G =
⟨M⟩ and M is a finite set, then we say that G is finitely generated. Clearly, any finite
group is finitely generated. Shortly, we will give an example of a finitely generated
infinite group.
Example 9.5.3. The set of all reflections forms a set of generators for the Euclidean
group ℰ . Recall that any T ∈ ℰ is either a translation, a rotation, a reflection, or a glide
reflection. It can be shown (see exercises) that any one of these can be expressed as a
product of 3, or fewer reflections.
In this case, G = {g n : n ∈ ℤ}; that is, G consists of all the powers of the element g.
If there exists an integer m such that g m = 1, then there exists a smallest such positive
integer say n. It follows that g k = g l if and only if k ≡ l mod n. In this situation, the
distinct powers of g are precisely
{1 = g 0 , g, g 2 , . . . , g n−1 }.
It follows that |G| = n. We then call G a finite cyclic group. If no such power exists, then
all the powers of G are distinct and G is an infinite cyclic group.
We show next that any two cyclic groups of the same order are isomorphic.
Theorem 9.5.5.
(a) If G = ⟨g⟩ is an infinite cyclic group, then G ≅ (ℤ, +); that is, the integers under
addition.
(b) If G = ⟨g⟩ is a finite cyclic group of order n, then G ≅ (ℤn , +); that is, the integers
modulo n under addition.
It follows that for a given order there is only one cyclic group up to isomorphism.
Proof. Let G be an infinite cyclic group with generator g. Map g onto 1 ∈ (ℤ, +). Since
g generates G and 1 generates ℤ under addition, this can be extended to a homomor-
phism. It is straightforward to show that this defines an isomorphism.
Now let G be a finite cyclic group of order n with generator g. As above, map g to
1 ∈ ℤn and extend to a homomorphism. Again it is straightforward to show that this
defines an isomorphism.
Now let G and H be two cyclic groups of the same order. If both are infinite, then
both are isomorphic to (ℤ, +) and, hence, isomorphic to each other. If both are finite of
order n, then both are isomorphic to (ℤn , +) and, hence, isomorphic to each other.
Theorem 9.5.6. Let G = ⟨g⟩ be a finite cyclic group of order n. Then every subgroup of
G is also cyclic. Furthermore, if d|n, there exists a unique subgroup of G of order d.
Proof. Let G = ⟨g⟩ be a finite cyclic group of order n, and suppose that H is a subgroup
of G. Notice that if g m ∈ H, then g −m is also in H since H is a subgroup. Hence, H must
contain positive powers of the generator g. Let t be the smallest positive power of g
such that g t ∈ H. We claim that H = ⟨g t ⟩, the cyclic subgroup of G generated by g t . Let
h ∈ H, then h = g m for some positive integer m ≥ t. Divide m by t to get
Theorem 9.5.7. Let G = ⟨g⟩ be an infinite cyclic group. Then a subgroup H is of the form
H = ⟨g t ⟩ for a positive integer t. Furthermore, if t1 , t2 are positive integers with t1 ≠ t2 ,
then ⟨g t1 ⟩ and ⟨g t2 ⟩ are distinct.
Proof. Let G = ⟨g⟩ be an infinite cyclic group and H a subgroup of G. As in the proof of
Theorem 9.5.6, H must contain positive powers of the generator g. Let t be the smallest
positive power of g such that g t ∈ H. We claim that H = ⟨g t ⟩, the cyclic subgroup of G
generated by g t . Let h ∈ H, then h = g m for some positive integer m ≥ t. Divide m by t
to get
Theorem 9.5.8. Let G = ⟨g⟩ be a cyclic group. Then the following hold:
(a) If G = ⟨g⟩ is finite of order n, then g k is also a generator if and only if (k, n) = 1. That
is, the generators of G are precisely those powers g k , where k is relatively prime to n.
(b) If G = ⟨g⟩ is infinite, then the only generators are g, g −1 .
Proof. (a) Let G = ⟨g⟩ be a finite cyclic group of order n, and suppose that (k, n) = 1.
Then there exist integers x, y with kx + ny = 1. It follows that
x y x
g = g kx+ny = (g k ) (g n ) = (g k )
kx + ny = 1.
Recall that for positive integers n, the Euler phi-function is defined as follows:
Corollary 9.5.11. If G = ⟨g⟩ is finite of order n, then there are ϕ(n) generators for G,
where ϕ is the Euler phi-function.
Proof. From Theorem 9.5.8, the generators of G are precisely the powers g k , where
(k, n) = 1. The numbers relatively prime to n are counted by the Euler phi-function.
Lemma 9.5.12. Let G be an arbitrary group and g, h ∈ G both of finite order o(g), o(h). If
g and h commute; that is, gh = hg, then o(gh) divides lcm(o(g), o(h)). In particular, if G is
an abelian group, then o(gh)| lcm(o(g), o(h)) for all g, h ∈ G of finite order. Furthermore,
if ⟨g⟩ ∩ ⟨h⟩ = {1}, then o(gh) = lcm(o(g), o(h)).
Proof. Suppose o(g) = n and o(h) = m are finite. If g, h commute, then for any k, we
have (gh)k = g k hk . Let t = lcm(n, m), then t = k1 m, t = k2 n. Hence,
k k
(gh)t = g t ht = (g m ) 1 (hn ) 2 = 1.
Therefore, the order of gh is finite and divides t. Suppose that ⟨g⟩∩⟨h⟩ = {1}; that is, the
cyclic subgroup generated by g intersects trivially with the cyclic subgroup generated
by h. Let k = o(gh), which we know is finite from the first part of the lemma. Let t =
lcm(n, m). We then have (gh)k = g k hk = 1, which implies that g k = h−k . Since the cyclic
subgroups have only trivial intersection, this implies that g k = 1 and hk = 1. But then
n|k and m|k; hence t|k. Since k|t it follows that k = t.
Recall that if m and n are relatively prime, then lcm(m, n) = mn. Furthermore,
if the orders of g and h are relatively prime, it follows from Lagrange’s theorem that
⟨g⟩ ∩ ⟨h⟩ = {1}. We then get the following:
Corollary 9.5.13. If g, h commute and o(g) and o(h) are finite and relatively prime, then
o(gh) = o(g)o(h).
Definition 9.5.14. If G is a finite abelian group, then the exponent of G is the lcm of
the orders of all elements of G. That is,
Lemma 9.5.15. Let G be a finite abelian group. Then G contains an element of order
exp(G).
e e
Proof. Suppose that exp(G) = p1 1 ⋅ ⋅ ⋅ pkk with pi distinct primes. By the definition of
e r
exp(G), there is a gi ∈ G with o(gi ) = pi i ri with pi and ri relatively prime. Let hi = gi i .
ei
Then from Lemma 9.5.12, we get o(hi ) = pi . Now let g = h1 h2 ⋅ ⋅ ⋅ hk . From the corollary
e e
to Lemma 9.5.12, we have o(g) = p1 1 ⋅ ⋅ ⋅ pkk = exp(G).
Proof. Let A ⊂ K ⋆ with |A| = n. Suppose that m = exp(A). Consider the polynomial
f (x) = xm − 1 ∈ K[x]. Since the order of each element in A divides m, it follows that
am = 1 for all a ∈ A; hence, each a ∈ A is a zero of the polynomial f (x). Hence, f (x) has
at least n zeros. Since a polynomial of degree m over a field can have at most m zeros, it
follows that n ≤ m. From Lemma 9.5.15, there is an element a ∈ A with o(a) = m. Since
|A| = n, it follows that m|n; hence, m ≤ n. Therefore, m = n; hence, A = ⟨a⟩ showing
that A is cyclic.
We close this section with two other results concerning cyclic groups. The first
proves, using group theory, a very interesting number theoretic result concerning the
Euler phi-function.
∑ ϕ(d) = n.
d|n
Proof. Consider a cyclic group G of order n. For each d|n, d ≥ 1, there is a unique
cyclic subgroup H of order d. H then has ϕ(d) generators. Each element in G generates
its own cyclic subgroup H1 , say of order d and, hence, must be included in the ϕ(d)
generators of H1 . Therefore,
We shall make use of the above theorem directly in the following theorem.
Theorem 9.5.18. If |G| = n and if for each positive d such that d|n, G has at most one
cyclic subgroup of order d, then G is cyclic (and, consequently, has exactly one cyclic
subgroup of order d).
Proof. For each d|n, d > 0, let ψ(d) = the number of elements of G of order d. Then
∑ ψ(d) = n.
d|n
Now suppose that ψ(d) ≠ 0 for a given d|n. Then there exists an a ∈ G of order d,
which generates a cyclic subgroup, ⟨a⟩, of order d of G. We claim that all elements of
G of order d are in ⟨a⟩. Indeed, if b ∈ G with o(b) = d and b ∉ ⟨a⟩, then ⟨b⟩ is a second
cyclic subgroup of order d, distinct from ⟨a⟩. This contradicts the hypothesis, so the
claim is proved. Thus, if ψ(d) ≠ 0, then ψ(d) = ϕ(d). In general, we have ψ(d) ≤ ϕ(d),
for all positive d|n. But n = ∑d|n ψ(d) ≤ ∑d|n ϕ(d), by the previous theorem. It follows,
clearly, from this that ψ(d) = ϕ(d) for all d|n. In particular, ψ(n) = ϕ(n) ≥ 1. Hence,
there exists at least one element of G of order n; hence, G is cyclic. This completes the
proof.
Corollary 9.5.19. If in a group G of order n, for each d|n, the equation x d = 1 has at most
d solutions in G, then G is cyclic.
Proof. The hypothesis clearly implies that G can have at most one cyclic subgroup of
order d since all elements of such a subgroup satisfy the equation. So Theorem 9.5.18
applies to give our result.
Theorem 9.5.20. Let G be a finitely generated group. The number of subgroups of index
n < ∞ is finite.
9.6 Exercises
1. Prove Lemma 9.1.4.
2. Let G be a group and H a nonempty subset. H is a subgroup of G if and only if
ab−1 ∈ H for all a, b ∈ H.
3. Suppose that g ∈ G and g m = 1 for some positive integer m. Let n be the small-
est positive integer such that g n = 1. Show the set of elements {1, g, g 2 , . . . ,
g n−1 } are all distinct but for any other power g k we have g k = g t for some
k = 0, 1, . . . , n − 1.
4. Let G be a group and U1 , U2 be finite subgroups of G. If |U1 | and |U2 | are relatively
prime, then U1 ∩ U2 = {e}.
5. Let A, B be subgroups of a finite group G. If |A| ⋅ |B| > |G| then A ∩ B ≠ {e}.
2 2
6. Let G be the set of all real matrices of the form ( ba −b
a ), where a + b ≠ 0. Show:
(a) G is a group.
(b) For each n ∈ ℕ there is at least one element of order n in G.
7. Let p be a prime, and let G = SL(2, p) = SL(2, ℤp ). Show: G has at least 2p − 2
elements of order p.
8. Let p be a prime and a ∈ ℤ. Show that ap ≡ a mod p.
9. Here we outline a proof that every planar Euclidean congruence motion is either
a rotation, translation, reflection or glide reflection. An isometry in this problem
is a planar Euclidean congruence motion. Show:
(a) If T is an isometry then it is completely determined by its action on a triangle –
equivalent to showing that if T fixes three noncollinear points then it must be
the identity.
(b) If an isometry T has exactly one fixed point then it must be a rotation with
that point as center.
(c) If an isometry T has two fixed points then it fixes the line joining them. Then
show that if T is not the identity it must be a reflection through this line.
(d) If an isometry T has no fixed point but preserves orientation then it must be
a translation.
(e) If an isometry T has no fixed point but reverses orientation then it must be a
glide reflection.
10. Let Pn be a regular n-gon and Dn its group of symmetries. Show that |Dn | = 2n.
(Hint: First show that |Dn | ≤ 2n and then exhibit 2n distinct symmetries.)
11. If A, B have the same cardinality, then there exists a bijection σ : A → B. Define a
map F : SA → SB in the following manner: if f ∈ SA , let F(f ) be the permutation
on B given by F(f )(b) = σ(f (σ −1 (b))). Show that F is an isomorphism.
12. Prove Lemma 9.3.3.
Definition 10.1.1. Let G be an arbitrary group and suppose that H1 and H2 are sub-
groups of G. We say that H2 is conjugate to H1 if there exists an element a ∈ G such that
H2 = a−1 H1 a. H1 , H2 are the called conjugate subgroups of G.
Lemma 10.1.2. Let G be an arbitrary group. Then the relation of conjugacy is an equiv-
alence relation on the set of subgroups of G.
Hence, f is a homomorphism.
If f (a1 ) = f (a2 ), then g −1 a1 g = g −1 a2 g. Clearly, by the cancellation law, we then
have a1 = a2 ; hence, f is one-to-one.
Finally, let a ∈ G, and let a1 = gag −1 . Then a = g −1 a1 g; hence, f (a1 ) = a. It follows
that f is onto; therefore, f is an automorphism on G.
https://doi.org/10.1515/9783110603996-010
Lemma 10.1.5. Let N be a subgroup of a group G. Then if a−1 Na ⊂ N for all a ∈ G, then
a−1 Na = N. In particular, a−1 Na ⊂ N for all a ∈ G implies that N is a normal subgroup.
Notice that if g −1 Hg = H, then Hg = gH. That is as sets the left coset, gH, is equal
to the right coset, Hg. Hence, for each h1 ∈ H, there is an h2 ∈ H with gh1 = h2 g. If
H ⊲ G, this is true for all g ∈ G. Furthermore, if H is normal, then for the product of
two cosets g1 H and g2 H, we have
Lemma 10.1.6. Let H be a subgroup of a group G. Then the following are equivalent:
(1) H is a normal subgroup of G.
(2) g −1 Hg = H for all g ∈ G.
(3) gH = Hg for all g ∈ G.
(4) (g1 H)(g2 H) = (g1 g2 )H for all g1 , g2 ∈ G.
This is precisely the condition needed to construct factor groups. First we give
some examples of normal subgroups.
G = H ∪ gH = H ∪ Hg.
Since the union is a disjoint union, we must have gH = Hg; hence, H is normal.
Lemma 10.1.9. Let K be any field. Then the group SL(n, K) is a normal subgroup of
GL(n, K) for any positive integer n.
Proof. Recall that GL(n, K) is the group of n × n matrices over the field K with nonzero
determinant, whereas SL(n, K) is the subgroup of n × n matrices over the field K with
determinant equal to 1. Let U ∈ SL(n, K) and T ∈ GL(n, K). Consider T −1 UT. Then
Hence, T −1 UT ∈ SL(n, K) for any U ∈ SL(n, K), and any T ∈ GL(n, K). It follows that
T −1 SL(n, K)T ⊂ SL(n, K); therefore, SL(n, K) is normal in GL(n, K).
The intersection of normal subgroups is again normal, and the product of normal
subgroups is normal.
Lemma 10.1.10. Let N1 , N2 be normal subgroups of the group G. Then the following hold:
(1) N1 ∩ N2 is a normal subgroup of G.
(2) N1 N2 is a normal subgroup of G.
(3) If H is any subgroup of G, then N1 ∩ H is a normal subgroup of H, and N1 H = HN1 .
g −1 (n1 n2 )g = (g −1 n1 g)(g −1 n2 g) ∈ N1 N2
Definition 10.1.11. Let G be an arbitrary group and H a normal subgroup of G. Let G/H
denote the set of distinct left (and hence also right) cosets of H in G. On G/H, define
the multiplication
(g1 H)(g2 H) = g1 g2 H
Theorem 10.1.12. Let G be a group and H a normal subgroup of G. Then G/H under the
operation defined above forms a group. This group is called the factor group or quotient
group of G modulo H. The identity element is the coset 1H = H, and the inverse of a coset
gH is g −1 H.
Proof. We first show that the operation on G/N is well-defined. Suppose that a N = aN
and b N = bN, then b ∈ bN, and so b = bn1 . Similarly a = an2 , where n1 , n2 ∈ N.
Therefore,
an2 bN = abN.
Thus, we have shown that if N ⊲ G, then a b N = abN, and the operation on G/N is
indeed well-defined.
The associative law is true, because coset multiplication as defined above uses the
ordinary group operation, which is by definition associative.
The coset N serves as the identity element of G/N. Notice that
aN ⋅ N = aN 2 = aN,
and
N ⋅ aN = aN 2 = aN.
aNa−1 N = aa−1 N 2 = N.
We emphasize that the elements of G/N are cosets; thus, subsets of G. If |G| < ∞,
then |G/N| = [G : N], the number of cosets of N in G. It is also to be emphasized that
for G/N to be a group, N must be a normal subgroup of G.
In some cases, properties of G are preserved in factor groups.
Lemma 10.1.13. If G is abelian, then any factor group of G is also abelian. If G is cyclic,
then any factor group of G is also cyclic.
One of the most outstanding problems in group theory has been to give a complete
classification of all finite simple groups. In other words, this is the program to discover
all finite simple groups, and to prove that there are no more to be found. This was ac-
complished through the efforts of many mathematicians. The proof of this magnificent
result took thousands of pages. We refer the reader to [25] for a complete discussion of
this. We give one elementary example:
Lemma 10.1.15. Any finite group of prime order is simple and cyclic.
Proof. Suppose that G is a finite group and |G| = p, where p is a prime. Let g ∈ G with
g ≠ 1. Then ⟨g⟩ is a nontrivial subgroup of G, so its order divides the order of G by
Lagrange’s theorem. Since g ≠ 1, and p is a prime, we must have |⟨g⟩| = p. Therefore,
⟨g⟩ is all of G; that is, G = ⟨g⟩; hence, G is cyclic.
The argument above shows that G has no nontrivial proper subgroups and, there-
fore, no nontrivial normal subgroups. Therefore, G is simple.
In the next chapter, we will examine certain other finite simple groups.
called the ring isomorphism theorem. We now look at the group theoretical analog of
this result, called the group isomorphism theorem. We will then examine some conse-
quences of this result that will be crucial in the Galois theory of fields.
That is the kernel, the set of the elements of G1 that map onto the identity of G2 . The
image of f , denoted im(f ), is the set of elements of G2 mapped onto by f from elements
of G1 . That is,
Proof. Suppose that f is injective. Since f (1) = 1, we always have 1 ∈ ker(f ). Suppose
that g ∈ ker(f ). Then f (g) = f (1). Since f is injective, this implies that g = 1; hence,
ker(f ) = {1}.
Conversely, suppose that ker(f ) = {1} and f (g1 ) = f (g2 ). Then
We now state the group isomorphism theorem. This is entirely analogous to the
ring isomorphism theorem replacing ideals by normal subgroups. We note that this
theorem is sometimes called the first group isomorphism theorem.
G/ ker(f ) ≅ im(f ).
(b) Conversely, suppose that N is a normal subgroup of a group G. Then there exists a
group H and a homomorphism f : G → H such that ker(f ) = N, and im(f ) = H.
Proof. (a) Since 1 ∈ ker(f ), the kernel is nonempty. Suppose that g1 , g2 ∈ ker(f ). Then
f (g1 ) = f (g2 ) = 1. It follows that f (g1 g2−1 ) = f (g1 )(f (g2 ))−1 = 1. Hence, g1 g2−1 ∈ ker(f );
therefore, ker(f ) is a subgroup of G1 . Furthermore, for any g ∈ G1 , we have
⋅ 1 ⋅ f (g) = f (g −1 g) = f (1) = 1.
−1
= (f (g))
therefore, f ̂ is a homomorphism.
Suppose that f ̂(g1 ker(f )) = f ̂(g2 ker(f )), then f (g1 ) = f (g2 ); hence, g1 ker(f ) =
g2 ker(f ). It follows that f ̂ is injective.
Finally, suppose that h ∈ im(f ). Then there exists a g ∈ G1 with f (g) = h. Then
̂f (g ker(f )) = h, and f ̂ is a surjection onto im(f ). Therefore, f ̂ is an isomorphism com-
pleting the proof of part (a).
(b) Conversely, suppose that N is a normal subgroup of G. Define the map f : G →
G/N by f (g) = gN for g ∈ G. By the definition of the product in the quotient group G/N,
it is clear that f is a homomorphism with im(f ) = G/N. If g ∈ ker(f ), then f (g) = gN = N
since N is the identity in G/N. However, this implies that g ∈ N; hence, it follows that
ker(f ) = N, completing the proof.
There are two related theorems that are called the second isomorphism theorem
and the third isomorphism theorem.
Proof. From Lemma 10.1.10, we know that U ∩ N is normal in U. Define the map
α : UN → U/U ∩ N
However, U ∩ N is normal in U, so
Therefore, α is a homomorphism.
We have im(α) = U/(U ∩ N) by definition. Suppose that un ∈ ker(α). Then α(un) =
U ∩ N ⊂ N, which implies u ∈ N. Therefore, ker(f ) = N. From the group isomorphism
theorem, we then have
(G/N)/(M/N) ≅ G/M.
β(gN) = gM.
(G/N)/(M/N) ≅ G/M.
ϕ : H → f (H),
Proof. We first show that the mapping ϕ is surjective. Let H1 be a subgroup of G/N,
and let
H = {g ∈ G : f (g) ∈ H1 }.
G = G1 × G2 = {(a, b) : a ∈ G1 , b ∈ G2 }.
On G, define
With this operation, it is direct to verify the groups axioms for G; hence, G becomes a
group.
Theorem 10.3.1. Let G1 , G2 be groups and G the Cartesian product G1 × G2 with the op-
eration defined above. Then G forms a group called the direct product of G1 and G2 . The
identity element is (1, 1), and (g, h)−1 = (g −1 , h−1 ).
This construction can be iterated to any finite number of groups (also to an infinite
number, but we will not consider that here) G1 , . . . , Gn to form the direct product G1 ×
G2 × ⋅ ⋅ ⋅ × Gn .
Proof. The map (a, b) → (b, a), where a ∈ G1 , b ∈ G2 provides an isomorphism from
G1 × G2 → G2 × G1 .
Suppose that both G1 , G2 are abelian. Then if a1 , a2 ∈ G1 , b1 , b2 ∈ G2 , we have
hence, G1 × G2 is abelian.
Conversely, suppose G1 × G2 is abelian, and suppose that a1 , a2 ∈ G1 . Then for the
identity 1 ∈ G2 , we have
If the factors are finite, it is easy to find the order of G1 ×G2 . The size of the Cartesian
product is just the product of the sizes of the factors.
Lemma 10.3.4. If |G1 | and |G2 | are finite, then |G1 × G2 | = |G1 ||G2 |.
Theorem 10.3.5. Suppose that G is a group with normal subgroups G1 , G2 such that G =
G1 G2 , and G1 ∩ G2 = {1}. Then G is isomorphic to the direct product G1 × G2 .
Now map G onto G1 × G2 by f (ab) → (a, b). We claim that this is an isomorphism.
It is clearly onto. Now
Theorem 10.4.1 (Basis theorem for finite abelian groups). Let G be a finite abelian
group. Then G is a direct product of cyclic groups of prime power order.
Before giving the proof, we give two examples showing how this theorem leads to
the classification of finite abelian groups.
Since all cyclic groups of order n are isomorphic to (ℤn , +), we will denote a cyclic
group of order n by ℤn .
Example 10.4.2. Classify all abelian groups of order 60. Let G be an abelian group of
order 60. From Theorem 10.4.1, G must be a direct product of cyclic groups of prime
power order. Now 60 = 22 ⋅ 3 ⋅ 5, so the only primes involved are 2, 3, and 5. Hence, the
cyclic group involved in the direct product decomposition of G have order either 2, 4,
3, or 5 (by Lagrange’s theorem, they must be divisors of 60). Therefore, G must be of
the form
G ≅ ℤ4 × ℤ3 × ℤ5
G ≅ ℤ2 × ℤ2 × ℤ3 × ℤ5 .
Hence, up to isomorphism, there are only two abelian groups of order 60.
Example 10.4.3. Classify all abelian groups of order 180. Now 180 = 22 ⋅ 32 ⋅ 5, so the
only primes involved are 2, 3, and 5. Hence, the cyclic group involved in the direct
G ≅ ℤ4 × ℤ9 × ℤ5
G ≅ ℤ2 × ℤ2 × ℤ9 × ℤ5
G ≅ ℤ4 × ℤ3 × ℤ3 × ℤ5
G ≅ ℤ2 × ℤ2 × ℤ3 × ℤ3 × ℤ5 .
Lemma 10.4.4. Let G be a finite abelian group, and let p||G|, where p is a prime. Then
all the elements of G, whose orders are a power of p, form a normal subgroup of G. This
subgroup is called the p-primary component of G, which we will denote by Gp .
Proof. Let p be a prime with p||G|, and let a and b be two elements of G of order a power
of p. Since G is abelian, the order of ab is the lcm of the orders, which is again a power
of p. Therefore, ab ∈ Gp . The order of a−1 is the same as the order of a, so a−1 ∈ Gp ;
therefore, Gp is a subgroup.
e e
Lemma 10.4.5. Let G be a finite abelian group of order n. Suppose that n = p1 1 ⋅ ⋅ ⋅ pkk
with p1 , . . . , pk distinct primes. Then
G ≅ Gp1 × ⋅ ⋅ ⋅ × Gpk ,
Proof. Each Gpi is normal since G is abelian, and since distinct primes are relatively
prime, the intersection of the Gpi is the identity. Therefore, Lemma 10.4.5 will follow
by showing that each element of G is a product of elements in the Gp1 .
f f f
Let g ∈ G. Then the order of g is p11 ⋅ ⋅ ⋅ pkk . We write this as pii m with (m, pi ) = 1.
f
Then g m has order pii and, hence, is in Gpi . Now since p1 , . . . , pk are relatively prime,
there exists m1 , . . . , mk with
f f
m1 p11 + ⋅ ⋅ ⋅ + mk pkk = 1;
hence,
f1 fk
m1 m
g = (g p1 ) ⋅ ⋅ ⋅ (g pk ) k .
We next need the concept of a basis. Let G be any finitely generated abelian group
(finite or infinite), and let g1 , . . . , gn be a set of generators for G. The generators g1 , . . . , gn
form a basis if
G = ⟨g1 ⟩ × ⋅ ⋅ ⋅ × ⟨gn ⟩;
that is, G is the direct product of the cyclic subgroups generated by the gi . The basis
theorem for finite abelian groups says that any finite abelian group has a basis.
Suppose that G is a finite abelian group with a basis g1 , . . . , gk so that G = ⟨g1 ⟩ ×
⋅ ⋅ ⋅ × ⟨gk ⟩. Since G is finite, each gi has finite order, say mi . It follows then, from the fact
that G is a direct product, that each g ∈ G can be expressed as
n n
g = g1 1 ⋅ ⋅ ⋅ gk k
and, furthermore, the integers n1 , . . . , nk are unique modulo the order of gi . Hence,
each integer ni can be chosen in the range 0, 1, . . . , mi − 1, and within this range for the
element g, the integer ni is unique.
From the previous lemma, each finite abelian group splits into a direct product of
its p-primary components for different primes p. Hence, to complete the proof of the
basis theorem, we must show that any finite abelian group of order pm for some prime
p has a basis. We call an abelian group of order pm an abelian p-group.
Consider an abelian group G of order pm for a prime p. It is somewhat easier to
complete the proof if we consider the group using additive notation. That is, the oper-
ation is considered +, the identity as 0, and powers are given by multiples. Hence, if
an element g ∈ G has order pk , then in additive notation, pk g = 0. A set of elements
g1 , . . . , gk is then a basis for G if each g ∈ G can be expressed uniquely as g = m1 g1 +
⋅ ⋅ ⋅ + mk gk , where the mi are unique modulo the order of gi . We say that the g1 , . . . , gk
are independent, and this is equivalent to the fact that whenever m1 g1 + ⋅ ⋅ ⋅ + mk gk = 0,
then mi ≡ 0 modulo the order of gi . We now prove that any abelian p-group has a basis.
Lemma 10.4.6. Let G be a finite abelian group of prime power order pn for some prime p.
Then G is a direct product of cyclic groups.
m1 g1 + ⋅ ⋅ ⋅ + mk gk = 0 (10.1)
for some set of integers mi . Since the order of each gi is p, as explained above, we may
assume that 0 ≤ mi < p for i = 1, . . . , k. Suppose that one mi ≠ 0. Then (mi , p) = 1;
hence, there exists an xi with mi xi ≡ 1 mod p (see Chapter 4). Multiplying the equa-
tion (10.1) by xi , we get modulo p,
m1 xi g1 + ⋅ ⋅ ⋅ + gi + ⋅ ⋅ ⋅ + mk xi gk = 0,
and rearranging
gi = −m1 xi g1 − ⋅ ⋅ ⋅ − mk xk gk .
But then gi can be expressed in terms of the other gj ; therefore, the set {g1 , . . . , gk } is
not minimal. It follows that g1 , . . . , gk constitute a basis, and the lemma is true for the
exponent p.
Now suppose that any finite abelian group of exponent pn−1 has a basis, and as-
sume that G has exponent pn . Consider the set G = pG = {pg : g ∈ G}. It is straight-
forward that this forms a subgroup (see exercises). Since pn g = 0 for all g ∈ G, it
follows that pn−1 g = 0 for all g ∈ G, and so the exponent of G ≤ pn−1 . By the inductive
hypothesis, G has a basis
S = {pg1 , . . . , pgk }.
Consider the set {g1 , . . . , gk }, and adjoin to this set the set of all elements h ∈ G, satis-
fying ph = 0. Call this set S1 , so that we have
S1 = {g1 , . . . , gk , h1 , . . . , ht }.
We claim that S1 is a set of generators for G. Let g ∈ G. Then pg ∈ G, which has the
basis pg1 , . . . , pgk , so that
pg = m1 pg1 + ⋅ ⋅ ⋅ + mk pgk .
p(g − m1 g1 − ⋅ ⋅ ⋅ − mk gk ) = 0,
g − m1 g1 − ⋅ ⋅ ⋅ − mk gk = hi , so that g = m1 g1 + ⋅ ⋅ ⋅ + mk gk + hi ,
m1 g1 + ⋅ ⋅ ⋅ + mr gr + n1 h1 + ⋅ ⋅ ⋅ + ns hs = 0 (10.2)
gi = −m1 xi g1 − ⋅ ⋅ ⋅ − ns xi hs .
a1 pg1 + ⋅ ⋅ ⋅ + ar pgr = 0.
The g1 , . . . , gr are independent and, hence, ai p = 0 for each i; hence, ai = 0. Now (10.2)
becomes
n1 h1 + ⋅ ⋅ ⋅ + ns hs = 0.
For more details see the proof of the general result on modules over principal ideal
domains later in the book. There is also an additional elementary proof for the basis
theorem for finitely generated abelian groups.
Example 10.5.1. In Example 9.2.6, we saw that the symmetry group of an equilateral
triangle had 6 elements, and is generated by elements r and f , which satisfy the re-
lations r 3 = f 2 = 1, f −1 rf = r −1 , where r is a rotation of 120∘ about the center of the
triangle, and f is a reflection through an altitude. This was called the dihedral group
D3 of order 6.
This can be generalized to any regular n-gon, n > 2. If D is a regular n-gon, then
the symmetry group Dn has 2n elements, and is called the dihedral group of order 2n.
It is generated by elements r and f , which satisfy the relations r n = f 2 = 1, f −1 rf = r n−1 ,
where r is a rotation of 2π n
about the center of the n-gon, and f is a reflection.
Hence, D4 , the symmetries of a square, has order 8 and D5 , the symmetries of a
regular pentagon, has order 10.
These elements then form a group of order 8 called the quaternion group denoted by Q.
Since ijk = 1, we have ij = −ji, and the generators i and j satisfy the relations i4 = j4 = 1,
i2 = j2 , ij = i2 ji.
We now state the main classification, and then prove it in a series of lemmas.
Recall from Section 10.1, that a finite group of prime order must be cyclic. Hence,
in the theorem, the cases |G| = 2, 3, 5, 7 are handled. We next consider the case, where
G has order p2 , and where p is a prime.
Definition 10.5.4. If G is a group, then its center denoted Z(G), is the set of elements
in G, which commute with everything in G. That is,
Proof. (a) and (b) are direct, and we leave them to the exercises. Consider the case,
where G/Z(G) is cyclic. Then each coset of Z(G) has the form g m Z(G), where g ∈ G. Let
a, b ∈ G. Then since a, b are in cosets of the center, we have a = g m u and b = g n v with
u, v ∈ Z(G). Then
A p-group is any finite group of prime power order pk . We need the following: The
proof of this is based on what is called the class equation, which we will prove in Chap-
ter 13.
Proof. Suppose that |G| = p2 . Then from the previous lemma, G has a nontrivial center;
hence, |Z(G)| = p, or |Z(G)| = p2 . If |Z(G)| = p2 , then G = Z(G), and G is abelian. If
|Z(G)| = p, then |G/Z(G)| = p. Since p is a prime this implies that G/Z(G) is cyclic;
hence, from Lemma 10.5.5, G is abelian.
Lemma 10.5.8. If G is any group, where every nontrivial element has order 2, then G is
abelian.
Proof. Suppose that g 2 = 1 for all g ∈ G. This implies that g = g −1 for all g ∈ G. Let a, b
be arbitrary elements of G. Then
g 3 = h2 = 1, h1 gh = g −1 .
Proof. The proof is almost identical to that for n = 6. Since 10 = 2 ⋅ 5, if G were abelian,
G ≅ ℤ2 × ℤ5 = ℤ10 .
Now suppose that G is nonabelian. As for n = 6, G must contain a normal cyclic
subgroup of order 5, say ⟨g⟩ = {1, g, g 2 , g 3 , g 4 }. If h ∉ ⟨g⟩, then exactly as for n = 6, it
follows that h must have order 2, and h−1 gh = g t for t = 1, 2, 3, 4. If h−1 gh = g, then g, h
commute, and G is abelian. Notice that h−1 = h. Suppose that h−1 gh = hgh = g 2 . Then
3
(hgh)3 = (g 2 ) = g 6 = g ⇒ g = h2 gh2 = hg 2 h = g 4 ⇒ g = 1,
g 5 = h2 = 1; h−1 gh = g −1 .
This leaves the case n = 8, the most difficult. If |G| = 8, and G is abelian, then
clearly, G ≅ ℤ8 , or G ≅ ℤ4 × ℤ2 , or G ≅ ℤ2 × ℤ2 × ℤ2 . The proof of Theorem 10.5.3 is
then completed with the following:
If h−1 gh = g, then as in the cases 6 and 10, ⟨g, h⟩ defines an abelian subgroup of order 8;
hence, G is abelian. If h−1 gh = g 2 , then
2 2
(h−1 gh) = (g 2 ) = g 4 = 1 ⇒ g = h−2 gh2 = h−1 g 2 h = g 4 ⇒ g 3 = 1,
contradicting the fact that g has order 4. Therefore, h−1 gh = g 3 = g −1 . It follows that
g, h define a subgroup of order 8, isomorphic to D4 . Since |G| = 8, this must be all of G
and G ≅ D4 .
Therefore, we may now assume that every element h ∈ G with h ∉ ⟨g⟩ has or-
der 4. Let h be such an element. Then h2 has order 2, so h2 ∈ ⟨g⟩, which implies that
h2 = g 2 . This further implies that g 2 is central; that is, commutes with everything. Iden-
tifying g with i, h with j, and g 2 with −1, we get that G is isomorphic to Q, completing
Lemma 10.5.11 and the proof of Theorem 10.5.3.
In principle, this type of analysis can be used to determine the structure of any fi-
nite group, although it quickly becomes impractical. A major tool in this classification
is the following important result known as the Sylow theorem, which we just state. We
will prove this theorem in Chapter 13. If |G| = pm n with p a prime and (n, p) = 1, then
a subgroup of G of order pm is called a p-Sylow subgroup. It is not clear at first that a
group will contain p-Sylow subgroups.
Theorem 10.5.12 (Sylow theorem). Let |G| = pm n with p a prime and (n, p) = 1.
(a) G contains a p-Sylow subgroup.
(b) All p-Sylow subgroups of G are conjugate.
(c) Any p-subgroup of G is contained in a p-Sylow subgroup.
(d) The number of p-Sylow subgroups of G is of the form 1 + pk and divides n.
ia : G → G, ia (x) = axa−1 .
Theorem 10.6.4. Inn(G) is a normal subgroup of Aut(G); that is, Inn(G) ⊲ Aut(G).
Hence, ker(φ) = Z(G), the center of G. Now, from Theorem 10.2.3, we get the following:
Theorem 10.6.5.
Inn(G) ≅ G/Z(G)
Let G be a group and f ∈ Aut(G). If a ∈ G has order n, then f (a) also has order n; if
a ∈ G has infinite order then f (a) also has infinite order.
Example 10.6.6. Let V ≅ ℤ2 × ℤ2 ; that is, V has four elements 1, a, b and ab with
a2 = b2 = (ab)2 = 1.
V is often called the Klein four group. An automorphism of V permutes the three
elements a, b and ab of order 2, and each permutation of {a, b, ab} defines an automor-
phism of V. Hence, Aut(V) ≅ S3 .
Example 10.6.7.
S3 ≅ Inn(S3 ) = Aut(S3 ).
By Theorem 10.6.5, we have S3 ≅ Inn(S3 ), because Z(S3 ) = {1}. Now, let f ∈ Aut(S3 ).
Analogously, as in Example 10.6.6, the automorphism f permutes the three transposi-
tions (1, 2), (1, 3), and (2, 3). This gives | Aut(S3 )| ≤ |S3 | = 6, because S3 is generated by
these transpositions. From S3 ≅ Inn(S3 ) ⊲ Aut(S3 ), we have | Aut(S3 )| ≥ 6.
Hence, Aut(S3 ) ≅ Inn(S3 ) ≅ S3 .
10.7 Exercises
1. Prove that if G is cyclic, then any factor group of G is also cyclic.
2. Prove that for any group G, the center Z(G) is a normal subgroup, and G = Z(G) if
and only if G is abelian.
3. Let U1 and U2 be subgroups of a group G. Let x, y ∈ G. Show the following:
(i) If xU1 = yU2 , then U1 = U2 .
(ii) An example that xU1 = U2 x does not imply U1 = U2 .
4. Let U, V be subgroups of a group G. Let x, y ∈ G. If UxV ∩UyV ≠ 0, then UxV = UyV.
5. Let N be a cyclic normal subgroup of the group G. Then all subgroups of N are
normal subgroups of G. Give an example to show that the statement is not correct
if N is not cyclic.
6. Let N1 and N2 be normal subgroups of G. Show the following:
(i) If all elements in N1 and N2 have finite order, then also the elements of N1 N2 .
e
(ii) Let e1 , e2 ∈ ℕ. If ni i = 1 for all ni ∈ Ni (i = 1, 2), then xe1 e2 = 1 for all x ∈ N1 N2 .
7. Find groups N1 , N2 and G with N1 ⊲ N2 ⊲ G, but N1 is not a normal subgroup of G.
8. Let G be a group generated by a and b and let bab−1 = ar and an = 1 for suitable
r ∈ ℤ, n ∈ ℕ. Show the following:
(i) The subgroup A := ⟨a⟩ is a normal subgroup of G.
(ii) G/A = ⟨bA⟩.
(iii) G = {bj ai : i, j ∈ ℤ}.
9. Prove that any group of order 24 cannot be simple.
10. Let G be a group with subgroups G1 , G2 . Then the following are equivalent:
(i) G ≅ G1 × G2 ;
(ii) G1 ⊲ G, G2 ⊲ G, G = G1 G2 , and G1 ∩ G2 = {1};
(iii) Every g ∈ G has a unique expression g = g1 g2 , where g1 ∈ G1 , g2 ∈ G2 , and
g1 g2 = g2 g1 for each g1 ∈ G1 , g2 ∈ G2 .
11. Suppose that G is a finite group with normal subgroups G1 , G2 such that
(|G1 |, |G2 |) = 1. If |G| = |G1 ||G2 |, then G ≅ G1 × G2 .
12. Let G be a group with normal subgroups G1 and G2 such that G = G1 G2 . Then
1 2 3 1 2 3 1 2 3
1=( ), a=( ), b=( )
1 2 3 2 3 1 3 1 2
1 2 3 1 2 3 1 2 3
c=( ), d=( ), e=( ).
2 1 3 3 2 1 1 3 2
S3 = ⟨a, c; a3 = c2 = 1, ac = ca2 ⟩.
Definition 11.1.1. Suppose that f is a permutation of A = {1, 2, . . . , n}, which has the
following effect on the elements of A: There exists an element a1 ∈ A such that f (a1 ) =
a2 , f (a2 ) = a3 , . . . , f (ak−1 ) = ak , f (ak ) = a1 , and f leaves all other elements (if there are
any) of A fixed; that is, f (aj ) = aj for aj ≠ ai , i = 1, 2, . . . , k. Such a permutation f is
called a cycle or a k-cycle.
f = (a1 , a2 , . . . , ak ).
https://doi.org/10.1515/9783110603996-011
The cycle notation is read from left to right. It says f takes a1 into a2 , a2 into a3 , et
cetera, and finally ak , the last symbol, into a1 , the first symbol. Moreover, f leaves all
the other elements not appearing in the representation above fixed.
Note that one can write the same cycle in many ways using this type of notation;
for example, f = (a2 , a3 , . . . , ak , a1 ). In fact, any cyclic rearrangement of the symbols
gives the same cycle. The integer k is the length of the cycle. Note we allow a cycle
to have length 1, that is, f = (a1 ), for instance. This is just the identity map. For this
reason, we will usually designate the identity of Sn by (1), or just 1. (Of course, it also
could be written as (ai ), where ai ∈ A.)
If f and g are two cycles, they are called disjoint cycles if the elements moved by
one are left fixed by the other; that is, their representations contain different elements
of the set A (their representations are disjoint as sets).
Lemma 11.1.2. If f and g are disjoint cycles, then they must commute; that is, fg = gf .
Proof. Since the cycles f and g are disjoint, each element moved by f is fixed by g, and
vice versa. First, suppose f (ai ) ≠ ai . This implies that g(ai ) = ai , and f 2 (ai ) ≠ f (ai ).
But since f 2 (ai ) ≠ f (ai ), g(f (ai )) = f (ai ). Thus, (fg)(ai ) = f (g(ai )) = f (ai ), whereas
(gf )(ai ) = g(f (ai )) = f (ai ). Similarly, if g(aj ) ≠ aj , then (fg)(aj ) = (gf )(aj ). Finally, if
f (ak ) = ak and g(ak ) = ak , clearly then, (fg)(ak ) = ak = (gf )(ak ). Thus, gf = fg.
Before proceeding further with the theory, let us consider a specific example. Let
A = {1, 2, . . . , 8}, and let
1 2 3 4 5 6 7 8
f =( ).
2 4 6 5 1 7 3 8
We pick an arbitrary number from the set A, say 1. Then f (1) = 2, f (2) = 4, f (4) = 5,
f (5) = 1. Now select an element from A not in the set {1, 2, 4, 5}, say 3. Then f (3) = 6,
f (6) = 7, f (7) = 3. Next select any element of A not occurring in the set {1, 2, 4, 5} ∪
{3, 6, 7}. The only element left is 8, and f (8) = 8. It is clear that we can now write the
permutation f as a product of cycles:
where the order of the cycles is immaterial since they are disjoint and, therefore, com-
mute. It is customary to omit such cycles as (8) and write f simply as
f = (1, 2, 4, 5)(3, 6, 7)
with the understanding that the elements of A not appearing are left fixed by f .
It is not difficult to generalize what was done here for a specific example, and
show that any permutation f can be written uniquely, except for order, as a product of
disjoint cycles. Thus, let f be a permutation on the set A = {1, 2, . . . , n}, and let a1 ∈ A.
Let f (a1 ) = a2 , f 2 (a1 ) = f (a2 ) = a3 , et cetera, and continue until a repetition is obtained.
We claim that this first occurs for a1 ; that is, the first repetition is, say f k (a1 ) = f (ak ) =
ak+1 = a1 . For suppose the first repetition occurs at the k-th iterate of f and
and so f k−j+1 (a1 ) = a1 . However, k−j+1 < k if j ≠ 1, and we assumed that the first repeti-
tion occurred for k. Thus, j = 1, and so f does cyclically permute the set {a1 , a2 , . . . , ak }.
If k < n, then there exists b1 ∈ A such that b1 ∉ {a1 , a2 , . . . , ak }, and we may proceed
similarly with b1 . We continue in this manner until all the elements of A are accounted
for. It is then seen that f can be written in the form
Note that all powers f i (a1 ) belong to the set {a1 = f 0 (a1 ) = f k (a1 ), a2 = f 1 (a1 ), . . . , ak =
f k−1 (a1 )}; all powers f i (b1 ) belong to the set {b1 = f 0 (b1 ) = f ℓ (b1 ), b2 = f 1 (b1 ), . . . , bℓ =
f ℓ−1 (b1 )}; . . . . Here, by definition, b1 is the smallest element in {1, 2, . . . , n}, which does
not belong to {a1 = f 0 (a1 ) = f k (a1 ), a2 = f 1 (a1 ), . . . , ak = f k−1 (a1 )}; c1 is the smallest
element in {1, 2, . . . , n}, which does not belong to
Therefore, by construction, all the cycles are disjoint. From this, it follows that k + ℓ +
m + ⋅ ⋅ ⋅ + t = n. It is clear that this factorization is unique, except for the order of the
factors, since it tells explicitly what effect f has on each element of A.
In summary, we have proven the following result.
(1, 2)(1, 3) takes 3 into 2. Finally, (1, 3) takes 2 into 2, and then (1, 2) takes 2 into 1. So
(1, 2)(1, 3) takes 2 into 1. Thus, we see
From Theorem 11.1.3, any permutation can be written in terms of cycles, but from the
above, any cycle can be written as a product of transpositions. Thus, we have the fol-
lowing result:
W(f ) = (k − 1) + (j − 1) + ⋅ ⋅ ⋅ + (t − 1)
transpositions. The number W(f ) is uniquely associated with the permutation f since
f is uniquely represented (up to order) as a product of disjoint cycles. However, there
is nothing unique about the number of transpositions occurring in an arbitrary repre-
sentation of f as a product of transpositions. For example, in S3 ,
Suppose now that f is represented as a product of disjoint cycles, where we include all
the 1-cycles of elements of A, which f fixes, if any. If a and b occur in the same cycle in
this representation for f ,
f = ⋅ ⋅ ⋅ (a, b1 , . . . , bk , b, c1 , . . . , ct ) ⋅ ⋅ ⋅ ,
then, in the computation of W(f ), this cycle contributes k + t + 1. Now consider (a, b)f .
Since the cycles are disjoint and disjoint cycles commute,
since neither a nor b can occur in any factor of f other than (a, b1 , . . . , bk , b, c1 ,
. . . , ct ). So that (a, b) cancels out, and we find that (a, b)f = ⋅ ⋅ ⋅ (b, c1 , . . . , ct )(a, b1 ,
. . . , bk ) ⋅ ⋅ ⋅. Since W((b, c1 , . . . , ct )(a, b1 , . . . , bk )) = k + t, but W(a, b1 , . . . , bk , b,
c1 , . . . , ct ) = k + t + 1, we have W((a, b)f ) = W(f ) − 1.
A similar analysis shows that in the case, where a and b occur in different cycles
in the representation of f , then W((a, b)f ) = W(f ) + 1. Combining both cases, we have
Then
Iterating this, together with the fact that W(1) = 0, shows that
It now makes sense to state the following definition since we know that the parity
is indeed unique:
Definition 11.2.3. On the group Sn , for n ≥ 2, we define the sign function : Sn → (ℤ2 , +)
by sgn(π) = 0 if π is an even permutation, and sgn(π) = 1 if π is an odd permutation.
We note that if f and g are even permutations, then so are fg and f −1 and also the
identity permutation is even. Furthermore, if f is even and g is odd, it is clear that fg
is odd. From this it is straightforward to establish the following:
We now let
An = {π ∈ Sn : sgn(π) = 0}.
Theorem 11.2.5. For each n ∈ ℕ, n ≥ 2, the set An forms a normal subgroup of index 2
in Sn , called the alternating group on n symbols. Furthermore, |An | = n!2 .
11.3 Conjugation in Sn
Recall that in a group G, two elements x, y ∈ G are conjugates if there exists a g ∈ G
with g −1 xg = y. Conjugacy is an equivalence relation on G. In the symmetric groups Sn ,
it is easy to determine if two elements are conjugates. We say that two permutations
in Sn have the same cycle structure if they have the same number of cycles and the
lengths are the same. Hence, for example in S8 the permutations
have the same cycle structure. In particular, if π1 , π2 are two permutations in Sn , then
π1 , π2 are conjugates if and only if they have the same cycle structure. Therefore, in S8 ,
the permutations
are conjugates.
be the cycle decomposition of π ∈ Sn . Let τ ∈ Sn , and denote the image of aij under τ by
aτij . Then
Proof. (a) Consider a11, then operating on the left like functions, we have
The same computation then follows for all the symbols aij , proving the lemma.
Theorem 11.3.2. Two permutations π1 , π2 ∈ Sn are conjugates if and only if they are of
the same cycle structure.
Proof. Suppose that π2 = τπ1 τ−1 . Then, from Lemma 11.3.1, we have that π1 and π2 are
of the same cycle structure.
Conversely, suppose that π1 and π2 are of the same cycle structure. Let
where we place the cycles of the same length under each other. Let τ be the per-
mutation in Sn that maps each symbol in π1 to the digit below it in π2 . Then, from
Lemma 11.3.1, we have τπ1 τ−1 = π2 ; hence, π1 and π2 are conjugate.
Case (3): (a, b)(c, d) = (a, b)(b, c)(b, c)(c, d) = (c, a, b)(c, d, b)
since (b, c)(b, c) = 1. Therefore, it is also true here, proving the theorem.
However, from Lemma 11.3.1, (b, c, d) = (aτi , bτi , cτi ). Furthermore, since π ∈ N and N
is normal, we have
and
Proof. Suppose, without loss of generality, that τ = (1, 2). There is an i with αi (1) = 2.
Without loss of generality, we may then assume that α = (1, 2, a3 , . . . , an ). Let
1 2 a3 ⋅⋅⋅ an
π=( ).
1 2 3 ⋅⋅⋅ n
Furthermore, π(1, 2)π −1 = (1, 2). Hence, U1 = πUπ −1 contains (1, 2) and (1, 2, . . . , n).
Now we have
Analogously,
and so on until
11.5 Exercises
1. Show that for n ≥ 3, the group An is generated by {(1, 2, k) : k ≥ 3}.
2. Let σ = (k1 , . . . , ks ) ∈ Sn be a permutation. Show that the order of σ is the least
common multiple of k1 , . . . , ks . Compute the order of τ = ( 21 62 35 41 35 46 77 ) ∈ S7 .
3. Let G = S4 .
(i) Determine a noncyclic subgroup H of order 4 of G.
(ii) Show that H is normal.
(iii) Show that f (g)(h) := ghg −1 defines an epimorphism f : G → Aut(H) for g ∈ G
and h ∈ H. Determine its kernel.
4. Show that all subgroups of order 6 of S4 are conjugate.
5. Let σ1 = (1, 2)(3, 4) and σ2 = (1, 3)(2, 4) ∈ S4 . Determine τ ∈ S4 such that τσ1 τ−1 = σ2 .
6. Let σ = (a1 , . . . , ak ) ∈ Sn . Describe σ −1 .
G = G0 ⊃ G1 ⊃ G2 ⊃ ⋅ ⋅ ⋅ ⊃ Gn−1 ⊃ Gn = {1},
https://doi.org/10.1515/9783110603996-012
in which each Gi+1 is a proper normal subgroup of Gi . The factor groups Gi /Gi+1 are
called the factors of the series, and n is the length of the series.
Definition 12.2.1. A group G is solvable if it has a normal series with abelian factors;
that is, Gi /Gi+1 is abelian for all i = 0, 1, . . . , n − 1. Such a normal series is called a
solvable series.
S3 ⊃ A3 ⊂ {1}.
Since |S3 | = 6, we have |A3 | = 3; hence, A3 is cyclic and therefore abelian. Furthermore,
|S3 /A3 | = 2; hence, the factor group S3 /A3 is also cyclic, thus abelian. Therefore, the
series above gives a solvable series for S3 .
Lemma 12.2.2. If G is a finite solvable group, then G has a normal series with cyclic
factors.
Proof. If G is a finite solvable group, then by definition, it has a normal series with
abelian factors. Hence, to prove the lemma, it suffices to show that a finite abelian
group has a normal series with cyclic factors.
Let A be a nontrivial finite abelian group. We do an induction on the order of A. If
|A| = 2, then A itself is cyclic, and the result follows. Suppose that |A| > 2. Choose an
1 ≠ a ∈ A. Let N = ⟨a⟩ so that N is cyclic. Then we have the normal series A ⊃ N ⊃ {1}
with A/N abelian. Moreover, A/N has order less than A, so A/N has a normal series
with cyclic factors, and the result follows.
G = G0 ⊃ G1 ⊃ ⋅ ⋅ ⋅ ⊃ Gr = {1}
is a solvable series for G. Hence, Gi+1 is a normal subgroup of Gi for each i, and the
factor group Gi /Gi+1 is abelian.
Now let H be a subgroup of G, and consider the chain of subgroups
H = H ∩ G0 ⊃ H ∩ G1 ⊃ ⋅ ⋅ ⋅ ⊃ H ∩ Gr = {1}.
Since Gi+1 is normal in Gi , we know that H ∩ Gi+1 is normal in H ∩ Gi ; hence, this gives
a finite normal series for H. Furthermore, from the second isomorphism theorem, we
have for each i,
However, Gi /Gi+1 is abelian, so each factor in the normal series for H is abelian. There-
fore, the above series is a solvable series for H; hence, H is also solvable.
(2) Let N be a normal subgroup of G. Then from (1) N is also solvable. As above,
let
G = G0 ⊃ G1 ⊃ ⋅ ⋅ ⋅ ⊃ Gr = {1}
It follows that Gi+1 N is normal in Gi N for each i; therefore, the series for G/N is a normal
series.
Again, from the isomorphism theorems,
However, the last group (Gi /Gi+1 )/((Gi ∩ Gi+1 N)/Gi+1 ) is a factor group of the group
Gi /Gi+1, which is abelian. Hence, this last group is also abelian; therefore, each factor
in the normal series for G/N is abelian. Hence, this series is a solvable series, and G/N
is solvable.
Theorem 12.2.4. Let G be a group and N a normal subgroup of G. If both N and G/N
are solvable, then G is solvable.
N = N0 ⊃ N1 ⊃ ⋅ ⋅ ⋅ ⊃ Nr = {1}
G/N = G0 /N ⊃ G1 /N ⊃ ⋅ ⋅ ⋅ ⊃ Gs /N = N/N = {1}
G = G0 ⊃ G1 ⊃ ⋅ ⋅ ⋅ ⊃ Gs = N ⊃ N1 ⊃ ⋅ ⋅ ⋅ ⊃ Nr = {1}
gives a normal series for G. Furthermore, from the isomorphism theorems again,
hence, each factor is abelian. Therefore, this is a solvable series for G; hence, G is
solvable.
This theorem allows us to prove that solvability is preserved under direct products.
Corollary 12.2.5. Let G and H be solvable groups. Then their direct product G ×H is also
solvable.
Proof. Suppose that G and H are solvable groups and K = G × H. Recall from Chap-
ter 10 that G can be considered as a normal subgroup of K with K/G ≅ H. Therefore,
G is a solvable subgroup of K, and K/G is a solvable quotient. It follows then, from
Theorem 12.2.4, that K is solvable.
We saw that the symmetric group S3 is solvable. However, the following theorem
shows that the symmetric group Sn is not solvable for n ≥ 5. This result will be crucial
to the proof of the insolvability of the quintic and higher polynomials.
Lemma 12.2.7. If a group G is both simple and solvable, then G is cyclic of prime order.
Proof. Suppose that G is a nontrivial simple, solvable group. Since G is simple, the
only normal series for G is G = G0 ⊃ {1}. Since G is solvable, the factors are abelian;
hence, G is abelian. Again, since G is simple, G must be cyclic. If G were infinite, then
G ≅ (ℤ, +). However, then 2ℤ is a proper normal subgroup, a contradiction. Therefore,
G must be finite cyclic. If the order were not prime, then for each proper divisor of the
order, there would be a nontrivial proper normal subgroup. Therefore, G must be of
prime order.
Definition 12.3.1. Let G be the subgroup of G, which is generated by the set of all
commutators
G = gp({[x, y] : x, y ∈ G}).
Theorem 12.3.2. For any group G, the commutator subgroup G is a normal subgroup of
G, and G/G is abelian. Furthermore, if H is a normal subgroup of G, then G/H is abelian
if and only if G ⊂ H.
Proof. The commutator subgroup G consists of all finite products of commutators and
inverses of commutators. However,
and so the inverse of a commutator is once again a commutator. It then follows that
G is precisely the set of all finite products of commutators; that is, G is the set of all
elements of the form
h1 h2 ⋅ ⋅ ⋅ hn ,
Consider the factor group G/G . Let aG and bG be any two elements of G/G . Then
since [a, b] ∈ G . In other words, any two elements of G/G commute; therefore, G/G
is abelian.
Now let N be a normal subgroup of G with G/N abelian. Let a, b ∈ G, then aN and
bN commute since G/N is abelian. Therefore,
From the second part of Theorem 12.3.2, we see that G is the minimal normal
subgroup of G such that G/G is abelian. We call G/G = Gab the abelianization of G.
We consider next the following inductively defined sequence of subgroups of an
arbitrary group G called the derived series:
Definition 12.3.3. For an arbitrary group G, define G(0) = G and G(1) = G , and then, in-
ductively, G(n+1) = (G(n) ) . That is, G(n+1) is the commutator subgroup or derived group
of G(n) . The chain of subgroups
Notice that since G(i+1) is the commutator subgroup of G(i) , we have G(i) /G(i+1) is
abelian. If the derived series was finite, then G would have a normal series with abelian
factors; hence would be solvable. The converse is also true and characterizes solvable
groups in terms of the derived series.
Theorem 12.3.4. A group G is solvable if and only if its derived series is finite. That is,
there exists an n such that G(n) = {1}.
Proof. If G(n) = {1} for some n, then as explained above, the derived series provides a
solvable series for G; hence, G is solvable.
Conversely, suppose that G is solvable, and let
G = G0 ⊃ G1 ⊃ ⋅ ⋅ ⋅ ⊃ Gr = {1}
be a solvable series for G. We claim first that Gi ⊃ G(i) for all i. We do this by induction
on r. If r = 0, then G = G0 = G(0) . Suppose that Gi ⊃ G(i) . Then Gi ⊃ (G(i) ) = G(i+1) .
Since Gi /Gi+1 is abelian, it follows, from Theorem 12.3.2, that Gi+1 ⊃ Gi . Therefore,
Gi+1 ⊃ G(i+1) , establishing the claim.
Now if G is solvable, from the claim, we have that Gr ⊃ G(r) . However, Gr = {1};
therefore, G(r) = {1}, proving the theorem.
The length of the derived series is called the solvability length of a solvable
group G. The class of solvable groups of class c consists of those solvable groups
of solvability length c, or less.
G = G0 ⊃ G1 ⊃ ⋅ ⋅ ⋅ ⊃ Gs = {1}
G = H0 ⊃ H1 ⊃ ⋅ ⋅ ⋅ ⊃ Gt = {1}
are two normal series for the group G, then the second is a refinement of the first if all
the terms of the second occur in the first series. Furthermore, two normal series are
called equivalent or (isomorphic) if there exists a 1–1 correspondence between the fac-
tors (hence the length must be the same) of the two series such that the corresponding
factors are isomorphic.
Theorem 12.4.1 (Schreier’s theorem). Any two normal series for a group G have equiv-
alent refinements.
G = G0 ⊃ G1 ⊃ ⋅ ⋅ ⋅ ⊃ Gs−1 ⊃ Gs = {1}
G = H0 ⊃ H1 ⊃ ⋅ ⋅ ⋅ ⊃ Ht−1 ⊃ Ht = {1}.
Now define
Then we have
and
Now, applying the third isomorphism theorem to the groups Gi , Hj , Gi+1 , Hj+1 , we have
that Gi(j+1) = (Gi ∩ Hj+1 )Gi+1 is a normal subgroup of Gij = (Gi ∩ Hj )Gi+1 , and Hj(i+1) =
(Gi+1 ∩ Hj )Hj+1 is a normal subgroup of Hji = (Gi ∩ Hj )Hj+1 . Furthermore, also
Thus, the above two are normal series, which are refinements of the two given series,
and they are equivalent.
Definition 12.4.2. A composition series for a group G is a normal series, where all the
inclusions are proper and such that Gi+1 is maximal in Gi . Equivalently, a normal se-
ries, where each factor is simple.
It is possible that an arbitrary group does not have a composition series, or even
if it does have one, a subgroup of it may not have one. Of course, a finite group does
have a composition series.
In the case in which a group G does have a composition series, the following im-
portant theorem, called the Jordan–Hölder theorem, provides a type of unique factor-
ization.
Proof. Suppose we are given two composition series. Applying Theorem 12.4.1, we get
that the two composition series have equivalent refinements. But the only refinement
of a composition series is one obtained by introducing repetitions. If in the 1–1 corre-
spondence between the factors of these refinements, the paired factors equal to {e} are
disregarded; that is, if we drop the repetitions, clearly, we get that the original compo-
sition series are equivalent.
We remarked in Chapter 10 that the simple groups are important, because they
play a role in finite group theory somewhat analogous to that of the primes in number
theory. In particular, an arbitrary finite group G can be broken down into simple com-
ponents. These uniquely determined simple components are, according to the Jordan–
Hölder theorem, the factors of a composition series for G.
12.5 Exercises
1. Let K be a field and
{ a x y }
{ }
G = {(0 b z ) : a, b, c, x, y, z ∈ K, abc ≠ 0} .
{ }
{ 0 0 c }
2. A group G is called polycyclic if it has a normal series with cyclic factors. Show the
following:
(i) Each subgroup and each factor group of a polycyclic group is polycyclic.
(ii) In a polycyclic group, each normal series has the same number of infinite
cyclic factors.
3. Let G be a group. Show the following:
(i) If G is finite and solvable, then G is polycyclic.
(ii) If G is polycyclic, then G is finitely generated.
(iii) The group (ℚ, +) is solvable, but not polycyclic.
4. Let N1 and N2 be normal subgroups of G. Show the following:
(i) If N1 and N2 are solvable, then also N1 N2 is a solvable normal subgroup of G.
(ii) Is (i) still true, if we replace “solvable” by “abelian”?
5. Let N1 , . . . , Nt be normal subgroups of a group G. If all factor groups G/Ni are solv-
able, then also G/(N1 ∩ ⋅ ⋅ ⋅ ∩ Nt ) is solvable.
πg : A → A
such that
(1) πg1 (πg2 (a)) = πg1 g2 (a) for all g1 , g2 ∈ G and for all a ∈ A,
(2) 1(a) = a for all a ∈ A.
For the remainder of this chapter, if g ∈ G and a ∈ A, we will write ga for πg (a).
Group actions are an extremely important idea, and we use this idea in the present
chapter to prove several fundamental results in group theory.
If G acts on the set A, then we say that two elements a1 , a2 ∈ A are congruent under
G if there exists a g ∈ G with ga1 = a2 . The set
Proof. Any element a ∈ A is congruent to itself via the identity map; hence, the relation
is reflexive. If a1 ∼ a2 so that ga1 = a2 for some g ∈ G, then g −1 a2 = a1 , and so a2 ∼ a1 ,
and the relation is symmetric. Finally, if g1 a1 = a2 and g2 a2 = a3 , then g2 g1 a1 = a3 , and
the relation is transitive.
Recall that the equivalence classes under an equivalence relation partition a set.
For a given a ∈ A, its equivalence class under this relation is precisely its orbit Ga , as
defined above.
Corollary 13.1.2. If G acts on the set A, then the orbits under G partition the set A.
We say that G acts transitively on A if any two elements of A are congruent under G.
That is, the action is transitive if for any a1 , a2 ∈ A there is some g ∈ G such that
ga1 = a2 .
If a ∈ A, the stabilizer of a consists of those g ∈ G that fix a. Hence,
https://doi.org/10.1515/9783110603996-013
Lemma 13.1.3. If G acts on A, then for any a ∈ A, the stabilizer StabG (a) is a subgroup
of G.
Theorem 13.1.4. Suppose that G acts on A and a ∈ A. Let Ga be the orbit of a under G
and StabG (a) its stabilizer. Then
G : StabG (a) = |Ga |.
That is, the size of the orbit of a is the index of its stabilizer in G.
Proof. Suppose that g1 , g2 ∈ G with g1 StabG (a) = g2 StabG (a); that is, they define the
same left coset of the stabilizer. Then g2−1 g1 ∈ StabG (a). This implies that g2−1 g1 a = a so
that g2 a = g1 a. Hence, any two elements in the same left coset of the stabilizer produce
the same image of a in Ga . Conversely, if g1 a = g2 a, then g1 , g2 define the same left coset
of StabG (a). This shows that there is a one-to-one correspondence between left cosets
of StabG (a) and elements of Ga . It follows that the size of Ga is precisely the index of
the stabilizer.
We will use this theorem repeatedly with different group actions to obtain impor-
tant group theoretic results.
and showed that it is a normal subgroup of G. We use this normal subgroup in con-
junction with what we call the class equation to show that any finite p-group has a
nontrivial center. In this section, we use group actions to derive the class equation
and prove the result for finite p-groups.
Recall that if G is a group, then two elements g1 , g2 ∈ G are conjugate if there exists
a g ∈ G with g −1 g1 g = g2 . We saw that conjugacy is an equivalence relation on G. For
The equivalence class of g ∈ G is called its conjugacy class, which we will denote by
Cl(g). Thus,
If g ∈ G, then its centralizer CG (g) is the set of elements in G that commute with g:
Theorem 13.2.1. Let G be a finite group and g ∈ G. Then the centralizer of g is a subgroup
of G, and
G : CG (g) = Cl(g).
That is, the index of the centralizer of g is the size of its conjugacy class.
In particular, for a finite group the size of each conjugacy class divides the order of
the group.
Proof. Let the group G act on itself by conjugation. That is, g(g1 ) = g −1 g1 g. It is easy to
show that this is an action on the set G (see exercises). The orbit of g ∈ G under this
action is precisely its conjugacy class Cl(g), and the stabilizer is its centralizer CG (g).
The statements in the theorem then follow directly from Theorem 13.1.4.
For any group G, since conjugacy is an equivalence relation, the conjugacy classes
partition G. Hence,
G = ⋃̇ Cl(g),
g∈G
where this union is taken over the distinct conjugacy classes. It follows that
|G| = ∑ Cl(g),
g∈G
G = Z(G) ∪ ⋃̇ Cl(g),
g∉Z(G)
where again the second union is taken over the distinct conjugacy classes Cl(g) with
g ∉ Z(G). The size of G is then the sum of these disjoint pieces, so
|G| = Z(G) + ∑ Cl(g),
g∉Z(G)
where the sum is taken over the distinct conjugacy classes Cl(g) with g ∉ Z(G). How-
ever, from Theorem 13.2.1, |Cl(g)| = |G : CG (g)|, so the equation above becomes
|G| = Z(G) + ∑ G : CG (g),
g∉Z(G)
where the sum is taken over the distinct indices |G : CG (g)| with g ∉ Z(G). This is
known as the class equation.
As a first application, we prove the result that finite p-groups have nontrivial cen-
ters (see Lemma 10.5.6).
Proof. Let G be a finite p-group so that |G| = pn for some n, and consider the class
equation
|G| = Z(G) + ∑ G : CG (g),
g∉Z(G)
where the sum is taken over the distinct centralizers. Since |G : CG (g)| divides |G| for
each g ∈ G, we must have that p||G : CG (g)| for each g ∈ G. Furthermore, p||G|. There-
fore, p must divide |Z(G)|; hence, |Z(G)| = pm for some m ≥ 1. Therefore, Z(G) is non-
trivial.
The idea of conjugacy and the centralizer of an element can be extended to sub-
groups. If H1 , H2 are subgroups of a group G, then H1 , H2 are conjugate if there exists a
g ∈ G such that g −1 H1 g = H2 . As for elements, conjugacy is an equivalence relation on
the set of subgroups of G.
If H ⊂ G is a subgroup, then its conjugacy class consists of all the subgroups of G
conjugate to it. The normalizer of H is
NG (H) = {g ∈ G : g −1 Hg = H}.
As for elements, let G act on the set of subgroups of G by conjugation. That is, for
g ∈ G, the map is given by H → g −1 Hg. For H ⊂ G, the stabilizer under this action
is precisely the normalizer. Hence, exactly as for elements, we obtain the following
theorem:
Theorem 13.2.4. Let G be a group and H ⊂ G a subgroup. Then the normalizer NG (H)
of H is a subgroup of G, H is normal in NG (H), and
G : NG (H) = number of conjugates of H in G.
Lemma 13.3.1. The alternating group on 4 symbols A4 has order 12, but has no subgroup
of order 6.
Proof. Suppose that there exists a subgroup U ⊂ A4 with |U| = 6. Then |A4 : U| = 2
since |A4 | = 12; hence, U is normal in A4 .
Now id, (1, 2)(3, 4), (1, 3)(2, 4), (1, 4)(2, 3) are in A4 . These each have order 2 and
commute, so they form a normal subgroup V ⊂ A4 of order 4. This subgroup V ≅
ℤ2 × ℤ2 . Then
|V||U| 4⋅6
12 = |A4 | ≥ |VU| = = .
|V ∩ U| |V ∩ U|
It follows that V ∩ U ≠ {1}, and since U is normal, we have that V ∩ U is also normal
in A4 .
Now (1, 2)(3, 4) ∈ V, and by renaming the entries in V, if necessary, we may assume
that it is also in U, so that (1, 2)(3, 4) ∈ V ∩ U. Since (1, 2, 3) ∈ A4 , we have
and then
But then V ⊂ V ∩ U, and so V ⊂ U. But this is impossible since |V| = 4, which does
not divide |U| = 6.
Definition 13.3.2. Let G be a finite group with |G| = n, and let p be a prime such that
pa |n, but no higher power of p divides n. A subgroup of G of order pa is called a p-Sylow
subgroup.
It is not a clear that a p-Sylow subgroup must exist. We will prove that for each p|n
a p-Sylow subgroup exists.
We first consider and prove a very special case.
Theorem 13.3.3. Let G be a finite abelian group, and let p be a prime such that p||G|.
Then G contains at least one element of order p.
Proof. Suppose that G is a finite abelian group of order pn. We use induction on n. If
n = 1, then G has order p, and hence is cyclic. Therefore, it has an element of order
p. Suppose that the theorem is true for all abelian groups of order pm with m < n,
and suppose that G has order pn. Suppose that g ∈ G. If the order of g is pt for some
integer t, then g t ≠ 1, and g t has order p, proving the theorem in this case. Hence,
we may suppose that g ∈ G has order prime to p, and we show that there must be an
element, whose order is a multiple of p, and then use the above argument to get an
element of exact order p.
Hence, we have g ∈ G with order m, where (m, p) = 1. Since m||G| = pn, we must
have m|n. Since G is abelian, ⟨g⟩ is normal, and the factor group G/⟨g⟩ is abelian of
order p( mn ) < pn. By the inductive hypothesis, G/⟨g⟩ has an element h⟨g⟩ of order p,
h ∈ G; hence, hp = g k for some k. g k has order m1 |m; therefore, h has order pm1 . Now,
as above, hm1 has order p, proving the theorem.
Theorem 13.3.4 (First Sylow theorem). Let G be a finite group, and let p||G|, then G con-
tains a p-Sylow subgroup; that is, a p-Sylow subgroup exists.
|G| = Z(G) + ∑ G : CG (g),
g∉Z(G)
where the sum is taken over the distinct centralizers. By assumption, each of the in-
dices are divisible by p and also p||G|. Therefore, p||Z(G)|. It follows that Z(G) is a finite
abelian group, whose order is divisible by p. From Theorem 13.3.3, there exists an el-
ement g ∈ Z(G) ⊂ G of order p. Since g ∈ Z(G), we must have ⟨g⟩ normal in G. The
factor group G/⟨g⟩ then has order pt−1 m, and—by the inductive hypothesis—must have
a p-Sylow subgroup K of order pt−1 , hence of index m. By the Correspondence Theo-
rem 10.2.6, there is a subgroup K of G with ⟨g⟩ ⊂ K such that K/⟨g⟩ ≅ K. Therefore,
|K| = pt , and K is a p-Sylow subgroup of G.
On the basis of this theorem, we can now strengthen the result obtained in Theo-
rem 13.3.3.
Theorem 13.3.5 (Cauchy). If G is a finite group, and if p is a prime such that p||G|, then
G contains at least one element of order p.
Proof. Let P be a p-Sylow subgroup of G, and let |P| = pt . If g ∈ P, g ≠ 1, then the order
t1 −1
of g is pt1 . Then g p has order p.
We have seen that p-Sylow subgroups exist. We now wish to show that any two
p-Sylow subgroups are conjugate. This is the content of the second Sylow theorem:
Theorem 13.3.6 (Second Sylow theorem). Let G be a finite group and p a prime such
that p||G|. Then any p-subgroup H of G is contained in a p-Sylow subgroup. Further-
more, all p-Sylow subgroups of G are conjugate. That is, if P1 and P2 are any two p-Sylow
subgroups of G, then there exists an a ∈ G such that P1 = aP2 a−1 .
Proof. Let Ω be the set of p-Sylow subgroups of G, and let G act on Ω by conjugation.
This action will, of course, partition Ω into disjoint orbits. Let P be a fixed p-Sylow
subgroup and ΩP be its orbit under the conjugation action. The size of the orbit is the
index of its stabilizer; that is, |ΩP | = |G : StabG (P)|. Now P ⊂ StabG (P), and P is a
maximal p-subgroup of G. It follows that the index of StabG (P) must be prime to p,
and so the number of p-Sylow subgroups conjugate to P is prime to p.
Now let H be a p-subgroup of G, and let H act on ΩP by conjugation. ΩP will itself
decompose into disjoint orbits under this actions. Furthermore, the size of each orbit is
an index of a subgroup of H, hence must be a power of p. On the other hand, the size of
the whole orbit is prime to p. Therefore, there must be one orbit that has size exactly 1.
This orbit contains a p-Sylow subgroup P , and P is fixed by H under conjugation;
that is, H normalizes P . It follows that HP is a subgroup of G, and P is normal in
HP . From the second isomorphism theorem, we then obtain
HP /P ≅ H/(H ∩ P ).
We come now to the last of the three Sylow theorems. This one gives us informa-
tion concerning the number of p-Sylow subgroups.
Theorem 13.3.7 (Third Sylow theorem). Let G be a finite group and p a prime such that
p||G|. Then the number of p-Sylow subgroups of G is of the form 1 + pk and divides the
order of |G|. It follows that if |G| = pa m with (p, m) = 1, then the number of p-Sylow
subgroups divides m.
Proof. Let P be a p-Sylow subgroup, and let P act on Ω, the set of all p-Sylow sub-
groups, by conjugation. Now P normalizes itself, so there is one orbit, namely, P, hav-
ing exactly size 1. Every other orbit has size a power of p since the size is the index of
a nontrivial subgroup of P, and therefore must be divisible by p. Hence, the size of the
Ω is 1 + pk.
Proof. We use induction on n. For n = 1, the theorem is trivial. By Lemma 10.5.7, any
group of order p2 is abelian. This, together with Theorem 13.3.3, establishes the claim
for n = 2.
We now assume the theorem is true for all groups G of order pk , where 1 ≤ k < n,
where n > 2. Let G be a group of order pn . From Lemma 10.3.4, G has a nontrivial
center of order at least p, hence an element g ∈ Z(G) of order p. Let N = ⟨g⟩. Since
g ∈ Z(G), it follows that N is normal subgroup of order p. Then G/N is of order pn−1 ,
therefore contains (by the induction hypothesis) normal subgroups of orders pm−1 , for
0 ≤ m − 1 ≤ n − 1. These groups are of the form H/N, where the normal subgroup H ⊂ G
contains N and is of order pm , 1 ≤ m ≤ n, because |H| = |N|[H : N] = |N| ⋅ |H/N|.
On the basis of the first Sylow theorem, we see that if G is a finite group, and if
p ||G|, then G must contain a subgroup of order pk . One can actually show that, as in
k
the case of Sylow p-groups, the number of such subgroups is of the form 1 + pt, but we
shall not prove this here.
Theorem 13.4.2. Let G be a finite abelian group of order n. Suppose that d|n. Then G
contains a subgroup of order d.
e e f f
Proof. Suppose that n = p1 1 ⋅ ⋅ ⋅ pkk is the prime factorization of n. Then d = p11 ⋅ ⋅ ⋅ pkk
e
for some nonnegative f1 , . . . , fk . Now G has p1 -Sylow subgroup H1 of order p1 1 . Hence,
f1
from Theorem 13.4.1, H1 has a subgroup K1 of order p1 . Similarly, there are subgroups
f f
K2 , . . . , Kk of G of respective orders p22 , . . . , pkk . Moreover, since the orders are disjoint,
f
Ki ∩ Kj = {1} if i ≠ j. It follows that ⟨K1 , K2 , . . . , Kk ⟩ has order |K1 ||K2 | ⋅ ⋅ ⋅ |Kk | = p11 ⋅ ⋅ ⋅
f
pkk = d.
Theorem 13.4.3. Let p, q be distinct primes with p < q and q not congruent to 1 mod p.
Then any group of order pq is cyclic. For example, any group of order 15 must be cyclic.
Proof. Suppose that |G| = pq with p < q and q not congruent to 1 mod p. The number
of q-Sylow subgroups is of the form 1 + qk and divides p. Since q is greater than p,
this implies that there can be only one; hence, there is a normal q-Sylow subgroup H.
Since q is a prime, H is cyclic of order q; therefore, there is an element g of order q.
The number of p-Sylow subgroups is of the form 1 + pk and divides q. Since q is
not congruent to 1 mod p, this implies that there also can be only one p-Sylow sub-
group; hence, there is a normal p-Sylow subgroup K. Since p is a prime K is cyclic
of order p; therefore, there is an element h of order p. Since p, q are distinct primes
H ∩ K = {1}. Consider the element g −1 h−1 gh. Since K is normal, g −1 hg ∈ K. Then
g −1 h−1 gh = (g −1 h−1 g)h ∈ K. But H is also normal, so h−1 gh ∈ H. This then implies
that g −1 h−1 gh = g −1 (h−1 gh) ∈ H; therefore, g −1 h−1 gh ∈ K ∩ H. It follows then that
g −1 h−1 gh = 1 or gh = hg. Since g, h commute, the order of gh is the lcm of the orders
of g and h, which is pq. Therefore, G has an element of order pq. Since |G| = pq, this
implies that G is cyclic.
In the above theorem, since we assumed that q is not congruent to 1 mod p, hence
p ≠ 2. In the case where p = 2, we get another possibility.
Theorem 13.4.4. Let p be an odd prime and G a finite group of order 2p. Then either
G is cyclic, or G is isomorphic to the dihedral group of order 2p; that is, the group of
symmetries of a regular p-gon. In this latter case, G is generated by two elements, g
and h, which satisfy the relations g p = h2 = (gh)2 = 1.
Proof. As in the proof of Theorem 13.4.3, G must have a normal cyclic subgroup of or-
der p, say ⟨g⟩. Since 2||G|, the group G must have an element of order 2, say h. Consider
the order of gh. By Lagrange’s theorem, this element can have order 1, 2, p, 2p. If the or-
der is 1, then gh = 1 or g = h−1 = h. This is impossible since g has order p, and h has
order 2. If the order of gh is p, then from the second Sylow theorem, gh ∈ ⟨g⟩. But this
implies that h ∈ ⟨g⟩, which is impossible since every nontrivial element of ⟨g⟩ has
order p. Therefore, the order of gh is either 2 or 2p.
If the order of gh is 2p, then since G has order 2p, it must be cyclic.
If the order of gh is 2, then within G, we have the relations g p = h2 = (gh)2 = 1. Let
H = ⟨g, h⟩ be the subgroup of G generated by g and h. The relations g p = h2 = (gh)2 = 1
imply that H has order 2p. Since |G| = 2p, we get that H = G. G is isomorphic to the
dihedral group Dp of order 2p (see exercises).
Example 13.4.5 (The groups of order 21). Let G be a group of order 21. The number of
7-Sylow subgroups of G is 1, because it is of the form 1 + 7k and divides 3. Hence, the
7-Sylow subgroup K is normal and cyclic; that is, K ⊲ G and K = ⟨a⟩ with a of order 7.
The number of 3-Sylow subgroups is analogously 1 or 7. If it is 1, then we have
exactly one element of order 3 in G, and if it is 7, there are 14 elements of order 3 in G.
Let b be an element of order 3. Then bab−1 = ar for some r with 1 ≤ r ≤ 6. Now,
3
a = b3 ab−3 = ar ; hence, r 3 = 1 in ℤ6 , which implies r = 1, 2 or 4.
The map b → b, a → a2 defines an automorphism of G, because (a2 )3 = a.
Hence, up to isomorphism, there are exactly two groups of order 21.
If r = 1, then G is abelian. In fact, G = ⟨ab⟩ is cyclic of order 21.
The group for r = 2 can be realized as a subgroup of S7 . Let a = (1, 2, 3, 4, 5, 6, 7)
and b = (2, 3, 5)(4, 7, 6). Then bab−1 = a2 , and ⟨a, b⟩ has order 21.
Example 13.4.6. Consider GL(n, p), the group of n × n invertible matrices over ℤp . If
{v1 , . . . , vn } is a basis for (ℤp )n over ℤp , then the size of GL(n, p) is the number of inde-
pendent images {w1 , . . . , wn } of {v1 , . . . , vn }. For w1 , there are pn − 1 choices; for w2 there
are pn − p choices and so on. It follows that
n(n−1)
n n n n−1 1+2+⋅⋅⋅+(n−1)
GL(n, p) = (p − 1)(p − p) ⋅ ⋅ ⋅ (p − p ) = p m=p 2 m
n(n−1)
with (p, m) = 1. Therefore, a p-Sylow subgroup must have size p 2 .
Let P be the subgroup of upper triangular matrices with 1’s on the diagonal.
n(n−1)
Then P has size p1+2+⋅⋅⋅+(n−1) = p 2 , and is therefore a p-Sylow subgroup of
GL(n, p).
The final example is a bit more difficult. We mentioned that a major result on fi-
nite groups is the classification of the finite simple groups. This classification showed
that any finite simple group is either cyclic of prime order, in one of several classes of
groups such as the An , n > 4, or one of a number of special examples called sporadic
groups. One of the major tools in this classification is the following famous result,
called the Feit–Thompson theorem, which showed that any finite group G of odd or-
der is solvable and, in addition, if G is not cyclic, then G is nonsimple.
Theorem 13.4.7 (Feit–Thompson theorem). Any finite group of odd order is solvable.
The proof of this theorem, one of the major results in algebra in the twentieth cen-
tury, is way beyond the scope of this book. The proof is actually hundreds of pages
in length, when one counts the results used. However, we look at the smallest non-
abelian simple group.
Theorem 13.4.8. Suppose that G is a simple group of order 60. Then G is isomorphic to
A5 . Moreover, A5 is the smallest nonabelian finite simple group.
is divisible by only two primes, is solvable. Therefore, for |G| = 60, we only have to
show that groups of order 30 = 2 ⋅ 3 ⋅ 5 and 42 = 2 ⋅ 3 ⋅ 7 are nonsimple. This is done
in the same manner as the first part of this proof. Suppose |G| = 30. The number of
5-Sylow subgroups is of the form 1 + 5k and divides 6. Hence, there are 1 or 6. If G were
simple there would have to be 6 covering 24 distinct elements. The number of 3-Sylow
subgroups is of the form 1 + 3k and divides 10; hence, there are 1 or 10. If there were
10 these would cover an additional 20 distinct elements, which is impossible, since
we already have 24 and G has order 30. Therefore, there is only one, hence a normal
3-Sylow subgroup. It follows that G cannot be simple. The case |G| = 42 is even simpler.
There must be a normal 7-Sylow subgroup.
13.5 Exercises
1. Prove Lemma 13.1.3.
2. Let the group G act on itself by conjugation; that is, g(g1 ) = g −1 g1 g. Prove that this
is an action on the set G.
3. Show that the dihedral group Dn of order 2n has the presentation
⟨r, f ; r n = f 2 = (rf )2 = 1⟩
1 2 3 1 2 3 1 2 3
1=( ), a=( ), b=( )
1 2 3 2 3 1 3 1 2
1 2 3 1 2 3 1 2 3
c=( ), d=( ), e=( ).
2 1 3 3 2 1 1 3 2
⟨a, c; a3 = c2 = (ac)2 = 1⟩
1 = 1, a = a, b = a2 , c = c, d = ac, e = a2 c,
and so a, c generate S3 .
Now from (ac)2 = acac = 1, we get that ca = a2 c. This implies that if we write any
sequence (or word in our later language) in a and c, we can also rearrange it so that
the only nontrivial powers of a are a and a2 ; the only powers of c are c, and all a terms
precede c terms. For example,
Therefore, using the three relations from the presentation above, each element of S3
can be written as aα cβ with α = 0, 1, 2 and β = 0, 1. From this the multiplication of any
two elements can be determined.
https://doi.org/10.1515/9783110603996-014
This type of argument exactly applies to all the dihedral groups Dn . We saw that, in
general, |Dn | = 2n. Since these are the symmetry groups of a regular n-gon, we always
have a rotation r of angle 2π n
about the center of the n-gon. This element r would have
order n. Let f be a reflection about any line of symmetry. Then f 2 = 1, and rf is a
reflection about the rotated line, which is also a line of symmetry. Therefore, (rf )2 = 1.
Exactly as for S3 , the relation (rf )2 = 1 implies that fr = r −1 f = r n−1 f . This allows us to
always place r terms in front of f terms in any word on r and f . Therefore, the elements
of Dn are always of the form
r α f β , α = 0, 1, 2, . . . , n − 1, β = 0, 1.
Theorem 14.1.1. If Dn is the symmetry group of a regular n-gon, then a presentation for
Dn is given by
We now give one class of infinite examples. If G is an infinite cyclic group, so that
G ≅ ℤ, then G = ⟨g; ⟩ is a presentation for G. That is, G has a single generator with no
relations.
A direct product of n copies of ℤ is called a free abelian group of rank n. We will
denote this by ℤn . A presentation for ℤn is then given by
We first show that given any set X, there does exist a free group with free basis X.
Let X = {xi }i∈I be a set (possibly empty). We will construct a group F(X), which is free
with free basis X. First, let X −1 be a set disjoint from X, but bijective to X. If xi ∈ X,
then we denote as xi−1 the corresponding element of X −1 under the bijection, and say
that xi and xi−1 are associated. The set X −1 is called the set of formal inverses from X,
and we call X ∪ X −1 the alphabet. Elements of the alphabet are called letters. Hence, a
ϵ
letter has the form xi 1 , where ϵi = ±1. A word in X is a finite sequence of letters from
the alphabet. That is a word has the form
ϵi ϵi ϵ
w = xi 1 xi 2 ⋅ ⋅ ⋅ xi in ,
1 2 n
where xij ∈ X, and ϵij = ±1. If n = 0, we call it the empty word, which we will denote
as e. The integer n is called the length of the word. Words of the form xi xi−1 or xi−1 xi are
called trivial words. We let W(X) be the set of all words on X.
If w1 , w2 ∈ W(X), we say that w1 is equivalent to w2 , denoted as w1 ∼ w2 , if w1 can
be converted to w2 by a finite string of insertions and deletions of trivial words. For ex-
ample, if w1 = x3 x4 x4−1 x2 x2 and w2 = x3 x2 x2 , then w1 ∼ w2 . It is straightforward to verify
that this is an equivalence relation on W(X) (see exercises). Let F(X) denote the set of
equivalence classes in W(X) under this relation; hence, F(X) is a set of equivalence
classes of words from X.
A word w ∈ W(X) is said to be freely reduced or reduced if it has no trivial subwords
(a subword is a connected sequence within a word). Hence, in the example above,
w2 = x3 x2 x2 is reduced, but w1 = x3 x4 x4−1 x2 x2 is not reduced. There is a unique element
of minimal length in each equivalence class in F(X). Furthermore, this element must
be reduced or else it would be equivalent to something of smaller length. Two reduced
words in W(X) are either equal or not in the same equivalence class in F(X). Hence,
F(X) can also be considered as the set of all reduced words from W(X).
ϵi ϵi ϵ
Given a word w = xi 1 xi 2 ⋅ ⋅ ⋅ xi in , we can find the unique reduced word w equivalent
1 2 n
to w via the following free reduction process. Beginning from the left side of w, we
cancel each occurrence of a trivial subword. After all these possible cancellations, we
have a word w . Now we repeat the process again, starting from the left side. Since w
has finite length, eventually the resulting word will either be empty or reduced. The
final reduced w is the free reduction of w.
Now we build a multiplication on F(X). If
ϵi ϵi ϵ ϵj ϵj ϵ
w1 = xi 1 xi 2 ⋅ ⋅ ⋅ xi in , w2 = xj 1 xj 2 ⋅ ⋅ ⋅ xj jm
1 2 n 1 2 m
are two words in W(X), then their concatenation w1 ⋆ w2 is simply placing w2 after w1 ,
ϵi ϵi ϵ ϵj ϵj ϵ
w1 ⋆ w2 = xi 1 xi 2 ⋅ ⋅ ⋅ xi in xj 1 xj 2 ⋅ ⋅ ⋅ xj jm .
1 2 n 1 2 m
w1 w2 = equivalence class of w1 ⋆ w2 .
That is, we concatenate w1 and w2 , and the product is the equivalence class of the re-
sulting word. It is easy to show that if w1 ∼ w1 and w2 ∼ w2 , then w1 ⋆w2 ∼ w1 ⋆w2 so that
the above multiplication is well-defined. Equivalently, we can think of this product in
the following way. If w1 , w2 are reduced words, then to find w1 w2 , first concatenate, and
ϵ ϵj
then freely reduce. Notice that if xi in xj 1 is a trivial word, then it is cancelled when the
n 1
concatenation is formed. We say then that there is cancellation in forming the product
w1 w2 . Otherwise, the product is formed without cancellation.
Theorem 14.2.2. Let X be a nonempty set, and let F(X) be as above. Then F(X) is a free
group with free basis X. Furthermore, if X = 0, then F(X) = {1}; if |X| = 1, then F(X) ≅ ℤ,
and if |X| ≥ 2, then F(X) is nonabelian.
Proof. We first show that F(X) is a group, and then show that it satisfies the universal
mapping property on X. We consider F(X) as the set of reduced words in W(X) with the
multiplication defined above. Clearly, the empty word acts as the identity element 1.
ϵi ϵi ϵ −ϵ −ϵi −ϵi
If w = xi 1 xi 2 ⋅ ⋅ ⋅ xi in and w1 = xi in xi n−1 ⋅ ⋅ ⋅ xi 1 , then both w ⋆ w1 and w1 ⋆ w freely
1 2 n n n−1 1
reduce to the empty word, and so w1 is the inverse of w. Therefore, each element of
F(X) has an inverse. Therefore, to show that F(X) forms a group, we must show that
the multiplication is associative. Let
ϵi ϵi ϵ ϵj ϵj ϵ ϵk ϵk ϵk
w1 = xi 1 xi 2 ⋅ ⋅ ⋅ xi in , w2 = xj 1 xj 2 ⋅ ⋅ ⋅ xj jm , w3 = xk 1 xk 2 ⋅ ⋅ ⋅ xk p
1 2 n 1 2 m 1 2 p
ϵ ϵk
However, these are equal since xi in = xk 1 . Therefore, in this final case, w1 (w2 w3 ) =
n 1
(w1 w2 )w3 .
It follows, inductively, from these four cases, that the associative law holds in
F(X); therefore, F(X) forms a group.
Now suppose that f : X → G is a map from X into a group G. By the construction
of F(X) as a set of reduced words this can be extended to a unique homomorphism. If
ϵi ϵ
w ∈ F with w = xi 1 ⋅ ⋅ ⋅ xi in , then define f (w) = f (xi1 )ϵi1 ⋅ ⋅ ⋅ f (xin )ϵin . Since multiplication
1 n
in F(X) is concatenation, this defines a homomorphism and again form the construc-
tion of F(X), its the only one extending f . This is analogous to constructing a linear
transformation from one vector space to another by specifying the images of a basis.
Therefore, F(X) satisfies the universal mapping property of Definition 14.2.1. Hence,
F(X) is a free group with free basis X.
The final parts of Theorem 14.2.2 are straightforward. If X is empty, the only re-
duced word is the empty word; hence, the group is just the identity. If X has a single
letter, then F(X) has a single generator, and is therefore cyclic. It is easy to see that
it must be torsion-free. Therefore, F(X) is infinite cyclic; that is, F(X) ≅ ℤ. Finally,
if |X| ≥ 2, let x1 , x2 ∈ X. Then x1 x2 ≠ x2 x1 , and both are reduced. Therefore, F(X) is
nonabelian.
The proof of Theorem 14.2.2 provides another way to look at free groups.
Theorem 14.2.3. F is a free group if and only if there is a generating set X such that every
element of F has a unique representation as a freely reduced word on X.
Theorem 14.2.4. If X and Y are sets with the same cardinality, that is, |X| = |Y|, then
F(X) ≅ F(Y), the resulting free groups are isomorphic. Furthermore, if F(X) ≅ F(Y), then
|X| = |Y|.
Then N(X) is a normal subgroup, and the factor group F(X)/N(X) is abelian, where
every nontrivial element has order 2 (see exercises). Therefore, F(X)/N(X) can be con-
sidered as a vector space over ℤ2 , the finite field of order 2, with X as a vector space
basis. Hence, |X| is the dimension of this vector space. Let N(Y) be the correspond-
ing subgroup of F(Y). Since F(X) ≅ F(Y), we would have F(X)/N(X) ≅ F(Y)/N(Y);
therefore, |Y| is the dimension of the vector space F(Y)/N(Y). Thus, |X| = |Y| from the
uniqueness of dimension of vector spaces.
Expressing elements of F(X) as a reduced word gives a normal form for elements
in a free group F. As we will see in Section 14.5, this solves what is termed the word
problem for free groups. Another important concept is the following: a freely reduced
e e e
word W = xv11 xv22 ⋅ ⋅ ⋅ xvnn is cyclically reduced if v1 ≠ vn , or if v1 = vn , then e1 ≠ −en . Clearly
then, every element of a free group is conjugate to an element given by a cyclically
reduced word. This provides a method to determine conjugacy in free groups.
Theorem 14.2.5. In a free group F, two elements g1 , g2 are conjugate if and only if a
cyclically reduced word for g1 is a cyclic permutation of a cyclically reduced word for g2 .
The theory of free groups has a large and extensive literature. We close this section
by stating several important properties. Proofs for these results can be found in [31],
[30] or [20].
is called a Schreier system or Schreier transversal for H. If g ∈ F, let g represent its coset
representative in T, and further define for g ∈ F and t ∈ T, Stg = tg(tg)−1 . Notice that
Stg ∈ H for all t, g. We then have the following:
Example 14.2.10. Let F be free on {a, b} and H = F(X 2 ) the normal subgroup of F gen-
erated by all squares in F.
Then F/F(X 2 ) = ⟨a, b; a2 = b2 = (ab)2 = 1⟩ = ℤ2 × ℤ2 (see Section 14.3 for
the concept of group presentations). It follows that a Schreier system for F mod H is
{1, a, b, ab} with a = a, b = b and ba = ab. From this it can be shown that H is free on
the generating set
The theorem also allows for a computation of the rank of H, given the rank of F
and the index. Specifically:
From the example, we see that F is free of rank 2, H has index 4, so H is free of
rank 2 ⋅ 4 − 4 + 1 = 5.
Theorem 14.3.1. Every group G is a homomorphic image of a free group. That is, let G
be any group. Then G = F/N, where F is a free group.
In the above theorem, instead of taking all the elements of G, we can consider
just a set X of generators for G. Then G is a factor group of F(X), G ≅ F(X)/N. The
normal subgroup N is the kernel of the homomorphism from F(X) onto G. We use The-
orem 14.3.1 to formally define a group presentation.
If H is a subset of a group G, then the normal closure of H denoted by N(H) is the
smallest normal subgroup of G containing H. This can be described alternatively in
the following manner. The normal closure of H is the subgroup of G generated by all
conjugates of elements of H.
Now suppose that G is a group with X, a set of generators for G. We also call X a
generating system for G. Now let G = F(X)/N as in Theorem 14.3.1 and the comments
after it. N is the kernel of the homomorphism f : F(X) → G. It follows that if r is a free
group word with r ∈ N, then r = 1 in G (under the homomorphism). We then call r
a relator in G, and the equation r = 1 a relation in G. Suppose that R is a subset of N
such that N = N(R), then R is called a set of defining relators for G. The equations r = 1,
r ∈ R, are a set of defining relations for G. It follows that any relator in G is a product of
conjugates of elements of R. Equivalently, r ∈ F(X) is a relator in G if and only if r can
be reduced to the empty word by insertions and deletions of elements of R, and trivial
words.
Definition 14.3.2. Let G be a group. Then a group presentation for G consists of a set of
generators X for G and a set R of defining relators. In this case, we write G = ⟨X; R⟩. We
could also write the presentation in terms of defining relations as G = ⟨X; r = 1, r ∈ R⟩.
From Theorem 14.3.1, it follows immediately that every group has a presentation.
However, in general, there are many presentations for the same group. If R ⊂ R1 , then
R1 is also a set of defining relators.
Theorem 14.3.4. F is a free group if and only if F has a presentation of the form F = ⟨X; ⟩.
Mimicking the construction of a free group from a set X, we can show that to each
presentation corresponds a group. Suppose that we are given a supposed presentation
⟨X; R⟩, where R is given as a set of words in X. Consider the free group F(X) on X.
Define two words w1 , w2 on X to be equivalent if w1 can be transformed into w2 using
insertions and deletions of elements of R and trivial words. As in the free group case,
this is an equivalence relation. Let G be the set of equivalence classes. If we define
multiplication as before, as concatenation followed by the appropriate equivalence
class, then G is a group. Furthermore, each r ∈ R must equal the identity in G so that
G = ⟨X; R⟩. Notice that here there may be no unique reduced word for an element
of G.
Theorem 14.3.5. Given (X, R), where X is a set and R is a set of words on X. Then there
exists a group G with presentation ⟨X; R⟩.
Fn = ⟨x1 , . . . , xn ; ⟩.
ℤn = ⟨x; xn = 1⟩.
Example 14.3.9. The dihedral groups of order 2n, representing the symmetry group of
a regular n-gon, has a presentation
In this section, we give a more complicated example, and then a nice application to
number theory.
If R is any commutative ring with an identity, then the set of invertible n × n matri-
ces with entries from R forms a group under matrix multiplication called the n-dimen-
sional general linear group over R (see [32]). This group is denoted by GLn (R). Since
det(A) det(B) = det(AB) for square matrices A, B, it follows that the subset of GLn (R),
consisting of those matrices of determinant 1, forms a subgroup. This subgroup is
called the special linear group over R and is denoted by SLn (R). In this section, we
concentrate on SL2 (ℤ), or more specifically, a quotient of it, PSL2 (ℤ), and find presen-
tations for them.
The group SL2 (ℤ) then consists of 2 × 2 integral matrices of determinant one:
a b
SL2 (ℤ) = {( ) : a, b, c, d ∈ ℤ, ad − bc = 1} .
c d
SL2 (ℤ) is called the homogeneous modular group, and an element of SL2 (ℤ) is called a
unimodular matrix.
If G is any group, recall that its center Z(G) consists of those elements of G, which
commute with all elements of G:
Z(G) is a normal subgroup of G. Hence, we can form the factor group G/Z(G). For G =
SL2 (ℤ), the only unimodular matrices that commute with all others are ±I = ±( 01 01 ).
Therefore, Z(SL2 (ℤ)) = {I, −I}. The quotient
is denoted by PSL2 (ℤ) and is called the projective special linear group or inhomoge-
neous modular group. More commonly, PSL2 (ℤ) is just called the Modular Group, and
denoted by M.
M arises in many different areas of mathematics, including number theory, com-
plex analysis, and Riemann surface theory and the theory of automorphic forms and
functions. M is perhaps the most widely studied single finitely presented group. Com-
plete discussions of M and its structure can be found in the books Integral Matrices by
M. Newman [46] and Algebraic Theory of the Bianchi Groups by B. Fine [41].
Since M = PSL2 (ℤ) = SL2 (ℤ)/{I, −I}, it follows that each element of M can be
considered as ±A, where A is a unimodular matrix. A projective unimodular matrix is
then
a b
±( ), a, b, c, d ∈ ℤ, ad − bc = 1.
c d
az + b
z = , a, b, c, d ∈ ℤ, ad − bc = 1, where z ∈ ℂ.
cz + d
Thought of in this way, M forms a Fuchsian group, which is a discrete group of isome-
tries of the non-Euclidean hyperbolic plane. The book by Katok [27] gives a solid and
clear introduction to such groups. This material can also be found in condensed form
in [43].
We now determine presentations for both SL2 (ℤ) and M = PSL2 (ℤ).
0 −1 0 1
X=( ) and Y =( ).
1 0 −1 −1
Furthermore, a complete set of defining relations for the group in terms of these
generators is given by
X 4 = Y 3 = YX 2 Y −1 X −2 = I.
⟨X, Y; X 4 = Y 3 = YX 2 Y −1 X −2 = I⟩.
Proof. We first show that SL2 (ℤ) is generated by X and Y; that is, every matrix A in the
group can be written as a product of powers of X and Y.
Let
1 1
U=( ).
0 1
Then a direct multiplication shows that U = XY, and we show that SL2 (ℤ) is generated
by X and U, which implies that it is also generated by X and Y. Furthermore,
1 n
Un = ( );
0 1
−c −d a + kc b + kd
XA = ( ), and U k A = ( )
a b c d
for any k ∈ ℤ. We may assume that |c| ≤ |a| otherwise start with XA rather than A. If
c = 0, then A = ±U q for some q. If A = U q , then certainly A is in the group generated
by X and U. If A = −U q , then A = X 2 U q since X 2 = −I. It follows that here also A is in
the group generated by X and U.
Now suppose c ≠ 0. Apply the Euclidean algorithm to a and c in the following
modified way:
a = q 0 c + r1
−c = q1 r1 + r2
r1 = q 2 r2 + r3
..
.
(−1)n rn−1 = qn rn + 0,
Therefore,
A = X m U q0 XU q1 ⋅ ⋅ ⋅ XU qn XU qn+1
X 4 = Y 3 = YX 2 Y −1 X −2 = I
form a complete set of defining relations for SL2 (ℤ), or that every relation on these
generators is derivable from these. It is straightforward to see that X and Y do satisfy
these relations. Assume then that we have a relation
S = X ϵ1 Y α1 X ϵ2 Y α2 ⋅ ⋅ ⋅ Y αn X ϵn+1 = I
X 4 = Y 3 = YX 2 Y −1 X −2 = I,
S = X ϵ1 Y α1 XY α2 ⋅ ⋅ ⋅ Y αm X ϵm+1
Y α1 X ⋅ ⋅ ⋅ Y αm X = X α = S1
a, b, c, d ≥ 0, b + c > 0,
or
a, b, c, d ≤ 0, b + c < 0.
1 0 −1 1
YX = ( ), and Y 2 X = ( ).
−1 1 0 −1
a −b1
Suppose it is correct for S2 = ( −c1 ). Then
1 d1
a1 −b1
YXS2 = ( ) and
−(a1 + c1 ) b1 + d1
−a − c1 b1 + d1
Y 2 XS2 = ( 1 ).
c1 d1
Therefore, the claim is correct for all S1 with m ≥ 1. This gives a contradiction, for the
entries of X α with α = 0, 1, 2 or 3 do not satisfy the claim. Hence, m = 0, and S can be
reduced to a trivial relation by the given set of relations. Therefore, they are a complete
set of defining relations, and the theorem is proved.
Corollary 14.3.11. The modular group M = PSL2 (ℤ) has the presentation
M = ⟨x, y; x2 = y3 = 1⟩.
Proof. The center of SL2 (ℤ) is ±I. Since X 2 = −I, setting X 2 = I in the presentation for
SL2 (ℤ) gives the presentation for M. Writing the projective matrices as linear fractional
transformations gives the second statement.
This corollary says that M is the free product of a cyclic group of order 2 and a
cyclic group of order 3, a concept we will introduce in Section 14.7.
We note that there is an elementary alternative proof to Corollary 14.3.11 as far as
showing that X 2 = Y 3 = 1 are a complete set of defining relations. As linear fractional
transformations, we have
1 1 z+1
X(z) = − , Y(z) = − , Y 2 (z) = − .
z z+1 z
Now let
Then
S = Y α1 XY α2 ⋅ ⋅ ⋅ XY αn
with 1 ≤ αi ≤ 2 and α1 = αn .
In this second case, if x ∈ ℝ+ , then S(x) ∈ ℝ− ; hence, S ≠ 1.
This type of ping-pong argument can be used in many examples (see [30], [20]
and [26]). As another example, consider the unimodular matrices
0 1 0 −1
A=( ), B=( ).
−1 2 1 2
−n + 1 n −n + 1 −n
An = ( ), Bn = ( ) for n ∈ ℤ.
−n n+1 n n+1
for all n ≠ 0. The ping-pong argument used for any element of the type
n1 m1 mk nk+1
S=A B ⋅⋅⋅B A
with all ni , mi ≠ 0 and n1 + nk+1 ≠ 0 shows that S(x) ∈ ℝ+ if x ∈ ℝ− . It follows that there
are no nontrivial relations on A and B; therefore, the subgroup of M generated by A, B
must be a free group of rank 2.
To close this section, we present a significant number of theoretical applications
of the modular group. First, we need the following corollary to Corollary 14.3.11:
Theorem 14.3.14 (Fermat’s two-square theorem). Let n > 0 be a natural number. Then
n = a2 + b2 with (a, b) = 1 if and only if −1 is a quadratic residue modulo n.
x n
A = ±( ).
m −x
d −b
T −1 = ( ),
−c a
and
a b 0 1 d −b −(bd + ac) a2 + b2
TXT −1 = ( )( )( ) = ±( ). (∗)
c d −1 0 −c a −(c2 + d2 ) bd + ac
Therefore, any conjugate of X must have the form (∗), and thus A also must have the
form (∗). Therefore, n = a2 + b2 . Furthermore, (a, b) = 1 since in finding the form (∗),
we had ad − bc = 1.
Conversely suppose n = a2 + b2 with (a, b) = 1. Then there exist c, d ∈ ℤ with
ad − bc = 1; hence, there exists a projective unimodular matrix
a b
T = ±( ).
c d
Then
α a2 + b2 α n
TXT −1 = ± ( ) = ±( ).
γ −α γ −α
−α2 − nγ = 1 ⇒ α2 = −1 − nγ ⇒ α2 ≡ −1 mod n.
This type of group theoretical proof can be extended in several directions. Kern-
Isberner and Rosenberger [28] considered groups of matrices of the form
a b√N
U=( ), a, b, c, d, N ∈ ℤ, ad − Nbc = 1,
c√N d
or
a√N b
U=( ), a, b, c, d, N ∈ ℤ, Nad − bc = 1.
c d √N
N ∈ {1, 2, 4, 5, 6, 8, 9, 10, 12, 13, 16, 18, 22, 25, 28, 37, 58}
The proof of the above results depends on the class number of ℚ(√−N) (see [28]).
In another direction, Fine [40] and [39] showed that the Fermat two-square prop-
erty is actually a property satisfied by many rings R. These are called sum of squares
rings. For example, if p ≡ 3 mod 4, then ℤpn for n > 1 is a sum of squares ring.
Reidemeister–Schreier process
Let G, H and T be as above. Then H is generated by the set
with a complete set of defining relations given by conjugates of the original relators
rewritten in terms of the subgroup generating set.
To actually rewrite the relators in terms of the new generators, we use a mapping
τ on words on the generators of G called the Reidemeister rewriting process. This map
is defined as follows: If
e
W = aev11 aev22 ⋅ ⋅ ⋅ avjj with ei = ±1 defines an element of H
then
e e e
τ(W) = St11,av St22,av ⋅ ⋅ ⋅ Stjj,av ,
1 2 j
We present two examples; one with a finite group, and then an important example
with a free group, which shows that a countable free group contains free subgroups
of arbitrary ranks.
Let H = A4 be the commutator subgroup. We use the above method to find a presen-
tation for H. Now
Therefore, |A4 : A4 | = 3. A Schreier system is then {1, b, b2 }. The generators for A4 are
then
Example 14.4.2. Let F = ⟨x, y; ⟩ be the free group of rank 2. Let H be the commutator
subgroup. Then
a free abelian group of rank 2. It follows that H has infinite index in F. As Schreier
coset representatives, we can take
The relations are only trivial; therefore, H is free on the countable infinitely many gen-
erators above. It follows that a free group of rank 2 contains as a subgroup a free group
of countably infinite rank. Since a free group of countable infinite rank contains as
subgroups free groups of all finite ranks, it follows that a free group of rank 2 contains
as a subgroup a free subgroup of any arbitrary finite rank.
Theorem 14.4.3. Let F be free of rank 2. Then the commutator subgroup F is free of
countable infinite rank. In particular, a free group of rank 2 contains as a subgroup a
free group of any finite rank n.
Corollary 14.4.4. Let n, m be any pair of positive integers n, m ≥ 2 and Fn , Fm free groups
of ranks n, m, respectively. Then Fn can be embedded into Fm , and Fm can be embedded
into Fn .
Theorem 14.5.1. Suppose that K is a connected cell complex. Suppose that T is a max-
imal tree within the 1-skeleton of K. Then a presentation for π(K) can be determined in
the following manner:
Corollary 14.5.2. The fundamental group of a connected graph is free. Furthermore, its
rank is the number of edges outside a maximal tree.
Lemma 14.5.4. If K1 is a connected covering complex for K, then K1 and K have the same
dimension.
Theorem 14.5.5. Let K1 be a covering complex of K with covering map p. Then p(π(K1 )) is
a subgroup of π(K). Conversely, to each subgroup H of π(K), there is a covering complex
K1 with π(K1 ) = H. Hence, there is a one-to-one correspondence between subgroups of
the fundamental group of a complex K and covers of K.
We will see the analog of this theorem in regard to algebraic field extensions in
Chapter 15.
A topological space X is simply connected if π(X) = {1}. Hence, the covering com-
plex of K corresponding to the identity in π(K) is simply connected. This is called the
universal cover of K since it covers any other cover of K.
Based on Theorem 14.5.1, we get a very simple proof of the Nielsen–Schreier the-
orem.
Proof. Let F be a free group. Then F = π(K), where K is a connected graph. Let H be a
subgroup of F. Then H corresponds to a cover K1 of K. But a cover is also 1-dimensional;
hence, H = π(K1 ), where K1 is a connected graph. Therefore, H is also free.
Theorem 14.5.7. Given an arbitrary presentation ⟨X; R⟩, there exists a connected 2-com-
plex K with π(K) = ⟨X; R⟩.
We note that the books by Rotman [33] and Camps, Kühling and Rosenberger [21]
have significantly detailed and accessible descriptions of groups and complexes.
Cayley, and then Dehn, introduced for each group G, a graph, now called Cayley
graph, as a tool to apply complexes to the study of G. The Cayley graph is actually
tied to a presentation, and not to the group itself. Gromov reversed the procedure and
showed that by considering the geometry of the Cayley graph, one could get informa-
tion about the group. This led to the development of the theory of hyperbolic groups
(see for instance [20]).
Call x the label on the edge (g, x). Given a g ∈ G, then G is represented by at least
one word W in A. This represents a path in the Cayley graph. The length of the word W
is the length of the path. This is equivalent to making each edge have length one. If we
take the distance between 2 points as the minimum path length, we make the Cayley
graph a metric space. This metric is called the word metric. If we extend this metric to
all pairs of points in the Cayley graph in the obvious way (making each edge a unit
real interval), then the Cayley graph becomes a geodesic metric space with a metric d.
Each closed path in the Cayley graph represents a relator.
By left multiplication, the group G acts on the Cayley graph as a group of isome-
tries. Furthermore, the action of G on the Cayley graph is without inversion; that is,
ge ≠ e−1 if e is an edge.
If we sew in a 2-cell for each closed path in the Cayley graph, we get a simply
connected 2-complex called the Cayley complex. We now want to briefly describe the
concept of hyperbolic groups. Let G be a finitely generated group with a finite gener-
ating system X. G is called a hyperbolic group if the Cayley graph C(X) is a hyperbolic
space; that is, there exists a constant δ = δ(x) ≥ 0 such that d(u, [y, z] ∪ [z, x]) ≤ δ for
each geodesic triangle [x, y] ∪ [y, z] ∪ [z, x], and each u ∈ [x, y].
z
v
u
x
y
Theorem 14.6.1 (Dyck’s theorem). Let G = ⟨X; R⟩, and suppose that H ≅ G/N, where N
is a normal subgroup of G. Then a presentation for H is ⟨X; R ∪ R1 ⟩ for some set of words
R1 on X. Conversely, the presentation ⟨X; R ∪ R1 ⟩ defines a group, that is, a factor group
of G.
Proof. Since each element of H is a coset of N, they have the form gN for g ∈ G. It is
clear then that the images of X generate H. Furthermore, since H is a homomorphic
image of G, each relator in R is a relator in H. Let N1 be a set of elements that generate N,
and let R1 be the corresponding words in the free group on X. Then R1 is an additional
All three of these problems have negative answers in general. That is, for each of these
problems one can find a finite presentation, for which these questions cannot be an-
swered algorithmically (see [30]). Attempts for solutions, and for solutions in restricted
cases, have been of central importance in combinatorial group theory. For this reason
combinatorial group theory has always searched for and studied classes of groups, in
which these decision problems are solvable.
For finitely generated free groups, there are simple and elegant solutions to all
three problems. If F is a free group on x1 , . . . , xn and W is a freely reduced word in
x1 , . . . , xn , then W ≠ 1 if and only if L(W) ≥ 1 for the length of W. Since freely reducing
any word to a freely reduced word is algorithmic, this provides a solution to the word
e e e
problem. Furthermore, a freely reduced word W = xv11 xv22 ⋅ ⋅ ⋅ xvnn is cyclically reduced
if v1 ≠ vn or if v1 = vn , then e1 ≠ −en . Clearly then, every element of a free group is
conjugate to an element given by a cyclically reduced word called a cyclic reduction.
This leads to a solution to the conjugacy problem. Suppose V and W are two words
Definition 14.8.1. The free product of A and B, denoted by A ∗ B, is the group G with
the presentation ⟨a1 , . . . , b1 , . . . ; R1 , . . . , S1 , . . .⟩; that is, the generators of G consist of the
disjoint union of the generators of A and B with relators taken as the disjoint union of
the relators Ri of A and Sj of B. A and B are called the factors of G.
Free products exist and are nontrivial. In that regard, we have the following:
Theorem 14.8.3. Let G = A ∗ B. Then the maps A → G and B → G are injections. The
subgroup of G generated by the generators of A has the presentation ⟨generators of A;
relators of A⟩, that is, is isomorphic to A. Similarly for B. Thus, A and B can be considered
as subgroups of G. In particular, A ∗ B is nontrivial if A and B are.
Free products share many properties with free groups. First of all there is a cate-
gorical formulation of free products. Specifically we have the following:
Theorem 14.8.4. A group G is the free product of its subgroups A and B if A and B gen-
erate G, and given homomorphisms f1 : A → H, f2 : B → H into a group H, there exists
a unique homomorphism f : G → H, extending f1 and f2 .
Secondly, each element of a free product has a normal form related to the reduced
words of free groups. If G = A ∗ B, then a reduced sequence or reduced word in G is a
sequence g1 g2 . . . gn , n ≥ 0, with gi ≠ 1, each gi in either A or B and gi , gi+1 not both in
the same factor. Then the following hold:
Theorem 14.8.7. If two elements of a free product commute, then they are both powers
of a single element or are contained in a conjugate of an abelian subgroup of a fac-
tor.
Theorem 14.8.8 (Kurosh). A subgroup of a free product is also a free product. Explicitly,
if G = A ∗ B and H ⊂ G, then
H = F ∗ (∗Aα ) ∗ (∗Bβ ),
We note that the rank of F and the number of the other factors can be computed.
A complete discussion of these is in [31], [30] and [20].
If A and B are disjoint groups, then we now have two types of products form-
ing new groups out of them: the free product and the direct product. In both these
products, the original factors inject. In the free product, there are no relations be-
tween elements of A and elements of B, whereas in a direct product, each element
of A commutes with each element of B. If a ∈ A and b ∈ B, a cross commutator
is [a, b] = aba−1 b−1 . The direct product is a factor group of the free product, and
the kernel is precisely the normal subgroup generated by all the cross commuta-
tors.
A × B = (A ⋆ B)/H,
14.9 Exercises
1. Let X −1 be a set disjoint from X, but bijective to X. A word in X is a finite sequence
of letters from the alphabet. That is a word has the form
ϵi ϵi ϵ
w = xi 1 xi 2 ⋅ ⋅ ⋅ xi in ,
1 2 n
where xij ∈ X, and ϵij = ±1. Let W(X) be the set of all words on X.
If w1 , w2 ∈ W(X), we say that w1 is equivalent to w2 , denoted by w1 ∼ w2 , if w1 can
be converted to w2 by a finite string of insertions and deletions of trivial words.
Verify that this is an equivalence relation on W(X).
2. In F(X), let N(X) be the subgroup generated by all squares in F(X); that is,
Show that N(X) is a normal subgroup, and that the factor group F(X)/N(X) is
abelian, where every nontrivial element has order 2.
3. Show that a free group F is torsion-free.
4. Let F be a free group, and a, b ∈ F. Show: If ak = bk , k ≠ 0, then a = b.
5. Let F = ⟨a, b; ⟩ a free group with basis {a, b}. Let ci = a−i bai , i ∈ ℤ. Then G =
⟨ci , i ∈ ℤ⟩ is free with basis {ci | i ∈ ℤ}.
6. Show that ⟨x, y; x2 y3 , x3 y4 ⟩ ≅ ⟨x; x⟩ = {1}.
7. Let G = ⟨v1 , . . . , vn ; v12 ⋅ ⋅ ⋅ vn2 ⟩, n ≥ 1, and α : G → ℤ2 the epimorphism with
α(vi ) = −1 for all i. Let U be the kernel of α. Then U has a presentation U =
⟨x1 , . . . , xn−1 , y1 , . . . , yn−1 ; y1 x1 ⋅ ⋅ ⋅ yn−1 xn−1 yn−1 xn−1 ⋅ ⋅ ⋅ y1−1 x1−1 ⟩.
−1 −1
2 3
8. Let M = ⟨x, y; x , y ⟩ ≅ PSL(2, ℤ) be the modular group. Let M be the commutator
subgroup. Show that M is a free group of rank 2 with a basis {[x, y], [x, y2 ]}.
As we mentioned in Chapter 1, one of the origins of abstract algebra was the prob-
lem of trying to determine a formula for finding the solutions in terms of radicals of a
fifth degree polynomial. It was proved first by Ruffini in 1800 and then by Abel that,
in general, it is impossible to find a formula in terms of radicals for such a solution.
In 1820, Galois extended this and showed that such a formula is impossible for any
degree five or greater. In proving this, he laid the groundwork for much of the devel-
opment of modern abstract algebra, especially field theory and finite group theory.
One of the goals of this book has been to present a comprehensive treatment of Ga-
lois theory and a proof of the results mentioned above. At this point, we have covered
enough general algebra and group theory to discuss Galois extensions and general
Galois theory.
In modern terms, Galois theory is that branch of mathematics, which deals with
the interplay of the algebraic theory of fields, the theory of equations, and finite group
theory. This theory was introduced by Evariste Galois about 1830 in his study of the
insolvability by radicals of quintic (degree 5) polynomials, a result proved somewhat
earlier by Ruffini, and independently by Abel. Galois was the first to see the close con-
nection between field extensions and permutation groups. In doing so, he initiated the
study of finite groups. He was the first to use the term group as an abstract concept,
although his definition was really just for a closed set of permutations.
The method Galois developed not only facilitated the proof of the insolvability
of the quintic and higher powers, but led to other applications, and to a much larger
theory.
The main idea of Galois theory is to associate to certain special types of algebraic
field extensions called Galois extensions, a group called the Galois group. The prop-
erties of the field extension will be reflected in the properties of the group, which are
somewhat easier to examine. Thus, for example, solvability by radicals can be trans-
lated into solvability of groups, which was discussed in Chapter 12. Showing that for
every polynomial of degree five or greater, there exists a field extension whose Galois
group is not solvable proves that there cannot be a general formula for solvability by
radicals.
The tie-in to the theory of equations is as follows: If f (x) = 0 is a polynomial equa-
tion over some field K, we can form the splitting field K. This is usually a Galois ex-
tension, and therefore has a Galois group called the Galois group of the equation. As
before, properties of this group will reflect properties of this equation.
https://doi.org/10.1515/9783110603996-015
is called the set of automorphisms of L over K. Notice that if α ∈ Aut(L|K), then α(k) = k
for all k ∈ K.
Lemma 15.2.2. Let L|K be a field extension. Then Aut(L|K) forms a group called the
Galois group of L|K.
Proof. Aut(L|K) ⊂ Aut(L). Hence, to show that Aut(L|K) is a group, we only have to
show that its a subgroup of Aut(L). Now the identity map on L is certainly the identity
map on K, so 1 ∈ Aut(L|K); hence, Aut(L|K) is nonempty. If α, β ∈ Aut(L|K), then
consider α−1 β. If k ∈ K, then β(k) = k, and α(k) = k, so α−1 (k) = k. Therefore, α−1 β(k) =
k for all k ∈ K, and hence α−1 β ∈ Aut(L|K). It follows that Aut(L|K) is a subgroup of
Aut(L), and therefore a group.
If f (x) ∈ K[x] \ K and L is the splitting field of f (x) over K, then Aut(L|K) is also
called the Galois group of f (x).
Proof. We must show that any automorphism of a prime field P is the identity. If α ∈
Aut(L), then α(1) = 1, and so α(n ⋅ 1) = n ⋅ 1. Therefore, in P, α fixes all integer multiples
of the identity. However, every element of P can be written as a quotient m⋅1 n⋅1
of integer
multiples of the identity. Since α is a field homomorphism and α fixes both the top
and the bottom, it follows that α will fix every element of this form, and hence fix each
element of P.
For splitting fields, the Galois group is a permutation group on the zeros of the
defining polynomial.
Theorem 15.2.4. Let f (x) ∈ K[x] and L the splitting field of f (x) over K. Suppose that
f (x) has zeros α1 , . . . , αn ∈ L.
(a) Then each ϕ ∈ Aut(L|K) is a permutation on the zeros. In particular, Aut(L|K) is
isomorphic to a subgroup of Sn and uniquely determined by the zeros of f (x).
(b) If f (x) is irreducible, then Aut(L|K) operates transitively on {α1 , . . . , αn }. Hence, for
each i, j, there is a ϕ ∈ Aut(L|K) such that ϕ(αi ) = αj .
(c) If f (x) = b(x − α1 ) ⋅ ⋅ ⋅ (x − αn ) with α1 , . . . , αn pairwise distinct and Aut(L|K) operates
transitively on α1 , . . . , αn , then f (x) is irreducible.
Hence, α(√2) = ±√2. Analogously, α(√3) = ±√3. From this it follows that |Aut(L)| ≤ 4.
Furthermore, α2 = 1 for any α ∈ G.
Next we show that the polynomial f (x) = x2 − 3 is irreducible over K = ℚ(√2).
Assume that x2 −3 were reducible over K. Then √3 ∈ K. This implies that √3 = ba + dc √2
with a, b, c, d ∈ ℤ and b ≠ 0 ≠ d, and gcd(c, d) = 1. Then bd√3 = ad + bc√2, hence
3b2 d2 = a2 b2 + 2b2 c2 + 2√2adbc. Since bd ≠ 0, this implies that we must have ac = 0.
If c = 0, then √3 = ba ∈ ℚ, a contradiction. If a = 0, then √3 = dc √2, which implies
3d2 = 2c2 . It follows from this that 3| gcd(c, d) = 1, again a contradiction. Hence f (x) =
x2 − 3 is irreducible over K = ℚ(√2).
Since L is the splitting field of f (x) and f (x) is irreducible over K, then there exists
an automorphism α ∈ Aut(L) with α(√3) = −√3 and α|K = IK ; that is, α(√2) = √2.
Analogously, there is a β ∈ Aut(L) with β(√2) = −√2 and β(√3) = √3.
Clearly, α ≠ β, αβ = βα and α ≠ αβ ≠ β. It follows that Aut(L) = {1, α, β, αβ},
completing the proof.
Theorem 15.3.1. For a G ⊂ Aut(K), the set Fix(K, G) is a subfield of K called the fix field
of G over K.
Proof. 1 ∈ K is in Fix(K, G), so Fix(K, G) is not empty. Let k1 , k2 ∈ Fix(K, G), and let
g ∈ G. Then g(k1 ± k2 ) = g(k1 ) ± g(k2 ) since g is an automorphism. Then g(k1 ) ± g(k2 ) =
k1 ±k2 , and it follows that k1 ±k2 ∈ Fix(K, G). In an analogous manner, k1 k2−1 ∈ Fix(K, G)
if k2 ≠ 0; therefore, Fix(K, G) is a subfield of K.
Using the concept of a fix field, we define a finite Galois extension.
Definition 15.3.2. L|K is a (finite) Galois extension if there exists a finite subgroup G ⊂
Aut(L) such that K = Fix(L, G).
Lemma 15.3.3. Let L = ℚ(√2, √3) and K = ℚ. Then L|K is a Galois extension.
Proof. Let G = Aut(L|K). From the example in the previous section, there are automor-
phisms α, β ∈ G with
We have
1
α(a) = −2 4 or
1
α(a) = −i2 ∉ L since i ∉ L.
4
Theorem 15.4.1 (Fundamental theorem of Galois theory). Let L|K be a Galois exten-
sion with Galois group G = Aut(L|K). For each intermediate field E, let τ(E) be the
subgroup of G fixing E. Then the following hold:
(1) τ is a bijection between intermediate fields containing K and subgroups of G.
(2) L|K is a finite extension, and if M is an intermediate field, then
|L : M| = Aut(L|M)
|M : K| = Aut(L|K) : Aut(L|M).
(4) If M is an intermediate field and M|K is a Galois extension then we have the follow-
ing:
(a) α(M) = M for all α ∈ Aut(L|K),
(b) the map ϕ : Aut(L|K) → Aut(M|K) with ϕ(α) = α|M = β is an epimorphism,
(c) Aut(M|K) = Aut(L|K)/ Aut(L|M).
(5) The lattice of subfields of L containing K is the inverted lattice of subgroups of
Aut(K|L).
We will prove this main result via a series of theorems, and then combine them
all.
Theorem 15.4.2. Let G be a group, K a field, and α1 , . . . , αn pairwise distinct group ho-
momorphisms from G to K ⋆ , the multiplicative group of K. Then α1 , . . . , αn are linearly
independent elements of the K-vector space of all homomorphisms from G to K.
then we must show that all ki = 0. Since α1 ≠ αn , there exists an a ∈ G with α1 (a) ≠
αn (a). Let g ∈ G and apply the sum above to ag. We get
n
∑ ki (αi (a))(αi (g)) = 0. (∗∗)
i=1
If we subtract equation (∗∗∗) from equation (∗∗), then the last term vanishes and
we have an equation in the n − 1 homomorphism α1 , . . . , αn−1 . Since these are linearly
independent, we obtain
for the coefficient for α1 . Since α1 (a) ≠ αn (a), we must have k1 = 0. Now α2 , . . . , αn−1
are by assumption linearly independent, so k2 = ⋅ ⋅ ⋅ = kn = 0 also. Hence, all the
coefficients must be zero, and therefore the mappings are independent.
Theorem 15.4.3. Let α1 , . . . , αn be pairwise distinct monomorphisms from the field K into
the field K . Let
Proof. Certainly L is a field. Assume that r = |K : L| < n, and let {a1 , . . . , ar } be a basis
of the L-vector space K. We consider the following system of linear equations with r
equations and n unknowns:
Let a ∈ K. Then
r
a = ∑ lj aj with lj ∈ L.
j=1
n n r
∑ xi (αi (a)) = ∑ xi (∑ αi (lj )αi (aj ))
i=1 i=1 j=1
r n
= ∑(α1 (lj )) ∑ xi (αi (aj )) = 0
j=1 i=1
since α1 (lj ) = αi (lj ) for i = 2, . . . , n. This holds for all a ∈ K, and hence ∑ni=1 xi αi = 0,
contradicting Theorem 15.4.2. Therefore, our assumption that |K : L| < n must be false,
and hence |K : L| ≥ n.
Definition 15.4.4. Let K be a field and G a finite subgroup of Aut(K). The map
trG : K → K, given by
Proof. Let L = Fix(K, G), and suppose that |G| = n. From Theorem 15.4.3, we know that
|K : L| ≥ n. We must show that |K : L| ≤ n.
Suppose that G = {α1 , . . . , αn }. To prove the result, we show that if m > n and
a1 , . . . , am ∈ K, then a1 , . . . , am are linearly dependent.
We consider the system of equations
This m-tuple (x1 , . . . , xm ) is then also a nontrivial solution of the system of equations
considered above.
Then we have
Summation leads to
m n m
0 = ∑ aj ∑(αi (xj )) = ∑(trG (xj ))aj
j=1 i=1 j=1
Aut(K|Fix(K, G)) = G.
Assume then that there exists an α ∈ Aut(K| Fix(K, G)) with α ∉ G. Suppose, as in
the previous proof, |G| = n and G = {α1 , . . . , αn } with α1 = 1. Now
From Theorem 15.4.3, we have that |K : Fix(K, G)| ≥ n + 1. However, from Theo-
rem 15.4.6, |K : Fix(K, G)| = n, getting a contradiction.
Suppose that L|K is a Galois extension. We now establish that the map τ between
intermediate fields K ⊂ E ⊂ L and subgroups of Aut(L|K) is a bijection.
Theorem 15.4.8. Let L|K be a Galois extension. Then we have the following:
(1) Aut(L|K) is finite and
Fix(L, Aut(L|K)) = K.
Aut(L|Fix(L, H)) = H.
Proof. (1) If (L|K) is a Galois extension, there is a finite subgroup of Aut(L) with K =
Fix(K, G). From Theorem 15.4.7, we have G = Aut(L|K). In particular, Aut(L|K) is finite,
and K = Fix(L, Aut(L|K)).
(2) Let H ⊂ Aut(L|K). From part (1), H is finite, and then Aut(L|Fix(L, H)) = H from
Theorem 15.4.7.
Theorem 15.4.9. Let L|K be a field extension. Then the following are equivalent:
(1) L|K is a Galois extension.
(2) |L : K| = |Aut(L|K)| < ∞.
(3) |Aut(L|K)| < ∞, and K = Fix(L, Aut(L|K)).
Proof. (1) ⇒ (2): Now, from Theorem 15.4.8, |Aut(L|K)| < ∞, and Fix(L, Aut(L|K)) = K.
Therefore, from Theorem 15.4.6, |L : K| = |Aut(L|K)|.
(2) ⇒ (3): Let G = Aut(L|K). Then K ⊂ Fix(L, G) ⊂ L. From Theorem 15.4.6, we have
L : Fix(L, G) = |G| = |L : K|.
(3) ⇒ (1) follows directly from the definition completing the proof.
We now show that if L|K is a Galois extension, then L|M is also a Galois extension
for any intermediate field M.
Proof. Let G = Aut(L|K). Then, from Theorem 15.4.9, |G| < ∞, and furthermore, K =
Fix(L, G). Define H = Aut(L|M) and M = Fix(L, H). We must show that M = M for
then L|M is a Galois extension.
Since the elements of H fix M, we have M ⊂ M . Let G = ⋃ri=1 αi H, a disjoint union
of the cosets of H. Let α1 = 1, and define βi = αi|M . The β1 , . . . , βr are pairwise distinct
for if βi = βj ; that is αi|M = αj|M . Then αj−1 αi ∈ H, so αi and αj are in the same coset.
We claim that
M ∩ Fix(L, G) = M ∩ K = K.
since
Aut(L|α(M)) = α Aut(L|M)α−1 .
Proof. Now, β ∈ Aut(L|α(M)) if and only if β(α(a)) = α(a) for all a ∈ M. This occurs if
and only if α−1 βα(a) = a for all a ∈ M, which is true if and only if β ∈ α Aut(L|M)α−1 .
Proof. (1) ⇒ (2): Suppose that M|K is a Galois extension. Let Aut(M|K) = {α1 , . . . , αr }.
Consider the αi as monomorphisms from M into L. Let αr+1 : M → L be a monomor-
phism with αr+1|K = 1. Then
since M|K is a Galois extension. Therefore, from Theorem 15.4.3, we have that if the
α1 , . . . , αr , αr+1 are distinct, then
|M : K| ≥ r + 1 > r = Aut(M|K) = |M : K|,
The field K ⊂ Fix(M, H) from the definition of the fix field. Hence, we must show
that Fix(M, H) ⊂ K. Assume that there exists an α ∈ Aut(L|K) with α(a) ≠ a for some
a ∈ Fix(M, H). Recall that L|K is a Galois extension, and therefore Fix(L, Aut(L|K)) = K.
Define β = α|M . Then β ∈ H, since α(M) = M and our original assumption. Then
β(a) ≠ a, contradicting a ∈ Fix(M, H). Therefore, K = Fix(M, H), and M|K is a Galois
extension.
(2) ⇒ (3): Suppose that if α ∈ Aut(L|K), then α(M) = M. Then Aut(L|M) is a normal
subgroup of Aut(L|K) follows from Lemma 15.4.12, since Aut(L|M) is the kernel of ϕ.
(3) ⇒ (2): Suppose that Aut(L|M) is a normal subgroup of Aut(L|K). Let α ∈
Aut(L|K), then from our assumption and Lemma 15.4.11, we get that
Aut(L|α(M)) = Aut(L|M).
Now L|M and L|α(M) are Galois extensions by Theorem 15.4.10. Therefore,
We now combine all of these results to give the proof of Theorem 15.4.1, the fun-
damental theorem of Galois theory.
Example 15.4.14. Let f (x) = x3 − 7 ∈ ℚ[x]. This has no zeros in ℚ, and since it is of
degree 3, it follows that it must be irreducible in ℚ[x].
Let ω = − 21 + 23 i ∈ ℂ. Then it is easy to show by computation that
√
1 √3
ω2 = − − i and ω3 = 1.
2 2
Therefore, the three zeros of f (x) in ℂ are
Hence, L = ℚ(a1 , a2 , a3 ) is the splitting field of f (x). Since the minimal polynomial
of all three zeros over ℚ is the same f (x), it follows that
The question then arises as to whether these are all the intermediate fields. The
answer is yes, which we now prove.
Let G = Aut(L|ℚ) = Aut(L). (Aut(L|ℚ) = Aut(L), since ℚ is a prime field.) Now
G ≅ S3 . G acts transitively on {a1 , a2 , a3 }, since f is irreducible. Let δ : ℂ → ℂ be the
automorphism of ℂ taking each element to its complex conjugate; that is, δ(z) = z.
Then δ(f ) = f , and δ|L ∈ G (see Theorem 8.2.2). Since a1 ∈ ℝ, we get that δ|{a1 ,a2 ,a3 } =
(a2 , a3 ), the 2-cycle that maps a2 to a3 and a3 to a2 . Since G is transitive on {a1 , a2 , a3 },
there is a τ ∈ G with τ(a1 ) = a2 .
Case 1: τ(a3 ) = a3 . Then τ = (a1 , a2 ), and (a1 , a2 )(a2 , a3 ) = (a1 , a2 , a3 ) ∈ G.
Case 2: τ(a3 ) ≠ a3 . Then τ is a 3-cycle. In either case, G is generated by a transpo-
sition and a 3-cycle. Hence, G is all of S3 . Then L|ℚ is a Galois extension from Theo-
rem 15.4.9, since |G| = |L : ℚ|.
The subgroups of S3 are as follows:
Hence, the above lattice of fields is complete. L|ℚ, ℚ|ℚ, ℚ(ω)|ℚ and L|ℚ(ai ) are
Galois extensions, whereas ℚ(ai )|ℚ with i = 1, 2, 3 are not Galois extensions.
15.5 Exercises
1. Let K ⊂ M ⊂ L be a chain of fields, and let ϕ : Aut(L|K) → Aut(M|K) be defined by
ϕ(α) = α|M . Show that ϕ is an epimorphism with kernel ker(ϕ) = Aut(L|M).
1 1
2. Show that ℚ(5 4 )|ℚ(√5) and ℚ(√5)|ℚ are Galois extensions, and ℚ(5 4 )|ℚ is not a
Galois extension.
3. Let L|K be a field extension and u, v ∈ L algebraic over K with |K(u) : K| = m and
|K(v) : K| = n. If m and n are coprime, then |K(u, v) : K| = n ⋅ m.
1 1
4. Let p, q be prime numbers with p ≠ q. Let L = ℚ(√p, q 3 ). Show that L = ℚ(√p⋅q 3 ).
1
Determine a basis of L over ℚ and the minimal polynomial of √p ⋅ q 3 .
1
5. Let K = ℚ(2 n ) with n ≥ 2.
(i) Determine the number of ℚ-embeddings σ : K → ℝ. Show that for each such
embedding, we have σ(K) = K.
(ii) Determine Aut(K|ℚ).
6. Let α = √5 + 2√5.
(i) Determine the minimal polynomial of α over ℚ.
(ii) Show that ℚ(a)|ℚ is a Galois extension.
(iii) Determine Aut(ℚ(a)|ℚ).
7. Let K be a field of prime characteristic p, and let f (x) = xp − x + a ∈ K be an
irreducible polynomial. Let L = K(v), where v is a zero of f (x).
(i) If α is a zero of f (x), then also α + 1.
(ii) L|K is a Galois extension.
(iii) There is exactly one K-automorphism σ of L with σ(v) = v + 1.
(iv) The Galois group Aut(L|K) is cyclic with generating element σ.
In this chapter, we consider these questions and completely characterize Galois ex-
tensions. To do this, we must introduce separable extensions.
Definition 16.1.1. Let K be a field. Then a nonconstant polynomial f (x) ∈ K[x] is called
separable over K if each irreducible factor of f (x) has only simple zeros in its splitting
field.
Definition 16.1.2. Let L|K be a field extension and a ∈ L. Then a is separable over K
if a is a zero of a separable polynomial. The field extension L|K is a separable field
extension, or just separable if all a ∈ L are separable over K. In particular, a separable
extension is an algebraic extension.
and
Lemma 16.1.4. Let K be a field and f (x) an irreducible nonconstant polynomial in K[x].
Then f (x) is separable if and only if its formal derivative is nonzero.
https://doi.org/10.1515/9783110603996-016
Proof. Let L be the splitting field of f (x) over K. Let f (x) = (x − a)r g(x), where (x − a)
does not divide g(x). Then
If g (x) = 0, then f (x) = g(x) ≠ 0. Now suppose that g (x) ≠ 0. Assume that f (x) = 0;
then, necessarily, (x − a)|g(x) giving a contradiction. Therefore, f (x) ≠ 0.
Conversely, suppose that f (x) ≠ 0. Assume that f (x) is not separable. Then both
f (x) and f (x) have a common zero a ∈ L. Let ma (x) be the minimal polynomial of
a in K[x]. Then ma (x)|f (x), and ma (x)|f (x). Since f (x) is irreducible, then the degree
of ma (x) must equal the degree of f (x). But ma (x) must also have the same degree as
f (x), which is less than that of f (x), giving a contradiction. Therefore, f (x) must be
separable.
Example 16.1.5. Let K = GF(p) and L = K(t), the field of rational functions in t over K.
Consider the polynomial f (x) = xp − t ∈ L[x].
Now K[t]/tK[t] ≅ K. Since K is a field, this implies that tK[t] is a maximal ideal,
and hence a prime ideal in K[t] with prime element t ∈ K[t] (see Theorem 3.2.7). By
the Eisenstein criteria, f (x) is an irreducible polynomial in L[x] (see Theorem 4.4.8).
However, f (x) = pxp−1 = 0, since char(K) = p. Therefore, f (x) is not separable.
Proof. Suppose that K is a field with char(K) = 0. Suppose that f (x) is a nonconstant
polynomial in K[x]. Then f (x) ≠ 0. If f (x) is irreducible, then f (x) is separable from
Lemma 16.1.4. Therefore, by definition, each nonconstant polynomial f (x) ∈ K[x] is
separable.
We remark that in the original motivation for Galois theory, the ground field was
the rationals ℚ. Since this has characteristic zero, it is perfect and all extensions are
separable. Hence, the question of separability did not arise until the question of ex-
tensions of fields of prime characteristic arose.
If in (1) and (2) f (x) is irreducible, then f (x) is not separable over K if and only if f (x) is
a polynomial in xp .
Proof. Let f (x) = ∑ni=1 ai xi . Then f (x) = 0 if and only if p|i for all i with ai ≠ 0. But this
is equivalent to
f (x) = a0 + ap xp + ⋅ ⋅ ⋅ + am xmp .
If f (x) is irreducible, then f (x) is not separable if and only if f (x) = 0 from
Lemma 16.1.4.
Theorem 16.2.4. Let K be a field with char(K) = p ≠ 0. Then the following are equiva-
lent:
(1) K is perfect.
(2) Each element in K has a p-th root in K.
(3) The Frobenius homomorphism x → xp is an automorphism of K.
Proof. First we show that (1) implies (2). Suppose that K is perfect, and a ∈ K. Then
xp − a is separable over K. Let g(x) ∈ K[x] be an irreducible factor of xp − a. Let L be
the splitting field of g(x) over K, and b a zero of g(x) in L. Then bp = a. Furthermore,
xp − bp = (x − b)p ∈ L[x], since the characteristic of K is p. Hence, g(x) = (x − b)s , and
then s must equal 1 since g(x) is irreducible. Therefore, b ∈ K, and b is a p-th root of a.
Now we show that (2) implies (3). Recall that the Frobenius homomorphism τ :
x → xp is injective (see Theorem 1.8.8). We must show that it is also surjective. Let
a ∈ K, and let b be a p-th root of a so that a = bp . Then τ(b) = bp = a, and τ is
surjective.
Finally, we show that (3) implies (1). Let τ : x → xp be surjective. It follows that
each a ∈ K has a p-th root in K. Now let f (x) ∈ K[x] be irreducible. Assume that f (x) is
not separable. From Theorem 16.2.3, there is a g(x) ∈ K[x] with f (x) = g(xp ); that is,
f (x) = a0 + a1 xp + ⋅ ⋅ ⋅ + am xmp .
p
f (x) = bpo + bp1 xp + ⋅ ⋅ ⋅ + bpm xmp = (b0 + b1 x + ⋅ ⋅ ⋅ + bm xm ) .
Theorem 16.2.5. Let K be a field with char(K) = p ≠ 0. Then each element of K has at
most one p-th power in K.
Proof. Let K be a finite field of characteristic p > 0. Then the Frobenius map τ : x →
xp is surjective since its injective and K is finite. Therefore, K is perfect from Theo-
rem 16.2.4.
Next we show that each finite field has order pm for some prime p and natural
number m > 0.
Lemma 16.3.2. Let K be a finite field. Then |K| = pm for some prime p and natural num-
ber m > 0.
Proof. Let K be a finite field with characteristic p > 0. Then K can be considered as a
vector space over K = GF(p), and hence of finite dimension since |K| < ∞. If α1 , . . . , αm
is a basis, then each f ∈ K can be written as f = c1 α1 + ⋅ ⋅ ⋅ + cn αm with each ci ∈ GF(p).
Hence, there are p choices for each ci , and therefore pm choices for each f .
In Theorem 9.5.16, we proved that any finite subgroup of the multiplicative group
of a field is cyclic. If K is a finite field, then its multiplicative subgroup K ⋆ is finite, and
hence cyclic.
Lemma 16.3.3. Let K be a finite field. Then its multiplicative subgroup K ⋆ is cyclic.
If K is a finite field with order pm , then its multiplicative subgroup K ⋆ has order
p − 1. Then, from Lagrange’s theorem, each nonzero element to the power pm is the
m
Lemma 16.3.4. Let K be a field of order pm . Then each α ∈ K is a zero of the polynomial
m m
xp − x. In particular, if α ≠ 0, then α is a zero of xp −1 − 1.
Proof. Let |K1 | = |K2 | = pm . From the remarks above, K1 = GF(p)(α), where α has order
pm − 1 in K1⋆ . Similarly, K2 = GF(p)(β), where β also has order pm − 1 in K2⋆ . Hence,
GF(p)(α) ≅ GF(p)(β), and therefore K1 ≅ K2 .
In Lemma 16.3.2, we saw that if K is a finite field, then |K| = pn for some prime p
and positive integer n. We now show that given a prime power pn , there does exist a
finite field of that order.
Theorem 16.3.6. Let p be a prime and n > 0 a natural number. Then there exists a field
K of order pn .
n
Proof. Given a prime p, consider the polynomial g(x) = xp − x ∈ GF(p)[x]. Let K be
the splitting field of this polynomial over GF(p). Since a finite field is perfect, K is a
separable extension, and hence all the zeros of g(x) are distinct in K.
Let F be the set of pn distinct zeros of g(x) within K. Let a, b ∈ F. Since
n n n n n n
(a ± b)p = ap ± bp and (ab)p = ap bp ,
it follows that F forms a subfield of K. However, F contains all the zeros of g(x), and
since K is the smallest extension of GF(p) containing all the zeros of g(x), we must
have K = F. Since F has pn elements, it follows that the order of K is pn .
Combining Theorems 16.3.5 and 16.3.6, we get the following summary result, in-
dicating that up to isomorphism there exists one and only one finite field of order pn .
Theorem 16.3.7. Let p be a prime and n > 0 a natural number. Then up to isomorphism,
there exists a unique finite field of order pn .
Lemma 16.4.2. Let L|K be a finite extension with L ⊂ L, and L algebraically closed. In
particular, L = K(a1 , . . . , an ), where the ai are algebraic over K. Let pi be the number of
pairwise distinct zeros of the minimal polynomial mai of ai over K(a1 , . . . , an−1 ) in L. Then
there are exactly p1 , . . . , pn monomorphisms β : L → L with β|K = 1K .
Proof. From Theorem 16.4.1, there are exactly p1 monomorphisms α : K(a1 ) → L with
α|K equal to the identity on K. Each such α has exactly p2 extensions of the identity on
K to K(a1 , a2 ). We now continue in this manner.
Proof. This follows directly from the fact that the minimal polynomial of a over M
divides the minimal polynomial of a over K.
Theorem 16.4.4. Let L|K be a field extension. Then the following are equivalent:
(1) L|K is finite and separable.
(2) There are finitely many separable elements a1 , . . . , an over K with K = K(a1 , . . . , an ).
(3) L|K is finite, and if L ⊂ L with L algebraically closed, then there are exactly [L : K]
monomorphisms α : L → L with α|K = 1K .
Proof. That (1) implies (2) follows directly from the definitions. We show then that (2)
implies (3). Let L = K(a1 , . . . , an ), where a1 , . . . , an are separable elements over K. The
extension L|K is finite (see Theorem 5.3.4). Let pi be the number of pairwise distinct
zeros in L of the minimal polynomial mai (x) = fi (x) of ai over K(a1 , . . . , ai−1 ). Then pi ≤
deg(fi ) = |K(a1 , . . . , ai ) : K(a1 , . . . , ai−1 )|. Hence, pi = deg(fi (x)) since ai is separable
over K(a1 , . . . , ai−1 ) from Theorem 16.4.3. Therefore,
[L : K] = p1 ⋅ ⋅ ⋅ pn
Finally, we show that (3) implies (1). Suppose then the conditions of (3). Since L|K
is finite, there are finitely many a1 , . . . , an ∈ L with L = K(a1 , . . . , an ). Let pi and fi (x) be
as in the proof above, and hence pi ≤ deg(fi (x)). By assumption we have
[L : K] = p1 ⋅ ⋅ ⋅ pn
k
mb (x) = ∑ bi xpi , bi ∈ K, bk = 1
i=0
b0 + b1 bp + ⋅ ⋅ ⋅ + bk bpk = 0.
Therefore, the elements 1, bp , . . . , bpk are linearly dependent over K. Since K(a1 ) = E =
K(E p ), we find that 1, b, . . . , bk are linearly dependent also, since if they were indepen-
dent the p-th powers would also be independent. However, this is not possible, since
k < deg(mb (x)). Therefore, mb (x) is separable over K, and hence K(a1 )|K is separable.
Altogether L|K is then finite and separable, completing the proof.
Theorem 16.4.5. Let L|K be a field extension, and let M be an intermediate field. Then
the following are equivalent:
(1) L|K is separable.
(2) L|M and M|K are separable.
Proof. We first show that (1) ⇒ (2): If L|K is separable then L|M is separable by Theo-
rem 16.4.3, and M|K is separable.
Now suppose (2), and let M|K and L|M be separable. Let a ∈ L, and let
M = K(b1 , . . . , bn−1 ).
Theorem 16.4.6. Let L|K be a field extension, and let S ⊂ L such that all elements of S
are separable over K. Then K(S)|K is separable, and K[S] = K(S).
Proof. Let W be the set of finite subsets of S. Let T ∈ W. From Theorem 16.4.4, we
obtain that K(T)|K is separable. Since each element of K(S) is contained in some K(T),
we have that K(S)|K is separable. Since all elements of S are algebraic, we have that
K[S] = K(S).
Theorem 16.4.7. Let L|K be a field extension. Then there exists in L a uniquely deter-
mined maximal field M with the property that M|K is separable. If a ∈ L is separable
over M, then a ∈ M. M is called the separable hull of K in L.
Proof. Let S be the set of all elements in L, which are separable over K. Define M =
K(S). Then M|K is separable from Theorem 16.4.6. Now, let a ∈ L be separable over M.
Then M(a)|M is separable from Theorem 16.4.4. Furthermore, M(a)|K is separable from
Theorem 16.4.5. It follows that a ∈ M.
Theorem 16.5.1. Let L|K be a field extension. Then the following are equivalent:
(1) L|K is a Galois extension.
(2) L is the splitting field of a separable polynomial in K[x].
(3) L|K is finite, normal, and separable.
Therefore, we may characterize Galois extensions of a field K as finite, normal, and sep-
arable extensions of K.
Proof. Recall from Theorem 8.2.2 that an extension L|K is normal if the following hold:
(1) L|k is algebraic, and
(2) each irreducible polynomial f (x) ∈ K[x] that has a zero in L splits into linear fac-
tors in L[x].
Now suppose that L|K is a Galois extension. Then L|K is finite from Theorem 15.4.1.
Let L = K(b1 , . . . , bm ) and mbi (x) = fi (x) be the minimal polynomial of bi over K. Let
ai1 , . . . , ain be the pairwise distinct elements from
Hi = {α(bi ) : α ∈ Aut(L|K)}.
Define
If α ∈ Aut(L|K), then α(gi ) = gi , since α permutes the elements of Hi . This means that
the coefficients of gi (x) are in Fix(L, Aut(L|K)) = K. Furthermore, gi (x) ∈ K[x], because
bi is one of the aij , and fi (x)|gi (x). The group Aut(L|K) acts transitively on {ai1 , . . . , ain }
by the choice of ai1 , . . . , ain . Therefore, each gi (x) is irreducible (see Theorem 15.2.4).
It follows that fi (x) = gi (x). Now, fi (x) has only simple zeros in L; that is, no zero has
multiplicity ≥ 2, and hence fi (x) splits over L. Therefore, L is a splitting field of f (x) =
f1 (x) ⋅ ⋅ ⋅ fm (x), and f (x) is separable by definition. Hence, (1) implies (2).
Now suppose that L is a splitting field of the separable polynomial f (x) ∈ K[x], and
L|K is finite. From Theorem 16.4.4, we get that L|K is separable, since L = K(a1 , . . . , an )
with each ai separable over K. Therefore, L|K is normal from Definition 8.2.1. Hence,
(2) implies (3).
Finally, suppose that L|K is finite, normal, and separable. Since L|K is finite and
separable from Theorem 16.4.4, there exist exactly [L : K] monomorphisms α : L → L,
L, the algebraic closure of L, with α|K the identity on K. Since L|K is normal, these
monomorphisms are already automorphisms of L from Theorem 8.2.2. Hence, [L : K] ≤
|Aut(L|K)|. Furthermore, |L : K| ≥ |Aut(L|K)| from Theorem 15.4.3. Combining these,
we have [L : K] = Aut(L|K), and hence L|K is a Galois extension from Theorem 15.4.9.
Therefore, (3) implies (1), completing the proof.
Recall that any field of characteristic 0 is perfect, and therefore any finite exten-
sion is separable. Applying this to ℚ implies that the Galois extensions of the rationals
are precisely the splitting fields of polynomials.
Corollary 16.5.2. The Galois extensions of the rationals are precisely the splitting fields
of polynomials in ℚ[x].
Theorem 16.5.3. Let L|K be a finite, separable field extension. Then there exists an ex-
tension field M of L such that M|K is a Galois extension.
Proof. Let L = K(a1 , . . . , an ) with all ai separable over K. Let fi (x) be the minimal poly-
nomial of ai over K. Then each fi (x), and hence also f (x) = f1 (x) ⋅ ⋅ ⋅ fn (x), is separable
over K. Let M be the splitting field of f (x) over K. Then M|K is a Galois extension from
Theorem 16.5.1.
Example 16.5.4. Let K = ℚ be the rationals, and let f (x) = x4 − 2 ∈ ℚ[x]. From Chap-
ter 8, we know that L = ℚ(√4 2, i) is a splitting field of f (x). By the Eisenstein criteria,
f (x) is irreducible, and [L : ℚ] = 8. Moreover,
are the zeros of f (x). Since the rationals are perfect, f (x) is separable. L|K is a Galois
extension by Theorem 16.5.1. From the calculations in Chapter 15, we have
Aut(L|K) = Aut(L) = [L : K] = 8.
Let
If we let τ = (2, 4) and σ = (1, 2, 3, 4), we get the isomorphism between G and D4 . From
Theorem 14.1.1, we know that D4 = ⟨r, f ; r 4 = f 2 = (rf )2 = 1⟩.
This can also be seen in the following manner. Let
4 4 4 4
a1 = √2, a2 = i√2, a3 = −√2, a4 = −i√2.
Let α ∈ G. α is determined if we know α(√4 2) and α(i). The possibilities for α(i) are i or
−i; that is, the zeros of x2 + 1.
The possibilities for √4 2 are the 4 zeros of f (x) = x4 − 2. Hence, we have 8 possibil-
ities for α. These are exactly the elements of the group G. We have δ, τ ∈ G with
4 4
δ(√2) = i√2, δ(i) = i
and
4 4
τ(√2) = √2, τ(i) = −i.
It is straightforward to show that δ has order 4, τ has order 2, and δτ has order 2. These
define a group of order 8 isomorphic to D4 , and since G has 8 elements, this must be
all of G.
We now look at the subgroup lattice of G, and then the corresponding field lattice.
Let δ and τ be as above. Then G has 5 subgroups of order 2
From this we construct the lattice of fields and intermediate fields. Since there
are 10 proper subgroups of G from the fundamental theorem of Galois theory, there
are 10 intermediate fields in L|ℚ, namely, the fix fields Fix(L, H), where H is a proper
subgroup of G. In the identification, the extension field corresponding to the whole
group G is the ground field ℚ (recall that the lattice of fields is the inverted lattice of
the subgroups), whereas the extension field corresponding to the identity is the whole
field L. We now consider the other proper subgroups. Let δ, τ be as before.
(1) Consider M1 = Fix(L, {1, τ}). Now {1, τ} fixes ℚ(√4 2) elementwise so that ℚ(√4 2) ⊂
M1 . Furthermore, [L : M1 ] = |{1, τ}| = 2, and hence [L : ℚ(√4 2)] = 2. Therefore,
M1 = ℚ(√4 2).
(2) Consider M2 = Fix(L, {1, τδ}). We have the following:
4 4 4
τδ(√2) = τ(i√2) = −i√2
4 4 4
τδ(i√2) = τ(−√2) = −√2
4 4 4
τδ(−√2) = τ(−i√2) = i√2
4 4 4
τδ(−i√2) = τ(√2) = √2.
(3) Consider M3 = Fix(L, {1, τδ2 }). The map τδ2 interchanges a1 and a3 and fixes a2 and
a4 . Therefore, M3 = ℚ(i√4 2).
In an analogous manner, we can then consider the other 5 proper subgroups and cor-
responding intermediate fields. We get the following lattice of fields and subfields:
Theorem 16.6.1 (Primitive element theorem). Let L = K(γ1 , . . . , γn ), and suppose that
each γi is separable over K. Then there exists a γ0 ∈ L such that L = K(γ0 ). The element
γ0 is called a primitive element.
Proof. Suppose first that K is a finite field. Then L is also a finite field, and therefore
L⋆ = ⟨γ0 ⟩ is cyclic. Therefore, L = K(γ0 ), and the theorem is proved if K is a finite field.
Now suppose that K is infinite. Inductively, it suffices to prove the theorem for
n = 2. Hence, let α, β ∈ L be separable over K. We must show that there exists a γ ∈ L
with K(α, β) = K(γ).
Let L be the splitting field of the polynomial mα (x)mβ (x) over L, where mα (x), mβ (x)
are, respectively, the minimal polynomials of α, β over K. In L[x], we have the follow-
ing:
α1 + zβ1 = αi + zβj
α1 + cβ1 ≠ αi + cβj
γ = α + cβ = α1 + cβ1 .
We claim that K(α, β) = K(γ). It suffices to show that β ∈ K(γ), for then α = γ − cβ ∈
K(γ). This implies that K(α, β) ⊂ K(γ), and since γ ∈ K(α, β), it follows that K(α, β) =
K(γ).
To show that β ∈ K(γ), we first define f (x) = mα (γ − cx), and let d(x) = gcd(f (x),
mβ (x)). We may assume that d(x) is monic. We show that d(x) = x − β. Then β ∈ K(γ),
since d(x) ∈ K(γ)[x].
Assume first that d(x) = 1. Then gcd(f (x), mβ (x)) = 1, and f (x) and mβ (x) are also
relatively prime in L[x]. This is a contradiction, since f (x) and mβ (x) have the common
zero β ∈ L, and hence the common divisor x − β.
Therefore, d(x) ≠ 1, so deg(d(x)) ≥ 1.
The polynomial d(x) is a divisor of mβ (x), and hence d(x) splits into linear factors
of the form x − βj , 1 ≤ j ≤ t in L[x]. The proof is completed if we can show that no linear
factor of the form x − βj with 2 ≤ j ≤ t is a divisor of f (x). That is, we must show that
f (βj ) ≠ 0 in L if j ≥ 2.
Now f (βj ) = mα (γ − cβj ) = mα (α1 + cβ1 − cβj ). Suppose that f (βj ) = 0 for some
j ≥ 2. This would imply that αi = α1 + cβ1 − cβj ; that is, α1 + cβ1 = αj + cβj for j ≥ 2.
This contradicts the choice of the value c. Therefore, f (βj ) ≠ 0 if j ≥ 2, completing the
proof.
Corollary 16.6.2. Let L|K be a finite extension with K a perfect field. Then L = K(γ) for
some γ ∈ L.
Corollary 16.6.3. Let L|K be a finite extension with K a perfect field. Then there exist
only finitely many intermediate fields E with K ⊂ E ⊂ L.
Proof. Since K is a perfect field, we have L = K(γ) for some γ ∈ L. Let mγ (x) ∈ K[x]
be the minimal polynomial of γ over K, and let L be the splitting field of mγ (x) over K.
Then L|K is a Galois extension; hence, there are only finitely many intermediate fields
between K and L. Therefore, also only finitely many fields between K and L.
Suppose that L|K is algebraic. Then, in general, L = K(γ) for some γ ∈ L if and only
if there exist only finitely many intermediate fields E with K ⊂ E ⊂ L.
This condition on intermediate fields implies that L|K is finite if L|K is algebraic.
Hence, we have proved this result, in the case that K is perfect. The general case is
discussed in the book of S. Lang [12].
16.7 Exercises
1. Let f (x) = x4 − 8x3 + 24x2 − 32x + 14 ∈ ℚ[x], and let v ∈ ℂ be a zero of f . Let
α := v(4 − v), and K a splitting field of f over ℚ. Show the following:
(i) f is irreducible over ℚ, and f (x) = f (4 − x).
(ii) There is exactly one automorphism σ of ℚ(v) with σ(v) = 4 − v.
(iii) L := ℚ(α) is the Fix field of σ and |L : ℚ| = 2.
(iv) Determine the minimal polynomial of α over ℚ and determine α.
(v) |ℚ(v) : L| = 2, and determine the minimal polynomial of v over L; also deter-
mine v and all other zeros of f (x).
(vi) Determine the degree of |K : ℚ|.
(vii)Determine the structure of Aut(K|ℚ).
2. Let L|K be a field extension and f ∈ K[x] a separable polynomial. Let Z be a split-
ting field of f over L and Z0 a splitting field of f over K. Show that Aut(Z|L) is
isomorphic to a subgroup of Aut(Z0 |K).
3. Let L|K be a field extension and v ∈ L. For each element c ∈ K it is K(v + c) = K(v).
For c ≠ 0, it is K(cv) = K(v).
4. Let v = √2 + √3 and let K = ℚ(v). Show that √2 and √3 are presentable as a
ℚ-linear combination of 1, v, v2 , v3 . Conclude that K = ℚ(√2, √3).
5. Let L be the splitting field of x3 − 5 over ℚ in ℂ. Determine a primitive element t of
L over ℚ.
K = L0 ⊂ L1 ⊂ ⋅ ⋅ ⋅ ⊂ Lm = L
In proving the insolvability of the quintic polynomial, we will look for necessary
and sufficient conditions for the solvability of polynomial equations. Our main result
https://doi.org/10.1515/9783110603996-017
will be that if f (x) ∈ K[x], then f (x) = 0 is solvable over K if the Galois group of the
splitting field of f (x) over K is a solvable group (see Chapter 11).
In the remainder of this section, we assume that all fields have characteristic zero.
The next theorem gives a characterization of simple extensions by radicals:
Theorem 17.2.2. Let L|K be a field extension and n ∈ ℕ. Assume that the polynomial
xn − 1 splits into linear factors in K[x] so that K contains all the n-th roots of unity. Then
L = K(√a)
n
for some a ∈ K if and only if L is a Galois extension over K, and Aut(L|K) =
ℤ/mℤ for some m ∈ ℕ with m|n.
Proof. The n-th roots of unity, that is, the zeros of the polynomial xn − 1 ∈ K[x], form a
cyclic multiplicative group ℱ ⊂ K ⋆ of order n, since each finite subgroup of the multi-
plicative group K ⋆ of K is cyclic, and |ℱ | = n. We call an n-th root of unity ω primitive
if ℱ = ⟨ω⟩.
Now let L = K(√a) n
with a ∈ K; that is, L = K(β) with βn = a ∈ K. Let ω be a
primitive n-th root of unity. With this β, the elements ωβ, ω2 β, . . . , ωn β = β are zeros of
xn − a. Hence, the polynomial xn − a splits into linear factors over L; hence, L = K(β)
is a splitting field of xn − a over K. It follows that L|K is a Galois extension.
Let σ ∈ Aut(L|K). Then σ(β) = ων β for some 0 < ν ≤ n. The element ων is uniquely
determined by σ, and we may write ων = ωσ .
Consider the map ϕ : Aut(L|K) → ℱ given by σ → ωσ , where ωσ is defined as
above by σ(β) = ωσ β. If τ, σ ∈ Aut(L|K), then
because ωτ ∈ K.
Therefore, ϕ(στ) = ϕ(σ)ϕ(τ); hence, ϕ is a homomorphism. The kernel ker(ϕ)
contains all the K-automorphisms of L, for which σ(β) = β. However, since K = K(β), it
follows that ker(ϕ) contains only the identity. The Galois group Aut(L|K) is, therefore,
isomorphic to a subgroup of ℱ . Since ℱ is cyclic of order n, we have that Aut(L|K) is
cyclic of order m for some m|n, completing one way in the theorem.
Conversely, first suppose that L|K is a Galois extension with Aut(L|K) = ℤn , a
cyclic group of order n. Let σ be a generator of Aut(L|K). This is equivalent to
n
ω ⋆ η = ∑ ων σ ν (η) ≠ 0.
ν=1
n n n+1
σ(ω ⋆ η) = ∑ ων σ ν+1 (η) = ω−1 ∑ ων+1 σ ν+1 (η) = ω−1 ∑ ων σ ν (η)
ν=1 ν=1 ν=2
n
= ω−1 ∑ ων σ ν (η) = ω−1 (ω ⋆ η).
ν=1
then if ω is a primitive n-th root of unity, define K̃ = K(ω) and L̃ = K(̃ √a). We then
n
K ⊂ E ⊂ L, [L : E] ≥ 2
and L = E(√a)
n
for some a ∈ E, n ∈ ℕ. Now [E : K] < m. Therefore, by the inductive
hypothesis, there exists a Galois extension by radicals Ẽ of K with E ⊂ E.̃ Let G =
Aut(E|K),
̃ and let L̃ be the splitting field of the polynomial f (x) = ma (xn ) ∈ K[x] over E,̃
where ma (x) is the minimal polynomial of a over K. We show that L̃ has the desired
properties.
Now √an
∈ L is a zero of the polynomial f (x), and E ⊂ Ẽ ⊂ L.̃ Therefore, L̃ contains
an E-isomorphic image of L = K(√a); n
hence, we may consider L̃ as an extension of L.
Since Ẽ is a Galois extension of K, the polynomial f (x) may be factored as
f (x) = (x n − α1 ) ⋅ ⋅ ⋅ (x n − αs )
with αi ∈ Ẽ for i = 1, . . . , s. All zeros of f (x) in L̃ are radicals over E.̃ Therefore, L̃ is an
extension by radicals of E.̃ Since Ẽ is also an extension by radicals of K, we obtain that
L̃ is an extension by radicals of K.
Since Ẽ is a Galois extension of K, we have that Ẽ is a splitting field of a polynomial
g(x) ∈ K[x]. Furthermore, L̃ is a splitting field of f (x) ∈ K[x] over E.̃ Altogether then,
we have that L̃ is a splitting field of f (x)g(x) ∈ K[x] over K. Therefore, L̃ is a Galois
extension of K, completing the proof.
Proof. We prove the lemma by induction on r. If r = 0, then G = {1}, and there is noth-
ing to prove. Suppose then that r ≥ 1, and assume that the lemma holds for all such
chains of fields with a length r < r. Since L1 |K is a Galois extension, then Aut(L1 |K) is
a normal subgroup of G by the fundamental theorem of Galois theory. Moreover,
Lemma 17.2.5. Let L|K be a field extension. Let K̃ and L̃ be the splitting fields of the
polynomial x n − 1 ∈ K[x] over K and L, respectively. Since K ⊂ L, we have K̃ ⊂ L.̃ Then
the following hold:
(1) If σ ∈ Aut(L|L),
̃ then σ ̃ ∈ Aut(K|K),
|K
̃ and the map
Aut(L|L)
̃ → Aut(K|K),
̃ given by σ → σ|K̃ ,
is an injective homomorphism.
(2) Suppose that in addition L|K is a Galois extension. Then L|K̃ is also a Galois exten-
sion. If furthermore, σ ∈ Aut(L|̃ K),
̃ then σ|L ∈ Aut(L|K), and
is an injective homomorphism.
Proof. (1) Let ω be a primitive nth root of unity. Then K̃ = K(ω), and L̃ = L(ω). Each
σ ∈ Aut(L|L)
̃ maps ω onto a primitive nth root of unity, and fixes K ⊂ L elementwise.
Hence, from σ ∈ Aut(L|L), ̃ we get that σ|K̃ ∈ Aut(K|K).
̃ Certainly, the map σ → σ|K̃
defines a homomorphism Aut(L|L) ̃ → Aut(K|K).
̃ Let σ|K̃ = 1 with σ ∈ Aut(L|L).̃ Then
σ(ω) = ω; therefore, we have already that σ = 1, since L = L(ω).
̃
(2) If L is the splitting field of a polynomial g(x) over K, then L̃ is the splitting field
of g(x)(xn − 1) over K. Hence, L|K ̃ is a Galois extension. Therefore, K ⊂ L ⊂ L,̃ and
L|K, L|L
̃ and L|K ̃ are all Galois extensions. Therefore, from the fundamental theorem
of Galois theory
Definition 17.3.1. The splitting field of the polynomial xn −1 ∈ ℚ[x] with n ≥ 2 is called
the nth cyclotomic field denoted by kn .
Lemma 17.3.2. Let K be a field and K̃ be the splitting field of x n −1 over K. Then Aut(K|K)
̃
is abelian.
Proof. We apply Lemma 17.2.5 for the field extension K|ℚ. This can be done since the
characteristic of K is zero, and ℚ is the prime field of K. It follows that Aut(K|K)
̃ is
isomorphic to a subgroup of Aut(ℚ|ℚ) ̃ from part (1) of Lemma 17.2.5. But ℚ̃ = kn , and
hence Aut(ℚ|ℚ)
̃ is abelian. Therefore, Aut(K|K)
̃ is abelian.
Proof. Suppose that L|K is a Galois extension. Then we have a chain of fields
K = L0 ⊂ L1 ⊂ ⋅ ⋅ ⋅ ⊂ Lr = L
K ⊂ K̃ = L̃ 0 ⊂ L̃ 1 ⊂ ⋅ ⋅ ⋅ ⊂ L̃ r = L.̃
From part (2) of Lemma 17.2.5, we get that L|K ̃ is a Galois extension. Furthermore,
̃Lj |L̃ j−1 is a Galois extension with Aut(L̃ j |L̃ j−1 ) cyclic from Theorem 17.2.2. In particular,
Aut(L̃ j |L̃ j−1 ) is abelian. The group Aut(K|K)̃ is abelian from Lemma 17.3.2. Therefore,
we may apply Lemma 17.2.4 to the chain
K ⊂ K̃ = L̃ 0 ⊂ ⋅ ⋅ ⋅ ⊂ L̃ r = L.̃
Therefore, G̃ = Aut(L|K)
̃ is solvable. The group G = Aut(L|K) is a homomorphic image
of G from the fundamental theorem of Galois theory. Since homomorphic images of
̃
solvable groups are still solvable (see Theorem 12.2.3), it follows that G is solvable.
Lemma 17.4.2. Let L|K be a Galois extension, and suppose that G = Aut(L|K) is solv-
able. Assume further that K contains all q-th roots of unity for each prime divisor q of
m = [L : K]. Then L is an extension of K by radicals.
Proof. Let L|K be a Galois extension, and suppose that G = Aut(L|K) is solvable; also
assume that K contains all the q-th roots of unity for each prime divisor q of m = [L : K].
We prove the result by induction on m.
If m = 1, then L = K, and the result is clear. Now suppose that m ≥ 2, and as-
sume that the result holds for all Galois extensions L |K with [L : K ] < m. Now
G = Aut(L|K) is solvable, and G is nontrivial since m ≥ 2. Let q be a prime divisor of m.
From Lemma 12.2.2 and Theorem 13.3.5, it follows that there is a normal subgroup H
of G with G/H cyclic of order q. Let E = Fix(L, H). From the fundamental theorem of
Galois theory, E|K is a Galois extension with Aut(E|K) ≅ G/H, and hence Aut(E|K) is
cyclic of order q. From Theorem 17.2.2, E|K is a simple extension of K by a radical. The
proof is completed if we can show that L is an extension of E by radicals.
The extension L|E is a Galois extension, and the group Aut(L|E) is solvable, since
it is a subgroup of G = Aut(L|K). Each prime divisor p of [L : E] is also a prime divisor
of m = [L : K] by the degree formula. Hence, as an extension of K, the field E contains
all the p-th roots of unity. Finally,
[L : K] m
[L : E] = = < m.
[E : K] q
Theorem 17.5.1. Let K be a field of characteristic 0, and let f (x) ∈ K[x]. Suppose that L
is the splitting field of f (x) over K. Then the polynomial equation f (x) = 0 is solvable by
radicals if and only if Aut(L|K) is solvable.
of [L : K] = |Aut(L|K)|. Hence, each prime divisor q of [L̃ : K]̃ is also a prime divisor of
[L : K]. Therefore, L̃ is an extension by radicals of K̃ by Lemma 17.4.2. Since K̃ = K(ω),
where ω is a primitive n-th root of unity, we obtain that L̃ is also an extension of K
by radicals. Therefore, L is contained in an extension L̃ of K by radicals; therefore,
f (x) = 0 is solvable by radicals.
Corollary 17.5.2. Let K be a field of characteristic 0, and let f (x) ∈ K[x] be a polynomial
of degree m with 1 ≤ m ≤ 4. Then the equation f (x) = 0 is solvable by radicals.
Proof. Let L be the splitting field of f (x) over K. The Galois group Aut(L|K) is isomor-
phic to the subgroup of the symmetric group Sm . Now the group S4 is solvable via the
chain
{1} ⊂ ℤ2 ⊂ D2 ⊂ A4 ⊂ S4 ,
where ℤ2 is the cyclic group of order 2, and D2 is the Klein 4-group, which is isomorphic
to ℤ2 × ℤ2 . Because Sm ⊂ S4 for 1 ≤ m ≤ 4, it follows that Aut(L|K) is solvable. From
Theorem 17.5.1, the equation f (x) = 0 is solvable by radicals.
Corollary 17.5.2 uses the general theory to show that any polynomial equation of
degree less than or equal to 4 is solvable by radicals. This, however, does not provide
explicit formulas for the solutions. We present these below:
Let K be a field of characteristic 0, and let f (x) ∈ K[x] be a polynomial of degree
m with 1 ≤ m ≤ 4. As mentioned above, we assume that K is the splitting field of the
respective polynomial.
Case (1): If deg(f (x)) = 1, then f (x) = ax + b with a, b ∈ K and a ≠ 0. A zero is then
given by k = − ba .
Case (2): If deg(f (x)) = 2, then f (x) = ax2 + bx + c with a, b, c ∈ K and a ≠ 0. The
zeros are then given by the quadratic formula
−b ± √b2 − 4ac
k= .
2a
We note that the quadratic formula holds over any field of characteristic not equal to 2.
Whether there is a solution within the field K then depends on whether b2 − 4ac has a
square root within K.
For the cases of degrees 3 and 4, we have the general forms of what are known as
Cardano’s formulas.
Case (3): If deg(f (x)) = 3, then f (x) = ax3 + bx2 + cx + d with a, b, c, d ∈ K and a ≠ 0.
Dividing through by a, we may assume, without loss of generality, that a = 1.
By a substitution x = y − b3 , the polynomial is transformed into
g(y) = y3 + py + q ∈ K[y].
Let L be the splitting field of g(y) over K, and let α ∈ L be a zero of g(y) so that
α3 + pα + q = 0.
If p = 0, then α = √−q
3
so that g(y) has the three zeros
3
√−q, ω√−q,
3
ω2√−q,
3
α3 + pα + q = 0,
we get
p3
β3 − + q = 0.
27β3
Define γ = β3 and δ = ( −p
3β
)3 so that
γ + δ + q = 0.
Then
3 3
p p3 p
γ 2 + qγ − ( ) = 0 and − +δ+q =0 and δ2 + qδ − ( ) = 0.
3 27δ 3
2 3
√( q ) + ( p ) = 0.
2 3
p
Then from the definitions of γ, δ, we have γ = β3 , and δ = ( −p
3β
)3 . From above, α = β − 3β .
Therefore, we get α by finding the cube roots of γ and δ.
There are certain possibilities and combinations with these cube roots, but be-
cause of the conditions, the cube roots of γ and δ are not independent. We must satisfy
the condition
−p p
√3 γ√3 δ = β =− .
3β 3
u + v, ωu + ω2 v, ω2 u + ωv,
2 3 2 3
3 q √ q p 3 q √ q p
u = √− + ( ) +( ) and v = √− − ( ) +( ) .
2 2 3 2 2 3
g(y) = y4 + py2 + qy + r.
We have to find the zeros of g(y). Let x1 , x2 , x3 , x4 be the solutions in the splitting field
of the polynomial
y4 + py2 + qy + r = 0.
Then
0 = x1 + x2 + x3 + x4 ,
p = x1 x2 + x1 x3 + x1 x4 + x2 x3 + x2 x4 + x3 x4 ,
−q = x1 x2 x3 + x1 x2 x4 + x1 x3 x4 + x2 x3 x4 ,
r = x1 x2 x3 x4 .
We define
y1 = (x1 + x2 )(x3 + x4 ),
y2 = (x1 + x3 )(x2 + x4 ),
y3 = (x1 + x4 )(x2 + x3 ).
From x1 + x2 + x3 + x4 = 0, we get
Let y3 + fy2 + gy + h = 0 be the cubic equation with the solutions y1 , y2 , and y3 . This
polynomial y3 + fy2 + gy + h is called the cubic resolvent of the equation of degree
four.
If we compare the coefficients, we get the following:
f = − y1 − y2 − y3 ,
g =y1 y2 + y1 y3 + y2 y3 ,
h = − y1 y2 y3 .
f = −2p,
g = p2 − 4r,
h = q2 .
x1 + x2 = −(x3 + x4 ) = ±√−y1 ,
x1 + x3 = −(x2 + x4 ) = ±√−y2 ,
x1 + x4 = −(x2 + x3 ) = ±√−y3 .
The formulas for x2 , x3 , and x4 follow analogously, and are of the same type as that for
x1 .
By variation of the signs we get eight numbers ±x1 , ±x2 , ±x3 and ±x4 . Four of them
are the solutions of the equation
y4 + py3 + qy + r = 0.
The correct ones we get by putting into the equation. They are as follows:
1
x1 = (√−y1 + √−y2 + √−y3 ),
2
1
x2 = (√−y1 − √−y2 − √−y3 ),
2
1
x3 = (−√−y1 + √−y2 − √−y3 ),
2
1
x4 = (−√−y1 − √−y2 + √−y3 ).
2
The following theorem is due to Abel; it shows the insolvability of the general
degree 5 polynomial over the rationals ℚ.
Theorem 17.5.3. Let L be the splitting field of the polynomial f (x) = x5 − 2x4 + 2 ∈ ℚ[x]
over ℚ. Then Aut(L|K) = S5 , the symmetric group on 5 letters. Since S5 is not solvable,
the equation f (x) = 0 is not solvable by radicals.
Proof. The polynomial f (x) is irreducible over ℚ by the Eisenstein criterion. Further-
more, f (x) has five zeros in the complex numbers ℂ by the fundamental theorem of
algebra (see Section 17.7). We claim that f (x) has exactly 3 real zeros and 2 nonreal
zeros, which then necessarily are complex conjugates. In particular, the 5 zeros are
pairwise distinct.
To see the claim, notice first that f (x) has at least 3 real zeros from the intermediate
value theorem. As a real function, f (x) is continuous, and f (−1) = −1 < 0 and f (0) =
2 > 0, so it must have a real zero between −1 and 0. Furthermore, f ( 32 ) = − 81 3
< 0, and
f (2) = 2 > 0. Hence, there must be distinct real zeros between 0 and 32 , and between
3
2
and 2. Suppose that f (x) has more than 3 real zeros. Then f (x) = x3 (5x − 8) has at
least 3 pairwise distinct real zeros from Rolle’s theorem. But f (x) clearly has only 2
real zeros, so this is not the case. Therefore, f (x) has exactly 3 real zeros, and hence 2
nonreal zeros that are complex conjugates.
Let L be the splitting field of f (x). The field L lies in ℂ, and the restriction of the
map δ : z → z of ℂ to L maps the set of zeros of f (x) onto themselves. Therefore, δ is an
automorphism of L. The map δ fixes the 3 real zeros and transposes the 2 nonreal zeros.
From this, we now show that Aut(L|ℚ) = Aut L = G = S5 , the full symmetric group on
5 symbols. Clearly, G ⊂ S5 , since G acts as a permutation group on the 5 zeros of f (x).
Since δ transposes the 2 nonreal zeros, G (as a permutation group) contains at
least one transposition. Since f (x) is irreducible, G acts transitively on the zeros of
f (x). Let x0 be one of the zeros of f (x), and let Gx0 be the stabilizer of x0 . Since G acts
transitively, x0 has five images under G; therefore, the index of the stabilizer must be 5
(see Chapter 10):
5 = [G : Gx0 ],
which—by Lagrange’s theorem—must divide the order of G. Therefore, from the Sylow
theorems, G contains an element of order 5. Hence, G contains a 5-cycle and a trans-
position; therefore, by Theorem 11.4.3, it follows that G = S5 . Since S5 is not solvable,
it follows that f (x) cannot be solved by radicals.
Since Abel’s theorem shows that there exists a degree 5 polynomial that cannot
be solved by radicals, it follows that there can be no formula like Cardano’s formula
in terms of radicals for degree 5.
Corollary 17.5.4. There is no general formula for solving by radicals a fifth degree poly-
nomial over the rationals.
We now show that this result can be further extended to any degree greater than 5.
Theorem 17.5.5. For each n ≥ 5, there exist polynomials f (x) ∈ ℚ[x] of degree n, for
which the equation f (x) = 0 is not solvable by radicals.
Proof. Let f (x) = xn−5 (x5 − 2x 4 + 2), and let L be the splitting field of f (x) over ℚ. Then
Aut(L|ℚ) = Aut(L) contains a subgroup that is isomorphic to S5 . It follows that Aut(L)
is not solvable; therefore, the equation f (x) = 0 is not solvable by radicals.
Corollary 17.5.6. There is no general formula for solving by radicals polynomial equa-
tions over the rationals of degree 5 or greater.
and compass. In Chapter 6, we proved the impossibility of the first 3 problems. Here,
we use Galois theory to consider constructible n-gons.
Recall that a Fermat number is a positive integer of the form
n
Fn = 22 + 1, n = 0, 1, 2, 3, . . . .
ℚ = L0 ⊂ L1 ⊂ ⋅ ⋅ ⋅ ⊂ Ln = kp ,
[Lj : Lj−1 ] = 2
for j = 1, . . . , n.
with [Uj−1 : Uj ] = 2 for j = 1, . . . , n. From the fundamental theorem of Galois theory, the
fields Lj = Fix(kp , Uj ) with j = 0, . . . , n have the desired properties.
Corollary 17.6.2. Consider the numbers 0, 1, that is, a unit line segment or a unit circle.
A regular p-gon with p ≥ 3 prime is constructible from {0, 1} using a straightedge and
s
compass if and only if p = 22 + 1, s ≥ 0 is a Fermat prime.
Proof. From Theorem 6.3.13, we have that if a regular p-gon is constructible with a
straightedge and compass, then p must be a Fermat prime. The sufficiency follows
from Theorem 17.6.1.
We now extend this to general n-gons. Let m, n ∈ ℕ. Assume that we may construct
from {0, 1} a regular n-gon and a regular m-gon. In particular, this means that we may
construct the real numbers cos( 2πn
), sin( 2π
n
), cos( 2π
m
), and sin( 2π
m
). If the gcd(m, n) = 1,
then we may construct from {0, 1} a regular mn-gon.
To see this, notice that
2π 2π 2(n + m)π 2π 2π 2π 2π
cos( + ) = cos( ) = cos( ) cos( ) − sin( ) sin( ),
n m nm n m n m
and
2π 2π 2(n + m)π 2π 2π 2π 2π
sin( + ) = sin( ) = sin( ) cos( ) + cos( ) sin( ).
n m nm n m n m
2π 2π
Therefore, we may construct from {0, 1} the numbers cos( mn ) and sin( mn ), because
gcd(n + m, mn) = 1. Therefore, we may construct from {0, 1} a regular mn-gon.
Now let p ≥ 3 be a prime. Then [kp2 : ℚ] = p(p − 1), which is not a power of 2.
Therefore, from {0, 1} it is not possible to construct a regular p2 -gon. Hence, altogether
we have the following:
Corollary 17.6.3. Consider the numbers 0, 1, that is, a unit line segment or a unit circle.
A regular n-gon with n ∈ ℕ is constructible from {0, 1} using a straightedge and compass
if and only if
(i) n = 2m , m ≥ 0 or
(ii) n = 2m p1 p2 ⋅ ⋅ ⋅ pr , m ≥ 0, and the pi are pairwise distinct Fermat primes.
the result in 1629. It was then more clearly stated in 1637 by Descartes, who also dis-
tinguished between real and imaginary zeros. The first published proof of the funda-
mental theorem of algebra was then given by D’Alembert in 1746. However, there were
gaps in D’Alembert’s proof, and the first fully accepted proof was that given by Gauss
in 1797 in his Ph. D. thesis. This was published in 1799. Interestingly enough, in re-
viewing Gauss’ original proof, modern scholars tend to agree that there are as many
holes in this proof as in D’Alembert’s proof. Gauss, however, published three other
proofs with no such holes. He published second and third proofs in 1816, while his
final proof, which was essentially another version of the first, was presented in 1849.
Theorem 17.7.1. Each nonconstant polynomial f (x) ∈ ℂ[x], where ℂ is the field of com-
plex numbers, has a zero in ℂ. Therefore, ℂ is an algebraically closed field.
Proof. Let f (x) ∈ ℂ[x] be a nonconstant polynomial, and let K be the splitting field of
f (x) over ℂ. Since the characteristic of the complex numbers ℂ is zero, this will be a
Galois extension of ℂ. Since ℂ is a finite extension of ℝ, this field K would also be a
Galois extension of ℝ. The fundamental theorem of algebra asserts that K must be ℂ
itself, and hence the fundamental theorem of algebra is equivalent to the fact that any
nontrivial Galois extension of ℂ must be ℂ.
Let K be any finite extension of ℝ with |K : ℝ| = 2m q, (2, q) = 1. If m = 0, then K is
an odd-degree extension of ℝ. Since K is separable over ℝ, from the primitive element
theorem, it is a simple extension, and hence K = ℝ(α), where the minimal polynomial
mα (x) over ℝ has odd degree. However, odd-degree real polynomials always have a
real zero, and therefore mα (x) is irreducible only if its degree is one. But then, α ∈ ℝ,
and K = ℝ. Therefore, if K is a nontrivial finite extension of ℝ of degree 2m q, we must
have m > 0. This shows more generally that there are no odd-degree finite extensions
of ℝ.
Suppose that K is a degree 2 extension of ℂ. Then K = ℂ(α) with deg mα (x) = 2,
where mα (x) is the minimal polynomial of α over ℂ. But from the quadratic formula
complex, quadratic polynomials always have zeros in ℂ, so a contradiction. Therefore,
ℂ has no degree 2 extensions.
Now, let K be a Galois extension of ℂ. Then K is also Galois over ℝ. Suppose |K :
ℝ| = 2m q, (2, q) = 1. From the argument above, we must have m > 0. Let G = Gal(K/ℝ)
be the Galois group. Then |G| = 2m q, m > 0, (2, q) = 1. Thus, G has a 2-Sylow subgroup
of order 2m and index q (see Theorem 13.3.4). This would correspond to an intermediate
field E with |K : E| = 2m and |E : ℝ| = q. However, then E is an odd-degree finite
extension of ℝ. It follows that q = 1 and E = ℝ. Therefore, |K : ℝ| = 2m , and |G| = 2m .
Now, |K : ℂ| = 2m−1 and suppose G1 = Gal(K/ℂ). This is a 2-group. If it were not
trivial, then from Theorem 13.4.1 there would exist a subgroup of order 2m−2 and in-
dex 2. This would correspond to an intermediate field E of degree 2 over ℂ. However,
from the argument above, ℂ has no degree 2 extensions. It follows then that G1 is triv-
ial; that is, |G1 | = 1, so |K : ℂ| = 1, and K = ℂ, completing the proof.
The fact that ℂ is algebraically closed limits the possible algebraic extensions of
the reals.
Corollary 17.7.2. Let K be a finite field extension of the real numbers ℝ. Then K = ℝ or
K = ℂ.
Proof. Since |K : ℝ| < ∞ by the primitive element theorem, K = ℝ(α) for some α ∈ K.
Then the minimal polynomial mα (x) of α over ℝ is in ℝ[x], and hence in ℂ[x]. There-
fore, from the fundamental theorem of algebra it has a zero in ℂ. Hence, α ∈ ℂ. If
α ∈ ℝ, then K = ℝ, if not, then K = ℂ.
17.8 Exercises
1. For f (x) ∈ ℚ[x] with
√3 (2 ± √−121) = 2 ± √−1,
xn − 1 = (x − ξ1 )(x − ξ2 ) ⋅ ⋅ ⋅ (x − ξn ),
are all (different) n-th roots of unity, that is, especially ξn = 1. These ξν form a from
ξ1 generated multiplicative cyclic group G = {ξ1 , ξ2 , . . . , ξn }. It is ξν = ξ1ν .
An n-th root of unity ξν is called a primitive n-th root of unity, if ξν is not an m-th
root of unity for any m < n.
Show that the following are equivalent:
(i) ξν is a primitive n-th root of unity.
(ii) ξν is a generating element of G.
(iii) gcd(ν, n) = 1.
11. The polynomial ϕn (x) ∈ ℂ[x], whose zeros are exactly the primitive n-th roots of
unity, is called the n-th cyclotomic polynomial. With Exercise 6 it is
ν
ϕn (x) = ∏ (x − ξν ) = ∏ (x − e2πi n ).
1≤ν≤n 1≤ν≤n
gcd(ν,n)=1 gcd(ν,n)=1
The degree of ϕn (x) is the number of the integers {1, . . . , n}, which are coprime to n.
Show the following:
(i) xn − 1 = ∏d≥1 ϕd (x).
d|n
(ii) ϕn (x) ∈ ℤ[x] for all n ≥ 1.
(iii) ϕn (x) is irreducible over ℚ (and therefore also over ℤ) for all n ≥ 1.
12. Show that the Fermat numbers F0 , F1 , F2 , F3 , F4 are all prime but F5 is composite
and divisible by 641.
Vector spaces are the fundamental algebraic structures in linear algebra, and the study
of linear equations. Vector spaces have been crucial in our study of fields and Galois
theory, since any field extension is a vector space over any subfield. In this context,
the degree of a field extension is just the dimension of the extension field as a vector
space over the base field.
If we modify the definition of a vector space to allow scalar multiplication from an
arbitrary ring, we obtain a more general structure called a module. We will formally
define this below. Modules generalize vector spaces, but the fact that the scalars do
not necessarily have inverses makes the study of modules much more complicated.
Modules will play an important role in both the study of rings and the study of abelian
groups. In fact, any abelian group is a module over the integers ℤ so that modules, be-
sides being generalizations of vector spaces, can also be considered as generalizations
of abelian groups.
In this chapter, we will introduce the theory of modules. In particular, we will
extend to modules the basic algebraic properties such as the isomorphism theorems,
which have been introduced earlier in presenting groups, rings, and fields.
In this chapter, we restrict ourselves to commutative rings, so that throughout R
is always a commutative ring. If R has an identity 1, then we always consider only the
case that 1 ≠ 0. Throughout this chapter, we use letters a, b, c, m, . . . for ideals in R. For
principal ideals, we write ⟨a⟩ or aR for the ideal generated by a ∈ R. We note, however,
that the definition can be extended to include modules over noncommutative rings
(see Chapter 22). In this case, we would speak of left modules and right modules.
Definition 18.1.1. Let R = (R, +, ⋅) a commutative ring and M = (M, +) an abelian group.
M together with a scalar multiplication ⋅ : R×M → M, (α, x) → αx, is called a R-module
or module over R if the following axioms hold:
(M1) (α + β)x = αx + βx,
(M2) α(x + y) = αx + αy, and
(M3) (αβ)x = α(βx) for all α, β ∈ R and x, y ∈ M.
https://doi.org/10.1515/9783110603996-018
Example 18.1.2.
(1) If R = K is a field, then a K-module is a K-vector space.
(2) Let G = (G, +) be an abelian group. If n ∈ ℤ and x ∈ G, then nx is defined as usual:
0 ⋅ x = 0,
nx = ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
x + ⋅⋅⋅ + x if n > 0, and
n-times
nx = (−n)(−x) if n < 0.
⋅ : ℤ × G → G, (n, x) → nx.
(3) Let S be a subring of R. Then, via (s, r) → sr, the ring R itself becomes an S-module.
(4) Let K be a field, V a K-vector space, and f : V → V a linear map of V. Let p =
∑i αi t i ∈ K[t]. Then p(f ) := ∑i αi f i defines a linear map of V, and V is an unitary
K[t]-module via the scalar multiplication
Basic to all algebraic theory is the concept of substructures. Next we define sub-
modules.
Example 18.1.4.
(1) In an abelian group G, considered as a ℤ-module, the subgroups are precisely the
submodules.
(2) The submodules of R, considered as a R-module, are precisely the ideals.
We next extend to modules the concept of a generating system. For a single gen-
erator, as with groups, this is called cyclic.
⟨⋃ Ui ⟩ = {∑ ai : ai ∈ Ui , L ⊂ I finite}.
i∈I i∈L
We write ⟨⋃i∈I Ui ⟩ =: ∑i∈I Ui and call this submodule the sum of the Ui . A sum
∑i∈I Ui is called a direct sum if for each representation of 0, as 0 = ∑ ai , ai ∈ Ui , it
follows that all ai = 0. This is equivalent to Ui ∩ ∑i=j̸ Uj = 0 for all i ∈ I. Notation:
⨁i∈I Ui ; and if I = {1, . . . , n}, then we also write U1 ⊕ ⋅ ⋅ ⋅ ⊕ Un .
Definition 18.1.8. Let U be a submodule of the R-module M. Let M/U be the factor
group. We define a (well-defined) scalar multiplication:
With this M/U is a R-module, the factor module or quotient module of M by U. In M/U,
we have the operations
(x + U) + (y + U) = (x + y) + U,
and
α(x + U) = αx + U.
A module M over a ring R can also be considered as a module over a quotient ring
of R. The following is straightforward to verify (see exercises):
Lemma 18.1.9. Let a ⊲ R an ideal in R and M a R-module. The set of all finite sums of
the form ∑ αi xi , αi ∈ a, xi ∈ M, is a submodule of M, which we denote by aM. The factor
group M/aM becomes a R/a-module via the well-defined scalar multiplication
If here R has an identity 1 and a is a maximal ideal, then M/aM becomes a vector space
over the field K = R/a.
f (x + y) = f (x) + f (y)
and
f (αx) = αf (x)
for all α ∈ R and all x, y ∈ M. Endo-, epi-, mono-, iso- and automorphisms are defined
analogously via the corresponding properties of the maps. If f : M → N and g : N → P
are module homomorphisms, then g ∘ f : M → P is also a module homomorphism. If
f : M → N is an isomorphism, then also f −1 : N → M.
and
f (M) ≅ M/ ker(f ).
U/(U ∩ V) ≅ (U + V)/V.
(M/U)/(V/U) ≅ M/V.
For the proofs, as for groups, just consider the map f : U + V → U/(U ∩ V),
u + v → u + (U ∩ V), which is well-defined because U ∩ V is a submodule of U; then
we have ker(f ) = V.
Note that α → αρ, ρ ∈ R fixed, defines a module homomorphism R → R if we
consider R itself as a R-module.
Ann(a) = {α ∈ R : αa = 0}.
Lemma 18.2.2. Ann(a) is a submodule of R, and the module isomorphism theorem (1)
gives R/ Ann(a) ≅ Ra.
As for single elements, since Ann(U) = ⋂u∈U Ann(u), then Ann(U) is a submodule
of R. If ρ ∈ R, u ∈ U, then ρu ∈ U; that means, if u ∈ Ann(U), then also ρu ∈ Ann(U),
because (αρ)u = α(ρu) = 0. Hence, Ann(U) is an ideal in R.
Suppose that G is an abelian group. Then as aforementioned, G is a ℤ-module. An
element g ∈ G is a torsion element, or has finite order if ng = 0 for some n ∈ ℕ. The
set Tor(G) consists of all the torsion elements in G. An abelian group is torsion-free if
Tor(G) = {0}.
Theorem 18.2.6. Let R be an integral domain and M an R-module (by our agreement
M is unitary). Let Tor(M) = T(M) be the set of torsion elements of M. Then Tor(M) is a
submodule of M, and M/ Tor(M) is torsion-free.
+:P×P →P and ⋅ : R × P → P
via
Theorem 18.3.1.
(1) If π ∈ SI is a permutation of I, then
∏ Mi ≅ ∏ Mπ(i),
i∈I i∈I
and
⨁ Mi ≅ ⨁ Mπ(i) .
i∈I i∈I
∏ Mi ≅ ∏(∏ Mi ),
i∈I j∈J i∈Ij
and
⨁ Mi ≅ ⨁(⨁ Mi ).
i∈I j∈J i∈Ij
Proof. (1) If there is such ϕ, then the jth component of ϕ(a) is equal ϕj (a), because
πj ∘ ϕ = ϕj . Hence, define ϕ(a) ∈ ∏i∈I Mi via ϕ(a)(i) := ϕi (a), and ϕ is the desired map.
(2) If there is such a Ψ with Ψ ∘ αj = Ψj , then Ψ(x) = Ψ((xi )) = Ψ(∑i∈I δi (xi )) =
∑i∈I Ψ ∘ δi (xi ) = ∑i∈I Ψi (xi ). Hence, define Ψ((xi )) = ∑i∈I Ψi (xi ), and Ψ is the desired
map (recall that the sum is well defined).
following, we assume that S ≠ 0. If S = 0, then ⟨S⟩ = ⟨0⟩ = {0}, and this case is not
interesting. For convention, in the following, we always assume mi ≠ mj if i ≠ j in a
finite sum ∑ αi mi with all αi ∈ R and all mi ∈ M.
Definition 18.4.1. A finite set {m1 , . . . , mn } ⊂ M is called linear independent or free (over
R) if a representation 0 = ∑ni=1 αi mi implies always αi = 0 for all i ∈ {1, . . . , n}; that is, 0
can be represented only trivially on {m1 , . . . , mn }. A nonempty subset S ⊂ M is called
free (over R) if each finite subset of S is free.
Example 18.4.3.
1. R × R = R2 , as an R-module, is free with basis {(1, 0), (0, 1)}.
2. More generally, let I ≠ 0. Then ⨁i∈I Ri with Ri = R for all i ∈ I is free with basis
{ϵi : I → R : ϵi (j) = δij , i, j ∈ I}, where
0 if i ≠ j,
δij = {
1 if i = j.
Theorem 18.4.4. The R-module M is free on S if and only if each m ∈ M can be written
uniquely in the form ∑ αi si with αi ∈ R, si ∈ S. This is exactly the case, where M = ⨁s∈S Rs
is the direct sum of the cyclic submodules Rs, and each Rs is module isomorphic to R.
Corollary 18.4.5.
(1) M is free on S ⇔ M ≅ ⨁s∈S Rs , Rs = R for all s ∈ S.
(2) If M is finitely generated and free, then there exists an n ∈ ℕ0 such that M ≅ Rn =
R ⊕ ⋅ ⋅ ⋅ ⊕ R.
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
n-times
Proof. Part (1) is clear. We prove part (2). Let M = ⟨x1 , . . . , xr ⟩ and S a basis of M. Each xi
is uniquely representable on S, as xi = ∑si ∈S αi si . Since the xi generates M, we get m =
∑ βi xi = ∑ βi αj sj for arbitrary m ∈ M, and we need only finitely many sj to generate M.
Hence, S is finite.
Theorem 18.4.6. Let R be a commutative ring with identity 1, and M a free R-module.
Then any two bases of M have the same cardinality.
Proof. R contains a maximal ideal m, and R/m is a field (see Theorem 2.3.2 and 2.4.2).
Then M/mM is a vector space over R/m. From M ≅ ⨁s∈S Rs with basis S, we get mM ≅
⨁s∈S ms; hence,
Therefore, the R/m-vector space M/mM has a basis of the cardinality of S. This gives
the result.
Let R be a commutative ring with identity 1, and M a free R-module. The cardinality
of a basis is an invariant of M, called the rank of M or dimension of M. If rank(M) = n <
∞, then this means M ≅ Rn .
Proof. Let S be a basis of F. By the axiom of choice, there exists for each s ∈ S an
element ms ∈ M with f (ms ) = s (f is surjective). We define the map g : F → M via
s → ms linearly; that is, g(∑si ∈S αi si ) = ∑si ∈S αi msi . Since F is free, the map g is well
defined. Obviously, f ∘ g(s) = f (ms ) = s for s ∈ S; that means f ∘ g = idF , because
F is free on S. For each m ∈ M, we have also m = g ∘ f (m) + (m − g ∘ f (m)), where
g ∘ f (m) = g(f (m)) ∈ g(F). Since f ∘ g = idF , the elements of the form m − g ∘ f (m) are in
the kernel of f . Therefore, M = g(F) + ker(f ). Now let x ∈ g(F) ∩ ker(f ). Then x = g(y)
for some y ∈ F and 0 = f (x) = f ∘ g(y) = y, and hence x = 0. Therefore, the sum is
direct: M = g(F) ⊕ ker(f ).
Corollary 18.4.9. Let M be an R-module and N a submodule such that M/N is free. Then
there is a submodule N of M with M = N ⊕ N .
Proof. Apply the above theorem for the canonical map π : M → M/N with
ker(π) = N.
Theorem 18.5.1. Let M be a free R-module of finite rank over the principal ideal do-
main R. Then each submodule U is free of finite rank, and rank(U) ≤ rank(M).
n
a = {β ∈ R : βx1 + ∑ βi xi ∈ U}.
i=2
Therefore, first γα1 x1 = 0; that is, γ = 0, because R has no zero divisor ≠ 0, and fur-
thermore, μ2 = ⋅ ⋅ ⋅ = μn = 0. That means, μ1 = ⋅ ⋅ ⋅ = μt = 0.
Ann(x) = {α ∈ R : αx = 0} ⊲ R, an ideal in R,
hence Ann(x) = (δx ). If x = 0, then (δx ) = R. δx is called the order of x and (δx ) the
order ideal of x. δx is uniquely determined up to units in R (that is, up to elements η with
ηη = 1 for some η ∈ R). For a submodule U of M, we call Ann(U) = ⋂u∈U (δu ) = (μ),
the order ideal of U.
In an abelian group G, considered as a ℤ-module, this order for elements corre-
sponds exactly to the order as group elements if we choose δx ≥ 0 for x ∈ G.
Proof. Let M = ⟨x1 , . . . , xn ⟩ torsion-free and R a principal ideal domain. Each submod-
ule ⟨xi ⟩ = Rxi is free, because M is torsion-free. We call a subset S ⊂ ⟨x1 , . . . , xn ⟩ free if
the submodule ⟨S⟩ is free. Since ⟨xi ⟩ is free, there exist such nonempty subsets. Under
all free subsets S ⊂ ⟨x1 , . . . , xn ⟩, we choose one with a maximal number of elements.
We may assume that {x1 , . . . , xs }, 1 ≤ s ≤ n, is such a maximal set—after possible re-
naming. If s = n, then the theorem holds. Now, let s < n. By the choice of s, the sets
{x1 , . . . , xs , xj } with s < j ≤ n are not free. Hence, there are αj ∈ R, and αi ∈ R, not all 0,
with
s
αj xj = ∑ αi xi , αj ≠ 0, s < j ≤ n.
i=1
Proof. M/T(M) is a finitely generated, torsion-free R-module, and hence free. By Corol-
lary 18.4.9, we have M = T(M) ⊕ F, F ≅ M/T(M).
From now on, we are interested in the case where M ≠ {0} is a torsion R-module;
that is, M = T(M). Let R be a principal ideal domain and M = T(M) an R-module.
Let M ≠ {0} and finitely generated. As above, let δx be the order of x ∈ M, unique
up to units in R, and let (δx ) = {α ∈ R : αx = 0} be the order ideal of x. Let (μ) =
⋂x∈M (δx ) be the order ideal of M. Since (μ) ⊂ (δx ), we have δx |μ for all x ∈ M. Since
principal ideal domains are unique factorization domains, if μ ≠ 0, then there can not
be many essentially different orders (that means, different up to units). Since M ≠ {0}
and finitely generated, we have in any case μ ≠ 0, because if M = ⟨x1 , . . . , xn ⟩, αi xi = 0
with αi ≠ 0, then αM = {0} if α := α1 ⋅ ⋅ ⋅ αn ≠ 0.
Lemma 18.5.4. Let R be a principal ideal domain and M ≠ {0} be an R-module with
M = T(M).
(1) If the orders δx and δy of x, y ∈ M are relatively prime; that is, gcd(δx , δy ) = 1, then
(δx+y ) = (δx δy ).
(2) Let δz be the order of z ∈ M, z ≠ 0. If δz = αβ with gcd(α, β) = 1, then there exist
x, y ∈ M with z = x + y and (δx ) = (α), (δy ) = (β).
Since αx = ασβz = σδz z = 0, we get α ∈ (δz ); that means, δx |α. On the other hand,
from 0 = δx x = σβδx z, we get δz |σβδx , and hence αβ|σβδx , because δz = αβ. Therefore,
α|σδx . From gcd(α, σ) = 1, we get α|δx . Therefore, α is associated to δx ; that is α = δx ϵ
with ϵ a unit in R, and furthermore, (α) = (δx ). Analogously, (β) = (δy ).
In Lemma 18.5.4, we do not need M = T(M). We only need x, y, z ∈ M with δx ≠ 0,
δy ≠ 0 and δz ≠ 0, respectively.
Corollary 18.5.5. Let R be a principal ideal domain and M ≠ {0} be an R-module with
M = T(M).
1. Let x1 , . . . , xn ∈ M be pairwise different and pairwise relatively prime orders δxi = αi .
Then y = x1 + ⋅ ⋅ ⋅ + xn has order α := α1 ⋅ ⋅ ⋅ αn .
k k
2. Let 0 ≠ x ∈ M and δx = ϵπ1 1 ⋅ ⋅ ⋅ πnn be a prime decomposition of the order δx of x (ϵ a
unit in R and the πi pairwise nonassociate prime elements in R), where n > 0, ki > 0.
k
Then there exist xi , i = 1, . . . , n, with δxi associated with πi i and x = x1 + ⋅ ⋅ ⋅ + xn .
This is exercise 7.
Theorem 18.6.1 (Theorem 10.4.1, basis theorem for finite abelian groups). Let G be a
finite abelian group. Then G is a direct product of cyclic groups of prime power order.
This allowed us, for a given finite order n, to present a complete classification of
abelian groups of order n. In this section, we extend this result to general modules
over principal ideal domains. As a consequence, we obtain the fundamental decom-
position theorem for finitely generated (not necessarily finite) abelian groups, which
finally proves Theorem 10.4.1. In the next chapter, we present a separate proof of this
in a slightly different format.
Theorem 18.6.3. Let R be a principal ideal domain and M ≠ {0} be an R-module with
M = T(M). Then M is the direct sum of its π-primary components.
k k
Proof. x ∈ M has finite order δx . Let δx = ϵπ1 1 ⋅ ⋅ ⋅ πnn be a prime decomposition of δx .
By Corollary 18.5.5, we have that x = ∑ xi with xi ∈ Mπi . That means, M = ∑π∈P Mπ ,
where P is the set of the prime elements of R. Let y ∈ Mπ ∩ ∑σ∈P,σ =π̸ Mσ ; that is, δy = π k
for some k ≥ 0 and y = ∑ xi with xi ∈ Mσi . That means, δxi = σ li for some li ≥ 0. By
l
Corollary 18.5.5, we get that y has the order ∏σi =π̸ σi i ; that means, π k is associated to
l
∏σi =π̸ σi i . Therefore, k = li = 0 for all i, and the sum is direct.
Corollary 18.6.4. Let R be a principal ideal domain and {0} ≠ M be a finitely gener-
ated torsion R-module. Then M has only finitely many nontrivial primary components
Mπ1 , . . . , Mπn , and we have
n
M = ⨁ Mπi .
i=1
Theorem 18.6.5. Let R be a principal ideal domain, π ∈ R a prime element, and M ≠ {0}
a R-module with π k M = {0}; furthermore, let m ∈ M with (δm ) = (π k ). Then there exists
a submodule N ⊂ M with M = Rm ⊕ N.
αx = ρm + n. (⋆)
In particular, αx ∈ M .
Now let α = ϵπ1 ⋅ ⋅ ⋅ πr be a prime decomposition. We consider one after the other
the elements x, πr x, πr−1 πr x, . . . , ϵπ1 ⋅ ⋅ ⋅ πr x = αx. We have x ∉ M , but αx ∈ M ; hence,
there exists an y ∉ M with πi y ∈ N + Rm.
1. πi ≠ π, π the prime element in the statement of the theorem. Then gcd(πi , π k )
= 1; hence, there are σ, σ ∈ R with σπi + σ π k = 1, and we get Rm = (Rπi + Rπ k )m =
πi Rm, because π k m = 0. Therefore, πi y ∈ M = N ⊕ Rm = N + πi Rm.
2. πi = π. Then we write πy as πy = n + λm with n ∈ N and λ ∈ R. This is possible,
because πy ∈ M . Since π k M = {0}, we get 0 = π k−1 ⋅πy = π k−1 n+π k−1 λm. Therefore,
π k−1 n = π k−1 λm = 0, because N ∩ Rm = {0}. In particular, we get π k−1 λ ∈ (δm ); that
is, π k |π k−1 λ, and hence π|λ. Therefore, πy = n + λm = n + πλ m ∈ N + πRm, λ ∈ R.
πi (y − z) = n. (⋆⋆)
Theorem 18.6.6. Let R be a principal ideal domain, π ∈ R a prime element, and M ≠ {0}
a finitely generated π-primary R-module. Then there exist finitely many m1 , . . . , ms ∈ M
with M = ⨁si=1 Rmi .
Since Rmi ≅ R/ Ann(mi ), and Ann(mi ) = (δmi ) = (π ki ), we get the following exten-
sion of Theorem 18.6.6:
Theorem 18.6.7. Let R be a principal ideal domain, π ∈ R a prime element, and M ≠ {0}
a finitely generated π-primary R-module. Then there exist finitely many k1 , . . . , ks ∈ ℕ
with
s
M ≅ ⨁ R/(π ki ),
i=0
Proof. The first part, that is, a description as M ≅ ⨁si=0 R/(π ki ), follows directly from
Theorem 18.6.6. Now, let
n m
M ≅ ⨁ R/(π ki ) ≅ ⨁ R/(π li ).
i=0 i=0
n m
N ≅ ⨁ R/(π) and, analogously, N ≅ ⨁ R/(π),
i=1 i=1
we get
n = dimR/(π) N = m. (⋆⋆⋆)
Assume that there is an i with ki < li or li < ki . Without loss of generality, assume that
there is an i with ki < li .
Let j be the smallest index, for which kj < lj . Then (because of the ordering of
the ki )
n j−1
M := π kj M ≅ ⨁ π kj R/π ki R ≅ ⨁ π kj R/π ki R,
i=1 i=1
Theorem 18.6.8 (Fundamental theorem for finitely generated modules over principal
ideal domains). Let R be a principal ideal domain and M ≠ {0} be a finitely generated
(unitary) R-module. Then there exist prime elements π1 , . . . , πr ∈ R, 0 ≤ r < ∞ and
numbers k1 , . . . , kr ∈ tℕ, t ∈ ℕ0 such that
k k
M ≅ R/(π1 1 ) ⊕ R/(π2 2 ) ⊕ ⋅ ⋅ ⋅ ⊕ R/(πrkr ) ⊕ R ⊕ ⋅ ⋅ ⋅ ⊕ R,
⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
t-times
k k
and M is, up to isomorphism, uniquely determined by (π1 1 , . . . , πr r , t).
The prime elements πi are not necessarily pairwise different (up to units in R); that
means, it can be πi = ϵπj for i ≠ j, where ϵ is a unit in R.
Proof. The proof is a combination of the preceding results. The free part of M is iso-
morphic to M/T(M), and the rank of M/T(M), which we call here t, is uniquely deter-
mined, because two bases of M/T(M) have the same cardinality. Therefore, we may re-
strict ourselves on torsion modules. Here, we have a reduction to π-primary modules,
k k
because in a decomposition M = ⨁i R/(πi i ) is Mπ = ⨁πi =π R/(πi i ), the π-primary com-
ponent of M (an isomorphism certainly maps a π-primary component onto a π-primary
component). Therefore, it is only necessary, now, to consider π-primary modules M.
The uniqueness statement now follows from Theorem 18.6.8:
Theorem 18.6.9 (Fundamental theorem for finitely generated abelian groups). Let {0}
≠ G = (G, +) be a finitely generated abelian group. Then there exist prime numbers
k
G ≅ ℤ/(p1 1 ℤ) ⊕ ⋅ ⋅ ⋅ ⊕ ℤ/(pkr r ℤ) ⊕ ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
ℤ ⊕ ⋅ ⋅ ⋅ ⊕ ℤ,
t-times
k k
and G is, up to isomorphism, uniquely determined by (p1 1 , . . . , pr r , t).
18.7 Exercises
1. Let M and N be isomorphic modules over a commutative ring R. Then EndR (M)
and EndR (N) are isomorphic rings. (EndR (M) is the set of all R-modules endomor-
phisms of M.)
2. Let R be an integral domain and M an R-module with M = Tor(M) (torsion mod-
ule). Show that HomR (M, R) = 0. (HomR (M, R) is the set of all R-module homo-
morphisms from M to R.)
3. Prove the isomorphism theorems for modules (1), (2), and (3) in Theorem 18.1.11
in detail.
4. Let M, M , N be R-modules, R a commutative ring. Show the following:
(i) HomR (M ⊕ M , N) ≅ HomR (M, N) × HomR (M , N)
(ii) HomR (N, M × M ) ≅ HomR (N, M) ⊕ HomR (N, M ).
5. Show that two free R-modules having bases, whose cardinalities are equal are iso-
morphic.
6. Let M be an unitary R-module (R a commutative ring), and let {m1 , . . . , ms } be a
finite subset of M. Show that the following are equivalent:
(i) {m1 , . . . , ms } generates M freely.
(ii) {m1 , . . . , ms } is linearly independent and generates M.
(iii) Every element m ∈ M is uniquely expressible in the form m = ∑si=1 ri mi with
ri ∈ R.
(iv) Each Rmi is torsion-free, and M = Rm1 ⊕ ⋅ ⋅ ⋅ ⊕ Rms .
7. Let R be a principal domain and M ≠ {0} be an R-module with M = T(M).
(i) Let x1 , . . . , xn ∈ M be pairwise different and pairwise relatively prime orders
δxi = αi . Then y = x1 + ⋅ ⋅ ⋅ + xn has order α := α1 . . . αn .
k k
(ii) Let 0 ≠ x ∈ M and δx = ϵπ1 1 ⋅ ⋅ ⋅ πnn be a prime decomposition of the order δx of
x (ϵ a unit in R and the πi pairwise nonassociate prime elements in R), where
k
n > 0, ki > 0. Then there exist xi , i = 1, . . . , n, with δxi associated with πi i and
x = x1 + ⋅ ⋅ ⋅ + xn .
Theorem 19.1.1 (Theorem 10.4.1, basis theorem for finite abelian groups). Let G be a
finite abelian group. Then G is a direct product of cyclic groups of prime power order.
We review two examples that show how this theorem leads to the classification of
finite abelian groups. In particular, this theorem allows us, for a given finite order n,
to present a complete classification of abelian groups of order n.
Since all cyclic groups of order n are isomorphic to (ℤn , +), ℤn = ℤ/nℤ, we will
denote a cyclic group of order n by ℤn .
Example 19.1.2. Classify all abelian groups of order 60. Let G be an abelian group of
order 60. From Theorem 10.4.1, G must be a direct product of cyclic groups of prime
power order. Now 60 = 22 ⋅ 3 ⋅ 5, so the only primes involved are 2, 3, and 5. Hence, the
cyclic groups involved in the direct product decomposition of G have order either 2, 4,
3, or 5 (by Lagrange’s theorem they must be divisors of 60). Therefore, G must be of
the form
G ≅ ℤ4 × ℤ3 × ℤ5 ,
or
G ≅ ℤ2 × ℤ2 × ℤ3 × ℤ5 .
Hence, up to isomorphism, there are only two abelian groups of order 60.
Example 19.1.3. Classify all abelian groups of order 180. Let G be an abelian group of
order 180. Now 180 = 22 ⋅ 32 ⋅ 5, so the only primes involved are 2, 3, and 5. Hence, the
cyclic groups involved in the direct product decomposition of G have order either 2, 4,
3, 9, or 5 (by Lagrange’s theorem they must be divisors of 180). Therefore, G must be
of the form
G ≅ ℤ4 × ℤ9 × ℤ5
G ≅ ℤ2 × ℤ2 × ℤ9 × ℤ5
G ≅ ℤ4 × ℤ3 × ℤ3 × ℤ5
G ≅ ℤ2 × ℤ2 × ℤ3 × ℤ3 × ℤ5 .
https://doi.org/10.1515/9783110603996-019
The proof of Theorem 19.1.1 involves the lemmas that follow. We refer back to Chap-
ter 10 or Chapter 18 for the proofs. Notice how these lemmas mirror the results for
finitely generated modules over principal ideal domains considered in the last chap-
ter.
Lemma 19.1.4. Let G be a finite abelian group, and let p||G|, where p is a prime. Then
all the elements of G, whose orders are a power of p form a normal subgroup of G. This
subgroup is called the p-primary component of G, which we will denote by Gp .
e e
Lemma 19.1.5. Let G be a finite abelian group of order n. Suppose that n = p1 1 ⋅ ⋅ ⋅ pkk
with p1 , . . . , pk distinct primes. Then
G ≅ Gp1 × ⋅ ⋅ ⋅ × Gpk ,
Theorem 19.1.6 (Basis theorem for finite abelian groups). Let G be a finite abelian
group. Then G is a direct product of cyclic groups of prime power order.
Theorem 19.2.1 (Fundamental theorem for finitely generated modules over principal
ideal domains). Let R be a principal ideal domain and M ≠ {0} be a finitely generated
(unitary) R-module. Then there exist prime elements π1 , . . . , πr ∈ R, 0 ≤ r < ∞ and
numbers k1 , . . . , kr ∈ ℕ, t ∈ ℕ0 , such that
k k
M ≅ R/(π1 1 ) ⊕ R/(π2 2 ) ⊕ ⋅ ⋅ ⋅ ⊕ R/(πrkr ) ⊕ ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
R ⊕ ⋅ ⋅ ⋅ ⊕ R,
t-times
k k
and M is, up to isomorphism, uniquely determined by (π1 1 , . . . , πr r , t).
The prime elements πi are not necessarily pairwise different (up to units in R); that
means, it can be πi = ϵπj for i ≠ j, where ϵ is a unit in R.
Since abelian groups can be considered as ℤ-modules, and ℤ is a principal ideal
domain, we get the following corollary, which is extremely important in its own right:
Theorem 19.2.2 (Fundamental theorem for finitely generated abelian groups). Let {0}
≠ G = (G, +) be a finitely generated abelian group. Then there exist prime numbers
k k
and G is, up to isomorphism, uniquely determined by (p1 1 , . . . , pr r , t).
Notice that the number t of infinite components is unique. This is called the rank
or Betti number of the abelian group G. This number plays an important role in the
study of homology and cohomology groups in topology.
If G = ℤ × ℤ × ⋅ ⋅ ⋅ × ℤ = ℤr for some r, we call G a free abelian group of rank r. No-
tice that if an abelian group G is torsion-free, then the p-primary components are just
the identity. It follows that, in this case, G is a free abelian group of finite rank. Again,
using module theory, it follows that subgroups of this must also be free abelian and
of smaller or equal rank. Notice the distinction between free abelian groups and abso-
lutely free groups (see Chapter 14). In the free group case, a nonabelian free group of
finite rank contains free subgroups of all possible countable ranks. In the free abelian
case, however, the subgroups have smaller or equal rank. We summarize these com-
ments as follows:
Theorem 19.2.3. Let G ≠ {0} be a finitely generated torsion-free abelian group. Then G
is a free abelian group of finite rank r; that is, G ≅ ℤr . Furthermore, if H is a subgroup
of G, then H is also free abelian and the rank of H is smaller than or equal to the rank
of G.
+ : G × G → G, (x, y) → x + y.
We also write ng instead of g n , and use 0 as the symbol for the identity element in G;
that is, 0 + g = g for all g ∈ G. G = ⟨g1 , . . . , gt ⟩, 0 ≤ t < ∞. That is, G is (finitely)
generated by g1 , . . . , gt , is equivalent to the fact that each g ∈ G can be written in the
form g = n1 g1 + n2 g2 + ⋅ ⋅ ⋅ + nt gt , ni ∈ ℤ. A relation between the gi with coefficients
n1 , . . . , nt is then each an equation of the form n1 g1 + ⋅ ⋅ ⋅ + nt gt = 0. A relation is called
nontrivial if ni ≠ 0 for at least one i. A system R of relations in G is called a system of
defining relations, if each relation in G is a consequence of R. The elements g1 , . . . , gt are
called integrally linear independent if there are no nontrivial relations between them.
A finite generating system {g1 , . . . , gt } of G is called a minimal generating system if there
is no generating system with t − 1 elements.
Certainly, each finitely generated group has a minimal generating system. In what
follow, we always assume that our finitely generated abelian group G is unequal {0};
that is, G is nontrivial.
As above, we may consider G as a finitely generated ℤ-module, and in this sense,
the subgroups of G are precisely the submodules. Hence, it is clear what we mean if
we call G a direct product G = U1 × ⋅ ⋅ ⋅ × Us of its subgroups U1 , . . . , Us ; namely, each
g ∈ G can be written as g = u1 + u2 + ⋅ ⋅ ⋅ + us with ui ∈ Ui and
s
Ui ∩ ( ∏ Uj ) = {0}.
j=1,j=i̸
To emphasize the little difference between abelian groups and ℤ-modules, here
we use the notation “direct product” instead of “direct sum”. Considered as ℤ-mod-
ules, for finite index sets I = {1, . . . , s}, we have anyway
s s
∏ Ui = ⨁ Ui .
i=1 i=1
Theorem 19.3.1 (Basis theorem for finitely generated abelian groups). Let G ≠ {0} be
a finitely generated abelian group. Then G is a direct product
G ≅ Zk1 × ⋅ ⋅ ⋅ × Zkr × U1 × ⋅ ⋅ ⋅ × Us ,
Lemma 19.3.3. Let G be a finitely generated abelian group. Among all nontrivial rela-
tions between elements of minimal generating systems of G, we choose one relation,
m1 g1 + ⋅ ⋅ ⋅ + mt gt = 0 (⋆)
with smallest possible positive coefficient, and let this smallest coefficient be m1 . Let
n1 g1 + ⋅ ⋅ ⋅ + nt gt = 0 (⋆⋆)
Proof. (1) Assume m1 ∤ n1 . Then n1 = qm1 + m1 with 0 < m1 < m1 . If we multiply the
relation (⋆) with q and subtract the resulting relation from the relation (⋆⋆), then we
get a relation with a coefficient m1 < m1 , which contradicts the choice of m1 . Hence,
m1 |n1 .
(2) Assume m1 ∤ m2 . Then m2 = qm1 + m2 with 0 < m2 < m2 . {g1 + qg2 , g2 , . . . , gt } is
a minimal generating system, which satisfies the relation m1 (g1 + qg2 ) + m2 g2 + m3 g3 +
⋅ ⋅ ⋅ + mt gt = 0, and this relation has a coefficient m2 < m1 . This again contradicts the
choice of m1 . Hence, m1 |m2 , and furthermore, m1 |mi for i = 1, . . . , t.
Lemma 19.3.4 (Invariant characterization of kr for finite abelian groups G). Let G =
Zk1 × ⋅ ⋅ ⋅ × Zkr and Zki finite cyclic of order ki ≥ 2, i = 1, . . . , r, with ki |ki+1 for i = 1, . . . , r − 1.
Then kr is the smallest natural number n such that ng = 0 for all g ∈ G. kr is called the
exponent or the maximal order of G.
s+1 s s s+1
∑ xi (∑ nij bj ) = ∑(∑ nij xi )bj = 0.
i=1 j=1 j=1 i=1
The system ∑s+1 i=1 nij xi = 0, j = 1, . . . , s, of linear equations has at least one nontriv-
ial rational solution (x1 , . . . , xs+1 ), because we have more unknowns than equations.
Multiplication with the common denominator gives a nontrivial integral solution
(x1 , . . . , xs+1 ) ∈ ℤs+1 . For this solution, we get
s+1
∑ xi gi = 0.
i=1
Case 2: mij aj arbitrary. Let k ≠ 0 be a common multiple of the orders kj of the cyclic
groups Zkj , j = 1, . . . , r. Then
kgi = ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
mi1 ka1 + ⋅ ⋅ ⋅ + ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
mir kar +ni1 kb1 + ⋅ ⋅ ⋅ + nis kbs
=0 =0
for i = 1, . . . , s + 1. By case 1, the kg1 , . . . , kgs+1 are integrally linear dependent; that is,
we have integers x1 , . . . , xs+1 , not all 0, with ∑s+1 s+1
i=1 xi (kgi ) = 0 = ∑i=1 (xi k)gi , and the xi k
are not all 0. Hence, also g1 , . . . , gs+1 are integrally linear dependent.
Lemma 19.3.6. Let G := Zk1 × ⋅ ⋅ ⋅ × Zkr ≅ Zk1 × ⋅ ⋅ ⋅ × Zk =: G , the Zki , Zk cyclic groups
r j
of orders ki ≠ 1 and kj ≠ 1, respectively, and ki |ki+1 for i = 1, . . . , r − 1 and kj |kj+1
for
j = 1, . . . , r − 1. Then r = r , and k1 = k1 , k2 = k2 , . . . , kr = kr .
Proof. We prove this lemma by induction on the group order |G| = |G |. Certainly,
Lemma 19.3.6 holds if |G| ≤ 2, because then, either G = {0}, and here r = r = 0, or
G ≅ ℤ2 , and here r = r = 1. Now let |G| > 2. Then, in particular, r ≥ 1. Inductively we
assume that Lemma 19.3.6 holds for all finite abelian groups of order less than |G|. By
Lemma 19.3.4 the number kr is invariantly characterized, that is, from G ≅ G follows
kr = kr , that is especially, Zkr ≅ Zk . Then G/Zkr ≅ G/Zk , that is, Zk1 × ⋅ ⋅ ⋅ × Zkr−1 ≅
r r
Zk1 × ⋅ ⋅ ⋅ × Zk . Inductively, r − 1 = r − 1; that is, r = r , and k1 = k1 , . . . , kr−1 = kr −1 .
r −1
We can now present the main result, which we state again, and its proof.
Theorem 19.3.7 (Basis theorem for finitely generated abelian groups). Let G ≠ {0} be
a finitely generated abelian group. Then G is a direct product
G ≅ Zk1 × ⋅ ⋅ ⋅ × Zkr × U1 × ⋅ ⋅ ⋅ × Us , r ≥ 0, s ≥ 0,
Proof. (a) We first prove the existence of the given decomposition. Let G ≠ {0} be a
finitely generated abelian group. Let t, 0 < t < ∞, be the number of elements in a
m1 g1 + ⋅ ⋅ ⋅ + mt gt = 0 (⋆)
m⏟ ⏟⏟
⏟⏟ 1 h⏟ 1⏟ + ⏟⏟ h⏟2⏟ = 0,
k⏟2⏟⏟
=0 =0
since k2 ≠ 0. Again m1 |k2 by Lemma 19.3.3. This gives the desired decomposition.
(b) We now prove the uniqueness statement.
Case 1: G is finite abelian. Then the claim follows from Lemma 19.3.6
Case 2: G is arbitrary finitely generated and abelian. Let T := {x ∈ G : |x| < ∞};
that is, the set of elements of G of finite order. Since G is abelian, T is a subgroup of G,
the so called torsion subgroup of G. If, as above, G = Zk1 × ⋅ ⋅ ⋅ × Zkr × U1 × ⋅ ⋅ ⋅ × Us , then
T = Zk1 × ⋅ ⋅ ⋅ × Zkr , because an element b1 + ⋅ ⋅ ⋅ + br + c1 + ⋅ ⋅ ⋅ + cs with bi ∈ Zki , cj ∈ Uj
has finite order if and only if all cj = 0. That means: Zk1 × ⋅ ⋅ ⋅ × Zkr is independent
of the special decomposition, uniquely determined by G; hence, also the numbers
r, k1 , . . . , kr by Lemma 19.3.6. Finally, the number s, the rank of G, is uniquely deter-
mined by Lemma 19.3.5. This proves the basis theorem for finitely generated abelian
groups.
Theorem 19.3.8. Let {0} ≠ G = (G, +) be a finitely generated abelian group. Then there
exist prime numbers p1 , . . . , pr , 0 ≤ r < ∞, and numbers k1 , . . . , kr ∈ ℕ, t ∈ ℕ0 such that
k k
and G is, up to isomorphism, uniquely determined by (p1 1 , . . . , pr r , t).
Proof. For the existence, we only have to show that ℤmn ≅ ℤm ×ℤn if gcd(m, n) = 1. For
this, we write Un = ⟨m + mnℤ⟩ < ℤmn , Um = ⟨n + nmℤ⟩ < ℤmn , and Un ∩ Um = {mnℤ},
because gcd(m, n) = 1. Furthermore, there are h, k ∈ ℤ with 1 = hm + kn. Hence,
l + mnℤ = hlm + mnℤ + kln + mnℤ, and therefore ℤmn = Un × Um ≅ ℤn × ℤm .
For the uniqueness statement, we may reduce the problem to the case |G| = pk for a
prime number p and k ∈ ℕ. But here the result follows directly from Lemma 19.3.6.
From this proof, we automatically get the Chinese remainder theorem for the case
ℤn = ℤ/nℤ.
Proof. By Theorem 19.3.1, we get that π is an additive group isomorphism, which can
be extended directly to a ring isomorphism via (a + mℤ)(b + mℤ) → (ab + m1 ℤ,
. . . , ab + mr ℤ). The remaining statements are now obvious.
for some k with 1 ≤ k ≤ m. On the other hand, each partition (m1 , . . . , mk ) of m gives an
abelian group of order pm , namely ℤpm1 ×⋅ ⋅ ⋅×ℤpmk . Theorem 19.2.2 shows that different
partitions give nonisomorphic groups. If we define p(m) to be the number of partitions
k k
of m, then we get the following: A(pm ) = p(m), and A(p1 1 ⋅ ⋅ ⋅ pr r ) = p(k1 ) ⋅ ⋅ ⋅ p(kr ).
19.4 Exercises
1. Let H be a finite generated abelian group, which is the homomorphic image of a
torsion-free abelian group of finite rank n. Show that H is the direct sum of ≤ n
cyclic groups.
2. Determine (up to isomorphism) all groups of order p2 (p prime) and all abelian
groups of order ≤ 15.
3. Let G be an abelian group with generating elements a1 , . . . , a4 and defining rela-
tions
5. Let p be a prime and G a finite abelian p-group; that is, the order of all elements
of G is finite and a power of p. Show that G is cyclic, if G has exactly one subgroup
of order p. Is the statement still correct if G is not abelian?
is transcendental.
In this section, we examine a special type of algebraic number called an algebraic
integer. These are the algebraic numbers that are zeros of monic integral polynomials.
The set of all such algebraic integers forms a subring of ℂ. The proofs in this section
can be found in [43].
After we do this, we extend the concept of an algebraic integer to a general con-
text and define integral ring extensions. We then consider field extensions that are
nonalgebraic—transcendental field extensions. Finally, we will prove that the famil-
iar numbers e and π are transcendental.
Definition 20.1.1. An algebraic integer is a complex number α, that is, a zero of a monic
integral polynomial. That is, α ∈ ℂ is an algebraic integer if there exists f (x) ∈ ℤ[x]
with f (x) = xn + bn−1 xn−1 + ⋅ ⋅ ⋅ + b0 , bi ∈ ℤ, n ≥ 1, and f (α) = 0.
https://doi.org/10.1515/9783110603996-020
To prove the converse of this lemma, we need the concept of a primitive integral
polynomial. This is a polynomial p(x) ∈ ℤ[x] such that the GCD of all its coefficients
is 1. The following can be proved (see exercises or Chapter 4):
(1) If f (x) and g(x) are primitive, then so is f (x)g(x).
(2) If f (x) ∈ ℤ[x] is monic, then it is primitive.
(3) If f (x) ∈ ℚ[x], then there exists a rational number c such that f (x) = cf1 (x) with
f1 (x) primitive.
Now suppose f (x) ∈ ℤ[x] is a monic polynomial with f (α) = 0. Let p(x) = mα (x). Then
p(x) divides f (x) so f (x) = p(x)q(x).
Let p(x) = c1 p1 (x) with p1 (x) primitive, and let q(x) = c2 q1 (x) with q1 (x) primitive.
Then
Lemma 20.1.4. If α is an algebraic integer and also rational, then it is a rational integer.
We saw that the set 𝒜 of all algebraic numbers is a subfield of ℂ. In the same
manner, the set ℐ of all algebraic integers forms a subring of 𝒜. First, an extension of
the following result on algebraic numbers.
Lemma 20.1.6. Suppose α1 , . . . , αn form the set of conjugates over ℚ of an algebraic in-
teger α. Then any integral symmetric function of α1 , . . . , αn is a rational integer.
We note that 𝒜, the field of algebraic numbers, is precisely the quotient field of
the ring of algebraic integers.
An algebraic number field is a finite extension of ℚ within ℂ. Since any finite ex-
tension of ℚ is a simple extension, each algebraic number field has the form K = ℚ(θ)
for some algebraic number θ.
Theorem 20.1.8. Let K be an algebraic number field and RK its ring of integers. Then
each α ∈ RK is either 0, a unit, or can be factored into a product of primes.
We stress again that the prime factorization need not be unique. However, from
the existence of a prime factorization, we can extend Euclid’s original proof of the
infinitude of primes (see [43]) to obtain the following:
Corollary 20.1.9. There exist infinitely many primes in RK for any algebraic number ring
RK .
Just as any algebraic number field is finite dimensional over ℚ, we will see that
each RK is of finite degree over ℚ. That is, if K has degree n over ℚ, we show that there
exists ω1 , . . . , ωn in RK such that each α ∈ RK is expressible as
α = m1 ω1 + ⋅ ⋅ ⋅ + mn ωn ,
where m1 , . . . , mn ∈ ℤ.
α = m1 ω1 + ⋅ ⋅ ⋅ + mt ωt ,
where m1 , . . . , mt ∈ ℤ.
The finite degree comes from the following result that shows there does exist an
integral basis (see [43]):
Theorem 20.1.11. Let RK be the ring of integers in the algebraic number field K of degree
n over ℚ. Then there exists at least one integral basis for RK .
Example 20.2.3.
1. Let E|K be a field extension. a ∈ E is integral over K if and only if a is algebraic over
K. If K is the quotient field of an integral domain R, and a ∈ E is algebraic over K.
Then there exists an α ∈ R with αa integral over R, because if 0 = αn an + ⋅ ⋅ ⋅ + α0 ,
thus, 0 = (αn a)n + ⋅ ⋅ ⋅ + αnn−1 α0 .
2. The elements of ℂ, which are integral over ℤ are precisely the algebraic integers
over ℤ, that is, the zeros of monic polynomials over ℤ.
for j = 1, . . . , n, where
0 if j ≠ k,
δjk = {
1 if j = k.
Define γjk := αkj − δjk a and C = (γjk )j,k . C is an (n × n)-matrix over the commutative ring
R[a]. Recall that R[a] has an identity element. Let C̃ = (γ̃jk )j,k be the complementary
matrix of C (see for instance [8]). Then CC ̃ = (det C)E . From (⋆⋆), we get
n
n n n n n
0 = ∑ γ̃ij ( ∑ γjk bk ) = ∑ ∑ γ̃ij γjk bk = ∑ (det C)δik bk = (det C)bi
j=1 k=1 k=1 j=1 k=1
for all 1 ≤ i ≤ n. Since b1 = 1, we have necessarily that det C = det(αjk − δjk a)j,k = 0
(recall that δjk = δkj ). Hence, a is a zero of the monic polynomial f (x) = det(δjk x −αjk ) ∈
R[x] of degree n ≥ 1. Therefore, a is integral over R.
Definition 20.2.5. A ring extension A|R is called an integral extension if each element
of A is integral over R. A ring extension A|R is called finite if A, as a R-module, is finitely
generated.
Recall that finite field extensions are algebraic extensions. As an immediate con-
sequence of Theorem 20.2.4, we get the corresponding result for ring extensions.
Proof. (1) ⇒ (2): We have R[a] = {g(a) : g ∈ R[x]}. Let f (a) = 0 be an integral equation
of a over R. Since f is monic, by the division algorithm, for each g ∈ R[x], there are
h, r ∈ R[x] with g = h ⋅ f + r and r = 0, or r ≠ 0 and deg(r) < deg(f ) =: n. Let r ≠ 0. Since
g(a) = r(a), we get that {1, a, . . . , an−1 } is a generating system for the R-module R[a].
(2) ⇒ (3): Take A = R[a].
(3) ⇒ (1): Use Theorem 20.2.4 for A .
For the remainder of this chapter, all rings are commutative with an identity 1 ≠ 0.
Theorem 20.2.8. Let A|R and B|A be finite ring extensions. Then also B|R is finite.
Proof. From A = Re1 +⋅ ⋅ ⋅+Rem , and B = Af1 +⋅ ⋅ ⋅+Afn , we get B = Re1 f1 +⋅ ⋅ ⋅+Rem fn .
Theorem 20.2.9. Let A|R be a ring extension. Then the following are equivalent:
(1) There are finitely many, over R integral elements a1 , . . . , am in A such that A =
R[a1 , . . . , am ].
(2) A|R is finite.
Theorem 20.2.11. Let A|R be a ring extension. Then the integral closure of R in A is a
subring of A with R ⊂ A.
Definition 20.2.12. Let A|R a ring extension. R is called integrally closed in A, if R itself
is its integral closure in R; that is, R = C, the integral closure of R in A.
Theorem 20.2.13. For each ring extension A|R, the integral closure C of R in A, is inte-
grally closed in A.
Theorem 20.2.14. Let A|R and B|A be ring extensions. If A|R and B|A are integral exten-
sions, then also B|R is an integral extension (and certainly vice versa).
Theorem 20.2.17. Let R be an integral domain and K its quotient field. Let E|K be a finite
field extension. Let R be integrally closed and α ∈ E integral over R. Then the minimal
polynomial g ∈ K[x] of α over K has only coefficients of R.
Proof. Let g ∈ K[x] be the minimal polynomial of α over K (recall that g is monic by
definition). Let Ē be an algebraic closure of E. Then g(x) = (x−α1 ) ⋅ ⋅ ⋅ (x−αn ) with α1 = α
over E.̄ There are K-isomorphisms σi : K(α) → Ē with σi (α) = αi . Hence, all αi are also
integral over R. Since all coefficients of g are polynomial expressions Cj (α1 , . . . , αn ) in
the αi , we get that all coefficients of g are integral over R (see Theorem 20.2.11). Now
g ∈ R[x], because g ∈ K[x], and R is integrally closed.
Theorem 20.2.18. Let R be an integrally closed integral domain and K its quotient field.
Let f , g, h ∈ K[x] be monic polynomials over K with f = gh. If f ∈ R[x], then also g, h ∈
R[x].
Theorem 20.2.19. Let E|R be an integral ring extension. If E is a field, then also R is a
field.
Proof. Let α ∈ R\{0}. The element α1 ∈ E satisfies an integral equation ( α1 )n +an−1 ( α1 )n−1 +
⋅ ⋅ ⋅ + a0 = 0 over R. Multiplication with αn−1 gives α1 = −an−1 − an−2 α − ⋅ ⋅ ⋅ − a0 αn−1 ∈ R.
Hence, R is a field.
Definition 20.3.1.
(a) M is said to be algebraically independent (over K) if α ∉ H(M \ {α}) for all α ∈ M;
that is, if each α ∈ M is transcendental over K(M \ {α}).
(b) M is said to be algebraically dependent (over K) if M is not algebraically indepen-
dent.
Lemma 20.3.2.
(1) M is algebraically dependent if and only if there exists an α ∈ M, which is algebraic
over K(M \ {α}).
(2) Let α ∈ M. Then α ∈ H(M \ {α}) ⇔ H(M) = H(M \ {α}).
(3) If α ∉ M and α is algebraic over K(M), then M ∪ {α} is algebraically dependent.
(4) M is algebraically dependent if and only if there is a finite subset in M, which is
algebraically dependent.
(5) M is algebraically independent if and only if each finite subset of M is algebraically
independent.
(6) M is algebraically independent if and only if the following holds: If α1 , . . . , αn are
finitely many, pairwise different elements of M, then the canonical homomorphism
ϕ : K[x1 , . . . , xn ] → E, f (x1 , . . . , xn ) → f (α1 , . . . , αn ) is injective; or in other words,
for all f ∈ K[x1 , . . . , xn ], we have that f = 0 if f (α1 , . . . , αn ) = 0. That is, there is no
nontrivial algebraic relation between the α1 , . . . , αn over K.
We will show that any field extension can be decomposed into a transcendental
extension over an algebraic extension. We need the idea of a transcendence basis.
Proof. (1) ⇒ (2): Let α ∈ M \ B. We have to show that B ∪ {α} is algebraically dependent.
But this is clear, because α ∈ H(B) = E.
(2) ⇒ (3): We just take M = E.
(3) ⇒ (1): We have to show that H(B) = E. Certainly, M ⊂ H(B). Hence, E = H(M) ⊂
H(H(B)) = H(B) ⊂ E.
We next show that any field extension does have a transcendence basis:
Theorem 20.3.5. Each field extension E|K has a transcendence basis. More concretely,
if there is a subset M ⊂ E such that E|K(M) is algebraic and if there is a subset C ⊂ M,
which is algebraically independent, then there exists a transcendence basis B of E|K with
C ⊂ B ⊂ M.
Theorem 20.3.6. Let E|K be a field extension and M a subset of E, for which E|K(M) is
algebraic. Let C be an arbitrary subset of E, which is algebraically independent on K.
Then there exists a subset M ⊂ M with C ∩ M = 0 such that C ∪ M is a transcendence
basis of E|K.
Theorem 20.3.7. Let B, B be two transcendence bases of the field extension E|K. Then
there is a bijection ϕ : B → B . In other words, any two transcendence bases of E|K have
the same cardinal number.
Proof. (a) If B is a transcendental basis of E|K and M is a subset of E such that E|K(M)
is algebraic, then we may write B = ⋃α∈M Bα with finite sets Bα . In particular, if B
is infinite, then the cardinal number of B is not bigger than the cardinal number
of M.
(b) Let B and B be two transcendence bases of E|K. If B and B are both infinite,
then B and B have the same cardinal number by (a) and the theorem by Schroeder–
Bernstein [9]. We now prove Theorem 20.3.7 for the case that E|K has a finite transcen-
dence basis. Let B be finite with n elements. Let C be an arbitrary algebraically inde-
pendent subset in E over K with m elements. We show that m ≤ n. Let C = {α1 , . . . , αm }
with m ≥ n. We show, by induction, that for each integer k, 0 ≤ k ≤ n, there are
subsets B ⫌ B1 ⫌ ⋅ ⋅ ⋅ ⫌ Bk of B such that {α1 , . . . , αk } ∪ Bk is a transcendence basis of
E|K, and {α1 , . . . , αk } ∩ Bk = 0. For k = 0, we take B0 = B, and the statement holds.
Assume now that the statement is correct for 0 ≤ k < n. By Theorem 20.3.4 and 20.3.5,
there is a subset Bk+1 of {α1 , . . . , αk } ∪ Bk such that {α1 , . . . , αk+1 } ∪ Bk+1 is a transcen-
dence basis of E|K, and {α1 , . . . , αk+1 } ∩ Bk+1 = 0. Then necessarily, Bk+1 ⊂ Bk . Assume
Bk = Bk+1 . Then on the one hand, Bk ∪ {α1 , . . . , αk+1 } is algebraic independent because
Bk = Bk+1 . On the other hand, also Bk ∪ {α1 , . . . , αk } ∪ {ak+1 } is algebraically dependent,
which gives a contradiction. Hence, Bk+1 ⫋ Bk . Now Bk has at most n − k elements.
Therefore, Bn = 0; that is, {α1 , . . . , αn } = {α1 , . . . , αn } ∪ Bn is a transcendence basis of
E|K. Because C = {α1 , . . . , αm } is algebraically independent, we cannot have m > n.
Thus, m ≤ n, and B and B have the same number of elements, because B must also
be finite.
Since the cardinality of any transcendence basis for a field extension E|K is the
same, we can define the transcendence degree.
Definition 20.3.8. The transcendence degree trgd(E|K) of a field extension is the car-
dinal number of one (and hence of each) transcendence basis of E|K. A field extension
E|K is called purely transcendental, if E|K has a transcendence basis B with E = K(B).
Theorem 20.3.9. Let E|K be a field extension and F an arbitrary intermediate field,
K ⊂ F ⊂ E. Let B be a transcendence basis of F|K and B a transcendence base of E|F.
Then B ∩ B = 0, and B ∪ B is a transcendence basis of E|K. In particular, trgd(E|K) =
trgd(E|F) + trgd(F|K).
Proof. (1) Assume α ∈ B∩B . As an element of F, then α is algebraic over F(B )\{α}. But
this gives a contradiction, because α ∈ B , and B is algebraically independent over F.
(2) F|K(B) is an algebraic extension, and also F(B )|K(B ∪ B ) = K(B)(B ). Since the
relation “algebraic extension” is transitive, we have that E|K(B ∪ B ) is algebraic.
(3) Finally, we have to show that B∪B is algebraically independent over K. By The-
orems 20.3.5 and 20.3.6, there is a subset B of B∪B with B∩B = 0 such that B∪B is a
transcendence basis of E|K. We have B ⊂ B , and have to show that B ⊂ B . Assume
that there is an α ∈ B with α ∉ B . Then α is algebraic over K(B ∪ B ) = K(B)(B ), and
hence algebraic over F(B ). Since B ⊂ B , we have that α is algebraically independent
over F, which gives a contradiction. Hence, B = B .
Proof. Without loss of generality, let the a1 , . . . , an be pairwise different. We prove the
theorem by induction on n. If n = 1, then there is nothing to show. Now, let n ≥ 2, and
assume that the statement holds for n − 1. If there is no nontrivial algebraic relation
f (a1 , . . . , an ) = 0 over K between the a1 , . . . , an , then there is nothing to show. Hence,
let there exists a polynomial f ∈ K[x1 , . . . , xn ] with f ≠ 0 and f (a1 , . . . , an ) = 0. Let
ν ν
f = ∑ν=(ν1 ,...,νn ) cν x1 1 ⋅ ⋅ ⋅ xnn . Let μ2 , μ3 , . . . , μn be natural numbers, which we specify later.
μ μ μ μ
Define b2 = a2 − a1 2 , b3 = a3 − a1 3 , . . . , bn = an − a1 n . Then ai = bi + a1 i for 2 ≤ i ≤ n,
μ μ
hence, f (a1 , b2 + a1 2 , . . . , bn + a1 n ) = 0. We write R := K[x1 , . . . , xn ] and consider the
polynomial ring R[y2 , . . . , yn ] of the n − 1 independent indeterminates y2 , . . . , yn over R.
μ μ
In R[y2 , . . . , yn ], we consider the polynomial f (x1 , y2 + x1 2 , . . . , yn + x1 n ). We may rewrite
this polynomial as
ν +μ2 ν2 +⋅⋅⋅+μn νn
∑ cν x1 1 + g(x1 , y2 , . . . , yn )
ν=(ν1 ,...,νn )
Proof. Let f (x) ∈ ℝ[x] with the degree of f (x) = m ≥ 1. Let z1 ∈ ℂ, z1 ≠ 0, and γ :
[0, 1] → ℂ, γ(t) = tz1 . Let
z1
z1 −z
I(z1 ) = ∫ e f (z)dz = (∫) ez1 −z f (z)dz.
γ 0 γ
z
By (∫0 1 )γ , we mean the integral from 0 to z1 along γ. Recall that
z1 z1
z1 −z
(∫) e z1
f (z)dz = −f (z1 ) + e f (0) + (∫) ez1 −z f (z)dz.
0 γ 0 γ
Let |f |(x) be the polynomial we get if we replace the coefficients of f (x) by their absolute
values. Since |ez1 −z | ≤ e|z1 −z| ≤ e|z1 | , we get
(2) |I(z1 )| ≤ |z1 |e|z1 | |f |(|z1 |).
For a detailed proof of these facts see for instance [42]. We consider now the polyno-
mial f (x) = xp−1 (x − 1)p ⋅ ⋅ ⋅ (x − n)p with p a sufficiently large prime number, and we
consider I(z1 ) with respect to this polynomial. Let
Now, f (j) (k) = 0 if j < p, k > 0, and if j < p − 1, then k = 0. Hence, f (j) (k) is
an integer that is divisible by p! for all j, k, except for j = p − 1, k = 0. Furthermore,
f (p−1) (0) = (p − 1)!(−1)np (n!)p . Hence, if p > n, then f (p−1) (0) is an integer divisible by
(p − 1)!, but not by p!.
It follows that J is a nonzero integer that is divisible by (p − 1)! if p > |q0 | and p > n.
So let p > n, p > |q0 |, so that |J| ≥ (p − 1)!.
Now, |f |(k) ≤ (2n)m . Together with (2), we then get that
(p − 1)! ≤ |J| ≤ cp ;
that is,
|J| cp−1
1≤ ≤c .
(p − 1)! (p − 1)!
cp−1
This gives a contradiction, since (p−1)!
→ 0 as p → ∞. Therefore, e is transcen-
dental.
Proof.
an−1 n n n−1
n f (x) = an x + an an−1 x
n−1
+ ⋅ ⋅ ⋅ + an−1
n a0
The product on the left side can be written as a sum of 2d terms eϕ , where ϕ =
ϵ1 θ1 + ⋅ ⋅ ⋅ + ϵd θd , ϵj = 0 or 1. Let n be the number of terms ϵ1 θ1 + ⋅ ⋅ ⋅ + ϵd θd that are
nonzero. Call these α1 , . . . , αn . We then have an equation
q + e α1 + ⋅ ⋅ ⋅ + e αn = 0
with q = 2d − n > 0. Recall that all tαi are algebraic integers, and we consider the
polynomial
f (x) = t np xp−1 (x − α1 )p ⋅ ⋅ ⋅ (x − αn )p
with p a sufficiently large prime integer. We have f (x) ∈ ℝ[x], since the αi are alge-
braic numbers, and the elementary symmetric polynomials in α1 , . . . , αn are rational
numbers.
Let I(z1 ) be defined as in the proof of Theorem 20.4.1, and now let
J = I(α1 ) + ⋅ ⋅ ⋅ + I(αn ).
with m = (n + 1)p − 1.
Now, ∑nk=1 f (j) (αk ) is a symmetric polynomial in tα1 , . . . , tαn with integer coeffi-
cients, since the tαi are algebraic integers. It follows from the main theorem on sym-
metric polynomials that ∑m n
j=0 ∑k=1 f (αk ) is an integer. Furthermore, f (αk ) = 0 for
(j) (j)
j < p. Hence, ∑m n
j=0 ∑k=1 f (αk ) is an integer divisible by p!.
(j)
(p − 1)! ≤ |J| ≤ cp ;
that is,
|J| cp−1
1≤ ≤c .
(p − 1)! (p − 1)!
cp−1
This, as before, gives a contradiction, since (p−1)!
→ 0 as p → ∞. Therefore, π is
transcendental.
20.5 Exercises
1. A polynomial p(x) ∈ ℤ[x] is primitive if the GCD of all its coefficients is 1. Prove
the following:
(i) If f (x) and g(x) are primitive, then so is f (x)g(x).
(ii) If f (x) ∈ ℤ[x] is monic, then it is primitive.
(iii) If f (x) ∈ ℚ[x], then there exists a rational number c such that f (x) = cf1 (x)
with f1 (x) primitive.
2. Let d be a square-free integer and K = ℚ(√d) be a quadratic field. Let RK be the
subring of K of the algebraic integers of K. Show the following:
(i) RK = {m + n√d : m, n ∈ ℤ} if d ≡ 2(mod 4) or d ≡ 3(mod 4). {1, √d} is an
integral basis for RK .
(ii) RK = {m + n 1+2 d : m, n ∈ ℤ} if d ≡ 1(mod 4). {1, 1+2 d } is an integral basis for
√ √
RK .
(iii) If d < 0, then there are only finitely many units in RK .
(iv) If d > 0, then there are infinitely many units in RK .
3. Let K = ℚ(α) with α3 + α + 1 = 0 and RK the subring of the algebraic integers in K.
Show that:
(i) {1, α, α2 } is an integral basis for RK .
(ii) RK = ℤ[α].
4. Let A|R be an integral ring extension. If A is an integral domain and R a field, then
A is also a field.
5. Let A|R be an integral extension. Let 𝒫 be a prime ideal of A and p be a prime ideal
of R such that 𝒫 ∩ R = p. Show that:
(i) If p is maximal in R, then 𝒫 is maximal in A. (Hint: consider A/𝒫 .)
(ii) If 𝒫0 is another prime ideal of A with 𝒫0 ∩ R = p and 𝒫0 ⊂ 𝒫 , then 𝒫 = 𝒫0 .
(Hint: we may assume that A is an integral domain, and 𝒫 ∩R = {0}, otherwise
go to A/𝒫 .)
6. Show that for a field extension E|K, the following are equivalent:
(i) [E : K(B)] < ∞ for each transcendence basis B of E|K.
(ii) trgd(E|K) < ∞ and [E : K(B)] < ∞ for each transcendence basis B of E|K.
(iii) There is a finite transcendence basis B of E|K with [E : K(B)] < ∞.
(iv) There are finitely many x1 , . . . , xn ∈ E with E = K(x1 , . . . , xn ).
7. Let E|K be a field extension. If E|K is purely transcendental, then K is algebraically
closed in E.
n
𝒩 (M) = {(α1 , . . . , αn ) ∈ C : f (α1 , . . . , αn ) = 0 ∀f ∈ M}.
For any subset N of C n , we can reverse the procedure and consider the set of poly-
nomials, whose zero set is N.
https://doi.org/10.1515/9783110603996-021
Instead of f ∈ I(N), we also say that f vanishes on N (over K). If we want to mention K,
then we write I(N) = IK (N).
What is important is that the set I(N) forms an ideal. The proof is straightforward.
Theorem 21.2.3. For any subset N ⊂ C n , the set I(N) is an ideal in K[x1 , . . . , xn ]; it is
called the vanishing ideal of N ⊂ C n in K[x1 , . . . , xn ].
The following result examines the relationship between subsets in C n and their
vanishing ideals.
Proof. The proofs are straightforward. Hence, we prove only (7), (8), and (9). The rest
can be left as exercise for the reader.
Proof of (7): Since ab ⊂ a ∩ b ⊂ a, b, we have, by (1), the inclusion 𝒩 (a) ∪ 𝒩 (b) ⊂
𝒩 (a ∩ b) ⊂ 𝒩 (ab). Hence, we have to show that 𝒩 (ab) ⊂ 𝒩 (a) ∪ 𝒩 (b).
Let α = (α1 , . . . , αn ) ∈ C n be a zero of ab, but not a zero of a. Then there is an f ∈ a
with f (α) ≠ 0; hence, for all g ∈ b, we get f (α)g(α) = (fg)(α) = 0. Thus, g(α) = 0.
Therefore, α ∈ 𝒩 (b).
Proof of (8) and (9): Let M ⊂ K[x1 , . . . , xn ]. Then, on the one hand, M ⊂ I 𝒩 (M) by
(5), and further 𝒩 I 𝒩 (M) ⊂ 𝒩 (M) by (1). On the other hand, 𝒩 (M) ⊂ 𝒩 I 𝒩 (M) by (6).
Therefore, 𝒩 (M) = 𝒩 I 𝒩 (M) for all M ⊂ K[x1 , . . . , xn ].
Now, the algebraic K-sets of C n are precisely the sets of the form V = 𝒩 (M). Hence,
V = 𝒩 I(V).
a ⊲ K[x1 , . . . , xn ].
f m ∈ a, m ≥ 1 ⇒ f ∈ a. (⋆)
Hence, for instance, if a = (x12 , . . . , xn2 ) ⊲ K[x1 , . . . , xn ], then a is not of the form a = I(N)
for some N ⊂ C n . We now define the radical of an ideal:
We note that the √0 is called the nil radical of R; it contains exactly the nilpotent
elements of R; that is, the elements a ∈ R with am = 0 for some m ∈ ℕ.
Let a ⊲ R be an ideal in R and π : R → R/a the canonical mapping. Then √a is
exactly the preimage of the nil radical of R/a.
Theorem 21.3.2. Let R be a noetherian ring. Then the polynomial ring R[x] over R is also
noetherian.
Proof. Let 0 ≠ fk ∈ R[x]. We denote the degree of fk with deg(fk ). Let a⊲R[x] be an ideal
in R[x]. Assume that a is not finitely generated. Then, particularly, a ≠ 0. We construct
a sequence of polynomials fk ∈ a such that the highest coefficients ak generate an
ideal in R, which is not finitely generated. This produces then a contradiction; hence,
a is in fact finitely generated. Choose f1 ∈ a, f1 ≠ 0, so that deg(f1 ) = n1 is minimal.
If k ≥ 1, then choose fk+1 ∈ a, fk+1 ∉ (f1 , . . . , fk ) so that deg(fk+1 ) = nk+1 is minimal
for the polynomials in a \ (f1 , . . . , fk ). This is possible, because we assume that a is not
finitely generated. We have nk ≤ nk+1 by our construction. Furthermore, (a1 , . . . , ak ) ⫋
(a1 , . . . , ak , ak+1 ).
Proof of this claim: Assume that (a1 , . . . , ak ) = (a1 , . . . , ak , ak+1 ). Then ak+1 ∈
(a1 , . . . , ak ). Hence, there are bi ∈ R with ak+1 = ∑ki=1 ai bi . Let g(x) = ∑ki=1 bi fi (x)xnk+1 −ni ;
hence, g ∈ (f1 , . . . , fk ), and g = ak+1 xnk+1 + ⋅ ⋅ ⋅. Therefore, deg(fk+1 − g) < nk+1 , and
fk+1 − g ∉ (f1 , . . . , fk ), which contradicts the choice of fk+1 . This proves the claim.
Hence, (a1 , . . . , ak ) ⫋ (a1 , . . . , ak , ak+1 ), which contradicts the fact that R is noethe-
rian. Hence, a is finitely generated.
Theorem 21.3.3 (Hilbert basis theorem). Let K be a field. Then each ideal a⊲K[x1 , . . . , xn ]
is finitely generated; that is, a = (f1 , . . . , fm ) for finitely many f1 , . . . , fm ∈ K[x1 , . . . , xn ].
Corollary 21.3.4. If C|K is a field extension, then each algebraic K-set V of C n is already
the zero set of only finitely many polynomials f1 , . . . , fm ∈ K[x1 , . . . , xn ]:
f m ∈ a, m ≥ 1 ⇒ f ∈ a
Theorem 21.4.1 (Hilbert’s nullstellensatz, first form). Let C|K be a field extension with
C algebraically closed. If a ⊲ K[x1 , . . . , xn ], then I 𝒩 (a) = √a. Moreover, if a is reduced,
that is, a = √a, then I 𝒩 (a) = a. Therefore, 𝒩 defines a bijective map between the set of
reduced ideals in K[x1 , . . . , xn ] and the set of the algebraic K-sets in C n , and I defines the
inverse map.
Theorem 21.4.2 (Hilbert’s nullstellensatz, second form). Let C|K be a field extension
with C algebraically closed. Let a ⊲ K[x1 , . . . , xn ] with a ≠ K[x1 , . . . , xn ]. Then there exists
an α = (α1 , . . . , αn ) ∈ C n with f (α) = 0 for all f ∈ a; that is, 𝒩C (a) ≠ 0.
Proof of Theorem 21.4.1. Let a ⊲ K[x1 , . . . , xn ], and let f ∈ I 𝒩 (a). We have to show that
f m ∈ a for some m ∈ ℕ. If f = 0, then there is nothing to show.
Now, let f ≠ 0. We consider K[x1 , . . . , xn ] as a subring of K[x1 , . . . , xn , xn+1 ] of the
n + 1 independent indeterminates x1 , . . . , xn , xn+1 . In K[x1 , . . . , xn , xn+1 ], we consider the
ideal ā = (a, 1 − xn+1 f ) ⊲ K[x1 , . . . , xn , xn+1 ], generated by a and 1 − xn+1 f .
Case 1: ā ≠ K[x1 , . . . , xn , xn+1 ].
ā then has a zero (β1 , . . . , βn , βn+1 ) in C n+1 by Theorem 21.2.4. Hence, for (β1 , . . . , βn ,
βn+1 ) ∈ 𝒩 (a), ̄ we have the equations:
(1) g(β1 , . . . , βn ) = 0 for all g ∈ a, and
(2) f (β1 , . . . , βn )βn+1 = 1.
From (1), we get (β1 , . . . , βn ) ∈ 𝒩 (a). Hence, especially, f (β1 , . . . , βn ) = 0 for our f ∈
I 𝒩 (a). But this contradicts (2). Therefore, ā ≠ K[x1 , . . . , xn , xn+1 ] is not possible. Thus,
we have
Case 2: ā = K[x1 , . . . , xn , xn+1 ], that is, 1 ∈ a.̄ Then there exists a relation of the form
V1 ⊃ V2 ⊃ ⋅ ⋅ ⋅ ⊃ Vm ⊃ Vm+1 ⊃ ⋅ ⋅ ⋅ (21.1)
Proof. We apply the operator I; that is, we pass to the vanishing ideals. This gives an
ascending chain of ideals
The union of the I(Vi ) is an ideal in K[x1 , . . . , xn ], and hence, by Theorem 21.3.3,
finitely generated. Therefore, there is an m with I(Vm ) = I(Vm+1 ) = I(Vm+2 ) = ⋅ ⋅ ⋅.
Now we apply the operator 𝒩 and get the desired result, because Vi = 𝒩 I(Vi ) by
Theorem 21.2.4 (10).
Proof. (1) Let V be irreducible. Let fg ∈ I(V). Then V = 𝒩 I(V) ⊂ 𝒩 (fg) = 𝒩 (f ) ∪ 𝒩 (g);
hence, V = V1 ∪ V2 with the algebraic K-sets V1 = 𝒩 (f ) ∩ V and V2 = 𝒩 (g) ∩ V. Now
V is irreducible; hence, V = V1 , or V = V2 , say V = V1 . Then V ⊂ 𝒩 (f ). Therefore,
f ∈ I 𝒩 (f ) ⊂ I(V). Since V ≠ 0, we have further 1 ∉ I(V); that is, I(V) ≠ R.
(2) Let I(V) ⊲ R with I(V) ≠ R be a prime ideal. Let V = V1 ∪ V2 , V1 ≠ V, with
algebraic K-sets Vi in C n . First,
where I(V1 )I(V2 ) is the ideal generated by all products fg with f ∈ I(V1 ), g ∈ I(V2 ).
We have I(V1 ) ≠ I(V), because otherwise V1 = 𝒩 I(V1 ) = 𝒩 I(V) = V contradicting
V1 ≠ V. Hence, there is a f ∈ I(V1 ) with f ∉ I(V). Now, I(V) ≠ R is a prime ideal; hence,
necessarily I(V2 ) ⊂ I(V) by (⋆). It follows that V ⊂ V2 . Therefore, V is irreducible.
Note that the affine space K n is, as the zero set of the zero polynomial 0, itself
an algebraic K-set in K n . If K is infinite, then I(K n ) = {0}. Hence, K n is irreducible
by Theorem 21.5.3. Moreover, if K is infinite, then K n can not be written as a union of
finitely many proper algebraic K-subsets. If K is finite, then K n is not irreducible.
Furthermore, each algebraic K-set V in C n is also an algebraic C-set in C n . If V is an
irreducible algebraic K-set in C n , then—in general—it is not an irreducible algebraic
C-set in C n .
Proof. Let a be the set of all algebraic K-sets in C n , which can not be presented as a
finite union of irreducible algebraic K-sets in C n .
Assume that a ≠ 0. By Theorem 21.4.1, there is a minimal element V in a. This V
is not irreducible, otherwise we have a presentation as desired. Hence, there exists a
presentation V = V1 ∪ V2 with algebraic K-sets Vi , which are strictly smaller than V.
By definition, both V1 and V2 have a presentation as desired; hence, V also has one,
which gives a contradiction. Hence, a = 0.
Now suppose that V = V1 ∪ ⋅ ⋅ ⋅ ∪ Vr = W1 ∪ ⋅ ⋅ ⋅ ∪ Ws are two presentations of the
desired form. For each Vi , we have a presentation Vi = (Vi ∩ W1 ) ∪ ⋅ ⋅ ⋅ ∪ (Vi ∩ Ws ). Each
Vi ∩ Wj is a K-algebraic set (see Theorem 21.2.4). Since Vi is irreducible, we get that
there is a Wj with Vi = Vi ∩ Wj , that is, Vi ⊂ Wj . Analogously, for this Wj , there is a Vk
with Wj ⊂ Vk . Altogether, Vi ⊂ Wj ⊂ Vk . But Vp ⊈ Vq if p ≠ q. Hence, from Vi ⊂ Wj ⊂ Vk ,
we get i = k. Therefore, Vi = Wj ; that means, for each Vi there is a Wj with Vi = Wj .
Analogously, for each Wk , there is a Vl with Wk = Vl . This proves the theorem.
Example 21.5.5.
1. Let M = {gf } ⊂ ℝ[x, y] with g(x) = x2 + y2 − 1 and f (x) = x2 + y2 − 2. Then 𝒩 (M) =
V = V1 ∪ V2 , where V1 = 𝒩 (g), and V2 = 𝒩 (f ); V is not irreducible.
2. Let M = {f } ⊂ ℝ[x, y] with f (x, y) = xy − 1; f is irreducible in ℝ[x, y]. Therefore, the
ideal (f ) is a prime ideal in ℝ[x, y]. Hence, V = 𝒩 (f ) is irreducible.
Definition 21.5.6. Let V be an algebraic K-set in C n . The residue class ring K[V] =
K[x1 , . . . , xn ]/I(V) is called the (affine) coordinate ring of V.
K[V] can be identified with the ring of all those functions V → C, which are given
by polynomials from K[x1 , . . . , xn ]. As a homomorphic image of K[x1 , . . . , xn ], we get
that K[V] can be described in the form K[V] = K[α1 , . . . , αn ]; therefore, a K-algebra
of the form K[α1 , . . . , αn ] is often called an affine K-algebra. If the algebraic K-set V
in C n is irreducible—we can call V now an (affine) K-variety in C n —then K[V] is an
integral domain with an identity, because I(V) is then a prime ideal with I(V) ≠ R
by Theorem 21.4.2. The quotient field K(V) = Quot K[V] is called the field of rational
functions on the K-variety V.
We note the following:
1. If C is algebraically closed, then V = C n is a K-variety, and K(V) is the field
K(x1 , . . . , xn ) of the rational functions in n variables over K.
2. Let the affine K-algebra A = K[α1 , . . . , αn ] be an integral domain with an identity
1 ≠ 0. Then A ≅ K[x1 , . . . , xn ]/p for some prime ideal p ≠ K[x1 , . . . , xn ]. Hence, if C is
algebraically closed, then A is isomorphic to the coordinate ring of the K-variety
V = 𝒩 (p) in C n (see Hilbert’s nullstellensatz, first form, Theorem 21.4.1).
3. If the affine K-algebra A = K[α1 , . . . , αn ] is an integral domain with an identity
1 ≠ 0, then we define the transcendence degree trgd(A|K) to be the transcendence
degree of the field extension Quot(A)|K; that is, trgd(A|K) = trgd(Quot(A)|K),
Quot(A) the quotient field of A.
Example 21.5.7. Let ω1 , ω2 ∈ ℂ two elements which are linear independent over ℝ. An
element ω = m1 ω1 + m2 ω2 with m1 , m2 ∈ ℤ, is called a period. The periods describe an
abelian group Ω = {m1 ω1 + m2 ω2 : m1 , m2 ∈ ℤ} ≅ ℤ ⊕ ℤ and give a lattice in ℂ.
1 1 1
℘(z) = 2
+ ∑ ( 2
− 2 ),
z 0=w∈Ω
̸
(z − w) w
is an elliptic function.
1 1
With g2 = 60 ∑0=w∈Ω
̸ w4
, and g3 = 140 ∑0=w∈Ω
̸ w6
, we get the differential equation
2 3
℘ (z) = 4℘(z) + g2 ℘(z) + g3 = 0. The set of elliptic functions is a field E, and each
elliptic function is a rational function in ℘ and ℘ (for details see, for instance, [34]).
The polynomial f (t) = t 2 − 4s3 + g2 s + g3 ∈ ℂ(s)[t] is irreducible over ℂ(s). For the
corresponding algebraic ℂ(s)-set V, we get K(V) = ℂ(s)[t]/(t 2 − 4s3 + g2 s + g3 ) ≅ E with
respect to t → ℘ , s → ℘.
21.6 Dimensions
From now we assume that C is algebraically closed.
Definition 21.6.1.
(1) The dimension dim(V) of an algebraic K-set V in C n is said to be the supremum of
all integers m, for which there exists a strictly descending chain V0 ⊋ V1 ⊋ ⋅ ⋅ ⋅ ⊋ Vm
of K-varieties Vi in C n with Vi ⊂ V for all i.
(2) Let A be a commutative ring with an identity 1 ≠ 0. The height h(p) of a prime ideal
p ≠ A of A is said to be the supremum of all integers m, for which there exists a
strictly ascending chain p0 ⊊ p1 ⊊ ⋅ ⋅ ⋅ ⊊ pm = p of prime ideals pi of A with pi ≠ A.
The dimension (Krull dimension) dim(A) of A is the supremum of the heights of
all prime ideals ≠ A in A.
Proof. By Theorem 21.2.4 and Theorem 21.4.2, we have a bijective map between the
K-varieties W with W ⊂ V and the prime ideals ≠ R = K[x1 , . . . , xn ] of R, which con-
tain I(V) (the bijective map reverses the inclusion). But these prime ideals correspond
exactly with the prime ideals ≠ K[V] of K[V] = K[x1 , . . . , xn ]/I(V), which gives the
statement.
Theorem 21.6.3. Let A = K[α1 , . . . , αn ] be an affine K-algebra, and let A be also an inte-
gral domain. Let {0} = p0 ⊊ p1 ⊊ ⋅ ⋅ ⋅ ⊊ pm be a maximal strictly ascending chain of prime
ideals in A (such a chain exists since A is noetherian). Then m = trgd(A|K) = dim(A). In
other words;
All maximal ideals of A have the same height, and this height is equal to the tran-
scendence degree of A over K.
Lemma 21.6.5. Let R be an unique factorization domain. Then each prime ideal p with
height h(p) = 1 is a principal ideal.
Lemma 21.6.6. Let R = K[y1 , . . . , yr ] be the polynomial ring of the r independent inde-
terminates y1 , . . . , yr over the field K (recall that R is a unique factorization domain). If
p is a prime ideal in R with height h(p) = 1, then the residue class ring R̄ = R/p has
transcendence degree r − 1 over K.
Proof. By Lemma 21.6.5, we have that p = (p) for some nonconstant polynomial
p ∈ K[y1 , . . . , yr ]. Let the indeterminate y = yr occur in p, that is, degy (p) ≥ 1, the
degree in y. If f is a multiple of p, then also degy (f ) ≥ 1. Hence, p ∩ K[y1 , . . . , yr ] ≠ {0}.
Before we describe the last technical lemma, we need some preparatory theoreti-
cal material.
Let R, A be integral domains (with identity 1 ≠ 0), and let A|R be a ring extension.
We first consider only R.
(1) A subset S ⊂ R \ {0} is called a multiplicative subset of R if 1 ∈ S for the identity
1 of R, and if s, t ∈ S, then also, st ∈ S. (x, s) ∼ (y, t) :⇔ xt − ys = 0 defines an
equivalence relation on M = R × S. Let xs be the equivalence class of (x, s) and S−1 R, the
set of all equivalence classes. We call xs a fraction. If we add and multiply fractions as
usual, we get that S−1 R becomes an integral domain; it is called the ring of fractions of
R with respect to S. If, in particular, S = R \ {0}, then S−1 R = Quot(R), the quotient field
of R.
Now, back to the general situation. i : R → S−1 R, i(r) = 1r , defines an embedding
of R into S−1 R. Hence, we may consider R as a subring of S−1 R. For each s ∈ S ⊂ R \ {0},
we have that i(s) is an unit in S−1 R. That is, i(s) is invertible, and each element of S−1 R
has the form i(s)−1 i(r) with r ∈ R, s ∈ S. Therefore, S−1 R is uniquely determined up to
isomorphisms, and we have the following universal property:
If ϕ : R → R is a ring homomorphism (of integral domains) such that ϕ(s) is
invertible for each s ∈ S, then there exist exactly one ring homomorphism λ : S−1 R →
R with λ ∘ i = ϕ. If a ⊲ R is an ideal in a, then we write S−1 a for the ideal in S−1 R,
generated by i(a). S−1 a is the set of all elements of the form as with a ∈ a and s ∈ S.
Furthermore, S−1 a = (1) ⇔ a ∩ S ≠ 0.
Vice versa; if A ⊲ S−1 R is an ideal in S−1 R, then we also denote the ideal i−1 (A) ⊲ R
with A ∩ R. An ideal a ⊲ R is of the form a = i−1 (A) if and only if there is no s ∈ S such
that its image in R/a under the canonical map R → R/a is a proper zero divisor in R/a.
Under the mapping P → P ∩ R and p → S−1 p, the prime ideals in S−1 R correspond
exactly to the prime ideals in R, which do not contain an element of S.
We now identify R with i(R):
(2) Now, let p ⊲ R be a prime ideal in R. Then S = R \ p is multiplicative. In this
case, we write Rp instead of S−1 R, and call Rp the quotient ring of R with respect to p,
or the localization of R of p. Put m = pRp = S−1 p. Then 1 ∉ m. Each element of Rp /m is a
unit in Rp and vice versa. In other words, each ideal a ≠ (1) in Rp is contained in m, or
equivalently, m is the only maximal ideal in Rp . A commutative ring with an identity
1 ≠ 0, which has exactly one maximal ideal, is called a local ring. Hence, Rp is a local
ring. From part (1), we additionally get the prime ideals of the local ring Rp correspond
bijectively to the prime ideals of R, which are contained in p.
(3) Now we consider our ring extension A|R as above. Let q be a prime ideal in R.
an−1 n−1 a
sn + s + ⋅ ⋅ ⋅ + 0n = 0
β β
a
over K. But s is integral over R; hence, all βn−1i ∈ R.
We are now prepared to prove the last preliminary lemma, which we need for the
proof of Theorem 21.6.3.
Lemma 21.6.7 (Krull’s going up lemma). Let A|R be an integral ring extension of inte-
gral domains, and let R be integrally closed in its quotient field. Let p and q be prime
ideals in R with q ⊂ p. Furthermore, let P be a prime ideal in A with P ∩ R = p. Then
there exists a prime ideal Q in A with Q ∩ R = q, and Q ⊂ P.
Proof of Theorem 21.6.3. Let first be m = 0. Then {0} is a maximal ideal in A; hence, A =
K[α1 , . . . , αn ] a field. By Corollary 20.3.11 then, A|K is algebraic; therefore, trgd(A|K) = 0.
So, Theorem 21.3.3 holds for m = 0.
Now, let m ≥ 1. We use Noether’s normalization theorem. A has a polynomial ring
R = K[y1 , . . . , yr ] of the r independent indeterminates y1 , . . . , yr as a subring, and A|R is
an integral extension. As a polynomial ring over K, the ring R is a unique factorization
domain, and hence, certainly, algebraically closed (in its quotient field).
Now, let
{0} = P0 ⊊ P1 ⊊ ⋅ ⋅ ⋅ ⊊ Pm (21.3)
{0} = p0 ⊂ p1 ⊂ ⋅ ⋅ ⋅ ⊂ pm (21.4)
of prime ideals pi = Pi ∩ R of R. Since A|R is integral, the chain (21.4) is also a strictly
ascending chain. This follows from Krull’s going up lemma (Lemma 21.6.7), because if
pi = pj , then Pi = Pj . If Pm is a maximal ideal in A, then also pm is a maximal ideal in
R, because A|R is integral (consider A/Pm and use Theorem 20.2.19). If the chain (21.3)
is maximal and strictly, then also the chain (2).
Now, let the chain (21.3) be maximal and strictly. If we pass to the residue class
rings Ā = A/P1 and R̄ = R/p1 , then we get the chains of prime ideals {0} = P̄ 1 ⊂ P̄ 2 ⊂
⋅ ⋅ ⋅ ⊂ P̄ m and {0} = p̄ 1 ⊂ p̄ 2 ⊂ ⋅ ⋅ ⋅ ⊂ p̄ m for the affine K-algebras Ā and R,̄ respectively,
but with a 1 less length. By induction, we may assume that already trgd(A|K) ̄ = m−1 =
trgd(R|K).
̄ On the other hand, by construction, we have trgd(A|K) = trgd(R|K) = r.
Finally, to prove Theorem 21.3.3, we have to show that r = m. If we compare both
equations, then r = m follows if trgd(R|K) ̄ = r − 1. But this holds by Lemma 21.6.6.
21.7 Exercises
1. Let A = K[a1 , . . . , an ] and C|K be a field extension with C algebraically closed.
Show that there is a K-algebra homomorphism K[a1 , . . . , an ] → C.
2. Let K[x1 , . . . , xn ] be the polynomial ring of the n independent indeterminates
x1 , . . . , xn over the algebraically closed field K. The maximal ideals of K[x1 , . . . , xn ]
https://doi.org/10.1515/9783110603996-022
(4) Linear algebraic groups are the analogues of Lie groups, but over more general
fields than just the reals or complexes. Their representation theory is more com-
plicated than that of Lie groups.
For this chapter, we will consider solely the representation theory of finite groups, and
for the remainder of this chapter, when we say group, we mean finite group.
Theorem 22.2.1. There is a bijective correspondence between the set of linear actions
of a group G on a K-vector space V and the set of homomorphisms from G into GL(V),
the group of all invertible linear transformations of V, which is called the general linear
group over V.
From Theorem 22.2.1, it follows that the study of group representations is equiva-
lent to the study of linear actions of groups. This area of study, with emphasis on finite
groups and finite dimensional vector spaces, has many applications to finite group
theory.
Definition 22.2.3. Let R be a ring with identity 1, and let M be an abelian group written
additively. M is called left R-module if there is a map R × M → M written as (r, m) → rm
such that the following hold:
(1) 1 ⋅ m = m;
(2) r(m + n) = rm + rn;
(3) (r + s)m = rm + sm;
(4) r(sm) = (rs)m;
Finite minimal sets for a given module may have different numbers of elements.
This is in contrast to the situation in free R-modules over a commutative ring R
with identity, where any two finite bases have the same number of elements (The-
orem 18.4.6).
In the following, we review the module theory that is necessary for the study of
group representations. The facts we use are straightforward extensions of the respec-
tive facts for modules over commutative rings or for groups.
Example 22.2.6. The R-submodules of a ring R are exactly the left ideals of R (see
Chapter 1). Every R-module M has at least two submodules, namely, M itself and the
zero submodule {0}.
Definition 22.2.7. A simple R-module is an R-module M ≠ {0}, which has only M and
{0} as submodules.
If N is a submodule of M, then we may construct the factor group M/N (recall that
M is abelian). We may give the factor group M/N an R-module structure by defining
r(m + N) = rm + N for every r ∈ R and m + N ∈ M/N. We call M/N the factor R-module,
or just factor module of M/N.
N1 + N2 = {x + y | x ∈ N1 , y ∈ N2 } ⊂ M.
N ⊕ N ⊕ ⋅⋅⋅ ⊕ N
of k copies of N.
As for groups, we also have the external notion of a direct sum. If M and N are
R-modules, then we give the Cartesian product M ×N an R-module structure by setting
r(m, n) = (rm, rn), and we write M ⊕ N instead of M × N.
The notions of internal and external direct sums can be extended to any finite
number of submodules and modules, respectively.
M = M0 ⊃ M1 ⊃ ⋅ ⋅ ⋅ ⊃ Mk = {0}
of finitely many submodules Mi of M beginning with M and ending with {0}, where the
inclusions are proper, and in which each successive factor module Mi /Mi+1 is a simple
module. We call the length of the composition series k.
exists a one-to-one correspondence between their respective factor modules. Hence, the
factor modules are unique, and, in particular, the length must be the same.
Therefore, we can speak in a well-defined manner about the factor modules of a
composition series. If an R-module M has a composition series, then each submodule
N and each factor module M/N also has a composition series.
If the submodule N and the factor module M/N each have a composition series,
then the module M also has one (see Chapter 13 for the respective proofs for groups).
Definition 22.2.11. Let M and N be R-modules, and let ϕ : M → N be a group homo-
morphism. Then ϕ is an R-module homomorphism if ϕ(rm) = rϕ(m) for any r ∈ R and
m ∈ M.
As for all other structures, we define monomorphism, epimorphism, isomor-
phism, and automorphism of R-modules in analogy with the definition for groups.
Analogously, for groups, we have the following results:
Theorem 22.2.12 (First isomorphism theorem). Let M and N be R-modules, and
ϕ : M → N an R-module homomorphism.
(1) The kernel ker(ϕ) = {m ∈ M | ϕ(m) = 0} of ϕ is a submodule of M.
(2) The image Im(ϕ) = {n ∈ N | ϕ(m) = n for some m ∈ M} of ϕ is a submodule of N.
(3) The R-modules M/kerϕ and Im(ϕ) are isomorphic via the map induced by ϕ.
Theorem 22.2.15 (Schur’s lemma). Let M and N be simple R-modules, and let
ϕ : M → N be a nonzero R-module homomorphism. Then ϕ is an R-module iso-
morphism.
Proof. Since both M and N are simple, we must have either ker(ϕ) = M or
ker(ϕ) = {0}. If ker(ϕ) = M, then ϕ = 0 the zero homomorphism. Hence, ker(ϕ) = {0}
and Im(ϕ) = N. Therefore, if ϕ ≠ 0, then ϕ is an R-module isomorphism.
{ ∑ αg g | all αg ∈ R}.
g∈G
∑ αg g + ∑ βg g = ∑ (αg + βg )g.
g∈G g∈G g∈G
( ∑ αg g)( ∑ βg g) = ∑ ∑ αg βh gh
g∈G g∈G g∈G h∈G
= ∑ ( ∑ (αg βg −1 x ))x.
x∈G g∈G
The group ring RG has an identity element, which coincides with the identity element
of G. We usually denote this by just 1.
From the viewpoint of abstract group theory, it is of interest to consider the case,
where the underlying ring is an integral domain. In this connection, we mention the
famous zero divisor conjecture by Higman and Kaplansky, which poses the question
whether every group ring RG of a torsion-free group G over an integral domain R or
over a field K has no zero divisors.
The conjecture has been proved only for a fairly restricted class of torsion-free
groups.
In this chapter, we will primarily consider the case where R = K is a field and the
group G is finite, in which case the group ring KG is not only a ring, but also a finite
dimensional K-vector space having G as a basis. In this case, KG is called the group
algebra.
In mathematics, in general, an algebra over a field K is a K-vector space with a
bilinear product that makes it a ring. That is, an algebra over K is an algebraic structure
A with both a ring structure and a K-vector space structure that are compatible. That
is, α(ab) = (αa)b = a(αb) for any α ∈ K and a, b ∈ A. An algebra is finite-dimensional
if it has finite dimension as K-vector space.
Example 22.2.17.
(1) The matrix ring Mn (K) is a finite dimensional K-algebra for any natural number n.
(2) The group ring KG is a finite dimensional K-algebra when the group G is finite.
Modules over a group algebra KG can also be considered as K-vector spaces with
α ∈ K acting as α ⋅ 1 ∈ KG.
Lemma 22.2.19. If K is a field, and G is a finite group, then a KG-module is finitely gen-
erated if and only if it is finite dimensional as a K-vector space.
We now describe the fundamental connections between modules over group al-
gebras and group representation theory.
Theorem 22.2.20. If K is a field and G is a finite group, then there is a one-to-one cor-
respondence between finitely generated KG-modules and linear actions of G on finite
dimensional K-vector spaces V, and hence with the homomorphisms ρ : G → GL(V) for
finite dimensional K-vector spaces V.
Proof. If V is a finitely generated KG-module, then dim K(V) < ∞ by Lemma 22.2.19,
and the map from G × V to V obtained by restricting the module structure map from
KG × V to V is a linear action.
Conversely, let V be a finite dimensional K-vector space, on which G acts linearly.
Then we place a KG-module structure on V by defining
∑ αg g ∈ KG and v ∈ V.
g∈G
Example 22.2.21.
(1) The field K can always be considered as a KG-module by defining gλ = λ for all
g ∈ G and λ ∈ K. This module is called the trivial module.
(2) Let G act on the finite set X = {x1 , . . . , xn }. Let KX be the set
n
{∑ ci xi | ci ∈ K, xi ∈ X for i = 1, . . . , n}
i=1
n n
g(∑ ci xi ) = ∑ ci (gxi ).
i=1 i=1
(4) Let U, V be KG-modules, and let HomKG (U, V) be the set of all KG-module homo-
morphisms from U to V. For ϕ, ψ ∈ HomKG (U, V) define ϕ + ψ ∈ HomKG (U, V)
by
With this definition HomKG (U, V) is an abelian group. Furthermore, HomKG (U.V)
is a K-vector space with (λϕ)(u) = λϕ(u) for λ ∈ K, u ∈ U and ϕ ∈ HomKG (U, V).
Note that this K-vector space has finite dimension. The K-vector space HomKG (U, V)
also admits a natural KG-module structure. For g ∈ G and ϕ ∈ HomKG (U, V) then, we
define
Therefore, (g1 g2 )ϕ = g1 (g2 ϕ). It follows that HomKG (U, V) has a KG-module structure.
G acts on HomKG (U, V), and we write U ⋆ for HomKG (U, K), where K is the trivial
module. U ⋆ is called the dual module of U, and here we have (gϕ)(u) = ϕ(g −1 u).
Theorem 22.2.22 (Maschke’s Theorem). Let G be a finite group, and suppose that the
characteristic of K is either 0 or co-prime to |G|; that is, gcd(char(K), |G|) = 1. If U
is a KG-module and V is a KG-submodule of U, then V is a direct summand of U as
KG-modules.
π : U → U
by
1
π (u) = ∑ gπ(g −1 u) for u ∈ U.
|G| g∈G
1
Since char(K) = 0, or gcd(char(K), |G|) = 1, it follows that |G| ≠ 0 in K; hence, |G|
exists
in K. Therefore, the definition of π makes sense.
1 1
π (xu) = ∑ gπ(g −1 xu) = ∑ xx−1 gπ(g −1 xu)
|G| g∈G |G| g∈G
1
= x( ∑ x−1 gπ(g −1 xu)).
|G| g∈G
1
π (xu) = x( ∑ yπ(y−1 u)) = xπ (u)
|G| y∈G
as required.
Corollary 22.2.24. Let G be a finite group and K a field. Suppose that either char(K) = 0
or char(K) is relatively prime to |G|. Then every nonzero KG-module is semisimple.
assume that U is not simple. Then U must have a nonzero proper KG-submodule V.
By Mashke’s theorem, we have U = V ⊕ W for some nonzero proper KG-submodule W
of U. Then both V and W have dimension strictly less than dimK (U). By the induction
hypothesis, both V and W are semisimple; therefore, U is semisimple.
Definition 22.2.25.
(1) A K-vector subspace V of U is a G-invariant subspace if gv ∈ V for all g ∈ G and
v ∈ V.
(2) Let U be nonzero. A representation ρ : G → GL(U) is irreducible if {0} and U are
the only G-invariant subspaces of U.
(3) Let U be nonzero. A representation ρ : G → GL(U) is fully reducible if each
G-invariant subspace V of U has a G-invariant complement W in U; that is,
U = V ⊕ W as K-vector spaces.
Theorem 22.2.26 (Mashke’s theorem). Let G be a finite group and K a field. Suppose
that either char(K) = 0 or char(K) is relatively prime to |G|. Let U be a finite dimensional
K-vector space. Then each representation ρ : G → GL(V) is fully reducible.
Proof. By Theorem 22.2.1, we may consider U as a KG-module. Then the above version
of Mashke’s theorem follows from the proof for modules, because the KG-submodules
of U together with the respective definitions for group representations represent the
G-invariant subspaces of U.
The theory of KG-modules, when char(K) = p > 0 and p, divides |G|. In which case,
arbitrary KG-modules need not be semisimple, and is called modular representation
theory. The earliest work on modular representations was done by Dickson and many
of the main developments were done by Brauer. More details and a good overview may
be found in [1], [4], [5], and [17].
Proof. The implication (1) ⇒ (2) follows in the same manner as Corollary 22.2.24.
The implication (2) ⇒ (3) is direct.
Finally, we must show the implication (3) ⇒ (1). Suppose that (3) holds, and
let N be a submodule of M. Let V also be a submodule of M; that is, maximal among
all submodules of M that intersect N trivially. Such a submodule V exists by Zorn’s
lemma. We wish to show that N + V = M. Suppose that N + V ≠ M (certainly we have
N + V ⊂ M). If every simple submodule of M were contained in N + V, then as M can
be written as a sum of simple submodules, we would have M ⊂ N + V. This is not
the case, since N + V ≠ M. Hence, there is some simple submodule S of M that is not
contained in N + V. Since S ∩ (N + V) is a proper submodule of the simple module S,
we must have S ∩ (N + V) = {0}. In particular, S ∩ V = {0}, so we have V ⊂ V + S. Let
n ∈ N ∩ (V + S). Then n = s + v for some v ∈ V and s ∈ S. This gives s = n − v ∈ S ∩ N + V,
and therefore s = 0. Hence, n = v, which forces n to be 0, because N ∩ V = {0}. It
follows that N ∩ (V + S) = {0}, which contradicts the maximality of V. Hence, we now
have M = N + V. Furthermore, since N ∩ V = {0}, we get that the sum is direct and
M = N ⊕ V. Therefore, N is a direct summand of M, which proves the implication
(3) ⇒ (1) completing the proof of the lemma.
Lemma 22.3.2. Submodules and factor modules of semisimple modules are also semi-
simple.
Proof. Let M be a semisimple A-module. By the previous lemma and the isomorphism
theorem for modules, we get that every submodule of M is isomorphic to a factor mod-
ule of M. Therefore, it suffices to show that factor modules of M are semisimple. Let
M/N be an arbitrary factor module, and let η : M → M/N with m → m + N be the
canonical map. Since M is semisimple, we have M = S1 + ⋅ ⋅ ⋅ + Sn with n ∈ ℕ, and each
Si a simple module. Then M/N = η(M) = η(S1 )+⋅ ⋅ ⋅+η(Sn ). But each η(Si ) is isomorphic
to a factor module of Si , and hence each η(Si ) is either {0} or a simple module. There-
fore, M/N is a sum of simple modules, and hence semisimple by Lemma 22.3.1
Lemma 22.3.4. The algebra A is semisimple if and only if the A-module A is semisimple.
Proof. Suppose that the A-module A is semisimple, and let M be an A-module gen-
erated by {m1 , . . . , mr }. Let Ar denote the direct sum of r copies of A; (a1 , . . . , ar ) →
a1 m1 + ⋅ ⋅ ⋅ + ar mr defines a map from Ar to M, which is an A-module epimorphism.
Thus, M is isomorphic to a factor module of the semisimple module Ar , and hence
semisimple by Lemma 22.3.2. It follows that A is a semisimple algebra.
The converse is clear.
A ≅ S1 ⊕ ⋅ ⋅ ⋅ ⊕ Sr , r ∈ ℕ,
where the Si are simple submodules of A. Then any simple A-module is isomorphic to
some Si .
M ≅ m1 S1 + ⋅ ⋅ ⋅ + mr Sr
Definition 22.3.7. An algebra D is a division algebra or skew field if the nonzero ele-
ments of D form a group. Equivalently, it is a ring, where every nonzero element has a
multiplicative inverse. It is exactly the definition of a field without requiring commu-
tativity.
Any field K is a division algebra over itself, but there may be division algebras
that are noncommutative. If the interest is on the ring structure of D, one often speaks
about division rings (see Chapter 7).
Theorem 22.3.8. Let D be a division algebra, and let n ∈ ℕ. Then any simple
Mn (D)-module is isomorphic to Dn , and Mn (D) is an Mn (D)- module isomorphic to the
direct sum of n copies of Dn . In particular, Mn (D) is a semisimple algebra.
Proof. A nonzero submodule of Dn must contain some nonzero vector, which must
have a nonzero entry x in the j-th place for some j. This x is invertible in D.
By premultiplying this vector by Ejj (x−1 ), we see that the submodule contains the
j-th canonical basis vector. By premultiplying this basis vector by appropriate permu-
tation matrices, we get that the submodule contains every canonical basis vector, and
hence contains every vector.
It follows that Dn is the only nonzero Mn (D)-submodule of Dn , and hence Dn is
simple. Now for each 1 ≤ k ≤ n, let Ck be the submodule of Mn (D) consisting of those
matrices, whose only nonzero entries appear in the k-th column. Then we have
Mn (D) ≅ ⊕nk=1 Ck
Definition 22.3.9. A nonzero algebra is simple if its only (two-sided) ideals (as a ring)
are itself and the zero ideal.
Proof. Let A be a simple algebra, and let Σ be the sum of all simple submodules of A.
Let S be a simple submodule of A, and let a ∈ A. Then the map ϕ : S → Sa, given by
s → sa, is a module epimorphism. Therefore, Sa is simple, or Sa = {0}. In either case,
we have Sa ⊂ Σ for any submodule S and any a ∈ A.
It follows that Σ is a right ideal in A, and hence that Sa is a two-sided ideal. How-
ever, A is simple, and Σ ≠ {0}, so we must have Σ = A. Therefore, A is the sum of simple
A-modules, and from Lemmas 22.3.1 and 22.3.4, it follows that A is a semisimple alge-
bra.
Theorem 22.3.11. Let D be a division algebra, and let n ∈ ℕ. Then Mn (D) is a simple
algebra.
Proof. Let M ∈ Mn (D) with M ≠ {0}. We must show that the principal two-sided ideal
J of Mn (D) generated by M is equal to Mn (D).
It suffices to show that J contains each Eij (1), since these matrices generate Mn (D)
as an Mn (D)-module. Since M ≠ {0}, there exists some 1 ≤ r, s ≤ n such that the
(r, s)-entry of M is nonzero. We call this entry x. By calculation, we have
b = b1 + ⋅ ⋅ ⋅ + br → (b1 , . . . , br ).
The algebra B is the internal direct sum as algebras of the Bi . This can be seen as fol-
lows. If i ≠ j and bi ∈ Bi , bj ∈ Bj , then we must have bi bj ∈ Bi ∩ Bj = {0}, since Bi and Bj
are ideals. Therefore, the product in B of b1 +⋅ ⋅ ⋅+br and b1 +⋅ ⋅ ⋅ br is just b1 b1 +⋅ ⋅ ⋅+br br .
Proof. Let J be a (two-sided) ideal of B, and let Ji = J ∩ Bi for each i. Certainly, ⊕ri=1 Ji ⊂ J.
Let b ∈ J, then b = b1 + ⋅ ⋅ ⋅ + br with bi ∈ Bi for each i. For some i, let ei =
(0, . . . , 0, 1, 0, . . . , 0); that is, the element of B, whose only nonzero entry is the iden-
tity element of Bi . Then b = bei ∈ J ∩ Bi = Ji . Therefore, b ∈ ⊕ri=1 Ji , which shows that
J = J1 ⊕ ⋅ ⋅ ⋅ ⊕ Jr .
The converse is clear.
Proof. For each i, we write Bi = Ci1 ⊕⋅ ⋅ ⋅⊕Cin using Theorem 22.3.8, where the Cij are mu-
tually isomorphic Bi -modules. As we saw above, each Cij is also simple as a B-module.
Therefore, as B-modules, we have B ≅ ⊕i,j Cij , and hence B is a semisimple algebra by
Lemma 22.3.4. From Theorem 22.3.5, we get that any simple B-module is isomorphic
to some Cij , but Cij ≅ Ckl if and only if i = k. Hence, there are exactly r isomorphisms
of simple B-modules. The final statement is a straightforward consequence of Theo-
rem 22.3.11 and Lemma 22.3.12.
We saw that a direct sum of matrix algebras over a division algebras is semisimple.
We now start to show that the converse is also true; that is, any semisimple algebra is
isomorphic to a direct sum of matrix algebras over division algebras. This is Wedder-
burn’s theorem.
Definition 22.3.14. If M is an A-module, then let EndA (M) = HomA (M, M) denote the
set of all A-module endomorphisms of M. In a more general context, we have seen that
EndA (M) has the structure of an A-module via
for all ϕ, ψ ∈ EndA (M), λ ∈ A, and m ∈ M. This composition of mappings gives a mul-
tiplication in EndA (M), and hence EndA (M) is a K-algebra, called the endomorphism
algebra of M.
Definition 22.3.15. The opposite algebra of B, denoted Bop , is the set B together with
the usual addition and scalar multiplication, but with the opposite multiplication,
that is, the multiplication rule of B reversed.
Given a, b ∈ B, we use ab to denote their product in B, and a⋅b to denote their prod-
uct in Bop . Hence, a ⋅ b = ba. We certainly have (Bop )op = B. If B is a division algebra,
then so is Bop . The opposite of a direct sum of algebras is the direct sum of the opposite
algebras, because the multiplication in the direct sum is defined componentwise.
Endomorphism algebras and opposite algebras are closely related.
Proof. Let ϕ ∈ EndB (B), and let a = ϕ(1). Then ϕ(b) = bϕ(1) = ba for any b ∈ B;
hence, ϕ is equal to the automorphism ψa , given by right multiplication of a. There-
fore, EndB (B) = {ψa | a ∈ B}; hence, EndB (B) and B are in one-to-one correspondence.
To finish the proof, we must show that ψa ψb = ψa⋅b for any a, b ∈ B.
Let a, b ∈ B. Then ψa ψb (x) = ψa (xb) = xba = ψba (x) = ψa⋅b (x), as required.
Lemma 22.3.17. Let S1 , . . . , Sr be the r distinct simple A-modules of Theorem 22.3.6. For
each i, let Ui be a direct sum of copies of Si , and let U = U1 ⊕ ⋅ ⋅ ⋅ ⊕ Ur . Then
Proof. Let ϕ ∈ EndA (U). Fix some i. Then every composition factor of Ui is isomorphic
to Si . Therefore, by the Jordan–Hölder theorem for modules (Theorem 22.3.10), we see
that the same is true for ϕ(Ui ), since ϕ(Ui ) is isomorphic to a quotient of Ui . Assume
that ϕ(Ui ) is not contained in Ui . Then the image of ϕ(Ui ) in U/Ui under the canonical
map is a nonzero submodule, having Si as a composition factor. However, the compo-
sition factors of U/Ui are exactly those Sj for j ≠ i. This gives a contradiction. It follows
that ϕ(Ui ) ⊂ Ui , and a submodule of U/Ui cannot have Si as a composition factor. For
each i, we can define ϕi = ϕ|U , and we have ϕi ∈ EndA (Ui ). In this way, we define a
i
map
by setting
Lemma 22.3.18. If S is a simple A-module, then EndA (nS) ≅ Mn (EndA (S)) for any n ∈ ℕ.
Proof. We regard the elements of nS as being column vectors of length n with entries
from S. Let Φ = (ϕij ) ∈ Mn (EndA (S)). We now define the map
Γ(Φ) : nS → nS
by
s + →
Γ(Φ(a→
s ) + Γ(Φ)(→
t )) = aΓ(Φ)(→
t)
s ,→
for any a ∈ A and →
t ∈ nS, because each ϕij is an A-module homomorphism. There-
fore, Γ(Φ) ∈ EndA (nS), and we easily obtain that
by
Φ → Γ(Φ)
is an algebra monomorphism.
Now let ψ ∈ EndA (nS). For each 1 ≤ i, j ≤ n, we define ψij : S → S implicitly by
s 0
ψ11 (s) ψ1n (s)
0 0
.
ψ ( . ) = ( . ) , . . . , ψ ( . ) = ( ... ) .
.
.. ..
ψn1 (s) ψnn (s)
0 s
We get that each ψij ∈ EndA (S). Now let Ψ = (ψij ) ∈ Mn (EndA (S)). Then Γ(Ψ) = ψ,
showing that Γ is also surjective, and hence an isomorphism.
Lemma 22.3.19. Suppose that K is algebraically closed, and let S be a simple A-module.
Then EndA (S) ≅ K.
Proof. Let ϕ ∈ EndA (S). Consider ϕ as an invertible K-linear map of the finite dimen-
sional K-vector space S onto itself. Since K is algebraically closed, ϕ has a nonzero
eigenvalue λϕ ∈ K. If I is the identity element of Enda (S), then (ϕ − λϕ I) ∈ EndA (S) has
a nonzero kernel, and therefore is not invertible. From this, it follows that ϕ = λϕ I,
since EndA (S) is a division algebra. The map ϕ → λϕ is then an isomorphism from
EndA (S) to K.
Lemma 22.3.20. Let B be an algebra. Then (Mn (B))op ≅ Mn (Bop ) for any n ∈ ℕ.
Proof. Define the map ψ : (Mn (B))op → Mn (Bop ) by ψ(X) = X t , where X t is the trans-
pose of the matrix X. This map is bijective.
Let X = (xij ) and Y = (yij ) be elements of (Mn (B))op . Then for any i and j we have
n n
(ψ(X)ψ(Y))ij = ∑ ψ(X)ij ⋅ ψ(Y)kj = ∑ (X t )ik ⋅ (Y t )kj
k=1 k=1
n n
= ∑ Xki ⋅ Yjk = ∑ Yjk Xki = (YX)ji
k=1 k=1
We are now at the point of stating Wedderburn’s main structure theorem for
semisimple algebras.
Proof. Suppose that the algebra A is semisimple. Then A is of the form A = U1 ⊕⋅ ⋅ ⋅⊕Ur ,
where each Ui is the direct sum of ni copies of a simple A-module Si , and no two of
the distinct Si are isomorphic. We have Aop ≅ EndA (A) by Lemma 22.3.16, and Aop ≅
EndA (U1 ) ⊕ ⋅ ⋅ ⋅ ⊕ EndA (Ur ) by Lemma 22.3.17. Therefore,
op
A ≅ (Mn1 (EndA (S1 )) ⊕ ⋅ ⋅ ⋅ ⊕ Mnr (Enda (Sr )))
op op
≅ (Mn1 (EndA (S1 ))) ⊕ ⋅ ⋅ ⋅ ⊕ (Mnr (Enda (Sr )))
op
≅ (Mn1 (EndA (S1 )op ) ⊕ ⋅ ⋅ ⋅ ⊕ Mnr (EndA (Sr )op )) .
Theorem 22.3.23. Suppose that the field K is algebraically closed. Then any semisimple
algebra is isomorphic to a direct sum of matrix algebras over K.
Proof. This follows directly from Lemma 22.3.19 and Theorem 22.3.21.
as ℂ-algebras.
Furthermore, there are exactly r isomorphism classes of simple ℂG-modules, and if
we let S1 , . . . , Sr be representations of these r classes, then we can order the Si so that
ℂG ≅ f1 S1 ⊕ ⋅ ⋅ ⋅ ⊕ fr Sr
as ℂG-modules, where dimℂ Si = fi for each i. Any ℂG-module can be written uniquely
in the form
a1 S1 ⊕ ⋅ ⋅ ⋅ ⊕ ar Sr ,
Proof. The theorem follows from our results on the classification of simple and
semisimple algebras. The first statement follows from Corollary 22.2.24 and Theo-
rem 22.3.23. The second statement follows from Theorem 22.3.8 and 22.3.13, where we
take Si as the space of column vectors of length fi with the canonical module structure
over the ith summand Mfi (ℂ).
The final statement follows from Theorem 22.3.6.
Corollary 22.4.3.
r
∑ fi2 = |G|.
i=1
We note that the degrees of G divide |G|. We do not need this fact. For a proof see
the appendix in the book [1].
Theorem 22.4.4. The number r of simple G-modules is equal to the number of conjugacy
classes of G.
Proof. Let Z be the center of ℂG; that is, the subalgebra of ℂG consisting of all ele-
ments that commute with every element of ℂG. From Theorem 22.4.1, it follows that Z
is isomorphic to the center of Mf1 (ℂ) ⊕ ⋅ ⋅ ⋅ ⊕ Mfr (ℂ), and therefore is isomorphic to the
direct sum of the centers of the Mfi (ℂ). It is straightforward that the center of Mfi (ℂ) is
equal to the set of diagonal matrices
Hence, the center of Mfi (ℂ) is isomorphic to ℂ, and therefore Z ≅ ℂr , which implies
that dimℂ (Z) = r.
We now consider an element ∑g∈G λg G of Z. For any h ∈ G, we have
( ∑ λg G)h = h( ∑ λg g),
g∈G g∈G
which leads to
∑ λg g = ∑ λg h−1 gh = ∑ λhgh−1 g.
g∈G g∈G g∈G
We note that for any representation U, we have χU (1) = dimℂ (U), since the identity
element of G induces the identity transformation of U. Furthermore, if ρ : G → GL(U)
is the representation corresponding to U, then χU (g) is just the trace of the map ρ(g).
Thus, isomorphic ℂG-modules have equal characters.
If g, h ∈ G, then the linear transformations of U, defined by g and hgh−1 , have the
same trace. These linear transformations are called similar. Therefore, any character
is constant on each conjugacy class of G; that is, the value of the character on any two
conjugate elements is the same.
Example 22.4.6. Let U = ℂG and g ∈ G. By considering the matrix of the linear trans-
formation defined by g with respect to the basis G of ℂG, we get that χU (g) is equal to
the number of elements x ∈ G, for which gx = x. Therefore, we have χU (1) = |G| and
χU (g) = 0 for every g ∈ G with g ≠ 1. This character is called the regular character of G.
Since one-dimensional modules are simple, we get that all linear characters are ir-
reducible. Let χ be the linear character arising from the ℂG-module U, and let g, h ∈ G.
Since U is one-dimensional for any u ∈ U, we have gu = χ(g)u, and hu = χ(h)u. Then
χ(gh)u = (gh)u = χ(g)χ(h)u. Hence, χ is a homomorphism from G to the multiplica-
tive group ℂ⋆ = ℂ \ {0}. On the other hand, given a homomorphism ϕ : G → ℂ⋆ ,
we can define a one-dimensional ℂG-module U by gu = ϕ(g)u for g ∈ G and u ∈ U.
Therefore, χU = ϕ. It follows that the linear characters of G are precisely the group of
homomorphisms from G to ℂ⋆ .
Proof. By considering a ℂ-basis for U ⊕V, whose first dimℂ (U) elements form a ℂ-basis
for U ⊕ {0}, and whose remaining elements form a ℂ-basis for {0} ⊕ V, we get that
χU⊕V (g) = χU (g) + χV (g) for any g ∈ G.
Proof. The first statement follows directly from Lemma 22.4.10. Now, suppose that
χU = χV for some ℂG-modules U and V. Since ℂG is semisimple, we can write U ≅
a1 S1 ⊕ ⋅ ⋅ ⋅ ⊕ ar Sr and V ≅ b1 S1 ⊕ ⋅ ⋅ ⋅ ⊕ br Sr with ai , bi ∈ ℕ ∪ {0}. By taking characters, we
have
r
0 = χU − χV = ∑(ai − bi )χi .
i=1
Theorem 22.4.13. The irreducible characters for G form a basis for the ℂ-vector space
of class functions on G.
Proof. By Theorem 22.4.9, the irreducible characters of G are linearly independent el-
ements of the space of class functions. Their number equals the number of conjugacy
classes of G by Theorem 22.4.4, and this number is equal to the dimension of the space
of class functions.
Definition 22.4.14. If α, β are class function of G, then their inner product is the com-
plex number
1
⟨α, β⟩ = ∑ α(g)β(g).
|G| g∈G
This inner product is a traditional complex inner product on the space of class
function. Therefore, we have the following properties:
(1) ⟨α, α⟩ ≥ 0, and ⟨α, α⟩ = 0, if and only if α = 0;
(2) ⟨α, β⟩ = ⟨β, α⟩;
(3) ⟨λα, β⟩ = λ⟨α, β⟩ for all λ ∈ ℂ;
(4) ⟨α1 + α2 , β⟩ = ⟨α1 , β⟩ + ⟨α2 , β⟩.
G
and hence a ∈ U1 . It follows that U = U1 . However, the trace of T is equal to the
dimension of U1 , and then the result follows from the linearity of the trace map.
Theorem 22.4.17. ⟨χU , χV ⟩ = dimℂ (HomℂG (U, V)) for any ℂG-modules U, V.
Recall that HomℂG (U, V) is an ℂ-vector space with (ϕ + ψ)(u) = ϕ(u) + ψ(u), and
(λϕ)(u) = λϕ(u) for any λ ∈ ℂ, u ∈ U and ϕ, ψ ∈ HomℂG (U, V).
Proof. We first observe that HomℂG (U, V) is a subspace of the ℂG-module HomℂG (U,
V). If ϕ ∈ HomℂG (U, V) and g ∈ G, then (gϕ)(u) = gϕ(g −1 u) = gg −1 ϕ(u) = ϕ(u) for any
u ∈ U. Hence, gϕ = ϕ for all g ∈ G. This implies that ϕ ∈ HomℂG (U, V)G . By reversing
the elements, we get HomℂG (U, V) = HomℂG (U, V)G .
Therefore,
= ⟨χV , χU ⟩
1 0, if i ≠ j,
∑ χ (g)χj (g) = {
|G| g∈G i 1, if i = j.
In other words, the irreducible characters form an orthonormal set with respect to
the defined inner product.
Proof. Let S1 , . . . , Sr be the distinct simple ℂG-modules that go with the irreducible
characters. From the previous theorem, we have
for any i, j. We further have HomℂG (Si , Si ) ≅ ℂ, and by Schur’s lemma HomℂG (Si , Sj ) = 0
for i ≠ j, proving the theorem.
Corollary 22.4.19. The set of irreducible characters form an orthonormal basis for the
vector space of class functions.
Proof. The irreducible characters form a basis for the space of characters, and from
the orthogonality result they are an orthonormal set relative to the inner product.
The second orthogonality relation says that the columns of the character table are
also a set of orthogonal vectors. That is, the irreducible characters of a set of conjugacy
class representatives also forms an orthogonal set with respect to the defined inner
product.
r 0, if i ≠ j,
∑ χs (gi )χs (gj ) = { |G|
s=1 ki
, if i = j.
Proof. Let χ = (χi (gj ))1≤i,j≤r be the character table for G, and let K be the r × r diagonal
matrix with the set {k1 , . . . , kr } as its main diagonal. Then we have (χK)i,j = χi (gj )kj for
any i, j. Then
r
(χKχ t )ij = ∑ kℓ χi (gℓ )χj (gℓ ) = ∑ χi (g)χj (g),
ℓ=1 g∈G
= |G|⟨χi , χj ⟩
and
r
0 = ∑ kj χℓ (gj )χℓ (gi ) for i ≠ j,
ℓ=1
As mentioned before, more information about character tables and their conse-
quences can be found in [1].
Recall that a group G is solvable if it has a normal series with abelian factors. Solv-
able groups play a crucial role in the proof of the insolvability of the quintic polyno-
mial, and we discussed solvable groups in detail in Chapter 12. For the proof, we need
the following two facts about solvable groups:
1. If a group G has a normal solvable subgroup N with G/N solvable, then G is solv-
able (Theorem 12.2.3).
2. Any finite group of prime power order is solvable (Theorem 12.2.8).
Lemma 22.5.1. Let χ be a character of G. The value χ(g) for any g ∈ G is an algebraic
integer.
Proof. For any g ∈ G, the value χ(g) is a sum of roots of unity. However, any root of
unity satisfies a monic integral polynomial X n − 1 = 0, and hence is an algebraic inte-
ger. Since the algebraic integers form a ring, any sum of roots of unity is an algebraic
integer.
Lemma 22.5.2. Let χ be an irreducible character of G. Let g ∈ G and CG (g) the central-
izer of g in G. Then
|G : CG (g)|
χ(g)
χ(1)
is an algebraic integer.
|G : CG (g)|
λ= = χ(g).
χ(1)
Let τ : ℂG → ℂG, τ(z) = zα for z ∈ ℂG. We get τ ∈ EndℂG (ℂG) by the proof of
Lemma 22.3.6. Since S is a simple ℂG-module. Therefore, we may consider S as a sub-
module of ℂG, and for 0 ≠ s ∈ S ⊂ ℂG, we have τ(s) = sα = αs = λs, since α is a central
element.
Therefore, λ is an eigenvalue of τ, and so det(λI − A) = 0, where I is the identity
matrix, and A the matrix of τ with respect to the ℂ-basis G for ℂG. Each entry of A is
either 0 or 1, which means that, in particular, f (X) = det(XI − A) is a monic polynomial
in X with integer coefficients. Since f (λ) = 0, we get that λ is an algebraic integer.
|G : CG (gi )|χ(gi )
and χ(gi ) = χ(gi−1 )
χ(1)
1 r r
χ(g )
∑ G : CG (gi )χ(gi )χ(gi ) = ∑ G : CG (gi ) i χ(gi ),
|G|
=
χ(1) χ(1) i=1 i=1
χ(1)
Theorem 22.5.5. If G has a conjugacy class of nontrivial prime power order, then G is
not simple.
Proof. Suppose that G is simple and that the conjugacy class of 1 ≠ g ∈ G has order pn
with p a prime number, and n ∈ ℕ. From the second orthogonality relation, we get
0 1 r 1 1 r
0= = ∑ χi (g)χi (1) = + ∑ χi (g)χi (1),
p p i=1 p p i=2
First of all, if g ∈ Zi , then g −1 ∈ Zi . From Theorem 22.4.8, we also get that |χi (g)| =
χi (1) if and only if g has exactly one eigenvalue. If g ∈ Zi , let this eigenvalue be λ(g),
so that, if U is the ℂG-module corresponding to χi , then we have gu = λ(g)u for all
u ∈ U. We now see that for g, h ∈ Zi , then (gh)u = λ(g)λ(h)u for all u ∈ U. Hence,
χi (gh) = χi (1)λ(g)λ(h), and thus |χi (gh)| = χi (1), which gives gh ∈ Zi . Therefore, Zi is a
subgroup of G.
Now, let Ki = {x ∈ G | χi (x) = χi (1)}. Ki is a normal subgroup of G, and also in Zi .
We now want to show that
Zi /Ki = Z(G/Ki ),
Theorem 22.5.6 (Burnside’s Theorem). If |G| = pa qb , where p and q are prime numbers
and a, b ∈ ℕ, then G is solvable.
pa qb = |G| = 1 + h2 + h3 + ⋅ ⋅ ⋅ + hr .
22.6 Exercises
1. Let K be a field, and let G be a finite group. Let U and V be KG-modules having the
same dimension n, and let ρ : G → GL(U) and τ : G → GL(V) be the corresponding
representations.
By fixing K-bases for U and V, consider ρ and τ as homomorphisms from G to
GL(n, K). Show that U and V are KG-module isomorphic if and only if there exists
some M ∈ GL(n, K) such that ρ(g)M = Mτ(g) for every g ∈ G.
U = V1 ⊕ ⋅ ⋅ ⋅ ⊕ Vk
https://doi.org/10.1515/9783110603996-023
ceiver, usually referred to as Bob and Alice. In the latter, the encryption method is
public knowledge but only the receiver knows how to decode.
The message that one wants to send is written in plaintext, and then converted
into code. The coded message is written in ciphertext. The plaintext message and ci-
phertext message are written in some alphabets that are usually the same. The process
of putting the plaintext message into code is called enciphering or encryption, whereas
the reverse process is called deciphering or decryption. Encryption algorithms break
the plaintext and ciphertext message into message units. These are single letters or
pairs of letters, or more generally, k-vectors of letters. The transformations are done
on these message units, and the encryption algorithm is a mapping from the set of
plaintext message units to the set of ciphertext message units. Putting this into a math-
ematical formulation we let
f : 𝒫 → 𝒞.
g : 𝒞 → 𝒫, g ∘ f = id𝒫
f :m→m+k mod N.
This is often known as a Caesar code after Julius Caesar, who supposedly invented it.
It was used by the Union Army during the American Civil War. For example, if both the
plaintext and ciphertext alphabets were English, and each message unit was a single
letter, then N = 26. Suppose k = 5, and we wish to send the message ATTACK. If a = 0,
then ATTACK is the numerical sequence 0, 19, 19, 0, 2, 10. The encoded message would
then be FYYFHP.
Any permutation encryption algorithm, which goes letter to letter is very simple to
attack using a statistical analysis. If enough messages are intercepted and the plain-
text language is guessed then a frequency analysis of the letters will suffice to crack
the code. For example, in the English language, the three most commonly occurring
letters are E, T, and A with a frequency of occurrence of approximately 13 % and 9 %
and 8 %, respectively. By examining the frequency of occurrences of letters in the ci-
phertext the letters corresponding to E, T, and A can be uncovered.
Example 23.1.3. A variation on the Caesar code is the Vignère code. Here, message
units are considered as k-vectors of integers mod N from an N letter alphabet. Let
B = (b1 , . . . , bk ) be a fixed k-vector in ℤkn . The Vignère code then takes a message
unit
From a cryptanalysis point of view, a Vignère code is no more secure than a Caesar
code and is susceptible to the same type of statistical attack.
The Alberti Code is a polyalphabetic cipher, and can often be used to thwart a sta-
tistical frequency attack. We describe it in the next example.
Key Letters
A B C D E O S T U
a A a b c d e o s t u
l B b c d e o s t u a
p C c d e o s t u a b
h D d e o s t u a b c
a E e o s t u a b c d
b O o s t u a b c d e
e S s t u a b c d e o
t T t u a b c d e o s
s U u a b c d e o s t.
Suppose the plaintext message is STAB DOC and Bob and Alice have chosen the
keyword BET. We place the keyword repeatedly over the message
B E T B E T B
S T A B D O C.
To encode, we look at B, which lies over S. The intersection of the B key letter and the
S alphabet is a T; so we encrypt the S with T. The next key letter is E, which lies over T.
The intersection of the E keyletter with the T alphabet is C. Continuing in this manner,
and ignoring the space, we get the encryption
Example 23.1.5. A final example, which is not number theory based, is the so-called
Beale Cipher. This has a very interesting history, which is related in the popular book
Archimedes Revenge by Paul Hoffman (see [66]). Here, letters are encrypted by num-
bering the first letters of each word in some document like the Declaration of Indepen-
dence or the Bible. There will then be several choices for each letter, making a Beale
cipher quite difficult to attack.
Until relatively recent times, cryptography was mainly concerned with message
confidentiality—that is sending secret messages so that interceptors or eavesdroppers
cannot decipher them. The discipline was primarily used in military and espionage sit-
uations. This changed with the vast amount of confidential data that had to be trans-
mitted over public airways. Thus, the field has expanded to many different types of
cryptographic techniques, such as digital signatures and message authentications.
Cryptography and encryption does have a long and celebrated history. In the
Bible, in the book of Jeremiah, they use what is called an Atabash Code. In this code,
the letters of the alphabet—Hebrew in the Bible, but can be used with any alphabet—
are permuted first to last. That is, in the Latin alphabet, Z would go to A and so on.
The Kabbalists and the Kabbala believe that the Bible—written in Hebrew, where
each letter also stands for a number—is a code from heaven. They have devised elabo-
rate ways to decode it. This idea has seeped into popular culture, where the book “The
Bible Code” became a bestseller.
In his military campaigns, Julius Caesar would send out coded messages. His
method, which we looked at in the last section, is now known as a Caesar code. It is a
shift cipher. That is, each letter is shifted a certain amount to the right. A shift cipher
is a special case of an affine cipher that will be elaborated upon in the next section.
The Caesar code was resurrected and used during the American Civil War.
Coded messages produced by most of the historical methods reveal statistical in-
formation about the plaintext. This could be used in most cases to break the codes.
The discovery of frequency analysis was done by the Arab mathematician Al-Kindi in
the ninth century, and the basic classical substitution ciphers became more or less
easily breakable. About 1470, Leon Alberti developed a method to thwart statistical
analysis. His innovation was to use a polyalphabetic cipher, where different parts of
the message are encrypted with different alphabets. We looked at an example of an
Alberti code in this section.
A different way to thwart statistical attacks is to use blank and neutral letters,
that is, meaningless letters within the message. Mary, Queen of Scots, used a ran-
dom permutation cipher with neutrals in it, where a neutral was a random mean-
ingless symbol. Unfortunately for her, her messages were decoded, and she was be-
headed.
There have been various physical devices and aids used to create codes. Prior
to the widespread use of the computer, the most famous cryptographic aid was the
Enigma machine, developed and used by the German military during the Second World
War. This was a rotor machine using a polyalphabetic cipher. An early version was
broken by Polish cryptographers early in the war, so a larger system was built that
was considered unbreakable. British cryptographers led by Alan Turing broke this,
and British knowledge of German secrets had a great effect on the latter part of the
war.
The development of digital computers allowed for the development of much more
complicated cryptosystems. Furthermore, this allowed for the encryption using any-
thing that can be placed in binary formats, whereas historical cryptosystems could
only be rendered using language texts. This has revolutionized cryptography.
In 1976, Diffie and Hellman developed the first usable public key exchange proto-
col. This allowed for the transmission of secret data over open airways. A year later,
Rivest, Adelman, and Shamir, developed the RSA algorithm, a second public key pro-
tocol. There are now many, and we will discuss them later. In 1997, it became known
that public key cryptography had been developed earlier by James Ellis working for
British Intelligence, and that both the Diffie–Hellman and RSA protocols had been
developed earlier by Malcom Williamson and Clifford Cocks, respectively.
Before we close this introductory section, we give a short overview of the crypto-
graphic tasks.
Secure confidential message transmission is only one type of task that must be
done with secrecy, and there are many other tasks and procedures that are important
in cryptography.
Several cryptographic tasks are described below. As more techniques are intro-
duced later in the book, we will look at more instances of these cryptographic tasks
and cryptographic protocols to handle them.
(1) Authentication: Authentication refers to the process of determining that a mes-
sage, supposedly from a given person, does come from that person, and further,
has not been tampered with. Included in the general topic of authentication are
the concepts of hash functions and digital signatures. Another important usage is
password identification.
(2) Key exchange and key transport: In a key exchange protocol, two people, usually
called Bob and Alice, exchange a secret shared key to be used in some symmetric
encryption. In a key transport protocol, one party transports to another a secret
key that is to be used.
(3) Secret sharing: Secret sharing involves methods, where some secret is to be shared
by k people, but not available to any proper subset of them. There are many ways
to accomplish this, and it is related to a classical lock and key problem. A beau-
tiful simple solution to the general problem using polynomial interpolation is ac-
corded to Shamir (see for instance the book [54] by Baumslag, Fine, Kreuzer, and
Rosenberger).
(4) Zero knowledge proof : A zero-knowledge proof is an argument that convinces
someone that you have solved a problem, for example, a combinatorial problem,
without giving away the solution. This is tied to authentication.
f :m→m+k mod N.
The shift algorithm is a special case of an affine cipher. Recall that an affine map
on a ring R is a function f (x) = ax + b with a, b, x ∈ R. We apply such a map to the ring
of integers modulo N, that is, R = ℤN , as the encryption map. Again, suppose we have
an N letter alphabet, and we consider the letters as the integers 0, 1, . . . , N − 1 mod N,
that is, in the ring ℤN . We choose integers a, b ∈ ℤN with (a, N) = 1 and b ≠ 0. The
numbers a, b are called the keys of the cryptosystem. The encryption map is then given
by
f : m → am + b mod N.
Example 23.2.1. Using an affine cipher with the English language and keys a = 3,
b = 5 encode the message EAT AT JOE’S. Ignore spaces and punctuation.
The numerical sequence for the message ignoring the spaces and punctuation is
17, 5, 62, 5, 62, 32, 47, 17, 59 → 17, 5, 10, 5, 10, 6, 21, 17, 7.
Since (a, N) = 1, the integer a has a multiplicative inverse a−1 mod N. The decryp-
tion map for an affine cipher with keys a, b is then
g = f −1 : m → a−1 (m − b) mod N.
Since an affine cipher, as given above, goes letter to letter, it is easy to attack using
a statistical frequency approach. Furthermore, if an attacker can determine two letters
and knows that it is an affine cipher, the keys can be determined and the code broken.
To give better security it is preferable to use k-vectors of letters as message units. The
form then of an affine cipher becomes
f : v → Av + B,
where v and B are k-vectors from ℤkN , and A is an invertible k × k matrix with entries
from the ring ℤN . The computations are then done modulo N. Since v is a k-vector,
and A is a k × k matrix, the matrix product Av produces another k-vector from ℤkN .
Adding the k-vector B again produces a k-vector, so the ciphertext message unit is
again a k-vector. The keys for this affine cryptosystem are the enciphering matrix A,
and the shift vector B. The matrix A is chosen to be invertible over ℤN (equivalent to
the determinant of A being a unit in the ring ℤN ), so the decryption map is given by
v → A−1 (v − B).
Here, A−1 is the matrix inverse over ℤN , and v is a k-vector. The enciphering matrix A
and the shift vector B are now the keys of the cryptosystem.
A statistical frequency attack on such a cryptosystem requires knowledge, within
a given language, of the statistical frequency of k-strings of letters. This is more diffi-
cult to determine than the statistical frequency of single letters. As for a letter to letter
affine cipher, if k + 1 message units, where k is the message block length, are discov-
ered, then the code can be broken.
Example 23.2.2. Using an affine cipher with message units of length 2 in the English
language and keys
5 1 5
A=( ), B = ( ),
8 7 3
encode the message EAT AT JOE’S. Again ignore spaces and punctuation.
Message units of length 2; that is, 2-vectors of letters are called digraphs. We first
must place the plaintext message in terms of these message units. The numerical se-
quence for the message EAT AT JOE’S, ignoring the spaces and punctuation, is as be-
fore
4 19 19 14 18
( ), ( ), ( ), ( ), ( ),
0 0 9 4 18
4 5 1 4 5 20 5 25
A( ) + B = ( )( ) + ( ) = ( ) + ( ) = ( ).
0 8 7 0 3 32 3 9
25 22 5 1 9
( ), ( ), ( ), ( ), ( ).
9 25 10 13 13
Z W F B J
( ), ( ), ( ), ( ), ( ).
J Z K N N
Example 23.2.3. Suppose we receive the message ZJWZFKBNJN, and we wish to de-
code it. We know that an affine cipher with message units of length 2 in the English
language and keys
5 1 5
A=( ), B=( )
8 7 3
is being used.
The decryption map is given by
v → A−1 (v − B),
so we must find the inverse matrix for A. For a 2 × 2 invertible matrix ( ac db ), we have
−1
a b 1 d −b
( ) = ( ).
c d ad − bc −c a
5 1 7 −1
A=( ) ⇒ A−1 = ( ).
8 7 −8 5
25 22 5 1 9
( ), ( ), ( ), ( ), ( ).
9 25 10 13 13
20 7 −1 25 5 4
A−1 (( ) − B) = ( ) (( ) − ( )) = ( ) .
6 −8 5 9 3 0
4 19 19 14 18
( ), ( ), ( ), ( ), ( )
0 0 9 4 18
E T T O S
( ), ( ), ( ), ( ), ( ).
A A J E S
This gives us
ZJWZFKBNJN → EATATJOESS.
Modern cryptography is done via a computer. Hence, all messages both plaintext
and ciphertext are actually presented as binary strings. Important in this regard is the
concept of a hash function.
A cryptographic hash function is a deterministic function
h : S → {0, 1}n ,
which returns for each arbitrary block of data, called a message, a fixed size bit string.
It should have the property that a change in the data will change the hash value. The
hash value is called the digest.
An ideal cryptographic hash function has the following properties:
(1) It is easy to compute the hash value for any given message.
(2) It is infeasible to find a message that has a given hash value (preimage resistant).
(3) It is infeasible to modify a message without changing its hash.
(4) It is infeasible to find two different messages with the same hash (collision resis-
tant).
fK (m) ⊕ h(K),
Alice could have just as easily sent fK (m) ⊕ K. However, sending the hash has two
benefits. Usually the hash is shorter than the key, and from the properties of hash
functions, it gives another level of security. As we will see, tying the secret key to the
actual encryption in this manner is the basis for the El-Gamal and elliptic curve cryp-
tographic methods.
The encryption algorithm fK is usually a symmetric key encryption, so that anyone
knowing K can encrypt and decrypt easily. However, it should be resistant to plaintext-
ciphertext attacks. That is, if an attacker gains some knowledge of a piece of plaintext
together with the corresponding ciphertext, it should not compromise the whole sys-
tem.
The encryption algorithm can either be a block cipher or a stream cipher. In the
former, blocks of fixed length k are transformed into blocks of fixed length n, and there
is a method to tie the encrypted blocks together. In the latter, a stream cipher, bits are
transformed one by one into new bit strings by some procedure.
In 2001, the National Institute of Standards and Technology adopted a block ci-
pher, now called AES for Advanced Encryption System, as the industry standard for a
symmetric key encryption. Although not universally used, it is the most widely used.
This block cipher was a standardization of the Rijnadel cipher, named after its inven-
tors Rijmen and Daeman.
AES replaced DES or Digital Encryption System, which had been the standard.
Parts of DES were found to be insecure. AES proceeds with several rounds of encrypt-
ing blocks, and then mixing blocks. The mathematics in AES is done over the finite
field GF(28 ).
The basic idea in a public key cryptosystem is to have a one-way function or trap-
door function. That is, a function, which is easy to implement, but very hard to invert.
Hence, it becomes simple to encrypt a message, but very hard, unless you know the
inverse, to decrypt.
The standard model for public key systems is the following: Alice wants to send
a message to Bob. The encrypting map fA for Alice is public knowledge, as well as the
encrypting map fB for Bob. On the other hand, the decryption algorithms gA and gB
are secret and known only to Alice and Bob, respectively. Let 𝒫 be the message Alice
wants to send to Bob. She sends fB gA (𝒫 ). To decode, Bob applies first gB , which only he
knows. This gives him gB (fB gA (𝒫 )) = gA (𝒫 ). He then looks up fA , which is publically
available and applies this fA (gA (𝒫 )) = 𝒫 to obtain the message. Why not just send
fB (𝒫 )? Bob is the only one who can decode this. The idea is authentication, that is, be-
ing certain from Bob’s point of view that the message really came from Alice. Suppose
𝒫 is Alice’s verification; signature, social security number et cetera. If Bob receives
fB (𝒫 ), it could be sent by anyone, since fB is public. On the other hand, since only
Alice supposedly knows gA , getting a reasonable message from fA (gB fB gA (𝒫 )) would
verify that it is from Alice. Applying gB alone should result in nonsense.
Getting a reasonable one-way function can be a formidable task. The most widely
used (at present) public key systems are based on difficult-to-invert number theoretic
functions. The original public key system was developed by Diffie and Hellman in 1976.
It was followed closely by a second public key system developed by Rivest, Shamir,
and Adelman, known as the RSA system. Although at present there are many different
public key systems in use, most are variations of these original two. The variations are
attempts to make the systems more secure. We will discuss four such systems.
Diffie and Hellman in 1976 developed the original public key idea using the discrete log
problem. In modular arithmetic, it is easy to raise an element to a power, but difficult
to determine, given an element, if it is a power of another element. Specifically, if G is
a finite group, such as the cyclic multiplicative group of ℤp , where p is a prime, and
h = g k for some k, then the discrete log of h to the base g is any integer t with h = g t .
The rough form of the Diffie–Hellman public key system is as follows: Bob and
Alice will use a classical cryptosystem based on a key k with 1 < k < q − 1 where q is a
prime. It is the key k that Alice must share with Bob. Let g be a multiplicative generator
of ℤ⋆q , the multiplicative group of ℤq . The generator g is public. It is known that this
group is cyclic if q is a prime.
Alice chooses an a ∈ ℤq with 1 < a < q − 1. She makes public g a . Bob chooses a
b ∈ ℤ⋆q and makes public g b . The secret key is g ab . Both Bob and Alice, but presumably
none else, can discover this key. Alice knows her secret power a, and the value g b is
public from Bob. Hence, she can compute the key g ab = (g b )a . The analogous situation
holds for Bob. An attacker, however, only knows g a and g b and g. Unless the attacker
can solve the discrete log problem, the key exchange is secure.
Given q, g, g a , g b the problem of determining the secret key g ab is called the Diffie–
Hellman problem. At present the only known solution is to solve the discrete log prob-
lem, which appears to be very hard. In choosing the prime q and the generator g, it is
assumed that the prime q is very large, so that the order of g is very large. There are
algorithms to solve the discrete log problem if q is too small.
One attack on the Diffie–Hellman key exchange is a man in the middle attack. Since
the basic protocol involves no authentication, an attacker can pretend to be Bob and
get information from Alice, and then pretend to be Alice and get information from
Bob. In this way, the attacker could get the secret shared key. To prevent this, digital
signatures are often used (see [70] for a discussion of these).
The decision Diffie–Hellman problem is: given a prime q and g a mod q, g b mod q,
and g c mod q determine if g c = g ab .
In 1997, it became known that the ideas of public key cryptography were developed
by British Intelligence Services prior to Diffie and Hellman.
In 1977, Rivest, Adelman, and Shamir developed the RSA algorithm, which is presently
(in several variations) the most widely used public key cryptosystem. It is based on
the difficulty of factoring large integers and, in particular, on the fact that it is easier
to test for primality than to factor very large integers.
In basic form, the RSA algorithm works as follows: Alice chooses two large primes
pA , qA and an integer eA relatively prime to ϕ(pA qA ) = (pA − 1)(qA − 1), where ϕ is the
Euler phi-function. It is assumed that these integers are chosen randomly to minimize
attacks. Primality tests arise in the following manner: Alice first randomly chooses a
large odd integer m and tests it for primality. If m is prime it is used. If not, she tests
m + 2, m + 4, and so on, until she gets her first prime pA . She then repeats the process
to get qA . Similarly, she chooses another odd integer m and tests until she gets an eA
relatively prime to ϕ(pA qA ). The primes she chooses should be quite large. Originally,
RSA used primes of approximately 100 decimal digits, but as computing and attack
have become more sophisticated, larger primes have had to be utilized. Presently, keys
with 400 decimal digits are not uncommon. Once Alice has obtained pA , qA , eA , she
lets nA = pA qA and computes dA , the multiplicative inverse of eA modulo ϕ(nA ). That
is, dA satisfies eA dA ≡ 1 mod (pA − 1)(qA − 1). She makes public the enciphering key
KA = (nA , eA ), and the encryption algorithm known to all is
fA (𝒫 ) = 𝒫 eA mod nA ,
where 𝒫 ∈ ℤnA is a message unit. It can be shown (see for instance [43] or exercises)
that if (eA , (pA −1)(qA −1)) = 1 and eA dA ≡ 1 mod (pA −1)(qA −1), then 𝒫 eA dA ≡ 𝒫 mod nA .
Therefore, the decryption algorithm is
gA (𝒞 ) = 𝒞 da mod nA .
N k < nU < N l
for each user U; that is, nU = pU qU . In this case, any plaintext message 𝒫 is an inte-
ger less than N k , considered as an element of ℤnU . Since nU < N l , the image under
the power transformation corresponds to an l digit integer written to the base N, and
hence to an l letter block. We give an example with relatively small primes. In real
world applications, the primes would be chosen to have over a hundred digits, and
the computations and choices must be done using good computing machinery.
Example 23.3.1. Suppose N = 26, k = 2, and l = 3. Suppose further that Alice chooses
pA = 29, qA = 41, eA = 13. Here, nA = 29 ⋅ 41 = 1189, so she makes public the key KA =
(1189, 13). She then computes the multiplicative inverse dA of 13 mod 1120 = 28 ⋅ 40.
Now suppose we want to send her the message TABU. Since k = 2, the message units
in plaintext are 2 vectors of letters, so we separate the message into TA BU. We show
how to send TA. First, the numerical sequence for the letters TA mod 26 is (19,0). We
then use these as the digits of a 2-digit number to the base 26. Hence,
TA =̂ 19 ⋅ 26 + 0 ⋅ 1 = 494.
This is evaluated as 320. Now we write 320 to the base 26. By our choices of k, l this
can be written with a maximum of 3 digits to this base. Then
320 = 0 ⋅ 262 + 12 ⋅ 26 + 8.
The letters in the encoded message then correspond to (0, 12, 8), and therefore the en-
cryption of TA is AMI.
To decode the message Alice knows dA and applies the inverse transformation.
Since we have assumed that k < l, this seems to restrict the direction in which
messages can be sent. In practice, to allow messages to go between any two users,
the following is done: Suppose Alice is sending an authenticated message to Bob. The
keys kA = (nA , eA ), kB = (nB , eB ) are public. If nA < nB , Alice sends fB gA (𝒫 ). On the
other hand, if nA > nB , she sends gA fB (𝒫 ).
There have been attacks on RSA for special types of primes, so care must be taken
in choosing the primes.
The computations and choices used in real world implementations of the RSA al-
gorithm must be done with computers. Similarly, attacks on RSA are done via comput-
ers. As computing machinery gets stronger and factoring algorithms get faster, RSA
becomes less secure, and larger and larger primes must be used. To combat this, other
public key methods are in various stages of ongoing development. RSA and Diffie–
Hellman, and many related public key cryptosystems use properties in abelian groups.
In recent years, a great deal of work has been done to encrypt and decrypt using cer-
tain nonabelian groups, such as linear groups or braid groups. We will discuss these
later in the chapter.
fK (m) ⊕ h(K),
A = ga mod q.
Her public key is then (q, g, A). Bob wants to send a message M to Alice. He first en-
crypts the message an integer m mod q. For Bob to now send an encrypted message m
to Alice, he chooses a random integer b with 1 < b < q − 2, and computes
B = gb mod q.
c = Ab m mod q;
that is, Bob encrypts the whole message by multiplying it by the Diffie–Hellman
shared key. The complete El-Gamal ciphertext is then the pair (B, c).
How does Alice decode the message? Given the message m, she knows how to
reconstruct the plaintext message M, so she must recover the mod q integer m. As in
the Diffie–Hellman key exchange, she can compute the shared key Ab = Ba . She can
then divide c by this Diffie–Hellman key g ab to obtain m. To avoid having to find the
inverse of Ba mod q, which can be difficult, she computes the exponent x = p − 1 − a.
The inverse is then Bx mod q.
For each new El-Gamal encryption, a new exponent b is chosen so that there is a
random component of El-Gamal, which improves the security.
Breaking the El-Gamal system is as difficult as breaking the Diffie–Hellman pro-
tocol, and hence is based on the difficulty of the discrete log problem. However, the
El-Gamal has the advantage that the choice of primes is random. As mentioned, the
primes should be chosen large enough to not be susceptible to known discrete log
algorithms. Presently, the primes should be of binary length at least 512:
c = Ab m mod q.
The important thing about elliptic curves from the viewpoint of cryptography is
that a group structure can be placed on E(K). In particular, we define the operation +
on E(K) by the following:
1. 0 + P = P for any point P ∈ E(K).
2. If P = (x, y), then −P = (x, −y), and −0 = 0.
3. P + (−P) = 0 for any point P ∈ E(K).
4. If P1 = (x1 , y1 ), P2 = (x2 , y2 ) with P1 ≠ −P2 , then
P1 + P2 = (x3 , y3 ) with
x3 = m2 − (x1 + x2 ), y3 = −m(x3 − x1 ) − y1 ,
where
y2 − y1
m= if x2 ≠ x1 ,
x2 − x1
and
3x12 + a
m= if x2 = x1 .
2y1
This operation has a very nice geometric interpretation if K = ℝ, the real numbers. It
is known as the chord and tangent method. If P1 ≠ P2 are two points on the curve, then
the line through P1 , P2 intersects the curve at another point P3 . If we reflect P3 through
the x-axis, we get P1 + P2 . If P1 = P2 ; we take the tangent line at P1 .
With this operation, E(K) becomes an abelian group (due to Cassels), whose struc-
ture can be worked out.
Theorem 23.3.2. E(K), together with the operations defined above, forms an abelian
group. If K is a finite field of order pk , then E(K) is either cyclic or has the structure
have been identified. Borovik, Myasnikov, Shpilrain [58], and others have studied the
statistical aspects of these attacks, and have identified what are termed black holes
in the platform groups, outside of which present cryptographic problems. Baumslag,
Fine and Xu in [55] and [79] suggested potential cryptosystems using a combination
of combinatorial group theory and linear groups, and a general schema for these
types of cryptosystems was given. In [56], a public key version of this schema using
the classical modular group as a platform was presented. A cryptosystem using the
extended modular group SL2 (ℤ) was developed by Yamamura [80], but was subse-
quently shown to have loopholes [77]. In [56], attacks based on these loopholes were
closed.
The extension of the cryptographic ideas to noncommutative platforms involves
the following idea:
(1) General algebraic techniques for developing cryptosystems,
(2) Potential algebraic platforms (specific groups, rings, et cetera) for implementing
the techniques,
(3) Cryptanalysis and security analysis of the resulting systems.
The main source for noncommutative platforms are nonabelian groups, and the main
method for handling nonabelian groups in cryptography is combinatorial group the-
ory, which we discussed in detail in Chapter 14. The basic idea in using combinatorial
group theory for cryptography is that elements of groups can be expressed as words in
some alphabet. If there is an easy method to rewrite group elements in terms of these
words, and further the technique used in this rewriting process can be supplied by a
secret key, then a cryptosystem can be created.
One of the earliest descriptions of a free group cryptosystem was in a paper by
W. Magnus in the early 1970s [71]. Recall that the classical modular group M is M =
PSL2 (ℤ). Hence, M consists of the 2 × 2 projective integral matrices:
a b
M = {± ( ) : ad − bc = 1, a, b, c, d ∈ ℤ} .
c d
1 1 1 + 4t 2 2t
±( ), ±( ), t = 1, 2, 3, . . .
1 2 2t 1
Since the entries in the generating matrices are positive, we can do the following:
Choose a set
T1 , . . . , Tn
of projective matrices from the set above with n large enough to encode a desired plain-
text alphabet 𝒜. Any message would be encoded by a word
W(T1 , . . . , Tn )
𝒜 → {W1 , . . . , Wk }.
That is,
a → W1 , b → W2 , . . . .
Then, given a word W(a, b, . . .) in the plaintext alphabet, form the free group word
W(W1 , W2 , . . .). This represents an element g in F. Send out g as the secret message.
To implement this scheme, we need a concrete representation of g, and then for
decryption, a way to rewrite g back in terms of W1 , . . . , Wk . This concrete representation
is the idea behind homomorphic cryptosystems.
The decryption algorithm in a free group cryptosystem then depends on the Reide-
meister–Schreier rewriting process. As described in Chapter 14, this is a method to
rewrite elements of a subgroup of a free group in terms of the generators of that sub-
group. Recall that roughly it works as follows: Assume that W1 , . . . , Wk are free gener-
ators for some subgroup H of a free group F on {x1 , . . . , xn }. Each Wi is then a reduced
word in the generators {x1 , . . . , xn }. A Schreier transversal for H is a set {h1 , . . . , ht , . . .} of
(left) coset representatives for H in F of a special form (see Chapter 14). Any subgroup
of a free group has a Schreier transversal. The Reidemeister–Schreier process allows
one to construct a set of generators W1 , . . . , Wk for H by using a Schreier transversal.
Furthermore, given the Schreier transversal, from which the set of generators for H was
constructed, the Reidemeister–Schreier rewriting process allows us to algorithmically
rewrite an element of H. Given such an element expressed as a word W = W(x1 , . . . , xr )
in the generators of F, this algorithm rewrites W as a word W ⋆ (W1 , . . . , Wk ) in the gen-
erators of H.
The knowledge of a Schreier transversal, and the use of Reidemeister–Schreier
rewriting, facilitates the decoding process in the free group case, but is not essential.
Given a known set of generators for a subgroup the Stallings folding method to develop
a subgroup graph can also be utilized to rewrite in terms of the given generators. The
paper by Kapovich and Myasnikov [68] is now a standard reference for this method in
free groups. At present, there is an ongoing study of the complexity of Reidemeister–
Schreier being done by Brukhov, Fine, and Troeger.
Pure free group cryptosystems are subject to various attacks and can be broken
easily. However, a public key free group cryptosystem, using a free group represen-
tation in the modular group, was developed by Baumslag, Fine, and Xu [55, 56]. The
most successful attacks on free group cryptosystems are called length-based attacks.
G = ⟨X|R⟩,
ρ : G → G.
G can be any one of several different kinds of objects: linear group, permutation group,
power series ring et cetera.
We assume that there is an algorithm to re-express an element of ρ(G) in G in terms
of the generators of G. That is, if g = W(x1 , . . . , xn , . . .) ∈ G, where W is a word in these
generators, and we are given ρ(g) ∈ G, we can algorithmically find g and its expression
as the word W(x1 , . . . , xn ).
Once we have G, we assume that we have two free subgroups K, H with
H ⊂ K ⊂ G.
We assume that we have fixed Schreier transversals for K in G and for H in K, both of
which are held in secret by the communicating parties Bob and Alice. Now, based on
the fixed Schreier transversals, we have sets of Schreier generators constructed from
the Reidemeister–Schreier process for K and for H:
k1 , . . . , km , . . . for K,
and
h1 , . . . , ht , . . . for H.
Notice that the generators for K will be given as words in x1 , . . . , xn , the generators
of G, whereas the generators for H will be given as words in the generators k1 , k2 , . . . for
K. We note further that H and K may coincide, and that H and K need not, in general,
be free, but only have a unique set of normal forms so that the representation of an
element in terms of the given Schreier generators is unique.
We will encode within H, or more precisely within ρ(H). We assume that the num-
ber of generators for H is larger than the set of characters within our plaintext alpha-
bet. Let 𝒜 = {a, b, c, . . .} be our plaintext alphabet. At the simplest level, we choose a
starting point i within the generators of H, and encode
a → hi , b → hi+1 , . . . et cetera.
Suppose that Bob wants to communicate the message W(a, b, c, . . .) to Alice, where
W is a word in the plaintext alphabet. Recall that both Bob and Alice know the var-
ious Schreier transversals, which are kept secret between them. Bob then encodes
W(hi , hi+1 , . . .) and computes in G the element W(ρ(hi ), ρ(hi+1 ), . . .), which he sends to
Alice. This is sent as a matrix if G is a linear group, or as a permutation if G is a per-
mutation group, and so on.
Alice uses the algorithm for G relative to G to rewrite W(ρ(hi ), ρ(hi+1 ), . . .) as a word
W ⋆ (x1 , . . . , xn ) in the generators of G. She then uses the Schreier transversal for K in
G to rewrite using the Reidemeister–Schreier process W ⋆ as a word W ⋆⋆ (k1 , . . . , ks , . . .)
in the generators of K. Since K is free, or has unique normal forms, this expression
for the element of K is unique. Once she has the word written in the generators of K,
she uses the transversal for H in K to rewrite again, using the Reidemeister–Schreier
process, in terms of the generators for H. She then has a word W ⋆⋆⋆ (hi , hi+1 , . . .), and
using hi → a, hi+1 → b, . . . decodes the message.
In actual implementation, an additional random noise factor is added.
In [55] and [56], an implementation of this process was presented that used for
the base group G, the classical modular group M = PSL2 (ℤ). Furthermore, it was a
polyalphabetic cipher, which was secure.
The system in the modular group M was presented as follows: A list of finitely
generated free subgroups H1 , . . . , Hm of M is public and presented by their systems of
generators (presented as matrices). In a full practical implementation, it is assumed
that m is large. For each Hi , we have a Schreier transversal
h1,i , . . . , ht(i),i
W1,i , . . . , Wm(i),i
We clarify the meanings of q and t. Once Bob chooses m, to further clarify the meaning
of q, he makes the substitution
a → Wm,q , b → Wm,q+1 , . . . .
Again, the assumption is that m(i) ≫ l so that starting almost anywhere in the se-
quence of generators of Hm will allow this substitution. The message unit size t is the
number of coded letters that Bob will place into each coded integral matrix.
Once Bob has made the choices (m, q, t), he takes his plaintext message W(a, b, . . .)
and groups blocks of t letters. He then makes the given substitution above to form the
corresponding matrices in the Modular group:
T1 , . . . , Ts .
We now introduce a random noise factor. After forming T1 , . . . , Ts , Bob then multiplies
on the right each Ti by a random matrix in M, say RTi (different for each Ti ). The only
restriction on this random matrix RTi is that there is no free cancellation in forming the
product Ti RTi . This can be easily checked, and ensures that the freely reduced form for
Ti RTi is just the concatenation of the expressions for Ti and RTi . Next he sends Alice
the integral key (m, q, t) by some public key method (RSA, Anshel–Anshel–Goldfeld
et cetera.). He then sends the message as s random matrices
Hence, what is actually being sent out are not elements of the chosen subgroup Hm ,
but rather elements of random right cosets of Hm in M. The purpose of sending coset
elements is two-fold. The first is to hinder any geometric attack by masking the sub-
group. The second is that it makes the resulting words in the modular group generators
longer—effectively hindering a brute force attack.
To decode the message, Alice first uses public key decryption to obtain the inte-
gral keys (m, q, t). She then knows the subgroup Hm , the ciphertext substitution from
the generators of Hm and how many letters t each matrix encodes. She next uses the
algorithms, described in Section 14.4, to express each Ti RTi in terms of the free group
generators of M, say WTi (y1 , . . . , yn ). She has knowledge of the Schreier transversal,
which is held secretly by Bob and Alice, so now uses the Reidemeister–Schreier rewrit-
ing process to start expressing this freely reduced word in terms of the generators of
Hm . The Reidemeister–Schreier rewriting is done letter by letter from left to right (see
Chapter 14). Hence, when she reaches t of the free generators, she stops. Notice that
the string that she is rewriting is longer than what she needs to rewrite to decode as a
result of the random polynomial RTi . This is due to the fact that she is actually rewrit-
ing not an element of the subgroup, but an element in a right coset. This presents a
further difficulty to an attacker. Since these are random right cosets, it makes it diffi-
cult to pick up statistical patterns in the generators even if more than one message is
intercepted. In practice, the subgroups should be changed with each message.
The initial key (m, q, t) is changed frequently. Hence, as mentioned above, this
method becomes a type of polyalphabetic cipher. Polyalphabetic ciphers have histor-
ically been very difficult to decode.
A further variation of this method, using a formal power series ring in noncom-
muting variables over a field, was described in [51].
There have been many cryptosystems based on the difficulty of solving hard group
theoretic problems. The book by Myasnikov, Shpilrain, and Ushakov [73] describes
many of these in detail.
Ko and Lee [69] developed a public key exchange system, that is, a direct translation
of the Diffie–Hellman protocol to a nonabelian group theoretic setting. Its security is
based on the difficulty of the conjugacy problem. We again assume that the platform
group has nice unique normal forms that are easy to compute given a group element,
but hard to recover the group element. Recall again that g h means the conjugate of g
by h; that is, g h = h−1 gh.
In the Ko–Lee protocol, Alice and Bob choose commuting subgroups A and B of
the platform group G. A is Alice’s subgroup, whereas Bob’s subgroup is B and these are
secret. Now they completely mimic the classical Diffie–Hellman technique. There is a
public element g ∈ G; Alice chooses a random secret element a ∈ A and makes public
g a . Bob chooses a random secret element b ∈ B and makes public g b . The secret shared
key is g ab . Notice that ab = ba, since the subgroups commute. It follows then that
(g a )b = g ab = g ba = (g b )a just as if these were exponents. Hence, both Bob and Alice
can determine the common secret. The difficulty is in the difficulty of the conjugacy
problem.
The conjugacy problem for a group G, or more precisely, for a group presentation
for G, is given g, h ∈ G to determine algorithmically if they are conjugates. As with the
conjugator search problem, it is known that the conjugacy is undecidable in general,
but there are groups, where it is decidable, but hard. These groups then become the
target platform groups for the Ko–Lee protocol. As with the Anshel–Anshel–Goldfeld
protocol, Ko and Lee suggest the use of the braid groups.
As with the standard Diffie–Hellman key exchange protocol, using number theory
the Ko–Lee protocol can be changed to an encryption system via the El-Gamal method.
There are several different variants of noncommutative El-Gamal systems.
A = {a1 , . . . , an }, B = {b1 , . . . , bm },
and make them public. The subgroup A is Alice’s subgroup, whereas the subgroup B
is Bob’s subgroup.
Alice chooses a secret group word a = W(a1 , . . . , an ) in her subgroup, whereas
Bob chooses a secret group word b = V(b1 , . . . , bm ) in his subgroup. For an element
g ∈ G, we let NF(g) denote the normal form for g. Alice knows her secret word a and
knows the generators bi of Bob’s subgroup. She makes public the normal forms of the
conjugates
NF(bai ), i = 1, . . . , m.
Bob knows his secret word b and the generators ai of Alice’s subgroup, and makes
public the normal forms of the conjugates
NF(abj ), j = 1, . . . , n.
Notice that Alice knows ab , since she knows a in terms of generators ai of her
subgroup, and she knows the conjugates by b, since Bob has made the conjugates of
the generators of A by b public. Since Alice knows ab , she knows [a, b] = a−1 ab .
In an analogous manner, Bob knows [a, b] = (ba )−1 b. An attacker would have to
know the corresponding conjugator, that is, the element that conjugates each of the
generators. Given elements g, h in a group G, where it is known that g k = k −1 gk = h, the
conjugator search problem is to determine the conjugator k. It is known that this prob-
lem is undecidable in general; that is, there are groups where the conjugator cannot
be determined algorithmically. On the other hand, there are groups, where the conju-
gator search problem is solvable, but “difficult”. That is, the complexity of solving the
conjugator search problem is hard. Such groups become the ideal platform groups for
the Anshel–Anshel–Goldfeld protocol.
The security in this system is then in the difficulty of the conjugator search prob-
lem. Anshel, Anshel, and Goldfeld suggested the Braid groups as potential platforms,
they use, for example, B80 with 12 or more generators in the subgroups. Their sugges-
tion and that of Ko and Lee led to development of braid group cryptography. There have
been various attacks on the Braid group system. However, some have been handled
by changing the parameters. In general, the ideas remain valid despite the attacks.
The Anshel–Anshel–Goldfeld key exchange can be developed into a cryptosystem
again by the El-Gamal method.
There have been many other public key exchange protocols developed using non-
abelian groups. A large number of them are described in the book of Myasnikov, Sh-
pilrain, and Ushakov [73]. The authors of that book themselves have developed many
of these methods. They use different “hard” group theoretic decision problems and
many have been broken. On the other hand, the security of many of them is still open,
and they, perhaps, can be used as viable alternatives to commutative methods.
that the group has solvable word problem, which is essential for these protocols. For
purposes of practicality, the group also needs an efficiently computable normal form,
which ensures an efficiently solvable word problem.
In addition to the platform group having normal form, ideally, it would also be
large enough so that a brute force search for the secret key is infeasible.
Currently, there are many potential platform groups that have been suggested.
What follow are some of the proposals. We refer to [73] for a discussion of many of
these.
– Braid groups (Ko–Lee, Anshel–Anshel–Goldfeld),
– Thompson groups (Shpilrain–Ushakov) [75],
– Polycyclic groups (Eick–Kahrobaei) [63],
– Linear groups (Baumslag–Fine–Xu) [55, 56],
– Free metabelian groups (Shpilrain–Zapata) [76],
– Artin groups (Shpilrain–Zapata) [76],
– Grigorchuk groups (Petrides) [74],
– Groups of matrices (Grigoriev–Ponomarenko) [65],
– Surface braid groups (Camps) [60].
As platform groups for their respective protocols, both Ko–Lee and Anshel– Anshel–
Goldfeld suggested the braid groups Bn (see [59]). The groups in this class of groups
possess the desired properties for the key exchange and key transport protocols; they
have remarkable presentations with solvable word problems and conjugacy prob-
lems; the solution to the conjugacy and conjugator search problem is “hard”; there
are several possibilities for normal forms for elements, and they have many choices
for large commuting subgroups. Initially, the braid groups were considered so ideal as
platforms that many other cryptographic applications were framed within the braid
group setting. These included authentication (identifying over a public airwave that
a message received was from the correct sender) and digital signature, (sending an
encrypted message with an included authentication). There was so much enthusiasm
about using these groups that the whole area of study was named braid group cryptog-
raphy. A comprehensive and well-written article by Dehornoy [38] provides a detailed
overview of the subject, and we refer the reader to that for technical details.
After the initial successes with braid group cryptographic schemes, there were
some surprisingly effective attacks. There were essentially three types of attacks: an
attack using solutions to the conjugacy and conjugator search problems, an attack
using heuristic probability within Bn , and an attack based on the fact that there are
faithful linear representations of each Bn (see [38]). What is most surprising is that
the Anshel–Anshel–Goldfeld method was susceptible to a length-based attack. In the
Anshel–Anshel–Goldfeld method, the parameters are the specific braid group Bn , and
the rank of the secret subgroups for Bob and Alice. A length-based attack essentially
broke the method for the initial parameters suggested by Anshel, Anshel and Goldfeld.
The parameters were then made larger and attacks by this method were less success-
ful. However, this led to research on why these attacks on the conjugator search prob-
lem within Bn were successful. What was discovered was that, generically, a random
subgroup of Bn is a free group; hence, length-based attacks are essentially attacks on
free group cryptography, and therefore successful (see [22]). What this indicated was
that although randomness is important in cryptography, by using the braid groups as
platforms, subgroups cannot be chosen purely randomly.
Braid groups arise in several different areas of mathematics and have several
equivalent formulations. We close this chapter and the book with a brief introduction
to braid groups. A complete topological and algebraic description can be found in the
book of Joan Birman [59].
A braid on n strings is obtained by starting with n parallel strings and intertwining
them. We number the strings at each vertical position and keep track of where each
individual string begins and ends. We say that two braids are equivalent if it is possible
to move the strings of one of the braids in space without moving the endpoints, or
moving through a string and obtain the other braid. A braid with no crossings is called
a trivial braid. We form a product of braids in the following manner: If u is the first braid
and v is the second braid, then uv is the braid formed by placing the starting points for
the strings in v at the endpoints of the strings in u. The inverse of a braid is the mirror
image in the horizontal plane. It is clear that if we form the product of a braid and
its mirror image, we get a braid equivalent to the trivial braid. With these definitions,
the set of all equivalence classes of braids on n strings forms a group Bn . We let σi
denote the braid that has a single crossing from string i over string i+1. Since a general
braid is just a series of crossings, it follows that Bn is generated by the set σi ; i = 1, . . . ,
n − 1.
There is an equivalent algebraic formulation of the braid group Bn . Let Fn be free
on the n generators x1 , . . . , xn with n > 2. Let σi , i = 1, . . . , n − 1, be the automorphism of
Fn , given by
This is now called the Artin presentation. The fact that Bn is contained in Aut(Fn )
provides an elementary solution to the word problem in Bn , since one can determine
xi−ϵ Vxiϵ
with ϵ = ±1, and where the word V does not involve xi . If V does not contain any
xi+1 -handles, then the xi -handle is called permitted.
A braid word W is obtained from a braid word W by a one step handle reduction
if some subword of W is a permitted xi -handle xi−ϵ Vxiϵ , and W is obtained from W by
applying the following substitutions for all letters in the xi -handle:
{
{ 1, if j = i,
{
{
{ −ϵ ±1 ϵ
xj±1 → {xi+1 xi xi+1 , if j = i + 1,
{
{
{
{ ±1
{xj , if j < i or j > i + 1.
The handle free reduction process is very efficient and most of the time works
in polynomial time on the length of the braid word to produce the handle free form.
However, there is no known theoretical complexity estimate (see [38]).
Garside solved the conjugacy problem using a different type of normal form for Bn .
Let Sn be the symmetric group on n letters, and for each s ∈ Sn , let ζs be the shortest
positive braid such that π(ζs ) = s. The elements
S = {ζs : s ∈ Sn } ⊂ Bn
are called simple elements. We order the simple elements so that ζs < ζt if there exists
r ∈ Sn such that ζt = ζs ζr . This produces a lattice structure on S.
The trivial braid is the smallest element of S, whereas the greatest element of S is
the half-twist braid
Δ = ζ(n,n−1,...,2,1) .
The Garside left normal form of a braid a ∈ Bn is a pair (p, (s1 , . . . , st )), where p ∈ ℤ
and s1 , . . . , st is a sequence of permutations in Sn \{1, Δ} satisfying for each i = 1, . . . , t −1
ζ1 = gcd(ζs−1 Δ , ζsi+1 ),
i
where
Theorem 23.6.2. There exists an algorithm, which computes the normal form of the cor-
responding braid for any braid word W = w(x1 , . . . , xn ).
23.7 Exercises
1. Show that if p, q are primes and e, d are positive integers with (e, (p − 1)(q − 1)) = 1
and ed ≡ 1 mod (p − 1)(q − 1), then aed ≡ a mod pq for any integer a. (This is the
basis of the decryption function used in the RSA algorithm.)
2. The following table gives the approximate statistical frequency of occurrence of
letters in the English language. The passage below is encrypted with a simple per-
mutation cipher without punctuation. Use a frequency analysis to try to decode
it.
ZKIRNVMFNYVIRHZKLHRGREVRMGVTVIDSR
XSSZHZHGHLMOBKLHRGREVWRERHLIHLMVZ
MWRGHVOUKIRNVMFNYVIHKOZBZXIFXRZOI
LOVRMMFNYVIGSVLIBZMWZIVGSVYZHRHUL
IGHSHVMLGVHGSVIVZIVRMURMRGVOBNZMB
KIRNVHZMWGSVBHVIEVZHYFROWRMTYOLXP
HULIZOOGSVKLHRGREVRMGVTVIH
3. Encrypt the message NO MORE WAR using an affine cipher with single letter keys
a = 7, b = 5.
4. Encrypt the message NO MORE WAR using an affine cipher on 2 vectors of letters
and an encrypting key
5 2 3
A=( ), B = ( ).
1 1 7
5. What is the decryption algorithm for the affine cipher given in the last problem.
6. How many different affine enciphering transformations are there on single letters
with an N letter alphabet.
7. Let N ∈ ℕ with N ≥ 2 and n → an + b with (a, N) = 1 is an affine cipher on an N
letter alphabet. Show that if any two letters are guessed n1 → m1 , n2 → m2 with
(n1 − n2 , N) = 1, then the code can be broken.
8. If we use an affine cipher on N, N ≥ 2, single letters with n → an + b, b ≠ 0 mod N,
and (a − 1, N) = 1, show that there is always a unique fixed letter. This can be used
in cryptoanalysis.
9. A user has the public RSA key (n, e). By a security gap, the number ϕ(n) becomes
known. Show that the user has to reject the key.
(i) Explain how the secret key d can be calculated, that is, the number d such
that ed = 1 mod ϕ(n).
(ii) Explain how n can be factorized.
10. The plaintext message x is encrypted with two RSA keys (551, 5) and (551, 11). The
respective ciphertexts are 277 mod 551 and 429 mod 551. From this calculate x.
11. Let F be a free group of rank 3 with generators x, y, z. Code the English alphabet
by a → 0, b → 1, . . . . Consider the free group cryptosystem given by
i → Wi ,
where Wi = xi yi+1 z i+2 x−i+1 . Code the message EAT AT JOES with this system.
12. In the Anshel–Anshel–Goldfeld protocol, verify that both Bob and Alice will know
the commutator.
https://doi.org/10.1515/9783110603996-024
[29] R. C. Lyndon, Groups and Geometry, LMS Lecture Note Series 101, Cambridge University Press,
1985.
[30] R. C. Lyndon and P. Schupp, Combinatorial Group Theory, Springer-Verlag 1977.
[31] W. Magnus, A. Karrass and D. Solitar Combinatorial Group Theory, Wiley, 1966.
[32] D. J. S. Robinson, A Course in the Theory of Groups, Springer-Verlag, 1982.
[33] J. Rotman, Group Theory, 3rd ed., Wm. C. Brown, 1988.
Number theory
Cryptography
[50] I. Anshel, M. Anshel and D. Goldfeld, An algebraic method for public key cryptography, Math.
Res. Lett., 6, 1999, 287–91.
[51] G. Baumslag, Y. Brjukhov, B. Fine and G. Rosenberger, Some cryptoprimitives for
noncommutative algebraic cryptography, in Aspects of Infinite Groups, 26–44, World Scientific
Press, 2009.
[52] G. Baumslag, Y. Brjukhov, B. Fine and D. Troeger, Challenge response password security using
combinatorial group theory, Groups Complex. Cryptol., 2, 2010, 67–81.
[53] G. Baumslag, T. Camps, B. Fine, G. Rosenberger and X. Xu, Designing key transport protocols
using combinatorial group theory, Cont. Math. 418, 2006, 35–43.
[54] G. Baumslag, B. Fine, M. Kreuzer and G. Rosenberger, A Course in Mathematical Cryptography,
De Gruyter, 2015.
[55] G. Baumslag, B. Fine and X. Xu, Cryptosystems using linear groups, Appl. Algebra Eng.
Commun. Comput. 17, 2006, 205–17.
[56] G. Baumslag, B. Fine and X. Xu, A proposed public key cryptosystem using the modular group,
Cont. Math. 421, 2007, 35–44.
[57] J. Birman, Braids, Links and Mapping Class Groups, Annals of Math Studies 82, Princeton
University Press, 1975.
[58] A. V. Borovik, A. G. Myasnikov and V. Shpilrain, Measuring sets in infinite groups, in
Computational and Statistical Group Theory, Contemp. Math. 298, 21–42, 2002.
[59] J. A. Buchmann, Introduction to Cryptography, Springer 2004.
[60] T. Camps, Surface Braid Groups as Platform Groups and Applications in Cryptography, Ph.D.
thesis, Universität Dortmund 2009.
[61] R. E. Crandall and C. Pomerance, Prime Numbers. A Computational Perspective, 2nd ed.,
Springer-Verlag, 2005.
[62] P. Dehornoy, Braid-based cryptography, Cont. Math., 360, 2004, 5–34.
[63] B. Eick and D. Kahrobaei, Polycyclic groups: A new platform for cryptology? math.GR/0411077
(2004), 1–7.
[64] M. I. González Vasco and R. Steinwandt, Group Theoretic Cryptograph, Chapman & Hall, 2015.
[65] D. Grigoriev and I. Ponomarenko, Homomorphic public-key cryptosystems over groups and
rings, Quaderni di Matematica, 2005.
[66] P. Hoffman, Archimedes’ Revenge, W. W. Norton & Company, 1988.
[67] D. Kahrobaei and B. Khan, A non-commutative generalization of the El-Gamal key exchange
using polycyclic groups, in Proceeding of IEEE, 1–5, 2006.
[68] I. Kapovich and A. Myasnikov, Stallings foldings and subgroups of free groups, J. Algebra 248,
2002, 608–68.
[69] K. H. Ko, S. J. Lee, J. H. Cheon, J. H. Han, J. S. Kang and C. Park, New public-key cryptosystems
using Braid groups, in Advances in Cryptography, Proceedings of Crypto 2000, Lecture Notes in
Computer Science 1880, 166–83, 2000.
[70] N. Koblitz, Algebraic Methods of Cryptography, Springer, 1998.
[71] W. Magnus, Rational representations of fuchsian groups and non-parabolic subgroups of the
modular group, Nachrichten der Akad. Göttingen, 179–89, 1973.
[72] A. G. Myasnikov, V. Shpilrain and A. Ushakov, A practical attack on some braid group based
cryptographic protocols, in CRYPTO 2005, Lecture Notes in Computer Science 3621, 86–96,
2005.
[73] A. G. Myasnikov, V. Shpilrain and A. Ushakov, Group-Based Cryptography, Advanced Courses in
Mathematics, CRM Barcelona, 2007.
[74] G. Petrides, Cryptoanalysis of the public key cryptosystem based on the word problem on the
Grigorchuk groups, in Cryptography and Coding, Lecture Notes in Computer Science 2898,
234–44, 2003.
[75] V. Shpilrain and A. Ushakov, The conjugacy search problem in public key cryptography;
unnecessary and insufficient, Applicable Algebra in Engineering, Communication and
computing, 17, 2006 285–9.
[76] V. Shpilrain and A. Zapata, Using the subgroup memberhsip problem in public key
cryptography, Cont. Math., 418, 2006, 169–79.
[77] R. Steinwandt, Loopholes in two public key cryptosystems using the modular groups, preprint,
University of Karlsruhe, 2000.
[78] R. Stinson, Cryptography; Theory and Practice, Chapman and Hall, 2002.
[79] X. Xu, Cryptography and Infinite Group Theory, Ph.D. thesis, CUNY, 2006.
[80] A. Yamamura, Public key cryptosystems using the modular group, in Public Key Cryptography,
Lecture Notes in Computer Sciences 1431, 203–16, 1998.